Perly Consulting │ Beck Eco

The State of Play

A living index of AI adoption across industries — where established practice meets the bleeding edge
UPDATED DAILY

The AI landscape doesn't move in one direction — it lurches. Some techniques leap from experiment to table stakes in a single quarter; others stall against regulatory walls, technical ceilings, or organisational inertia that no amount of hype can dislodge. Knowing which is which is the hard part. The State of Play cuts through the noise with a rigorously maintained index of AI techniques across every major business domain — classified by maturity, evidenced by real-world adoption, and updated daily so you always know where you stand relative to the field. Stop guessing. Start knowing.

The Daily Dispatch

A daily newsletter distilling the past two weeks of movement in a domain or two — delivered to your inbox while the index updates in the background.

AI Maturity by Domain

Each dot marks the weighted maturity of practices within a domain — hover for a brief summary, click for more detail

DOMAIN
BLEEDING EDGEESTABLISHED

Ticket intelligence — intent, sentiment & language detection

LEADING EDGE

TRAJECTORY

Stalled

AI that classifies ticket intent, detects sentiment and escalation risk, identifies language, and routes accordingly. Includes multi-label topic tagging and escalation prediction; distinct from ticket routing which assigns based on rules rather than understanding content.

OVERVIEW

Ticket intelligence — AI that classifies support tickets by intent, sentiment, and escalation risk — is technically proven but organisationally stalled. The core capabilities work: production deployments routinely hit 80-90% accuracy on straightforward classification, and best-in-class implementations exceed 98% on escalation routing. Every major cloud platform ships GA intent and sentiment features. The problem is getting from pilot to production. Research consistently shows that most AI agent pilots never reach deployment, blocked by integration costs, data fragmentation, and legacy infrastructure. This gap between what the technology can do and what organisations actually operationalise defines ticket intelligence as a leading-edge practice — forward-leaning teams extract real value, but the majority have not moved beyond evaluation.

CURRENT LANDSCAPE

Zendesk, IBM Watson Assistant, Google Cloud, AWS Comprehend, and NICE all ship production intent detection and sentiment analysis, with Zendesk refining its Intelligent Triage feature through early 2026 to address overlapping-intent accuracy problems. Deployments that reach production show compelling returns. Fin AI reports greater than 98% accuracy on escalation routing; AssemblyAI cut first-response time from 15 minutes to 23 seconds with 50% automated resolution; Grove Collaborative reduced ticket volume by over 80% through intent-based routing. The sentiment analytics market reached $5.71B in 2025, and 82% of senior leaders report investing in AI-powered customer service tools.

Getting there remains hard. RAND and Gartner data indicate 88% of AI agent pilots never advance past proof-of-concept, with integration costs running $140K-$350K and timelines stretching to four to six months. OpenAI's own research frames this as a "capability overhang" — the technology is ready, but most organisations lack the execution frameworks to use it. Technical limitations compound the organisational ones: single-label routing fails when tickets carry stacked intents, accuracy metrics often miss real containment and task-success signals, and at least one vendor (Syncro) has already deprecated its AI ticket classification feature. An Intercom survey of 2,400 support professionals found 77% say AI meets or exceeds expectations, yet only 10% have reached mature deployment — a ratio that captures where this practice actually stands.

TIER HISTORY

ResearchJan-2018 → Jan-2018
Bleeding EdgeJan-2018 → Jan-2020
Leading EdgeJan-2020 → present

EVIDENCE (116)

— Named organizations (Halfbrick, Hutch Games, Supercell) deploying real ticket intelligence systems with intent classification, sentiment scoring, and issue tagging. Metrics show 84→9 hour resolution improvement.

— Industry benchmark showing 78%+ AI adoption in support operations, 40-70% ticket deflation targets, ROI metrics of $3.50–$8.00 per $1 invested, and 25.8% market CAGR through 2030.

— Production case study: independent service provider deployed real-time sentiment analysis on support calls using Whisper transcription + Bedrock Claude + Glia platform widget.

— Enterprise implementation guide demonstrating Comprehend sentiment, entity recognition, and PII detection in customer support automation achieving 3-day to 15-minute processing reduction.

— Production deployment of sentiment-driven ticket prioritization via eZintegrations Goldfinch AI with validated metrics: 89%+ sentiment classification accuracy, 91%+ urgency detection precision.

— Comprehensive benchmark aggregating 150+ data points from Zendesk, Salesforce, Gartner, Forrester, Intercom, McKinsey, BCG, Bain. Shows intent-based deflation asymmetry and quality gaps by intent type.

— AWS vendor implementation bundling Bedrock + Comprehend for sentiment analysis, intent detection, urgency assessment, and churn risk detection in production-ready reference framework.

— Deprecation notice: IBM Watson Tone Analyzer (sentiment detection) retired Feb 2023, no longer activated for new customers. Negative signal showing vendor consolidation away from standalone tone analysis.

HISTORY

  • 2018: IBM Research demonstrated sentiment analysis on real service provider ticket data for subscription renewal prediction; production helpdesk systems achieved 90% accuracy routing 40,000+ emails/month across major providers; sentiment analysis market forecasted to grow from $123M to $3.8B; multi-label classification on real ticket data still limited to 54% accuracy.
  • 2019: Enterprise deployments accelerated—Lufthansa deployed Watson NLU across 15,000 agents, Meltwater scaled sentiment analysis to 450M documents/day for 30,000+ customers; research advanced multi-intent detection (NAACL 2019); Google released AutoML Natural Language GA; sentiment detection became production-standard, but full automation remained constrained by multi-label complexity and edge cases requiring human review.
  • 2020: IBM advanced Watson with idiom/colloquialism detection and Project Debater commercialization; intent detection benchmarking showed Watson Assistant outperforming competitors by 5-14 percentage points; Observe.ai demonstrated production sentiment analysis driving 50% conversion gains; escalation prediction emerged as validated research focus; ticket quality assessment began attracting academic attention; multi-language support and integration complexity remained primary adoption barriers.
  • 2021: Intent detection matured with peer-reviewed NAACL benchmarking validating Watson Assistant leadership; sentiment analysis expanded to 7-tone emotion detection (frustrated, satisfied, etc.); research advanced transformer-based approaches for handling imbalanced customer datasets; specialized tooling proliferated (Nyckel, etc.); methodological refinements (token-level labeling, transfer learning) incrementally improved accuracy; comprehensive multi-capability integration remained complex.
  • 2022-H1: Google Cloud and NTT DATA demonstrated production ticket intelligence deployments using cloud AutoML and semantic NLP for feedback routing and categorization; sentiment.ai and competing tools advanced multilingual accuracy through deep learning benchmarks; Info-Tech research identified ticket intelligence ROI barriers in ITSM adoption; FinTech sector began applying sentiment analysis to support for issue prioritization and escalation detection; ecosystem tooling matured with specialized support-ticket classifiers.
  • 2022-H2: Google Cloud's internal production pipeline deployed clustering and anomaly detection on support tickets at scale; intent detection advanced in financial services (Banking77 benchmarking) and real-time voice support (LSTM-based latency optimization); sentiment analysis pipelines scaled in AWS and GCP for near-real-time routing; NAACL 2022 research tackled novel intent detection under constrained annotation budgets. However, Zendesk's documented limitations on language-specific accuracy showed production failures in unsupported languages; multi-label simultaneous classification remained organizationally complex; integration with legacy ITSM systems continued to lag behind cloud-native platforms.
  • 2023-H2: Research acceleration on intent detection methods (open models, self-supervised pre-training via RSVP/EMNLP 2023) alongside continued advancement in novel intent discovery frameworks. Vendor product evolution continued with Google Cloud's new PaLM-based sentiment model in Natural Language API v2. Real-world case studies demonstrated active adoption (Qlik escalation reduction, Indonesian company 1000+ ticket pilot), validating ROI drivers. Critical adoption barriers remained: resource quota constraints on cloud platforms, unsupported language failure modes, and complexity of comprehensive multi-label classification requiring continued human review.
  • 2024-Q1: Organizational adoption accelerated with 70% of C-level support executives planning AI investment in customer service; real-world deployments demonstrated operational ROI (telecom companies reducing escalations 18% via real-time sentiment monitoring). Sentiment analysis standardized across cloud platforms with mature tooling. However, vendor platform reliability became a constraint—practitioners reported critical failures during AutoML-to-Vertex migrations, exposing inadequate migration paths and documentation gaps. Structural barriers persisted: cloud platform quota limitations, language-specific failure modes, and complexity of simultaneous multi-label classification continued to require human review in production.
  • 2024-Q2: Intent detection expanded to automotive customer systems (General Motors OnStar), confirming real-world enterprise deployment momentum. Peer-reviewed research revealed LLM limitations in complex sentiment analysis tasks—a key finding that sophisticated ticket intelligence remained beyond LLM reach. Google Cloud's Natural Language API deprecation and migration to Vertex AI created user confusion and exposed vendor documentation gaps. All major cloud platforms (IBM, Google, AWS, Azure) offered production sentiment/intent/emotion detection, but vendor platform stability and tooling reliability remained uneven constraints on broader adoption.
  • 2024-Q3: Enterprise adoption planning accelerated with 70% of C-level support execs planning AI investment per Zendesk survey, while open-source implementations demonstrated multi-label classification at 93% accuracy. However, critical headwinds emerged: peer-reviewed validation that LLMs lag on complex sentiment detection, real-world AI failures in customer service contexts, McKinsey data showing adoption plateau at 50-60% due to cost and hallucination barriers, and Google's incomplete Natural Language API migration creating vendor platform instability. Structural barriers persisted: quota constraints, language-specific failure modes, and multi-label classification requiring human review. Credibility gap between vendor marketing and production reality widened as adoption matured.
  • 2024-Q4: Market adoption metrics confirmed 1,250+ solutions globally serving 4,500+ corporate end-users, demonstrating category-level breadth. Academic research continued advancing intent detection methods (CNN-BiLSTM) while peer-reviewed evidence confirmed LLMs lag on complex sentiment. Market skepticism intensified: MIT economist warned AI infrastructure investments may underperform; industry analysis found only 44% of companies had AI strategies despite 76% feeling competitive pressure; real-world customer service AI failures documented. Vendor platform churn continued with Google's incomplete migration. Gap between investment intent and deployment execution widened; adoption plateau persisted at 50-60% due to cost, hallucination risks, and execution complexity. Practice remained in leading-edge with broad organizational consideration but significant headwinds to sustained momentum.
  • 2025-Q1: Vendor platforms matured with Google Cloud Natural Language and Freshworks shipping enhanced AI ticketing features. Real deployments showed strong ROI: Infiniticube's Sun West Mortgage case study achieved 40% faster resolution and 30% cost reduction; Glammmup improved CSAT from 62 to 78 via sentiment analysis. Research advanced multilingual emotion detection (SemEval 2025) but revealed language-specific robustness gaps. Critical failures documented in language detection production deployments exposed fundamental technical limitations. Organizational commitment remained high (70% of C-level execs planning investment) despite widening gap between investment intent and deployment feasibility.
  • 2025-Q2: Real-world deployments demonstrated substantial ROI: AssemblyAI achieved 97% reduction in first response time and 50% automated resolution; Monte dei Paschi di Siena Bank deployed BERT-based classification at 85.88% accuracy on production tickets. Practitioner guidance emerged on escalation-trigger design including sentiment and complexity detection. However, platform disruption accelerated—Google announced June 2025 cutoff for AutoML Text classification/sentiment/entity extraction, forcing deployments to migrate to Gemini-based approaches. Vendor platform instability emerged as critical adoption barrier alongside persistent multi-label classification complexity and language-detection production failures.
  • 2025-Q3: Escalation-routing deployments matured with Fin AI achieving >98% accuracy on production escalation decisions using custom models with sentiment/complexity logic. NICE and major vendors reinforced intent detection as standard GA feature. Adoption metrics confirmed ~28% resolution time improvement from AI-driven ticket triage with up to 35% ticket deflection. However, customer sentiment data revealed significant headwinds: 70% of consumers abandon brands after poor AI experiences and 88% prefer human agents, contradicting vendor deployment momentum. Platform disruption continued—Google's AutoML cutoff caused migrations; vendor documentation gaps exposed organizational friction. Credibility gap between marketing claims and production reality widened, creating tension between strong investment intent (70% of C-level execs, 78% of organizations deploying AI) and execution barriers (quota constraints, language-detection failures, multi-label complexity).
  • 2025-Q4: Independent case studies demonstrated strong intent detection ROI (Grove Collaborative's 80%+ volume reduction via intent routing, Eurail's 95% first-response improvement). Market growth accelerated with $12.06B AI customer service market in 2024, projected $47.82B by 2030 (25.8% CAGR). However, McKinsey data (September 2025) revealed 73% of AI pilots fail to reach production with integration costs of $140K–$350K and 4–6 months required. Text-based intent/sentiment analysis exposed fundamental limitations: poor tone detection, inability to probe surface issues, and weak cross-ticket pattern recognition. Platform disruption persisted with ongoing Google AutoML migrations. Practice remained firmly leading-edge with clear organizational momentum (70% of C-level executives planning investment) but widening execution-intention gap as adoption barriers (integration costs, documentation gaps, technical limitations) became more apparent.
  • 2026-Jan: Vendor platforms released coordinated capability upgrades (Google Cloud, IBM Watson, Zendesk all shipped enhanced GA features) signaling continued investment in production maturity. However, OpenAI's "capability overhang" analysis and RAND/Gartner research documented fundamental deployment barriers: 88% of AI agent pilots fail to reach production; enterprise implementation requires 4–6 months and $140K–$350K integration costs. Market data confirmed strong adoption momentum (82% of senior leaders invested, sentiment analytics market at $5.71B), but revealed persistent tension between technical readiness and organizational execution capability. The practice remained in leading-edge with clear momentum but plateauing ROI realization.
  • 2026-Feb: Vendor platforms released coordinated refinements—Zendesk shipped intent quality recommendations for personalized conflict resolution, and independent research demonstrated real-world ticket classification in public administration (ISTAT). However, Syncro deprecated AI ticket classification as adoption barriers persisted. Market adoption accelerated (77% of teams report AI meeting/exceeding expectations, 80% of routine interactions handled by AI) yet early maturity remained (only 10% at mature implementation stage). Critical analysis documented production limitations: single-label routing fails with stacked intents and accuracy metrics miss real containment/task success signals. Platform evolution and organizational execution gaps continued to define the leading-edge plateau.
  • 2026-Apr: Major vendors accelerated platform refinements—Zendesk added intent quality recommendations and entity extraction reporting; AWS announced Predictive Insights for Amazon Connect with intent and sentiment detection; Cisco Webex expanded multilingual sentiment analysis support across 7 new languages. Real-world case studies documented strong ROI: SupportLogic customers (Salesforce, Nutanix, Basware, Databricks) achieved 30-80% escalation reductions; Robylon deployments reached 93% ticket classification accuracy on 300k+ annual tickets with 83% automation. Large-scale adoption analysis (150+ enterprise deployments, 10M+ tickets) confirmed 95%+ routing accuracy and 80% autonomous resolution as baseline capabilities. Vendor product consolidation signaled market maturity—Zendesk's acquisition of Forethought positioned ticket intelligence (intent, sentiment, language detection) as foundational infrastructure rather than optional features. Market-wide sentiment analysis tools demonstrated 94%+ accuracy benchmarks. Leading-edge plateau persisted due to integration complexity and organizational execution barriers despite strong technical capability and proven ROI.
  • 2026-May: Additional production case studies confirmed strong real-world ROI: DataArt deployed real-time sentiment analysis on support calls using Whisper + Bedrock; Helpshift customers (Halfbrick, Hutch Games, Supercell) reduced resolution time from 84 to 9 hours via intent classification and sentiment scoring; eZintegrations reported 89%+ sentiment classification accuracy and 91%+ urgency detection. Market adoption data aggregating 150+ benchmarks showed intent-based deflation asymmetry (structured 70%+ deflation, sentiment-heavy 19-34%) indicating practice maturity variance. Industry benchmarks reported 78%+ AI adoption in support operations with $3.50–$8.00 ROI per dollar invested and 25.8% market CAGR through 2030. However, vendor consolidation continued—IBM's Watson Tone Analyzer (standalone sentiment detection) deprecated by February 2023, marking shift toward embedded capabilities in multi-function platforms. Leading-edge plateau persisted with proven technical maturity (80-95% accuracy, 28%+ resolution improvement) and strong organizational commitment (78%+ adoption), but execution barriers (integration costs, documentation gaps, multi-label complexity) continued to limit production deployments.

TOOLS