Perly Consulting │ Beck Eco

The State of Play

A living index of AI adoption across industries — where established practice meets the bleeding edge
UPDATED DAILY

The AI landscape doesn't move in one direction — it lurches. Some techniques leap from experiment to table stakes in a single quarter; others stall against regulatory walls, technical ceilings, or organisational inertia that no amount of hype can dislodge. Knowing which is which is the hard part. The State of Play cuts through the noise with a rigorously maintained index of AI techniques across every major business domain — classified by maturity, evidenced by real-world adoption, and updated daily so you always know where you stand relative to the field. Stop guessing. Start knowing.

The Daily Dispatch

A daily newsletter distilling the past two weeks of movement in a domain or two — delivered to your inbox while the index updates in the background.

AI Maturity by Domain

Each dot marks the weighted maturity of practices within a domain — hover for a brief summary, click for more detail

DOMAIN
BLEEDING EDGEESTABLISHED

Agent quality monitoring & coaching

GOOD PRACTICE

TRAJECTORY

Stalled

AI that monitors agent interactions for quality and compliance while providing real-time sentiment and tone coaching. Includes automated QA scoring and in-call coaching prompts; distinct from agent assist which drafts responses rather than evaluating agent performance.

OVERVIEW

AI-driven quality monitoring and coaching is a proven capability with a mature vendor ecosystem, GA tooling, and documented ROI — yet a persistent gap between deployment and value extraction keeps the practice from reaching universal status. The technology itself works: auto-scoring accuracy exceeds 99%, 100% interaction coverage has replaced manual sampling at forward-leaning organisations, and real-time coaching delivers measurable gains in handle time, attrition, and compliance. The question facing most contact centres is no longer whether to adopt, but how to move past fragmented pilots into strategic integration. That transition is where most stall. Only 12% of organisations with AI in their contact centres report fully optimised value, and change management failures — agent distrust, leadership gaps in empathy training, disconnects between operational metrics and business outcomes — remain the binding constraint. The tooling is ready; the organisational maturity is not.

CURRENT LANDSCAPE

Calabrio, Observe.AI, NICE, Verint, and Gryphon all ship GA products offering 100% interaction coverage, automated scoring, and real-time agent coaching. Named deployments back the value claims: Verint's Quality Bot scaled QA coverage from 1% to 96% for a major enterprise, eliminating 1,200 manual QA roles and delivering $12.5M annual savings. Calabrio's QM platform delivers 90% reductions in manual QA time at production scale, while a healthcare deployment through its CareAI programme automated quality evaluation for 53% of patient inquiries with measurable improvements in time to care. Observe.AI, serving over 400 enterprise customers, reports consistent 20% AHT reductions and 25% CSAT improvement from real-time coaching. A Tele Access BPO deployment demonstrates distinctive coaching methodologies—Good Call/Bad Call peer learning sessions grounded in adult learning research, plus morning briefings with actionable, specific feedback—showing the operational maturity of coaching at scale.

May 2026 market evidence confirms rapid adoption: Salesforce's State of Service study (3,075 professionals) found 70% of organisations observe measurable value within 60 days of deployment, with 66% adoption rate. Real-time coaching architecture has matured into a three-layer pattern (pre-interaction compliance cues, during-call live guidance, post-interaction micro-coaching) validated by research showing immediate feedback produces 2-3× larger behavioral effect sizes than delayed reviews. Coaching ROI is documented: contact centers deploying continuous automated feedback report 25-40% faster new-rep ramp time and measurable CSAT gains. Sentiment Arc analysis extending beyond polarity to detect churn-signal patterns (customer starts satisfied, ends negative) reveals coaching opportunities that resolved-ticket metrics miss.

June 2026 evidence deepens the adoption picture: COPC's outcome-based measurement framework reveals that QA outcomes depend heavily on policy and process design, not just agent performance—only 7 of 60 unresolved issues in a B2B e-commerce deployment were agent-controllable, 53 percentage points were policy/process/tool-driven. ETS Labs' QEval platform processed 2.5 billion interactions in 2025, demonstrating 100% coverage now achievable at scale with sub-4-minute latency. Level AI's Vista deployment scaled from 1-2% manual coverage to 100% AI-automated scoring with improved coaching effectiveness. Operational governance has crystallized as the next maturity frontier: Soberan's framework articulates evidence-packet generation, supervisor calibration, and governed coaching actions as prerequisites for trusted automation. Regulatory bodies increasingly enforce 100% compliance monitoring—an RBI fine (Rs 1.31 cr) for a BFSI firm exposed how 3% sampling fails to detect systematic violations (4-8 week detection lag).

Yet adoption unevenness persists. A May 2026 industry assessment found 88% of contact centers have deployed AI but only 25% operationalised it into daily workflows. More critically, only 14% of enterprises move pilot agents to production; 78% remain blocked by governance gaps rather than model capability. Production safety requires five architectural foundations: decision audit trails for post-hoc review, human-in-the-loop escalation thresholds, rollback capability to known-good versions, ownership assignment for multi-agent workflows, and continuous runtime observability. Without these, organisations fail compliance audits and customer escalations. The barriers are primarily operational and organisational, not technical. Only 35% of agents understand how AI tools are being used in their workflow, more than half fear job automation, and 64% of leaders neglect empathy training despite agents rating it a core strength. Bias in scoring models — accent, sentiment, gender, and script-adherence patterns — remains documented across a majority of deployed systems, and privacy litigation under statutes like CIPA adds legal friction. The technology has arrived; operationalising it safely at scale is now the work.

TIER HISTORY

ResearchJan-2020 → Jan-2020
Bleeding EdgeJan-2020 → Jan-2022
Leading EdgeJan-2022 → Jan-2025
Good PracticeJan-2025 → present

EVIDENCE (135)

— Market consolidation signal: Verint's $2B acquisition of Calabrio (April 2026) merges two major QA/WFM vendors, indicating enterprise strategic importance and validation of QA automation maturity at scale.

— Three named orgs with quantified QA/coaching outcomes: FNB South Africa (14x automated evaluations, 15% compliance improvement); NOS Portugal (40% productivity gain, 61-point NPS); major bank ($10M agent capacity savings).

— Adoption metrics: 75% of customer interactions will be monitored by AI QA systems by 2026 (up from 30% in 2021); 80% of contact centers now use AI-based QA technologies; shift from 5% manual sampling to 100% AI coverage mainstream.

— GA product with named deployments: financial/HR team automated 1.8M assessments with 60% QA time reduction; mental health helpline scaled 450% call volume; healthcare org saved 27,000+ clinical hours via 41% ACW reduction.

— Comprehensive best practices framework articulating shift from 1-3% sampling to 100% AI-driven coverage, with real-time coaching superior to post-call review; 78% of customers switch after one bad experience drives retention ROI.

Real-Time Agent Coaching BotCase Studies

— Production deployments with quantified ROI: telco €67.8M annual benefits plus 30-second AHT reduction; mortgage lender NPS improvement from +3 to +39; 500-agent deployment achieving 15x ROI with measurable CSAT/compliance gains.

— 350+ enterprise customers, 4.6-star G2 rating from 238 verified reviews, IDC MarketScape Leader status; $44.2M revenue (2024) signals category-level adoption and vendor financial maturity for sustained innovation.

— Primary research (109 directors/VPs): 85% deployed AI QA/training tools but only 29% use effectively—critical deployment-to-value gap rooted in broken training-QA integration rather than technology limitations.

HISTORY

  • 2020: Observe.AI and Calabrio establish AI-powered agent quality monitoring as a distinct capability; Observe.AI secures $80M in funding and lands partnerships with HCL and 3CLogic, enabling 100% call coverage for sentiment and compliance scoring.
  • 2021: Observe.AI reaches 160 customers with 20,000+ agent licenses; launches AI-powered coaching product suite (4X coaching session increase); integrates with Microsoft Azure. Calabrio expands to 100% omnichannel interaction analytics. Cloud adoption accelerates (68% of contact centers), validating infrastructure readiness. Implementation challenges emerge: nonlinear effort to achieve quality thresholds and risk of agent demotivation from automated scoring.
  • 2022-H1: Observe.AI reports 150% ARR growth and 40% enterprise customer increase; launches Auto QA for adaptive automation (up to 1,000x coaching insights increase). Calabrio earns G2 Leader recognition and integrates with Talkdesk. Independent survey shows 67% of contact centers still manual but CI adopters 10x more confident; ecosystem maturity advances via platform integrations and product innovation.
  • 2022-H2: Vendor ecosystem accelerates real-time coaching capabilities. Calabrio and Verint present advanced coaching and quality management features at industry conferences. Market adoption survey shows 78% of contact centers plan AI deployment within 3 years, with quality management as a top priority use case. Implementation focus shifts from manual sampling to 100% automated interaction evaluation.
  • 2023-H1: Observe.AI launches Real-Time AI product suite adding live guidance and supervisor coaching; Calabrio maintains leadership with mature Auto QM evaluation forms. Third-party adoption metrics show significant gaps: 62% of contact center managers cannot analyze enough calls for accurate performance evaluation (Invoca). Vendor innovation focuses on real-time coaching ROI with measurable behavioral impacts (5-6% win rate lifts). Two-tier market emerges between cloud-native leaders and traditional centers.
  • 2023-H2: Critical research from SQM Group reveals persistent adoption barriers: only 19% of managers believe QA programs improve CSAT, and 83% of agents don't believe QA helps their performance. Despite mature product capabilities and vendor innovation in ROI tooling, fundamental user skepticism remains a deployment barrier. Quality monitoring reaches mainstream commercial stage with 100% interaction evaluation becoming standard, but adoption unevenness persists between cloud-native and traditional contact centers.
  • 2024-Q1: Calabrio acquires Wysdom.AI to expand bot QA analytics; deployment evidence shows Awaken achieving 56% reduction in difficult calls and 10% sales uplift. ISG analyst research projects two-thirds of contact centers will increase training/coaching budgets by 2026. Level AI survey finds 100% of leaders considering AI adoption with 23% higher satisfaction among agents using real-time AI tools. Community skepticism persists about AI agent reliability in live customer interactions, highlighting deployment risks alongside vendor momentum.
  • 2024-Q2: NICE releases Real-Time Interaction Guidance for AI-driven agent coaching with contextual compliance prompts, confirming category-wide focus on real-time coaching as table-stakes capability. Calabrio continues product innovation with Bot Analytics tools. Vendor ecosystem shows steady product maturation in real-time guidance and evaluation, though no major new deployment case studies emerge in this quarter.
  • 2024-Q3: Market adoption metrics show 39% of CX leaders using AI-driven scoring for employee and customer evaluation (CallMiner survey, 700 leaders). AWS ecosystem integration advances with real-time sentiment analysis templates for contact center deployment. Research from Salesforce reveals fundamental vulnerabilities in AI evaluation systems: LLMs vulnerable to deceptive feedback with 50%+ performance degradation. Analysis of 47 failed enterprise AI deployments ($127M sunk) identifies testing gaps and insufficient human oversight as key adoption barriers. Industry data shows 74% of contact centers still rely on random sampling, with AI achieving 100% coverage—adoption bifurcation persists between cloud-native and traditional centers.
  • 2024-Q4: NICE releases AI for Agents with 100% conversation evaluation and real-time coaching, confirming vendor commitment to quality monitoring as table-stakes. Adoption momentum continues: 33% of contact centers actively using emotion recognition for sentiment analysis; Frost & Sullivan projects two-thirds of CX operations plan AI-driven coaching within 3-5 years. However, critical legal barriers emerge: class-action lawsuits under privacy statutes (CIPA) challenge automated quality management when customer consent absent. Market bifurcation persists—cloud-native leaders deploying 100% AI-automated evaluation while traditional centers remain largely manual; execution gaps widen between vendor innovation and customer deployment capability.
  • 2025-Q1: Named deployments confirm measurable ROI: AAA Northeast reduced AHT by 14 seconds via AI analytics (equivalent to 1 FTE); Australian energy provider cut QA scoring inconsistencies by 35%, improving FCR/CSAT; GE Appliances/Delta Dental show cost reduction and attrition/defect improvement. AWS Marketplace integration of Observe.AI signals cloud ecosystem maturity. McKinsey/Gartner data shows 30% CSAT improvement and 25% productivity gains from AI monitoring. Practitioner analysis emphasizes hybrid human+AI model necessity—pure automation risks employee pushback, judgment gaps, and legal compliance issues. Market remains bifurcated between cloud-native leaders and traditional centers.
  • 2025-Q2: Calabrio's survey reveals near-universal AI adoption (98%) but persistent implementation challenges: 61% of centers report more difficult conversations since AI deployment, 32% cite agent distrust as critical barrier. Calabrio releases 70+ new features (Auto QM, Trending Topics, Interaction Summary), confirming ecosystem maturity. Observe.AI documents 350+ enterprise deployments with 60% efficiency gains and 75% QA time reduction. Critical shift in evidence landscape: practitioner analysis exposes measurement gaps (e.g., telecom provider showed 12% AHT improvement but 7% revenue decline), revealing tension between operational metrics and business outcomes. Legal/compliance barriers persist; market bifurcation between cloud-native leaders and traditional centers widens.
  • 2025-Q3: Vendor ecosystem innovation continues: Omind launches AI QMS platform with 100% automation, 30% cost reduction, 95% compliance accuracy, 20% CSAT gains, and up to 59-second AHT improvement. Observe.AI case studies document RealDefense (103% quota lift, 13% revenue boost) and Nations Info Corp (50% save rate improvement, 43% AHT reduction). However, critical deployment risks crystallize: 75-80% of enterprises deploying AI QA grading; documented bias manifestations in scoring (accent, sentiment, gender, script-adherence bias) affecting agent reviews and coaching. Organizational failure rates spike: Gartner forecasts 85% of AI projects fail; S&P Global 2025 data shows 42% of companies abandoned most AI initiatives. Fundamental gaps persist—legacy system integration, change management, causation modeling between metrics and business outcomes, and compliance under CIPA privacy constraints. Market bifurcation widens: cloud-native leaders achieve strong ROI, traditional centers struggle with execution. Technology maturity exceeds implementation maturity.
  • 2025-Q4: Technical standardization emerges: Deepgram and UseScore publish production guidelines for speech-to-text sentiment analysis (5-10% WER) and scaling from 3-5% manual sampling to 100% coverage (70-90% workload reduction achieved 8-12 weeks post-deployment). Industry benefit data solidifies: 20-40% CSAT, 15-25% repeat-call reduction, 30-50% escalation gains consistently reported. Observe.AI validated as IDC MarketScape Leader in Workforce Engagement Management. However, deployment fundamentals remain unchanged: 61% of centers report conversation quality degradation post-deployment; 32% cite agent distrust; organizational failure rates sustained at 42% abandonment; bias risks (accent, sentiment, gender, script-adherence) persist across 75-80% deployed systems. No tier-advancement signals emerge; market bifurcation between cloud-native leaders (with strong ROI) and traditional centers (struggling with execution) persists unchanged.
  • 2026-Jan: Calabrio launches Omni Agent Intelligence for unified human+AI agent monitoring, confirming market evolution toward hybrid deployment frameworks. New product capabilities extend to quality measurement across autonomous and human agents. However, Gartner forecasts 40% of agentic AI projects will be canceled by 2027 due to cost, unclear ROI, inadequate risk controls, and integration friction (70% of developers report integration problems). Technical maturity advances (95-98% call-scoring accuracy achieved) but organizational adoption barriers persist: unoptimized QA processes, inadequate change management, and measurement gaps between operational metrics and business outcomes remain tier-limiting factors.
  • 2026-Feb: Calabrio QM deployment demonstrates advanced technical capability: 99%+ auto-scoring accuracy, 90% manual QM time reduction, 25% agent attrition improvement, 41% ACW reduction; healthcare deployment (CareAI) manages 53% of inquiries via automated quality evaluation. However, strategic optimization analysis reveals critical disconnect: 98% AI adoption across contact centers but only 12% claim fully optimized value; 86% remain in "pilot purgatory." Leadership and cultural barriers intensify—only 35% of agents understand AI tool usage, >50% fear automation, and 64% of leaders neglect empathy training. Deployment maturity remains unchanged with persistent challenges: bias risks in 75-80% of systems, 42% organizational abandonment rates, measurement gaps between operational and revenue metrics. Technology capability advances but implementation/organizational maturity static, preventing tier advancement.
  • 2026-Apr: New deployment evidence confirms scale ROI where adoption is mature: Verint Coaching Bot delivers €67.8M benefit and 20-second AHT reduction at a telco, $70M savings at an insurer, and +39 NPS at a mortgage lender; platform comparison data shows BPO customers scaling QA coverage from 3% to 100% with 5-point CSAT gains. The operationalization gap sharpens as the defining constraint: CMSwire data shows 88% of contact centers deployed AI but only 25% operationalized it into daily workflows, and only 52% allow shared visibility between agents and AI systems. A real failure case (insurance company's knowledge base error affecting 660 calls over 11 days before a customer complaint surfaced it) illustrates why 100% monitoring coverage has practical value beyond efficiency — it catches systematic errors that sampling misses.
  • 2026-May: Major vendor commitments formalize next-generation capabilities. Microsoft launches Quality Assurance Agent within Dynamics 365 Contact Center (GA April 2026), emphasizing shift from sampling to real-time evaluation across both AI and human agents. Palomarr analyst framework ranks 94 vendors on transcription accuracy, real-time analytics, and coaching automation, identifying LevelAI (9.8), Cresta (9.7), and Observe.AI (9.6) as leaders. Independent survey of 815 enterprise executives (Liveops/Peter Ryan Strategic Advisory) identifies continued maturity gap: 65% remain in Walk/Run stages (hybrid human-AI workflows) requiring quality management infrastructure, while only 14% reach Fly stage with continuous real-time optimization. McKinsey data confirms 90%+ AI accuracy vs 70% manual; $286K annual savings per 1% FCR improvement. Verint Quality Bot delivers enterprise-scale outcomes: €8.6M digital channel savings at 80% containment and 30% attrition reduction, with Fiserv achieving 1%→96% QA coverage and $12.5M annual savings by eliminating 1,200 manual QA roles. Salesforce State of Service study (3,075 respondents) confirms 66% adoption rate with 70% observing measurable value within 60 days—strongest rapid-ROI signal in the category this cycle. Metrigy's CX Assurance research study finds companies with advanced assurance practices are 2.2× more likely to succeed, and explicitly extends the monitoring mandate from human agents to autonomous AI systems. Real-time coaching architecture consolidates as a three-layer pattern (pre-interaction compliance cues, during-call guidance, post-interaction micro-coaching), with independent technical assessment of Observe.AI confirming platform production-readiness. Vendor comparison ranking 7 automated coaching platforms documents 25-40% faster new-rep ramp time as a repeatable outcome. Tele Access BPO hybrid QA model combines 100% AI coverage with adult-learning-grounded coaching methodologies (Good Call/Bad Call peer sessions, morning briefings), demonstrating mature operationalization. 88% centers deployed AI but only 25% operationalized into daily workflows; operationalization gap and real-time coaching as attrition driver when QA feels punitive remain the binding constraints.
  • 2026-Jun: Outcome-based measurement, scale evidence, and M&A consolidation define the June picture. COPC's third-party case study of a B2B e-commerce deployment reveals that only 7 of 60 percentage points of unresolved issues were agent-controllable—53 pp were policy, process, or tool-driven—fundamentally reshaping how QA findings should drive operational change rather than individual coaching. ETS Labs' QEval documents 2.5 billion interactions processed in 2025 with sub-4-minute latency and zero backlog, confirming 100% coverage is now operationally achievable at industry scale. Level AI's Vista deployment illustrates the step-change from 1-2% manual sampling to 100% AI scoring with improved coaching effectiveness. Research (Journal of Applied Psychology via TechBullion) reinforces that real-time feedback produces 3x larger behavioral effect than delayed review, strengthening the case for in-call coaching over post-call evaluation. Regulatory pressure sharpens: an RBI fine of Rs 1.31 cr against a BFSI firm exposed that 3% sampling carries a 97% miss-probability per violation with a 4-8 week detection lag, making 100% coverage a compliance necessity rather than a performance aspiration. Verint's $2B acquisition of Calabrio (confirmed June 2026) merges two of the three largest QA/WFM vendors, signaling category consolidation and enterprise strategic validation. Verint Engage 2026 conference produces three named quantified deployments: FNB South Africa (14x automated evaluations, 15% compliance improvement), NOS Portugal (40% productivity gain, 61-point NPS lift), and a major unnamed bank ($10M agent capacity savings). Market adoption metrics firm up: 80% of contact centers now use AI-based QA technologies (up from 30% in 2021), with primary research on 109 directors/VPs finding 85% deployed AI QA tools but only 29% use them effectively—the deployment-to-value gap, not technology maturity, remains the binding constraint.