The AI landscape doesn't move in one direction — it lurches. Some techniques leap from experiment to table stakes in a single quarter; others stall against regulatory walls, technical ceilings, or organisational inertia that no amount of hype can dislodge. Knowing which is which is the hard part. The State of Play cuts through the noise with a rigorously maintained index of AI techniques across every major business domain — classified by maturity, evidenced by real-world adoption, and updated daily so you always know where you stand relative to the field. Stop guessing. Start knowing.
A daily newsletter distilling the past two weeks of movement in a domain or two — delivered to your inbox while the index updates in the background.
Each dot marks the weighted maturity of practices within a domain — hover for a brief summary, click for more detail
AI that sends responses to customers automatically with human agents only involved for escalations and edge cases. Includes confidence-gated auto-send and human escalation routing; distinct from autonomous chatbots which handle the full interaction rather than augmenting an agent workflow.
Autonomous send -- AI that fires customer responses without waiting for a human to press "send" -- remains firmly experimental despite shipping GA at major vendors. The concept is narrower than a fully autonomous chatbot: it augments existing agent workflows by removing the manual approval step for high-confidence replies, escalating only edge cases to humans. Zendesk and Salesforce have deployed autonomous send at scale (Klarna 2.3M conversations, 1-800-Accountant 70% autonomous resolution during peak season). Yet critical failures dominate recent evidence: Klarna rehired humans after CSAT collapse, Commonwealth Bank reversed layoffs following tribunal challenge, DPD disabled its system after an embarrassing profanity incident, and Air Canada faced legal liability for autonomous promises the AI made up. The binding constraint remains reliability. A financial institution deployed LLM-based autonomous agents with infrastructure investment in validation and monitoring. But peer-reviewed research on the same period finds zero gains in agent dependability from model improvements, with 85% per-step reliability yielding only 20% end-to-end success on multi-step workflows. Once an autonomous message sends, it cannot be recalled. Governance gaps and systematic failure patterns keep this practice experimental.
Market adoption is wide but deployment is narrow. April 2026 data shows $15.12B market size with 91% of customer service leaders under implementation pressure, but 79% of consumers still prefer human contact. Zendesk GA'd autonomous action execution in March 2026 (pre-approved refunds, status updates, replies execute without per-interaction approval). Salesforce Agentforce and Klarna demonstrated significant autonomous deployment: Klarna processed 2.3M conversations (2/3 of all support chats) with AHT dropping from 11 minutes to under two minutes. 1-800-Accountant achieved 70% autonomous resolution during peak tax season using Salesforce Agentforce. A named financial institution deployed LLM-based autonomous customer service agents with infrastructure investment in validation, monitoring, and continuous evaluation. These demonstrate capability maturity.
Yet reliability barriers block adoption. April 2026 research shows 95% of enterprise AI pilots deliver no measurable P&L impact and 42% of AI initiatives are abandoned before production. Most critically: documented autonomous send failures at scale. Klarna rehired human agents after autonomous deployment caused customer satisfaction to drop; CEO admitted "We went too far." Commonwealth Bank of Australia reversed AI-driven layoffs after tribunal challenge and public backlash. DPD's autonomous system was disabled within hours after it swore at a customer and called the company "the worst delivery firm in the world" (1.3M people saw screenshots). Air Canada faced legal liability when its autonomous agent promised a bereavement fare policy that didn't exist. Infrastructure gaps are fundamental: compound failure means 85% per-step reliability yields only 20% end-to-end success on 10-step tasks. Once autonomous messages send, they cannot be unsent. Governance gaps around escalation design, approval boundaries, and policy encoding remain the core adoption barrier.
— Salesforce Agentforce resolved 84% of cases autonomously across 380,000+ support interactions in Q1 2026, demonstrating production-scale autonomous agent maturity in customer service.
— End-to-end autonomous resolution executes full workflows including identity validation, policy checking, refunds, and system updates without human handoff; production deployment shows 4-minute resolution (vs 48 minutes), 98% SLA compliance, 85% autonomous closure rate.
— Critical barrier evidence: AI workflows outnumber autonomous agents 5:1 in regulated markets; 78% of European enterprises cite EU AI Act compliance as primary barrier to autonomous agent adoption; workflows deliver 3.4x faster time-to-value and 47% lower implementation costs.
— Analysis of 600+ deployments shows bimodal ROI distribution: 12% of enterprise agentic AI deployments clear 300%+ ROI; 88% operate at or below break-even on full-loaded cost; deployment discipline rather than vendor choice determines outcomes.
— Multiple autonomous service AI deployments demonstrate production maturity: Sprout Social resolves 80% of new-hire tickets autonomously; European energy company cut L1-L2 escalations by 35%; Domino's achieved 75% risk reduction with unified system of work enabling autonomous intervention authority.
— Salesforce customers using Agentforce report automating 70% of tier-1 customer support queries end-to-end. Primary failure mode identified as organizational (poor data, unclear accountability) rather than technical.
— Production case: Tier-1 auto-resolution agents autonomously resolve 40-65% of support tickets, with documented economics of $8,800-$14,300 monthly savings per 1,000 tickets per month.
— Autonomous send in customer support achieves 40-70% tier-1 ticket resolution without human involvement; IDC × Microsoft 2026 study shows 171% average first-year ROI, with top-quartile deployments exceeding 300%.