The AI landscape doesn't move in one direction — it lurches. Some techniques leap from experiment to table stakes in a single quarter; others stall against regulatory walls, technical ceilings, or organisational inertia that no amount of hype can dislodge. Knowing which is which is the hard part. The State of Play cuts through the noise with a rigorously maintained index of AI techniques across every major business domain — classified by maturity, evidenced by real-world adoption, and updated daily so you always know where you stand relative to the field. Stop guessing. Start knowing.
A daily newsletter distilling the past two weeks of movement in a domain or two — delivered to your inbox while the index updates in the background.
Each dot marks the weighted maturity of practices within a domain — hover for a brief summary, click for more detail
AI that sends responses to customers automatically with human agents only involved for escalations and edge cases. Includes confidence-gated auto-send and human escalation routing; distinct from autonomous chatbots which handle the full interaction rather than augmenting an agent workflow.
Autonomous send -- AI that fires customer responses without waiting for a human to press "send" -- remains firmly experimental despite shipping GA at major vendors. The concept is narrower than a fully autonomous chatbot: it augments existing agent workflows by removing the manual approval step for high-confidence replies, escalating only edge cases to humans. Confidence-gated execution architectures (85-92% threshold for send, 65-80% for draft, <65% for escalation) are now standard in production systems. Yet independent May 2026 research reveals the core tension: while vendors report 70-84% autonomous resolution at 9,000+ customers (HubSpot) and 35,000+ deployments globally (Text), only 24% of consumers in production environments actually experienced full resolution without human intervention. The binding constraint remains reliability and trust. Critical failures continue: Klarna rehired humans after CSAT collapse, Commonwealth Bank reversed layoffs following tribunal challenge, DPD disabled its system after swearing at a customer, and Air Canada faced legal liability for autonomous policy fabrications. Practitioner consensus (MoClaw 2026) emphasizes mandatory human gating: "Customer-facing send without approval... Always gate." Once an autonomous message sends, it cannot be recalled. The gap between capability (70%+ vendor metrics) and actual reliability (24% consumer experience) signals the practice remains early-stage deployment despite product maturity.
Vendor adoption is demonstrable but consumer reality lags claims. May 2026 evidence shows HubSpot Customer Agent autonomously resolving 70% of conversations across 9,000+ customers (up from 20% in 12 months), Text AI deployed at 35,000+ companies with 74% autonomous resolution, and Stratco Australia doubling previous human support volumes by achieving 80% autonomous query resolution. Go Autonomous documents autonomous order confirmation sending in production across European manufacturers with 43% capacity release. These represent genuine scale deployments with confidence-gated execution (85-92% auto-send thresholds, 65-80% draft, <65% escalation).
Yet independent research reveals the maturity gap. Ada/NewtonX's May 2026 survey of actual consumer experiences found only 24% reported full autonomous resolution without human intervention—a critical reality check against vendor claims of 70-80% autonomous send rates. Practitioner consensus emphasizes mandatory human review: MoClaw's May 2026 assessment states unambiguously that "customer-facing send without human approval" is a failure pattern and "always gate" is the safe model for customer communication. The trust gap persists: only 29% of enterprises allow unsupervised agent actions despite 88% planning increased budgets (ace8 mid-2026 assessment). Market adoption is wide (35,000+ Text deployments, 9,000+ HubSpot customers) but production readiness is narrow—success depends on deployment discipline (infrastructure validation, confidence thresholds, escalation governance) rather than vendor choice. Regulated markets show stronger hesitation: AI workflows outnumber autonomous agents 5:1, with 78% citing EU AI Act compliance as the primary barrier.
The metric-inflation problem is now explicitly recognized: Fini Labs' May 2026 research found 71% of support leaders cite "inflated automation metrics" as their top blocker to trusting AI vendors. Vendor self-report bias is real—Decagon claims 80% deflection while Zendesk's enterprise-wide median is 41.2%. Governance failures are widespread: Sinch's May 2026 survey of 2,500+ customer service leaders found 62% have autonomous AI agents in production, but 74% reported rolling back or disabling them due to governance failures (31% cited customer data exposure, 22% hallucinations, 16% lack of auditability). Staged rollout approaches show promise (Salesforce survey: 70% report measurable value within 60 days), but scaling remains difficult: realistic ROI assumes 3-month payback with 20-35% year-one cost reduction—far below vendor claims of 60-80%.
— Microsoft Dynamics 365 released production-ready Autonomous Email Resolution performing intent identification, response generation, autonomous sending, and case creation without agent review, confirming autonomous send moving to mainstream enterprise platform.
— Sinch report: 74% of autonomous AI agent deployments reversed after go-live due to governance failures; demonstrates that autonomous agents including autonomous send face real barriers and failures in production.
— Independent review of Decagon platform serving Hertz, Notion, Rippling, Duolingo, Faire, ClassPass, Noom, Substack, Curology with 80% deflection rates and 90%+ autonomous resolution, documenting production-scale autonomous send adoption.
— U.S. CAN-SPAM compliance framework: penalties $53,088 per individual email (2026 adjusted rate); autonomous sending must comply with header accuracy, unsubscribe, and opt-out enforcement within 10 days.
— Newo.ai reports 99.6% Lead Success Score across 100,000 analyzed calls, confirming autonomous agents reliably execute core business tasks without revenue loss; deployment across 22 industries, 30 countries, 90 languages.
— Fini Labs' comparative analysis of guardrails and the Air Canada tribunal case where AI chatbot invented policy; 'when an AI agent answers with confident wrong response, the business owns that answer including refund and compliance exposure.'
— eesel expert guide with Gridwise case study: 73% tier-1 resolution in first month with confidence-based routing (auto-send for routine, escalate for refunds/compliance); directly documents agent-assist autonomous send in production.
— Azeon synthesis: 74% rollback rate due to accuracy (hallucinations) and privacy/security; shift from chatbots to agentic agents documented but governance spending exceeds AI development (75-76% trust/security vs 63% technology investment).