The AI landscape doesn't move in one direction — it lurches. Some techniques leap from experiment to table stakes in a single quarter; others stall against regulatory walls, technical ceilings, or organisational inertia that no amount of hype can dislodge. Knowing which is which is the hard part. The State of Play cuts through the noise with a rigorously maintained index of AI techniques across every major business domain — classified by maturity, evidenced by real-world adoption, and updated daily so you always know where you stand relative to the field. Stop guessing. Start knowing.
A daily newsletter distilling the past two weeks of movement in a domain or two — delivered to your inbox while the index updates in the background.
Each dot marks the weighted maturity of practices within a domain — hover for a brief summary, click for more detail
AI that generates and maintains operational runbooks and produces post-incident review reports. Includes automated playbook creation and blameless post-mortem drafting; distinct from incident response automation which executes actions rather than documenting them.
AI-generated runbooks and post-incident reports have reached vendor maturity and demonstrated quantified value in select deployments, yet organisational adoption remains constrained by governance and operational discipline gaps—not technical capability. The technology solves a genuine pain point: runbooks decay as systems change (static documentation has a half-life measured in weeks in rapidly deploying environments), and post-mortems are routinely delayed or incomplete because recovery competes for engineers' attention against documentation. Named deployments show measurable returns: GreatCTO achieved 94.1% median detection time reduction across 47 P0 incidents via persisted incident memory; incident.io customers report 37% MTTR reduction and $29,700 annual savings; SolarWinds measured 17.8% incident resolution time cuts across 2,000+ ITSM deployments; Cutover platforms demonstrate 60% MTTR reduction and 50% fewer disruptions via AI-assisted runbook execution, plus 25–40% additional MTTR reduction via autonomous execution with continuous learning; financial services deployments (Danske Bank) report 300% resilience efficiency gains. Proven deployment patterns now documented: AI Advisory Board validates runbooks as high-confidence AI use case with emerging standard workflow (AI drafts from alert definitions, on-call engineers refine within first week), with documented SaaS firm scaling runbook coverage from 40% to 85% in 14 days. The vendor ecosystem has solidified: Arvo AI's May 2026 neutral taxonomy defines postmortem generation as a distinct, mature capability axis across 15+ vendors (BMC HelixGPT, PagerDuty Advance, incident.io, Rootly, ServiceNow, Datadog Bits AI, AlertOps). June 2026 updates show AlertOps Chronicle achieving 80% time savings in automated postmortem drafting; PagerDuty Scribe Agent GA enabling real-time transcription and enriched postmortem summaries; Datadog postmortem lifecycle management (Draft/In Review/Completed) embedded as tier-1 platform infrastructure; Lightrun AI SRE generating evidence-based postmortems in regulated deployments (SOC 2/HIPAA). Yet the adoption gap persists. Hallucination accuracy has become the critical blocker: Seekr research documents production hallucination rates of 33–86% in agentic multi-step workflows vs. sub-1% benchmarks, directly contradicting vendor marketing. High-profile failure: KPMG Big Four consulting firm withdrew its AI-generated report (June 2026) after verification identified 40 of 45 citations as hallucinated, contradicted by named organizations (UBS, NHS, Swiss Railways, Transport for London)—evidence that verification frameworks are essential for professional-grade documentation. June 2026 data reveals organisational barriers dominate: 58% of enterprise CTOs name governance as the #1 blocker on AI agent projects; the binding constraints are accountability structures (named owner, escalation paths, audit trails, change management discipline), verification checkpoints, and accuracy assurance rather than model capability. Accuracy risks remain acute: 2026 hallucination benchmarks show 3.3–60% error rates, with June professional documentation failures (Sullivan & Cromwell court filings, KPMG report, Deloitte audit reports, EY cybersecurity analyses) signaling real cost when governance checkpoints are absent; AI-generated incident reports face evidentiary, privacy, and compliance gaps. Operational documentation for AI systems reveals new structural gaps: runbooks written by engineers with broad access are unexecutable by on-call SREs with restricted credentials; postmortems must now document agentic failure modes (silent degradation, policy drift, tool ambiguity) distinct from deterministic systems. The practice is vendor-ready and proven in select deployments, but organisational prerequisites are steep—on-call discipline, runbook testing procedures, governance frameworks governing prompt/model changes, evidence capture at the agentic execution layer, verification checkpoints for AI-generated content, and mature incident-reporting cultures—and most teams haven't established them.
The vendor ecosystem has crystallised into mature, GA-ready offerings. BMC HelixGPT (26.1), PagerDuty Advance, ServiceNow automated post-incident review agents, Rootly, Datadog Bits AI, incident.io, and AlertOps all ship production-ready features for runbook automation and postmortem generation. June 2026 updates: PagerDuty Scribe Agent now GA with real-time Zoom/Teams transcription and enriched postmortem summaries; Datadog DASH 2026 announced postmortem lifecycle management (Draft/In Review/Completed status) as embedded tier-1 infrastructure; Lightrun AI SRE generating evidence-based postmortems with validated reasoning chains in SOC 2/HIPAA deployments; AlertOps Chronicle auto-drafts complete incident reviews from alert data with 80% time savings; incident.io ecosystem maturity signal shows 9+ vendors shipping postmortem generation (Rootly, incident.io, Datadog, Opsrift, Arvo AI, ilert, DrDroid, PagerDuty, Atlassian). Real deployments deliver quantified returns in financial services and IT operations: Danske Bank achieved 300% resilience efficiency gains in runbook automation; SolarWinds measured 17.8% incident resolution time reduction across 2,000+ ITSM systems; incident.io customers report 37% MTTR reduction and $29,700 annual savings; Cutover platforms demonstrate 60% MTTR reduction and 50% fewer disruptions via AI-assisted runbook execution with human-in-the-loop governance, and 25–40% additional MTTR reduction via continuous learning from incidents. Proven deployment pattern emerging: AI Advisory Board documents runbooks as high-confidence AI use case where AI drafts from alert definitions and incident history, on-call engineers edit within first week, with SaaS firm scaling runbook coverage from 40% to 85% in 14 days. Real-world deployments also reveal acute failure modes: Runcycles documented 20+ AI agent incidents with costs ranging from $1.40 to $12,400 in direct spend and up to $50K+ business impact—exactly the failures runbooks should prevent. Hallucination accuracy risk has intensified as critical blocker: Seekr research documents production hallucination rates of 33–86% in agentic, multi-step reasoning workflows vs. sub-1% benchmarks, directly contradicting vendor marketing claims. High-profile failure case: KPMG Big Four consulting firm withdrew its AI-generated report (June 2026) after verification identified 40 of 45 citations as hallucinated or misleading, contradicted by named organizations (UBS, NHS, Swiss Railways, Transport for London), signaling that verification frameworks and governance checkpoints are essential rather than optional for enterprise-scale documentation generation. Structural operational gaps surface: AI runbooks written by engineers with broad access are operationally unexecutable by on-call SREs without credentials; runbooks for agentic systems must now capture decision artifacts (workflow IDs, policy gate results, tool-call traces, side-effect ledgers) distinct from deterministic systems. Governance frameworks crystallize with tiered autonomy models: confidence thresholds <0.60 require manual selection, 0.60–0.84 require human approval, ≥0.85 execute autonomously, with NIST AI Risk Management alignment and 95% accuracy targets. Operator discipline remains weak: April 2026 evidence shows operational toil increased 30% despite AI investment because teams deployed agents without runbook discipline; 69% of AI-powered decisions still require human verification, creating a "messy middle" where the automation layer was added but the manual layer wasn't removed. Post-mortem quality is systemically broken: most AI incident postmortems miss root causes by focusing on model hallucination when the real cause is credential misconfiguration—a systematic failure pattern in how teams analyze incidents. Large-firm AI adoption in IT operations has stalled at 12%, with only 14% of enterprises successfully scaling pilots to production. The binding constraints are organisational. Incident-reporting systems remain underused due to blame culture and reporting friction, starving AI models of training data. Most AI deployments lack the telemetry infrastructure (model versions, prompt logs, retrieval context, embedding versions) needed for effective forensic postmortems. Governance frameworks (terminology control, human review workflows, audit trails, verification checkpoints) are emerging as essential—without them, AI-generated reports cannot be audited or defended when disputes occur. Runbook discipline requires operational governance: access federation via OpenTelemetry, runbook authoring discipline enforcing execution-persona validation, and agentic-specific controls (blast radius definition, autonomy classification, rollback procedures). Successful deployments cluster where blameless postmortem cultures and strong incident-data hygiene already exist—the AI amplifies mature practices rather than compensating for absent ones.
— Cutover demonstrates 25–40% MTTR reduction via autonomous runbook execution with continuous learning from incidents, real-time command center dashboards, and human validation checkpoints for high-risk changes.
— Critical analysis of production hallucination rates in agentic workflows: 33–86% on reasoning tasks vs. sub-1% benchmarks. Documents gap between marketing claims and real operational deployment risk in multi-step documentation generation.
— Runbooks categorized as high-confidence AI use case with proven deployment pattern: AI drafts from alert definitions and incident history; on-call engineers edit within first week. SaaS case study: AI runbook generation increased coverage from 40% to 85% in 14 days.
— Big Four firm withdrew its AI-generated report after verification identified 40 of 45 citations as hallucinated or misleading, contradicted by named organizations. Critical negative evidence: professional documentation generation at enterprise scale lacks adequate verification frameworks.
— Tiered autonomy framework (confidence thresholds 0.60–0.84 require human approval; ≥0.85 autonomous) with NIST AI Risk Management governance; SentienGuard achieves 95% runbook selection accuracy with safety practices including separate AI decision/execution layers.
— Cutover deployment achieving 60% MTTR reduction and 50% fewer disruptions via AI-assisted runbook execution, real-time audit trails, and post-incident learning with human-in-the-loop governance checkpoints.
— AlertOps Chronicle GA feature auto-drafts complete incident reviews from alert data with 80% time savings, automated timeline assembly, and pattern surfacing across recurring incidents.
— Production platform generating evidence-based postmortems with timeline, RCA, and resolution strategies from runtime evidence; validated reasoning chains against live behavior with MTTR improvement claims and SOC 2/HIPAA alignment in regulated deployments.