The AI landscape doesn't move in one direction — it lurches. Some techniques leap from experiment to table stakes in a single quarter; others stall against regulatory walls, technical ceilings, or organisational inertia that no amount of hype can dislodge. Knowing which is which is the hard part. The State of Play cuts through the noise with a rigorously maintained index of AI techniques across every major business domain — classified by maturity, evidenced by real-world adoption, and updated daily so you always know where you stand relative to the field. Stop guessing. Start knowing.
A daily newsletter distilling the past two weeks of movement in a domain or two — delivered to your inbox while the index updates in the background.
Each dot marks the weighted maturity of practices within a domain — hover for a brief summary, click for more detail
AI that monitors service level indicators and predicts SLA breaches before they occur, enabling proactive intervention. Includes predictive SLA risk scoring and early warning systems; distinct from APM which monitors application health rather than business-level commitments.
Predicting SLA breaches before they happen has transitioned from vendor feature to operationalised capability in large enterprises and SaaS platforms, yet remains inaccessible to mainstream IT operations. The vanguard -- LINE, United Airlines, Agos Ducato, BT Digital -- run sophisticated agentic and ML-based SLA prediction workflows integrated with ITSM platforms, achieving measurable breach prevention and MTTR improvements. New Relic, Dynatrace, and emerging platforms like Lyzr and StackOne now ship GA breach prediction as core observability features. However, mainstream adoption faces a persistent barrier: the gap between platform capability (prediction algorithms are proven) and organisational readiness (data quality, SRE maturity, integration discipline, and tool consolidation) continues to widen. Industry data shows 60% of MSPs have formalised SLA management programs and 70% of IT professionals prioritise SLO-based monitoring, signalling ecosystem maturity and mainstream awareness; yet implementation complexity and integration friction remain the binding constraints. For most mid-market and smaller teams, SLA breach prediction remains a purchased but undeployed vendor feature.
The vanguard is producing measurable operational wins at scale. Dynatrace-ServiceNow integrations have reached GA for autonomous incident workflows; Agos Ducato (Credit Agricole) achieved 30-point lift in critical transaction success (65%→95%) and 30-second latency reduction. United Airlines operates ~800 Dynatrace-monitored applications with documented top on-time performance. New Relic shipped SRE Agent (full incident lifecycle automation) and reported 25% faster incident resolution, 80% higher deployment frequency, and 27% less alert noise among AI-enabled operations teams. In May 2026, technological maturity continued advancing with multiple named deployments: Air France-KLM deployed Dynatrace enterprise-wide (98M annual passengers, 564-aircraft fleet) shifting from reactive to proactive SLA-aware monitoring; a large telecom operator (25M subscribers) deployed ML-based SLA breach prediction achieving 40% breach reduction and $3.5M annual penalty savings; Dynatrace released Intelligence GA as the first agentic operations system combining deterministic SLO insights with autonomous remediation. Vendor observability platforms delivered concrete SLA outcomes: TD Bank cut transaction failure rates from 0.16% to 0.06% and reduced monitoring costs 45%; BNZ achieved 58% increase in high-quality releases and 94% reduction in major incidents; WeLab Bank reduced root-cause ID time from hours to minutes. New agentic breach prediction platforms emerged: StackOne deployed AI agents predicting breach probability by monitoring ticket burn rate and queue depth; Lyzr released 'Breach Predict' agents with customer reports of 30% critical incident reduction; LINE (Japanese platform) deployed SLI/SLO-centric observability with automated breach detection tied to user-facing SLA targets. Peer-reviewed research (May 2026, arXiv) demonstrated transformer-based breach prediction achieving 30-minute advance warning for data center colocation SLAs using per-customer multi-head attention models. Market analysis shows SLA tracking system market growing at 17.1% CAGR to $4.3B by 2030, with automated monitoring, predictive analytics, and workflow automation as standard vendor capabilities.
June 2026 scan evidence confirms platform maturity with emerging agentic innovation: New Relic production deployments show 33-43% MTTR reduction and $95-220k annual savings; Dynatrace Terraform SLO provider (GA) enables SLA-as-code; Arcturus multi-org deployments demonstrate 94% SLO compliance and 87% MTTR improvement (to 11 minutes). Virtana launched GA Agentic SLA Management (June 2026), establishing AI-native SLA orchestration as an emerging category. Product enhancements advanced detection accuracy: New Relic released maintenance window support and FACET-based SLI aggregation eliminating false violations from planned downtime. Emerging platforms (AINE, Sparkco) deliver 6-12 hour advance breach prediction. However, practitioner surveys (Neubird, 1,000+ SRE professionals) document critical gaps: 78% of teams experienced missed detections, 44% suffered alert fatigue incidents, and deployment barriers (infrastructure hygiene, SRE maturity, integration complexity) remain the primary blocker. Consulting analysis (Scalence, GB Advisors) documents 40% breach reduction possible with predictive analytics + anomaly detection + dynamic escalation; practical guidance (Zazz, Snoh AI) establishes industry benchmarks (MTTD <15 min, MTTR <1 hour) and risk-scoring frameworks, yet organizational readiness—not platform capability—remains the limiting factor.
This activity masks a widening bifurcation. SaaS observability vendors (New Relic, Dynatrace, Chronosphere, emerging agentic platforms including Virtana) achieved production breach prediction with enterprise deployments; mainstream ITSM platforms (ServiceNow on-premise, Jira Service Management) retain calculation accuracy gaps, automation reliability issues, and class-imbalance problems blocking prediction. Industry adoption metrics are maturing: 60% of MSPs now operate formal Customer Success programs with structured SLA management; 70% of IT professionals prioritise SLO-based monitoring; 52-74% of tech companies and telcos deployed AI monitoring capabilities; GitLab publicly documented error budgets as operational release-gating mechanism at a leading-edge tech company; the SLA tracking ecosystem (10+ vendors: Fivenines, Nobl9, Datadog, Checkly, Uptime.com, Better Stack, Site24x7, etc.) reached USD 2.29B in 2026 and is projected to USD 4.3B by 2030 (17.1% CAGR) -- yet these metrics reflect widespread threshold-alerting adoption, not breach prediction. Emerging technical complexity surfaces around SLA monitoring for AI-native infrastructure: traditional SLA metrics fail for probabilistic AI systems; agentic workflows require observability beyond infrastructure (state timing, agent context, evidence artifacts); standard anomaly detection requires tuning for real-world deployments (contamination thresholds, feature engineering for time-of-day effects) to avoid false positives; and AI inference systems in shared-tenant cloud environments face SLA visibility gaps that standard monitoring cannot surface (multi-tenant contention remains invisible to tenant-level observability). The barrier remains organisational: McKinsey data shows 6% of organisations achieve meaningful AI ROI; ServiceNow Predictive Intelligence documentation lists 20+ implementation failure modes (data quality, label corruption); Broadcom surveys find 98% of IT teams cite automation/integration issues as root cause of SLA breaches, not inadequate tooling. Organisational readiness gaps -- data quality discipline, SRE maturity, integration architecture, business alignment -- constrain deployment of proven prediction capabilities across the broader market, even as platform vendors accelerate agentic AI shipping and market growth (13.7% CAGR, USD 1.38B in 2024 to projected USD 4.21B by 2033) continues.
Critical blockers to autonomous deployment were documented by independent practitioners: infrastructure hygiene (data quality, staging/production parity) must precede agentic automation; organizations cannot delegate SLA breach prevention to AI agents without first achieving operational maturity (clean pipelines, unified tooling, SRE discipline); 80-90% of AI agent projects fail in production due to unrealistic assumptions about infrastructure readiness, not algorithm limitations. Operational SLA monitoring at scale (Levy Fleets, TD Bank) demonstrates that deployed systems require deterministic breach detection with 15-minute cron cycles, real-time analytics, and clear escalation paths—yet practitioners document that shift from reactive alerting to predictive breach prevention requires forward-looking multi-signal frameworks (latency drift, error budget burn, queue depth, dependency instability, resource saturation, traffic pattern shifts) that most organizations lack operational maturity to instrument and maintain. This fundamental asymmetry—vendor platform maturity exceeding organizational deployment readiness—is the defining constraint preventing SLA breach prediction from crossing from leading-edge practice (SaaS vendors, Fortune 500 early adopters) into mainstream operations, with emerging complexity added by AI-native systems that require fundamentally different observability models.
— Practitioner account of deploying ML anomaly detection replacing 200 static Prometheus alerts; identifies slow-degradation detection gap (memory leaks invisible to static thresholds) as critical SLA breach signal.
— Market analysis confirms SLA tracking ecosystem maturity: USD 2.29B market in 2026 projected to USD 4.3B by 2030 at 17.1% CAGR, with automated monitoring and predictive analytics as standard vendor capabilities.
— New Relic released SLI calculation improvements enabling maintenance windows to exclude planned downtime from violations and FACET support for attribute-level SLI analysis, addressing core breach detection accuracy.
— Virtana launches AI-native Agentic SLA Management platform transforming static SLAs into intelligent operational control planes with continuous validation and breach prediction orchestration.
— Industry benchmarks document 2026 standards: MTTD <15 min, MTTR <1 hour for top MSPs; AI/automation in incident response cuts breach lifecycle by 80 days and saves $1.9M per incident on average.
— Practitioner guide details predictive SLA models using historical workflow data (time-to-first-action, assignee completion rates, queue depth, calendar context) achieving 60-80% breach prevention via proactive intervention.
— Named-org deployments including Danube Group (94% SLO compliance), AeroMexico (87% MTTR reduction to 11 minutes), and others demonstrating AI observability enables SLA compliance at scale.
— Three production deployments showing 33-43% MTTR reduction, incident count drops 20-38/year, and $95-220k annual cost savings via New Relic AI observability.