Perly Consulting │ Beck Eco

The State of Play

A living index of AI adoption across industries — where established practice meets the bleeding edge
UPDATED DAILY

The AI landscape doesn't move in one direction — it lurches. Some techniques leap from experiment to table stakes in a single quarter; others stall against regulatory walls, technical ceilings, or organisational inertia that no amount of hype can dislodge. Knowing which is which is the hard part. The State of Play cuts through the noise with a rigorously maintained index of AI techniques across every major business domain — classified by maturity, evidenced by real-world adoption, and updated daily so you always know where you stand relative to the field. Stop guessing. Start knowing.

The Daily Dispatch

A daily newsletter distilling the past two weeks of movement in a domain or two — delivered to your inbox while the index updates in the background.

AI Maturity by Domain

Each dot marks the weighted maturity of practices within a domain — hover for a brief summary, click for more detail

DOMAIN
BLEEDING EDGEESTABLISHED

AIOps — log analysis, alerting & event correlation

GOOD PRACTICE

TRAJECTORY

Advancing

AI-powered analysis of logs, metrics, and events to detect anomalies, correlate alerts, and reduce noise in monitoring. Includes pattern detection across log streams and intelligent alert grouping; distinct from root cause analysis which diagnoses the underlying cause after detection.

OVERVIEW

AIOps log analysis and event correlation has transitioned to a proven, GA-tooling category with documented production ROI across industries. Core capabilities—anomaly detection, alert deduplication, event clustering, and multi-signal correlation—are now standard features in enterprise observability platforms. Deployment outcomes are quantifiable: 30-80% MTTR reduction, 75-80% alert volume compression, and measurable cost savings (60+ hours/month analyst toil elimination per customer). The second phase of the practice now involves LLM-augmented agentic approaches, where natural language processing and autonomous investigation agents handle correlation at larger scale and with greater context awareness than rule-based systems.

The defining tension remains organisational, not technical. Only 70% of AIOps implementations succeed; the remaining 30% fail due to data quality, integration complexity, and insufficient organisational readiness to consolidate fragmented monitoring infrastructure. Tool sprawl, inconsistent log formats, and incomplete metadata governance prevent effective signal correlation. Alert fatigue persists at scale (70% of SRE teams report it; 67% of security operations center alerts still ignored) not because correlation engines fail, but because prerequisites—unified data pipelines, enforced schema standards, and cross-functional discipline—remain unmet. Organisations deploying AIOps should treat data consolidation and observability governance as prerequisite, not afterthought.

CURRENT LANDSCAPE

Dynatrace (Forrester Wave 2025 Leader), Elastic (Gartner Magic Quadrant, IDC MarketScape Leader), Splunk, Sumo Logic, Moogsoft, BigPanda, and Coralogix all ship GA log analysis, anomaly detection, and alert correlation with competing feature sets. May 2026 vendor momentum shows evolution toward LLM-augmented agentic approaches: BMC Helix AIOps v26.2 (May 2026) ships HelixGPT v7.4 conversational investigation and Ops Swarmer agent for distributed incident analysis; IBM Cloud Pak for AIOps v4.13 replaces static event grouping with topology-based correlation engine and Gen AI diagnostics at 700 concurrent users; Splunk AppDynamics ships automated anomaly detection with problem correlation; Dynatrace Grail delivers cross-signal log/trace/metric correlation (BMO customer: 80% faster resolution, 60 hours/month analyst toil eliminated).

Enterprise deployments confirm production maturity and shifting adoption patterns. Recent 2026 data: Cisco IT achieved 25% incident reduction and 99.998% alert automation over 18 months; SigNoz demonstrates real-time multi-signal correlation (logs, traces, metrics) using AI-driven dashboard generation. Academic research validates LLM-based detection: peer-reviewed survey (arXiv, May 2026) on agentic AIOps architectures documents autonomy hierarchies, evaluation frameworks, and safety constraints for deployment governance; University of Twente LogBERT deployment shows 15-second real-time latency in military environments. Telecom market analysis (Stratistics, May 2026) projects 60-80% MTTR reductions and 40-50% NOC staffing cost reduction as measurable outcomes driving 25.1% CAGR growth to $38.6B by 2034.

Adoption remains constrained by data readiness and organisational change management, not technical capability. Critical reality check (Agent Mode, May 2026) compares vendor claims to audited customer results: UK Government 20,000-user M365 Copilot trial achieved 26 min/user/day productivity gain; BT pilot achieved 35% case-resolution-time improvement; only 15% of Fortune 500 companies have cut support headcount via AI. ByteIota adoption study (May 2026) reports 73% of enterprises implementing or planning AIOps by year-end with GitLab 1.5M-developer case study showing value, but documents critical failure signal: 70-80% of AIOps implementations fail to meet expectations due to data quality and integration complexity requiring 12-18 months of data governance work before platforms deliver acceptable anomaly detection accuracy. A survey of 500+ security leaders found 90% value AI/ML for alert fatigue reduction, yet only 9% use advanced correlation. Practitioner assessments document 20 distinct failure patterns in ServiceNow implementations (event noise, incomplete CMDB), and tool sprawl adds 20-40% overhead to resolution time. Alert fatigue persists as a $3.3 billion annual cost in the US alone, with 42% of alerts uninvestigated and 67% of security operations centre alerts ignored due to volume. The constraint is data estate readiness, unified schema enforcement, and organisational willingness to consolidate fragmented monitoring architectures.

TIER HISTORY

ResearchJan-2017 → Jan-2017
Bleeding EdgeJan-2017 → Jan-2020
Leading EdgeJan-2020 → Jan-2026
Good PracticeJan-2026 → present

EVIDENCE (148)

— Comprehensive peer-reviewed survey on LLM-based agentic AIOps architectures covering autonomy hierarchies, evaluation frameworks, and safety constraints for deployment governance.

— Comprehensive platform evaluation comparing Moogsoft, Dynatrace, Datadog, Splunk ITSI, BigPanda, PagerDuty, IBM Cloud Pak, BMC, and ScienceLogic across event correlation, anomaly detection, automation, and enterprise fit—providing practical vendor selection guidance.

— Market research projecting telecom AIOps from $6.4B (2026) to $38.6B (2034) at 25.1% CAGR; documents 60-80% MTTR reduction and 40-50% NOC staffing cost reduction as key deployment outcomes.

— Major vendor GA release featuring HelixGPT v7.4 for conversational investigation, Deep RCA for root cause analysis, Ops Swarmer agent, and multi-index log analysis—demonstrating continued AIOps platform maturation.

Incident Specific Dashboard Spin-UpProduct Launches

— SigNoz deployed AI correlation of logs, traces, and metrics with natural language prompts; identified payment service error rate 52% via multi-signal analysis across service dependencies—demonstrating practical cross-domain event correlation.

— Critical market analysis comparing vendor claims to audited customer results: UK Gov 20k-user trial (26 min/user/day), BT 35% case-resolution improvement, only 15% of F500 cut headcount—documenting practical limitations and adoption barriers.

— Adoption metrics showing 73% enterprise AIOps planning, GitLab 1.5M-developer case, but negative signal: 70-80% AIOps implementations fail due to data quality issues requiring 12-18 months of data governance work.

— Market sizing study: AIOps market $18.95B in 2026, growing 14.8% CAGR; AI-powered monitoring adoption jumped from 42% to 54% in one year; alert fatigue prevalence across 70% of SRE teams.

HISTORY

  • 2017: Early academic and vendor work on alert correlation and anomaly detection in logs. Splunk, Elastic, Sumo Logic adding ML capabilities; dedicated AIOps platforms (Moogsoft, BigPanda) emerging. Survey data documented widespread alert fatigue (37% of enterprises >10,000 alerts/month) driving demand for noise reduction solutions.

  • 2018: Commercial validation and technical advancement. Moogsoft Series D funding ($40M, Goldman Sachs) with deployment wins at Cisco, T-Mobile, Intuit; demonstrated 90% alert reduction and 62% support call reduction. Log analytics vendors strengthened ML integration (Splunk, Elastic). Academic research showed measurable improvements in correlation accuracy. However, adoption barriers persisted: poor data quality, vendor hype skepticism, and tuning complexity. Gartner forecast 25% enterprise adoption by 2019; 451 Research cautioned on market maturity.

  • 2019: Vendor products matured (Moogsoft Express launch, Elastic 7.4 ML expansion, Sumo Logic Kubernetes integration). Adoption surveys showed 51% of infrastructure leaders implementing AI/ML monitoring. Microsoft ICSE research identified critical practical challenges: data quality assurance, continuous model validation, and organizational skepticism about automation. Sumo Logic expanded cloud-native deployments. Adoption progress continued but remained contingent on addressing data foundation and organizational readiness barriers.

  • 2020: Enterprise adoption acceleration and Fortune 100 case studies. Walmart published AIDR system covering 3000+ models across 25+ teams with 63% major incident coverage and 7+ minute MTTD reduction. O'Reilly data showed 50%+ of enterprises in mature AI phase (up from 27% YoY); Sumo Logic survey documented widespread alert fatigue driving cloud SIEM modernization demand (70% doubled alert volumes, 84% prefer cloud solutions). Moogsoft Enterprise v8.0 (May 2020) advanced topology-based clustering and entropy analysis. Research validated core noise-reduction capability (80%-99% false alarms detected). Organizational skepticism and skills gaps (58% ML engineer shortage) remained primary adoption barriers.

  • 2021: Platform maturation and proof-of-concept deployments widening. Peer-reviewed research (ESEC/FSE) validated practical log anomaly detection with F1-score 0.83 on real bank data. Production deployments showed quantified results: enterprise customer reduced monthly alert volume from 8,000-10,000 to 2,000 (75-80% reduction). IDC survey documented persistent alert fatigue problem: 45% of in-house security operations centers experience false positive rates exceeding 45%, with 35% ignoring alerts due to volume. Tool sprawl remained endemic (52% of enterprises using 6+ monitoring tools), validating AIOps alert aggregation value. Vendor product evolution continued with platform feature expansion, though deployment barriers remained around data quality, model validation, and organizational readiness to trust automation.

  • 2022-H1: Vendor platforms advancing toward cloud-native architectures and production quantified results. Moogsoft enhanced automatic alert-to-incident correlation in SaaS platform; Splunk case studies showed 70% MTTR reduction in SAP monitoring; Sumo Logic demonstrated correlated log/metric/trace analysis for AWS Lambda observability. IBM research detailed log parsing and anomaly detection techniques driving AIOps maturation. Alert fatigue surveys (Orca Security) documented scale of problem: 59% of organizations receive 500+ alerts daily with 43% false positive rates. Existing ML implementations showed limitations (Elastic Stack false positives requiring manual effort), reinforcing that correlation quality remained a key differentiator between vendors.

  • 2022-H2: Academic research and practical deployment challenges documented. Peer-reviewed study (ADLILog) proposed novel method using GitHub log instructions to achieve 60% F1 score improvement in unsupervised log anomaly detection, advancing algorithmic approaches. Splunk and Moogsoft tutorials demonstrated practical deployment guidance for event correlation and anomaly detection in production. Community experiences revealed limitations: Elastic Stack ML users reporting false positive issues and confusing outputs requiring manual expert diagnosis. Independent platform review documented Moogsoft achieving 50% MTTR improvement but noted automation limitations (duplicate ticket generation, unresolved alert closures). Assessment reflects market state at year-end 2022: platforms gaining production deployment traction with quantified metrics, but real-world implementations exposed reliability gaps and tuning complexity barriers.

  • 2023-H1: Platform maturity consolidation and inflection toward observability platforms. Moogsoft released APEX v9 with enterprise migration procedures indicating stable platform status; Elastic deployed AIOps Labs with ML-powered log categorization and spike detection; academic research advanced log anomaly detection with hybrid PCA/ANN frameworks showing measurable improvements. Observability adoption reached 64% of IT professionals with AIOps becoming an expected feature. Critical industry assessment published mid-2023 argued that AIOps platforms had failed to deliver acceptable alert-to-noise ratios and were being absorbed into broader observability solutions. The fundamental gap between correlation capability and operational trust persisted despite technical advances.

  • 2023-H2: Platform consolidation accelerated with AIOps capabilities absorbed into observability platforms. Elastic released full-stack observability GA (November 2023) with AI-driven log processing and customer case studies (Wells Fargo 60% log field reduction); New Relic survey (August) showed 41% AIOps deployment (up 10% from 2022) with 70% MTTR improvement. Independent assessments documented critical deployment barriers: only 53% of AI projects reach production, data quality and pipeline complexity remain primary obstacles, and trust in ML-driven correlation remains contingent on organizational readiness. Alert fatigue problem persisted at scale (59% receiving 500+ daily alerts, 43% false positives exceeding 40%), validating continued need for correlation while documenting implementation challenges.

  • 2024-Q1: Major vendor market expansion and continued algorithmic advancement. Cisco AIOps launched as GA product with unified telemetry correlation across AppDynamics, ThousandEyes, and VMware, signaling platform consolidation trend. Splunk ITSI production deployments (Transurban road operations) demonstrated MTTR gains. Academic research introduced novel techniques: LogELECTRA (self-supervised log anomaly detection with SOTA performance on benchmarks) and unsupervised PCA/ANN methods (72% reduction in false alerts). LLM-augmented AIOps approaches surveyed in peer-reviewed research, indicating early exploration of generative AI in log analysis and correlation. Core market tension persisted: technical capability advancing while deployment barriers (data quality, integration complexity) remained

  • 2024-Q2: Vendor product maturation in log-specific analysis and continued academic rigor advancement. Elastic released GA log rate analysis (April) with statistical spike detection; Sumo Logic announced GA AI-driven Alerting with AutoML (May); Dynatrace case studies documented autonomous operations deployments (Coop, Experian). Microsoft Defender XDR deployed GraphWeaver for billion-scale alert correlation achieving 99% accuracy with 7.4x storage reduction. Federal agencies began migrating to Elastic for observability and log analysis at scale. Academic literature advanced with systematic evaluation of deep learning models for log-based failure prediction (May) and MicroServo benchmark framework for standardized algorithm evaluation on microservices (June). Critical independent analysis documented persistent gap between platform capabilities and marketing claims, highlighting that most vendors lack effective log noise reduction despite integration advances. Alert fatigue problem remained unresolved at scale, with 59% of organizations receiving 500+ daily alerts and 43% reporting 40%+ false positives, constraining further AIOps tier advancement despite improved product offerings.

  • 2024-Q3: Vendor product advancement in log analysis and alert correlation. Elastic named Gartner Magic Quadrant Leader for observability (August); IBM Cloud Pak for AIOps v4.11.0 released log anomaly detection with golden signals algorithm for improved noise management; ServiceNow Alert Automation went GA (August) with enhanced alert grouping and correlation accuracy; HCL IntelliOps Event Management announced GA ML-driven alert correlation engine. Academic research advanced LLM capabilities in log analysis (LogEval benchmark evaluating model performance on parsing, anomaly detection, fault diagnosis). Vendor ecosystem maturity signals confirmed across product releases, though deployment complexity remained primary adoption barrier.

  • 2024-Q4: Vendor product advancement and market maturity signals. BigPanda documented 80% alert noise reduction within eight weeks and 80% event-to-incident compression (25% MTTR reduction in 90 days); market research forecasted AIOps growing from $15.9B in 2025 to $50.5B by 2032 (17.9% CAGR) with 65% of businesses embedding AI-driven platforms. New Relic survey showed 24% AIOps deployment with organizations using 5+ capabilities experiencing 45% lower downtime. Splunk observability research (1,850 ITOps/developers) documented 2.6x ROI for leaders and 80% alert accuracy vs 54% baseline. Named deployments (Singapore Airlines: 75% faster detection, 90% backend issue reduction) demonstrated tangible operational value. Critical analysis (CNCF) argued AIOps adoption failures stem from organizational resistance rather than technology gaps, advocating GenAI-powered observability as evolution. Alert fatigue remained endemic problem (31% Singapore respondents find false positives highly problematic).

  • 2025-Q1: Enterprise deployment evidence and research advancement. Named case studies documented production results: HCL Technologies with Moogsoft reduced MTTR 33% and tickets 62%; TD Bank with Dynatrace achieved 25% proactive incident increase; ServiceNow predictive intelligence reached 68% proactive engagement. Academic research advanced log parsing (ULP/AML metrics) and LLM evaluation (LogEval benchmark) for log analysis. Cloud-native metrics showed 70-80% of AWS alerts were noise; auto-remediation resolved 40-60% of incidents pre-human. Open-source community traction (Keep platform: 9,200 stars, 110+ integrations). Practitioners emphasized data consolidation and organizational readiness as critical success prerequisites beyond technology capability.

  • 2025-Q2: Vendor product maturation and production deployment validation across multiple named organizations. Moogsoft APEX released correlation enhancements including improved list similarity algorithm for better alert clustering, advancing event correlation capabilities. LLM-based semantic event correlation advanced beyond traditional rule/statistical methods, with natural language processing enabling context-aware alert interpretation. Named production deployments demonstrated significant outcomes: Managed Service Provider using GrokStream achieved 80% incident reduction (40,000 NOC hours saved, $1.2M annually); Fortune 500 enterprise saw 72% incident reduction; Equinix deployed Moveworks achieving 96% ticket routing accuracy with 82% routed within 30 seconds; Kroger unified observability replacing 16 tools and cutting support tickets by 99%; Photobox achieved 80% MTTR reduction and 60% incident reduction during peak periods; Southeast Asian government achieved full Pre-L1 automation with knowledge graph-powered ticket resolution (60% of simple issues automated) saving millions. Survey evidence confirmed adoption acceleration: EMA research (1,000+ ITOps/DevOps/SecOps professionals) showed AIOps improving efficiency and resolution times; Sumo Logic security survey (500+ decision-makers) documented 75% exploring AI-enhanced SIEM alternatives with 33% reporting tangible incident response improvements. Critical assessment documented persistent platform limitations: steep learning curves, integration challenges, real-time performance degradation with large datasets—indicating deployment complexity as primary constraint despite continued vendor capability advancement and measurable operational improvements across diverse organizations.

  • 2025-Q3: Vendor product momentum in alert correlation and cloud platforms. Elastic released AI-driven Streams for automated log pattern detection and anomaly identification; Microsoft Azure Monitor launched AIOps issue investigation preview with multi-signal correlation; IBM Cloud Pak for AIOps v4.10.1 enhanced incident creation from multi-alert conditions. Moogsoft Enterprise 8.0 introduced Entropy noise reduction AI with topology visualization. Market analysis showed ServiceNow ITOM at 2.1% market share vs Moogsoft 0.9%, with 97% fragmented across other vendors. Critical practitioner assessment (e-commerce context) identified persistent AIOps barriers: lack of SLO-based alerting, non-actionable pages, uncorrelated alert duplicates, and runaway costs—signaling that alert management remains the core friction point despite platform advancement.

  • 2025-Q4: Platform maturation and enterprise adoption acceleration. Splunk released AI Toolkit with guided ML assistants for AIOps use cases (October); Elastic confirmed IDC MarketScape Leadership in observability (November) with AI-driven Streams log processing; New Relic GA intelligent alerting with dynamic baselines (November). Adoption metrics showed 100% of enterprise leaders using AI in operations with 41% anticipating significant value from anomaly detection; 60% characterize observability practices as mature/expert (up from 41% prior year). Technical advancement validated: unsupervised log anomaly detection using NLP and vectorization demonstrated reduced false positives on real datasets (SockShop, HDFS). Critical assessments documented persistent limitations: biased datasets, weak AI reasoning on root causes, and regulatory concerns (GDPR Article 22 liability). Deployment barriers remained endemic: tool sprawl (28% use 5+ tools, adding 20-40% overhead to resolution), slow remediation (66% of teams take 4+ hours), and alert fatigue unresolved (67% of SOC alerts ignored due to false positives). Market forecasts AIOps growing from $16.42B in 2025 to $36.6B by 2030 (17.39% CAGR), signaling mainstream adoption despite persistent operational constraints.

  • 2026-Jan: Vendor product momentum and enterprise adoption acceleration. Dynatrace named Forrester Wave 2025 Leader in AIOps; Sumo Logic released Mobot AI assistant for log analysis with natural language query conversion and knowledge agents; industry survey showed 90% of 500+ security leaders value AI/ML for alert fatigue but deployment gaps persist (only 9% use advanced correlation vs 49% basic detection). Healthcare deployment (datasensAI on Splunk) achieved 64% ROI improvement and freed 35% infrastructure capacity. Adoption metrics projected 73% enterprise AIOps adoption by end 2026 with 80% automated incident resolution, indicating mainstream platform penetration. Alert fatigue remained operational challenge despite continued vendor advancement and ecosystem maturity in log analysis capabilities.

  • 2026-Feb: Enterprise adoption validation and persistent deployment barriers. Splunk 2026 CISO Report (650 respondents) documented 92% adoption of AI for event review and 89% for data correlation, reflecting mainstream category acceptance. Elastic Observability Labs published technical results: 94% log parsing accuracy and 91% log partitioning accuracy via ML, confirming vendor product maturity. However, industry metrics remained concerning: alert fatigue cost quantified at $3.3B annually in US alone with 42% of alerts not investigated due to volume overload. Practitioner assessments highlighted structural barriers: tool sprawl adding 20-40% overhead to incident resolution time; ServiceNow AIOps implementation challenges documented 20 failure patterns including event noise, incomplete CMDB, and correlation failures. Customer satisfaction indicators remained strong (Moogsoft: 84% recommend, 98% renewal), yet deployment complexity persisted as primary adoption constraint.

  • 2026-Mar/Apr: Continued vendor product maturation and LLM-augmented approaches. Splunk AppDynamics shipped GA anomaly detection with automated baselines and problem correlation; Dynatrace Grail log management platform delivers cross-signal correlation with 850+ integrations and named customer outcomes (BMO: 80% faster issue resolution, 60 hours/month analyst toil eliminated). Splunk Enterprise Security 8.5.0 (April 2026) GA shipped an AI triage agent for autonomous alert investigation alongside detection tuning to reduce false positives. IBM Cloud Pak for AIOps v4.13 replaced static event grouping with topology-based correlation at 700 concurrent users, adding Gen AI root cause diagnostics. BigPanda incident correlation product enables cross-domain visibility reducing triage time and MTTR; Coralogix launched trace drilldown for side-by-side signal correlation. Elastic published production benchmarks showing 3,374x speedup in ML model training for log anomaly detection (836k events/hour) via aggregation-based datafeeds; THG (UK e-commerce, £2B revenue) achieved 60% MTTR reduction and halved security triage burden using unified Elastic log/event analysis. Academic research advanced: a comprehensive benchmark found fine-tuned transformers achieve F1 0.96-0.99 and zero-shot LLMs F1 0.82-0.91 on log anomaly detection, providing practitioner selection guidance; University of Twente peer-reviewed paper (LogBERT) demonstrates real-world military deployment at 15-second real-time latency. NeuBird survey (1,039 professionals) documented 44% of outages caused by suppressed alerts and $50k-$100k+/hour downtime costs, reinforcing the core AIOps value case. Enterprise deployments confirm production value: Odigo consolidated 15 tools; BT Group consolidated 85 tools, auto-remediates 500 incidents/week; New Relic's 6.6-million-user study shows AI-enabled observability users achieve 27% less alert noise, 2X better correlation, 25% faster resolution. Data estate consolidation and organisational readiness remain primary adoption constraints.

  • 2026-May: Market sizing confirmed AIOps at $18.95B in 2026 (14.8% CAGR) with AI-powered monitoring adoption rising from 42% to 54% in a single year, while alert fatigue persists across 70% of SRE teams. A peer-reviewed arXiv survey on LLM-based agentic AIOps architectures documented autonomy hierarchies, evaluation frameworks, and safety constraints — marking the shift from rule-based correlation to governed agentic pipelines. Telecom AIOps market data projects $6.4B (2026) to $38.6B (2034) at 25.1% CAGR, with documented 60-80% MTTR reduction and 40-50% NOC staffing cost reduction as primary adoption drivers. BMC Helix AIOps v26.2 shipped HelixGPT v7.4 conversational investigation and Ops Swarmer agent for distributed incident analysis. Named deployments reinforced production value: Cisco IT achieved 25% incident reduction and 99.998% alert automation over 18 months; IDC MarketScape 2026 named New Relic a Leader; a tiered ML architecture case study demonstrated 38% MTTR reduction and 90% diagnostic accuracy. Data estate readiness and organisational change management remain the primary adoption constraints.

TOOLS