The AI landscape doesn't move in one direction — it lurches. Some techniques leap from experiment to table stakes in a single quarter; others stall against regulatory walls, technical ceilings, or organisational inertia that no amount of hype can dislodge. Knowing which is which is the hard part. The State of Play cuts through the noise with a rigorously maintained index of AI techniques across every major business domain — classified by maturity, evidenced by real-world adoption, and updated daily so you always know where you stand relative to the field. Stop guessing. Start knowing.
A daily newsletter distilling the past two weeks of movement in a domain or two — delivered to your inbox while the index updates in the background.
Each dot marks the weighted maturity of practices within a domain — hover for a brief summary, click for more detail
AI techniques that go beyond correlation to estimate causal effects and identify which interventions drive outcomes. Includes treatment effect estimation and counterfactual analysis; distinct from predictive modelling which forecasts outcomes without inferring causation.
Causal inference and uplift modelling occupy a deepening fault line: the tooling is production-ready and measurably deployable, yet adoption remains narrow and deployment reveals hard constraints. Unlike predictive modelling, which forecasts outcomes, causal inference estimates what would happen under a specific intervention — the incremental effect of a marketing campaign, a product change, or a credit offer. Uplift modelling applies this at the individual level, identifying which customers will actually respond to treatment rather than converting regardless.
By June 2026, the practice demonstrates maturity in three ways. First, real-world deployments continue to accumulate: Haus (Series B, $55.3M) reports 10x ROI in 60 days; Syneos Health runs production biopharma commercial analytics; Adyen's GA Uplift product processes trillions of payment transactions; foundational pricing models (C3PO) operate across healthcare, airline, and tender domains. Second, vendor GA automation is now mainstream: Google Demand Gen Uplift (June 2026) enables advertisers to measure incremental campaign impact with 10% ROAS and 12% sales lift outcomes; AppsFlyer's incrementality measurement GA removes data science dependencies via automated holdout experiments; end-to-end platforms (Geminos CauseWay) reach enterprise customers with integrated causal modeling, counterfactuals, and LLM co-pilots. Third, practitioners increasingly acknowledge that causal measurement reveals hidden value destruction: 60-90% of marketing promotions destroy incremental value when measured causally, not just failing to increase sales. Yet these deployments concentrate in e-commerce and marketing where randomised experiment infrastructure already exists. April 2026 Amazon Science benchmarks confirm 62% of modern treatment-effect models still underperform trivial baselines on real-world heterogeneous data, documenting robustness gaps that persist despite tooling maturity. Multi-institutional critical assessment (U Michigan, ICL, NYU, Columbia, Harvard, MIT) cautions that causal ML remains complex, requiring careful assumption validation and domain expertise—particularly in observational healthcare settings. Healthcare remains research-led with zero clinical workflow integration despite intensified institutional interest. The practice remains leading-edge: forward-leaning teams extract real value and scale it, but most organisations have not started, and the barriers that prevent broader adoption—data volume requirements (10,000+ per arm), method selection complexity, inter-library consistency gaps (10-20% ATE divergence), and assumption validation challenges—have proven resistant to vendor investment and platform integration despite 2026 tooling accessibility advances.
Production deployment landscape matured into May 2026 with concrete ROI measurement and practitioner adoption at scale. Haus (Series B, $55.3M funding through 2025) reports customer Newton Living achieved 10x ROI in 60 days via synthetic control causal MMM; founder Epstein left Google frustrated by correlation-attribution gaps and built causal inference infrastructure for practitioner access. Remerge's 20+ documented uplift case studies (2023-2026) show sustained RCT-based incremental measurement delivering 30-60% CPA reductions across 100+ mobile marketing campaigns. Cassandra.app serves 100+ marketing teams with geo-based uplift testing; Gina Tricot demonstrates consistent ROI improvement across markets. Adyen's GA Uplift (May 2026) processes trillions of payment transactions with 10% conversion lift. Meta's incremental attribution in DTC achieved 18% incremental sales growth. DuckDuckGo's GeoLift implementation reveals operational barriers: requires organizational culture enabling channel turn-offs; Bayesian confidence intervals provide uncertainty quantification but implementation complexity limits adoption to unified marketing teams.
Real-world deployment barriers persist and are increasingly well-documented. A €40M FMCG case (Deep Marketing, May 2026) demonstrates that 60-90% of promotions destroy incremental value when measured causally due to pull-forward and cannibalization effects; enabling measurement revealed 40% promo reduction without revenue loss, validating causal deployment value while exposing massive measurement gaps in practitioner default approaches. April 2026 Amazon Science benchmark confirms 62% of CATE models still underperform trivial baselines on heterogeneous data; peer-reviewed biomedical work documents single-robust ML estimators underperform parametric regression, requiring doubly-robust (TMLE, AIPW) approaches. A theoretical impossibility result (Tao et al., May 2026) proves distribution-free prediction sets for individual treatment effects must have infinite expected length under standard assumptions, documenting fundamental inference constraints. Model upgrade stability: production accounts document causal risk estimate shifts of 0.12-0.19 points and 23% confidence interval widening on protected cohorts, creating deployment instability challenges.
Methodological advancement toward practitioner accessibility and deployment scalability continues: empirical simulation handbook (Aurensanz-Crespo et al., May 2026) guides method selection across PSM, IPW, G-computation, TMLE on real biomedical data; hierarchical causal models on 3M active users recover effects under treatment overlap; foundation models (C3PO, May 2026) deployed for pricing optimization across healthcare, airline, tender domains with reported substantial gains; Causely benchmark (May 2026) quantifies enterprise causal reasoning in AI agent diagnostics with concrete latency/cost/accuracy gains. Yet adoption remains concentrated in e-commerce/marketing with randomized experiment infrastructure; healthcare research intensifies (4,300+ clinical publications) but clinical workflow integration remains absent. Syneos Health partnership (May 2026) with causaLens extends adoption to biopharma commercialization (targeting, optimization, territory design), signaling expansion beyond marketing but still confined to customers with existing causal infrastructure investment.
The research frontier is moving toward reliability assessment and practitioner accessibility. New benchmarks at ICLR 2026 exposed critical LLM weaknesses on causal reasoning: when evaluated across Pearl's causal ladder (discovery, intervention, counterfactual), LLMs achieve 93.5% on discovery but degrade sharply to 81.9% on intervention and 73% on counterfactual reasoning, limiting autonomous causal method selection. Methodological work continues to push boundaries with theoretical advances in heterogeneous treatment effects (HTE) clarifying assumptions for mechanism testing, multi-treatment effect identification under unmeasured confounding with √n-consistent estimators, and HTE estimation for survival outcomes with clinical application. NSF-funded educational tooling (thinkCausal with stan4bart) achieves randomized validation showing superior accuracy and speed over alternative methods, advancing practitioner accessibility. Yet these advances remain concentrated in academic and research settings rather than driving operational adoption.
The core adoption barriers are well-documented and persistent: minimum data volumes of 10,000+ per treatment arm, 10-20% ATE estimate divergence between major libraries using identical configurations, and model generalization failure across campaigns. Healthcare has generated over 4,300 clinical publications referencing causal methods, yet systematic reviews find zero integration into clinical workflows. Analyst surveys project enterprise interest — 62% planning a shift toward causal decision intelligence within 18 months — but the gap between stated intent and operational deployment defines this practice's stalled trajectory.
— Major cloud provider deployed causal discovery system for production RCA with 85.7% recall on 35 incidents and 800+ real-world deployments; documents causal inference at hyperscale with measurable operational impact.
— Luxembourg Institute of Health multi-speaker lecture series with recognized practitioners (Huber, Mealli, Hernán) teaching causal methods; signals mainstream professional adoption and institutionalized training infrastructure.
— Critical assessment revealing 28% of standard causal inference predictors fail on unidentified counterfactual couplings; documents structural reliability limitation blocking broader deployment.
— Technical reference standardizing Qini methodology for uplift model evaluation, establishing discipline-specific evaluation standards and best practices for evaluating incremental targeting quality.
— Peer-reviewed BartCure methodology with application to real CALGB 40101 breast cancer trial; advances heterogeneous treatment effect estimation for healthcare with conservative heterogeneity detection.
— Netflix demonstrated agentic workflows for causal inference with human augmentation; open-sourced methodology and industry commentary shifting from prediction to causal explanation and intervention impact quantification.
— Microsoft-Causaly partnership GA integrating causal reasoning into biopharma R&D, enabling target identification and biomarker strategy with governed provenance; demonstrates regulated-domain production deployment.
— End-to-end causal AI platform GA with causal modeling, counterfactual reasoning, intervention impact analysis, and LLM co-pilot; demonstrates mature, feature-complete commercial tooling for enterprise deployment.