The AI landscape doesn't move in one direction — it lurches. Some techniques leap from experiment to table stakes in a single quarter; others stall against regulatory walls, technical ceilings, or organisational inertia that no amount of hype can dislodge. Knowing which is which is the hard part. The State of Play cuts through the noise with a rigorously maintained index of AI techniques across every major business domain — classified by maturity, evidenced by real-world adoption, and updated daily so you always know where you stand relative to the field. Stop guessing. Start knowing.
A daily newsletter distilling the past two weeks of movement in a domain or two — delivered to your inbox while the index updates in the background.
Each dot marks the weighted maturity of practices within a domain — hover for a brief summary, click for more detail
AI models that forecast future values from historical time series data across demand, revenue, usage, and other metrics. Includes deep learning forecasting and automated model selection; distinct from financial forecasting which applies time series to a specific finance context.
AI-driven time series forecasting has reached the point where forward-leaning organisations extract real value from it -- but most have not yet started, and the field's central question remains unresolved. Neural and foundation model approaches (Transformers, TimeGPT, TimesFM) promise zero-shot generality across demand, revenue, and operational metrics, yet empirical evidence stubbornly shows that simpler methods -- gradient boosting, ARIMA, exponential smoothing -- match or beat them on most production workloads. The M4 Competition, repeated benchmarking studies, and practitioner case studies all converge on the same finding: model performance is task-dependent, not architecture-dependent. What makes this a leading-edge practice is not proof that deep learning wins, but that a mature vendor ecosystem, cloud-managed services, and confirmed multi-sector deployments have made automated forecasting accessible at scale. The tension that defines this tier is method selection: organisations can deploy forecasting today, but choosing when neural complexity justifies its cost over classical alternatives still requires domain expertise and empirical validation rather than default architectural commitment.
The vendor ecosystem is consolidating around foundation models even as evidence mounts against their universal superiority. AWS completed its deprecation of Amazon Forecast, retreating from specialised forecasting-as-a-service -- a significant signal from the category's largest cloud provider. Foundation model vendors filled the gap: Google released TimesFM 2.5 (March 2026) with 200M parameters and 16k context length (8x expansion), integrated into BigQuery ML and Google Sheets for consumer-grade accessibility; Amazon Chronos-2 achieved 600M+ HuggingFace downloads and added multivariate/covariate support; Salesforce released Moirai-MoE with sparse mixture-of-experts outperforming larger rivals at 28x parameter efficiency. Enterprise adoption is real but narrow -- 62% of enterprises report increased predictive analytics demand, and documented deployments span retail (The Very Group: 9.9% SKU management improvement across 8M+ forecasts), manufacturing (Foxconn: 8% accuracy gain, $553K annual savings), energy (renewable forecasting achieving 14% balancing cost reduction), and healthcare (peer-reviewed mortality/discharge prediction). Yet peer-reviewed research keeps undermining the case for model complexity: a billion-scale benchmark (QuitoBench) on Alipay data reveals that deep learning matches foundation models with 59x fewer parameters at short context lengths, while transformer-based models underperform simple linear models on financial data due to variance-driven error. Theoretical work has formalised non-zero error bounds tied to partial observability, while calibration studies confirm TSFMs maintain reliable uncertainty estimates—enabling deployment in high-stakes domains but not resolving the core efficiency tension. Perhaps the most consequential finding is that traditional accuracy metrics (MAPE, MAE) correlate poorly with economic outcomes -- optimising for forecast precision can actually reduce profitability by ignoring pricing, substitution, and agency effects. The field's real barrier remains not which model to choose but whether forecasting teams are optimising for the right objective and whether zero-shot generic foundation models offer genuine ROI over domain-specific fine-tuning.
— Negative signal: empirical evidence of fundamental rule-based model selection failures, documenting context-dependent performance instability and practical maturity challenges.
— Production TSO deployment: empirical evaluation of TSFMs (Chronos-2, TabPFN-TS) on energy load forecasting, zero-shot competitive with task-specific models.
— Uber production deployment: Bayesian neural networks for demand and anomaly detection with principled uncertainty decomposition (epistemic, aleatoric, distributional shift) at scale.
— Peer-reviewed empirical study benchmarking 200,000+ model configurations with wavelet-based methodology improvements and quantified financial forecasting results.
— Comprehensive energy sector benchmark: TSFMs outperform dataset-specific ML on 54 energy datasets (9 categories), deployment-grade rigor in critical infrastructure domain.
— Critical assessment documenting failure modes during market regime changes; identifies structural barriers to forecasting effectiveness in real production environments.
— ICLR 2026 benchmarking framework: 10,000+ experiments evaluating forecasting architecture components; 92% configurations beat SOTA with 5.4% error reduction through systematic design exploration.
— Financial sector production deployment: end-to-end TimesFM 2.5 pipeline with fine-tuning on 14 instruments, demonstrating real-world adoption in algorithmic trading with technical implementation evidence.