Perly Consulting │ Beck Eco

The State of Play

A living index of AI adoption across industries — where established practice meets the bleeding edge
UPDATED DAILY

The AI landscape doesn't move in one direction — it lurches. Some techniques leap from experiment to table stakes in a single quarter; others stall against regulatory walls, technical ceilings, or organisational inertia that no amount of hype can dislodge. Knowing which is which is the hard part. The State of Play cuts through the noise with a rigorously maintained index of AI techniques across every major business domain — classified by maturity, evidenced by real-world adoption, and updated daily so you always know where you stand relative to the field. Stop guessing. Start knowing.

The Daily Dispatch

A daily newsletter distilling the past two weeks of movement in a domain or two — delivered to your inbox while the index updates in the background.

AI Maturity by Domain

Each dot marks the weighted maturity of practices within a domain — hover for a brief summary, click for more detail

DOMAIN
BLEEDING EDGEESTABLISHED

Feature engineering, AutoML & predictive modelling

GOOD PRACTICE

AI that automates feature engineering, model selection, hyperparameter tuning, and end-to-end predictive model building. Includes automated feature discovery and neural architecture search; distinct from model monitoring which evaluates deployed models rather than building them.

OVERVIEW

Automated feature engineering and AutoML are proven, commercially validated capabilities with a mature tooling ecosystem and documented ROI across multiple industries. The question for most organisations is no longer whether these tools work, but how to deploy them effectively. Cloud platforms from Google, Microsoft, and AWS offer GA-quality managed services with integrated Feature Stores for centralized feature management; open-source frameworks like AutoGluon and PyCaret have specialised into distinct performance niches confirmed by independent benchmarking across hundreds of datasets. Financial services adoption sits at 68%, and named deployments span fraud detection (50% scam loss reduction), predictive maintenance (24% accuracy improvements), healthcare (93% classification accuracy), and manufacturing (14-week ROI payback).

The central tension is an adoption-reality gap. Organisations with strong data engineering foundations report measurable returns within months, yet broader surveys show that most enterprises still struggle to scale AI projects beyond pilots. The bottleneck is rarely the model-building step that AutoML automates -- it is upstream data quality, problem formulation, and organisational readiness. AutoML augments skilled practitioners; it does not replace the judgment required to frame problems, curate features, and govern deployed models. Recent research reinforces this: domain-specific feature engineering still outperforms fully automated pipelines in low signal-to-noise domains, and production deployment barriers (SDK deprecation cycles, version dependency conflicts, Feature Store reliability issues) reveal that implementation complexity persists despite platform maturity.

CURRENT LANDSCAPE

The vendor ecosystem has consolidated around a clear split: managed cloud services (Google Vertex AI, Azure ML, AWS SageMaker) for enterprise teams, and specialised open-source frameworks for practitioners who need fine-grained control. AutoGluon leads on accuracy for tabular workloads, PyCaret optimises for speed and memory efficiency, and older frameworks like Auto-sklearn and TPOT have moved into maintenance mode. LLM-powered AutoML agents emerged in late 2025, outperforming traditional frameworks on multimodal benchmarks; March 2026 research demonstrates new LLM-driven feature engineering deployed at Databricks achieving 19% cost savings with feature engineering loops reduced from weeks to 20-30 minutes. April 2026 vendor consolidation signals reinforce momentum: Microsoft's Fabric AutoML GA with auto-featurization and MLflow integration, Databricks' early-stopping classification workflows, and H2O's FedRAMP "In Process" designation all demonstrate ecosystem maturity crossing into regulated and mainstream enterprise adoption. AutoML adoption has reached 55% of new enterprise ML models according to industry forecasts, reflecting mainstream penetration across all company sizes.

Production deployments demonstrate concrete value across sectors. Commonwealth Bank reduced fraud by 70% using automated feature engineering; industrial applications show SHAP-based feature selection improving RUL prediction by up to 24% in aero-engine maintenance; healthcare studies achieve 93% classification accuracy on wearable data using ensemble-driven feature selection; a Tier-1 automotive supplier raised overall equipment effectiveness from 68% to 81% with a 14-week payback. Facio, a Brazilian fintech serving 4M customers, achieved 60-70% training time reduction and 2-3x faster loan decisions using automated feature engineering and AutoML, with 80% accuracy improvement in production credit scoring. AWS SageMaker Autopilot, Google Vertex AI, and Azure ML report enterprise deployments with documented productivity gains (Deloitte 30-40% speed improvements). Three named-organization deployments on Vertex AI span e-commerce cross-store search, vending machine placement analytics, and autonomous vehicle image processing. The market grew from USD 2.21B in 2024 to USD 3.02B in 2025, projected at 36.8% CAGR through 2032.

These successes coexist with persistent operational friction. Only one-third of ML projects reach production, per Rexer Analytics, and a PwC survey of 4,454 CEOs found 56% reporting no significant AI ROI. Implementation complexity surfaces in deployment: Azure AutoML troubleshooting documentation reveals SDK deprecation (v1→v2 migration), scikit-learn and pandas version incompatibilities, and configuration failures that block production adoption. Root causes cluster around data quality and governance -- 67% of documented failures trace to these factors, not model performance. Platform reliability remains uneven: Azure ML Feature Store production failures have blocked online inference pipelines, and practitioner benchmarks show AutoML accuracy gains come at steep computational cost. Interpretability gaps and overfitting risks limit deployment in regulated and high-stakes domains, as a cybersecurity study of eight tools across eleven datasets confirmed.

TIER HISTORY

ResearchJan-2017 → Jan-2017
Bleeding EdgeJan-2017 → Jan-2021
Leading EdgeJan-2021 → Jul-2024
Good PracticeJul-2024 → present

EVIDENCE (142)

— Production infrastructure research demonstrating 5x acceleration of feature rollouts and 50-55% prevention of performance degradation at scale, advancing feature engineering maturity.

— Microsoft Fabric AutoML GA with auto-featurization and MLflow integration signals ecosystem consolidation and mainstream adoption across enterprise data platforms.

— H2O.ai FedRAMP 'In Process' designation at High Impact Level signals production-ready regulatory compliance, enabling government and regulated sector deployment.

— Databricks AutoML classification GA showing configurable hyperparameters, early stopping, and integrated serving endpoints—evidence of mature, production-ready tooling.

— Peer-reviewed survey of hyperparameter optimization techniques with real-world deployment examples (AlphaGo, sentiment analysis) documenting maturity and practical challenges.

— Amazon FeatPilot research on automatic feature augmentation from data lakes, advancing AutoML beyond static features to dynamic multi-hop feature discovery from enterprise data.

— Infor analyst report tracking enterprise-scale AutoML adoption gaps and growth patterns across large organizations, reflecting mainstream adoption.

— Facio fintech (4M customers) achieved 60-70% training time reduction, 2-3x faster decisions, 80% accuracy improvement using automated feature engineering and AutoML in production.

HISTORY

  • 2017: AutoML research and early product launches demonstrate efficiency gains (MIT ATM 100x speedup, IoT feature engineering 4-6 months to 2 days) and early enterprise adoption (H2O 20% Fortune 500), but production deployment barriers and practitioner skepticism highlight immaturity.

  • 2018: AutoML field matures with formalized research frameworks and ecosystem expansion (NeurIPS challenge, 300+ participants; Feedzai domain-specific tool). However, organizational adoption plateaus despite 93% enterprise AI investment: most firms stuck in pilots, 38% struggle with deployment at scale; practitioner analysis highlights that automation addresses only 5-10% of real ML work.

  • 2019: Commercial ecosystem expands with major vendor releases and domain-specific tools (H2O Driverless AI, Databricks AutoML Toolkit, Aible Advanced). Research confirms core problem has shifted from "automate model selection" to "automate data prep and deployment"—78% of projects stall before production; data quality and labeling at scale emerge as primary barriers, not algorithm selection.

  • 2020: AutoML ecosystem matures with formalized benchmarking across tools (AutoWEKA, TPOT, H2O, Auto-Sklearn, cloud platforms). Practitioner adoption measured at 57% hands-on exposure but perception gap persists; critical assessments highlight winner's curse in hyperparameter search and limited practical value to feature engineering stage. Vendor expansion into international markets (H2O-Dell Japan). Systematic reviews formalize AutoML research scope, confirming technical maturity but organizational integration barriers remain unresolved.

  • 2021: Major cloud vendors consolidate AutoML into unified ML platforms (Google Vertex AI). Real-world deployments demonstrate viability across sectors (EDF preventive maintenance, UPMC transplant matching, Adecco CV screening), but critical gaps emerge: fairness features remain inadequate across platforms; 20-30% of organizations report non-adoption; MIT researchers identify problem formulation as intractable automation barrier, revealing that most commercial systems still require domain expert involvement.

  • 2022-H1: AutoML adoption accelerates among mature practitioners (67% adoption rate, 37% YoY growth per O'Reilly survey). Ecosystem benchmarking confirms comparative tool maturity (AutoMLBench 100-dataset evaluation). Real-world deployments expand to automotive manufacturing (warranty forecasting, operationalization challenges documented). Environmental and resource constraints emerge as explicit maturity concerns (Green AutoML research); platform reliability issues surface in production (Azure timeouts, compute resource limits). Fairness gaps persist across major platforms, limiting deployment in regulated contexts.

  • 2022-H2: Ecosystem survey documents industry uptake via diverse deployment case studies (Commonwealth Bank 70% fraud reduction, USAA insurance automation 28% improvement). Academic research confirms field maturation and simultaneously identifies persistent barriers: problem formulation and domain expert involvement remain bottlenecks to full automation; fairness-specific features inadequate across major platforms. Cloud vendor consolidation continues with Google Vertex AI and Microsoft Azure AutoML as primary managed services. Resource constraints and computational cost emerge as structural adoption barriers alongside governance and orchestration challenges.

  • 2023-H1: AutoML adoption continues accelerating in financial services (68% adoption, +12 points YoY), and research documents mature platform ecosystems with strong benchmark performance. However, CHI 2023 and academic studies reveal critical barriers: users report insufficient customizability, transparency, and privacy controls requiring workarounds; AutoML library developers identify structural adoption limits (data prep costs dwarf modeling), constraining ROI. Shift from "citizen data science" to "efficient teams" narrative reflects market maturity. Ecosystem research confirms tool maturity but emphasizes human-in-the-loop requirements and cautious case-by-case deployment decisions.

  • 2023-H2: Healthcare applications accelerate (JMIR tutorial on medical imaging). Peer-reviewed research reviews field advancements in feature engineering and hyperparameter optimization. Institutional investment increases (AutoML Hannover ERC funding for explainability, BMUV support for Green AutoML). User reports surface tool stability issues: Azure AutoML data preparation failures and OutOfMemory errors in production pipelines highlight incomplete automation and resource management challenges. Academic research groups restructure to focus on human-centered and sustainable AutoML development.

  • 2024-Q1: AutoML ecosystem shows continued maturation with product feature releases (RapidMiner unsupervised feature selection, Qlik Cloud automation) and independent benchmarking (AMLB, ICEIS) confirming performance variations across 9+ frameworks. Healthcare case studies emerge (clinical imaging, surgical outcome prediction). Critical literature reviews synthesize 25+ documented adoption barriers and limitations. Framework efficiency trade-offs evident: AutoGluon leads accuracy, PyCaret excels in speed/memory, TPOT struggles with completion rates. User adoption reports highlight risks with imbalanced data and missing variable handling in automated workflows.

  • 2024-Q2: Product ecosystem consolidation continues (IBM-H2O partnership on Power Systems, Qlik Cloud feature expansion). Market forecasts accelerate (30%+ CAGR through 2028, $1B→$6.4B projection). Enterprise AI adoption metrics strengthen: 89% of firms report measurable ROI within 18 months, but 67% of failures traced to data quality and governance challenges. Analyst recognition sustained (H2O.ai Gartner Visionary position). Practitioner guidance emphasizes adoption barriers: customization constraints, interpretability deficits, complex problem limitations, regulatory deployment risks. Research confirms AutoML addresses discrete ML pipeline stages effectively while upstream problem formulation and downstream governance remain intractable human-led tasks.

  • 2024-Q3: Large enterprise deployments confirm production readiness: Nationwide Insurance uses H2O Driverless AI for automated feature engineering with reported cost savings in millions. Analyst recognition (Forrester Wave Q3 2024 names Google Cloud a Leader) signals ecosystem consolidation and mainstream acceptance. However, critical research (FSE 2024 analysis of 37 tools via 14.3K Stack Overflow questions) identifies MLOps (43% of deployment issues) and data preparation (25%) as primary adoption barriers. Vendor documentation (Google AutoML limitations) acknowledges model quality gaps versus manual training and reproducibility challenges, reflecting honest assessment of technical constraints. Ecosystem maturity confirmed but deployment remains operationally demanding.

  • 2024-Q4: Ecosystem maturation accelerates with active benchmarking and tool consolidation. Benchmarking papers (Extreme AutoML vs Google AutoML, 5-library comparisons of PyCaret, H2O, TPOT, Auto-sklearn, FLAML) demonstrate framework performance variations and performance competition. Large-scale healthcare deployments confirm feature engineering automation viability but highlight reproducibility gaps and data quality bottlenecks. Tool ecosystem churn surfaces: practitioner migration from abandoned PyCaret to maintained AutoGluon reflects maintenance sustainability concerns. Production deployment barriers persist: Azure AutoML conda environment conflicts cause endpoint deployment failures, confirming operational complexity. Research reviews synthesize automated feature engineering innovations and challenges, signaling field maturity. End-of-year state reflects mature category with strong adoption among advanced practitioners but persistent operational and governance barriers limiting mass-market deployment.

  • 2025-Q1: AutoML ecosystem consolidates with multi-vendor deployment adoption signals and continued benchmarking analysis. Named enterprise deployments expand (Yokogawa Electric DX via H2O Driverless AI across parallel projects, Nationwide Insurance millions in cost savings documented through Q1), and Google Cloud customer survey (400 customers) confirms 40% acceleration in time-to-insight. Benchmarking landscape broadens: Ready Tensor study compares 10 libraries with systematic performance trade-offs; existing frameworks (AMLB 9 tools, ICEIS 4-library comparison) continue establishing tool performance profiles. Multi-industry case study compilation (22 named deployments across real estate, finance, retail, healthcare) demonstrates adoption breadth and concrete metrics (time savings weeks-to-hours, churn detection, revenue gains). Critical research synthesis (multivocal review of 162 sources) documents 25 limitations including data constraints, interpretability gaps, computational cost, and bias risks—reflecting balanced view of maturity. Platform evolution signals consolidation: Azure SDK v1 deprecation and troubleshooting documentation reveal operational maturity and product lifecycle transitions. State of ecosystem at quarter-end reflects sustained good-practice status with strong enterprise adoption among advanced teams but persistent barriers (data quality, problem formulation, governance) limiting broader democratization.

  • 2025-Q2: AutoML benchmarking and maturity research deepens through mid-2025. Academic research expands with peer-reviewed evaluation of 16 AutoML tools across classification tasks and updated multivocal literature review (162 sources, 25 documented limitations) reaffirming balanced maturity assessment. Market projections strengthen: USD 4.65B market size (Q2 2025) projected at 48.4% CAGR through 2032 with segment concentration in data processing (39.7%) and BFSI (38.8%), confirming sustained adoption momentum. Practitioner deployments continue across sectors: H2O Driverless AI applied to retail demand forecasting with quick time-to-value, vendor guides detail comprehensive AutoML implementation strategies. Production deployment challenges persist: vendor knowledge bases document integration barriers (dependency conflicts, version incompatibilities) highlighting ongoing operational complexity despite ecosystem maturity. Tooling ecosystem steady-state confirmed through this period with no major releases or consolidation events. State reflects stable good-practice category with market validation and benchmark maturity, yet continuous friction in deployment integration and data preparation automation.

  • 2025-Q3: AutoML ecosystem consolidation accelerates with major open-source feature releases and confirmed enterprise production deployments. AutoGluon 1.4.0 (July 2025) introduces five new tabular model families (RealMLP, TabM, TabPFNv2, TabICL, Mitra) claiming state-of-the-art performance on small-to-medium datasets (<30K samples), advancing ecosystem technical maturity. Research validates feature engineering automation in specialized domains: decision-focused learning framework (Sept 2025) combines automated feature engineering with energy storage optimization, demonstrating domain-specific applicability. Enterprise ROI evidence strengthens: Flash.co reports 366% ROI on Azure ML-powered fraud detection and analytics platform (9.6-month payback, 30% efficiency gains across teams), confirming production deployment viability in finance. Market momentum sustained: ecosystem remains stable with no major consolidation events; open-source frameworks (AutoGluon, PyCaret, Auto-Sklearn) maintain active development; cloud vendors (Azure, Google, AWS) consolidate as managed leaders. State at quarter-end reflects mature good-practice category with sustained enterprise adoption, demonstrable ROI in specialized deployments, and continued ecosystem evolution, though upstream problem formulation and data quality barriers persist as limiting factors to democratization.

  • 2025-Q4: AutoML ecosystem consolidation completes with open-source tool specialization and renewed critical assessment. End-of-year ecosystem analysis identifies AutoGluon (Amazon) as dominant for tabular/multimodal, NNI (Microsoft) for deep learning, FLAML for speed optimization, while Auto-sklearn and TPOT transition to maintenance mode, signaling maturity through differentiation. Novel research advances accessibility: LLM-powered AutoML agent (Frontiers AI, Oct 2025) achieves superior multimodal performance vs. traditional frameworks across 10 datasets, validating new optimization paradigms. Enterprise production adoption continues: Airvantage deployment of H2O Driverless AI for real-time telecom risk scoring replaces static rules with automated feature engineering in production. Market validation strengthens: USD 2.21B (2024)→USD 3.02B (2025) growth at 36.81% CAGR projects USD 27.15B by 2032, with sustained segment concentration in BFSI (38.8%) and data processing (39.7%). Critical assessment emerges alongside positive signals: empirical study of 8 AutoML tools on 11 cybersecurity datasets (Oct 2025) finds no consistent winner, identifies overfitting and interpretability as persistent risks in high-stakes domains, tempering uncritical adoption narrative. Independent tool benchmarking (ICEIS 2025) confirms performance trade-offs: AutoGluon leads accuracy, PyCaret optimizes efficiency, TPOT frequently fails to complete, reinforcing tool specialization. State at quarter-end reflects stable good-practice category with sustained commercial validation, demonstrated ROI, and balanced signal of both innovation (multimodal LLM agents) and limitations (cybersecurity risk assessment), confirming that AutoML remains operationally demanding despite technical maturity.

  • 2026-Jan: AutoML evidence in January 2026 reveals stark adoption-reality gap and validates limits of algorithmic complexity over domain-specific feature engineering. New research demonstrates that in low signal-to-noise domains (financial prediction, 2.79M observations), manual feature engineering dramatically outperforms deep learning pipelines (Sharpe 1.30 vs. 0.07, return 272.6% vs. -5.1%), challenging narratives of algorithmic superiority and affirming role for skilled feature engineering. Adoption metrics uncover enterprise challenges: PwC survey (4,454 CEOs globally) finds 56% of organizations report zero significant ROI from AI investments; McKinsey data indicates only one-third successfully scaled AI across enterprise, with failures consuming billions in R&D costs. Practitioner benchmarks confirm AutoML trade-offs: H2O AutoML wins on accuracy vs. single models across 9 datasets but requires 7+ hours computation versus minutes, trading simplicity and interpretability for marginal performance gains. Market analysis projects continued growth: AutoML market expected to reach USD 27.15B by 2032 at 36.85% CAGR, maintaining BFSI (38.8%) and data processing (39.7%) segment focus. State at month-end reinforces good-practice assessment: AutoML is technically mature and commercially scaled, but deployment remains operationally complex with significant organizational barriers, and the practice remains most effective when combined with human feature engineering expertise rather than as a replacement for data science judgment.

  • 2026-Feb: February 2026 consolidates production maturity signals across manufacturing and financial services. Manufacturer survey (520 leaders) confirms 94% AI adoption with predictive AI at 48% and explicit shift from pilots to operational integration. Case evidence: Commonwealth Bank's 70% fraud reduction, AT&T's 2X ROI, and Tier-1 automotive supplier's OEE improvement from 68% to 81% with 14-week payback demonstrate real-world deployment success. Academic validation: peer-reviewed comparative study confirms cloud AutoML platforms deliver high-performing models without manual intervention, establishing production-readiness. However, deployment barriers persist: Azure ML Feature Store production failures (online materialization blocking inference pipelines) highlight reliability gaps; only one-third of ML projects reach production per Rexer Analytics, with root causes in organizational infrastructure and governance rather than model capability. Practice remains good-practice with validated commercial deployment at scale, but operational complexity and non-technical barriers continue limiting broader adoption.

  • 2026-Mar (Q1): March 2026 evidence demonstrates ecosystem maturity with advanced feature engineering techniques and continued platform consolidation. LLM-driven AutoML research (LeJOT-AutoML at Databricks) shows automated feature synthesis reducing engineering loops from weeks to 20-30 minutes with 19% cost savings; industrial applications show SHAP-based feature selection improving predictive accuracy 4.63–24.05% in safety-critical aero-engine maintenance. Healthcare studies validate ensemble-driven feature selection achieving 93% classification accuracy on wearable data. Cloud platform Feature Stores reach GA maturity: Vertex AI integrates centralized repository with automated drift detection and retraining. Independent analysis confirms AutoML now table-stakes across AWS, Google, Azure with 55% of new enterprise models created via automated pipelines. AWS SageMaker Autopilot documents multiple enterprise deployments with documented productivity gains (30-40% speed improvements). Three named production deployments on Vertex AI (vending machines, e-commerce, automotive) show model development timelines reduced to 'just months'. Implementation friction persists: Azure troubleshooting documentation reveals SDK v1→v2 migration barriers, scikit-learn/pandas version conflicts, configuration failures. Practitioner patterns show feature engineering remains most effective when combined with domain expertise rather than fully autonomous. State reflects stable good-practice with demonstrated LLM-agent advancement and ecosystem convergence on feature engineering infrastructure, sustained deployment challenges, and continued validation of ROI in specialized domains.

  • 2026-May: Platform GA releases and production research signal ecosystem consolidation across enterprise AutoML. Microsoft Fabric AutoML reached GA with auto-featurization and MLflow integration, adding to Databricks' early-stopping classification workflows and H2O.ai's FedRAMP "In Process" designation at High Impact Level—all signalling mainstream enterprise and regulated-sector readiness. Amazon published FeatPilot research on automatic feature augmentation from data lakes, advancing AutoML toward dynamic multi-hop feature discovery. A Brazilian fintech (Facio, 4M customers) reported 60–70% training time reduction, 80% accuracy improvement, and 2–3x faster loan decisions from automated feature engineering in production credit scoring. Research on intelligent elastic feature fading demonstrated 5x acceleration of feature rollouts with 50–55% prevention of performance degradation at scale. The AutoML market reached USD 3.02B (2025), maintaining 36.8% CAGR trajectory, with 55% of new enterprise ML models now created via automated pipelines.

  • 2026-Apr (Q2): April 2026 scan (04-08 to 04-22) reveals production deployment maturation at massive scale offset by widening adoption-to-accountability gap. New evidence: Uber's Michelangelo platform operates 400+ production ML use cases with 20K training jobs/month and 15M predictions/second, documenting feature engineering practices (null handling, imputation consistency, drift detection) at hyperscale; Model Feature Agent (MoFA) deployed across three production systems demonstrating LLM-driven feature selection with operational constraints and measured outcomes. Harmonic Security case study: autonomous agent-based model tuning achieved 20% F1 improvement through systematic feature engineering and threshold tuning without human bias. PMTS production trading system demonstrates hundreds of engineered features (Parkinson volatility, regime indicators, microstructure signals) with versioned feature store, walk-forward validation, and drift detection achieving 67.69% win rate. Independent analyst assessment (ISG evaluation of 83 AI/data platforms) confirms ecosystem consolidation with H2O.ai named Overall Leader in Emerging Providers. Market analysis (Technavio) values AutoML market at USD 17.66B with 44.5% CAGR through 2030, cloud deployment dominating. Critical fairness research documents trade-off: fairness integration reduced accuracy 9.4% while improving fairness 14.5%—essential governance signal for regulated deployment. Operational barriers persist: Azure AutoML featurization generating 600+ features causes memory exhaustion blocking training—core pipeline fragility. Amazon SageMaker Automatic Model Tuning (AMT) demonstrates major vendor GA commitment to gradient-free hyperparameter optimization at enterprise scale. State reflects consolidation: ecosystem vendors (H2O, AWS, Google, Azure) have reached feature parity on core AutoML; adoption breadth confirmed (USD 17.66B market with sustained BFSI/data-processing segment focus); production viability established through infrastructure maturity at scale (Uber 400+ use cases, Harmonic autonomous agents), but governance, fairness constraints, and operational reliability remain blocking factors to advancement beyond good-practice.

TOOLS