Perly Consulting │ Beck Eco

The State of Play

A living index of AI adoption across industries — where established practice meets the bleeding edge
UPDATED DAILY

The AI landscape doesn't move in one direction — it lurches. Some techniques leap from experiment to table stakes in a single quarter; others stall against regulatory walls, technical ceilings, or organisational inertia that no amount of hype can dislodge. Knowing which is which is the hard part. The State of Play cuts through the noise with a rigorously maintained index of AI techniques across every major business domain — classified by maturity, evidenced by real-world adoption, and updated daily so you always know where you stand relative to the field. Stop guessing. Start knowing.

The Daily Dispatch

A daily newsletter distilling the past two weeks of movement in a domain or two — delivered to your inbox while the index updates in the background.

AI Maturity by Domain

Each dot marks the weighted maturity of practices within a domain — hover for a brief summary, click for more detail

DOMAIN
BLEEDING EDGEESTABLISHED

Budget variance analysis & narrative explanation

GOOD PRACTICE

AI that analyses budget variances and generates narrative explanations of why actuals deviated from plan. Includes automated waterfall decomposition and natural language variance commentary; distinct from financial reporting which presents results rather than explaining variances.

OVERVIEW

AI-driven budget variance analysis has moved past proof-of-concept into proven, accessible tooling. The practice automates what was once one of the most labour-intensive steps in the close cycle: decomposing actual-versus-plan deviations into their drivers (price, volume, mix) and generating narrative explanations fit for management review. GA features from Microsoft, IBM, Pigment, and HighRadius now handle this end-to-end, and a Workiva survey of nearly 1,500 finance professionals found 91% reporting that AI improved the timeliness of financial decisions.

The question facing finance teams is no longer whether the technology works, but whether their data and processes are ready for it. Vendor capability is mature; the constraint is organisational. Data quality, multi-entity complexity, and the enduring need for human review on publication-ready narratives set the pace of rollout. For teams with clean, well-governed planning data, variance automation delivers measurable close-cycle compression. For those without it, the tooling outpaces the foundation.

CURRENT LANDSCAPE

The vendor ecosystem has consolidated around a handful of production-grade platforms. Microsoft Copilot for Finance reached GA in October 2025 with end-to-end recurring variance workflows, bringing automated decomposition and narrative drafting into the productivity suite used by most enterprise finance teams. Early adopters report 30-40% close-cycle reductions and 50% faster variance analysis. Pigment's Analyst Agent, launched the following month, automates variance narratives across all cost centres; customers including Coca-Cola, Unilever, ServiceNow, and Supercell point to days of manual work eliminated per cycle—Supercell compressed P&L updates from 2 days to 4 minutes. Carta's deployment with Pigment achieved 80% reduction in data aggregation time. IBM Planning Analytics added AI-driven price/volume decomposition with generated summaries in early 2026, and HighRadius continues to scale past 1,000 customer deployments. V7 Labs launched a production Business Performance Analysis Agent in April 2026, reducing monthly variance reporting from 1-2 days to 10-15 minutes.

Adoption metrics are strong at the top. Workiva's survey of 1,497 finance professionals found 65% already using AI in quarterly or annual disclosures and 91% reporting AI improved decision timeliness. But a critical adoption gap has emerged: only 19% of finance controllers report actual AI adoption versus 51% of CFOs, revealing that presentation-layer automation (dashboards, narratives) masks unchanged manual data preparation and reconciliation workflows underneath. Broader ROI evidence remains sobering: Deloitte's survey of 1,854 executives found only 6% of AI projects achieved payback within a year, and 80% of financial services AI projects fail outright due to governance and data foundation gaps. Yale's Fin-RATE research documents systematic LLM failures on temporal financial reasoning critical to variance analysis, with 18.6% accuracy degradation on multi-period tasks.

The technical barriers are equally binding. Washington State University researchers found that ChatGPT achieved 80% accuracy in answering financial questions but only 16.4% accuracy in identifying false hypotheses, with consistency degrading to 73% across repeated queries—a critical failure mode for variance narrative reliability where models generate fluent-sounding explanations that may be factually wrong. A 2026 CFO.com benchmark found that even the best AI models fail 1 in 5 accounting tasks, and independent Claryx analysis documents a 41% hallucination rate in financial NLP, directly constraining variance narrative reliability. Accounting domain experts note that LLM-based variance analysis fails at validating whether driver classifications are correct or whether explanations reflect actual system behavior; errors compound silently as outputs become inputs to subsequent periods without grounding in live data. A financial services case study documented an AI model with 91% accuracy that failed 3x more often than expected in production due to miscalibration—confidently wrong decisions that users cannot detect. EY surveys show $4.4 billion in combined AI losses across enterprises, with 95% of generative AI pilots failing to deliver measurable returns.

The gap between technically capable tooling and organisationally ready finance functions remains the defining friction. Battery Ventures surveyed 129 CFOs and found variance analysis listed as a priority automation target, yet only 4% reported pilot success rates above 50%, reflecting a systematic implementation-to-value gap. Governance failures—inconsistent data definitions, missing baseline metrics, and unaccountable pilots—are systematically killing AI ROI in financial services, not technical limitations. 70% of enterprise AI projects fail to reach production, with data quality cited as a barrier by 61% and projects scoped as large transformations failing 78% of the time. A 2026 Bain report finds that 83% of CFOs plan AI budget increases over two years, yet only 15-25% have scaled AI to production, revealing the structural chasm between adoption intent and realized deployment. An independent Forrester study quantifies Copilot's business case across 12 organizations, but even high-profile deployments expose a measurement problem: Deloitte's 2026 research on AI ROI frameworks shows that organizations struggle to connect AI spending to financial outcomes due to baseline gaps and accountability voids. Human-in-the-loop review persists as standard practice for published narratives, and data governance maturity and organisational discipline—not vendor selection—is the binding constraint for most teams evaluating adoption.

TIER HISTORY

ResearchJan-2023 → Apr-2024
Bleeding EdgeApr-2024 → Jul-2025
Leading EdgeJul-2025 → Oct-2025
Good PracticeOct-2025 → present

EVIDENCE (68)

AI can't read an investor deckResearch Papers

— Mercor benchmark of frontier models on 25 financial tasks finds 16-20pp accuracy drop for visual vs text inputs and systematic mathematical reasoning failures, directly undermining AI reliability for variance narrative analysis.

— Alteryx GA starter kit for AI-driven budget variance diagnosis with threshold-based alerting, driver decomposition, and narrative generation directly addresses the practice at scale.

— Independent tool evaluation on variance explanation shows Claude 64.4% vs Copilot 4.4% on Wall Street Prep benchmark; identifies Copilot's data fabrication risk and Claude's superior source attribution.

— University facilities case: AI variance detection within 48 hours vs 5-month manual lag, preventing $1.4M downstream cost; demonstrates variance speed as cost multiplier across cascading and inflation-driven overruns.

— Product evaluation finds Copilot variant-explanation failure mode: fabricates data when semantic model gaps exist; identifies governance requirements and risk as more critical than vendor capability.

— Reveals critical adoption gap: 83% of CFOs plan AI spending increases, only 31% report positive outcomes, 15-25% have production deployment; ROI frameworks show time-to-resolve-variances as emerging metric.

— Credible benchmark assessment of AI accuracy limitations in accounting tasks including month-end close where variance analysis operates; latest data documenting deployment risk factors.

— Microsoft Dynamics 365 Copilot GA adds embedded AI agents in period-end close workflows (including variance analysis) with reported 25-30% performance gains; major ERP platform brings generative AI to variance-related finance operations.

HISTORY

  • 2023-H1: Variance analysis automation integrated into record-to-report platforms (HighRadius, Workiva); fintech companies (Ramp) demonstrated variance management case studies; adoption remained concentrated in large enterprises with mature FP&A functions.
  • 2024-Q1: ClickUp launched dedicated AI agents for budget variance analysis; academic research documented 23% variance reduction and 35% accuracy gains in construction project deployments; adoption barriers persisted including data quality, organizational resistance, and implementation complexity.
  • 2024-Q2: Anaplan deployments scaled to mid-market (Abilene Christian University, 6-week production implementation); independent analyst validation emerged (Nucleus Research: 2 months annual work saved); Gartner survey identified budget variance explanation as #1 finance GenAI priority; public sector began AI-assisted workflows; human-in-the-loop verification remained industry standard, limiting pure automation potential.
  • 2024-Q3: Platform maturity expanded (HighRadius, Workiva, D365 Copilot); industry adoption signals strengthened (KPMG survey: 71% AI in finance, 57% exceeding ROI expectations; 66% of finance leaders see GenAI as most impactful for variance explanation); vendor confidence reflected in product roadmap investments; critical practitioner analysis highlighted persistent data preparation and causal reasoning bottlenecks limiting automation potential.
  • 2024-Q4: Mainstream platform integration accelerated: Microsoft Business Central added GA variance analysis Power BI reports (November); D365 Finance deployed AI co-pilots for variance-to-narrative workflows (December); KPMG US survey showed 78% of companies piloting/using AI for financial planning. Adoption remained constrained by data quality requirements and mid-market complexity rather than technical feasibility.
  • 2025-Q1: HighRadius and Workiva continued platform maturity (HighRadius reporting 1100+ customers with 95% forecast accuracy claims; Workiva powering public sector budget automation including City of Dubuque's production budget book workflow). Broader AI adoption uncertainty signaled by enterprise ROI timelines extending beyond six months; data quality challenges cited by 85% of finance leaders as primary barrier. Practice remained technically mature but adoption paced by organizational readiness and data governance rather than vendor availability.
  • 2025-Q2: Workiva reported 30% of customer base enabled AI features in production (May 2025 earnings call), signaling acceleration from prior periods and vendor monetization confidence. Competitive landscape expanded with specialized variance tools (Vena Copilot, Planful Predict) explicitly marketed for AI-assisted variance explanation. Practitioner adoption signals emerged (57% of finance professionals piloting or using AI). Organizational bottlenecks remained: data quality, human review requirements, and causal reasoning complexity in multi-entity models continued to pace mid-market adoption.
  • 2025-Q3: HighRadius scaled to 1000+ customers with AI-powered variance insights (99% accuracy, 60%+ automation); Microsoft Copilot for Finance entered public preview with variance analysis in Excel. Adoption expanded to 79% of FP&A teams using AI. Critical limitations documented: AI struggles with nuanced commentary, causal reasoning in multi-entity models, and stakeholder confidence. Human-in-the-loop verification remained industry standard, constraining pure automation ROI for mid-market organizations.
  • 2025-Q4: Microsoft Copilot for Finance 'Automate Variance Analysis' reached GA in October 2025, enabling recurring end-to-end variance automation with notifications. Pigment launched Analyst Agent in November for departmental variance narratives. However, economic headwinds emerged: Deloitte survey (October) found only 6% of AI projects broke even in under one year; MIT study found 95% of organizations saw zero GenAI ROI. Practice remained technically mature but ROI timelines and scaling challenges constrained mid-market adoption growth.
  • 2026-Jan: IBM Planning Analytics Workspace (v3.1.3) delivered GA variance analysis with AI-driven driver decomposition (price/volume) and AI summaries; continued mainstream platform maturity with minimal new deployment announcements signaling consolidation around existing vendor ecosystem.
  • 2026-Feb: Ecosystem consolidation accelerated with steady product advancement across major vendors: Workiva survey (1,497 respondents) confirmed 91% of finance leaders believe AI improved decision timeliness and 65% actively use AI in disclosures, suggesting mainstream production adoption. Pigment and Copilot for Finance offered production-grade variance automation with documented efficiency gains (30-40% close cycle reduction). Critical headwinds persisted: Gartner predicted 60% of AI projects lacking data readiness would be abandoned through 2026, and broad MIT/Gartner research showed 95% of organizations saw zero GenAI ROI. Implementation complexity and ROI expectations remained the constraining factors rather than technical capability.
  • 2026-Mar: Governance failures emerged as the primary adoption barrier, not technology: LedgerUp analysis revealed 32-point perception gap between CFOs claiming AI adoption and controllers reporting unchanged manual workflows. Finance industry analysis documented 80% overall AI project failure rate, with root cause identified as governance and data foundation gaps, not technology limitations. Accounting VC analysis cited Yale Fin-RATE research showing 18.6% accuracy degradation on temporal financial reasoning tasks critical to variance analysis. Named customer deployment (Carta/Pigment) documented 80% reduction in data aggregation time with 12-week implementation, validating production viability for data-ready organizations. Market consolidation continued with Aleph FP&A comparison showing AI-powered variance explanation as 2026 table-stakes capability across seven major platforms.
  • 2026-Apr: V7 Labs launched a production Business Performance Analysis Agent compressing monthly variance reporting from 1-2 days to 10-15 minutes, reinforcing the time-savings case for narrowly-scoped deployments. Ecosystem expansion included Datarails (product GA for AI-assisted variance narrative drafting), Microsoft Dynamics 365 Copilot (25-30% close cycle gains), and reinforced Microsoft Copilot for Finance adoption across enterprise base. Critical measurement data emerged: CFO.com benchmark showed best-in-class AI models fail 1 in 5 accounting tasks; Claryx documented 41% hallucination rate in financial NLP; Bain found 83% of CFOs planning AI increases but only 15-25% have scaled to production; Deloitte released 2026 framework for measuring AI ROI as organizations struggle with baseline gaps and accountability voids. The ROI gap sharpened: 70% of enterprise AI projects fail to reach production, with narrowly-scoped automation (54% success) significantly outperforming large-transformation initiatives (22% success) — validating focused, incremental rollout strategy for most finance teams.
  • 2026-May: Alteryx released a GA starter kit for AI-driven budget variance diagnosis with threshold alerting and narrative generation; independent tool benchmarks showed Claude at 64.4% vs Copilot at 4.4% on variance explanation (Wall Street Prep benchmark), with Copilot flagged for data fabrication risk when semantic model gaps exist. A Mercor benchmark of frontier models across 25 financial tasks found 16-20pp accuracy drops for visual inputs and systematic mathematical reasoning failures, adding to growing evidence that model reliability for variance narratives requires careful scope control and data grounding. A university facilities case study documented AI variance detection within 48 hours versus a 5-month manual lag, preventing $1.4M in downstream costs.