The AI landscape doesn't move in one direction — it lurches. Some techniques leap from experiment to table stakes in a single quarter; others stall against regulatory walls, technical ceilings, or organisational inertia that no amount of hype can dislodge. Knowing which is which is the hard part. The State of Play cuts through the noise with a rigorously maintained index of AI techniques across every major business domain — classified by maturity, evidenced by real-world adoption, and updated daily so you always know where you stand relative to the field. Stop guessing. Start knowing.
A daily newsletter distilling the past two weeks of movement in a domain or two — delivered to your inbox while the index updates in the background.
Each dot marks the weighted maturity of practices within a domain — hover for a brief summary, click for more detail
AI that generates financial reports, management commentary, and board-ready presentations from financial data. Includes automated variance narratives and board pack generation; distinct from budget variance analysis which analyses data rather than producing formal reports.
AI-generated financial narratives — variance commentary, board pack summaries, management reports — remain stuck in early-pilot territory despite production-ready tooling from several vendors. The practice uses generative AI to produce formal written output from financial data, distinct from variance analysis (which interprets data) or forecasting (which projects it). The appeal is obvious: automating the labour-intensive drafting that sits between structured numbers and investor- or board-facing prose.
The defining tension is capability without organisational follow-through. Platforms can generate 200-page board reports in seconds, yet Forrester finds only 10-15% of AI projects scale beyond pilot, and a PwC survey of over 4,400 CEOs reports 56% seeing zero financial return from AI deployments. Trust is the core constraint — only 14% of mid-market CFOs fully trust AI-produced accounting content, and 97% demand human oversight. Vendors have solved the generation problem; governance, data quality, and credible ROI frameworks have not kept pace. Until organisations close that gap, this practice will continue to stall at the bleeding edge.
The vendor ecosystem now extends beyond OneStream and Workiva to include BlackLine (Verity Narrate), Board Intelligence (Report Writer and Lucia), Magical (GA financial statement automation), and AI-native tools (V7 Labs, Claryx, Prime AI, LayerNext). All are shipping production-ready narrative-generation capabilities. Workiva Q1 2026 (May 5) demonstrates vendor momentum: $247M revenue (+20% YoY), 6,665 customers, 112.4% net retention, with 605 customers at ACV >$300K (+38%)—indicating expanding high-value deployments. OneStream claims 86% faster planning cycles and 27% improved forecast accuracy; both Workiva and OneStream have doubled AI bookings and deployed agentic narrative features. Tier-1 financial institutions now validate enterprise-scale deployment: JPMorgan Chase's DocLLM processes 1.2M documents monthly with 98.7% accuracy, reducing manual review by 76%; Goldman Sachs achieved 99.2% accuracy in translating research across 17 languages; Morgan Stanley deployed GPT-4 across 16,000 wealth advisors generating portfolio summaries in 47 seconds versus 14 minutes manually. These are full production deployments, not pilots. Real SMB deployments also demonstrate viability. Carlyle Group ($195B AUM) and Takeaway.com went live on OneStream in early 2026; named case studies from Solution Analysts show US multinational reduced forecast cycles from 2-3 weeks to under 1 week; Prime AI deployed Claude-powered board narrative generation for a UAE regulated financial services firm, cutting monthly close from 2 days to 2 hours (20 person-days recovered annually). Board Intelligence reports 45% time savings at Nationwide; Accenture's internal system achieved 95% automation across 737 company codes globally, though still required human review. Inoxoft documented an SMB client reducing financial report generation from 5 days to <1 day (70% reduction) with <1% error rate. These are real productivity wins in controlled settings.
Yet adoption remains constrained by governance, hallucination risk, and trust barriers, not capability. A KPMG survey (May 2026) of 1,000+ finance leaders reports 75% now actively using AI (+150% from 30% in 2024), but only 42% describe systems as "assurance-ready" for financial statement verification; critically, 45% of AI-generated finance content contains hallucinations or inaccuracies. A Harris Poll of 300+ finance leaders found 92% use AI tools but only 28% see measurable financial impact; 33% cannot audit or explain AI-driven results; only 43% are confident AI fits within existing financial controls and audit frameworks. Only 14% of mid-market CFOs fully trust AI-produced accounting content. Grant Thornton's 2026 survey of 950 banking executives found only 18% confident they could pass an independent AI audit of controls, with 50% citing governance barriers as limiting AI performance. Most critically, a May 2026 ChatFin study shows 82% of midsize companies deployed agentic AI for close/reporting automation, yet only 7% report strong ROI—deployed systems are not delivering value at scale. Finance adoption remains stuck: 60% of CFOs believe AI is transformative, but only 11% actively use it; 35% remain in pilots unable to move to production due to data quality and governance gaps. Three specific friction points block progress: system/data constraints (inconsistent, untimely data sources); change management (skill gaps, fear of displacement); and AI-human handshake governance (unclear override protocols, undefined responsibility boundaries). Production failures have crystallized the risk: Deloitte Australia (June 2026) refunded $291K of a $440K government report containing fabricated academic citations and non-existent court judgment quotes; EY Canada simultaneously retracted a consulting report with 59% hallucinated citations, and the contaminated source data subsequently appeared in Claude and ChatGPT responses—exemplifying liability exposure and systemic risk when AI narratives fail quality assurance. Regulatory bodies (COSO, FINRA, FRC, PCAOB) have explicitly signaled that AI-generated financial narratives require governance frameworks; PCAOB guidance (May 2026) establishes that AI influencing "numbers, estimates, journal entries, reconciliations, disclosures" becomes part of SOX/ICFR control environments, requiring audit trails and human accountability. COSO roadmap released April 2026; regulatory compliance timelines (EU AI Act August 2026, OSFI September 2026) are accelerating institutional governance maturity. The core barrier is not technical: platforms can generate board packs in minutes, and tier-1 financial institutions demonstrate 98%+ accuracy in controlled production deployments. The barrier is organisational: governance structures, audit trails, escalation protocols, and liability frameworks for AI-generated financial narratives remain largely unbuilt at scale, creating a structural adoption ceiling that persists despite vendor product maturity and growing hallucination evidence.
— High-profile incident: KPMG withdrew agentic AI report after UBS, NHS, Swiss Railways, Transport for London publicly contradicted claims about their AI usage. GPTZero attributed errors to AI hallucinations, undermining professional services credibility.
— Rigorous compilation of verified hallucination rates and documented incidents with legal consequences; debunks inflated cost figures and provides independent benchmarks from Stanford and Vectara.
— Financial Stability Board consultation report proposing 12 sound practices across AI lifecycle (development, deployment, monitoring, management). Authoritative regulatory framework reflecting board/senior management perspectives on governance for financial institutions.
— KPMG 2026 Global AI in Finance Report: 75% of organizations actively using AI (up from 30% in 2024); 71% report meeting or exceeding ROI. Only 42% are 'assurance-ready'; only 29% track AI adoption failures.
— Fintech analyst (CloudFintech) distinguishes high-deployment mediated outputs (fraud scoring, monitoring with human review) from cautious unmediated deployment (customer chatbots). Financial reporting narratives fit mediated category where deployment scales rapidly.
— Large-scale study (480M AI outputs, Jan–Apr 2026) shows multi-model verification reduces hallucination from 8.3% to 3.2%. Financial sector baseline: 9.1% single-model hallucination, indicating financial data especially vulnerable to LLM fabrication.
— KPMG survey of 1,800 companies across 6 industries: 72% piloting or using AI in financial reporting; 99% expected to adopt within 3 years. 57% planning generative AI implementation specifically for narrative generation.
— Expert analysis from Stripe/TaxJar Global Indirect Tax Lead (PhD/MBA/LLM in taxation) on why AI language models fail in financial/tax compliance: confidence mirage, RAG limitations, and hallucination as architectural inevitability.
2023-H1: Generative AI emergence creates high executive interest in financial applications, but deployment remains nascent. Academic evidence (Turkish firm study) shows potential for reporting accuracy improvement. Regulatory bodies (FASB) and major advisory firms begin signaling need for governance frameworks. 60% of executives remain 1-2 years from first implementation; governance, technology, and talent gaps are widespread barriers.
2023-H2: AI adoption moves into mainstream conversation with clearer use case definition (BCG reports generative AI will create reports and explain variances); 86% of financial institutions expect significant AI increase. However, practitioner sentiment remains mixed—FP&A leaders express skepticism about automating narrative storytelling. Technical barriers become evident: independent study shows LLMs fail on SEC filings (70% error rate for Llama2, 81% failure for GPT-4 on standard queries). Regulators formally identify AI risks to financial stability. Governance and accuracy concerns dominate, shifting focus from capability to safe deployment.
2024-Q1: Vendor tooling reaches general availability (Govrn, OneStream); real-world deployment begins despite governance gaps. AICPA survey shows 26% of finance teams experimenting, 6% implemented, but 71% report risk concerns. Gartner forecasts 80% adoption by 2026. Critical governance deficit emerges: only 12% of boards had in-depth AI discussions. Problem-driven adoption accelerates as Board Intelligence research reveals board pack crisis (226-page average, low reader satisfaction), creating market demand for AI narrative solutions.
2024-Q2: Real-world deployment evidence emerges: Doyon generates 200+ page board reports in seconds via OneStream automation. Academic research advances the field (FASTER framework for multimodal financial summarization). Adoption surveys show text/data summarization among top generative AI use cases, though one-third of organizations still in evaluation phase. Critical reality check: 95% of AI use remains internal-facing due to accuracy/compliance risks. Research highlights persistent LLM limitations on financial decisions (universal failure on tax questions, arithmetic errors). Regulatory pressure accelerates: Treasury issues RFI on AI in financial services, focusing on compliance and data privacy risks. Adoption shifts from "is this possible?" to "how do we manage accuracy and governance at scale?"
2024-Q3: Adoption continues despite ROI headwinds. Gartner survey shows 58% of finance functions using AI (up 21 points from 2023), confirming mainstream market penetration across finance. However, critical barriers persist: Georgia Tech study reveals structural LLM limitations (biases toward large-cap companies, high hallucination rates on financial data); Gartner predicts 30% of GenAI projects will be abandoned after POC by end-2025 due to poor data quality and escalating costs ($5-20M deployment). Governance risk awareness intensifies—281 Fortune 500 companies now flag AI as risk factor (473% increase), highlighting mounting board-level concern about AI deployment in sensitive financial contexts. The window captures widening adoption paired with persistent accuracy and cost-effectiveness challenges that constrain enterprise-wide rollout.
2024-Q4: Practice reaches mainstream adoption scale with significant governance gaps. KPMG research of 2,900 orgs shows 71% using AI in finance, with reporting as the most common use case (2/3 piloting or deployed). Vendor commitment accelerates: OneStream launches pre-built AI models for reporting workflows. However, adoption-reality disconnect persists: Federal Reserve analysis finds AI rhetoric no longer predicts capital investment; Deloitte board survey shows 45% have no AI on board agenda, only 3% feel ready for deployment. Treasury report acknowledges growing AI use while emphasizing governance, privacy, and bias risks. Window captures paradox of mainstream adoption without corresponding organizational readiness.
2025-Q1: CFO demand for narrative generation becomes explicit and urgent. Bain survey shows 79% of CFOs planning to increase AI budgets, with 94% believing gen AI can benefit finance; CFOs specifically seek solutions for generating P&L variance narratives and management commentary. OneStream formalizes dedicated Narrative Reporting product with integrated workflows. However, production deployment remains severely constrained: CDO survey shows 67% unable to move AI pilots to production due to data quality barriers; 97% of data leaders struggle to demonstrate business value. Window reveals growing sophistication of requirements paired with persistent infrastructure and governance gaps blocking scale.
2025-Q2: Vendor maturity and regulatory acceptance intensify, but governance concerns emerge as critical barrier. OneStream announces GA of SensibleAI Studio with 30+ AI routines for narrative generation (June). Regulatory bodies (HKMA, FRC) release guidance on responsible AI adoption, signaling acceptance but emphasizing governance. However, governance professionals express heightened accuracy concerns: 74% of 600+ governance leaders worried about AI-generated content in corporate reporting. Academic research confirms AI's potential for board-level reporting while highlighting persistent deployment barriers tied to accuracy risk perception. Global adoption shows 40% of finance organizations have deployed AI in some form, but scale-up constrained by governance readiness and accuracy confidence gaps.
2025-Q3: Platform consolidation advances but deployment ROI reality surfaces. Workiva launches Intelligent Finance (September) with agentic AI for reporting automation; OneStream and Workiva both push narrative reporting capabilities. However, MIT Media Lab study (August) examining 300 real deployments finds only 5% deliver measurable profit impact—regulated sectors face governance delays that prevent rollout. Study of 2,300 finance professionals (July) confirms 74% use AI daily but majority unable to move pilots to production due to data and governance barriers. RSM notes financial institutions accelerating automation, but MIT evidence suggests execution challenges limit transformative outcomes. Window reveals capability-readiness divergence: platforms are production-ready, but organizations struggle with data quality, governance alignment, and ROI justification.
2025-Q4: Vendor maturity reaches production stage across multiple platforms. OneStream and Workiva both announce agentic and generative AI capabilities for narrative report generation (December 2025). Real-world AWS customer reviews validate auto-linking data consistency. However, a critical perception gap emerges: 51% of midmarket CFOs believe they've fully adopted AI, but only 19% of financial controllers agree—exposing a dangerous alignment gap between executive intent and operational reality. Simultaneously, McKinsey analysis shows 68% of AI projects miss ROI targets within 2 years, with a detailed case study revealing cost overruns (budgeted $1.2M, actual $4.7M) and adoption failures (34% realized vs. 95% target). The window captures the paradox of mature tooling meeting hard implementation realities: platforms are production-ready, but deployment at scale remains constrained by organizational readiness, data quality, and honest ROI assessment.
2026-Jan: Real-world enterprise deployments confirm OneStream market leadership while implementation barriers persist. Carlyle Group ($195B AUM) and Takeaway.com both deployed OneStream for financial consolidation and narrative reporting, demonstrating category adoption by major organizations. Workiva releases Generative AI product in GA (January 29), signaling competitive feature parity. However, macro reality check emerges: Deloitte survey shows CFO confidence rising (87% see AI as important to 2026 operations), but Forrester reveals only 10-15% of AI projects scale from pilot to production—60% fail due to integration, data quality, and workflow redesign delays. Finance leaders report minimal financial value (11% in 2025). Survey aggregation shows less than 1% of executives report significant ROI (≥20%), with 30% of GenAI projects abandoned after POC. Window captures simultaneous trend: real-world deployments validating product capability, but system-wide ROI realization remaining elusive.
2026-Feb: Adoption growth meets persistent trust and ROI realization barriers. Workiva executive benchmark of 1,497 professionals shows 91% report AI improved timeliness/strategic value of financial decisions; 65% use AI in disclosures. Yet trust remains critical constraint: 100 mid-market CFOs show 60-77% plan adoption but only 14% trust AI completely; 97% demand human oversight. Regulatory recognition advances: FINRA identifies narrative generation and content drafting as observed use cases, signaling governance acceptance. PwC survey of 4,454 CEOs finds 56% report zero financial return—adoption stuck in "Pilot Purgatory." Window reveals growing capability-deployment gap: vendors deliver proven outcomes (OneStream: 86% faster cycles, 27% accuracy improvement), but organizational ROI realization and discipline lag feature maturity.
2026-Mar: Real deployments emerge but outcome-investment gap widens further. Independent case study demonstrates viable SMB deployment path: 15-person accounting firm reduced monthly close by 8 hours per client using Claude API + QuickBooks integration, achieving near-100% accuracy. Workiva's agentic AI product (GA March 2026) shows named customer wins (Cognizant reporting 40% time savings). However, macro data reveals persistent barriers: Wolters Kluwer survey of 1,672 CFOs shows 62% expect AI to reshape reporting within 3 years, yet Richmond Federal Reserve finds 50%+ of firms invested heavily in AI over past 12 months with zero reported outcomes in labor productivity, decision speed, or high-value work—indicating massive implementation maturity gap. Critical perception disconnect surfaces: 51% of CFOs perceive their organization has adopted AI for reporting, but only 19% of controllers agree, exposing presentation-layer automation masking unchanged underlying workflows. Academic research confirms technical barriers persist: peer-reviewed study demonstrates LLM arithmetic incompetence and semantic conflation remain fundamental obstacles to trustworthy financial narrative generation. Window captures inflection point: viable technical solutions exist and deployments validate capability, but system-wide organizational barriers (data quality, governance frameworks, ROI measurement, workflow integration) prevent scaled adoption. Bleeding-edge tier classification sustained by real deployments coexisting with massive outcome realization gap.
2026-Apr: Deployment evidence accumulates but regulatory and governance barriers intensify. Workiva Q4 2025 shows 30% of customer base activated AI features with 88% of practitioners reporting ROI increases and 112.8% net retention; OneStream AI bookings doubled in 2025; Board Intelligence's Report Writer reached 60% of FTSE 20 financial institutions, and Accenture's internal system automated variance narratives across 737 company codes globally while still requiring human review. Named case studies emerge: board narrative automation reducing preparation from 7 hours to 20 minutes per quarter (97% time savings); UAE SME firm achieving 40-60 hours monthly reduction. However, the Deloitte Australia scandal crystallized governance risk: a $440K government report generated with AI contained fabricated references, false quotations, and non-existent footnotes — only 14% of enterprises maintain AI audit trails for generated narratives. FRC published first global guidance on AI in financial auditing, warning hallucination and data distortion risks do not absolve human accountability; AnchorDrift analysis documents silent model drift creating hidden compliance exposure even when monitoring dashboards appear green. Federal Reserve/Duke research finds CFO-reported productivity gains (1.8%) exceed revenue-implied gains by ~1 year, confirming execution lag persists despite expanding deployment evidence.
2026-May: Vendor growth and survey data converge on a sharp adoption-vs-value gap while production failures raise governance alerts. Workiva Q1 2026 (May 5) reports $247M revenue (+20% YoY), 6,665 customers with 112.4% net retention, 605 customers at ACV >$300K (+39%), with agentic narrative features deployed alongside GRC and sustainability agents. KPMG's May 2026 Global AI in Finance report shows 71% of organisations meeting or exceeding ROI expectations, but only 23% exceeding—with agentic AI outperforming by 32 points. However, a parallel KPMG survey (May 28) of 1,000+ finance leaders reveals 75% actively using AI (2.5× growth from 30% in 2024), but only 42% describe it as "assurance-ready"; critically, 45% of AI outputs contain hallucinations. Richmond Federal Reserve CFO survey (~750 firms) found 82% deployed agentic AI for reporting yet only 7% report strong ROI, with governance and implementation barriers dominating explanations. Cambridge Judge Business School research reveals >80% of financial services firms adopting AI but only 52% experimenting with agentic AI; most deployment remains back-office with institutional-scale narrative integration rare. Analyst assessment (Tambellini Group, May 27) validates OneStream's Splash 2026 agentic architecture with four production deployments (Cox, Amer Sports, Milo's Tea) using hallucination-aware context preservation. Tier-1 financial institutions (JPMorgan, Goldman Sachs, Morgan Stanley, AmEx) deployed AI narrative generation at full production stage with 98.7-99.2% accuracy metrics (VAHU governance guide, May 28). Yet production failures crystallize governance risk: Deloitte Australia (June 2) refunded $291K of a $440K report containing fabricated academic citations and non-existent court judgments; simultaneously, EY Canada (May 30) retracted consulting report with 59% hallucinated citations, which subsequently contaminated Claude/ChatGPT response caches. Finance-specific adoption barriers documented: 60% of CFOs believe AI transformative but only 11% actively using it in core reporting workflows; 35% stuck in pilots due to data quality, change management friction, and unclear AI-human handshake protocols. Regulatory frameworks accelerating: COSO released governance roadmap for GenAI in financial reporting (April 2026); PCAOB explicit guidance establishes that AI influencing "numbers, estimates, journal entries, reconciliations, disclosures" becomes part of SOX control environment; EU AI Act enforcement (August 2026) and OSFI guidance (September 2026) begin shifting governance expectations from optional policy to mandatory operational control. Practitioners cite specific governance tensions: 47% of finance users made decisions on hallucinated content; only 21% have mature autonomous AI agent governance; SOX auditability incompatible with non-deterministic AI outputs. Product maturity and named customer wins coexist with a structural adoption ceiling created by governance maturity gaps, data infrastructure constraints, and honest ROI shortfalls—now punctuated by visible production failures.
2026-Jun: High-profile hallucination failures by Big Four firms sharpened the governance debate while tier-1 institution deployments confirmed production-stage maturity. Deloitte Australia refunded $291K of a $440K government contract for a 237-page report containing fabricated citations and non-existent court judgment quotes; EY Canada simultaneously retracted a consulting report with 59% hallucinated citations, and the contaminated data subsequently surfaced in Claude and ChatGPT outputs — demonstrating source contamination risk beyond a single deliverable. Most significantly, KPMG itself (June 16) withdrew an AI-generated research report on agentic AI after UBS, NHS, Swiss Railways, and Transport for London publicly contradicted claims about their AI usage, with GPTZero analysis revealing only 5 of 45 citations were accurate and approximately 50% of claims fabricated — marking the fourth major consulting firm caught publishing hallucinated AI-generated analysis in months. The Financial Stability Board released a consultation report proposing 12 sound practices for responsible AI adoption across the full AI lifecycle in financial institutions, providing the most authoritative cross-jurisdictional governance framework to date. Contrasting with those failures, KPMG's June 2026 survey of 1,800 companies documents 72% piloting or using AI in financial reporting (with 57% planning generative AI specifically for narrative generation), while tier-1 institutions (JPMorgan, Goldman Sachs, Morgan Stanley, AmEx) ran full-production narrative generation at 98.7-99.2% accuracy; PCAOB guidance formalized that AI touching "numbers, estimates, journal entries, reconciliations, disclosures" now falls inside SOX control environments, intensifying audit-trail requirements industry-wide. Large-scale research (480M AI outputs, Jan–Apr 2026) quantified that multi-model verification reduces financial-sector hallucination from 9.1% (single-model baseline) to 3.2%, establishing a concrete mitigation benchmark — though the residual rate remains material for assurance-grade reporting. The practice enters 2026-H2 at an inflection: production deployments and vendor maturity are proven; hallucination failure now documented at multiple tiers (Big Four, government, consulting); governance frameworks formally emerging (FSB, PCAOB, COSO); but organizational adoption still constrained by infrastructure, trust, and accountability gaps that architecture improvements cannot solve.