Spreadsheet & data task automation

The AI landscape doesn't move in one direction — it lurches. Some techniques leap from experiment to table stakes in a single quarter; others stall against regulatory walls, technical ceilings, or organisational inertia that no amount of hype can dislodge. Knowing which is which is the hard part. The State of Play cuts through the noise with a rigorously maintained index of AI techniques across every major business domain — classified by maturity, evidenced by real-world adoption, and updated daily so you always know where you stand relative to the field. Stop guessing. Start knowing.

AI Maturity by Domain

Each dot marks the weighted maturity of practices within a domain — hover for a brief summary, click for more detail

DOMAIN

BLEEDING EDGEESTABLISHED

BLEEDING EDGE

TRAJECTORY— Stalled

AI that automates spreadsheet tasks including formula creation, data analysis, pivot tables, and formatting from natural language. Includes Excel/Sheets AI assistants and formula generation; distinct from natural language to SQL which queries databases rather than manipulating spreadsheets.

OVERVIEW

AI-driven spreadsheet automation entered a new maturity phase in mid-2026, marked by simultaneous capability expansion and governance crisis. All major AI vendors (Microsoft, Google, OpenAI, Anthropic) independently converged on spreadsheet agents as a core platform strategy, confirming market validation. Yet this convergence masks a fundamental bifurcation: finance and bounded operational workflows show sustained ROI (52% accounting adoption, 250% returns, 90-day payback), while mainstream deployment stalls at the governance barrier. Peer-reviewed benchmarks (Columbia, Reutlingen, U Illinois) confirm that frontier models "frequently fall short of professional finance standards" on multi-step logic. Microsoft's own SearchLeak vulnerability (CVE-2026-42824, CVSS 10/10) and CSA findings (82% of organizations discovered ungoverned shadow agents) reveal that spreadsheet automation's bleeding-edge status is constrained by trust and security architecture, not capability. Finance sector remains strongest signal; mainstream advancement blocked by governance gaps, verification burden, and the ROI measurement failure (60% of enterprises achieve minimal value despite investment). Practice remains at bleeding-edge: elite deployment in finance and structured operations; mainstream adoption blocked by control gaps and inability to demonstrate ROI beyond bounded finance workflows.

CURRENT LANDSCAPE

June 2026 platform updates confirm ecosystem maturity while exposing governance crisis as the binding constraint. Microsoft Copilot Agent Mode (GA April 22) demonstrated strong preview engagement (+67% usage, +50% retention, 65% satisfaction) and expanded to full deployment in June. Google's semantic layer announcement (Cloud Next, April 22) unifies Sheets automation with cross-Drive context (emails, docs, chats), enabling end-to-end natural language spreadsheet construction. OpenAI's ChatGPT for Excel/Sheets (GA May 8) operates via manifest XML deployment, removing app-store friction for regulated enterprises. Microsoft's Copilot Cowork (GA June 16) represents the agentic frontier: autonomous task execution across emails, meetings, files, and data with human oversight. All major vendors converging on spreadsheet agents signals market validation—yet this convergence masks a critical trust gap. Microsoft's own terms of service label Copilot as "entertainment only" while marketing emphasizes productivity; real market penetration remains 3.3% despite feature announcements. CVE-2026-42824 (SearchLeak, CVSS 10/10) demonstrated one-click data exfiltration via prompt injection in Copilot Enterprise Search, affecting emails, MFA codes, and financial data.

Real-world deployment data shows sharp bifurcation. Finance sector sustains ROI: accounting 52% adoption with 250% ROI in 18 months; AR automation achieves 40% payment acceleration, 90% error reduction. EY's 150,000-user Microsoft 365 Copilot deployment (June 2026) achieves 94% monthly usage and 85% weekly usage, signaling enterprise-scale uptake where governance is structured. EPC Group's 200+ Fortune 500 deployments deliver 90–120 days to value with 60–75% DAU using structured rollout; without structured governance, DAU drops to 15–25%. Peer-reviewed benchmarks confirm capability ceilings: Claude leads frontier LLMs but "frequently falls short of professional finance standards" on multi-step logic (Columbia, Reutlingen, arXiv:2605.22664). FP&A workflows succeed narrowly: three-statement models and auditing work reliably (30→4 min). Failures: VBA/macros, Power Query, 50K+ row datasets. Independent testing (Neuriflux): formula generation 7/10 first-attempt success on 8K-row files; 12-second latency acceptable but verification mandatory. Governance has become disqualifying: CSA survey (418 security professionals, May 2026) found 82% discovered ungoverned shadow agents; 65% experienced security incidents; 61% reported data exposure. Gartner predicts 40% of agentic AI projects discontinued by 2027 due to governance gaps and unclear ROI. Deloitte's finance survey (1,300+ leaders) shows 63% deployed automation but only 21% report measurable ROI. Broader enterprise data: 60% of organizations achieve minimal AI value despite investment; 95% report no measurable ROI on generative AI. Outside finance, adoption stalls: only 3% of enterprises achieve meaningful AI transformation; 56% of CEOs report no AI ROI. The practice bifurcates sharply: finance automation succeeds where task scope is bounded and verification gates are tight; mainstream advancement blocked by governance control gaps, verification burden (Gartner: benefits mask diffuse data quality risks), and the structural inability to demonstrate ROI at enterprise scale.

TIER HISTORY

ResearchJun-2023 → Jun-2023

Bleeding EdgeJun-2023 → present

EVIDENCE (113)

Google Workspace Intelligence: Semantic Layer for SheetsIndustry Reports2026-06-19

— Google Cloud Next 2026 announcement (April 22): unified semantic layer enabling end-to-end natural language spreadsheet construction. User describes goal, Gemini builds structure—ecosystem maturity signal.

Microsoft 365 Copilot CoworkProduct Launches2026-06-16

— Microsoft GA product (June 2026) representing agentic evolution: autonomous task execution on emails, meetings, files, data. Grounds work in user context while keeping human in control—maturity milestone for spreadsheet and data task automation.

Microsoft Copilot Agent Mode: April 2026 GA with engagement metricsProduct Launches2026-06-16

— Copilot Agent Mode GA (April 22, 2026): Excel engagement +67%, retention +50%, satisfaction 65% in preview. Enables multi-step native spreadsheet actions (formulas, pivot tables, charts) from plain language—core capability signal.

Microsoft 365 Copilot Can Be Turned Into a One-Click Data Theft Tool (CVE-2026-42824)News Coverage2026-06-16

— Critical CVE-2026-42824 (CVSS 10/10) in Copilot Enterprise Search: prompt injection enables one-click data exfiltration from emails, OneDrive, SharePoint. Negative signal: governance and trust barrier limiting mainstream adoption.

ROI and Business Impact: 60% of Orgs Achieve Minimal ValueAdoption Metrics2026-06-15

— Synthesis of 60+ sources: 60% of companies achieve minimal ROI despite substantial AI investment; gap between leaders (5x revenue gains) and rest widening. Critical negative signal on real-world value realization.

AI Agent Statistics 2026: 40%+ Projects Cancelled by 2027Adoption Metrics2026-06-15

— Comprehensive agentic AI adoption reality check: 40%+ of projects forecast for cancellation by 2027 (Gartner); 95% report no measurable ROI; only 5% of pilots reach production. Documents adoption-ROI divergence central to bleeding-edge maturity.

Spreadsheets as Shared Working Surface for Business and AIOpinion2026-06-08

— Independent analysis of vendor convergence: all major AI vendors (Anthropic, Microsoft, OpenAI) independently shipped spreadsheet agents. Explains why: substrates chosen by existing population, not design merit. Spreadsheets are 'occupied' by millions of users.

Braintree: Microsoft Copilot Agent Mode Rollout and Governance RequirementsOpinion2026-06-08

— Independent South African consulting firm's implementation guide: Excel engagement +67%, security governance required, agents use Claude Opus (Jan 2026). Documents deployment readiness and governance barriers.

HISTORY

2023-H1: M365 Copilot enters early access with Excel integration announced; initial user feedback reveals limitations in complex formula generation.
2023-H2: Copilot reaches GA in October with formula suggestions and Python integration. Peer-reviewed research documents accuracy breakdown on complex problems. Practical deployments of ChatGPT+Sheets for data analysis appear. Feature availability gaps and access friction limit real-world rollout.
2024-Q1: Google Sheets ecosystem solidifies as dominant platform with multiple AI integration approaches (native, third-party add-ons like SheetGen, app scripts). Third-party tool ecosystem expands (GptExcel, others). Copilot remains unavailable in desktop Excel despite GA claims; licensing/access issues persist. Research benchmarks confirm formula generation accuracy challenges via NL2Formula dataset.
2024-Q2: Adoption metrics emerge: automated reporting platforms show 60%+ organizational adoption and 80% time savings. Real-world case studies of scale deployments (1,700+ response surveys with ChatGPT analysis). Institutional adoption programs launched (M365 Copilot training bootcamps). Desktop Excel access gaps persist despite training programs.
2024-Q3: No significant new evidence of capability expansion or adoption barrier shifts captured during this window.
2024-Q4: Google launches Gemini AI integration in Google Sheets and new =AI() function via Workspace Labs. Microsoft ships Copilot Lite in Microsoft 365 Family plans, expanding distribution. User feedback and peer-reviewed research document persistent reliability gaps: Copilot in Excel fails on common tasks (find-replace, pivot table analysis) despite GA claims. Formal trustworthiness framework published identifying hallucination risks in formula generation.
2025-Q1: Google ships Gemini GA for chart generation and Python-driven insights in Sheets (January). Microsoft expands Copilot in Excel with Python-driven advanced analytics for forecasting and risk analysis (March). Meanwhile, SPREADSHEETBENCH benchmark reveals 75%+ of models score below 24% accuracy on real-world Excel forum queries. Alteryx analyst survey confirms adoption paradox: 70% say AI improves productivity, yet 76% still rely on spreadsheets and 45% spend 6+ hours weekly on data prep. Emerging third-party solutions (GRID) begin bridging AI-to-spreadsheet logic gaps. Capability expansion continues, but accuracy and logic interpretation barriers remain blocking broader tier advancement.
2025-Q2: Ecosystem expansion: Sourcetable launches AI-powered spreadsheet platform with $4.3M seed funding (April), signaling continued venture interest. Production deployment issues emerge: NHS experiences Copilot outage blocking Excel file analysis in April (resolved November). Access friction persists: Copilot unavailable for Family/Personal tiers, causing user frustration documented across support forums. Vendor acknowledgment: Coherent publishes analysis arguing AI cannot reliably replace Excel for complex models, must remain complementary. Market adoption stalled: analyst productivity claims hold at 70%, but data prep time-sinks (6+ hours weekly, 45% of analysts) remain unaddressed. Core barriers unchanged: formula accuracy, access complexity, organizational rollout friction, trustworthiness concerns.
2025-Q3: Vendor use-case contraction becomes visible: Microsoft ships =COPILOT formula but explicitly warns it unsuitable for accuracy-dependent work (financial, legal, calculations). Google ships incremental Gemini features for Sheets (table auto-formatting, =AI() text function). Paradigm AI and other third-party entrants launch with specialized agent ecosystems (5,000+ agents) as alternative to general-purpose tools. Landmark production deployment: UK government trial of M365 Copilot (1,000 licenses) finds no productivity gains and actual slowdown on complex analysis. Analyst adoption metrics frozen: 70% report productivity gains, 76% still rely on spreadsheets, 45% still spend 6+ hours on data prep. Market bifurcating into specialized text-focused and text-excluded use cases. Analyst skepticism deepens as vendor warnings narrow the applicable use case.
2025-Q4: Google expands Gemini for multi-table analysis (October); Microsoft deprecates Copilot application skills in Excel (removal Feb 2026). Third-party ecosystem matures with 10+ distinct platforms (Julius, Equals, Arcwise, Rows, SheetGod, GPTExcel, SheetAI, Coefficient, Quadratic, others). Critical case studies emerge: companies with high-volume spreadsheet work (Lula Commerce, REVOLVE) abandon spreadsheets for dedicated BI platforms rather than adopt AI spreadsheet tools. Quadratic critiques Excel AI architecture as unsuitable for production data work. Feature expansion continues to mask use-case contraction and user exodus to non-spreadsheet platforms.
2026-Jan: Adoption bifurcation crystallizes: financial services teams deploy spreadsheet automation at scale (accounting sector 52% adoption, 250% ROI, 20-30% capacity gains) while mainstream adoption stalls (56% of CEOs report no AI ROI). Microsoft scales back Copilot integration due to user trust concerns and privacy friction, signaling vendor pivot from aggressive embedding to tactical deployment. Third-party ecosystem remains active but no major breakthroughs. Core barrier: structured domains see strong unit ROI while long-tail knowledge workers face trust, cost, and governance obstacles preventing mainstream scaling.
2026-Feb: Vendor feature expansion continues: Microsoft releases Agent Mode availability in EU and local file querying with Copilot Chat; Google announces admin usage reports for Gemini and forecasting in Connected Sheets via BigQuery ML. However, structural adoption barriers intensify: Microsoft 365 Copilot security bug (CW1226324) bypasses Data Loss Prevention policies, exposing confidential data; technical analysis shows 83% of AI-generated formulas fail under conditional formatting; organizational surveys reveal only 3% of enterprises highly transformed with AI while 72% remain early-stage. Accounting case studies document strong sector adoption (60% of firms, 25-35% time savings, 90-day ROI), yet broader market stalled by compliance concerns, formula fragility, and skills readiness gaps (61% use AI daily but only one-third prepared to adapt).
2026-Mar: Adoption bifurcation sharpens: Microsoft Copilot seats reach 15M with 160% YoY growth and 3x increase in large deployments (35k+ seats); 60% Fortune 500 now deployed with 20-40% measured productivity on Excel data analysis. Google launches Fill with Gemini for Sheets with real-time web data integration and pattern summarization. Yet critical barriers persist: Excel data analysis achieves only 20% adoption despite wide distribution; complex workbooks score worse than random guessing (82% best-case vs >50% human baseline); governance gaps cause 60% of AI projects to fail; and measured ROI remains elusive despite high engagement. Finance sector remains strongest signal (52% adoption, 250% ROI). Core tension unresolved: feature acceleration in vendors masks fundamental accuracy, governance, and trust gaps preventing mainstream tier advancement.
2026-Apr: Trust fracture and market bifurcation deepen. Microsoft's own terms of service contradict marketing claims: Copilot labeled "entertainment only" in ToS while marketing emphasizes productivity gains, revealing 3.3% real market penetration. Microsoft restricts free Copilot Chat access in Excel effective April 15, requiring $30/month per user licensing, signaling major pricing barrier and adoption friction. Meanwhile, leading third-party tool GPT for Work reaches 7M+ installations (ranked #1 in Kinross 2026 report, 4.9★ rating) and Google Workspace reports 128 customer deployments with named-org metrics (Geotab 89% adoption, Docusign 80% positive, Pinnacol 96% time savings). Microsoft ships Work IQ context-aware Copilot, Claude Opus 4.6 model support, and Copilot Notebooks for Excel generation. Yet Gartner forecast warns >40% of agentic AI projects will be discontinued by 2027 due to unclear ROI, uncontrolled costs, and insufficient data. Practice remains at bleeding-edge: strong specialized adoption in accounting, but mainstream tier advancement blocked by trust gaps between vendor marketing and contractual disclaimers, persistent formula accuracy failures under conditional formatting (83%), and licensing/pricing barriers preventing organizational scale.
2026-May: Governance crisis crystallizes as primary adoption barrier, superseding feature maturity concerns. OpenAI ships ChatGPT for Excel/Sheets sidebar GA (May 8) with enterprise deployment options, removing app-store friction. CSA survey (418 security professionals) reveals 82% of organizations discovered ungoverned shadow automation agents created without IT/security/governance knowledge; 65% experienced security incidents; 61% reported data exposure from AI agents. Ramp Labs Sheets AI vulnerability (detected May 7, patched March 16) demonstrated prompt injection exploits allowing agents to exfiltrate financial data via formula injection. New peer-reviewed benchmark (WorkstreamBench, arXiv:2605.22664) evaluated LLM agents on end-to-end professional spreadsheet tasks in finance: Claude family leads overall but "frequently fall short of professional finance standards," with performance degrading sharply on tasks beyond simple chained calculations—a critical negative signal for production finance deployment. Enterprise adoption framework from EPC Group (200+ deployments, Fortune 500) documents 90-120 days to value versus 6-12 month industry baseline, with structured rollouts achieving 60-75% DAU versus 15-25% without structure. Finance automation ROI benchmarks (Kwestra, citing Hackett/Gartner data) show 45% cost reduction in AP/AR/close processes with 41% reaching breakeven within 12 months. Independent testing (Neuriflux, May 2026) documents Copilot in Excel achieving 7/10 formula generation success on first attempt with 12-second latency on 8K-row datasets. Finance sector shows bifurcated signal: AR automation metrics strong (40% payment acceleration, 90% error reduction, 91% mid-market success); but Deloitte survey of 1,300+ finance leaders shows 63% deployed financial automation yet only 21% report clear ROI—measurement gap masks benefit realization. Named case studies demonstrate payback: Mayo Clinic RPA deployment yielded 84,000 annual staff hours (18:1 ROI); City of Los Angeles licensing automation reduced processing from 45 to 6 days. Core tension: deep governance gaps (shadow agents, data exposure, lifecycle control, permission drift) now block mainstream adoption more than feature maturity or pricing, and even leading models fall short of professional finance standards on complex tasks.
2026-Jun: Platform convergence and governance crisis arrived simultaneously. Microsoft Copilot Agent Mode (GA April 22) delivered +67% Excel engagement, +50% retention, and 65% satisfaction, while Google's semantic layer (Cloud Next) enabled end-to-end natural language spreadsheet construction and Microsoft Copilot Cowork (GA June 16) extended autonomous task execution to emails, meetings, and files—all major vendors converging on spreadsheets as a core AI substrate. Yet CVE-2026-42824 (SearchLeak, CVSS 10/10) exposed one-click data exfiltration from Copilot Enterprise Search via prompt injection, and EY's 150,000-user deployment confirmed that structured governance is the critical differentiator (94% monthly usage) vs. ungoverned rollouts (15–25% DAU). ROI data remained bifurcated: 60% of enterprises achieve minimal AI value despite investment; only 21% of finance leaders who deployed automation report clear ROI; while finance automation (accounting 52% adoption, 250% returns) continues to be the practice's strongest signal. Independent analysis confirmed spreadsheets are the AI substrate of choice by default—occupied by millions of users before AI design choices were made—rather than by design merit.

TOOLS

Microsoft 365 Copilot ChatGPT Google Sheets Google Apps Script