Perly Consulting │ Beck Eco

The State of Play

A living index of AI adoption across industries — where established practice meets the bleeding edge
UPDATED DAILY

The AI landscape doesn't move in one direction — it lurches. Some techniques leap from experiment to table stakes in a single quarter; others stall against regulatory walls, technical ceilings, or organisational inertia that no amount of hype can dislodge. Knowing which is which is the hard part. The State of Play cuts through the noise with a rigorously maintained index of AI techniques across every major business domain — classified by maturity, evidenced by real-world adoption, and updated daily so you always know where you stand relative to the field. Stop guessing. Start knowing.

The Daily Dispatch

A daily newsletter distilling the past two weeks of movement in a domain or two — delivered to your inbox while the index updates in the background.

AI Maturity by Domain

Each dot marks the weighted maturity of practices within a domain — hover for a brief summary, click for more detail

DOMAIN
BLEEDING EDGEESTABLISHED

Writing assistance — drafting, editing & style

GOOD PRACTICE

TRAJECTORY

Stalled

AI that assists with drafting documents and improving writing quality through editing, style, and tone suggestions. Includes document drafting from outlines and readability improvement; distinct from content generation in Marketing which targets published content rather than personal professional writing.

OVERVIEW

AI writing assistance is a proven productivity category with mature tooling, validated ROI, and mainstream adoption — but its real-world impact remains tightly constrained and adoption paradoxes suggest the category has reached a stable equilibrium rather than further growth. The ecosystem spans dedicated vendors like Grammarly (40M daily users, $13B+ valuation, $200M+ ARR), platform-native offerings like Microsoft 365 Copilot (90%+ Fortune 500 adoption), and general-purpose LLM interfaces. Randomised controlled trials and field studies confirm bounded gains: Microsoft's RCT of 7,137 workers across 66 firms found 3.6 hours saved weekly on email alone, and practitioner analysis shows the category succeeds precisely because writing tasks have clear inputs and measurable outputs, unlike broader productivity claims. Consumer uptake is broad (63% of U.S. adults use AI tools monthly, writing assistance is the top use case) and sector-specific adoption is strong (73% college students, 89% marketing teams, 94% content agencies). Yet UC Berkeley peer-reviewed research (April 2026) reveals a critical paradox: heavy AI writing assistance use reduces argument coherence by 70% and erodes writer voice, while users report equal or higher satisfaction — suggesting a systematic misalignment between perceived writing quality and actual quality outcomes.

The structural barriers preventing tier advancement are both technical and behavioural. Hallucination rates in professional writing are escalating (18% in news-related prompts in Aug 2024 rose to 35% by Aug 2025, resulting in 12,842 articles pulled from circulation in Q1 2025), with legal domain showing 69-88% failure rates and 486+ court cases involving AI hallucinations. Enterprise adoption remains stuck in pilot limbo: only 16% of pilots reach production, and 73% of regulated-sector organisations have paused rollouts over security, data-privacy, and measurement gaps. Specialist vendors face intensifying pressure: Jasper AI collapsed with 40% layoffs as users migrated to free platforms; Grammarly's Expert Review feature (Aug 2025–Mar 2026) was disabled after 7 months following a class action lawsuit for using journalists' identities without consent; Writer.com abandoned SMB for enterprise-only positioning. Market consolidation data (22.5% CAGR, $4.8B→$20B 2025–2034) masks a permanent bifurcation: Fortune 500 firms and content professionals extract value in bounded tasks (drafting, summarizing, light editing), while mid-market and regulated organisations remain structurally blocked by accuracy, compliance, and measurement immaturity.

CURRENT LANDSCAPE

The vendor landscape is consolidating around two poles while specialist competitors exit or retreat. Microsoft continues expanding Copilot with Agent Mode for Word/Excel/PowerPoint and new draft-generation agents, reporting 90%+ Fortune 500 adoption. Grammarly ($700M+ revenue, 50,000 enterprise customers, 40M daily active users) is specialising vertically — shipping dedicated products for HR and customer-support teams with features like style-guide enforcement, term banks, and multilingual support. Both platforms are maturing: Grammarly reports 283% return for enterprise customers; Writer.com released 200+ Skills, observability/Datadog integration, and NVIDIA NIM support (Mar-Apr 2026). Specialist vendors face structural pressure: Jasper AI (once valued at $1.5B) collapsed with 40% layoffs in early 2026 as users migrated to free ChatGPT/Claude; Grammarly's Expert Review feature was disabled in March 2026 following a class action lawsuit for using journalists' identities without consent; Writer.com abandoned SMB markets and now targets enterprise exclusively at $18,000/year minimums. Market consolidation signals that writing assistance is moving toward general-purpose platforms and vertical specialization.

Adoption breadth masks deployment stall. Writing and communications account for 80% of all workplace generative AI use — the single largest application category. Sector-specific metrics show strength: 73% of college students use AI for writing, 89% of marketing teams use AI (up from 67% in 2024), 94% of content agencies embed it in workflows, and 74% of content marketers use AI tools. In professional services (legal, tax), 40% have deployed generative AI with 30% drafting time reductions. Yet enterprise conversion remains the critical bottleneck: only 19% of content marketing teams track AI-specific KPIs despite reporting 3.4x velocity and 67% cost reduction, indicating measurement immaturity; Copilot rollouts routinely stall at 20% adoption within organisations due to absent manager modelling, poor workflow integration, and unclear permissions. EU enterprises face additional compliance barriers: Grammarly Enterprise requires Data Processing Agreement (Standard Contractual Clauses only, no EU residency), DPA unavailable for Free/Premium tiers, and Works Council participation rights under German labor law, raising deployment friction for regulated sectors.

Accuracy and reliability remain limiting factors showing escalation. Hallucination rates in professional writing are worsening: news-related prompts doubled from 18% (Aug 2024) to 35% (Aug 2025), with 12,842 AI-generated articles pulled from circulation in Q1 2025. Enterprise testing of 6 AI content tools (March 2026) documented 15-25% hallucination rates. Legal domain shows systematic failure: 69-88% hallucination rate on specific legal queries, with 486+ court cases now involving AI hallucinations and 128 lawyers sanctioned (including Mata v. Avianca where ChatGPT fabricated 6 cases). Medical domain (Mount Sinai 2025): 64% hallucination rate on long clinical cases without mitigation. The paradox: UC Berkeley research shows heavy AI use reduces argument coherence by 70% and erodes writer voice, yet users report equal satisfaction—suggesting confidence masking actual quality loss. The market's next phase hinges less on feature expansion than on solving reliability (acceptable hallucination rates for high-stakes contexts), measurement (demonstrating ROI beyond early-adopter cohorts), and user-preference alignment (actual vs. perceived writing quality).

May 2026 platform developments reveal the "verification tax" shaping deployment reality. Google shipped tone/style personalization in Gmail (May 7, 2026), and Microsoft deployed Writing Tools natively in Notepad, confirming platform convergence on integrated drafting—yet field research shows this integration paradox: MIT's 1,258-person experiment validated 137% message volume and 20% editing time reduction, but independent studies document that 80% of recovered time is reabsorbed into review and quality control. Leaders report spending more time validating polished AI output than writing from scratch, and researchers identify that improved fluency masks errors and hallucinations rather than reducing them. In regulated domains (legal, healthcare, finance), verification burden blocks adoption entirely: 58% hallucination on legal queries and institutional shifts to sampling-based triage confirm that polished AI writing requires deeper human scrutiny, not less. The category has completed the "generation" phase and entered the "verification" phase—adoption now depends on whether organizations can afford the human overhead of validation rather than whether tools generate plausible text.

TIER HISTORY

ResearchNov-2022 → Nov-2022
Bleeding EdgeNov-2022 → Jan-2024
Leading EdgeJan-2024 → Jan-2025
Good PracticeJan-2025 → present

EVIDENCE (110)

— Comprehensive hallucination benchmarks: 0.7–4.6% on basic summarization, 18.7% on legal queries, 15.6% on medical—MIT research shows models are 34% more confident when false—establishing that hallucination is fundamental ceiling on trustworthiness.

— Google released tone/style personalization for Gmail writing, inferring writer voice from email history and integrating context from Drive—confirming major platform production deployment of learning-based style adaptation.

— Research-backed analysis revealed bimodal productivity distribution: power users reclaim 9–20+ hours/week from drafting/summarization/translation, while casual users see negligible gains—confirming writing assistance ROI is context-dependent and adoption barriers are structural.

— SSHRC-funded research identifies psychological risks: deferring writing to AI shifts users from active contributors to passive reviewers, eroding confidence in own writing abilities—documenting fundamental adoption barrier rooted in skill erosion and loss of authorship agency.

— Independent research of 90+ leaders: ~80% reported recovered writing time reabsorbed into reviewing/prompting/QC; polished AI output is harder to fact-check than rough drafts—confirming that writing assistance creates 'different work' rather than productivity gains at scale.

— Carl Benedikt Frey identifies the 'verification tax': field study of experienced developers showed AI access made them 19% slower (vs. 14% gains in customer support); Sullivan & Cromwell fabricated citations case shows net productivity depends on error cost—explaining why writing assistance stalls in high-stakes domains.

— Real-world adoption (42% of tenants use AI for legal interpretation) met with institutional friction: 58% hallucination on legal queries; organizations shifted from linear reading to triage/sampling/cross-checking because polished AI output conceals fabricated authorities—evidence of high adoption and high institutional friction.

— MIT field experiment (n=1,258 teams) showed AI writing agents reduced editing time 20% while increasing message volume 137%; paid X campaigns validated quality, providing rare RCT-grade evidence of productive human-AI writing collaboration.

HISTORY

  • 2022-H2: ChatGPT released, establishing new performance benchmarks for conversational writing assistance; Grammarly integrated into Hi Marley insurance platform for customer communications; student survey identified mixed adoption patterns with uneven understanding of risks; AI writing detection revealed fundamental accuracy limitations; VC interest in AI writing tools accelerated.
  • 2023-H1: Microsoft 365 Copilot announced with early access for enterprise writing workflows; student adoption accelerated despite institutional bans (51% plan to use tools regardless); Grammarly Business demonstrated internal productivity gains; Marshall University discontinued Grammarly due to low institutional uptake; professional journalism deployments (BuzzFeed, CNET) exposed factual accuracy limitations and plagiarism risks; experts emphasized mandatory human verification for AI-generated writing.
  • 2023-H2: OpenAI discontinued its AI text detector due to unacceptable accuracy; student usage plateaued at 43-51% with Turnitin detecting AI in 9.6% of submissions but generating ~50% false positives; academic institutions struggled with policy definitions of permissible grammar assistance versus prohibited AI generation, with documented student penalties; free ChatGPT displaced paid writing-specific tools in comparative reviews; Grammarly consolidated toward platform-native integration while critical voices questioned whether AI writing assistance improved or eroded writer skill development.
  • 2024-Q1: Workplace adoption accelerated with 28% of US workers using generative AI on the job, writing assistance as top use case (NBER); named case studies demonstrated strong ROI (Databricks 1994% return, Amadeus >90% Copilot adoption); Grammarly Text Editor SDK reached GA (2.9M users); early-adopter feedback documented adoption barriers ($30/user cost, feature reliability gaps, accountability erosion); research evidence showed AI-assisted writing increased productivity but decreased user accountability and diversity in output.
  • 2024-Q2: Independent surveys confirmed sustained adoption momentum (BCG, Microsoft/LinkedIn both reported 75% of knowledge workers using generative AI with ≥5 hours/week time savings); research documented divergent adoption patterns by context—business professionals prioritized efficiency while academic writers expressed concerns about bias and skill erosion; vendor attempts to expand AI writing customization (Microsoft Copilot Pro GPT Builder) faced adoption friction and were discontinued after 3 months; PR and marketing practitioners continued to report quality limitations, finding AI output inadequate for creative work requiring authentic voice and emotional connection.
  • 2024-Q3: Broad consumer awareness masked stalled enterprise adoption; independent national survey found 39% population usage but companies spending $150B on AI with only 10-15% experiment rollout; Microsoft WorkLab documented production deployments (document creation +45-58%, email 31% faster), but named pilot failures (pharma Copilot cancellation after 500-user trial citing low ROI) and CMO dissatisfaction (60% success rate) signaled persistent adoption friction; cost ($30/user), reliability gaps, and ROI clarity remained barriers to expansion.
  • 2024-Q4: Large-firm adoption accelerated with 70% of Fortune 500 using Microsoft 365 Copilot and independent case studies showing strong ROI (OneSource Virtual 27x, Eaton 83% time savings), but broader IT surveys documented declining deployment rates (55.5%→47.4% since 2021) and slipping ROI realization (56.7%→47.3%), with 49% of organizations unable to demonstrate value; CSIRO's real-world pilot trial showed mixed outcomes (productivity gains in structured tasks offset by data privacy concerns); market consolidation evident as Grammarly acquired Coda (December) to offset competitive moat erosion from platform integrations.
  • 2025-Q1: Large-scale empirical validation arrived with Microsoft's RCT showing 12% faster document completion and 7% less email reading time across 6,000+ workers at 56 firms, validating prior productivity claims. Named deployments confirmed strong ROI: Emplifi (Grammarly) achieved 19x returns with 2-4 hours/week per-employee savings; adoption breadth held steady at 74% of content professionals using AI weekly. Critical professional voices (author/editor Nathan Bransford) documented AI maturity gap: strong on grammar and summarization but limited feedback for creative writing. Grammarly GA'd enterprise ROI measurement tools (January), signaling vendor focus on demonstrating mid-market value against cost and reliability concerns.
  • 2025-Q2: Empirical evidence expanded with Microsoft's larger RCT (7,137 workers across 66 firms, May 2025) confirming 3.6 hours/week email savings and faster document completion, validated independent productivity gains. Grammarly Enterprise scaled to 50,000 organizational customers with 283% ROI and 3.3% CSAT lift (June), signaling vendor-driven adoption momentum. Global adoption accelerated: 87% of internet users have access to tools; Southeast Asia achieved 314% YoY growth. Critical barriers emerged: Gartner survey revealed only 16% of pilots reach production, with 73% of regulated-sector organizations pausing rollouts due to security, data privacy, and ROI clarity concerns. Specialized vendors (Grammarly 40M daily users, $700M+ revenue) maintained competitive position against Copilot expansion.
  • 2025-Q3: Consumer and professional adoption broadened significantly: Verasight found 63% of U.S. adults use AI tools monthly with writing assistance as top use case (45% of users); Orbitmedia survey showed 95% of marketers use AI (up from 80% in 2024) with editing as primary application, reducing writing time from 4h10m to 3h25m. Platform vendors advanced products: Grammarly launched eight specialized writing agents (Reader Reactions, Citation Finder, Expert Review, etc.) in beta to enterprise/education customers; Microsoft expanded Copilot Chat metrics for usage tracking. However, critical signals highlighted persistent adoption friction: M365 analysis found ~70% of Copilot rollouts report no measurable ROI despite activation metrics; independent licensing analysis questioned inflated adoption claims, arguing many deployments remain in trial phases with modest actual per-user engagement. Grammarly's support team deployment with Forethought achieved 87% chat/email deflection and 4.2/5 CSAT, demonstrating real-world efficiency gains in service contexts. The window reinforced bifurcated market: mainstream consumer and professional adoption alongside continued mid-market and regulated-sector deployment friction.
  • 2025-Q4: Platform consolidation accelerated with Microsoft Ignite 2025 announcing major feature expansions (Work IQ, Word/Excel/PowerPoint agents, Agent 365 control plane) and reporting 90%+ Fortune 500 Copilot adoption, while Gallup survey (December) confirmed U.S. workplace AI adoption rose to 45% with writing assistance as leading use case. However, critical independent research exposed systematic deployment failures: MIT Technology Review analysis of 300+ enterprise deployments found 95% deliver no measurable business value; BBC/EBU study of 3,000 AI responses documented 45% contain significant accuracy issues and 31% with sourcing problems, undermining reliability claims. Practitioner evidence surfaced persistent limitations: technical writers reported AI output requiring extensive rewriting (5 days to salvage single document), highlighting standardization barriers; limited positive case studies (Grammarly/Iterable 35% editing time reduction) confirmed gains remain niche and modest. Window crystallized the paradox: mainstream adoption (87% global access, 45% U.S. workplace use) coexists with structural enterprise barriers (only 16% pilot projects reach production, 73% regulated-sector firms paused rollouts, 95% of deployments without measurable ROI), indicating market bifurcation is permanent rather than transitional.
  • 2026-Jan: Early 2026 showed continued platform evolution with Microsoft shipping Agent Mode for Word/Excel/PowerPoint and new draft-generation agents, while independent adoption metrics remained stable (Gallup: 26% frequent AI workplace use, 77% among tech workers). However, critical signals dominated: Grammarly maintained market dominance ($13B+ valuation, 30M+ daily users) but faced structural adoption headwinds. Research identified a fundamental design gap—users consistently mispredict their AI writing assistance preferences, with behavioral-pattern systems (61.3% accuracy) outperforming stated-preference designs (57.7%), indicating misalignment between perceived value and actual utility. Deployment reality remained stark: analysis of UK government and enterprise pilots documented systematic adoption failures, with Copilot rollouts stalling at ~20% adoption due to lacking manager modeling, workflow integration gaps, and permission ambiguities; quantified cost burden at £240k annually per 1,000-person organization. Window reinforced that despite stable mainstream usage, structural adoption barriers prevent expansion beyond early-adopter cohorts, and fundamental user-preference misalignment constrains effectiveness of design-focused product investments.
  • 2026-Feb: February 2026 signaled market consolidation and operational optimization focus in writing assistance. Grammarly expanded vertical-specific deployments with dedicated Business products for customer support teams and HR functions (featuring multilingual support, SCIM provisioning, style guide enforcement), reflecting market maturity and specialization. However, critical adoption signals emerged: independent analysis documented 25-35% enterprise license underutilization due to over-provisioning, exposing ROI efficiency gaps in writing assistance rollouts; Grammarly's Word add-in for Mac was discontinued, signaling ongoing product consolidation challenges in platform integration. Professional community continued to articulate evolved expectations: industry analysis emphasized meaning preservation, fact-checking, citation integrity, and privacy as core evaluation criteria for writing assistance tools, indicating the category had moved beyond grammar/spelling toward reliability and domain-specific accuracy. Window reinforced bifurcated market: vendor specialization and platform expansion continue among leaders, while adoption barriers remain structural for mid-market and regulated-sector organizations.
  • 2026-Mar: Market consolidation accelerated with Jasper AI collapsing (40% layoffs, user migration to ChatGPT/Claude), Grammarly degrading UX through aggressive AI upsells, and Writer.com abandoning SMB to focus on enterprise-only pricing—confirming specialist writing tools are losing ground to general-purpose platforms. Reliability risks sharpened: enterprise testing documented 15-25% hallucination rates across 6 AI writing tools, and 486 AI hallucination cases logged in courts worldwide resulted in 128 lawyers sanctioned, giving concrete legal weight to accuracy concerns. Adoption breadth remains strong—writing accounts for 80% of all workplace GenAI use, 74% of content marketers use AI tools (with 3.4x velocity gains), and professional services reached 40% generative AI adoption with 30% drafting time reductions—but only 19% of teams track AI-specific KPIs, signalling persistent measurement immaturity.
  • 2026-Apr: UC Berkeley research documented a fundamental adoption paradox: heavy AI writing assistance use reduces argument coherence by 70% and erodes writer voice, yet users report equal satisfaction—systematic misalignment between perceived and actual quality. Grammarly's Expert Review feature was disabled following a class action lawsuit for impersonating journalists and deceased academics without consent, exposing governance failure that halts enterprise adoption. Adoption metrics show breadth but not depth: 97% of content marketers plan AI use and 85% are already drafting/editing (11 hours/week saved, 420% ROI), yet 44% regularly fix AI mistakes and JPMorgan Chase/Vanguard deployments show 500% volume gains are contingent on an editor-in-chief validation workflow. Hallucination escalation continues (18%→35% in news-writing, 12,842 articles pulled in Q1 2025), confirming that category maturity is real in bounded tasks but structurally blocked by accuracy and compliance risks in regulated or high-stakes contexts.
  • 2026-May: Google's Gmail tone/style personalisation GA and an MIT RCT (n=1,258) confirming 20% editing-time reduction reinforce platform-level adoption, but the dominant theme this month is the verification tax: independent research across 90+ leaders found ~80% of recovered writing time is reabsorbed into reviewing and QC, productivity gains are bimodally distributed (power users: 9–20+ hours/week; casual users: negligible), and 58% hallucination on legal queries is prompting institutions to shift from linear reading to triage-and-sampling workflows — confirming that writing assistance productivity depends heavily on error cost and domain risk tolerance.