Perly Consulting │ Beck Eco

The State of Play

A living index of AI adoption across industries — where established practice meets the bleeding edge
UPDATED DAILY

The AI landscape doesn't move in one direction — it lurches. Some techniques leap from experiment to table stakes in a single quarter; others stall against regulatory walls, technical ceilings, or organisational inertia that no amount of hype can dislodge. Knowing which is which is the hard part. The State of Play cuts through the noise with a rigorously maintained index of AI techniques across every major business domain — classified by maturity, evidenced by real-world adoption, and updated daily so you always know where you stand relative to the field. Stop guessing. Start knowing.

The Daily Dispatch

A daily newsletter distilling the past two weeks of movement in a domain or two — delivered to your inbox while the index updates in the background.

AI Maturity by Domain

Each dot marks the weighted maturity of practices within a domain — hover for a brief summary, click for more detail

DOMAIN
BLEEDING EDGEESTABLISHED

Customer health scoring & churn prediction

GOOD PRACTICE

TRAJECTORY

Stalled

AI that scores customer health, detects churn signals, and triggers proactive intervention workflows. Includes usage-based health scoring and early warning systems; distinct from customer journey analysis which maps experience rather than predicting outcomes.

OVERVIEW

Customer health scoring and churn prediction has matured from research to table-stakes operational practice with broadly deployed tooling and consistent evidence of churn reduction at scale — yet a deep execution gap persists between technical capability and measurable business impact. The practice solves a well-understood problem: customer success teams need both a prioritisation signal (which accounts need attention now) and a forecasting signal (which accounts will churn). Vendor platforms from Gainsight, ChurnZero, Salesforce, and Microsoft now ship both as standard GA features; peer-reviewed research confirms ensemble ML models achieve 87-97% accuracy on production datasets; market adoption has shifted from bleeding-edge to mainstream — 65% of large enterprises (1,000+ customers) now use ML-based churn prediction as of June 2026, up from 38% in 2023, and 41% of B2B SaaS have deployed dedicated tools. Top deployments demonstrate 31% gross churn reduction within 12 months and $4-7 in protected revenue per $1 spent on implementation. Yet 95% of enterprise AI pilots produce no measurable P&L impact, and 60% of AI projects are abandoned due to data quality constraints. The constraint is no longer technological — it is organisational and operational. Successful deployments require unified data infrastructure (product, CRM, support, billing), front-loaded onboarding signal detection (not usage-only scoring), automated playbook wiring (scores must trigger action), and specialist configuration. Only 22% of organizations have successfully adopted AI-driven health scoring despite 76% piloting or deploying; mid-market penetration lags enterprise tier due to cost ($60K-$140K annual TCO), data fragmentation, and absence of configured retention workflows. For enterprises with dedicated CS operations and clean data pipelines, the practice delivers measurable revenue impact. For mid-market and smaller teams, implementation barriers remain binding constraints on adoption.

CURRENT LANDSCAPE

Gainsight's agentic platform, announced May 2026, represents the current state-of-the-art evolution: the entire Gainsight platform is now agentic, with Staircase Risk and Expansion Analysts live in production automatically surfacing churn signals and expansion opportunities months in advance; 175K+ tool calls and 96K+ queries demonstrate ecosystem adoption. Staircase AI Health Score delivers real-time 0-100 scoring that blends sentiment analysis, engagement patterns, and response-time signals from emails, calls, and Slack; ChurnZero offers structured health scores mapped to specific churn archetypes with ~40% autonomous agent deployment; Salesforce Einstein and Microsoft Dynamics 365 Customer Insights provide embedded prediction engines with automated model retraining. The vendor ecosystem is feature-complete and mature. Deployment results at well-resourced organisations are compelling: Arete research on 500+ mid-market SaaS companies shows AI churn prediction achieves 31% gross churn reduction within 12 months and generates $4-7 in protected revenue per $1 invested; mid-market B2B SaaS companies reduced churn from 34% to 11%, increased NRR from 96% to 118%, and attributed $8.4M in retained revenue to unified health scoring with automated interventions. A feedback-driven approach using AI analysis of support tickets and survey data achieved a 56% churn reduction (8% to 3.5% monthly) by identifying behavioral patterns and triggering psychological interventions. A systematic review of 142 studies found predictive health scoring achieving 89%+ accuracy with 34-47% NRR improvements in production settings. G2 survey data across platforms documents 15-25% churn reductions (Chargebee up to 25%, Velaris averaging 15%). The global AI-enhanced churn scoring market reached $2.53B in 2025 and is projected to grow 24.5% annually to $3.15B in 2026 and $7.48B by 2030.

Adoption of AI-driven approaches, however, remains constrained despite mainstream trialing of capability. By May 2026, while 76% of B2B SaaS companies have deployed or piloted AI churn prediction, only 22% have successfully adopted AI-driven health scoring approaches, signaling a persistent execution and operationalization gap. Eighty percent of customer success teams remain experimental with AI-driven scoring despite years of vendor investment and availability of production-grade tooling. Year 1 total cost of ownership runs $60K-$99K for ChurnZero to $90K-$140K for Gainsight, with realistic setup demanding 150+ hours and specialist resources for ongoing calibration. Practitioner assessments reveal that most deployed scores fail to outperform churn-rate baselines — a consequence of subjective weighting and poor signal selection that erodes CSM trust through persistent false positives. Critical implementation analysis shows that 83% precision models fail in production because prediction capability does not automatically translate to intervention execution; deployment gaps include cold-start reliability issues (models unreliable on customers under 14-30 days old), prediction-window mismatches (30-day models surface signals too late for multi-touch retention campaigns), and silent model drift requiring frequent retraining cycles. TSIA analysts identify an "actionability gap"—even directionally correct scores often fail because teams cannot prescribe specific next steps based on identified drivers. Disconnected data systems remain the primary blocking constraint. The practice delivers proven value at enterprise scale but has stalled at the boundary of mid-market adoption, where cost, complexity, and data quality challenges compound.

TIER HISTORY

ResearchJan-2017 → Jan-2017
Bleeding EdgeJan-2017 → Jan-2021
Leading EdgeJan-2021 → Jan-2024
Good PracticeJan-2024 → present

EVIDENCE (150)

— Critical analysis of enterprise AI project failures: 95% of enterprise generative AI pilots produce no measurable P&L impact; 60% of AI projects abandoned due to inadequate AI-ready data; 40% of agentic AI projects forecast cancelled by end 2027. Uses churn prediction as explicit failure case example; documents seven failure modes including weak success criteria, scope creep, data quality, and governance gaps—negative signal on adoption reality.

— Large-scale empirical analysis identifies 8 key behavioral signals predicting retention: 23% active users fully disengaged (no activity 30d), monthly plans churn 4.7x faster than annual, single core action creates 3-6x churn gap, top 10% customers hold 58% revenue, non-renewal signals strongest predictors (40%+ churn post-flag), single-user accounts churn 14-33x faster than teams, feature adoption compounds retention, churn peaks at 6-12 months (median 25%).

— Best-practices guide: teams with automated health score workflows report 31% gross revenue churn reduction within two quarters; 74% of SaaS still rely on manual assessment despite automation availability; AI-based scoring 2-4x better at 90-day churn prediction vs telemetry-only; AI detects risk 63 days before cancellation vs 11 days manual; five-phase automation framework with critical insight: telemetry-only scores fail because they measure behavior (lagging) not intent (decision made in conversation).

— Duplo (fintech, 2K+ merchants) deployed four-metric health system: DIS <3 = 7.4x churn risk; TVR <0.85 = 67% churn likelihood; SSI rise = 3x volume reduction. Measured outcomes: 35% retention lift post-TVR alerts, 32% support-save increase, 25% QoQ improvement, 14% wallet-share growth—demonstrates domain-generalizable framework for leading indicators over lagging usage metrics.

— Market research: 65% large enterprises use ML-based churn prediction (vs 38% in 2023); 41% of B2B SaaS deployed dedicated tools (vs 19% in 2022); 78% of CS leaders report AI health scoring replaced manual reviews; $9.8B market in 2025 projected $24.1B by 2030; companies reduce churn 15-25% vs manual; 85-92% accuracy on 90-day windows; $2.1M ARR retained per 100 accounts.

— Mid-market supply chain management SaaS ($84K ACV): health scoring reduced first-year churn 34%→11%, NRR 96%→118%, generating $8.4M documented revenue impact. Integrated product usage, support, billing, and engagement signals; customers completing 6+ onboarding milestones showed 94% renewal rate.

— Practitioner framework: 70-80% of churning customers display warning signs 30+ days before cancellation. Critical operational principle: health score 'only earns its place' when score changes trigger automatic CRM actions (tasks, alerts, workflows). Without wired playbooks, scores become 'dashboard ornaments'—operationalization as constraint.

— Technical HubSpot implementation: four-category signal framework (Product 60-90d, Relationship 30-60d, Intent 7-30d, Financial any-time) with specific weighted scoring. Automated 6-step workflows trigger on health score band changes. Demonstrates production-ready system design for operationalizing health scores at scale.

HISTORY

  • 2017: Customer health scoring emerges as a distinct practice with product-level support (Gainsight Sally bot) and real-world deployments (Feedvisor); academic research advances churn prediction methodologies; industry recognition of fundamental tension between subjective CSM prioritization and objective forecasting models limits broader adoption.
  • 2018: Churn prediction research expands beyond telecom into e-commerce and general SaaS; deep learning techniques (Keras, XGBoost) enter mainstream practitioner adoption; novel methods (PU learning) address real-world data challenges; adoption remains constrained by data scarcity and the unresolved CSM-vs.-forecasting tension.
  • 2019: Major CRM vendors (Salesforce, Microsoft) launch productized churn prediction capabilities; custom deployments accelerate across multiple sectors with 90%+ model accuracies; vendor platforms evolve to separate health scoring (CSM visibility) from churn propensity (forecasting), reducing the core architectural tension; adoption broadens beyond early-adopter SaaS into financial services and telecom, though data scarcity and skills gaps remain barriers to widespread custom implementation.
  • 2021: Churn prediction reaches commodity status in enterprise SaaS and CRM platforms; Salesforce Einstein and Microsoft Dynamics 365 embed prediction engines as standard features; practitioner guides document multi-dimensional health scoring approaches and implementation maturity; research literature reviews compare statistical methods for production deployment; adoption barriers shift from technical capability to organizational readiness and data quality rather than algorithm innovation.
  • 2022-H1: Vendor platforms continue feature expansion: Salesforce Einstein adds multiclass prediction and temporal awareness (Projected Predictions) in Spring/Summer releases; Gainsight publishes D.E.A.R. framework for operationalizing health scores at scale; SaaS vendors report 20%+ churn reduction with ML-powered health scoring; segmentation-driven churn models validated across telecom operators; vendor innovation emphasizes lifecycle-stage health scoring to improve CSM targeting and forecast accuracy.
  • 2022-H2: Microsoft Dynamics 365 announces GA predictive churn capabilities in Customer Insights; academic research validates advanced data transformation methods (26% AUC improvements via feature selection); practitioner roundtables surface real-world implementation challenges including data hygiene and tool integration complexity; vendor consolidation accelerates with health scoring becoming standard across SaaS and CRM platforms; market barriers shift from technical capability to organizational readiness and data quality.
  • 2023-H1: Peer-reviewed research validates health scoring as established B2B CS metric with production deployments across SaaS, banking, and subscription sectors; Notion publishes segment-specific D.E.A.R. framework implementation; Salesforce Einstein confirms 80 billion daily predictions including churn; Gainsight redesigns health scoring feature replacing traffic-light models with multi-dimensional approaches; practitioner adoption documented across service industries with churn prediction as core use case; sector expansion includes banking and OTT subscription services with 97%+ model accuracy achieved in production.
  • 2023-H2: Industry adoption metrics confirm 60% of 400+ North American companies use health scores (Gainsight Oct 2023); Klaviyo launches churn prediction in CDP with 70% baseline churn data; peer-reviewed research validates production-grade churn models in banking (GA-XGBoost) and telecom (0.832 accuracy); systematic ML/DL review confirms ensemble methods dominate with technical maturity but persistent gaps in interpretability; critical finding: SaaS survey shows no correlation between health scores and upsell revenue, challenging traditional ROI narratives despite evidence of churn control benefits.
  • 2024-Q1: Systematic review of 212 peer-reviewed papers confirms ensemble ML/DL dominance with 93-97% AUC on production datasets; researchers emphasize profit-based evaluation metrics; verified deployments at enterprises (PTC/ChurnZero) demonstrate automation of retention workflows; vendor CAB surveys predict 2024 emphasis on AI-driven CS and expansion-over-acquisition strategy; Totango and ChurnZero maintain 7.6/10 user satisfaction with 90%+ recommendation rates and 70%+ renewal intent; third-party health score analysis tools (nCloud Integrators) emerge for Gainsight ecosystem, indicating vendor consolidation and platform extensibility.
  • 2024-Q2: Multi-org case studies document real-world deployments at BigTime, Logiwa, and Qualtrics with specific health scoring methodologies; peer-reviewed research achieves 91.66% accuracy with Random Forest on telecom data with 30%+ churn rates; industry benchmark established at 13% median SaaS churn; critical assessment reveals adoption barriers—only 7% of companies actively track health scores and naive prediction-based interventions can trigger unintended churn; experts emphasize proactive onboarding signals over order cadence and highlight need for uplift modeling over classification.
  • 2024-Q3: Microsoft GA launches transactional churn prediction in Dynamics 365 Customer Insights with automated retraining; Gainsight acquires Staircase AI to enhance interaction-based health signals, signaling continued vendor consolidation in AI-driven CS; adoption metrics show 42% of CS teams track health scores; practitioner guides highlight behavioral alternatives when survey data unavailable; McKinsey reports AI churn reduction potential of 15%.
  • 2024-Q4: Cross-sector deployments validate production maturity: NBFC in Sri Lanka achieves 90% accuracy and 20-30% churn reduction with behavioral feature engineering; telecom operators (Viatel) achieve 97.92% precision with LightGBM; analysis across 67 B2B SaaS companies confirms 82% accuracy and 5.2x median ROI from proactive intervention. Vendor innovation continues: Gainsight releases Scorecard Optimizer for improved health scoring. Critical assessment emerges: JoySuite analysis warns of lagging-indicator risk and false-confidence traps in all-green health scores, reinforcing need for qualitative validation. Practice solidifies as table-stakes feature across enterprise platforms with demonstrated business impact, though execution barriers persist around data quality, intervention timing, and comprehensive implementation.
  • 2025-Q1: Vendor platforms integrate AI-driven interaction analysis: Gainsight launches Atlas AI agents and deepens Staircase AI integration for sentiment monitoring and proactive risk detection. Case studies validate mid-market deployments: SmartReach achieves 35% churn reduction (27% to 17.5%) through weighted health scoring; ChurnZero implementation cases document custom model development and automation. Industry adoption survey reveals expansion limits: 70% adoption at enterprises vs. lower mid-market penetration; only 21% incorporated AI despite 87% planning to do so, indicating implementation gap. Critical assessments surface methodology limitations: advocates argue quantitative-only health scores miss sentiment signals and relationship changes, recommending integrated qualitative approaches. Practice demonstrates broad adoption among large companies with emerging focus on behavioral signal integration and intervention optimization.
  • 2025-Q2: Independent case studies confirm production maturity: GitLab publishes internal health scoring methodology with use-case adoption tracking; SaaS companies report health score improvements from 40% to 82% prediction accuracy with 60% fewer false positives; fintech deployments reduce churn from 18% to 14% via early warning systems. Consultant analyses document real-world implementations including global financial services platforms shifting from reactive to proactive CS via segment-specific models and CSM sentiment integration. Industry adoption shows persistent execution gaps: data quality remains primary barrier (incomplete journeys, legacy CRM silos), churn benchmarks vary significantly by sector (6.9% digital media to 25% finance), and qualitative signal integration emerges as critical gap in quantitative-only approaches. Practice demonstrates table-stakes maturity with proven deployment patterns, yet execution barriers and data quality challenges limit broader adoption below enterprise tier.
  • 2025-Q3: Vendor platforms accelerate AI integration: Gainsight's Insight Agent (Staircase AI) GA delivers automated health scoring (0-100) with real-time churn signals and Executive Dashboard for NRR tracking; ChurnZero Success Insights GA enables ML-powered risk detection; Totango launches Unison AI (though Gartner cautions execution lags behind static legacy systems). Market growth accelerates: customer health scoring AI market reaches USD 1.48B with 25.7% projected CAGR through 2033. Deployment evidence remains strong: London fintech achieves 60-day early warning churn reduction (18% to 14%); SaaS companies demonstrate 40-82% accuracy improvements. Critical signal emerges: analyst reviews highlight platform maturity gaps—Totango's AI remains limited despite roadmap claims, signaling uneven vendor execution. Practice solidifies as table-stakes with accelerating AI-driven capabilities, yet mid-market adoption lags enterprise tier; data quality and qualitative signal gaps persist as core barriers to intervention effectiveness.
  • 2025-Q4: Deployment evidence expands across sectors: peer-reviewed research validates 95.13% accuracy in telecom churn prediction; consulting case studies document 260%+ conversion improvements from predictive analytics (Hydrant with Pecan AI); implementation guides establish quantified operational metrics (88% renewal accuracy, 90% health scoring precision, 50% churn reduction). Vendor deployments confirm signal viability: ChurnZero demonstrates engagement-churn correlation through community platform integration. Critical assessment surfaces implementation reality: realistic setup requires 150+ hours and $20K-$50K consulting costs, with data quality challenges and sector-specific churn variation (6.9% to 25%) demanding segment-specific models. Practice demonstrates strong enterprise adoption and deployment maturity with consistent churn reduction outcomes, yet mid-market penetration gaps (21% AI adoption despite 87% planning), execution barriers around data integration and cost, and unresolved ROI clarity on upsell correlation remain key scaling constraints.
  • 2026-Jan: Vendor platforms deepen AI signal integration: Gainsight Staircase AI Health Score (GA) delivers 0-100 scoring with sentiment, engagement, open items, and response time analysis; ChurnZero structures health scores for specific churn scenarios. New case studies validate enterprise deployments (18% to 11% churn reduction, 42% to 68% save rate improvement). Practitioner consensus emphasizes outcome-based design and segment-specific models. Vendor guidance moves toward hybrid scoring (25-50% AI weighting) balanced with traditional metrics. Execution barriers persist: implementation demands specialist resources, false positive rates remain problematic, and opaque scoring erodes CSM trust. Global AI infrastructure adoption remains early-stage (6% of large enterprises, 13.4% of Fortune 500 with LLM tools), constraining sophistication of AI-driven interventions.
  • 2026-Feb: Independent research and practitioner assessments surface critical implementation gaps despite vendor feature maturity. Academic systematic review (142 studies) confirms predictive health scoring achieves 89%+ accuracy and 34-47% NRR gains in production settings, validating capability potential. However, adoption reality diverges sharply: 80% of CS teams remain experimental with AI despite aggressive vendor investment; Year 1 TCO barriers of $60K-$99K (ChurnZero) to $90K-$140K (Gainsight) persist alongside ongoing configuration demands. Practitioner consensus surfaces accuracy and design failures: most deployed scores underperform churn baseline rates due to subjective weighting and poor signal selection. Platform complexity and false positive rates erode CSM trust. Systematic assessments (Gartner, TSIA, G2) highlight that widespread pilot programs lack ROI visibility and data unification remains the blocking constraint. Market evidence of measurable impact (15-25% churn reductions) concentrated at well-resourced enterprises; mid-market adoption remains significantly constrained. The practice demonstrates proven technical capability at scale but persistent barriers to mainstream deployment—execution complexity, specialist resource requirements, and unproven ROI at non-enterprise tiers limit broader market penetration.
  • 2026-Apr: New evidence reinforces both the deployment upside and the execution ceiling. A mid-market B2B SaaS case study documented churn reduction from 34% to 11% with NRR improving from 96% to 118% and $8.4M attributed to integrated health scoring with automated interventions. Market sizing confirmed at $2.53B (2025) growing to $3.15B (2026) at 24.5% CAGR. Against these positive signals, critical practitioner analysis identified core production failure patterns: high-precision models (83% precision) fail because prediction does not guarantee intervention execution, cold-start unreliability persists below 14-30 days of customer tenure, 30-day prediction windows surface signals too late for multi-touch retention campaigns, and silent model drift demands continuous retraining cycles that most teams do not run. Execution gap remains the defining constraint.
  • 2026-May: Latest evidence reinforces persistent execution gap despite improved platform capabilities. Gainsight launched its full agentic stack for customer retention (May 28), with Staircase Risk and Expansion Analysts live in production surfacing churn signals months in advance — 175K+ tool calls and 96K+ queries document live ecosystem adoption. Arete analysis of 500+ mid-market SaaS companies confirms 31% gross churn reduction within 12 months and $4-7 in protected revenue per $1 spent; models trained on 80+ behavioral signals achieve 75-82% accuracy, reaching 94% with LLM sentiment analysis. Practitioner case study (Momentum Nexus) documented 81% churn prediction accuracy (9 of 11) with save rates improving from 14% to 51% through composite health scoring with 60-90 day intervention runway. Critical practitioner assessments document specific accuracy ceilings (health scores ~85% baseline, 34% better with 4+ dimensions per Gainsight) and intervention failures (accurate models yield no better retention than unflagged accounts without actionable next-step guidance). Independent platform comparison shows the category shifting from static health scores toward relationship intelligence: Staircase AI analyses emails, calls, and Slack for sentiment and stakeholder changes while ChurnZero deploys ~40% autonomous agents — yet a 20-year practitioner review confirms that 70-85% accurate models still fail without an operational "action layer" providing authorised playbooks and intervention guidance. Broader market evidence confirms adoption concentration: SaaS platforms hold 61.4% of churn prediction market share; large enterprises represent 64.8% of demand. Gainsight research documents companies using predictive health scoring reporting 27% lower gross churn; median B2B NRR stands at 102% (down from 110%+ in 2021-22). TSIA critical assessment confirms only 22% of organizations have successfully adopted AI-driven health scoring approaches despite 76% piloting or deploying, and identifies the "actionability gap" as the defining production failure mode. Platform comparison data shows ChurnZero (90% recommendation rate, 71% renewal intent) and Gainsight (86% recommendation, 95% renewal intent) maintain strong user satisfaction; however, 80% of CS teams remain experimental with AI despite years of vendor investment. Implementation reality: typical setup requires 150+ hours and $60-140K annual TCO, with most deployed scores underperforming churn baselines due to poor signal selection and data fragmentation barriers across CRM, usage, and support systems. Practice demonstrates proven technical capability at enterprise scale with documented 15-25% churn reductions, but mid-market adoption remains significantly constrained by cost, complexity, and unresolved ROI clarity on intervention outcomes.
  • 2026-Jun: Independent research and practitioner ecosystem evidence validates the execution-as-constraint thesis. Peer-reviewed research (arXiv) confirms temporal framing—not model complexity—drives robust churn prediction; rolling-window approaches achieve 87.6% accuracy with 83%+ performance on unseen data without retraining, addressing the real production requirement of handling temporal shift. Adversarial independent assessment (ChurnTools) surfaces critical practice limitation: most 'AI' tools are repackaged rule-based health scoring; genuine ML adds only 10-20% accuracy at 10-50x implementation cost, raising serious ROI questions for non-enterprise teams. Mid-market deployment case study ($84K ACV supply chain SaaS) validates upside when execution is complete: health scoring reduced first-year churn 34%→11%, improved NRR 96%→118%, generating $8.4M documented impact. Large-scale empirical analysis of 44,000 SaaS users (CustomerScore.io) quantified signal precision at the behavioral level: single core-action adoption creates a 3-6x churn gap; monthly-plan customers churn 4.7x faster than annual; 23% of nominally active users have fully disengaged (zero activity in 30 days); non-renewal signals predict 40%+ post-flag churn — reinforcing that signal selection and intervention timing matter more than model sophistication. Enterprise AI failure research (Grid Dynamics) uses churn prediction as a canonical project failure case, citing 95% of enterprise GenAI pilots producing no measurable P&L impact and 60% abandoned due to data quality. Practitioner automation guide documents AI-based health scoring detecting risk 63 days before cancellation versus 11 days manual, with teams using automated health score workflows reporting 31% gross revenue churn reduction within two quarters — yet 74% of SaaS companies still rely on manual assessment. Sector-level data increasingly clear: enterprise deployments with dedicated resources and unified data infrastructure achieve documented 25-40% churn reductions; mid-market trials continue to underperform baselines due to cost ($60-140K/year TCO), data fragmentation (CRM/product/support silos), and inability to operationalize interventions at scale. The practice has plateaued at the execution ceiling—technology is mature and proven, but organizational readiness, data unification, and qualified resource availability remain binding constraints on broader adoption.