Perly Consulting │ Beck Eco

The State of Play

A living index of AI adoption across industries — where established practice meets the bleeding edge
UPDATED DAILY

The AI landscape doesn't move in one direction — it lurches. Some techniques leap from experiment to table stakes in a single quarter; others stall against regulatory walls, technical ceilings, or organisational inertia that no amount of hype can dislodge. Knowing which is which is the hard part. The State of Play cuts through the noise with a rigorously maintained index of AI techniques across every major business domain — classified by maturity, evidenced by real-world adoption, and updated daily so you always know where you stand relative to the field. Stop guessing. Start knowing.

The Daily Dispatch

A daily newsletter distilling the past two weeks of movement in a domain or two — delivered to your inbox while the index updates in the background.

AI Maturity by Domain

Each dot marks the weighted maturity of practices within a domain — hover for a brief summary, click for more detail

DOMAIN
BLEEDING EDGEESTABLISHED

Legacy code analysis & migration

LEADING EDGE

TRAJECTORY

Advancing

AI that analyses legacy systems to document behaviour, identify dependencies, and assist migration to modern platforms. Includes COBOL-to-Java migration and mainframe modernisation; distinct from code refactoring which improves existing code within its current platform.

OVERVIEW

AI-assisted legacy code migration has solidified at leading-edge maturity: vendors are shipping competitive agentic platforms, consulting firms are scaling AI practices at industrial scale (EPAM certifying 1,300+ architects with 10,000+ target), and deployments span multiple vertical markets. The practice uses AI to analyse, document, and transform systems written in older languages—primarily COBOL on mainframes—into modern platforms like Java and cloud-native microservices. Its urgency is demographic and economic. With 10% of COBOL developers retiring annually, 43% of US banking still COBOL-reliant, and the modernization services market projected to grow from $22.1B (2026) to $50.7B (2033) at 12.6% CAGR, the economics are forcing action. Yet production deployment remains at ~13-14% (per 2025 surveys), and the binding constraints are organisational—semantic validation expertise, behavioral equivalence assurance, and change management—not technical capability. The tools work. Real-world evidence now shows AI accelerates the discovery phase by 2-3x and handles 30-60% of migration work, but the final 40-70% (business logic validation, regulatory compliance, zero-trust testing) remains human-intensive. Scaling them means solving the human problem first.

CURRENT LANDSCAPE

IBM dominates the vendor landscape through watsonx Code Assistant for Z, which reached v2.8.0 in December 2025 with agentic capabilities that orchestrate multi-step analysis and transformation across mainframe codebases. Its Project Bob initiative consolidates RPG and COBOL assistants into a single platform. That dominance faces intensifying competitive pressure: Anthropic's February 2026 announcement of Claude for COBOL modernisation triggered a 13.2% single-day drop in IBM stock. AWS launched Transform service (GA April 2026) with agentic AI for code analysis and PL/I modernization, demonstrating real-world deployment velocity—a software firm migrated 12 weeks' worth of Control-M workflows to Apache Airflow in 2.5 weeks, achieving 3-5x delivery acceleration and 100% validation success. BMC shifts to agentic architecture, capturing institutional knowledge from historical resolutions to produce AI-analyzed application narratives. CAST Imaging and OpenLegacy round out the ecosystem with documented results: 110M COBOL lines analyzed in four weeks by a leading insurer; Thoughtworks delivering financial services firms 4M lines of COBOL/HLASM modernization in four weeks using agentic AI with 80% code comprehension. However, validation overhead persists. A 2026 survey of 200 enterprise SRE/DevOps leaders found 43% of AI-generated code requires manual debugging in production—developers spend 38% of weekly time fixing AI output. Gartner predicts 70% of 2026 mainframe exit projects will fail. The Stack Overflow survey (49K+ developers) shows 80% AI adoption but only 29% confidence in accuracy; 66% spend extra time fixing near-correct output. Independent testing confirms all mainstream AI tools produce semantically incorrect COBOL-to-Java transformations without expert validation. The UK government's experience illustrates organisational headwinds: legacy systems consume £2.3 billion of £4.7 billion IT budget, with high-risk systems growing 26% annually despite remediation efforts.

TIER HISTORY

ResearchJan-2023 → Jan-2023
Bleeding EdgeJan-2023 → Jan-2024
Leading EdgeJan-2024 → present

EVIDENCE (88)

— Named global services firm (FPT) with 300+ systems and 200M+ LOC transformed; 30% effort reduction in assessment phase for major steel manufacturer case, documenting real deployment economics.

— Market sizing: $22.1B (2026) → $50.7B (2033) at 12.6% CAGR, with application modernization 34% of market, BFSI dominant, broad vertical adoption signaling category maturation.

— AWS and Anthropic official documentation demonstrating integrated workflow for legacy mainframe modernization using reverse engineering and agentic code generation.

— Major consulting firm (EPAM) publicly commits to certifying 10,000+ architects on Claude with 1,300 already certified and 5,000 by Q3 2026; signals enterprise-scale consulting shift toward vendor-specialized practices.

— IBM SVP critical assessment: code translation ≠ modernization. Real work is system-level engineering (data architecture, runtime, transaction integrity). Includes three named customers with metrics.

— Market signal: IBM stock dropped 13.2% (worst day since 2000) after Anthropic announced AI-driven COBOL analysis, indicating investor perception that AI automates legacy discovery cost bottleneck.

— AWS-IBM hybrid cloud collaboration with specific named deployment (Toyota Motor NA): 40M+ LOC COBOL to Java in 50% less time with AI, demonstrating deployment velocity and ecosystem acceleration.

— Named case study (ZK Fiddle, 13-year-old app) with specific problem (lost institutional memory), solution (Claude Code + CaseFoundry knowledge base), outcome (unblocked 2-year stalled migration).

HISTORY

  • 2023-H1: Vendor deployments (SoftRoad 950+ migrations, CAST 50%+ time savings) establish proof-of-concept; IBM announces Watson GA; adoption barriers identified as organizational (strategy clarity, skills gaps) and technical (LLM semantic limitations).
  • 2023-H2: IBM watsonx Code Assistant and CAST Advisors reach GA; however, Gartner notes lack of customer case studies for watsonx validation. Developer skepticism persists around correctness and non-technical barriers (organizational alignment, skills crisis). Real-world adoption remains incremental and risk-averse.
  • 2024-Q1: INAIL case study shows 700-app portfolio migration with CAST, proving scale viability. Academic research (ICSE 2024) advances COBOL-to-Java LLM translation. CAST and IBM expand product capabilities (advanced search, new migration advisors). Persistent adoption barriers: programmer shortage (now including C/C++), organizational inertia, need for human expertise in semantic translation.
  • 2024-Q3: Enterprise deployments expand: insurance company achieves 80% faster code understanding with IBM watsonx. Independent survey (Kyndryl, 500 IT leaders) reports 86% planning GenAI deployment for mainframe modernization with 114-225% ROI, but 43% lack skills to operationalize. IBM expands on-premises deployment options. Skills shortage intensifies as critical adoption constraint.
  • 2024-Q4: IBM releases watsonx Code Assistant GA with multi-language support; AWS endorses CAST Highlight in official guidance. CAST serves hundreds of modernization clients via Google Cloud partnership. Software AG discontinues legacy Unix support, creating platform rehosting mandate. NTT DATA documents persistent GenAI challenges (hallucinations, semantic complexity); category enters mature production phase but remains constrained by organizational and human factors rather than technical capability.
  • 2025-Q1: Vendors advance (IBM watsonx v2.x, OpenLegacy on AWS Marketplace), but adoption gap widens: 2025 Arcati survey shows only 13% in production (vs. 86% planning in 2024). Case studies confirm value (110M COBOL in 4 weeks, 66% reverse engineering speedup), but technical validation remains critical: independent testing shows all major AI tools fail semantic correctness in COBOL-to-Java conversion. Market drivers persist (platform EOL urgency, proven ROI), but semantic correctness and organizational skills remain binding constraints.
  • 2025-Q2: Vendors mature tooling through Q2: IBM releases watsonx v2.6 (June 27) with AI agents for autonomous COBOL generation and expanded language support; IBM and CAST deepen partnership for enhanced application discovery; OpenLegacy documents 353% ROI in independent Forrester study. Academic and industry research intensifies: FSE 2025 industry paper presents automated testing framework for semantic equivalence validation in COBOL-to-Java translation, addressing production reliability. Generative models' capability for reverse engineering of legacy systems gains technical validation. Category remains in production deployment phase with mature vendor capabilities but sustained adoption constraint from semantic correctness validation requirements and organizational change management complexity.
  • 2025-Q3: Vendor ecosystem expands across platforms: IBM releases watsonx v2.7 (August) with business rule discovery and natural language COBOL generation; previews watsonx Code Assistant for i (RPG modernization) with 10M+ LOC training. Academic research accelerates quality assurance: ASE 2025 submissions present automated evaluation systems for semantic correctness in COBOL-to-Java translation. Practitioner and market evidence: Thoughtworks case study documents one-month assessment of 2.2M-LOC system; Stack Overflow survey reveals trust paradox (80% adoption but 29% accuracy confidence, 66% fixing "almost-right" code). Adoption remains nascent (13% in production per Arcati survey), confirming organizational and validation barriers persist despite mature technical capabilities and documented ROI.
  • 2025-Q4: IBM advances vendor platform consolidation: watsonx v2.8.0 (December) introduces agentic chat enabling multi-step orchestration across mainframe transformation tasks; Project Bob announcement unifies RPG and COBOL assistants into single platform. Q4 saw primarily vendor announcements and feature releases rather than new deployment case studies; agentic capabilities signal movement toward higher-order automation in legacy analysis workflows. Adoption barriers (semantic validation expertise, domain knowledge, organizational change management) remain unchanged; category remains at leading-edge production phase with mature capabilities but nascent organizational adoption.
  • 2026-Jan: Vendor momentum and real-world deployment signal confirmed at end-Q4 2025: IBM CEO highlighted watsonx COBOL-to-Java refactoring driving highest z17 revenue in 20 years (Q4 growth 48% YoY). Market data shows 75%+ organizations now using AI for legacy modernization, with market projected to grow from $24.98B (2025) to $56.87B (2030). Practical deployment evidence from insurance sector and engineering case studies documents both AI-assisted approaches (IBM watsonx reference deployments) and deterministic alternatives (AlfaStrakhovanie ANTLR-based migration). Organizational barriers remain binding: UK government report shows legacy systems consuming half of IT budgets (£2.3B of £4.7B in 2025) with high-risk systems increasing 26% despite remediation efforts, indicating systemic failure to scale modernization. Category consolidated at leading-edge with mature vendor ecosystem, proven economics (353% ROI documented), and demonstrated real-world deployment, but constrained by human validation expertise and organizational change management rather than technical capability.
  • 2026-Feb: Competitive disruption enters market: Anthropic announces Claude AI for COBOL modernization (February 23-24, 2026), triggering IBM stock decline 13.2% in single day—largest move in 25+ years—signaling investor concern about competitive encroachment on IBM's mainframe modernization franchise. Market context: 220B lines of COBOL running in banking, government, healthcare; 10% of COBOL programmer base retiring annually. Real-world deployment accelerates: food wholesaler modernization case (Keyhole Software) documents 20-30% development acceleration via AI tools; practitioner analysis shows iterative agentic workflows (engineer-agent refactoring, critic-agent validation) enable traceable, testable modernization. Analyst competitive analysis (Futurum) notes Claude targets discovery/analysis/documentation phases while full modernization encompasses broader architectural transformation. Assessment tool market broadens: Replay's visual reverse engineering approach reduces timelines from 18 months to weeks, addressing 70% legacy rewrite failure rate. Vendor ecosystem remains IBM-dominant but competitive pressure signals market acceleration and feature consolidation across platforms.
  • 2026-Mar: Deployment evidence and practitioner analysis refine understanding of AI's actual role. Thoughtworks documents 66% reverse engineering acceleration (6 weeks to 2 weeks per 10K lines) on automotive 15M-line COBOL codebase, validating AI's discovery-phase impact. Practitioner consensus emerges: Thoughtworks (18+ months delivery) describes real workflow as layered (chunking, summarization, relationship inference with external scaffolding), not single-prompt automation; LLMs augment but cannot simultaneously process tens of millions of lines. Heirloom (PHEAA—largest mainframe-to-AWS migration) argues LLMs structurally unsuited for 100% correctness requirement; deterministic compilation produces byte-for-byte equivalence verifiable against tests. Independent adoption survey (Arcati 2026 mainframe practitioners): only 49% expect significant AI impact over 3-5 years; 8% expect major transformation. Top AI use cases remain narrow (anomaly detection 29%, security monitoring 26%), signaling augmentation not replacement posture. Market delineation: Indium analysis clarifies IBM's stock reaction—AI genuinely solves code analysis/documentation but cannot solve business logic extraction, behavioral equivalence validation, organizational change (estimated 80% of total cost). New vendors entering: CLPS Incorporation (20+ years banking domain) completed PoC with major Hong Kong bank, demonstrating commercial viability of AI-assisted transformation in regulated financial services. Bob 1.0 release (March 2026) consolidates RPG/COBOL assistants into multi-model platform (Anthropic Claude, Meta Llama, Mistral, IBM Granite) with unified pricing tiers ($20-200/month). Category remains leading-edge: vendor capabilities mature and competitive, deployment evidence accumulating across financial services and engineering, but adoption gap persists (13% production per Q1 2025 survey) due to organizational and validation barriers rather than technical capability. Risk-balanced evidence now confirms AI's discovery-phase impact while documenting persistent limitations in semantic validation and regulatory compliance—key factors governing adoption velocity.
  • 2026-Apr: Vendor ecosystem expands with AWS market entry: AWS Transform service (GA April 2026) delivers agentic AI for code analysis and migration, with a documented case migrating 12 weeks of Control-M workflows to Apache Airflow in 2.5 weeks (3-5x delivery acceleration, 100% validation success) and extending agentic support to PL/I, directly challenging IBM's mainframe-focused dominance. Thoughtworks documented a $12B revenue financial services firm modernizing 4M lines of COBOL/HLASM using agentic AI in 4 weeks (vs 8 weeks planned) with 80% code comprehension accuracy under human-in-the-loop validation—concrete evidence of agentic approaches reaching production speed targets. European enterprises progressed from pilots to production GenAI in mainframe workflows, with ISG analyst report documenting governance frameworks and human oversight requirements for scaled deployment. BMC software shifts from generative AI assistance to agentic architecture: Knowledge Hub captures institutional knowledge from historical resolutions; zAdviser Enterprise generates narrative application analysis combining code analysis with operational telemetry. Critical counter-evidence emerged: Gartner predicted 70% of 2026 mainframe exit projects will fail, and a survey of 200 enterprise SRE/DevOps leaders found 43% of AI-generated code requires manual production debugging with developers spending 38% of weekly time on fixes; Thoughtworks Technology Radar v34 identified 'cognitive debt' as a systemic risk from AI-generated code, warning of semantic diffusion and the need for zero-trust controls. Practitioner evidence accumulates: AltexSoft documents AI agent discovery effectiveness (revealed 11 hidden dependencies vs. team's expectation of 5) balanced with critical context-window and persistence limitations; Sigma Software identifies critical flaw in naive AI test generation (locks in incorrect behaviors as untestable technical debt) and proposes behavior-first testing approach. Vendor market survey (IT Jungle) documents competitive ecosystem maturity: IBM Bob (multi-model framework), Profound Logic CoderFlow, Fresche Solutions, ARCAD—signals market acceleration despite persistent adoption barriers. Category remains at leading-edge: vendor capabilities competitive and deployment evidence accumulating, but Gartner's failure prediction and persistent validation overhead confirm that organizational barriers (semantic correctness, expertise) remain the binding constraint.
  • 2026-May: Consulting industry structural shift and market validation accelerate: EPAM Systems announces multi-year partnership with Anthropic to certify 10,000+ architects on Claude with 1,300 already certified as of May 6, 2026, signaling enterprise-scale industry shift toward vendor-specialized practices (vs. agnostic deck consulting). Named deployment evidence strengthens: ZK Fiddle (13-year-old legacy app, 2-year stalled migration) unblocked using Claude Code + institutional knowledge base; Novacomp completed Java 8→17 modernization (10K LOC) in 50 minutes vs. 3 weeks, 60% technical debt reduction; FPT deployed AI on 300+ systems with 200M+ LOC transformed and 30% assessment effort reduction. Market analysts assess boundaries: DORA 2026 study shows 35-40% productivity gains on greenfield code vs. 10% or less on legacy brownfield, establishing AI's realistic contribution to legacy work. IBM SVP critical analysis articulates key limitation: code translation ≠ modernization—real work spans data architecture, runtime behavior, transaction integrity, and regulatory compliance, with business logic extraction and behavioral validation remaining human-intensive. AWS-IBM collaboration (Toyota Motor NA case: 40M+ LOC COBOL to Java in 50% less time) and IBM Bob Premium Package (Z-aware code analysis with multi-model platform) demonstrate vendor maturation and ecosystem competitive positioning. Market sizing confirms category growth: Persistence Market Research projects $22.1B (2026) → $50.7B (2033) at 12.6% CAGR with BFSI leading and application modernization representing 34% of market. Category consolidated at leading-edge with mature competitive vendor ecosystem, strong real-world deployment evidence, and consulting industry-scale adoption signals; constraints remain organizational (semantic validation expertise, regulatory compliance) and human (verification overhead 43% of AI code requiring production debugging per SRE survey), not technical.

TOOLS