Perly Consulting │ Beck Eco

The State of Play

A living index of AI adoption across industries — where established practice meets the bleeding edge
UPDATED DAILY

The AI landscape doesn't move in one direction — it lurches. Some techniques leap from experiment to table stakes in a single quarter; others stall against regulatory walls, technical ceilings, or organisational inertia that no amount of hype can dislodge. Knowing which is which is the hard part. The State of Play cuts through the noise with a rigorously maintained index of AI techniques across every major business domain — classified by maturity, evidenced by real-world adoption, and updated daily so you always know where you stand relative to the field. Stop guessing. Start knowing.

The Daily Dispatch

A daily newsletter distilling the past two weeks of movement in a domain or two — delivered to your inbox while the index updates in the background.

AI Maturity by Domain

Each dot marks the weighted maturity of practices within a domain — hover for a brief summary, click for more detail

DOMAIN
BLEEDING EDGEESTABLISHED

Language learning with conversational AI

LEADING EDGE

TRAJECTORY

Stalled

AI-powered conversational practice for language learning, providing immersive dialogue with pronunciation and grammar feedback. Includes voice-based conversation practice and contextual correction; distinct from content localisation which translates existing content rather than teaching language.

OVERVIEW

Conversational AI for language learning has proven it can attract users at scale, but it has not yet proven it can teach them effectively on its own. A handful of forward-leaning platforms -- Duolingo, Speak, Talkpal -- have deployed AI-driven dialogue practice to tens of millions of users, and meta-analyses confirm measurable gains in pronunciation and vocabulary. That places the practice firmly in leading-edge territory: real value is being delivered, but most language-learning programs and institutions have not adopted it, and the evidence base reveals a hard ceiling. Learners using AI chatbots alone retain only 22% of proficiency gains after six months, compared with 68% for those working with live tutors. The defining tension is not whether conversational AI works as a supplement -- it does -- but whether it can function as a standalone pedagogical tool. So far, the answer is no. Retention gaps, shallow error correction, and 75% app drop-off rates within 30 days suggest that engagement mechanics have outpaced learning design. The organisations extracting value are those treating conversational AI as one component of a blended approach, not a replacement for human instruction.

CURRENT LANDSCAPE

Duolingo reached 56.5M daily active users in Q1 2026 (21% YoY growth) with conversational speaking features now core to the platform. Video Call feature doubled the average spoken words per user over the past year, demonstrating engagement with conversational practice at scale. Speaking Adventures and Spoken tokens have made real-world conversation practice central to the free user experience, with 20,500 course units published in Q1 2026 alone—a 10x increase from 2024's production pace via AI-accelerated content generation. However, user monetization metrics are softening: revenue growth decelerated to 27% YoY (vs. 38% in prior year), paid subscriber growth slowed, and analyst assessments note rising subscriber acquisition costs and declining conversion rates. CEO Luis von Ahn disclosed in May 2026 that content quality control remains a critical bottleneck: approximately 20% of AI-generated educational material comes out unusable, requiring substantial human curation despite infrastructure advances. This directly constrains scale and profitability. Specialist platforms continue validating market demand: Speak app achieved $5M monthly revenue (Feb 2026), with the US market emerging as the second-largest revenue source; user testimony ('better than Duolingo' mentioned 66 times in US reviews) indicates the app captures learners seeking actual conversational ability. Saylore launched as a new GA platform offering CEFR-aligned conversational practice in six languages with offline capability and immediate error correction. Mainstream adoption signals remain visible: Google Translate launched pronunciation practice (April 2026), and Pronounce AI expanded to 100,000+ professionals across 80 countries, validating B2B demand for conversational feedback.

Regulatory and pedagogical constraints are intensifying adoption barriers. The EU Education Council formally adopted AI education policy in May 2026, documenting three named risks: reduced learner autonomy, bias and data protection threats, and widening digital divides. The EU AI Act's high-risk requirements take effect August 2026, mandating transparency and human oversight in assessment and learning pathway systems—effectively creating compliance deadlines that constrain autonomous conversational AI deployments in European institutions. Internationally, peer-reviewed research continues validating technical efficacy while highlighting implementation barriers: meta-analysis of 36 studies (2023-2025) shows conversational AI achieves moderate effect sizes on achievement (d=0.61) but negligible impact on motivation (d=0.29), with warnings that students obtaining easy answers without evaluating feedback weaken deep thinking and encourage dependence. 221 EFL teachers cite widespread barriers to adoption (65% report inadequate training, data privacy concerns, displacement fears). New empirical evidence from international student research (N=60 survey, N=14 interviews, May 2026) identifies conversational AI as a 'first-aid tool for immediate challenges,' capturing the maturity inflection point: technical capability proven but pedagogical sustainability unresolved. Critical assessment of customer sentiment reveals sustained backlash post-April 2025 AI-first announcements, with trust erosion documented across 500K+ reviews and user concerns about AI-mediated interaction displacing human connection.

Infrastructure maturity coexists with unresolved equity, competitive, and design challenges. Azure Pronunciation Assessment and other deployed systems achieve measurable performance, yet persistent bias in accent detection and language diversity constrains equitable global rollout; algorithmic fairness gaps disadvantage underrepresented linguistic groups and create digital divides across developing markets. Competitive commoditization is accelerating: free AI translation tools from Google and T-Mobile, ChatGPT's free language practice capability, and general-purpose AI parity are threatening subscription-based conversational platforms. Hybrid approaches—pairing conversational AI with corpus-based scaffolding—produce superior outcomes to standalone chatbots, yet most consumer platforms remain single-modality conversation. Learner retention and user trust remain critical limiting factors: sustained customer backlash over AI-first positioning, 75% app drop-off within 30 days, documented preference for human interaction, and learner complaints about 'AI slop' in user-generated content suggest that engagement mechanics designed for gamification are misaligned with learning outcomes. The organisations capturing real value are treating conversational AI as one component of blended instruction, not as a replacement for human guidance. The market is transitioning from supplementary tool toward primary channel economically, yet pedagogically and ethically the practice remains contingent on human integration and responsible governance frameworks.

TIER HISTORY

ResearchJan-2023 → Jan-2023
Bleeding EdgeJan-2023 → Apr-2024
Leading EdgeApr-2024 → present

EVIDENCE (109)

— Peer-reviewed article in Cambridge Annual Review of Applied Linguistics synthesizing evidence on how GenAI addresses gaps in input, interaction, and feedback—three mechanisms central to second language acquisition theory—while identifying open research questions about motivation effects.

— Empirical research showing LLMs provide pronunciation feedback driven by stereotypes rather than acoustic evidence—LLMs converge to fixed L2 difficulty phones regardless of actual pronunciation, revealing fundamental reliability limitation in LLM-based conversational language tutoring.

— Peer-reviewed research quantifying demographic bias (gender, accent, ethnicity, age) across ASR systems—persistent disparities in phoneme accuracy limit equitable conversational AI language learning effectiveness for non-native and regional accent speakers.

— Peer-reviewed action research (n=30 non-native English speakers, 6-week intervention, p<.001, d=0.75) shows AI-powered learning buddy 'Walter' achieved statistically significant improvement in oral presentation scores with identified limitations on conversational depth and cultural sensitivity.

— Technical analysis documenting critical failure mode: code-switched speech (multilingual mixing standard in India, Southeast Asia, urban Asia) causes 30-50% relative WER increase in monolingual ASR models, preventing reliable conversational practice in multilingual learner contexts.

— Vendor deployment report across named Thai schools showing concrete outcomes: pronunciation accuracy +30%, lesson prep time reduced 2.5 hours→6 min, classroom engagement revival, HSK pass rates—demonstrates production-scale adoption of conversational AI language tutoring in ASEAN.

— LatIA literature review identifies algorithmic bias disadvantaging underrepresented linguistic groups, inadequate data security, and transparency gaps in AI language tools; recommends hybrid human-AI approach as most responsible deployment model.

— Observational data from job postings and career profiles documents hundreds of organizations deploying learner-facing AI for language teaching—independent methodology capturing real-world adoption beyond vendor metrics.

HISTORY

  • 2023-H1: Controlled classroom studies documented effectiveness of AI pronunciation coaching in improving student confidence and segmental accuracy; academic research advanced multi-task learning for error detection; consumer platforms reached 89% awareness among university students but received criticism for limited native-speaker interaction and pedagogical depth. Production systems (Azure, major platforms) showed documented reliability issues in pronunciation scoring.
  • 2023-H2: GPT-4 release triggered major vendor launches: Duolingo Max (Mar), Netease Hi Echo (Oct), Google AI English tools (Oct), and Speak expansion (USD 16M Series B-2). Microsoft Pronunciation Assessment reached GA in 14+ languages. Classroom studies (93 EFL students) showed significant speaking-skill gains. Despite investor enthusiasm and polished products, core technical challenges (pronunciation reliability, tonal-language support) and pedagogical gaps remained unresolved.
  • 2024-Q1: Duolingo's Roleplay feature achieved GA with CEFR-aligned conversational practice; independent startups (Loora, Talkpal) demonstrated viable consumer adoption with 15,000+ users. Peer-reviewed research confirmed significant skill improvements after 27+ hours. However, production deployments revealed accuracy and scoring reliability issues; user reports documented level-matching failures; academic assessment research identified AI-human rater discrepancies. Category transitioned from research-driven to deployment-driven, with acknowledged technical limitations alongside effectiveness signals.
  • 2024-Q2: Meta-analysis of 61 studies (N=8,282) confirmed large effect sizes (d=1.18) for AI-guided language learning efficacy; Speak crossed 10M users with $500M valuation; university deployments (ASU, Purdue) expanded institutional adoption. Market investment remained robust ($1.6B in H2 2023 startups) with sustained user growth (Loora 8.3x DAU growth). Technical constraints persisted: Azure Pronunciation Assessment API documented 1-minute processing limits; expert skepticism grew around narrow chatbot implementations in education. Efficacy validation coexisted with production-stage reliability gaps.
  • 2024-Q3: Duolingo reached 103.6M MAUs with 34.1M DAUs (59% growth) and launched GPT-4-powered Max features (video calls, role-play) across 188 countries; Talkpal and Speak continued scaling consumer adoption. Systematic review of 32 studies identified positive learning outcomes alongside geographic and methodological gaps. Research validated positive attitudes toward AI language learning correlating with proficiency gains. Critical analyses emerged highlighting risks of AI dependency in tutoring and limitations of practice-based features alone. Category demonstrated sustained scale but unresolved tension between validated efficacy and production reliability.
  • 2024-Q4: Speak secured $78M Series C funding at $1B valuation (with OpenAI Startup Fund, Accel) after reaching 10M+ downloads and 1B+ spoken sentences, signalling investor confidence in infrastructure maturity; Duolingo deployed Lily AI chatbot with GPT-4 Video Call feature reaching 37.2M DAUs (54% growth). Market analysis positioned AI-powered adaptive learning as key growth driver (+3.8% CAGR impact) in USD 50B+ language learning market by 2031. Systematic review confirmed chatbots improve communication skills, motivation, and self-confidence; however critical analysis raised concerns about anthropomorphization and over-reliance, with OpenAI cautioning that human-like voice interactions could displace human mentors. Category entered mature production phase with clear market validation alongside emerging risks.
  • 2025-Q1: Rigorous evidence continued to validate conversational AI efficacy: randomized trial (N=363) showed 5.90% lexical diversity gains with 9.53% gains for below-proficiency learners; quasi-experiment (N=60) confirmed speaking proficiency improvements and anxiety reduction; meta-review (N=125 studies) positioned bots as mainstream language education technology. Duolingo Max reached ~2M users (5% of 40M DAUs) driven by Lily Video Call feature adoption. Market expansion continued with AI-generated immersive lesson segment growing 30% to $3.73B. Critical analysis emerged questioning AI-only models, emphasizing need for human interaction and integrated pedagogy. Deployment evidence demonstrates sustained efficacy at feature and platform scale while practitioner concerns underscore pedagogical integration gaps.
  • 2025-Q3: Microsoft Azure launched GA feature for conversational AI with unscripted dialogue and real-time pronunciation feedback, advancing infrastructure maturity. Research validated integrated human-AI approaches (N=150 EFL learners) with scaffolded instruction outperforming isolated conversational systems. However, user adoption challenges emerged: Duolingo reported significant user backlash over AI-first pivot with complaints of buggy, culturally insensitive content; market-level analysis documented 75% app drop-off within 30 days, gamification fatigue, and competition from free AI alternatives. Category demonstrates peak technical maturity alongside mounting questions about user trust, retention, and pedagogical sustainability.
  • 2025-Q4: Vendor infrastructure governance matured: Microsoft published Pronunciation Assessment transparency note documenting 100,000+ hours training data and responsible AI considerations. Systematic review of 11 AI tools for ESL (2021-2025) confirmed efficacy gains but highlighted critical limitations (feedback accuracy, learner dependency, cultural bias, insufficient without human mediation). Speak raised $78M Series C at $1B valuation; Duolingo's Lily AI reached 37.2M DAUs. However, user adoption barriers intensified: user research documented critical assessments of Duolingo's gamification-first model, platform migration to alternatives, and practitioner consensus that blended human-AI approaches outperform isolated conversational systems. Category enters late mainstream with technical maturity validated but pedagogical integration and user retention challenges unresolved.
  • 2026-Jan: Market maturation visible: Duolingo stock fell 69.6% despite AI features driving 51% DAU growth, signaling investor skepticism on valuation and market saturation. Systematic review (39 studies, 2021-2025) confirms sustained learning benefits but reveals tight dependence on digital access and teacher training. Technical limitations persist: Azure Pronunciation Assessment forums document word-substitution errors and phoneme-scoring inconsistencies in production. Global adoption barriers remain: voice-first multilingual AI tutors identified as prerequisite for equitable reach; free alternatives cannibalizing premium platforms; practitioner consensus on need for human-centered blended approaches. Category demonstrates technical stability with unresolved challenges in user retention, global reach, and pedagogical integration.
  • 2026-Feb: Platform deployment stabilized while market signals diverged sharply: Duolingo reached 50M+ DAUs with Lily video-call feature but stock corrected 23% as growth guidance slowed (18-20% projected vs 40%+ prior). Talkpal sustained 4.39M MAU with 6.78% monthly growth, validating niche positioning with structured feedback. Critical adoption barrier evidence emerged: comparative research found AI-chatbot learners retain only 22% of proficiency gains vs 68% for live tutors after 6 months; ChatGPT Voice Mode shown insufficient for structured learning (lacks correction, memory, accountability). Vendor infrastructure maturity confirmed (Azure Pronunciation Assessment Feb GA) but with persistent limitations (word-substitution scoring gaps, phoneme inconsistencies). Category at inflection point: technical maturity achieved, platform-scale deployment confirmed, but pedagogical integration tensions and user retention challenges blocking broader adoption.
  • 2026-Mar: Duolingo Q4 2025 results confirmed 50M DAU, 135M MAU, and $1.04B revenue (+39% YoY) with profitability (29.5% EBITDA), validating category-leading scale; inference cost reductions of 10x enabled distribution of conversational AI features (Lily video calls) from premium to free tiers. Peer-reviewed studies (Frontiers in Education N=66, Showa Women's University N=32) validated semester-long LLM chatbot engagement and pronunciation feedback gains in controlled classroom contexts. Investment analysis documents a 100M DAU roadmap, signalling continued vendor commitment. However, critical accessibility limitations surfaced: Gladia technical analysis documents severe ASR bias—women face higher error rates, Black speakers 10x more likely rated 'unusable', and code-switching causes system failures; a Nature Machine Intelligence study reveals WER/CER evaluation metrics are inadequate for language-learning contexts, undermining confidence in reported performance. Category demonstrates clear infrastructure maturity at scale but systemic ASR bias and flawed evaluation methodologies represent unresolved barriers to equitable deployment across diverse learner populations.
  • 2026-Apr: Duolingo achieved 52.7M DAUs (+30% YoY) with $1.04B revenue (+39% YoY) and 36% YoY subscription growth, demonstrating sustained monetization of conversational AI features (Video Call with Lily, Roleplay, Max tier) despite market headwinds. Classroom deployment evidence expanded: Chinese university study (N=108) documented active integration of generative AI for conversational compensation with mixed benefits (anxiety reduction, practice engagement) and documented risks (over-reliance, academic integrity); American School of Budapest piloting Speakology AI with 90 students, targeting 80% teacher integration; Cambridge peer-reviewed study comparing Duolingo + classroom vs classroom-only conditions on beginner French efficacy, confirming Video Call with Lily effectiveness in structured conditions. Systematic review of 221 EFL teachers revealed active AI adoption for lesson planning/assessment with critical barriers: 65% untrained, widespread data privacy/displacement concerns. Mechanism evidence strengthened: PRISMA systematic review (31 studies) identified dual pathways for willingness to communicate—anxiety reduction and growth mindset—with outcomes moderated by proficiency level, technology agency, teacher collaboration. Infrastructure limitations persisted: ASR research documented Whisper matching human performance for English but 'considerable challenges remain for almost all other languages,' constraining global voice-based deployment. Market adoption headwinds intensified: consumer usage declined 15.7% YoY, churn accelerated +85.2%, and machine translation adoption showed 38.4% of students reducing language learning motivation. Engineering analysis documented fundamental STT-based pronunciation feedback failures: systems optimize for word identity not phoneme accuracy. Category enters Q2 2026 with clear infrastructure maturity and deployment validation in controlled classroom settings, but persistent tensions between technical capability, pedagogical design, teacher readiness, pronunciation assessment accuracy, learner retention, and competitive convergence with general-purpose AI tools.
  • 2026-May: Duolingo Q1 2026 confirmed 56.5M DAUs (+21% YoY) and 20,500 course units published in the quarter — a 10x increase from 2024 production pace — but revenue growth decelerated to 27% YoY (vs. 38% prior year) and the CEO disclosed that 20% of AI-generated content comes out unusable, requiring substantial human curation. Speak app reached $5M monthly revenue with the US as second-largest market; Saylore launched as a new GA CEFR-aligned conversational platform across six languages with offline capability. Peer-reviewed research (N=60 survey, N=14 interviews) documents international students using ChatGPT and Gemini as a 'first-aid tool' for language adaptation, with unmet demand for long-term learning engagement — capturing the maturity ceiling precisely: technical capability proven at scale, pedagogical sustainability unresolved. The EU Education Council formally adopted AI education policy in May 2026, documenting reduced learner autonomy as a named risk and setting August 2026 as the compliance deadline for high-risk AI systems in assessment and learning pathways. The engagement-learning tension sharpened: Video Call doubled spoken words per user, yet monetization is softening — confirming that feature-level engagement metrics do not translate directly into retention or revenue.
  • 2026-Jun: Market momentum and competitive pressure sharpened simultaneously: the language tutor bots market is on a 14.2% CAGR trajectory toward USD 7.91B by 2036 (FMI), and Burning Glass labor-market data documents hundreds of organizations actively deploying learner-facing AI for language teaching — yet Duolingo's post-April 2025 AI-first pivot produced documented trust erosion (trust-complaint share rising from 0.27% to 3.71% across 500K+ reviews) and analyst downgrades citing free-AI commoditization from Google, T-Mobile, and ChatGPT as a structural subscription threat. A meta-analysis of 36 studies confirmed moderate achievement gains (d=0.61) but negligible motivation effects (d=0.29), reinforcing that engagement mechanics and learning outcomes remain decoupled at the platform level. Two new empirical findings deepened the pronunciation reliability gap: LLMs were shown to provide stereotype-driven pronunciation feedback—converging to fixed expected-difficulty phonemes regardless of the learner's actual acoustic output—while a separate study quantified persistent demographic bias across ASR systems (gender, accent, ethnicity) that disadvantages non-native and regional-accent speakers; code-switching (standard in multilingual Asian contexts) causes an additional 30-50% relative word-error-rate increase in monolingual models. Against this, iFlytek deployments in Thai schools documented production gains (pronunciation accuracy +30%, lesson prep 2.5 hours reduced to 6 minutes, improved HSK pass rates), confirming that purpose-built multilingual platforms can outperform general-purpose systems in constrained contexts.