Language learning with conversational AI

The AI landscape doesn't move in one direction — it lurches. Some techniques leap from experiment to table stakes in a single quarter; others stall against regulatory walls, technical ceilings, or organisational inertia that no amount of hype can dislodge. Knowing which is which is the hard part. The State of Play cuts through the noise with a rigorously maintained index of AI techniques across every major business domain — classified by maturity, evidenced by real-world adoption, and updated daily so you always know where you stand relative to the field. Stop guessing. Start knowing.

AI Maturity by Domain

Each dot marks the weighted maturity of practices within a domain — hover for a brief summary, click for more detail

DOMAIN

BLEEDING EDGEESTABLISHED

LEADING EDGE

TRAJECTORY— Stalled

AI-powered conversational practice for language learning, providing immersive dialogue with pronunciation and grammar feedback. Includes voice-based conversation practice and contextual correction; distinct from content localisation which translates existing content rather than teaching language.

OVERVIEW

Conversational AI for language learning has proven it can attract users at scale, but it has not yet proven it can teach them effectively on its own. A handful of forward-leaning platforms -- Duolingo, Speak, Talkpal -- have deployed AI-driven dialogue practice to tens of millions of users, and meta-analyses confirm measurable gains in pronunciation and vocabulary. That places the practice firmly in leading-edge territory: real value is being delivered, but most language-learning programs and institutions have not adopted it, and the evidence base reveals a hard ceiling. Learners using AI chatbots alone retain only 22% of proficiency gains after six months, compared with 68% for those working with live tutors. The defining tension is not whether conversational AI works as a supplement -- it does -- but whether it can function as a standalone pedagogical tool. So far, the answer is no. Retention gaps, shallow error correction, and 75% app drop-off rates within 30 days suggest that engagement mechanics have outpaced learning design. The organisations extracting value are those treating conversational AI as one component of a blended approach, not a replacement for human instruction.

CURRENT LANDSCAPE

Duolingo reached 56.5M daily active users in Q1 2026 (21% YoY growth) with speaking features now core to the platform, though investor sentiment has diverged from user metrics. Q1 2026 earnings showed revenue of $292M (+27% YoY) and 12.5M paid subscribers (+21% YoY), but the company shifted profitability expectations to 2027+, prioritizing engagement over margin as gross margins decline from 73% to 69% due to AI feature scaling. The Q1 strategic pivot to "user-first" engagement—CEO pledged to reach 100M DAU—triggered user backlash over AI-first positioning, documenting adoption friction and preference for human interaction. Specialist platforms continue scaling: Talkpal reached 4.39M monthly active users with steady 6.78% month-over-month growth. Importantly, mainstream adoption is accelerating: Google Translate launched pronunciation practice with AI assessment (April 2026, initially English/Spanish/Hindi), and Pronounce AI deployed conversational coaching across 100,000+ professionals in 80+ countries, signaling B2B and consumer-mainstream demand for conversational pronunciation feedback beyond dedicated language-learning apps.

The underlying infrastructure is production-grade but adoption remains constrained by technical and pedagogical gaps. Microsoft Azure Pronunciation Assessment achieves measurable performance, yet deployed systems show persistent bias: accent detection failures and language diversity gaps constrain equitable global deployment. Peer-reviewed research demonstrates hybrid approaches—pairing AI dialogue with corpus-based scaffolding—produce better outcomes than standalone conversational systems. Independent reviews of leading platforms (Speak) document specific limitations: feedback lacks depth and personalization, pronunciation scoring remains inconsistent across accent variations, and learners develop platform dependency without human mediation.

The most consequential finding is on retention and adoption barriers. Comparative research shows AI-only learners retain only 22% of proficiency gains after six months, compared with 68% for those working with live tutors. Critical survey evidence shows 221 EFL teachers report active AI adoption (lesson planning, assessment) but cite widespread barriers: 65% untrained, data privacy concerns, and displacement fears. Duolingo's April 2026 "AI-first" pivot met significant user backlash, with learners complaining about buggy content and expressing concerns about reduced human interaction. These results frame the adoption inflection point: conversational AI has demonstrable technical maturity and early mainstream adoption (Google, Pronounce AI), but pedagogical design, learner retention, equitable performance across languages/accents, and teacher integration remain unresolved—effectively creating a ceiling on further tier advancement without breakthroughs in these dimensions.

TIER HISTORY

ResearchJan-2023 → Jan-2023

Bleeding EdgeJan-2023 → Apr-2024

Leading EdgeApr-2024 → present

EVIDENCE (88)

Pronounce AI: Professional English speech coaching platformProduct Launches2026-05-07

— Pronounce AI deployed conversational coaching at scale (100K+ users across 80 countries), validating specialized B2B demand for conversational pronunciation feedback.

Speak App Review: Is It Worth It in 2026?Opinion2026-05-07

— Critical independent assessment documents specific limitations of deployed conversational platforms—feedback depth, accent bias, learner retention—balancing adoption narratives.

Duolingo goes rogue amid 'AI-first' backlashNews Coverage2026-05-07

— Market signal: Duolingo's user base expressing concerns about AI-first shift, documenting adoption friction and learner preference for human interaction—key barrier to scaling.

Examining the effectiveness of integrating corpus-based and AI approaches for English speaking practiceResearch Papers2026-05-05

— Peer-reviewed study documents hybrid corpus+AI approach for EFL speaking practice, informing design of more effective conversational scaffolding systems.

Duolingo Stock Slides After Q1 Beat as 2026 Growth Outlook Tests AI BetNews Coverage2026-05-04

— Duolingo expanded AI-driven speaking features across 56.5M daily users with 10x scaling of AI content generation, demonstrating production-scale deployment and user adoption.

Duolingo Reports First Quarter 2026 ResultsProduct Launches2026-05-04

— Official company announcement repositioning conversational practice (speaking/dialogue) from peripheral feature to core strategic priority across platform.

Google Translate Adds Pronunciation Practice Feature Supporting English, Spanish, and HindiProduct Launches2026-04-30

— Google Translate launched AI-powered pronunciation practice with automated assessment—demonstrating mainstream adoption by major platform with 500M+ users.

How to Test Voice AI Agents for Accent and Language DiversityTutorials2026-04-29

— Technical documentation of systemic bias in deployed voice AI agents—accent detection failures, language diversity gaps—constraining equitable global deployment.

HISTORY

2023-H1: Controlled classroom studies documented effectiveness of AI pronunciation coaching in improving student confidence and segmental accuracy; academic research advanced multi-task learning for error detection; consumer platforms reached 89% awareness among university students but received criticism for limited native-speaker interaction and pedagogical depth. Production systems (Azure, major platforms) showed documented reliability issues in pronunciation scoring.
2023-H2: GPT-4 release triggered major vendor launches: Duolingo Max (Mar), Netease Hi Echo (Oct), Google AI English tools (Oct), and Speak expansion (USD 16M Series B-2). Microsoft Pronunciation Assessment reached GA in 14+ languages. Classroom studies (93 EFL students) showed significant speaking-skill gains. Despite investor enthusiasm and polished products, core technical challenges (pronunciation reliability, tonal-language support) and pedagogical gaps remained unresolved.
2024-Q1: Duolingo's Roleplay feature achieved GA with CEFR-aligned conversational practice; independent startups (Loora, Talkpal) demonstrated viable consumer adoption with 15,000+ users. Peer-reviewed research confirmed significant skill improvements after 27+ hours. However, production deployments revealed accuracy and scoring reliability issues; user reports documented level-matching failures; academic assessment research identified AI-human rater discrepancies. Category transitioned from research-driven to deployment-driven, with acknowledged technical limitations alongside effectiveness signals.
2024-Q2: Meta-analysis of 61 studies (N=8,282) confirmed large effect sizes (d=1.18) for AI-guided language learning efficacy; Speak crossed 10M users with $500M valuation; university deployments (ASU, Purdue) expanded institutional adoption. Market investment remained robust ($1.6B in H2 2023 startups) with sustained user growth (Loora 8.3x DAU growth). Technical constraints persisted: Azure Pronunciation Assessment API documented 1-minute processing limits; expert skepticism grew around narrow chatbot implementations in education. Efficacy validation coexisted with production-stage reliability gaps.
2024-Q3: Duolingo reached 103.6M MAUs with 34.1M DAUs (59% growth) and launched GPT-4-powered Max features (video calls, role-play) across 188 countries; Talkpal and Speak continued scaling consumer adoption. Systematic review of 32 studies identified positive learning outcomes alongside geographic and methodological gaps. Research validated positive attitudes toward AI language learning correlating with proficiency gains. Critical analyses emerged highlighting risks of AI dependency in tutoring and limitations of practice-based features alone. Category demonstrated sustained scale but unresolved tension between validated efficacy and production reliability.
2024-Q4: Speak secured $78M Series C funding at $1B valuation (with OpenAI Startup Fund, Accel) after reaching 10M+ downloads and 1B+ spoken sentences, signalling investor confidence in infrastructure maturity; Duolingo deployed Lily AI chatbot with GPT-4 Video Call feature reaching 37.2M DAUs (54% growth). Market analysis positioned AI-powered adaptive learning as key growth driver (+3.8% CAGR impact) in USD 50B+ language learning market by 2031. Systematic review confirmed chatbots improve communication skills, motivation, and self-confidence; however critical analysis raised concerns about anthropomorphization and over-reliance, with OpenAI cautioning that human-like voice interactions could displace human mentors. Category entered mature production phase with clear market validation alongside emerging risks.
2025-Q1: Rigorous evidence continued to validate conversational AI efficacy: randomized trial (N=363) showed 5.90% lexical diversity gains with 9.53% gains for below-proficiency learners; quasi-experiment (N=60) confirmed speaking proficiency improvements and anxiety reduction; meta-review (N=125 studies) positioned bots as mainstream language education technology. Duolingo Max reached ~2M users (5% of 40M DAUs) driven by Lily Video Call feature adoption. Market expansion continued with AI-generated immersive lesson segment growing 30% to $3.73B. Critical analysis emerged questioning AI-only models, emphasizing need for human interaction and integrated pedagogy. Deployment evidence demonstrates sustained efficacy at feature and platform scale while practitioner concerns underscore pedagogical integration gaps.
2025-Q3: Microsoft Azure launched GA feature for conversational AI with unscripted dialogue and real-time pronunciation feedback, advancing infrastructure maturity. Research validated integrated human-AI approaches (N=150 EFL learners) with scaffolded instruction outperforming isolated conversational systems. However, user adoption challenges emerged: Duolingo reported significant user backlash over AI-first pivot with complaints of buggy, culturally insensitive content; market-level analysis documented 75% app drop-off within 30 days, gamification fatigue, and competition from free AI alternatives. Category demonstrates peak technical maturity alongside mounting questions about user trust, retention, and pedagogical sustainability.
2025-Q4: Vendor infrastructure governance matured: Microsoft published Pronunciation Assessment transparency note documenting 100,000+ hours training data and responsible AI considerations. Systematic review of 11 AI tools for ESL (2021-2025) confirmed efficacy gains but highlighted critical limitations (feedback accuracy, learner dependency, cultural bias, insufficient without human mediation). Speak raised $78M Series C at $1B valuation; Duolingo's Lily AI reached 37.2M DAUs. However, user adoption barriers intensified: user research documented critical assessments of Duolingo's gamification-first model, platform migration to alternatives, and practitioner consensus that blended human-AI approaches outperform isolated conversational systems. Category enters late mainstream with technical maturity validated but pedagogical integration and user retention challenges unresolved.
2026-Jan: Market maturation visible: Duolingo stock fell 69.6% despite AI features driving 51% DAU growth, signaling investor skepticism on valuation and market saturation. Systematic review (39 studies, 2021-2025) confirms sustained learning benefits but reveals tight dependence on digital access and teacher training. Technical limitations persist: Azure Pronunciation Assessment forums document word-substitution errors and phoneme-scoring inconsistencies in production. Global adoption barriers remain: voice-first multilingual AI tutors identified as prerequisite for equitable reach; free alternatives cannibalizing premium platforms; practitioner consensus on need for human-centered blended approaches. Category demonstrates technical stability with unresolved challenges in user retention, global reach, and pedagogical integration.
2026-Feb: Platform deployment stabilized while market signals diverged sharply: Duolingo reached 50M+ DAUs with Lily video-call feature but stock corrected 23% as growth guidance slowed (18-20% projected vs 40%+ prior). Talkpal sustained 4.39M MAU with 6.78% monthly growth, validating niche positioning with structured feedback. Critical adoption barrier evidence emerged: comparative research found AI-chatbot learners retain only 22% of proficiency gains vs 68% for live tutors after 6 months; ChatGPT Voice Mode shown insufficient for structured learning (lacks correction, memory, accountability). Vendor infrastructure maturity confirmed (Azure Pronunciation Assessment Feb GA) but with persistent limitations (word-substitution scoring gaps, phoneme inconsistencies). Category at inflection point: technical maturity achieved, platform-scale deployment confirmed, but pedagogical integration tensions and user retention challenges blocking broader adoption.
2026-Mar: Duolingo Q4 2025 results confirmed 50M DAU, 135M MAU, and $1.04B revenue (+39% YoY) with profitability (29.5% EBITDA), validating category-leading scale; inference cost reductions of 10x enabled distribution of conversational AI features (Lily video calls) from premium to free tiers. Peer-reviewed studies (Frontiers in Education N=66, Showa Women's University N=32) validated semester-long LLM chatbot engagement and pronunciation feedback gains in controlled classroom contexts. Investment analysis documents a 100M DAU roadmap, signalling continued vendor commitment. However, critical accessibility limitations surfaced: Gladia technical analysis documents severe ASR bias—women face higher error rates, Black speakers 10x more likely rated 'unusable', and code-switching causes system failures; a Nature Machine Intelligence study reveals WER/CER evaluation metrics are inadequate for language-learning contexts, undermining confidence in reported performance. Category demonstrates clear infrastructure maturity at scale but systemic ASR bias and flawed evaluation methodologies represent unresolved barriers to equitable deployment across diverse learner populations.
2026-Apr: Duolingo achieved 52.7M DAUs (+30% YoY) with $1.04B revenue (+39% YoY) and 36% YoY subscription growth, demonstrating sustained monetization of conversational AI features (Video Call with Lily, Roleplay, Max tier) despite market headwinds. Classroom deployment evidence expanded: Chinese university study (N=108) documented active integration of generative AI for conversational compensation with mixed benefits (anxiety reduction, practice engagement) and documented risks (over-reliance, academic integrity); American School of Budapest piloting Speakology AI with 90 students, targeting 80% teacher integration; Cambridge peer-reviewed study comparing Duolingo + classroom vs classroom-only conditions on beginner French efficacy, confirming Video Call with Lily effectiveness in structured conditions. Systematic review of 221 EFL teachers revealed active AI adoption for lesson planning/assessment with critical barriers: 65% untrained, widespread data privacy/displacement concerns. Mechanism evidence strengthened: PRISMA systematic review (31 studies) identified dual pathways for willingness to communicate—anxiety reduction and growth mindset—with outcomes moderated by proficiency level, technology agency, teacher collaboration. Infrastructure limitations persisted: ASR research documented Whisper matching human performance for English but 'considerable challenges remain for almost all other languages,' constraining global voice-based deployment. Market adoption headwinds intensified: consumer usage declined 15.7% YoY, churn accelerated +85.2%, and machine translation adoption showed 38.4% of students reducing language learning motivation. Engineering analysis documented fundamental STT-based pronunciation feedback failures: systems optimize for word identity not phoneme accuracy. Category enters Q2 2026 with clear infrastructure maturity and deployment validation in controlled classroom settings, but persistent tensions between technical capability, pedagogical design, teacher readiness, pronunciation assessment accuracy, learner retention, and competitive convergence with general-purpose AI tools.
2026-May: Pronounce AI validated B2B conversational coaching at scale (100,000+ professionals across 80 countries), confirming demand for specialised pronunciation feedback outside consumer language-learning apps. Duolingo's AI-first pivot continued generating documented user backlash, with learners citing reduced human interaction and buggy content — a concrete market signal that conversational AI product design remains a competitive differentiator, not a solved problem.