1. Introduction
As artificial intelligence (AI) is maturing from experimental prototypes to essential components of digital infrastructure, it has initiated profound changes in how we engage with technology for learning and communication. Early AI-based language assistants like Amazon Alexa provided basic pronunciation training and conversation simulation, though they were limited in accuracy and naturalness (Dizon & Tang, Reference Dizon and Tang2020). The situation changed dramatically with the emergence of powerful generative AI platforms such as ChatGPT in late 2022, which have opened unprecedented pathways for second language (L2) development by offering more intelligent language support and highly personalized practice opportunities (Soyoof, Gee & Liu, Reference Soyoof and Gee2025). Meanwhile, the repercussion of AI extends beyond standalone large language model (LLM)–powered chatbots to specialized language learning applications, with platforms like Duolingo and Englia (an e-dictionary) embedding LLMs and AI algorithms into platform design to enhance their pedagogical offerings and learner experiences. As pointed out by Godwin-Jones (Reference Godwin-Jones2024), the ongoing evolution and proliferation of AI across digital spaces point toward a future where AI tools become as indispensable to language learning as mobile technology is to our daily interactions.
In the field of computer-assisted language learning (CALL), researchers have started to document how learners now actively harness AI’s instant feedback mechanisms, writing assistance, and interactive features for additional language development (Haristiani, Dewanty & Rifai, Reference Haristiani, Dewanty and Rifai2022; Jeon & Lee, Reference Jeon and Lee2024; Tai & Chen, Reference Tai and Chen2024). Particularly noteworthy is the mounting evidence on AI-mediated informal digital learning of English (AI-IDLE), which reveals how English language learners worldwide independently leverage AI tools beyond classroom settings (Liu, Darvin & Ma, Reference Liu, Darvin and Ma2024a, Liu, Lee & Zhao, Reference Liu, Lee and Zhao2025; Liu & Zhao, Reference Liu and Zhao2025). Such a phenomenon extends beyond English to include several other languages, including French and Spanish (e.g., Alm, Reference Alm2024; Huang & Cassany, Reference Huang and Cassany2025). This research convergence suggests that today’s learners are not limited to completing academic assignments but are agentively experimenting with AI for diverse informal multilingual learning activities. They are able to critically negotiate AI’s affordances and limitations while immersing themselves in the diverse range of languages, genres, styles, and registers that AI can expose them to (Liu, Lee & Zhao, Reference Liu, Lee and Zhao2025). Through these emerging practices, AI-mediated informal language learning (AI-ILL), which focuses on L2 development in out-of-class environments, is transitioning from theoretical possibility to lived reality.
With the accelerating integration of AI resources into learners’ informal language development, there remains a critical need for a scoping review that comprehensively maps this nascent field to examine the complex interplay between learners and intelligent CALL environments. By synthesizing existing research and identifying emergent patterns across contexts and languages, this review serves as a pivotal reference point for CALL researchers and practitioners as AI technologies continue to help us reimagine how L2 is learned informally in a world shaped by constantly emerging technologies.
2. Understanding AI-ILL through the lens of proactive language learning theory
This study draws upon proactive language learning theory (PLLT) to frame AI-ILL as a theoretically grounded phenomenon within L2 acquisition research. PLLT, developed by Papi and Hiver (Reference Papi and Hiver2025a), conceptualizes L2 acquisition as an inherently agentic process where learners actively engage with linguistic resources in their environment rather than passively receiving instruction. The theory positions learners as autonomous agents who deliberately seek out input, interaction, feedback, and metalinguistic information to advance their language development (Papi & Hiver, Reference Papi and Hiver2025b). It emphasizes the self-directed and strategic nature of language learning behaviors, recognizing learners as complete individuals with distinct motivations, aspirations, and self-determined actions. In addition, Papi and Hiver (Reference Papi and Hiver2025a, Reference Papi and Hiver2025b) propose that proactive learning behaviors are influenced by both contextual variables (such as learning environment and available resources) and learner variables (including personality trait, motivation, and cognitive factors), which drive specific proactive behaviors that lead to enhanced language outcomes.
PLLT provides a compelling lens for understanding AI-ILL because it captures the fundamentally autonomous and self-directed nature of learners’ engagement with AI tools outside formal classroom settings. The theory’s emphasis on learners as proactive agents aligns with how individuals independently explore and utilize AI technologies for language development, enabling researchers to systematically examine the varied approaches learners from different backgrounds employ to actively identify, generate, and leverage learning opportunities within the informal and independent L2 learning environments (Liu & Zhao, Reference Liu and Zhao2025).
Viewed through the lens of PLLT, AI-ILL can be understood as the self-directed, proactive, and independent use of AI technologies by language learners to support their L2 development outside formal educational settings. Note that we use the term “AI-mediated” rather than “AI-assisted” to emphasize the reciprocal role AI technologies play in shaping learners’ self-directed language engagement, which closely chimes with PLLT’s view of learners as proactive agents who co-construct their learning environments. Examples of AI-ILL practices might include proactively using ChatGPT for conversation practice (interaction-seeking behavior) and employing AI writing assistants for feedback on their compositions (feedback-seeking behavior) beyond the classroom. Understanding the antecedents (or the predictive factors/conditions) that motivate these AI-ILL behaviors and their subsequent outcomes is crucial for comprehending how AI tools can effectively support autonomous language development and for identifying optimal conditions that promote successful AI-mediated learning experiences outside of the classroom.
3. Prior literature reviews associated with AI-ILL
While review studies on AI-ILL are still scarce, a few have explored related areas. Soyoof, Reynolds, Vazquez-Calvo and McLay (Reference Soyoof, Reynolds, Vazquez-Calvo and McLay2023) offered the first systematic and comprehensive overview of informal digital learning of English (IDLE). Similarly, building upon the ecological systems theory, Guo and Lee’s (2023) systematic review of IDLE examined factors affecting learners’ IDLE behaviors and perceptions. The thematic review by Liu, Soyoof, Lee and Zhang (Reference Liu, Soyoof, Lee and Zhang2025) drew attention to the development of IDLE in Asian English as a foreign contexts. In parallel, researchers have begun examining the application of AI in L2 learning (Law, Reference Law2024; Weng & Chiu, Reference Weng and Chiu2023; Yang & Li, Reference Yang and Li2024; Zhu & Wang, Reference Zhu and Wang2025). For example, Weng and Chiu (Reference Weng and Chiu2023) reviewed the relationship between instructional design and learning outcomes in intelligent CALL environments, highlighting that such environments provide learners with multiple benefits through personalized and automatic feedback as well as intelligent tutoring. Yang and Li (Reference Yang and Li2024) conducted a systematic review of 44 selected studies on ChatGPT’s role in facilitating L2 learning in general. Chang and Sun’s (2024) systematic review examined how AI influences self-regulated language learning from 2000 to 2022, emphasizing AI’s metacognitive importance in enabling students to learn with emerging technology as a partner and facilitate independent critical thinking.
Although the above review studies have highlighted AI’s growing role in enhancing L2 learning through personalized support, self-regulation, and increased interaction with the target language, they largely overlook informal learning contexts. One exception is Guan, Li and Gu’s (Reference Guan, Li and Gu2024) meta-analysis on the influence of generative AI in informal digital English learning on L2 learners’ English proficiency, motivation, and self-regulation. Nevertheless, a key limitation of this review is that it includes only 15 experimental studies, most published before 2023. Furthermore, while the authors claimed their reviewed studies focus on AI-mediated informal English learning, most may not genuinely fit within the scope of informal language learning due to the lack of explicit clarifications regarding how learners leverage AI tools to learn an L2 in an out-of-class and self-directed way.
Thus, there remains a knowledge gap requiring a review that adopts stricter and more robust criteria to map out the terrain of AI-ILL. Such a review should not be confined to English but should incorporate other languages as well. With this in mind, this paper addresses three research questions (RQs):
- 
RQ1. What is the current landscape of research on AI-ILL? 
- 
RQ2. What are the key reported antecedents of AI-ILL by L2 learners? 
- 
RQ3. What are the key reported outcomes of AI-ILL by L2 learners? 
4. Method
4.1 A scoping review approach
This study adopts a scoping review approach to identify and map the growing body of research on AI-ILL. As noted in Chong and Plonsky’s (2024) taxonomy of secondary research in applied linguistics, scoping reviews can be understood as a systematic form of review, particularly well-suited for emerging fields with diverse methodologies and rapidly evolving literature. In line with this perspective, scoping reviews have been recognized as especially valuable for providing a broad yet systematic overview, often serving as a foundational step before more narrowly focused systematic reviews by examining key features of a concept and surveying the types of available evidence (Alexander, Reference Alexander2020; Chong & Plonsky, Reference Chong and Plonsky2024).
4.2 Initial literature retrieval
We conducted multiple iterative literature searches, with the final search completed in mid-April 2025. To ensure comprehensive coverage, we included peer-reviewed journal articles, conference proceedings, book chapters, and graduate-level theses (both master’s and doctoral), provided they were published in English. As shown in Figure 1, to ensure coverage of both educational and interdisciplinary AI-focused scholarship, we selected four major academic databases – Scopus, Web of Science, ERIC, and ProQuest – based on their broad indexing of research in applied linguistics, educational technology, and computer science. We used the following search string, which was carefully designed to balance precision and recall:
((“artificial intelligence” OR “AI” OR “large language model*” OR “LLM*” OR “chatbot*” OR “ChatGPT” OR “GPT” OR “virtual assistant*”) AND (“language learning” OR “language acquisition” OR “second language” OR “L2” OR “foreign language” OR “additional language”) AND (“informal learning” OR “informal digital language learning” OR “extramural learning” OR “self-directed” OR “autonomous” OR “out-of-class” OR “independent learning” OR “self-study”))

Figure 1. The PRISMA (Page et al., Reference Page, McKenzie, Bossuyt, Boutron, Hoffmann, Mulrow, Shamseer, Tetzlaff, Akl, Brennan, Chou, Glanville, Grimshaw, Hróbjartsson, Lalu, Li, Loder, Mayo-Wilson, McDonald and Moher2021) flowchart on the identification of the studies in the review pool.
Following the methodological guidance of Alexander (Reference Alexander2020), we employed a carefully structured search strategy. Instead of limiting search terms to titles and keywords, we targeted their appearance in abstracts and main texts to reduce the risk of missing relevant studies that use specialized terminology in varied contexts. To ensure comprehensive coverage, we also conducted manual reference checks of included studies. This approach yielded 176 unique records from database searches (after removing duplicates) and an additional eight studies from reference checks.
4.3 Inclusion criteria
The selection of studies for this scoping review adheres to the following inclusion criteria to ensure relevance and quality (see Figure 1 for a detailed illustration of the screening process):
- 
1. Studies must explicitly examine informal, self-directed L2 learning contexts where learners can exercise autonomy outside traditional classroom settings. Sixty-three studies were excluded because they were unrelated to informal learning. 
- 
2. The research must substantively incorporate AI technologies that exhibit core AI features (e.g. natural language processing, intelligent adaptivity) and clearly describe how these tools dynamically mediate the learning experience. We removed 12 records that failed to satisfy this criterion. 
- 
3. Studies should focus specifically on the acquisition of languages beyond the learner’s first language(s), with clear identification of the target language(s) being learned. Only one record did not meet this requirement. 
- 
4. All included research must present original empirical data, whether qualitative, quantitative, or mixed methods, that document actual learner attitudes/perceptions, behaviors, practices, or outcomes when engaging with AI tools. Consequently, 26 records from the database search and three from the reference search were excluded. 
- 
5. To maintain scholarly rigor, we included only peer-reviewed publications from academic journals and high-quality conference proceedings with full papers. In addition, a small number of peer-reviewed book chapters presenting original empirical data from reputable edited volumes were incorporated. Although our initial ProQuest search generated 36 records, none corresponded to master’s or doctoral theses or dissertations, which were excluded because they are not subject to formal peer review. Consistent with this criterion, we removed 14 non-peer-reviewed items, including book reviews, retraction notices, and editorial notes. 
Through the application of these inclusion criteria, our final corpus comprised 65 publications, including 51 journal articles, two book chapters, and 12 conference papers (see the supplementary materials).
4.4 Coding
To address RQ1, we employed a deductive coding approach using Microsoft Excel. Informed by the established scoping review guidelines (Alexander, Reference Alexander2020) and relevant precedents in L2 education research (e.g. Soyoof et al., Reference Soyoof, Reynolds, Vazquez-Calvo and McLay2023), the coding scheme included eight key dimensions: (1) research design nature, (2) geographical distribution of studies, (3) primary methodological approaches, (4) participant demographics, (5) technological context (pre-ChatGPT or post-ChatGPT), (6) specific AI tools employed, (7) target languages being acquired, and (8) theoretical frameworks, alongside supplementary notes capturing unique characteristics of each reviewed study.
For RQ2, we implemented an inductive coding approach (Thomas, Reference Thomas2006) using NVivo 12. This process began by consolidating abstracts and key findings from all 65 studies into a structured document for NVivo analysis. Following an initial familiarization reading, we conducted systematic line-by-line coding without predetermined categories, generating descriptive codes that captured essential elements related to antecedents, outcomes, and challenges of AI-ILL. These initial codes underwent multiple iterations of refinement and systematic organization into hierarchical structures through continuous cross-study comparison. Before formal coding, both coders (the authors) participated in collaborative training sessions involving trial coding and joint discussion to establish a shared understanding of the coding procedure. Disagreements were resolved through iterative discussion until consensus was reached, ensuring consistent code application across the data set. We also co-developed a coding memo to guide consistent code application as new categories emerged. The analysis culminated in the synthesis of recurring patterns into coherent thematic categories (e.g. affective antecedents, linguistic outcomes) that directly respond to RQ2. Thus, we emphasize that these categories emerged from an inductive, data-driven process in NVivo, and the codebook was refined collaboratively through multiple cycles of comparison, grouping, and thematic consolidation. Furthermore, these categories represent domain-level themes derived from patterns across studies, rather than reflexive or interpretative constructs, to ensure transparency and replicability. To ensure robust interrater reliability throughout the coding process, both coders independently coded all 65 studies. We then compared coding outputs for all studies and calculated the observed kappa coefficient using the standard formula:
 $${\cal \kappa} = {{{P_o} - {P_e}} \over {1 - {P_e}}}$$
$${\cal \kappa} = {{{P_o} - {P_e}} \over {1 - {P_e}}}$$
With an observed agreement (Po) of 92% against an expected chance agreement (Pe) of 60%, we achieved a kappa value of .8 (p < .001), indicating strong interrater agreement according to established standards (Cohen, Reference Cohen1960).
5. Findings
5.1 The current landscape of research on AI-ILL
5.1.1 Year of publication
The publication trend revealed that research on AI-ILL is a highly nascent field of investigation, with only isolated studies appearing before 2022 and the overwhelming majority of research (55 out of 65 studies) emerging in the brief period between 2023 and early 2025 (see Figure 2). This trend may be attributable to the release of ChatGPT in late 2022, which opened broad access to generative AI and substantially expanded both the accessibility and the potential applications of AI tools for language learning, thereby triggering unprecedented research interest.

Figure 2. Study counts by publication years.
5.1.2 Geographical distribution of studies Footnote 1
As demonstrated in Figure 3, the geographical distribution of the selected studies appears notably concentrated in East Asia, with China Mainland representing the primary source with 25 studies, followed by Japan (6 studies), China Hong Kong (5 studies), and other East Asian contexts such as South Korea (3 studies). This East Asian prominence suggests regional differences in AI integration for language learning, alongside a relative scarcity of research from Africa, South America, and much of Europe. Furthermore, the United States contributes only 3 studies despite being a major AI development hub.

Figure 3. Study counts by geographical distribution
Note. The category “Others” refers to studies that did not explicitly indicate a research context or studies conducted across multiple contexts that could not be attributed to a single country/region.
5.1.3 Research design nature and primary methodological approaches
In Table 1, the research design characteristics may reveal a field still in its methodological development phase. Cross-sectional designs overwhelmingly dominate the research landscape (75.4%), with longitudinal studies representing only a small fraction (4.6%) of the reviewed literature, suggesting limited investigation into the long-term effects of AI-mediated language learning. Regarding methodological approaches, mixed-methods research appears to be the preferred strategy (24.6%). This may reflect researchers’ attempts to capture both quantitative metrics and qualitative insights in this emerging field. Qualitative approaches (exploratory studies and case studies combined) represent 35.4% of the methodologies employed, while purely quantitative approaches (quasi-experimental and questionnaire-based) account for 30.8%, which indicates a relatively balanced distribution between qualitative and quantitative paradigms despite the field’s novelty. Notably, all studies using quantitative survey designs were reported to have good internal consistency.
Table 1. Study counts by research design

5.1.4 Participant profile
The participant profiles in Figure 4 demonstrate a pronounced focus on undergraduate students, who represent over half of all study participants (33 studies, 50.8%), with elementary students (6 studies, 9.2%) forming a distant second group of interest. We interpret this strong emphasis on higher education contexts as the point that researchers may be prioritizing populations with greater technological access and autonomy, while the notable presence of “unspecified” participants (13 studies, 20%) indicates methodological concerns regarding participant reporting in this emerging field. Other participant profiles receive minimal attention, including language teachers (3 studies), middle school students (2 studies), preschoolers (1 study), and non-traditional learner groups (2 studies) that include YouTubers and online learning app users.

Figure 4. Study counts by participant.
5.1.5 Technology focus and the types of AI tools employed
Figure 5 shows that self-developed chatbots remain the predominant AI tools investigated (27 studies, 41.5%), though it is important to note that many of these tools directly build upon the underlying LLMs released after 2022. In addition, the combined prevalence of multiple generative AI tools (17 studies, 26.2%) and ChatGPT (12 studies, 18.5%) may reflect researchers’ balanced approach between exploring commercial platforms and developing specialized AI language learning environments tailored to specific educational contexts.

Figure 5. Study counts by types of AI tools involved.
5.1.6 The target language
Figure 6 reports the distribution of target languages reported in the selected studies. It shows English as the overwhelming focus, appearing in 49 studies (75.4%) of the reviewed literature. Japanese appears as the distant second most studied language with only 4 studies (6.2%), followed by Spanish with 3 studies (4.6%), then Chinese with 2 studies (3.1%), while Finnish, French, German, Italian, and Russian each appear in just a single study. This salient emphasis on English reflects both its global status and possibly the predominant language capabilities of current AI systems, though it leaves significant questions about how AI-mediated informal learning might operate differently when applied to less frequently spoken languages.

Figure 6. Study counts by the target language.
Note. The category “Others” refers to studies that involve multiple target languages.
5.1.7 The theoretical/conceptual framework
As demonstrated in Figure 7, the theoretical foundation of AI-ILL research presents a notable gap, with 36 studies (55.4%) operating without any specified theoretical framework or leaving their conceptual underpinnings unclear. When examining the 29 studies with theoretical bases, self-directed learning theory and the technology acceptance model emerge as the most frequently applied frameworks, each appearing in 5 studies (7.7%), while self-regulated learning theory follows with 4 studies (6.2%). The theoretical landscape further fragments into various approaches, such as self-determination theory (3 studies) and learning autonomy (2 studies). This absence of explicit theoretical frameworks may mirror earlier phases in CALL research history, where the focus was primarily on technological exploration and pedagogical experimentation rather than theoretical articulation (Hubbard & Levy, Reference Hubbard, Levy, Farr and Murray2016). The rapid evolution of AI technologies in informal language learning may have similarly led to a prioritization of applied research, underscoring the need for more systematic theoretical integration in future studies.

Figure 7. Study counts by the theoretical/conceptual framework.
Note. The category “Others” refers to theories that appeared only once in the review pool, such as the theory of planned behaviors.
5.2 The key reported antecedents of AI-ILL
In this review, AI-ILL is understood as learners’ self-directed and proactive practices of using AI technologies to support L2 development outside formal educational settings. Although the included studies differed in focus, ranging from actual practices to willingness to use AI, they all align with this broader conceptualization. Within this premise, our analysis identified several predictive variables of AI-ILL, which can be categorized into cognitive (related to thinking processes and prior knowledge), affective (concerning emotions and motivation), and sociocontextual dimensions (regarding both sociodemographic characteristics and sociotechnological environments).
5.2.1 Cognitive antecedents
It was revealed that learners’ perceptions of AI’s learning affordances are powerful cognitive antecedents influencing their adoption of AI tools for informal language learning. Fifteen studies have demonstrated that how learners recognize and interpret the capabilities of AI applications directly shapes their willingness to integrate these tools into self-directed learning practices (Dizon & Tang, Reference Dizon and Tang2020; Li, Wang & Bonk, Reference Li, Wang and Bonk2024; Wang & Wang, Reference Wang and Wang2024; Wu & Wang, Reference Wu and Wang2025). For example, Li, Wang and Bonk (Reference Li, Wang and Bonk2024) found that students’ use of ChatGPT was largely motivated by their perception of its flexibility and personalization, which enabled engagement with the target language in ways aligned with their individual goals, knowledge levels, and interests. This perception of personalized learning emerges as a key cognitive driver of AI adoption.
Findings grounded in the technology acceptance model further support this perspective. Studies have identified perceived ease of use and perceived usefulness as strong predictors of learners’ adoption of generative AI for informal English learning (Dizon, Gold & Barnes, Reference Dizon, Gold and Barnes2025; Kohnke, Reference Kohnke2023; Liu & Ma, Reference Liu and Ma2024; Liu, Darvin & Ma, Reference Liu, Darvin and Ma2024a). These cognitive evaluations of utility and accessibility play a central role in learners’ decisions to implement AI in their language development. Similarly, research informed by the theory of planned behavior highlights that learners’ attitudes toward AI functionalities, combined with perceived support from significant others, largely influence their use of generative AI outside formal classrooms (Wu & Dong, Reference Wu and Dong2025). This underscores the importance of both technological perceptions and social validation in learners’ cognitive decision-making processes.
Moreover, studies predating the emergence of ChatGPT have likewise confirmed that learners’ awareness of specific technological affordances, especially interactivity, can foster more self-directed and engaging learning experiences. Applications offering interactive rather than merely mobile-based features have been shown to enhance overall L2 learning with AI (Dizon & Tang, Reference Dizon and Tang2020; Haristiani & Rifai, Reference Haristiani and Rifai2021), suggesting that cognitive recognition of interactivity as a learning affordance remains a consistent predictor of AI adoption across different technological generations.
5.2.2 Affective antecedents
Our analysis also disclosed that affective factors play a significant role in shaping learners’ adoption of AI for informal language learning, with motivation and positive emotional experiences emerging as the most consistent predictors across studies. Research has shown that learners who experience enjoyment, reduced anxiety, and increased confidence in using the target language are considerably more likely to initiate and sustain engagement with AI language tools outside formal educational settings (Haristiani, Reference Haristiani2019; Liu, Darvin & Ma, Reference Liu, Darvin and Ma2024b; Tram, Nguyen & Tran, Reference Tram, Nguyen and Tran2024; Wang & Li, Reference Wang and Li2024).
Motivation, in particular, stands out as a primary affective antecedent. Eight studies have underscored its predictive power for AI tool adoption in self-directed language learning (e.g. Lee & Cho, Reference Lee and Cho2025; Liu, Darvin & Ma, Reference Liu, Darvin and Ma2024b). Importantly, promotion-focused motivation (e.g. learning an L2 for future workplace communication) has been found to be more influential than prevention-focused motives, such as learning an L2 to pass exams. Learners driven by promotion-focused goals are more likely to explore AI’s affordances for creative and productive language use beyond the classroom.
Beyond motivation, positive emotions such as enjoyment also significantly influence learners’ use of AI for informal language learning. Tram, Nguyen & Tran (Reference Tram, Nguyen and Tran2024) found that learners who anticipate or experience enjoyment during AI-mediated activities are more likely to incorporate these tools into their regular learning routines. Wang and Li (Reference Wang and Li2024) further reported that while initial adoption may be influenced by various factors, long-term engagement is primarily sustained by positive emotional experiences during early interactions with the technology.
Confidence in learning also functions as a critical affective driver. Liu and Zhao (Reference Liu and Zhao2025) demonstrated that learners’ confidence in their ability to effectively use digital tools strongly predicts their willingness to engage with AI for independent English learning. Higher confidence levels are associated with more frequent and sustained AI usage in self-directed contexts.
Additionally, communication anxiety in real-world settings often prompts learners to turn to AI tools as lower-stress alternatives for language practice. Haristiani (Reference Haristiani2019) observed that learners with high anxiety gravitate toward AI-mediated environments, which provide judgment-free, psychologically safe spaces for language development. These findings suggest that AI tools can serve an important buffering function for learners with communication apprehension, supporting the development of L2 communication skills in less intimidating settings (Chang & Sun, Reference Chang and Sun2024; Cong-Lem, Soyoof & Tsering, Reference Cong-Lem, Soyoof and Tsering2025; Yang & Li, Reference Yang and Li2024).
5.2.3 Sociocontextual antecedents
Sociocontextual variables in this study refer to the contextual factors that situate individuals within their social environments and shape their language learning experiences and outcomes across diverse spaces. Our analysis identified two broad categories: sociodemographic characteristics and sociotechnological factors. The former includes variables such as age (Dizon, Reference Dizon2024), cultural background (Liu & Zhao, Reference Liu and Zhao2025), target language (Alm, Reference Alm2024), and prior language proficiency (Van Horn, Reference Van Horn2024). The latter comprises differentiated AI tool functionalities (Haristiani & Rifai, Reference Haristiani and Rifai2021), digital self-efficacy (Wang & Li, Reference Wang and Li2024), and digital literacy or competence (Liu & Zhao, Reference Liu and Zhao2025).
These interrelated factors significantly influence how learners approach, engage with, and benefit from AI-ILL opportunities. For instance, Dizon (Reference Dizon2024) found that learners’ age groups (e.g., young adults vs. adolescents) affect their receptiveness to ChatGPT and their willingness to experiment with AI-enhanced learning beyond traditional classroom settings. Similarly, prior language proficiency influences how learners interact with AI tools because beginners may require more scaffolded, structured input, whereas advanced learners can benefit from more open-ended tasks (Van Horn, Reference Van Horn2024).
Digital literacy and self-efficacy also play an important role. Learners with higher confidence in navigating digital environments are more likely to overcome technical challenges and make fuller use of AI’s capabilities (Wang & Li, Reference Wang and Li2024). Liu, Darvin & Ma, (Reference Liu, Darvin and Ma2024b) further emphasize that learners seeking productive engagement with AI in informal digital English learning must negotiate access to resources (e.g., AI platforms), technological readiness (e.g., digital skills), and individual goals (e.g., classroom performance vs. communicative competence). These factors interact in complex and often nonlinear ways, giving rise to varied AI-IDLE practices across learners and contexts.
Taken together, the interplay of sociocontextual variables not only influences who participates in AI-ILL but also shapes how they participate. As AI technologies and learning environments continue to evolve, understanding these variables is critical for ensuring equitable access and meaningful participation in AI-mediated language learning.
5.3 The key reported outcomes of AI-ILL
The analysis of the reviewed literature highlighted that AI-ILL yields diverse outcomes that can be grouped into three main domains: linguistic, affective, and cognitive. These outcomes demonstrated how AI technologies not only enhance language proficiency but also influence learners’ affective engagement with language learning and foster various cognitive processes essential for L2 development.
5.3.1 Linguistic outcomes
The integration of AI tools into informal language learning environments has shown substantial positive effects on learners’ linguistic gains across key components of L2 proficiency. The literature consistently reports improvements in both overall language performance and specific skills when learners engage with AI technologies outside traditional classrooms. Notable gains include enhanced reading comprehension (Pan, Lai & Guo, Reference Pan, Lai and Guo2025), writing proficiency (Aladini, Ismail, Khasawneh & Shakibaei, Reference Aladini, Ismail, Khasawneh and Shakibaei2025), and speaking ability (Wang, Zou, Du & Wang, Reference Wang, Zou, Du and Wang2024; Zou, Liviero, Ma, Zhang, Du, & Xing, Reference Zou, Liviero, Ma, Zhang, Du and Xing2024), along with improved performance on standardized assessments such as the TOEIC (Hsu, Chen & Yu, Reference Hsu, Chen and Yu2023). Additionally, AI-mediated learning supports the development of foundational linguistic knowledge, including grammatical understanding (Haristiani & Rifai, Reference Haristiani and Rifai2021) and vocabulary acquisition (Huang & Cassany, Reference Huang and Cassany2025; Wang, Zhou, Li, Cheung & Tian, Reference Wang, Zhou, Li, Cheung and Tian2025).
Among these outcomes, improved speaking ability emerges as the most salient benefit. This is likely because AI addresses a long-standing challenge in traditional language education by offering learners authentic-seeming oral practice opportunities. Our analysis shows that the technological affordances of AI systems foster measurable improvements in pronunciation accuracy across languages such as Spanish, Mandarin Chinese, English, and Japanese. This cross-linguistic effectiveness is attributed to AI’s capacity to deliver individualized feedback on subtle phonetic features often overlooked in group instruction.
Moreover, AI tools help reframe speaking practice by alleviating performance anxiety commonly experienced in classroom settings. Learners benefit from the ability to rehearse and receive feedback in a judgment-free environment (Celik, Yildiz & Kara, Reference Celik, Yildiz and Kara2025; Haristiani, Reference Haristiani2019). As Jeon and Lee (Reference Jeon and Lee2024) note, the multimodal nature of AI that integrates visual, auditory, and interactive elements supports the comprehensive development of speaking skills, including pronunciation, grammatical accuracy, idea organization, and presentation fluency. Importantly, the gamified features of many AI applications foster sustained learner engagement, transforming speaking practice, traditionally one of the most challenging aspects of L2 learning, into an enjoyable, self-motivated activity pursued outside formal educational settings (Liu et al., Reference Liu, Darvin and Ma2024a; Zou et al., Reference Zou, Liviero, Ma, Zhang, Du and Xing2024).
5.3.2 Affective outcomes
Analysis of the literature also revealed a consistent pattern of affective outcomes that transcend purely linguistic gains when learners interact with AI technologies outside traditional classroom settings. More than half of the studies in the review pool reported that AI-ILL experiences can enhance learners’ sense of enjoyment (Aladini et al., Reference Aladini, Ismail, Khasawneh and Shakibaei2025), boost confidence in speaking the target language (Van Horn, Reference Van Horn2024), reduce communication anxiety (Tai & Chen, Reference Tai and Chen2024), increase willingness to communicate (Wang et al., Reference Wang, Zou, Du and Wang2024), and foster positive attitudes toward long-term language learning success (Har & Ma, Reference Har, Ma, Hong and Ma2023). These affective outcomes appear particularly significant because they address common psychological barriers that often impede L2 learning and teaching in formal educational settings.
A particularly noteworthy dimension of affective outcomes involves the bidirectional relationship between motivation and AI-ILL. While initial motivation levels influence learners’ engagement with AI tools, the evidence suggests that these technologies simultaneously enhance and reshape motivational constructs (Jiang, Yang & Shen, Reference Jiang, Yang and Shen2025; Zhang, Zou & Cheng, Reference Zhang, Zou and Cheng2024). This reciprocal relationship manifests particularly in the development of the ideal L2 self (i.e. the learner’s vision of themselves as a competent language user) as AI interactions provide immediate success experiences that help crystallize this future identity. Furthermore, the personalized, low-stakes nature of AI interactions appears to strengthen intrinsic motivation by satisfying core psychological needs for autonomy, competence, and relatedness (Wu & Wang, Reference Wu and Wang2025). In addition, within the framework of Dörnyei’s (Reference Dörnyei, Dörnyei and Ushioda2009) L2 motivational self-system, research has reported that AI-mediated informal learning significantly enhances the L2 learning experience component by creating positive situational motives through gamification, personalization, and achievement recognition (Liu, Darvin & Ma, Reference Liu, Darvin and Ma2024b). This impact on the affective dimension emphasizes that AI tools are not merely instructional aids but potentially powerful catalysts for sustainable motivational development in language learners, which can create emotional conditions conducive to long-term language learning success.
5.3.3 Cognitive outcomes
Cognitive outcomes in L2 learning refer to the mental processes and strategies that learners develop and employ when acquiring an L2 (Guo & Lee, Reference Guo and Lee2023). The findings demonstrated that leveraging AI tools in informal language learning contexts holds the potential to foster several significant cognitive benefits among learners, including higher awareness of L2 writing strategies (Yuasa & Takeuchi, Reference Yuasa and Takeuchi2024), enhanced cognitive engagement with the L2 learning process (Tram, Nguyen & Tran, Reference Tram, Nguyen and Tran2024), improved self-regulation strategies (Aladini et al., Reference Aladini, Ismail, Khasawneh and Shakibaei2025), and increased metacognitive awareness in autonomous learning (Haristiani, Dewanty & Rifai, Reference Haristiani, Dewanty and Rifai2022; Van Horn, Reference Van Horn2024). These cognitive outcomes emerge from AI-ILL primarily because the technological affordances of AI tools provide immediate, personalized feedback that prompts learners to reflect on their language use and learning approaches (Cong-Lem, Soyoof & Tsering, Reference Cong-Lem, Soyoof and Tsering2025; Yang & Li, Reference Yang and Li2024). In addition, the multimodal nature of AI interactions creates seemingly authentic language processing contexts that demand higher-order thinking skills (Jeon & Lee, Reference Jeon and Lee2024), while the scaffolded learning environments these technologies offer gradually transfer responsibility to learners, which requires them to develop self-monitoring and self-evaluation strategies essential for effective L2 development.
6. Discussion and implications
6.1 An overview of the field of AI-ILL
The first RQ concerns understanding the current landscape of research on AI-ILL. Our findings demonstrated that this is an emerging research area characterized by exponential growth following ChatGPT’s release in late 2022, with 55 of 65 studies published between 2023 and early 2025. Geographically, research is heavily concentrated in East Asia, particularly China, with notable underrepresentation from Africa, South America, and parts of Europe. Methodologically, the field is dominated by cross-sectional studies and balanced between qualitative and mixed-methods approaches, while primarily focusing on undergraduate students learning English through custom-developed chatbots and generative AI tools. Most critically, we found that 55.4% of studies lack explicit theoretical frameworks, pointing to a field that still needs to solidify its conceptual grounding.
These findings help substantiate the claim that AI-ILL represents a specialized subfield within intelligent CALL, building on the technological evolution from traditional CALL to mobile-assisted language learning and now to AI-enhanced approaches (Duman, Orhon & Gedik, Reference Duman, Orhon and Gedik2015). As Godwin-Jones (Reference Godwin-Jones2024) notes, this shift reflects a natural progression from programmed instruction toward more naturalistic, personalized learning that simulates authentic language interactions. While intelligent CALL has long incorporated AI features like natural language processing and intelligent tutoring systems (Dizon & Tang, Reference Dizon and Tang2020), our review indicates that rapidly advancing AI technologies are reshaping how learners autonomously engage with language learning tools outside formal settings. This emerging subfield is characterized by a focus on learner independence and AI’s capacity to offer immediate, context-sensitive support tailored to individual learning paths.
6.2 Antecedents and outcomes of AI-ILL
RQ2 and RQ3 examine the antecedents and outcomes of AI-ILL reported in the reviewed studies. Drawing from the synthesis of empirical findings across the reviewed literature, we developed a conceptual model (Figure 8) that illustrates the dynamic relationships between AI-ILL antecedents, practices, and outcomes within the PLLT framework. The model recognizes key factors that influence learners’ adoption of AI tools for language learning outside formal educational settings, categorized into cognitive, affective, and sociocontextual dimensions. Our findings further added that the cognitive antecedents include learners’ perceptions of AI’s learning affordances, particularly personalization features and perceived ease of use, while the affective factors consist of motivation, enjoyment, reduced anxiety, and increased confidence in the target language. The sociocontextual variables span both sociodemographic characteristics, such as age, cultural background, target language, proficiency level, and sociotechnological factors (e.g. different/limited functionalities of AI, digital self-efficacy, and digital literacy/competence). These antecedents collectively shape AI-ILL practices, which are characterized by informal L2 learning behaviors that are proactive and agentic in nature.

Figure 8. A conceptual model of AI-mediated informal language learning (AI-ILL) based on proactive language learning theory (PLLT).
The cyclical nature of our model, as depicted in Figure 8, reflects how these practices generate outcomes that may potentially interact with future antecedents. Regarding outcomes, our findings cluster into three domains: linguistic gains (improved speaking, reading, writing, grammar, vocabulary), affective benefits (increased enjoyment, reduced anxiety, heightened motivation, positive attitudes), and cognitive improvements (enhanced metacognitive awareness, self-regulation, deeper cognitive engagement). Following PLLT (Papi & Hiver, Reference Papi and Hiver2025a), we suppose that this tripartite outcome structure mirrors the antecedent dimensions, suggesting a recursive relationship where outcomes feed back into the learning conditions and factors that prepare learners for future AI-ILL practices.
It is important to note that this conceptual model remains provisional, given the emergent and rapidly evolving nature of AI-ILL research captured in our scoping review, but the findings that underpin this model both parallel and extend previous research on informal language learning in digital but non-AI environments. For example, like previous studies on mobile-assisted language learning and IDLE (Guo & Lee, Reference Guo and Lee2023; Liu, Zhang & Zhang, Reference Liu, Zhang and Zhang2024; Rezai, Goodarzi & Liu, Reference Rezai, Goodarzi and Liu2025; Soyoof et al., Reference Soyoof, Reynolds, Vazquez-Calvo and McLay2023), our results emphasize the negotiation of sociotechnical resources as essential for accessing and participating in AI-mediated learning practices (Liu & Darvin, Reference Liu and Darvin2024). Additionally, the research reveals that AI-ILL appears to uniquely amplify affective outcomes by providing psychologically safer, conversational spaces for language practice and facilitating risk-taking and confidence-building beyond what earlier tools or peer communities afforded. The central positioning of AI-ILL practices in our model emphasizes this distinctive characteristic as learners often engage in more agentic, AI-mediated behaviors rather than passive AI-assisted activities.
6.3 Suggestions for future research
By taking stock of the existing literature on AI-ILL, we propose six critical directions for future research to advance the field. First, the field urgently needs longitudinal studies to better understand the long-term effects of AI-mediated informal learning on linguistic and affective outcomes, particularly to investigate whether the observed improvements in speaking proficiency and reduced anxiety persist over extended periods of engagement. Second, researchers should diversify participant profiles beyond undergraduate students, especially to examine how sociocontextual factors like age and digital literacy influence AI adoption patterns across different demographic groups. Priority may be given to elementary school children, working adults from lower socioeconomic backgrounds, and elderly learners to understand developmental and access-related barriers. Third, we emphasize a pressing need to establish stronger theoretical foundations that can better underpin the field and understand the complex interplay between cognitive antecedents like perceived usefulness and affective outcomes such as motivation development. Fourth, we advocate for more geographically diverse research beyond East Asia to investigate how cultural, educational, and technological contexts influence AI-mediated language learning practices and the development of self-regulation strategies. Fifth, to address the current overwhelming emphasis on English in the evolving field of informal language learning (see Liu, Zhao & Yang, Reference Liu, Zhao and Yang2024), research must expand beyond English to explore how AI tools support the acquisition of less commonly taught languages, such as those with non-Latin scripts (e.g., Arabic andHindi). Finally, methodological approaches should incorporate more robust, mixed-methods designs and standardized assessment protocols to strengthen the empirical foundation of the field, particularly for measuring the cognitive outcomes and self-directed learning capabilities surrounding AI. For example, future mixed-methods sequential explanatory studies may consider combining eye tracking during L2 reading with longitudinal surveys of motivational shifts, and concurrent designs integrating learning analytics with qualitative interviews about metacognitive strategy use in a world where innovative AI tools (e.g., multimodal AI platforms) constantly emerge and create new opportunities for L2 learning.
6.4 Implications for pedagogical practices
This paper offers important implications for bridging informal language learning and formal classroom instruction in AI-mediated environments. Language educators should recognize the affective benefits of AI-ILL, particularly its potential to reduce anxiety and boost motivation. Structured opportunities should be created to guide learners in using AI tools for low-stakes practice outside the classroom, especially for speaking skills, which often provoke high anxiety.
Given the strong predictive power of cognitive factors such as perceived usefulness (Dizon, Gold & Barnes, Reference Dizon, Gold and Barnes2025; Liu & Ma, Reference Liu and Ma2024; Tram, Nguyen & Tran, Reference Tram, Nguyen and Tran2024), instructors should explicitly demonstrate how AI tools can support diverse language learning goals. Providing concrete examples can help learners identify affordances aligned with their individual needs and trajectories.
In addressing the sociocontextual dimensions of AI-ILL, educators should incorporate targeted digital literacy training to reduce equity gaps caused by differing levels of technological self-efficacy. Furthermore, the metacognitive benefits highlighted in our findings suggest the need for reflective activities that go beyond tool use. Teachers should encourage learners to articulate how they engage with AI, identify effective strategies, and critically evaluate the quality and reliability of AI-generated language input.
Language programs should also revisit assessment practices to better reflect the realities of independent, AI-mediated learning. Process-oriented tasks and critical language analysis may prove more meaningful than product-based evaluations in an AI-enhanced learning ecosystem (Godwin-Jones, Reference Godwin-Jones2024).
Ultimately, rather than viewing AI-mediated informal learning as a threat to traditional pedagogy, educators and curriculum designers should adopt a complementary perspective, one that positions formal instruction as preparation for effective, self-directed learning with AI. This approach can support the development of learners’ critical thinking, cultural awareness, and creative language use in ways that extend beyond what AI alone can offer.
7. Conclusion
This scoping review has synthesized findings from empirical studies on AI-ILL, providing a comprehensive mapping of this emerging subfield and identifying critical avenues for future research. It contributes to the field by theoretically framing AI-ILL as a type of proactive language learning behavior and situating AI-ILL as a distinctive subfield within the broader intelligent CALL traditions. It recognizes both its continuity with earlier chatbot-based IDLE research and the distinctive features brought by recent AI advancements that warrant dedicated theoretical and methodological attention.
Several limitations constrain this review, notably the reliance on English-language databases and the mid-April 2025 temporal cut-off for the review pool. Additionally, this review primarily focuses on reported antecedents and outcomes aligned with the stated research questions and thus does not fully capture potential drawbacks of AI-ILL (e.g., overreliance and misinformation), which merit deeper exploration in future research. Furthermore, future research could adopt integrative approaches to examine the dynamic interplay between cognitive, affective, and sociocontextual antecedents and their influence on varied learning outcomes in AI-ILL. We also acknowledge the potential limitations of our coding framework, such as the granularity of participant background, the broad taxonomies of research designs, and the variability in theoretical applications. These constraints reflect both the heterogeneity of reporting practices in the primary literature and the methodological boundaries of a scoping review, which foregrounds the need for future studies to adopt more standardized and detailed reporting protocols.
Supplementary materials
To view supplementary materials referred to in this article, please visit https://doi.org/10.1017/S0958344025100359.
Data availability statement
Data available on request from the authors.
Authorship contribution statement
Guangxiang Leon Liu: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Writing – original draft, Writing – review & editing. Xian Zhao: Conceptualization, Data curation, Formal analysis, Writing – original draft, Writing – review & editing.
Funding disclosure statement
This research did not receive any specific funding.
Competing interests statement
The authors declare no competing interests.
Ethical statement
Ethical approval was not required.
GenAI use disclosure statement
The authors declare no use of generative AI.
About the authors
Guangxiang Leon Liu (PhD, The Chinese University of Hong Kong) is an associate professor (research-track) at Southeast University, China. His research interests include AI-mediated informal language learning and digital literacies. He has published in journals such as Computers in Human Behavior, System, TESOL Quarterly, ReCALL, and Computer Assisted Language Learning.
Xian Zhao (PhD, The University of Auckland) is an assistant professor at Nanjing University, China. Her current research focuses on positive psychology, Chinese as a second language learning and teaching, and education technology. Her recent work appears in Studies in Second Language Acquisition, System, Applied Linguistic Review, and Language Teaching Research.
 
 








