Introduction
The enjoyment of a task is well recognized as a predictor of sustainable motivation (Ryan & Deci, Reference Ryan and Deci2017). Over the past half century, task enjoyment, also known as intrinsic motivation (Deci, Reference Deci1972), has moved from the realm of psychological abstraction to popular understanding (Kohn, Reference Kohn1993). Popular YouTube talks with millions of views make specific reference to this unobservable mental phenomenon (Pink, Reference Pinkn.d.). Popular books and podcasts make direct reference to the idea of an internally controlled desire to achieve the task for its own sake (Grant, Reference Grant2023). Empirically as well, the construct has strong situational and predictive validity across domains (Howard, Bureau, Guay, Chong, & Ryan, Reference Howard, Bureau, Guay, Chong and Ryan2021; Bureau, Howard, Chong, & Guay, Reference Bureau, Howard, Chong and Guay2022; Ryan, Reference Ryan2023). As a central coordinating variable in self-determination theory (SDT; Ryan & Deci, Reference Ryan and Deci2017; Ryan, Reference Ryan2023) and across the field of psychology and educational psychology more generally (e.g., Eccles & Wigfield, Reference Eccles and Wigfield2020), intrinsic motivation is a well-accepted theoretical construct with an established record of predictable effects. Intrinsic motivation’s reach and effect extend to language education. Since the 1990s, empirical work has demonstrated that task enjoyment is a strong and sustainable source of motivation for learning a language (Noels, Pelletier, Clément, & Vallerand, Reference Noels, Pelletier, Clément and Vallerand2000).
From the early 2000s to mid-2010s, however, there was a diminution in research efforts directly employing this variable by name (Boo, Dörnyei, & Ryan, Reference Boo, Dörnyei and Ryan2015). The growing paradigm of the L2 Motivational Self System (L2MSS; Dörnyei, Reference Dörnyei2005, Reference Dörnyei, Dörnyei and Ushioda2009) set the direction for the field of language learning motivation for two decades. Under this paradigm, however, new constructs specific to languages and with only a tenuous connection to the field of psychology began to proliferate (Dörnyei, Henry, & Muir, Reference Dörnyei, Henry and Muir2016). Consciously or unconsciously, these constructs often mirrored existing constructs in the field of psychology generally, expanding the trend of rebranding extant theoretical ideas as “positive psychology” from other areas (Kristjánsson, Reference Kristjánsson2012). One of these constructs was the L2 learning experience (Dörnyei, Reference Dörnyei2019), a highly similar construct to intrinsic motivation. Since its inception, the L2 learning experience has been a broadly defined construct with an ambitious scope apparently spanning nearly all aspects of the classroom environment. According to Dörnyei (Reference Dörnyei, Dörnyei and Ushioda2009), this construct “concerns situated, ‘executive’ motives related to the immediate learning environment and experience (e.g. the impact of the teacher, the curriculum, the peer group, the experience of success)” (p. 29). However, when it comes to the items used in the standard scale used to measure this construct, they show strong overlaps in wording and conceptualization with intrinsic motivation (see Oga-Baldwin, Reference Oga-Baldwin2024). Recognizing issues of jingle–jangle style diaspora of terminology across the field of motivation (Henry & Liu, Reference Henry and Liu2024; Skinner, Reference Skinner, Bong, Reeve and Kim2023), there is now a need to empirically consolidate constructs and terminology to facilitate communication between fields.
Divergence or replication? A critical look at the constructs
Although research using SDT is conducted in a wide variety of fields, from business to sports to parenting (Ryan & Deci, Reference Ryan and Deci2017), the closest comparison to language learning is likely that of education. Within the literature on educational psychology, intrinsic motivation has shown a strong predictive effect on achievement and learning outcomes. The meta-analysis by Howard et al. (Reference Howard, Bureau, Guay, Chong and Ryan2021) underscored the significant influence of intrinsic motivation on academic achievement, persistence, well-being, goal orientations, and self-evaluation. Specifically, intrinsic motivation was strongly correlated with academic achievement (β = .41), indicating that students driven by inherent interest tend to perform better academically. In terms of persistence, intrinsic motivation showed a robust relationship with outcomes such as effort (ρ = .54) and engagement (ρ = .62), highlighting its crucial role in sustaining student commitment to learning activities. Regarding well-being, intrinsic motivation was positively associated with indicators such as positive affect (ρ = .52) and vitality (ρ = .61), suggesting that students motivated by intrinsic factors experience enhanced psychological health. Additionally, intrinsic motivation influenced goal orientations, with strong correlations found for mastery-approach goals (ρ = .64) and performance-approach goals (ρ = .25), which underscores its impact on students’ achievement strategies. In the domain of self-evaluation, intrinsic motivation correlated positively with self-efficacy (ρ = .41) and self-esteem (ρ = .34), reinforcing its importance in fostering positive self-perceptions. There is thus strong empirical evidence that the feeling of enjoyment for a task is a strong predictor of continuation and subsequent achievement.
SDT overall has received strong empirical support across numerous geographical and cultural contexts (Ryan & Deci, Reference Ryan and Deci2017; Ryan, Reference Ryan2023). While a complete coverage of the theoretical proposals, corollaries, and interrelationships between constructs is beyond the scope of this short report (see Al-Hoorie, Oga-Baldwin, Hiver, & Vitta, Reference Al-Hoorie, Oga-Baldwin, Hiver and Vitta2022, for a more complete review), the effects and relationships of the various elements of the theory remain consistent in educational contexts as varied as South Korea (Jang, Reeve, Ryan, & Kim, Reference Jang, Reeve, Ryan and Kim2009), Saudi Arabia (Alamer, Reference Alamer2022), Peru (Benita, Matos, & Cerna, Reference Benita, Matos and Cerna2022), and China (Shi, Levesque, & Maeda, Reference Shi, Levesque and Maeda2018), as well as the more standard western, educated, industrial, rich, and democratic societies like Canada (Noels et al., Reference Noels, Pelletier, Clément and Vallerand2000), the US (Jang, Reeve, & Deci, Reference Jang, Reeve and Deci2010), and Belgium (Aelterman et al., Reference Aelterman, Vansteenkiste, Haerens, Soenens, Fontaine and Reeve2019). Although there are emic realities and different within-culture localizations (Lynch, Reference Lynch and Ryan2023), SDT indicates a broad, generalizable, and recognizable reality across distinct regions and backgrounds.
Contrasting with the universal claims of SDT, Dörnyei’s (Reference Dörnyei2005) L2MSS is rooted in the foundational assumption that learning a foreign language differs from other pursuits (although see Al-Hoorie & Hiver, Reference Al-Hoorie and Hiver2020), necessitating the addition of an “L2” twist to each construct. Dörnyei’s L2MSS is a theoretical framework designed to explain individual differences in L2 learning motivation. It comprises three main components: the ideal L2 self, the ought-to L2 self, and the L2 learning experience. The ideal L2 self reflects the learner’s aspirations and desired future self as a proficient language user, serving as a motivational force (Dörnyei, Reference Dörnyei, Dörnyei and Ushioda2009). The ought-to L2-self component represents the expectations imposed by significant others, such as parents and teachers. While it may motivate some learners, it often lacks the energizing effect necessary to drive actual behaviors (Dörnyei & Chan, Reference Dörnyei and Chan2013). Finally, the L2 learning experience theoretically, and imprecisely, focuses on learners’ immediate experiences and attitudes toward their language learning environment, including interactions with teachers, peers, and the curriculum.
In recent years, cracks have begun to show in the theoretical, practical, and empirical edifices of the L2MSS (Al-Hoorie, Reference Al-Hoorie2018; Hiver & Al-Hoorie, Reference Hiver and Al-Hoorie2020; Al-Hoorie, Hiver, & In’nami, Reference Al-Hoorie, Hiver and In’nami2024; Henry & Liu, Reference Henry and Liu2023, Reference Henry and Liu2024) and the broader psychology of language learning (Sudina, Reference Sudina2021; Reference Sudina2023). In a similar meta-analysis to those conducted on SDT (Bureau et al., Reference Bureau, Howard, Chong and Guay2022; Howard et al., Reference Howard, Bureau, Guay, Chong and Ryan2021), Al-Hoorie (Reference Al-Hoorie2018) found that the L2 learning experience had a significant correlation with intended effort (r = .41) but a weaker link to objective measures of achievement (r = .17). At the same time, of all the variables in the L2MSS studied, the L2 learning experience had the strongest predictive relationship with any outcome measures. What is concerning about the L2 learning experience, being the best predictor in the L2MSS, is that it is severely undertheorized. When it was first proposed, Dörnyei simply described it as the situated, executive motive (Dörnyei, Reference Dörnyei, Dörnyei and Ushioda2009, p. 29) and as the causal dimension (Dörnyei, Reference Dörnyei2005, p. 106) of the model without further elaboration. A decade later, Dörnyei (Reference Dörnyei2019) described this construct as the “Cinderella” of the L2MSS, acknowledging the little progress fleshing out the theoretical underpinning of this construct. Surprisingly, Dörnyei additionally acknowledged that this construct is “hardly more than a broad, place-holding umbrella term that would need to be fine-tuned at one point” (Dörnyei, Reference Dörnyei2019, p. 22). This suggests that a thorough investigation of this construct and its validity is long overdue.
This investigation becomes especially important in light of recent failures related to the measurement and psychometric validity of its constructs in efforts to replicate the theory (Hiver & Al-Hoorie, Reference Hiver and Al-Hoorie2020; Al-Hoorie et al., Reference Al-Hoorie, Hiver and In’nami2024). In a recent study, for example, Al-Hoorie and colleagues (Reference Al-Hoorie, McClelland, Resnik, Hiver and Botes2025) found that in terms of its measurement, the ideal L2 self is functionally another name for ability beliefs. That is, when participants respond to an ideal-L2-self item (e.g., “I can imagine myself speaking English fluently”), they resort to their beliefs in their ability to achieve this, rather than a discrepancy between actual–ideal self-guides. These studies have been conducted in a variety of settings; their failure to replicate has prompted what has alternately been called a validation crisis (Al-Hoorie et al., Reference Al-Hoorie, Hiver and In’nami2024) and a research opportunity (Oga-Baldwin, Reference Oga-Baldwin2024) in the psychology of language learning. Unlike SDT, the empirical and theoretical foundations of the L2MSS are anything but set (Henry & Liu, Reference Henry and Liu2024) but rather are currently in flux.
These results are broadly in line with the status measurement across the psychology of language learning, which has been shown to be in need of renovation, or at least a thorough housecleaning. Sudina’s (Reference Sudina2021, Reference Sudina2023) comprehensive methodological syntheses of measurement of anxiety, motivation, and willingness to communicate (WTC) scales found that as few as one in four regularly used survey scales reported factor analysis results, less than 6% demonstrated convergent validity, less than 6% provided discriminant validity evidence, and fewer than 7% tested measurement invariance. Methodologically, this would indicate the need to strengthen the quality of measurement practices while improving the definitions of what it means to be anxious, willing to communicate, or motivated. Thus, although the L2MSS is notable due to its breadth and popularity (Boo et al., Reference Boo, Dörnyei and Ryan2015), it is also not alone in showing definitional fuzziness.
A major potential issue existing between SDT and the L2MSS is the absolute similarity of items used in well-established surveys of both constructs (Oga-Baldwin, Reference Oga-Baldwin2024). More formally, this overlap is known as a jangle fallacy, where two different terms are used to represent the same or highly similar set of items. While these similarities are evident (see Table 1), it is important to recognize that these differences may not be recognized throughout the field. Indeed, researchers of disparate paradigms may filter these ideas through their own theoretical lenses.
Table 1. Item similarities between L2 learning experience and intrinsic motivation (originally presented in Oga-Baldwin, Reference Oga-Baldwin2024)

† While English is used in the originals, any language could potentially be inserted here.
One central aspect of intrinsic motivation is its strong connection to affect, particularly positive emotions such as interest, excitement, and enjoyment, which are integral to the definition of the construct. Rather than a simple state of feeling motivated, intrinsic motivation refers to a specific quality of motivation punctuated by the joy and satisfaction of the task itself. While some researchers have called for a clearer differentiation between emotion and motivation within the theoretical framework, including those aligned with SDT (e.g., Alamer, Reference Alamer2024), there is a risk that such proposals may overlook key elements of the theory. For example, efforts to create new scales measuring the continuum of SDT (Alamer, Reference Alamer2022) suggest an awareness of the interplay between affect and motivation. However, subsequent arguments to separate these constructs (Alamer, Reference Alamer2024) seem to downplay the significance of the emotional content that is inherently embedded in these scales. SDT, as articulated by foundational works (e.g., Deci & Ryan, Reference Deci and Ryan1985; Ryan & Deci, Reference Ryan and Deci2000), incorporates both cognitive and affective components, underscoring the importance of positive affect in driving motivation. This highlights the need for careful alignment between theoretical claims and operationalizing constructs in empirical research.
To test understanding of these constructs throughout the field, a look at how currently active researchers recognize the wordings may offer insight. Knowing how theorists make front-line use of the surveys, recognizing the wordings of items to represent constructs can help to identify the source of jangle and may help resolve aspects of theoretical divergence.
The present study
Given the broad and imprecise definition of the construct of the L2 learning experience, the present study aimed to examine the content validity of the items associated with its scale. Specifically, we sought to determine the extent to which these items closely reflect this construct and the degree to which they may overlap with intrinsic motivation. Verifying content and construct validity, the extent to which a measurement instrument accurately reflects the scope and conceptual definition of the construct it is intended to measure, is a critical step in scale development and validation (DeVellis & Thorpe, Reference DeVellis and Thorpe2021; Kline, Reference Kline2019). According to Lynn (Reference Lynn1986), a minimum of three experts is necessary to establish content validity, with a recommended range of five to ten experts for more robust evaluation (Almanasreh, Moles, & Chen, Reference Almanasreh, Moles and Chen2019). Similar studies using expert panels have been used for content validation as well in social work and psychology (Rubio, Berg-Weger, Tebb, Lee, & Rauch, Reference Rubio, Berg-Weger, Tebb, Lee and Rauch2003; Vogt, King, & King, Reference Vogt, King and King2004), with some large-scale content validation studies employing as many as 37 independent experts (Ahmadi et al., Reference Ahmadi, Noetel, Parker, Ryan, Ntoumanis, Reeve, Beauchamp, Dicke, Yeung, Ahmadi, Bartholomew, Chiu, Curran, Erturan, Flunger, Frederick, Froiland, González-Cutre, Haerens and Lonsdale2023). These guidelines help ensure a balance between diverse expert opinions and practical feasibility in recruiting qualified participants.
In this study, we presented participants with a set of items drawn from various scales and asked them to identify the construct(s) each item most closely aligned with. This approach allowed us to assess whether the wording of the items was clear and representative of their intended constructs. Additionally, we note that the L2 learning experience is sometimes referred to as “attitudes toward language learning” in the literature (see Hiver & Al-Hoorie, Reference Hiver and Al-Hoorie2020). This terminological variation further highlights potential ambiguities in conceptualizing this construct. For consistency and clarity, we adopt the term the L2 learning experience throughout this paper.
Examination of the alignment of these items with their respective constructs contributes to the ongoing refinement of L2 motivation theories and their measurement tools. Moreover, it sheds light on the potential conceptual overlap between the L2 learning experience and intrinsic motivation, which has implications for both theoretical understanding and practical applications in the field of second language acquisition.
Method
After obtaining ethical approval, and to identify participants with the requisite expertise for this study, we compiled the names of authors who had published in an applied linguistics journal specializing in language learning psychology between 2021 and 2024. These individuals were invited via email by the second author to participate in the study (See Appendix 1 for the invitation email). Thirty emails were sent. The invitation clarified that the study aimed to validate the content validity of a set of questionnaire items by having a panel of experts review them. They were also asked to avoid consulting external sources during the task, as reliance on such sources would indicate ambiguity in the item wording. If an item was unclear or associated with multiple constructs, participants were encouraged to report this directly. Example responses were provided to illustrate the process and clarify expectations. The landing page instructions and survey results are presented in Appendices 1 and 2, respectively.
Participants
Thirteen researchers initially responded to the online survey, but only 12 (three female, two of whom did not report their gender) completed the survey. All responses were anonymous. Most participants held a PhD degree, with only two reporting a master’s degree as their highest qualification. They represented a wide range of ages: two were 25–35 years old, five were 35–44, three were 45–54, one was 55–64, and one was 65–74. Participants also reported varying years since completing their highest degrees: five had completed their degrees 1–5 years ago, three had done so 5–10 years ago, three had 10–15 years of experience, and two had more than 15 years. In terms of research output, the number of publications in the past 2 years ranged from 1–2 (two participants) to 3–4 (four participants), 5–6 (three participants), 7–8 (two participants), and more than 10 (one participant). No further background data was solicited for analytic purposes.
One participant completed responses to only two items (neither of which was a target item; see next section) before discontinuing the survey; consequently, their data were excluded from the analysis. Another participant acknowledged consulting online resources during the task. However, as their response pattern aligned with the other participants, their data were retained. For the purposes of transparency and completeness, we have included even the incomplete answers in Appendix 2.
Instruments
The participants were presented with 10 items, including two target items and eight distractor items. The distractor items were either sourced from unrelated existing scales or created specifically for this study. The two target items were “I find learning English really interesting” and “I really enjoy learning English,” which, as noted earlier, appear in both the L2 Learning Experience scale and the Intrinsic Motivation scale. Table 2 provides the full list of items presented to the participants, shown in the order they appeared in the survey.
Table 2. Items used in this study

Note: † = target item; AMTB = Attitudes/Motivation Test Battery; IMI = Intrinsic Motivation Inventory; SRQ-A = Academic Self-Regulation Questionnaire.
Data Analysis
As this was a validation study focusing on content validity through expert review, the primary analytical approach focused on descriptive statistics and analysis of open-ended responses. Following recommendations for content validity assessments through open-ended expert review (Lynn, Reference Lynn1986; Almanasreh et al., Reference Almanasreh, Moles and Chen2019), we looked for consistency of responses, particularly frequency counts. Given the relatively small sample answering open-ended questions, analysis beyond percentage statistics and raw frequency counts was deemed inappropriate for the analysis of a small set of expert opinions. Similar methods have been used in content validity studies (Vogt et al., Reference Vogt, King and King2004). Had we used a larger sample with a finite set of categories (cf. Ahmadi et al., Reference Ahmadi, Noetel, Parker, Ryan, Ntoumanis, Reeve, Beauchamp, Dicke, Yeung, Ahmadi, Bartholomew, Chiu, Curran, Erturan, Flunger, Frederick, Froiland, González-Cutre, Haerens and Lonsdale2023), further statistical testing might have been possible, although large samples are not generally the target of expert panel studies (Vogt et al., Reference Vogt, King and King2004).
While statistical tests of agreement such as Cohen’s Kappa (Cohen, Reference Cohen1960) or Fleiss’ Kappa (Fleiss, Reference Fleiss1971) are often used in validation studies, the diversity and open-ended nature of expert responses in this study made such analyses inappropriate—recoding the data might lead to the veneer of truth through statistical certainty while sacrificing the diversity of responses and remove the more intuitive quality of evidence imparted by frequency. Open response format better reflects the reality of theoretical understanding in the field, but often precludes the use of traditional agreement statistics, which require fixed nominal categories (Ostrov & Hart, Reference Ostrov, Hart and Little2013). Unlike previous studies where raters chose from predetermined categories, our experts could freely choose construct identifications, making measures of agreement as to the categories at best unwieldy and practically unfeasible. In light of the goals of the study (i.e., testing expert recognition of the L2 learning experience items vs. other constructs), a basic majority or plurality was set as the threshold for agreement.
Results
Table 3 presents the items and their responses with frequencies. Analysis of expert responses revealed varying levels of consensus regarding construct identification across the questionnaire items. However, when it came to the two target items, it was clear to most of the respondents that the relevant construct was intrinsic motivation and enjoyment, rather than the L2 learning experience. Specifically, the item “I find learning English really interesting” was identified by 35.7% of respondents (n = 5) as measuring intrinsic motivation, with additional respondents categorizing it under related constructs such as general motivation (n = 3) and curiosity (n = 1). Similarly, the item “I really enjoy learning English” was recognized as measuring intrinsic motivation by 28.6% of respondents (n = 4), with additional related classifications of enjoyment (n = 2) and general motivation (n = 3). Notably, not one respondent associated these items with the L2 learning experience.
Table 3. Items and their response classifications

Note: Bold indicates target items.
Although not a target of this study, anxiety and instrumental motivation items were also similarly recognizable. The anxiety-related item “It embarrasses me to volunteer answers in our English class” was identified by 50% of respondents (n = 7) as measuring foreign language anxiety or classroom anxiety. Instrumental motivation was consistently identified in career-related and travel-related items, with 28.6% (n = 4) of experts identifying these constructs in both relevant items.
Notably absent from all expert responses was any reference to the L2 learning experience construct from Dörnyei’s L2MSS. The original wordings (Items 3 and 7) were identified as intrinsic motivation or motivation generally. Despite some other items potentially relating to the “experiences” within the classroom, particularly those concerning classroom environment and teacher evaluation, no experts indicated this construct in their answers. This indicates that intrinsic motivation tends to be the first construct that comes to mind when experts read an L2 learning experience item. This is understandable considering that intrinsic motivation predates the L2 learning experience by several decades.
Discussion
Our expert-panel survey results reveal several insights regarding the theoretical understanding and thus the subsequent measurement of language learning motivation constructs. Most notably, none of the experts identified items traditionally associated with the L2 learning experience as belonging to this construct, despite these items forming a core component of the L2MSS, a model that has dominated the field for two decades (Dörnyei, Reference Dörnyei2005, Reference Dörnyei, Dörnyei and Ushioda2009). Instead, experts consistently categorized these items as measuring intrinsic motivation or related constructs such as enjoyment, suggesting a natural recognition of the theoretical overlap identified by Oga-Baldwin (Reference Oga-Baldwin2024). This indicates that, despite the obvious theoretical differences between the two constructs, the items used in the standard L2 learning motivation scale were inadvertently derived from the intrinsic motivation scale, which, as explained above, predates the L2 learning experience construct by several decades.
Other constructs frequently used with the L2MSS were identified. At least three of the participating experts correctly indicated that the item “My parents believe that I must study English to be an educated person,” represented the L2MSS construct ought-to L2 self, but none noted the similar construct from SDT of introjected regulation. Thus, despite their familiarity with the L2MSS, none of these researchers indicated the potential L2 learning experience items as belonging to that construct. We interpreted this to be a further indication that interest and enjoyment are not associated with the L2 learning experience, even in the minds of researchers familiar with the L2MSS.
Our results also contribute to the growing evidence of the shaky foundations of the L2MSS (McClelland & Larson-Hall, Reference McClelland and Larson-Hall2025). The ideal L2 self, the crown jewel of the model, has failed to demonstrate discriminant validity from confidence in one’s ability (Al-Hoorie et al., Reference Al-Hoorie, Hiver and In’nami2024; Al-Hoorie et al., in press). This, in turn, exacerbates the current validation crisis (Al-Hoorie et al., Reference Al-Hoorie, Hiver and In’nami2024) in language learning motivation research. The experts’ consistent identification of interest and enjoyment as indicating intrinsic motivation, rather than the L2 learning experience, supports other findings demonstrating the empirical inconsistency of the theoretical foundations of the L2MSS (Al-Hoorie, Reference Al-Hoorie2018; Hiver & Al-Hoorie, Reference Hiver and Al-Hoorie2020). The results suggest that what has been termed the “Cinderella” of the L2MSS (Dörnyei, Reference Dörnyei2019) may in fact be a rebranding of the well-established construct of intrinsic motivation from SDT (Ryan & Deci, Reference Ryan and Deci2017); in essence, a jangle fallacy (Henry & Liu, Reference Henry and Liu2024). Any positive results found by L2MSS research may thus be inadvertently replicating results from other theories under new terminology—terminology which, unfortunately, does not communicate across disciplines (King & Fryer, Reference King and Fryer2024). In short, the L2 learning experience as it is currently operationalized likely does not make a unique contribution to motivation literature. This possibility was pointed out years ago:
theorists need to consider in what respects this new [L2MSS] formulation is more than self-determination theory cast in self-terminology. This is a crucial consideration since it is desirable to avoid a situation where different researchers within one field deal with more or less the same phenomena but independently due to different terminology. (Al-Hoorie, Reference Al-Hoorie2018, p. 738)
It is worth noting that while intrinsic motivation is often associated with SDT (Ryan & Deci, Reference Ryan and Deci2000; Brophy, Reference Brophy2010), it is not exclusive to SDT. The importance of interest and enjoyment in learning is well established (Fryer & Ainley, Reference Fryer and Ainley2019), and other major theories include components that call intrinsic motivation by recognizable names (cf. Eccles & Wigfield, Reference Eccles and Wigfield2020; Pekrun, Reference Pekrun2006). This set of terminology is recognizable across theories, and researchers can recognize the overlapping similarity in terminology of constructs such as intrinsic value, task enjoyment, and interest, given the breadth of lay discussions that include intrinsic motivation. The question for the L2 learning experience, and more broadly the L2MSS, is, to some extent, therefore one of comprehensibility and translation of terminology.
These findings suggest that any renovation of the L2MSS will require a fundamental shift in how constructs are recognized and labeled. Rather than maintaining theoretical distinctions based on traditional nomenclature, constructs should be identified and validated based on their actual content and measurement. This aligns with recent calls for theoretical consolidation in motivation research (Henry & Liu, Reference Henry and Liu2024; Skinner, Reference Skinner, Bong, Reeve and Kim2023) and suggests that the field will benefit from moving away from language-specific theories toward more general motivational frameworks (see Al-Hoorie & Hiver, Reference Al-Hoorie and Hiver2020). Indeed, in the interest of more complete models of learning, recognition of both the domain-general and domain-specific features of language can contribute more globally to understanding. Improving theoretical and terminological clarity will benefit scientific understanding of motivation and learning.
We recognize several shortcomings in our data. First, the nature of the open-ended data does not allow for intuitive statistical comparison. While this is only a relatively small and initial exploratory study, future studies into the nature of content and construct validity of the L2MSS and other latent variables will benefit from approaches that more readily lend themselves to statistical nuance. Further, the sampling of the study may not have achieved fully representative randomization across a broad swath of experts on the psychology of language learning, even although we targeted authors who have recently published in a journal dedicated to the psychology of language learning. Future studies will need to enlist participants from a planned variety of research traditions.
To resolve these theoretical overlaps and potential jangle fallacies, future research should directly compare the L2MSS and SDT within the same studies. Such comparative research would help determine the true discriminant validity of these constructs and could potentially lead to a more parsimonious theoretical framework. This is particularly important given the strong empirical support for SDT across various cultural contexts (Ryan, Reference Ryan2023) compared with the current theoretical flux in L2MSS research (Henry & Liu, Reference Henry and Liu2024; McClelland & Larson-Hall, Reference McClelland and Larson-Hall2025).
Conclusions
Ultimately, the study of language education is the study of how individuals learn to communicate. A significant part of this comes through building tools for common understanding, including the learning of appropriate vocabulary for the variety of situations that language users encounter. In the interest of matching the goal of learning a language to communicate across regional and national boundaries, an agreement on terminology and constructs has become necessary (Skinner, Reference Skinner, Bong, Reeve and Kim2023). While recognizing that many of the differences in learning an L1 and L2 are as important as the similarities (DeKeyser, Reference DeKeyser2000), the overlaps remain similarly crucial for a complete picture (Al-Hoorie & Hiver, Reference Al-Hoorie and Hiver2020; Oga-Baldwin & Fryer, Reference Oga-Baldwin and Fryer2020); formal L2 learning indeed contains practices and processes that are unique and explicable only within the bounds of this domain, but also contains domain-general elements that are recognizable to researchers in traditions such as educational and developmental psychology more generally. To build broad communicative interplay and expand the field of human knowledge of how motivation works in a variety of domains, a common language of constructs will facilitate ease of interpretation and interpolation.
From the perspective of parsimony, there is now both theoretical and empirical evidence that intrinsic motivation and L2 learning experience constitute practically the same construct. Logically, the next step will be the expansion of comparison, running competing models of the L2MSS against the SDT continuum. Given the overlaps presented in this paper, there is room to speculate that perhaps other correlated constructs may also overlap. Could the ideal L2 self, with its focus on achieving future abilities and goals, show similar overlap with identified regulation (see Al-Hoorie et al., in press), the motivation toward personal achievement? Both ought-to L2 self and introjected regulation represent motivation stemming from guilt, social pressure, and social comparison; could these similarly be empirically indistinguishable? In the interest of determining the potential theoretical role of the L2MSS moving forward, these questions must be resolved. An affirmative response to these questions should not be a cause for concern but rather an opportunity to develop more parsimonious and theoretically robust models.
Competing models using both theories can eventually help to reach the theoretical parsimony, and perhaps, eventually, identify both the language-specific and general conceptual differences explained by these theoretical constructs. Preregistration methods (Liu, Al-Hoorie, & Hiver, Reference Liu, Al-Hoorie and Hiver2024) and other foundational principles of open science in concert with exacting methodological practices will provide empirical answers and help to move the field toward improved parsimony. In deciding and defining the terms to be used in scientific communication, language learning researchers can bring the field of theoretical motivation into greater harmony with the practice of acquiring a language.
Supplementary material
The supplementary material for this article can be found at http://doi.org/10.1017/S0272263125100880.


