Hostname: page-component-857557d7f7-nbs69 Total loading time: 0 Render date: 2025-11-20T15:22:18.532Z Has data issue: false hasContentIssue false

Allophonic and phonemic tap dance: the influence of native phonology on nonnative phonetic perception and lexical encoding

Published online by Cambridge University Press:  19 November 2025

Zhiyi Wu*
Affiliation:
The Graduate Program of Second Language Acquisition, School of Languages, Literatures, and Cultures, University of Maryland, College Park, USA
Kira Gor
Affiliation:
The Graduate Program of Second Language Acquisition, School of Languages, Literatures, and Cultures, University of Maryland, College Park, USA
*
Corresponding author: Zhiyi Wu; Email: zhiyiw1@umd.edu
Rights & Permissions [Opens in a new window]

Abstract

This study investigates second language (L2) phonetic categorization and phonological encoding of L2 words (hereafter, phonolexical encoding1) with phonemic and allophonic cross-linguistic mismatches. We focus on the acquisition of Spanish /ɾ/-/l/ and /ɾ/-/t/ contrasts among Spanish learners with American English (AE) and Mandarin Chinese (hereafter, Chinese) as first languages (L1s). [ɾ] and [t] are positional allophones in AE but separate phonemes in Spanish. The phoneme /ɾ/ is absent in Chinese. AE learners showed nativelike phonetic categorization and little between-contrast difference in phonolexical encoding, suggesting that L1 positional allophony does not necessarily impede L2 contrast acquisition. Chinese learners showed persistent perceptual difficulties with both contrasts due to perceptual similarity. Phonetic categorization significantly predicted phonolexical encoding for /ɾ/-/t/ contrasts for Chinese learners bidirectionally, while AE learners showed this relationship only when /t/ was incorrectly replaced by /ɾ/ in Spanish words. This asymmetry can be driven by the fact that [t] is the dominant allophone of /t/ in AE, while [ɾ] is a positional allophone. It suggests L1 allophonic knowledge heightens perceptual monitoring when evaluating substitutions that conflict with L1 phonological expectations. This study calls for more nuanced treatment of L1 influence in L2 phonological acquisition models, especially at the allophonic level.

Information

Type
Original Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press

Introduction

A key area of second language acquisition (SLA) research concerns the influence of L1 language phonology on the perception, encoding, and representation of second language (L2) sounds. Studies have explored how first language (L1) phonemic inventories affect L2 phonetic perception and phonological representations, focusing on how matches and mismatches between L1 and L2 phonemic inventories influence various aspects of L2 phonological acquisition, including perception accuracy, processing speed, and the formation of phonological and lexical representations (Best, Reference Best and Strange1995; Best & Tyler, Reference Best, Tyler, Bohn and Munro2007; Flege, Reference Flege and Strange1995; Flege & Bohn, Reference Flege, Bohn and Wayland2021). Extensive experience with rapid and accurate L1 phonemic categorization can be recruited to categorize L2 phonemic contrasts, leading to weakened sensitivity and poor phonological encoding of L2 sound segments (i.e., imprecise phonological representations) and lexical units, or words (i.e., imprecise phonolexical representations; Gor et al., Reference Gor, Cook, Bordag, Chrabaszcz and Opitz2021; Llompart, Reference Llompart2021) for an extended period (Gor et al., Reference Gor, Cook, Bordag, Chrabaszcz and Opitz2021; Pallier et al., Reference Pallier, Colomé and Sebastián-Gallés2001). Moreover, when there is a mismatch, the degree of perceptual similarity between L1 and L2 phonemes or between L2 phonemes can create varying difficulty levels in acquiring L2 phonemes.

While phonemes are often considered the minimal linguistic units, other researchers argue that allophones are the minimal units of speech perception (Luce et al., Reference Luce, Goldinger, Auer and Vitevitch2000; Mitterer et al., Reference Mitterer, Reinisch and McQueen2018). Allophones represent the physical realizations of phonemes, the latter being abstract units that distinguish meanings (Hayes, Reference Hayes2010). For instance, in North American English (AE), the phoneme /t/ is realized as a flap [ɾ] when placed post-tonic between two vowels (e.g., water [wɔɾɚ]), as an unaspirated [t] after /s/ (e.g., stand [stænd]), and as an aspirated [th] in most other positions (e.g., take [theɪk]). In this case, [ɾ], [t], and [th] are three allophones of the phoneme /t/. A listener identifies these sounds based on their phonetic properties and categorizes them as the phoneme /t/ based on allophonic variation rules. Accordingly, allophonic variation influences the acquisition of a language’s sound system no less than abstract phonemic categories. Despite their crucial role in speech processing (Hayes, Reference Hayes2010), the perception and representations of these subphonemic speech segments among L2 speakers are not sufficiently understood (Shoemaker, Reference Shoemaker2014).

This study contributes to understanding the influence of L1 phonology on L2 acquisition by investigating two scenarios: the role of L1 allophonic variation in L2 phonological development when L1 allophones correspond to separate L2 phonemes, and the impact of a complete mismatch between L1 and L2 phonemic inventories, where the L2 phoneme is absent in the L1 both as a phoneme and an allophone. Using an auditory oddity task (OT) and an auditory lexical decision task (LDT), we first examine whether the L1 allophone–L2 phoneme relationship influences learners’ acquisition of the L2 sound system among late learners with intermediate or higher proficiency. Specifically, we explore whether the allophonic status of [ɾ] and [t] in AE impacts the categorization of /ɾ/ and /t/ as separate phonemes in Spanish at the phonemic and phonolexical levels. American learners’ perception and representation of this Spanish phonemic pair were compared with /ɾ/ and /l/, which are separate phonemes in both languages but are articulatorily and acoustically more similar to each other than /ɾ/ and /t/. Then, we compared this set of comparisons with the ones for L1 Chinese learners of Spanish, who must acquire /ɾ/ as a new L2 category. For Chinese learners, the relative articulatory and perceptual similarities between /ɾ/ and /t/, as well as between /ɾ/ and /l/, are predicted to be the major influences on their categorization patterns and phonolexical representations: The closer the Spanish phonemes are to each other, the less accurately these learners can categorize and represent them in words.

Background

Numerous studies have consistently demonstrated that mismatches between L1 and L2 phonemic inventories could inhibit L2 phonemic acquisition (e.g., Aoyama et al., Reference Aoyama, Flege, Guion, Akahane-Yamada and Yamada2004; Bosch et al., Reference Bosch, Costa and Sebastián-Gallés2000; Kkese & Karpava, Reference Kkese, Karpava and Babatsouli2019). Moreover, these studies have shown that the perceptual difference between L1 and L2 phonemes is critical for how well learners perceive and acquire L2 phonemes. These recurring findings have led to the proposal of different models of cross-language speech perception with similar core premises but different operationalizations, such as the Perceptual Assimilation Model (PAM) and PAM-L2 (Best et al., Reference Best, Goodman and Nusbaum1994; Best & Tyler, Reference Best, Tyler, Bohn and Munro2007), the Speech Learning Model (SLM) and SLM-r (Flege, Reference Flege and Strange1995; Flege & Bohn, Reference Flege, Bohn and Wayland2021), and the Second Language Linguistic Perception model (L2LP; Escudero, Reference Escudero2005; van Leussen & Escudero, Reference van Leussen and Escudero2015). Despite their different conceptualizations (e.g., perceptual proximity in SLM, articulatory similarity in PAM, and acoustic similarity in L2LP), these models agree that perceptual differences are key predictors of success in perceiving and pronouncing nonnative phonemes. L2 phonemes that are highly similar to L1 categories could be more challenging, as learners tend to assimilate new sounds to their existing representations, while L2 phonemes dissimilar from the L1 inventory may be easier to acquire as separate categories.

While the perception and categorization of phonemes are important, the mapping of these abstract phonological categories onto their allophonic variants, which differ across languages, is often overlooked (Barrios et al., Reference Barrios, Jiang and Idsardi2016a; Barrios et al., Reference Barrios, Namyst, Lau, Feldman and Idsardi2016b; Kohler, Reference Kohler1981; Lopez Velarde, Reference Lopez Velarde2020). Allophonic variation can occur as free variation, where allophones of a phoneme appear in the same environment, or more commonly, as context-dependent allophones constrained by the phonetic environment. Studies have demonstrated that individuals who acquired a language from birth struggle to discriminate context-dependent allophonic pairs in that language when presented outside their phonetic context (Boomershine et al., Reference Boomershine, Hall, Hume, Johnson, Avery, Dresher and Rice2008; Kazanina et al., Reference Kazanina, Phillips and Idsardi2006; Whalen et al., Reference Whalen, Best and Irwin1997), suggesting that pure auditory discriminability plays a limited role in phonetic categorization (Boomershine et al., Reference Boomershine, Hall, Hume, Johnson, Avery, Dresher and Rice2008). Instead, abstract phonological representations strongly influence sound categorization (Kazanina et al., Reference Kazanina, Phillips and Idsardi2006). When two speech sounds share a phonological representation (e.g., [d] and [t] in Korean), L1 speakers may be unable to discriminate these phonemically noncontrastive sounds, even if they are acoustically distinct (see Barrios et al., Reference Barrios, Namyst, Lau, Feldman and Idsardi2016b for a review).

Allophonic variation creates unique perceptual and processing patterns for both L1 and L2 speakers. The SLM-r proposes that “the mapping of L2 to L1 sounds occurs at the level of position-sensitive allophones, not phonemes” (Flege & Bohn, Reference Flege, Bohn and Wayland2021, p. 13). Related research on L2 speakers has focused on free variation (Barrios et al., Reference Barrios, Rodriguez and Barriuso2023; Zheng & Gor, Reference Zheng and Gor2024), but studies on context-dependent allophones among L2 speakers remain scant. Some studies have investigated learners acquiring context-dependent allophones in L2 (Barrios et al., Reference Barrios, Rodriguez and Barriuso2023; Shea & Curtin, Reference Shea and Curtin2010, Reference Shea and Curtin2011), while few have examined how L1 allophones influence L2 perception and acquisition (e.g., Barrios et al., Reference Barrios, Namyst, Lau, Feldman and Idsardi2016b; Eckman et al., Reference Eckman, Iverson and Elreyes2001; Eckman & Iverson, Reference Eckman and Iverson2013). One unique situation is when L1 allophonic pairs map onto two separate L2 phonemes, termed “allophonic split,” (Eckman et al., Reference Eckman, Iverson and Elreyes2001) and a learning scenario considered challenging by the contrastive analysis hypothesis (Lado, Reference Lado1957). For instance, [t] and [ɾ] are allophones of /t/ in AE but separate phonemes /t/ and /ɾ/ in Spanish (see the next section for examples). American learners of Spanish must learn that /t/ and /ɾ/ distinguish word meanings in Spanish. While some evidence supports this difficulty, mainly in production (Eckman et al., Reference Eckman, Iverson and Elreyes2001; Eckman & Iverson, Reference Eckman and Iverson2013), research on the perception of L2 phonemic contrasts involving L1 allophones is limited to two studies (Eckman et al., Reference Eckman, Iverson, Fox, Jacewicz, Lee, Watkins, Rauber and Baptista2009; Barrios et al., Reference Barrios, Namyst, Lau, Feldman and Idsardi2016b).

These two studies have provided valuable insights into the relationship between L1 allophony and L2 phonemic acquisition, each focusing on different aspects of this complex interaction. The first study, Eckman et al. (Reference Eckman, Iverson, Fox, Jacewicz, Lee, Watkins, Rauber and Baptista2009), examined the perception and production of the English /s/-/š/ contrast by L1 speakers of Japanese and Korean, in which these sounds are allophones of the same phoneme, and found that L2 learners could successfully acquire this contrast across morpheme boundaries. Using an AX discrimination task, Barrios et al. (Reference Barrios, Namyst, Lau, Feldman and Idsardi2016b) examined the cross-linguistic influence of allophonic split and found that advanced Spanish late learners of English successfully discriminated the L2 contrast of [d] and [ð], which are allophones in Spanish but phonemes in English. This study suggests that the L1 allophonic status of the L2 phonological contrast does not necessarily impede L2 sound acquisition and that learners can establish new phonological relationships between familiar phones despite their different statuses in the L1. However, more research is needed to examine the generalizability of these results and how L1 allophony affects learners’ phonetic perception and phonological and phonolexical encoding of L2 phonemic contrasts.

To examine whether and how L2 learners encode these difficult phonological contrasts in their lexical representations, researchers have increasingly turned to the auditory LDT (for a comprehensive review, see Darcy et al., Reference Darcy, Llompart, Hayes-Harb, Mora, Adrian, Cook and Ernestus2025). Using this methodology, research has revealed not only general difficulty with challenging L2 contrasts but also directional asymmetries in processing them, where one sound may activate representations of another but not vice versa (Barrios et al., Reference Barrios, Namyst, Lau, Feldman and Idsardi2016b; Darcy et al., Reference Darcy, Daidone and Kojima2013; Llompart, Reference Llompart2021; Llompart & Reinisch, Reference Llompart and Reinisch2019). For instance, Weber and Cutler (Reference Weber and Cutler2004) found that Dutch–English bilinguals hearing /pæn/ in “panda” inappropriately activated “pencil,” but /pϵn/ in “pencil” did not activate “panda,” demonstrating how asymmetries can reveal the more nuanced nature of learners’ phonological representations. The patterns of these asymmetries can help determine whether learners’ difficulties stem from perceptual neutralization of the contrast or imprecise lexical encoding (Barrios & Hayes-Harb, Reference Barrios and Hayes-Harb2021). This approach thus offers a window into L2 phonolexical processing, making it an ideal method for investigating how L1 allophonic variants are engaged in a phonemic contrast in L2.

Studies on the persistent difficulty of phonemic contrasts at phonological and phonolexical levels have yielded mixed results regarding whether perceptual abilities can predict lexical-level processing accuracy. Some support the Direct Mapping from Acoustics to Phonology (DMAP) theory, positing that well-established L2 phonemic categories are dissociated from phonolexical encoding (Amengual, Reference Amengual2016; Daidone & Darcy, Reference Daidone and Darcy2021; Darcy et al., Reference Darcy, Dekydtspotter, Sprouse, Glover, Kaden, McGuire and Scott2012). Others support the fuzzy lexical representation hypothesis (FLRH; Gor et al., Reference Gor, Cook, Bordag, Chrabaszcz and Opitz2021), suggesting a close relationship between phonemic and phonolexical encoding, with imprecise perception leading to confusion at the word level (Broersma & Cutler, Reference Broersma and Cutler2008; Ota et al., Reference Ota, Hartsuiker and Haywood2009; Rhodes et al., Reference Rhodes, Avcu, Han and Hestvik2022). These competing theoretical positions have been investigated using a combination of methods, with researchers employing auditory OT to assess phonetic discrimination and auditory LDT to evaluate phonolexical encoding (as in Daidone & Darcy, Reference Daidone and Darcy2021). This study aims to contribute to this debate by examining the influence of allophonic split on L2 acquisition at phonemic and phonolexical levels, using these same complementary methods—an auditory OT and an auditory LDT. We investigated how AE L1 speakers learning Spanish phonetically perceive and phonolexically encode Spanish /ɾ/-/t/ in relation to Spanish /ɾ/-/l/, compared to Chinese learners of Spanish.

[ɾ], [t], and [l] in Spanish, AE, and Chinese

Among the three targeted speech sounds in Spanish, [ɾ] and [l] share both place and voicing of articulation, but [ɾ] and [t] do not share any articulatory features, as shown in Table 1.

Table 1. Articulatory features of [t], [l], [ɾ] in Spanish

In Spanish, /ɾ/, /t/, and /l/ are distinct phonemes (see Table 2; e.g., caro, /kaɾo/ “expensive” vs. cato, /kato/ “(I) taste” vs. calo, /kalo/ “(I) soak”). /ɾ/ is a voiced consonant typically found in intervocalic positions with an alveolar place of articulation, whereas the voiceless /t/ and voiced /l/ can occur in any position within a word.

Table 2. /t/, /l/, /ɾ/ in Spanish

In AE, /l/ is a distinct phoneme, while [ɾ] and [th] are allophones of the phoneme /t/ (see Table 3). [ɾ] occurs exclusively in intervocalic positions, and the unaspirated [t] appears only after an /s/. The most common allophone of /t/ is the aspirated [th]. Consequently, [ɾ] and [th] in AE are considered as belonging to the same phonological category, /t/, despite their different phonetic realizations. This presents an allophonic split, where an L1 allophonic pair (AE [ɾ] and [t]) corresponds to two separate L2 phonemes with distinct phonetic features (Spanish /ɾ/ and /t/). Additionally, Spanish /ɾ/ is orthographically represented by “r,” contrasting with the AE [ɾ], represented by “t”. Considering [ɾ] and [t] map onto the same L1 phoneme but [ɾ] and [l] are more phonetically similar, it is unclear which pair of Spanish phonemes is harder for AE L1 speakers in terms of perception and processing.

Table 3. /t/, /l/, /ɾ/ in AE

Note: Bolded allophones are the core sounds examined in the current study.

For Chinese learners of Spanish, both /l/ and /t/ can be mapped onto the distinct Chinese phonemes, /l/ and /t/. While the Spanish /ɾ/ does not have an exact equivalent in Chinese, it may still involve the “phantom activation” of similar-sounding phonemes in Chinese learners’ L1 inventory (Broersma & Cutler, Reference Broersma and Cutler2008). Table 4 provides an overview of the corresponding allophones for these three phonemes in Chinese. Studies have shown that Chinese speakers most often associate /ɾ/ with the alveolar lateral /l/ that exists in Spanish and Chinese, likely due to their similar place of articulation (Ortí Mateu, Reference Ortí Mateu1990; also, see Patience, Reference Patience2018). Additionally, the voiced stop /d/ and the voiceless stop /t/ have been identified as potential candidates for perceptual confusion with the Spanish /ɾ/ (Duanmu, Reference Duanmu2007). When acquiring a novel L2 phoneme similar to L1 phonemes, learners tend to rely on those L1 sounds as a basis for perception, creating difficulties in establishing the new phonemic category and distinguishing it from similar L1 and L2 phonemes (Best & Tyler, Reference Best, Tyler, Bohn and Munro2007; van Leussen & Escudero, Reference van Leussen and Escudero2015). Considering the phonetic proximity between /l/ and /ɾ/, Chinese learners tend to perceive the Spanish /ɾ/ as similar to existing Chinese phonemes, particularly /l/, rather than treating it as an entirely novel category. This perceptual assimilation aligns with several studies showing Chinese learners’ considerable difficulty in distinguishing between Spanish /l/ and /ɾ/ (Chih, Reference Chih2013; Ortí Mateu, Reference Ortí Mateu1990; Patience, Reference Patience2018).

Table 4. /t/, /l/, /ɾ/ in Chinese

This study employs an auditory OT and LDT to examine how intermediate-to-advanced learners of Spanish with AE and Chinese L1 backgrounds phonetically discriminate and phonologically and phonolexically encode the Spanish /ɾ/-/l/ and /ɾ/-/t/ contrasts. Based on the theoretical frameworks and empirical evidence discussed above, we propose three research questions (RQs) and hypotheses (Hs):

RQ1: How do the allophonic split (AE learners) and the absence of phonemic equivalence (Chinese learners) influence L2 phonetic discrimination?

Using d’ (d-prime) as a measure of perceptual sensitivity, Experiment 1 utilizes an auditory OT to compare how accurately AE and Chinese learners differentiate between /ɾ/-/t/ and /ɾ/-/l/ contrasts.

H1.1: For AE learners, L1 allophonic status is predicted to either reduce sensitivity to /ɾ/-/t/ through interference or enhance sensitivity through phonetic familiarity.

If phonetic similarity is the dominant factor, AE learners will show greater sensitivity to /ɾ/-/t/ than /ɾ/-/l/.

If L1 allophonic status is the dominant factor, AE learners will show either reduced or similar sensitivity to /ɾ/-/t/ compared to /ɾ/-/l/, despite the phonetic differences. The allophonic relationship between [ɾ] and [t] in AE may counteract any advantage that would typically result from the greater phonetic distinction between these sounds.

H1.2: Chinese learners are hypothesized to show particularly low sensitivity to /ɾ/-/l/ given both the absence of /ɾ/ in their L1 and its similarity to /l/.

RQ2: How do these L1–L2 relationships affect phonolexical encoding?

Using an auditory LDT, Experiment 2 measures how accurately learners identify nonwords created by phoneme substitution (e.g., bonito /boˈnito/ “beautiful” becoming */boˈniɾo/ through /t/➔/ɾ/ substitution).

H2.1: Both learner groups are hypothesized to show lower accuracy in rejecting /ɾ/-/l/ nonwords than /ɾ/-/t/ nonwords, reflecting the influence of perceptual similarity. We also expect this difficulty to manifest asymmetrically, with greater challenges when the less familiar phoneme (/ɾ/) replaces the more familiar one (/t/).

H2.2: AE learners’ accuracy in rejecting /ɾ/-/t/ nonwords is hypothesized to depend on whether their L1 allophonic experience facilitates or hinders L2 encoding. If allophonic split creates processing asymmetries, we expect differential performance based on substitution direction, with less accurate rejection of /t/➔/ɾ/ than /ɾ/➔/t/.

H2.3: Chinese learners are hypothesized to show low accuracy in their rejection of /ɾ/-/l/ nonwords, indicating fuzzy phonolexical representations involving this contrast. Due to the much lower familiarity with /ɾ/ than the other two phonemes, this group is expected to reject /t/➔/ɾ/ and /l/➔/ɾ/ less accurately than the opposite substitution directions.

RQ3: What is the relationship between phonetic discrimination and phonolexical encoding for these contrasts? We examine this by analyzing correlations between Experiment 1’s discrimination performance and Experiment 2’s nonword rejection accuracy.

H3: Following the FLRH, discrimination ability is hypothesized to predict nonword rejection accuracy more strongly for perceptually challenging contrasts:

Stronger correlation for /ɾ/-/l/ than /ɾ/-/t/ for both learner groups.

Stronger correlations overall for Chinese learners compared to AE learners.

Experiment 1

Experiment 1 used an auditory OT to investigate participants’ ability to perceive the difference between the two sounds in each pair and categorize them as different. We chose the oddball paradigm due to its higher cognitive demand than other discrimination tasks (e.g., ABX tasks), as it would reduce the likelihood of a ceiling effect for less challenging contrasts (Daidone & Darcy, Reference Daidone and Darcy2021). We hypothesized that Spanish L2 speakers would show more accurate responses to /ɾ/-/t/ triads than /ɾ/-/l/ triads because of the greater phonetic similarity between /ɾ/ and /l/. Moreover, Chinese L1 speakers may have the most difficulty because they have no experience with /ɾ/ from their L1.Footnote 2 AE L1 speakers may face two contrasting learning scenarios: struggling with the /ɾ/-/t/ triads if their learning is driven by the L1–L2 allophonic mismatch, or finding these triads easier if the learning is driven by their familiarity with the phonetic properties of /ɾ/ and /t/ in L1.

Participants

Forty Spanish L1 speakers (21 females; mean age = 28 years, SD = 7.2; two from Argentina, two from Chile, nine from Spain, and 22 from Mexico) and two groups of Spanish learners, 42 L1 speakers of AE and 48 L1 speakers of Chinese, participated in this experiment. During post-experiment debriefing, it was discovered that two Chinese L1 speakers had used incorrect response keys for more than a third of one task, leading to their exclusion from the study. Additionally, data from one Spanish L1 speaker was lost due to technical issues. These adjustments resulted in final group sizes of 39 Spanish L1 speakers, 42 AE L1 speakers, and 46 Chinese L1 speakers.

Chinese-speaking participants were recruited from Spanish departments at various universities in China, while Spanish and AE L1 speakers were recruited from different universities in the United States, word of mouth, social media, and the online participant recruitment platform Prolific (www.prolific.com). All participants received monetary compensation for their time and effort.

To ensure that all Spanish L2 speakers had at least an intermediate level of proficiency, two screening proficiency tests were administered: LexTALE-Esp (Izura et al., Reference Izura, Cuetos and Brysbaert2016) and 30 multiple-choice DELE questions adapted from Montrul (Reference Montrul2012). Participants also completed a modified Language Experience and Proficiency Questionnaire (Marian et al., Reference Marian, Blumenfeld and Kaushanskaya2007), which collected information on their educational background, language learning history, self-rated Spanish proficiency, age of acquisition (AoAFootnote 3), and average daily exposure to Spanish (see Table 5). As shown in Table 5, Chinese and AE learners of Spanish differed significantly in their AoA. Given that earlier language learning and higher language proficiency levels have been associated with enhanced listening performance and improved cognitive and auditory processing (Bak et al., Reference Bak, Vega-Mendoza and Sorace2014; Dastgerdi et al., Reference Dastgerdi, Seifi and Vahabi2023; Vandergrift, Reference Vandergrift2006), we tested whether AoA and self-rated listening influenced perceptual performance by including them as a main factor in our statistical models, although investigating age or proficiency effects was not the primary focus of our study. However, model comparisons across all analyses consistently revealed that neither AoA nor self-rated listening significantly contributed to explaining the variance in our data, suggesting that the timing or the self-rated listening proficiency of Spanish language acquisition did not meaningfully impact participants’ performance in our tasks. The LexTALE and Montrul test scores were used as covariates in follow-up analyses to validate our main findings and ensure they were not driven by proficiency differences between the two Spanish late learner groups (see the Results section for details).

Table 5. Background information for participants

Note: * One American L1 speaker’s demographic data did not get successfully saved due to technical issues.

All tasks involved in the study protocol were approved by the Institutional Review Board at the authors’ institution. Prior to participation, all participants provided informed consent.

Materials

All stimuli were disyllabic and respected the Spanish phonotactic constraints. They were all nonwords in Spanish, Chinese, or English. There were two minimal pairs for each of the two critical pairs (/ɾ/-/l/ pair: /meɾi/-/meli/ and /kiɾu/-/kilu/; /ɾ/-/t/ pair: /priɾo/-/prito/ and /seɾi/-/seti/) and five filler pairs (/n/-/ɲ/, /l/-/ʎ/, /b/-/f/, /g/-/k/, /d/-/ɾ/)Footnote 4. All stimuli were digitally recorded in a sound-attenuated room by three L1 speakers of Castilian Spanish (one male and two females). The complete set of materials, data, code, analyses, and model output for both experiments is available on the Open Science Framework (OSF) website: osf.io/v6esc.

Each pair of stimuli (e.g., nonword A: /meɾi/ and B: /meli/) appeared in eight possible orders: AAA, BBB, ABB, BAA, ABA, BAB, AAB, and BBA. For example, the /meɾi/–/meli/ pair appeared once as /meɾi/-/meɾi/-/meɾi/ (AAA), once as /meli/-/meli/-/meli/ (BBB), once as /meɾi/-/meli/-/meli/ (ABB), etc. A total of 112 trials were included in the study, with 56 trials pseudo-randomized for presentation in each of two blocks. Each block contained all seven phonemic pairs, with each pair appearing in all eight possible stimulus orders.

Procedure

Participants were tested individually on the online experimental platform Gorilla (www.gorilla.sc; Anwyl-Irvine et al., Reference Anwyl-Irvine, Massonnié, Flitton, Kirkham and Evershed2020). In each trial, participants heard the first stimulus produced by the male speaker, the second by one female speaker, and the third by the other female speaker, separated by a lag of 1200 ms. The order of the talkers was fixed across trials. For each trial, participants were asked to press keys to denote, as quickly and accurately as possible, whether all three nonwords were the same (the L key) and, if not, which was different (the A, S, or D key). The inter-trial interval was 500 ms. Participants were allowed 6500 ms to give their responses. The experiment began with a practice phase consisting of seven practice trials with explicit feedback, informing participants whether their responses were correct or incorrect, allowing them to become familiar with the task before proceeding to the test trials. Following this, the main task commenced, with the first eight trials also treated as practice and excluded from the analysis. The entire task was approximately 12 min, with an optional break between the two blocks.

Results

Before conducting our main analysis, we followed Bramlett and Wiener (Reference Bramlett and Wiener2024) and established removal criteria based on accuracy and reaction time (RT) for participants and items within each language group. The threshold for exclusion was set at three standard deviations (SDs) from the mean. Specifically, we removed participants whose accuracy or RT fell beyond three SDs from the mean of all participants within their language group. Similarly, we removed items (three-stimulus trials) whose accuracy or RT exceeded three SD from the mean of all items within their language group. This procedure removed one AE L1 speaker, leaving 41 participants for the following analysis, and one triad (/kiɾu/-/kilu/-/kilu/) from the Spanish L1 speaker group. Table 6 provides the raw accuracy and RT data for each group. Figure 1 visualizes the accuracy rates (mean and standard error) for the three language groups across the two critical contrasts for odd (i.e., difference) and same trials.

Table 6. Mean accuracy rate and RT information for the three L1 groups in OT

Figure 1. Accuracy rates by contrast type, language group, and trial type (odd or same trials).

Note: Bars show proportion correct for /l/-/ɾ/ (black) and /t/-/ɾ/ (gray) contrasts across AE, Spanish, and Chinese L1 speakers. Error bars present one standard error of the mean for each contrast for each group.

To provide a more comprehensive measure of participants’ discrimination abilities, we calculated each participant’s sensitivity to each contrast using d’ instead of raw response accuracy (Stanislaw & Todorov, Reference Stanislaw and Todorov1999). We computed d’ by collapsing across same and different trials, with hits defined as correct identifications of the odd item in different trials, and false alarms as incorrect identifications of a difference when all three items were the same. This approach accounts for both correct detections and error rates, offering a more nuanced view of perceptual sensitivity. To address extreme proportions in our d’ calculation, we applied the log-linear correction method (Hautus, Reference Hautus1995), adding .5 to both hits and false alarms and 1 to the total number of trials, preventing infinite values while maintaining unbiased sensitivity estimates. Figure 2 illustrates the d’ values for the OT across language groups and phonemic contrasts. As shown in Figures 1 and 2, Chinese L1 speakers showed the lowest accuracy and sensitivity across both contrasts, with more accurate performance on /t/-/ɾ/ discrimination compared to /l/-/ɾ/ discrimination. Their response pattern for /l/-/ɾ/ trials revealed a pronounced “same” bias, evidenced by near-chance performance (.25) on odd trials where they needed to detect differences, yet relatively high accuracy (.70) on same trials where all items were the same. This “same” bias suggests a lack of discrimination of these phonemes—Chinese speakers systematically failed to detect acoustic differences between /l/ and /ɾ/. In contrast, both Spanish L1 speakers and AE speakers exhibited robust discrimination abilities with high sensitivity across both contrasts. Neither of these groups displayed a “same” bias.

Figure 2. d' values for the oddity task across language groups (Spanish L1 speakers, AE L1 speakers, and Chinese L1 speakers) and phonemic contrasts (/l/-/ɾ/ vs. /t/-/ɾ/).

We analyzed d’ values using linear mixed-effects modeling (LMM) with Subject and Item as random effects, using the lme4 (Bates et al., Reference Bates, Mächler, Bolker and Walker2015) and the lmerTest (Kuznetsova et al., Reference Kuznetsova, Brockhoff and Christensen2017) packages in R (v4.2.3; R Core Team, 2021). Although kept as maximal as possible, no models included random slopes due to failure to converge. Model comparisons based on AIC and BIC values (see analysis scripts on the OSF repositoryFootnote 5) favored a model containing fixed effects of Language Group (Spanish, AE, and Chinese), Contrast (/t/-/ɾ/, /l/-/ɾ/), and their interactions (d’∼ Group × Contrast + (1|Subject)). To ensure that our findings were not driven by early learners, we further conducted a validation analysis including only participants who began learning Spanish after age 12 years (AE: n = 23; Chinese: n = 46) and included LexTALE and Montrul test scores in the model selection process. The best model from this subset analysis was one with Language Group, Contrast, their interactions, and Montrul test scores (d’∼ Group × Contrast + Montrul Score + (1|Subject)).

We employed treatment, or dummy, coding for categorical variables to facilitate the interpretation of main effects and interactions (Brehm & Alday, Reference Brehm and Alday2022). Post hoc comparisons between language groups and phonetic contrasts were conducted using the emmeans() function from the emmeans package (Lenth, Reference Lenth2025), which provides estimated marginal means and pairwise contrasts while appropriately adjusting for multiple comparisons.

The main model revealed significant differences among groups and between contrasts. Chinese L1 speakers exhibited significantly lower sensitivity than both Spanish and AE L1 speakers across both contrasts (ps < .001). Spanish and AE L1 speakers showed comparable levels of sensitivity, with no significant differences observed in either /t/-/ɾ/ trials (|β| = .04; p = .84) or /l/-/ɾ/ trials (|β| = .13; p = .53). When comparing the two contrasts within each group, a consistent pattern emerged: All three groups showed significantly higher sensitivity to /t/-/ɾ/ trials than to /l/-/ɾ/ trials. This effect was most pronounced for Chinese L1 speakers (|β| = 1.08; p < .001), followed by AE L1 speakers (|β| = .65, p = .0003) and Spanish L1 speakers (|β| = .56, p = .002). These results suggest that while language background plays a crucial role in overall sensitivity, the relative difficulty of the /l/-/ɾ/ contrast compared to the /t/-/ɾ/ contrast is consistent across all groups, albeit to varying degrees.

The secondary model with only learners with an AoA of over 12 revealed the same results for the AE and Chinese L1 groups. The only difference it revealed was a significant positive effect of the Montrul test scores (β = .05, p = .04). This result shows that the higher the participants scored in the Montrul test, the more sensitive they were to the two contrasts at the earlier stage of speech processing, confirming previous findings that higher L2 proficiency contributes to L2 listening (e.g., Vandergrift, Reference Vandergrift2006).

Discussion

Experiment 1 tested the perceptual sensitivity to the two target contrasts (/l/-/ɾ/ and /t/-/ɾ/) across the three L1 groups (Spanish, AE, and Chinese). Our findings largely support our initial predictions but also reveal some unexpected patterns.

We anticipated, among Spanish learners, lower sensitivity in discriminating /l/-/ɾ/ than /t/-/ɾ/, likely due to the closer articulatory proximity of /l/ and /ɾ/. This pattern was found across all three groups, including Spanish L1 speakers, which was somewhat unexpected. The fact that Spanish L1 speakers did not perform at ceiling for /l/-/ɾ/ triads can be attributed to the significant cognitive load required to complete the OT. Discrimination tasks, such as the OT employed in this study, are known to place high demands on working memory (Mitterer & Mattys, Reference Mitterer and Mattys2017). In our three-item OT, participants must maintain and compare phonological representations of complex stimuli, which can create processing difficulties even for native speakers when dealing with phonetically similar sounds. Moreover, varying levels of familiarity with Castilian Spanish among these L1 speakers, most of whom were from Latin America (as noted in the Participants section), may also contribute to this not-at-ceiling performance. While other factors, such as regional dialect differences or individual cognitive variations, could potentially play a role, a comprehensive analysis of these factors is beyond the scope of the current study.

Chinese L1 speakers, as predicted, showed the lowest sensitivity to both contrasts, with a particularly pronounced difficulty in discriminating /l/-/ɾ/. This aligns with our hypothesis based on the absence of the /ɾ/ phone in Mandarin Chinese and the perceptual similarity between /l/ and /ɾ/. The significantly higher d’ values for /t/-/ɾ/ trials compared to /l/-/ɾ/ trials across all groups further support the notion that the /l/-/ɾ/ contrast is perceptually more challenging.

Interestingly, unlike English L1 speakers who struggled to detect differences in allophonic pairs in their L1 when these sounds are presented outside their appropriate phonetic contexts (Boomershine et al., Reference Boomershine, Hall, Hume, Johnson, Avery, Dresher and Rice2008; Kazanina et al., Reference Kazanina, Phillips and Idsardi2006; Whalen et al., Reference Whalen, Best and Irwin1997), AE L1 speakers in our study performed as well as Spanish L1 speakers in discriminating both contrasts, supporting the hypothesis that the allophonic status in L1 for AE L1 speakers may have increased their sensitivity to these nonnative phonemic contrasts through phonetic familiarity, and thus, they were able to establish new mappings for these familiar phones, consistent with Barrios and colleagues (Reference Barrios, Namyst, Lau, Feldman and Idsardi2016b).

Experiment 2

An auditory LDT investigated participants’ phonological encoding of nonnative phonemic contrasts (/ɾ/-/l/ and /ɾ/-/t/) and phonolexical encoding of words differentiated by these contrasts. Participants decided whether the sound sequence they heard was a real word. Critical nonwords were created by replacing the critical phonemes with the other consonant in the pair (e.g., bonito /boˈnito/ “beautiful” became */boˈniɾo/; /eˈneɾo/ “January” became /eˈneto/; /baˈsuɾa/ “garbage” became /baˈsula/; /aˈbwelo/ “grandfather” became /aˈbweɾo/). We hypothesized that if Spanish-learning participants had distinct phonolexical representations for words containing the target phonemes, they would successfully reject these nonwords. Conversely, if minimal pairs containing nonnative contrasts had imprecise lexical representations, participants would have difficulty rejecting the nonwords. By comparing AE and Chinese learners of Spanish, we aimed to determine how the presence and absence of allophonic split influence their phonolexical encoding of these contrasts in their L2.

Using results from Experiments 1 and 2, we further examined the relationship between participants’ performance in perceiving phonemic contrasts and their phonolexical encoding. The FLRH posits that discrimination ability would predict nonword rejection accuracy more strongly for perceptually challenging contrasts. Based on this hypothesis, we predicted that learners with more difficulty discriminating a phonemic pair may show a stronger relationship between their phonemic discrimination and phonolexical encoding, whereas learners with less difficulty would show more influence from lexical properties on their lexical decision performance than from their discrimination performance.

Participants

Participants were the same as in Experiment 1 and performed the auditory LDT before the OT.

Materials

Stimuli included 78 Spanish words (40 /ɾ/-/l/ pairs, 37 /t/-/ɾ/ pairsFootnote 6; see the full list on the OSF repository). For each contrast, half contained /ɾ/ and half the corresponding /l/ or /t/. Critical nonwords were created by replacing the target phoneme with the other phoneme of the contrast pair. An additional 140 words and 140 nonwords served as fillersFootnote 7. No fillers contained phonological contrasts of interest. Items ranged from one to four syllables.

All stimuli were recorded by the male speaker for Experiment 1. Critical stimuli were counterbalanced across two lists to ensure that a word and its corresponding nonword were not heard by the same participant. The critical stimuli for the two contrasts were matched for word frequency (t = −1.12, p = .27) and neighborhood size (t = −.93, p = .36) using CLEARPOND (Marian et al., Reference Marian, Bartolotti, Chabal and Shook2012). Trials were pseudo-randomized and presented in two blocks.

No critical words were cognates, with most chosen from beginner-level Spanish textbooks. L1 speaker-based frequency measures do not necessarily apply to learners (Barrios et al., Reference Barrios, Jiang and Idsardi2016a; Darcy et al., Reference Darcy, Dekydtspotter, Sprouse, Glover, Kaden, McGuire and Scott2012), so learners translated and rated familiarity for each critical word on a scale of 1 to 5 (1 = I have never seen this word before; 2 = I don’t remember what it means; 3 = I think I know this word; 4 = I know this word; 5 = I know this word very well). Ratings were adjusted based on translation accuracy (see the OSF repository for the adjustment method). Only words with a final score of 4 or 5 and their corresponding nonwords were retained for analysis. Both groups reported over 90% familiarity, and 9.39% of the items were removed due to unfamiliarity.

Procedure

Participants were assigned to one list and tested on Gorilla. Each trial began with a fixation cross displayed for 1000 ms, followed by an auditory stimulus. Participants decided, as quickly and accurately as possible, whether the stimulus was a Spanish word, pressing the P and Q keys for words and nonwords, respectively. RT was measured from stimulus onset, with a 4,000-ms response window. Eight practice trials, including four words and four nonwords, preceded the test trials. Feedback was provided on practice items but not on the test items. The task lasted approximately 25 minutes, with an optional between-block break.

Results

We applied a data cleaning procedure similar to that used in the OT, employing the 3-SD criterion for both accuracy and RT. Participants and items were assessed separately within each language group. No participants or items met the exclusion criteria.

Figure 3 reveals a clear dichotomy in performance across stimulus types. While all groups achieved near-ceiling accuracy (>95%) with real words, their performance diverged markedly in nonword rejection, reflecting varying degrees of phonological robustness in their representations of /t/, /ɾ/, and /l/. This pattern aligns with previous research establishing nonword rejection as a sensitive measure of phonological representation strength (e.g., Darcy et al., Reference Darcy, Daidone and Kojima2013; Llompart, Reference Llompart2021). Given the ceiling effect in word recognition, we focused our analyses on nonword rejection trials, where participants had to identify and reject items that differed from real words by a single phoneme (e.g., */læmon/ for “lemon”; */boniɾo/ for “bonito”). Figure 4 shows the mean accuracy for different directions.

Figure 3. Accuracy rates for nonwords and words by contrast type and language group.

Note: Bars show proportion correct for /t/-/ɾ/ (gray) and /l/-/ɾ/ (blue) contrasts across Spanish, American English (AE), and Chinese L1 speakers. Error bars present one standard error of the mean for each contrast for each group.

Figure 4. Accuracy rates for the four nonword creation directions among the three language groups (Spanish, AE, and Chinese L1 speakers).

Note: Bars show the proportion correct for responses for the four nonword creation directions across the three participant groups. Error bars represent one standard error of the mean. For each L1 speaker group, L-R: /t/→/ɾ/, /ɾ/→/t/, /l/→/ɾ/, /ɾ/→/l/.

RT patterns, as shown in Figure 5, show different processing patterns across language groups. Spanish L1 speakers demonstrated the fastest overall responses with relatively consistent RTs across all conditions. AE L1 speakers showed similar consistency but with generally longer RTs. Chinese L1 speakers exhibited the most variable pattern, with notably longer RTs and a clear speed–accuracy trade-off: They responded more slowly when correctly rejecting nonwords compared to when incorrectly accepting them as real words. Also, this group responded more quickly to /l/-/ɾ/ contrasts than to /t/-/ɾ/ contrasts, despite showing higher accuracy for /t/-/ɾ/ pairs as shown in Figure 4. While these RT patterns suggest interesting processing differences across groups, we focus our analysis on accuracy data for several methodological reasons. First, nonword rejection studies typically emphasize accuracy over RT data (e.g., Llompart, Reference Llompart2021; Llompart & Reinisch, 2019). Second, standard RT analyses require the removal of incorrect responses (Jiang, Reference Jiang2012), which would create highly imbalanced datasets in our case, particularly for Chinese L1 speakers who showed very low accuracy rates. Including incorrect trials, while potentially informative of processing patterns, would deviate from established analytical approaches and present significant challenges for interpretation. However, the patterns that can be observed from the graph, while not subjected to statistical analysis, could complement our accuracy findings.

Figure 5. Reaction times by Language Group, Contrast, and response accuracy.

Note: Bars show mean RTs (ms) for incorrect (gray) vs. correct (blue) responses for the two contrasts. Error bars represent one standard error of the mean. L-R: Spanish, AE, and Chinese L1 speakers.

To analyze the between-group differences in nonword rejection accuracy and potential asymmetries across phoneme substitution directions shown in Figures 3 and 4, we ran a generalized linear mixed model (GLMM) for nonword rejection accuracy (coded as “0=incorrect” and “1=correct” for each trial) using the lme4 and the lmerTest packages in R. The model included Subject as a random effect. Although we aimed to keep the random effects structure maximal, models including random slopes failed to converge. Therefore, we proceeded with random intercepts only. We conducted pairwise comparisons using the emmeans() function for pairwise comparisons to examine specific contrasts between language groups and directions (see subsequent sections below for results; complete model specifications and outputs are available in the OSF repository).

LDT – GLMMs

To determine the optimal model structure, we conducted a series of model comparisons using likelihood ratio tests. We included the variable Direction, that is, nonword creation direction (e.g., /l/ replaced by /ɾ/: /l/➔/ɾ/; /ɾ/ replaced by /l/: /ɾ/➔/l/), to examine potential asymmetries in nonword rejection based on which phoneme was replaced. The final model included fixed effects of Language Group (Spanish, AE, and Chinese), Direction (/l/➔/ɾ/, /ɾ/➔/l/, /t/➔/ɾ/, /ɾ/➔/t/), the interaction between Language Group and Direction, and two z-score transformed covariates: Frequency and Neighborhood SizeFootnote 8 (Accuracy ∼ 1 + Group × Direction + Frequency + Neighborhood Size + (1|Subject)). Random intercepts for Item were not included because Item co-varied with Frequency and Neighborhood Size, which would have led to multicollinearity issues in the model (Llompart, Reference Llompart2021). Figures 4 and 5 show the mean accuracy and RT for different directions, respectively.

As in Experiment 1, we employed treatment coding for categorical variables to facilitate the interpretation of the main effects and interactions (Brehm & Alday, Reference Brehm and Alday2022). Across all three groups, the analysis consistently revealed that participants were more accurate in rejecting nonwords involving the /t/-/ɾ/ contrasts compared to those involving the /l/-/ɾ/ contrasts, as evidenced by both /t/-/ɾ/ directions (/t/→/ɾ/ and /ɾ/→/t/) outperforming both /l/-/ɾ/ directions (/l/→/ɾ/ and /ɾ/→/l/) in pairwise comparisons. This pattern suggests that the /t/-/ɾ/ contrasts may be more perceptually salient or easier to process across different language backgrounds, while the /l/-/ɾ/ contrasts pose more challenges, particularly for the Chinese and AE learners of Spanish.

Spanish L1 speakers showed no significant difference in accuracy between /ɾ/➔/t/ and /t/➔/ɾ/ nonwords (p = .99) or between /ɾ/➔/l/ and /l/➔/ɾ/ nonwords (p = .99), indicating symmetric processing of both contrasts. However, they were significantly more accurate in rejecting /ɾ/-/t/ nonwords compared to /ɾ/-/l/ nonwords (ps < .022).

AE L1 speakers demonstrated response patterns that partially aligned with Spanish L1 speakers, particularly in their processing of the /ɾ/-/t/ contrasts, where they showed no significant difference in accuracy between /ɾ/➔/t/ and /t/➔/ɾ/ nonwords (p = .98). However, their performance diverged when processing the /l/-/ɾ/ contrasts, revealing a distinct asymmetrical pattern. Specifically, they were significantly less accurate in rejecting /l/➔/ɾ/ nonwords compared to /ɾ/➔/l/ nonwords (|β| = .57, p = .003). This heightened sensitivity to /ɾ/-items appears counterintuitive, as one might expect greater accuracy with /l/-items given that /l/ exists as a distinct phoneme in the participants’ L1. One potential explanation for this unexpected pattern lies in AE learners’ metalinguistic awareness of the similarity between the Spanish /ɾ/ and the AE [ɾ]. Having encountered /ɾ/ as an allophone in their L1, these learners may be particularly attuned to the phonetic relationship between their familiar English [ɾ] and the Spanish /ɾ/, especially if explicit instruction has highlighted this cross-linguistic connection. This awareness of the phonetic similarity across languages, combined with their prior experience of [ɾ] in English contexts, may make Spanish /ɾ/ more perceptually salient when it inappropriately appears in place of /l/ in nonwords.

Chinese L1 speakers exhibited a distinct pattern among the three groups. First of all, their accuracy rates for all directions were significantly lower than in the other two groups (ps < .001). While they also showed no significant difference between the two directions within each contrast (/ɾ/-/t/ contrasts: p = .96; /ɾ/-/l/ contrasts: p = .14), their accuracy dropped significantly when processing /ɾ/-/l/ contrasts compared to /ɾ/-/t/ contrasts (ps < .001). This group’s below-chance performance in rejecting /ɾ/-/l/ nonwords (see Figure 4: 15% for /l/➔/ɾ/ and 20% for /ɾ/➔/l/) suggests that participants were likely struggling to discriminate between these phonemes at all. This pervasive difficulty with the /l/-/ɾ/ contrasts likely stems from the absence of /ɾ/ in the Chinese phonological inventory, leading to fundamental challenges in establishing distinct phonological categories for these sounds.

The analysis also revealed significant effects of both word frequency and neighborhood size across all participants. Higher word frequency was associated with increased accuracy in nonword rejection (β = .09, p = .01), while a larger neighborhood size led to decreased accuracy (β = −.18, p < .001). These findings align with established lexical effects in word recognition tasks, where more frequent words are typically processed more accurately (e.g., Brysbaert et al., Reference Brysbaert, Stevens, Mandera and Keuleers2016; Monsell et al., Reference Monsell, Doyle and Haggard1989), and words with many neighbors often lead to more competition and potential errors in word recognition (e.g., Imai et al., Reference Imai, Walley and Flege2005)Footnote 9.

We further analyzed the data from the learner groups only, using another GLMM model to examine the potential influence of participants’ familiarity with the critical words as well as their AoAs, self-rated listening proficiency, LexTALE, and Montrul test scoresFootnote 10. The final model from the selection process is the one with fixed effects of Language Group, Direction, the interaction between Language Group and Direction, z-score transformed Neighborhood Size, z-score transformed Familiarity Rating, and its interaction with Language Group (Accuracy ∼ Group × Direction + Neighborhood Size + Familiarity Rating × Group + (1|Subject)). While the within-group comparisons for different directions and the effect of Neighborhood Size remained the same, Familiarity Rating showed a significant main effect and a significant interaction with Group. Post hoc analyses revealed this effect was driven exclusively by AE speakers (β = −.29, p < .001), with no effect for Chinese speakers (β = .33). This counterintuitive pattern suggests that AE learners exhibit an inverse relationship between perceived word familiarity and nonword detection accuracy: The more familiar they believed a word to be, the less likely they were to notice phonemic substitution. This lexical bias may have reduced phonological monitoring for highly familiar items.

Relationship between OT and LDT

A subsequent GLMM analysis incorporated OT Performance, operationalized as d’ scores from Experiment 1, as a predictor of lexical decision accuracy (Accuracy ∼ Group × Direction × d’ + Neighborhood Size + Frequency + (1|Subject))Footnote 11. The results showed that when controlling for OT performance, directional asymmetries persisted only for AE speakers on the /l/-/ɾ/ contrasts, with no significant asymmetries for Spanish or Chinese groups, consistent with the findings when the OT performance was not included in the modeling.

Looking into the relationship between OT and LDT performance, the model revealed no significant main effect of OT performance nor any significant interaction (three-way—Group × Direction × OT performance: χ2 = 5.50, p = .48; two-way—Group × OT performance: χ² = .94, p = .63; two-way—Direction × OT performance: χ² = 1.23, p = .75). However, examination of simple slopes revealed selective predictive relationships between phonetic discrimination and lexical decision performance that varied systematically across language groups and phonemic contrasts.

For Spanish L1 speakers, OT performance significantly predicted LDT accuracy only for the /ɾ/➔/l/ condition (β = .41, p = .04), despite both being native phonemes. This unexpected finding may reflect the high phonetic similarity between these sounds, suggesting that even native speakers require heightened perceptual acuity for accurate lexical decision when sounds are acoustically close and they need to overcome the lexical bias.

Chinese L1 speakers showed a contrasting pattern. OT performance significantly predicted LDT accuracy for both /ɾ/-/t/ directions (/ɾ/➔/t/: β = .34, p = .006; /t/➔/ɾ/: β = .31, p = .01), but not for the /ɾ/-/l/ contrast. This pattern aligns with the relative perceptual distinctiveness of these phonemes for Chinese speakers, who lack /ɾ/ in their L1 inventory. The significant relationships suggest that participants’ varying levels of perceptual sensitivity to the /ɾ/-/t/ contrast directly impact their lexical processing abilities for these novel but discriminable contrasts. The absence of effects for /ɾ/-/l/ likely reflects floor effects in both their discrimination and lexical decision accuracy. When the discrimination ability and lexical decision accuracy are uniformly low across participants, insufficient variability precludes the detection of predictive relationships.

AE L1 speakers demonstrated a unique asymmetrical pattern, with OT performance predicting LDT accuracy only for the /t/➔/ɾ/ direction (β = .59, p = .02). This unidirectional effect pattern is particularly intriguing given that [ɾ] and [t] are allophones of /t/ in AE. The effect suggests that when AE speakers must reject nonwords where /t/ has been replaced by /ɾ/, their perceptual sensitivity becomes crucial. This may reflect the dominant status of [t] as the dominant allophone of /t/ and the marked status of [ɾ] as merely a positional allophone, making its inappropriate appearance in place of [t] more perceptually salient and dependent on discrimination ability.

A follow-up analysis examining learner groups incorporated Familiarity Rating and its interaction with Language Group as additional predictors (Accuracy ∼ Group × Direction × d’ + Neighborhood Size + Frequency + Familiarity Rating × Group + (1|Subject)). Similar to previous findings, Familiarity Rating showed a significant main effect and a significant interaction with Group (p = .0008). This analysis confirmed that Chinese L1 speakers demonstrated significant OT–LDT relationships for both /ɾ/-/t/ contrasts (/ɾ/➔/t/: β = .34, p = .006; /t/➔/ɾ/: β = .31, p = .02), while AE L1 speakers showed significance only in the /t/➔/ɾ/ direction (β = .54, p = .04).

As before, to examine potential AoA effects, we conducted a robustness check on late learners (AoA > 12 years; Chinese: N = 46; AE: N = 15). Chinese speakers’ results remained consistent with the main analysis. However, AE speakers showed two differences: The OT performance effect for /t/➔/ɾ/ lost significance (β = .30, p = .43), and the familiarity rating effect was smaller (β = −0.21, p = .06). This divergence suggests that the relationship between perceptual sensitivity and lexical encoding for AE speakers may be modulated by AoA, but only for contrasts with an allophonic split status. In other words, a better /t/-/ɾ/ contrast sensitivity is associated with a more accurate detection of nonword /t/➔/ɾ/ substitution only for earlier learners. Because of L1 allophony, later learners may rely less on bottom-up perceptual sensitivity when making lexical decisions, potentially compensating through top-down lexical knowledge.

Discussion

Lexical decision accuracy

Experiment 2 examined the robustness of phonological encoding and phonolexical representations for three target Spanish phonemes among Spanish L1 and L2 speakers. Chinese L1 speakers were hypothesized to struggle more with /ɾ/-/l/ than /ɾ/-/t/ due to phonetic similarities (e.g., Pallier et al., Reference Pallier, Bosch and Sebastián-Gallés1997, Reference Pallier, Colomé and Sebastián-Gallés2001). Interestingly, Chinese L1 speakers rejected /l/-/ɾ/ nonwords below chance level but faster than /t/-/ɾ/ nonwords, as shown in Figures 3 and 5, indicating they assimilated /l/ and /ɾ/ as a single phoneme, did not notice the phoneme swap, and thus were confident in their answers. Our results support and extend previous findings on the lateral /l/ substitution for the Spanish tap /ɾ/ among Chinese participants (Chih, Reference Chih2013; Patience, Reference Patience2018), confirming that this lack of discrimination persists at the intermediate level and extends to phonolexical processing. Chinese participants’ accuracy in rejecting /t/-/ɾ/ nonwords was higher than /l/-/ɾ/ nonwords but still lower than the other two groups, indicating an imprecise, or fuzzy, phonolexical representation of Spanish /ɾ/. Moreover, Chinese L1 speakers showed faster lexical retrieval for the /l/-/ɾ/ contrast, as shown in the raw RT data presented in Figure 5, despite lower accuracy, indicating its fuzzy encoding and thus overconfidence in lexical decision.

Contrary to expectations based on phonetic similarities, the inhibitory influence of these similarities was not evident among the AE L1 speakers. They performed equally well in rejecting nonwords for both phonemic pairs, suggesting robust representations of all three L2 phonemes and thus no negative influence from the allophonic representations of [t] and [ɾ] in AE on their Spanish phonolexical representations. Instead, the existence of [ɾ] in L1 may have facilitated the establishment of the /ɾ/ phoneme in their L2.

Relationship between phonemic discrimination and lexical decision accuracy

To examine how phonetic sensitivity relates to lexical processing, we analyzed the predictive relationship between OT performance (d’) and LDT accuracy across language groups and phonological contrasts. The results revealed different patterns in learners’ reliance on phonemic discrimination for their lexical processing, dependent on how the L2 phonemic distinctions map onto their L1 phonological systems.

First, when L2 phonemes are well established as distinct categories, as with /ɾ/-/l/ for AE speakers, who maintain this distinction in their L1, learners demonstrate efficient lexical processing based on automatic phonemic discrimination. The absence of OT–LDT relationships in this case suggests that well-established phonemic categories enable effortless lexical access.

Second, at the opposite extreme, when L2 phonemes are perceptually assimilated into a single category, as with /ɾ/-/l/ for Chinese speakers, both phonemic discrimination and lexical encoding suffer from floor effects. The lack of OT–LDT relationships here reflects a different mechanism: Without reliable phonemic distinctions, learners cannot leverage perceptual sensitivity to support lexical processing. This pattern illustrates how perceptual assimilation creates persistent difficulties at both phonetic and lexical levels.

Third, an intermediate scenario emerges when L2 phonemes remain discriminable but challenging, as with /ɾ/-/t/ for Chinese speakers. Here, significant OT–LDT relationships indicate that perceptual sensitivity to the two sounds impacts lexical encoding. This pattern suggests that when phonemic distinctions are unstable but achievable, bottom-up perceptual processing becomes an important foundation for accurate lexical representation.

Lastly, a more complex pattern emerged for the allophonic split scenario. AE L1 speakers demonstrated an asymmetrical OT–LDT relationship, significant only for the /t/➔/ɾ/ direction. Given that [t] and [ɾ] function as positional allophones of /t/ and that [t] is the main allophone, while [ɾ] commonly replaces [t] in intervocalic positions in AE, this asymmetry suggests that detecting the incorrect appearance of /ɾ/ that replaces /t/ requires active perceptual monitoring, while the inverse substitution is easier to detect.

General discussion

We conducted two experiments to investigate the phonetic perception and phonological and phonolexical encoding of a Spanish-specific phoneme /ɾ/, in contrast with /t/ and /l/, by intermediate to advanced Chinese and American late learners of Spanish. We hypothesized that L1 positional allophones are not immediately associated with L2 phonemes and that an allophonic split may not always hinder L2 phonemic categorization. Moreover, the similarity between the phonetic properties of L1 allophones and L2 phonemes may benefit L2 learners. Experiment 1 employed an auditory OT to investigate Spanish learners’ ability to categorize and discriminate minimally contrastive Spanish disyllabic nonword triads. Experiment 2 used an auditory LDT to examine word recognition and phonolexical encoding, focusing on the effect of L1 phonology on rejecting nonwords differing from real words by one phoneme. A further analysis combining the results from both experiments showed contrast-specific predictive relationships between phonetic discrimination and phonolexical encoding, which varied by language background and phonological pairing. The findings revealed that (1) Chinese L1 speakers showed very low accuracy in the OT, whereas AE L1 speakers exhibited nativelike phonetic categorization, and both Spanish learner groups differed from L1 speakers in the robustness of their phonolexical representations. Both AE and Chinese L1 speakers showed significantly lower accuracy than Spanish L1 speakers in the LDT, reflecting typical L1 speaker–learner differences in SLA literature (see Gor et al., Reference Gor, Cook, Bordag, Chrabaszcz and Opitz2021). Also, (2) in both tasks, Chinese participants demonstrated low accuracy rates for both contrasts, with significantly lower accuracy on /ɾ/-/l/ than /ɾ/-/t/, suggesting a strong influence of phonetic similarity on L2 phonetic perception. Conversely, (3) American participants performed similarly to Spanish L1 speakers in phonetic perception and showed no significant differences in the robustness of their phonolexical representations, although the /ɾ/-/t/ pair is not phonemically contrastive in AE. Moreover, (4) Chinese L1 speakers demonstrated a significant OT–LDT relationship for the /ɾ/-/t/ contrast but not for /ɾ/-/l/, likely suggesting that perceptual discrimination must reach a functional level before it can support lexical processing. In contrast, (5) AE L1 speakers showed no relationship between perceptual sensitivity and lexical decisions for the /l/-/ɾ/ contrast, suggesting sufficiently distinct phonemic representations that allow lexical processing to operate independently of lower-level phonetic discrimination. However, an asymmetry emerged for the allophonic pair, as their OT performance significantly predicted lexical decision accuracy only in the /t/➔/ɾ/ direction. These findings shed light on the “fuzzy” nature of learners’ phonolexical representations as proposed in the FLRH, showing that learners systematically fail to represent novel contrasts in words in a nativelike manner, even at higher proficiency levels (Gor et al., Reference Gor, Cook, Bordag, Chrabaszcz and Opitz2021).

The difficulty Chinese L1 speakers faced with the two contrasts was expected, as numerous studies have shown that establishing a novel phoneme category in L2 is challenging, especially for adult learners (Guion et al., Reference Guion, Flege, Akahane-Yamada and Pruitt2000; Pallier et al., Reference Pallier, Colomé and Sebastián-Gallés2001). The difficulty of differentiating L2 phonemes depends on how acoustically similar they are to each other, with the greatest challenges arising when one L2 phoneme has no L1 counterpart. /ɾ/, a novel phoneme for Chinese L1 speakers, presented significant difficulty at both phonetic and phonolexical levels. As /ɾ/ is phonetically closer to /l/ than to /t/, Chinese learners experienced more difficulty discriminating /ɾ/-/l/ than /ɾ/-/t/ in the OT. Their representations of /ɾ/-/l/ at phonological and phonolexical levels were much less robust, leading to near-chance performance in rejecting /ɾ/-/l/ nonwords. This pattern fits with the PAM (Best, Reference Best and Strange1995), as /ɾ/ and /l/ are likely assimilated to a single L1 category due to phonetic similarities. Even at intermediate to advanced proficiency levels, when an L2 phoneme is assimilated to a similar-sounding L1 phoneme, the two categories are likely to be represented as one at phonetic, phonological, and phonolexical levels for a long time.

AE learners of Spanish exhibited nativelike phonetic perception and robust phonolexical encoding for words with both contrasts, supporting the prediction that L1 allophony does not necessarily affect L2 acquisition and perception. AE L1 speakers successfully acquired the Spanish /ɾ/ as a new phonemic category separate from /t/, despite [ɾ] and [t] being allophones of /t/ in AE, and no evidence suggests this L1 allophonic relationship negatively impacted their discrimination of the /ɾ/-/t/ contrast in Spanish phonetic perception or lexical representations. Their performance suggests that they have established relatively robust representations for all three critical phonemes in their L2 Spanish phonological system. This result differs from findings that allophonic pairs in L1 are harder to distinguish than phonemically contrastive pairs (Boomershine et al., Reference Boomershine, Hall, Hume, Johnson, Avery, Dresher and Rice2008; Kazanina et al., Reference Kazanina, Phillips and Idsardi2006). Our finding suggests that an allophonic pair in L1 represented by two separate L2 phonemes does not necessarily confuse L2 learners. This absence of confusion could be attributed to three factors. First, a substantial phonetic distance between [ɾ] and [t] may have facilitated AE learners’ ability to establish them as separate phonemic categories in Spanish despite their allophonic relationship in the L1. Second, the allophone [ɾ] in the AE phonological system may have facilitated the acquisition of the Spanish phoneme /ɾ/, as it represents a familiar phonetic category. The extra complication caused by the cross-language mismatch between its orthographic representations has also been overcome. Specifically, the [ɾ] in AE appears in words spelled with “t” (as in “water”), but in Spanish, the same sound corresponds to the letter “r” (as in “pero”). This orthographic mismatch means that AE learners must reassign the familiar [ɾ] sound from “t” to “r” when reading Spanish, potentially creating an additional layer of difficulty in establishing the phoneme–grapheme correspondence (e.g., Hayes-Harb & Barrios, Reference Hayes-Harb and Barrios2021). The fact that our participants showed little evidence of confusion suggests that they have successfully reconfigured these mismatches for the L2. Third, at higher proficiency levels, learners may have restructured their phonological representations to accommodate L2-specific contrasts, aligning with the SLM (Flege, Reference Flege and Strange1995). While the strong L1 influence on Chinese learners’ performance might question this reasoning, differences in linguistic experience and knowledge of Spanish among learners should be considered. Chinese participants studied Spanish as a foreign language in classrooms with limited natural exposure, while American participants had richer, multidimensional learning experiences through access to Spanish-speaking communities and pop culture in the United States. Additionally, the greater typological distance between Chinese and Spanish compared to AE adds an extra phonological learning burden for Chinese L1 learners.

Although AE learners of Spanish exhibited nativelike phonetic categorization, they differed from Spanish L1 speakers in their phonolexical encoding, consistent with previous findings (Amengual, Reference Amengual2016; Barrios et al., Reference Barrios, Namyst, Lau, Feldman and Idsardi2016b). While AE L1 speakers showed robust performance overall, an asymmetry emerged for the /t/-/ɾ/ contrasts: OT performance predicted lexical decision accuracy only in the /t/➔/ɾ/ direction, suggesting that L1 allophonic knowledge creates specific processing demands when learners must evaluate whether [ɾ] can replace /t/ in Spanish—a substitution that would be natural in English intervocalic contexts but may create nonwords in Spanish. This asymmetrical pattern reveals how L1 phonological knowledge continues to shape L2 processing even at intermediate or advanced proficiency levels.

The discrepancy between phonetic categorization and phonolexical encoding could be partially explained by the DMAP theory (Darcy et al., Reference Darcy, Dekydtspotter, Sprouse, Glover, Kaden, McGuire and Scott2012) and the FLRH (Gor et al., Reference Gor, Cook, Bordag, Chrabaszcz and Opitz2021). The DMAP theory proposes a dissociation between L2 phonetic categorization and phonolexical representations, potentially explaining American participants’ well-established phonemic categories but less robust phonolexical encoding. However, it cannot fully explain the contrast-specific and directionally asymmetric relationships observed in our data, particularly why Chinese L1 speakers showed OT–LDT relationships for /ɾ/-/t/ but not /ɾ/-/l/ contrasts, or why AE L1 speakers showed this relationship only for /t/➔/ɾ/ but not the reverse.

In a more comprehensive way, the FLRH proposes that phonetic perception is foundational to phonolexical representations. At the same time, while phonetic perception significantly influences phonolexical representations for difficult L2 phonological contrasts, the robustness of phonolexical encoding for less difficult L2 contrasts depends more on individual words, that is, is largely influenced by lexical properties, for example, neighborhood density, word familiarity, and frequency (Gor et al., Reference Gor, Cook, Bordag, Chrabaszcz and Opitz2021). This is supported by the different predictive powers of OT performance between Chinese and AE L1 speakers and the presence of phonolexical encoding struggles for Chinese L1 speakers and their absence for AE L1 speakers when controlling for phonetic sensitivity. Moreover, word frequency was a significant factor for AE L1 speakers, but not Chinese L1 speakers, further supporting the FLRH by showing that factors beyond phonetic sensitivity play a larger role in shaping phonolexical representations and lexical retrieval.

Taken together, these findings suggest that the role of L1 phonology in L2 acquisition differs based on specific scenarios. When an L2 phoneme does not exist in a learner’s L1 even as a positional allophone, its similarity with another L2 phoneme along articulatory-acoustic dimensions can lead to their conflation into a single L2 category, resulting in persistent categorization difficulties and weak, or “fuzzy,” phonolexical representations. Conversely, when an L2 phoneme can be categorized as a new realization of a familiar L1 sound, it does not necessarily impede the acquisition of this L2 phoneme, especially at higher proficiency levels. Its existence, even as an allophone in L1, may facilitate the establishment of its phonemic category in the L2. These findings emphasize the need for theoretical models to incorporate more nuanced considerations of how specific phonetic properties drive learners’ perceptual assimilation patterns and phonolexical encoding. They also underscore the importance of examining how L2 phonemes are represented in actual words within the mental lexicon, going beyond whether learners can discriminate them in isolation or nonword contexts. While phonetic discrimination tasks using nonwords can show perceptual abilities at the prelexical level, examining phonolexical representations reveals whether these distinctions are robustly encoded in stored words.

Conclusion and future directions

This study provides a novel assessment of how an allophonic split influences L2 speakers’ spoken word recognition and compares the effects of an allophonic split and phonetic similarities and differences between L2 phonemes. Our findings suggest that intermediate–advanced American learners of Spanish can overcome the familiar categorization of L1 positional allophones and acquire a new target-language contrast between familiar phones [ɾ] and [t] at the phonetic level. Although not as robust as among Spanish L1 speakers, American learners’ representations of the /ɾ/ and /t/ contrast are well established at the phonolexical level, indicating that an allophonic relationship between two L1 phones does not necessarily hinder learners from categorizing them as separate phonemes in the L2. However, we acknowledge that our interpretation is limited by the absence of a baseline condition comparing L2 learners’ performance on contrasts that exist as separate phonemes in both L1 and L2. Without such a comparison, we cannot definitively determine whether the observed L1–L2 differences reflect general L2 lexical processing difficulties or specific challenges related to the allophonic split.

Future studies focusing on beginner learners may provide a more comprehensive understanding of the evolution of this influence throughout an L2 speaker’s learning trajectory. Additionally, future research should include baseline contrasts (e.g., /s/-/t/ or /n/-t/) that are phonemically distinct in both L1 and L2 to better isolate the specific effects of allophonic relationships on L2 phonolexical encoding. Moreover, future research could also systematically investigate how varying degrees of phonetic distance between L1 allophones affect the acquisition of L2 phonemic contrasts, extending beyond the substantial articulatory differences between /ɾ/ and /t/ examined in this study to cases where the phonetic distinction might be more subtle. Finally, persistent discrimination difficulties can arise when two L2 sounds are perceptually similar to a single L1 category, as observed in Chinese speakers’ performance on the /ɾ/-/l/ pair, leading to fuzzy phonolexical representations of words with these phonemes. We encourage further research on the acquisition of L2 phonological contrasts, given the particular phonetic and phonological mappings across L1 and L2.

Replication package

All data, research materials, and analysis codes are available at https://osf.io/v6esc.

Acknowledgements

We are grateful to the anonymous reviewers and the Associate Editor, Dr. Ethan Kutlu, for their thoughtful and constructive feedback throughout the review process, which greatly strengthened this manuscript. We also wish to thank Mireia Toda Cosi for her invaluable help with Spanish material development, Drs. Hua Yuan and Yan Li for their help in recruiting Chinese L1 participants and Dr. Nan Zhang for her guidance on data analysis.

Competing interests

We have no known conflict of interest to disclose.

Footnotes

1 In this paper, we use the terms “phonetic,” “phonemic,” and “phonolexical” as follows. Phonetic processing refers to the ability of listeners to perceive sound segments, and it can be inaccurate for nonnative speech sounds. Following standard notation in linguistics, phonetic transcriptions are enclosed in square brackets (e.g., [ɾ]). Phonemic processing, on the other hand, occurs only within the context of words and is accessible solely to speakers of the language. It involves the abstract mental categories that distinguish word meanings in a specific language, and the transcriptions of phonemes are denoted with slashes (e.g., /ɾ/). For example, [t] and [ɾ] are phonetically distinct sounds that function as variants (allophones) of the same phoneme /t/ in American English, but as separate /t/ and /ɾ/ in Spanish. Phonolexical encoding refers to the phonological encoding of words as sequences of phonemes. This level of processing enables speakers to store words in the mental lexicon and retrieve them during speech comprehension.

2 To examine whether Chinese speakers’ L2 English phonology influences their third-language Spanish phonological representation, we conducted two norming transcription studies with different participant groups. Participants transcribed sound sequences, 28 based on English and 11 based on Spanish phonology, containing intervocalic /ɾ/ using the Latin alphabet. Chinese speakers predominantly perceived /ɾ/ as “t” (59% of all trials) in English, but as “l” (63% of all trials) in Spanish, suggesting their perception of Spanish sounds was not significantly influenced by their knowledge of English.

3 Due to missing data from three participants who did not complete the language background questionnaire (one English L1 speaker and two Chinese L1 speakers), their AoA information was not available. For the purpose of data analysis, the missing AoA values were replaced with the rounded mean AoA of the remaining participants in each language group, which was 12 for the American participant and 18 for Chinese participants.

4 Originally, we included /d-ɾ/ contrast as one of the critical pairs in both experiments. However, due to pronunciation inconsistency discovered after the data collection, we excluded them from the data analysis and introduced them here as a difficult contrast.

5 See OT.r for all model comparisons.

6 The slight imbalance in the number of pairs between the two contrasts reflects the availability of suitable word pairs in Spanish that met our criteria for familiarity, word frequency, and phonological structure. These 78 word pairs represent the complete set of usable stimuli we could identify that satisfied our stringent selection criteria.

7 Originally, we included /d-ɾ/ contrast with 20 words and 20 nonwords in each counterbalanced list. However, due to pronunciation inconsistency discovered after the data collection, we excluded them from the data analysis. They were not counted when establishing the cutoff accuracy scores for filler items.

8 Based on the words used to create the critical nonword.

9 As kindly noted by a reviewer, recent studies (Llompart, Reference Llompart2021; Rocca et al., Reference Rocca, Llompart and Darcy2025) have reported contrasting findings regarding neighborhood density effects. While a comprehensive examination of this discrepancy is beyond the scope of the current study, interested readers are directed to these works for further discussion of neighborhood density effects in L2 phonological processing.

10 To ensure our findings were robust for late Spanish learners, we analyzed a subset of participants with an AoA over 12 years, mirroring our approach in Experiment 1. The main results remained consistent across both analyses. The only notable difference was the loss of the asymmetry effect for the /l/-/ɾ/ contrasts among the AE-L1 learner group (β = −.49, p = .11). However, given the similar coefficient magnitude and reduced sample size (N = 15), this difference likely reflects diminished statistical power rather than a substantive change. Full details are available in “LDT.r” and “Statistical Model Selection and Complete Results” in our OSF repository.

11 Following a reviewer’s suggestion, we conducted additional Pearson’s correlational analyses between LDT performance and the d-prime scores from the OT. Complete details, including code and results, are available in our OSF repository (“LDT.R” in the “Analysis” folder and “Between-Task Correlational Analyses” in the “Other docs” folder).

12 In Spanish, /t/ is denti-alveolar (Martínez-Celdrán et al., Reference Martínez-Celdrán, Fernández-Planas and Carrera-Sabaté2003) rather than purely dental, meaning that it is articulated with the tongue against the alveolar ridge and the upper teeth at the same time.

13 Handedness was included in the exploratory models for American participants due to the relatively high percentage of left-handed participants in this group. It was not a significant contributor to participants’ performance.

References

Amengual, M. (2016). The perception of language-specific phonetic categories does not guarantee accurate phonological representations in the lexicon of early bilinguals. Applied Psycholinguistics, 37(5), 12211251. https://doi.org/10.1017/S0142716415000557 CrossRefGoogle Scholar
Anwyl-Irvine, A. L., Massonnié, J., Flitton, A., Kirkham, N., & Evershed, J. K. (2020). Gorilla in our midst: An online behavioral experiment builder. Behavior Research Methods, 52(1), 388407. https://doi.org/10.3758/s13428-019-01237-x CrossRefGoogle ScholarPubMed
Aoyama, K., Flege, J. E., Guion, S. G., Akahane-Yamada, R., & Yamada, T. (2004). Perceived phonetic dissimilarity and L2 speech learning: The case of Japanese /r/ and English /l/ and /r/. Journal of Phonetics, 32(2), 233250. https://doi.org/10.1016/S0095-4470(03)00036-6 CrossRefGoogle Scholar
Bak, T. H., Vega-Mendoza, M., & Sorace, A. (2014). Never too late? An advantage on tests of auditory attention extends to late bilinguals. Frontiers in Psychology, 5, 485 https://doi.org/10.3389/fpsyg.2014.00485 CrossRefGoogle ScholarPubMed
Barrios, S., & Hayes-Harb, R. (2021). L2 processing of words containing English /æ/-/ϵ/ and /l/-/ɹ/ contrasts, and the uses and limits of the auditory lexical decision task for understanding the locus of difficulty. Frontiers in Communication, 6, 689470. https://doi.org/10.3389/fcomm.2021.689470 CrossRefGoogle Scholar
Barrios, S., Jiang, N., & Idsardi, W. J. (2016a). Similarity in L2 phonology: Evidence from L1 Spanish late-learners’ perception and lexical representation of English vowel contrasts. Second Language Research, 32(3), 367395. https://doi.org/10.1177/0267658316630784 CrossRefGoogle Scholar
Barrios, S. L., Namyst, A. M., Lau, E. F., Feldman, N. H., & Idsardi, W. J. (2016b). Establishing new mappings between familiar phones: Neural and behavioral evidence for early automatic processing of nonnative contrasts. Frontiers in Psychology, 7, 995. https://doi.org/10.3389/fpsyg.2016.00995 CrossRefGoogle ScholarPubMed
Barrios, S. L., Rodriguez, J. M., & Barriuso, T. A. (2023). The acquisition of L2 allophonic variants: The role of phonological distribution and lexical cues. Second Language Research, 39(3), 899924. https://doi.org/10.1177/02676583221099237 CrossRefGoogle Scholar
Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1). https://doi.org/10.18637/jss.v067.i01 CrossRefGoogle Scholar
Best, C. T. (1995). A direct realist view of cross-lanugage speech perception. In Strange, W. (Ed.), Speech perception and linguistic experience: issues in cross-language research (pp. 345360). York Press.Google Scholar
Best, C. T., Goodman, J. C., & Nusbaum, H. C. (1994). The emergence of native-language phonological influences in infants: A perceptual assimilation model. In The development of speech perception: the transition from speech sounds to spoken words (pp. 167224). The MIT Press. https://direct.mit.edu/books/book/4743/The-Development-of-Speech-PerceptionThe-Transition Google Scholar
Best, C. T., & Tyler, M. D. (2007). Nonnative and second-language speech perception: Commonalities and complementarities. In Bohn, O.-S. & Munro, M. J. (Eds.), Language learning & language teaching (Vol. 17, pp. 1334). John Benjamins Publishing Company. https://doi.org/10.1075/lllt.17.07bes Google Scholar
Boomershine, A., Hall, K. C., Hume, E., & Johnson, K. (2008). The impact of allophony versus contrast on speech perception. In Avery, P., Dresher, B. E., & Rice, K. (Eds.), Phonology and Phonetics [PP]. Mouton de Gruyter. https://doi.org/10.1515/9783110208603.2.145 Google Scholar
Bosch, L., Costa, A., & Sebastián-Gallés, N. (2000). First and second language vowel perception in early bilinguals. European Journal of Cognitive Psychology, 12(2), 189221. https://doi.org/10.1080/09541446.2000.10590222 CrossRefGoogle Scholar
Bramlett, A. A., & Wiener, S. (2024). The art of wrangling: Working with web-based visual world paradigm eye-tracking data in language research. Linguistic Approaches to Bilingualism, 15(4), 538570. https://doi.org/10.1075/lab.23071.bra CrossRefGoogle Scholar
Brehm, L., & Alday, P. M. (2022). Contrast coding choices in a decade of mixed models. Journal of Memory and Language, 125, 104334. https://doi.org/10.1016/j.jml.2022.104334 CrossRefGoogle Scholar
Broersma, M., & Cutler, A. (2008). Phantom word activation in L2. System, 36(1), 2234. https://doi.org/10.1016/j.system.2007.11.003 CrossRefGoogle Scholar
Brysbaert, M., Stevens, M., Mandera, P., & Keuleers, E. (2016). The impact of word prevalence on lexical decision times: Evidence from the Dutch Lexicon Project 2. Journal of Experimental Psychology: Human Perception and Performance, 42(3), 441458. https://doi.org/10.1037/xhp0000159 Google ScholarPubMed
Chih, M. T.-C. (2013). E/LE en Taiwán: Problemas de apreciación fonética en estudiantes universitarios de grado. SinoELE, 9, 1732.Google Scholar
Daidone, D., & Darcy, I. (2021). Vocabulary size is a key factor in predicting second language lexical encoding accuracy. Frontiers in Psychology, 12, 688356. https://doi.org/10.3389/fpsyg.2021.688356 CrossRefGoogle ScholarPubMed
Darcy, I., Daidone, D., & Kojima, C. (2013). Asymmetric lexical access and fuzzy lexical representations in second language learners. The Mental Lexicon, 8(3), 372420. https://doi.org/10.1075/ml.8.3.06dar CrossRefGoogle Scholar
Darcy, I., Dekydtspotter, L., Sprouse, R. A., Glover, J., Kaden, C., McGuire, M., & Scott, J. H. (2012). Direct mapping of acoustics to phonology: On the lexical encoding of front rounded vowels in L1 English-L2 French acquisition. Second Language Research, 28(1), 540. https://doi.org/10.1177/0267658311423455 CrossRefGoogle Scholar
Darcy, I., Llompart, M., Hayes-Harb, R., Mora, J. C., Adrian, M., Cook, S., & Ernestus, M. (2025). Phonological processing and the L2 mental lexicon: Looking back and moving forward. Studies in Second Language Acquisition, 47(1), 361387. https://doi.org/10.1017/S0272263124000482 CrossRefGoogle Scholar
Dastgerdi, Z. H., Seifi, H., & Vahabi, M. (2023). The effect of second language acquisition age (AoA) on auditory processing skills. Indian Journal of Otolaryngology and Head & Neck Surgery, 75(4), 32213227. https://doi.org/10.1007/s12070-023-03978-w CrossRefGoogle ScholarPubMed
Duanmu, S. (2007). The phonology of standard Chinese (2nd ed). Oxford University Press.10.1093/oso/9780199215782.001.0001CrossRefGoogle Scholar
Eckman, F., & Iverson, G. K. (2013). The role of native language phonology in the production of l2 contrasts. Studies in Second Language Acquisition, 35(1), 6792. https://doi.org/10.1017/S027226311200068X CrossRefGoogle ScholarPubMed
Eckman, F., Iverson, G. K., & Elreyes, A. (2001). Allophonic splits in L2 phonology: The question of learnability. International Journal of English Studies, 9(2), 140.Google Scholar
Eckman, F., Iverson, G. K., Fox, R. A., Jacewicz, E., & Lee, S. A. (2009). Recent research in second language phonetics/phonology: Perception and production. In Watkins, M. A., Rauber, A. S., & Baptista, B. O. (Eds.), Recent research in second language phonetics/phonology: Perception and production (First edition). Cambridge Scholars Publishing.Google Scholar
Escudero, P. (2005). Linguistic perception and second language acquisition: Explaining the attainment of optimal phonological categorization. LOT.Google Scholar
Flege, J. E. (1995). Second language speech learning: Theory, findings and problems. In Strange, W. (Ed.), Speech perception and linguistic experience: Issues in cross-language research (pp. 233276). York Press.Google Scholar
Flege, J. E., & Bohn, O.-S. (2021). The Revised Speech Learning Model (SLM-r). In Wayland, R. (Ed.), Second language speech learning (1st ed., pp. 383). Cambridge University Press. https://doi.org/10.1017/9781108886901.002 CrossRefGoogle Scholar
Gor, K., Cook, S., Bordag, D., Chrabaszcz, A., & Opitz, A. (2021). Fuzzy lexical representations in adult second language speakers. Frontiers in Psychology, 12, 732030. https://doi.org/10.3389/fpsyg.2021.732030 CrossRefGoogle ScholarPubMed
Guion, S. G., Flege, J. E., Akahane-Yamada, R., & Pruitt, J. C. (2000). An investigation of current models of second language speech perception: The case of Japanese adults’ perception of English consonants. The Journal of the Acoustical Society of America, 107(5), 27112724. https://doi.org/10.1121/1.428657 CrossRefGoogle ScholarPubMed
Hautus, M. J. (1995). Corrections for extreme proportions and their biasing effects on estimated values of d’. Behavior Research Methods, Instruments, & Computers, 27(1), 4651. https://doi.org/10.3758/BF03203619 CrossRefGoogle Scholar
Hayes, B. (2010). Introductory phonology (Nachdr.). Wiley-Blackwell.Google Scholar
Hayes-Harb, R., & Barrios, S. (2021). The influence of orthography in second language phonological acquisition. Language Teaching, 54(3), 297326. https://doi.org/10.1017/S0261444820000658 CrossRefGoogle Scholar
Imai, S., Walley, A. C., & Flege, J. E. (2005). Lexical frequency and neighborhood density effects on the recognition of native and Spanish-accented words by native English and Spanish listeners. The Journal of the Acoustical Society of America, 117(2), 896907. https://doi.org/10.1121/1.1823291 CrossRefGoogle ScholarPubMed
Izura, C., Cuetos, F., & Brysbaert, M. (2016). Lexical Test for Advanced Learners of Spanish [Dataset]. American Psychological Association. https://doi.org/10.1037/t47086-000 CrossRefGoogle Scholar
Jiang, N. (2012). Conducting reaction time research in second language studies. Routledge.Google Scholar
Kazanina, N., Phillips, C., & Idsardi, W. J. (2006). The influence of meaning on the perception of speech sounds. Proceedings of the National Academy of Sciences of the United States of America, 103(30), 1138111386. https://doi.org/10.1073/pnas.0604821103 CrossRefGoogle ScholarPubMed
Kkese, E., & Karpava, S. (2019). Applying the Native Language Magnet Theory to an L2 setting: Insights into the CGR adult perception of L2 English. In Babatsouli, E., (Ed.), Proceedings of the international symposium on monolingual and bilingual speech 2019 (pp. 6774). Institute of Monolingual and Bilingual Speech. http://ismbs.eu/publications-2019 Google Scholar
Kohler, K. J. (1981). Contrastive phonology and the acquisition of phonetic skills. Phonetica, 38(4), 213226. https://doi.org/10.1159/000260025 CrossRefGoogle Scholar
Kuznetsova, A., Brockhoff, P. B., & Christensen, R. H. B. (2017). lmerTest package: tests in linear mixed effects models. Journal of Statistical Software, 82(13).126. https://doi.org/10.18637/jss.v082.i13 CrossRefGoogle Scholar
Lado, R. (1957). Linguistics across cultures: applied linguistics for language teachers. University of Michigan press.Google Scholar
Lenth, R. V. (2025). emmeans: Estimated Marginal Means, aka Least-Squares Means (Version 1.11.1-00001) [R].Google Scholar
Llompart, M. (2021). Lexical and phonetic influences on the phonolexical encoding of difficult second-language contrasts: Insights from nonword rejection. Frontiers in Psychology, 12, 659852. https://doi.org/10.3389/fpsyg.2021.659852 CrossRefGoogle ScholarPubMed
Llompart, M., & Reinisch, E. (2019). Robustness of phonolexical representations relates to phonetic flexibility for difficult second language sound contrasts. Bilingualism: Language and Cognition, 22(5), 10851100. https://doi.org/10.1017/S1366728918000925 CrossRefGoogle Scholar
Lopez Velarde, M. (2020). Effects of native phonology on spoken word recognition and second language phonological processing. ProQuest Dissertations and Theses. https://www.proquest.com/docview/2434486312?accountid=14696&bdid=48052&_bd=jsccXa9wA3ggEIP36I23tirgaYs%3D Google Scholar
Luce, P. A., Goldinger, S. D., Auer, E. T., & Vitevitch, M. S. (2000). Phonetic priming, neighborhood activation, and PARSYN. Perception & Psychophysics, 62(3), 615625. https://doi.org/10.3758/BF03212113 CrossRefGoogle ScholarPubMed
Marian, V., Bartolotti, J., Chabal, S., & Shook, A. (2012). CLEARPOND: cross-linguistic easy-access resource for phonological and orthographic neighborhood densities. PLoS ONE, 7(8), e43230. https://doi.org/10.1371/journal.pone.0043230 CrossRefGoogle ScholarPubMed
Marian, V., Blumenfeld, H. K., & Kaushanskaya, M. (2007). The language experience and proficiency questionnaire (leap-q): Assessing language profiles in Bilinguals and Multilinguals. Journal of Speech, Language, and Hearing Research, 50(4), 940967. https://doi.org/10.1044/1092-4388(2007/067)CrossRefGoogle ScholarPubMed
Martínez-Celdrán, E., Fernández-Planas, A. M., & Carrera-Sabaté, J. (2003). Castilian spanish. Journal of the International Phonetic Association, 33(2), 255259. https://doi.org/10.1017/S0025100303001373 CrossRefGoogle Scholar
Mitterer, H., & Mattys, S. L. (2017). How does cognitive load influence speech perception? An encoding hypothesis. Attention Perception & Psychophysics, 79(1), 344351. https://doi.org/10.3758/s13414-016-1195-3 CrossRefGoogle ScholarPubMed
Mitterer, H., Reinisch, E., & McQueen, J. M. (2018). Allophones, not phonemes in spoken-word recognition. Journal of Memory and Language, 98, 7792. https://doi.org/10.1016/j.jml.2017.09.005 CrossRefGoogle Scholar
Monsell, S., Doyle, M. C., & Haggard, P. N. (1989). Effects of frequency on visual word recognition tasks: Where are they? Journal of Experimental Psychology: General, 118(1), 4371. https://doi.org/10.1037/0096-3445.118.1.43 CrossRefGoogle Scholar
Ortí Mateu, R. (1990). Comparación fonética, diagnóstico y tratamiento de las dificultades de los estudiantes chinos para aprender español [Phonetic comparison, assessment, and treatment of Chinese students’ difficulties learning spanish]. Doctoral dissertation, University of the Philippines.Google Scholar
Ota, M., Hartsuiker, R. J., & Haywood, S. L. (2009). The KEY to the ROCK: Near-homophony in nonnative visual word recognition. Cognition, 111(2), 263269. https://doi.org/10.1016/j.cognition.2008.12.007 CrossRefGoogle Scholar
Pallier, C., Bosch, L., & Sebastián-Gallés, N. (1997). A limit on behavioral plasticity in speech perception. Cognition, 64(3), B9B17. https://doi.org/10.1016/S0010-0277(97)00030-9 CrossRefGoogle ScholarPubMed
Pallier, C., Colomé, A., & Sebastián-Gallés, N. (2001). The influence of native-language phonology on lexical access: Exemplar-based versus abstract lexical entries. Psychological Science, 12(6), 445449. https://doi.org/10.1111/1467-9280.00383 CrossRefGoogle ScholarPubMed
Patience, M. (2018). Acquisition of the Tap-Trill Contrast by L1 Mandarin–L2 English–L3 Spanish Speakers. Languages, 3(4), 42. https://doi.org/10.3390/languages3040042 CrossRefGoogle Scholar
R Core Team. (2021). R: A language and environment for statistical computing [Computer software]. R Foundation for Statistical Computing. https://www.R-project.org/ Google Scholar
Rhodes, R., Avcu, E., Han, C., & Hestvik, A. (2022). Auditory predictions are phonological when phonetic information is variable. Language, Cognition and Neuroscience, 116. https://doi.org/10.1080/23273798.2022.2043395 CrossRefGoogle Scholar
Rocca, B., Llompart, M., & Darcy, I. (2025). Phonological neighborhood density, phonetic categorization, and vocabulary size differentially affect the phonolexical encoding of easy and difficult L2 segmental contrasts. Bilingualism: Language and Cognition, 28(3), 662675. https://doi.org/10.1017/S1366728924000865 CrossRefGoogle Scholar
Shea, C. E., & Curtin, S. (2010). Discovering the relationship between context and allophones in a second language: Evidence for distribution-based learning. Studies in Second Language Acquisition, 32(4), 581606. https://doi.org/10.1017/S0272263110000276 CrossRefGoogle Scholar
Shea, C. E., & Curtin, S. (2011). Experience, representations and the production of second language allophones. Second Language Research, 27(2), 229250. https://doi.org/10.1177/0267658310375753 CrossRefGoogle Scholar
Shoemaker, E. (2014). The exploitation of subphonemic acoustic detail in l2 speech segmentation. Studies in Second Language Acquisition, 36(4), 709731. https://doi.org/10.1017/S027226311400014X CrossRefGoogle Scholar
Stanislaw, H., & Todorov, N. (1999). Calculation of signal detection theory measures. Behavior Research Methods, Instruments, & Computers, 31(1), 137149. https://doi.org/10.3758/BF03207704 CrossRefGoogle ScholarPubMed
van Leussen, J.-W., & Escudero, P. (2015). Learning to perceive and recognize a second language: The L2LP model revised. Frontiers in Psychology, 6, 1000. https://doi.org/10.3389/fpsyg.2015.01000 CrossRefGoogle ScholarPubMed
Vandergrift, L. (2006). Second language listening: Listening ability or language proficiency? The Modern Language Journal, 90(1), 618. https://doi.org/10.1111/j.1540-4781.2006.00381.x CrossRefGoogle Scholar
Weber, A., & Cutler, A. (2004). Lexical competition in non-native spoken-word recognition. Journal of Memory and Language, 50(1), 125. https://doi.org/10.1016/S0749-596X(03)00105-0 CrossRefGoogle Scholar
Whalen, D. H., Best, C. T., & Irwin, J. (1997). Lexical effects in the perception and production of American English /p/ allophones. Journal of Phonetics, 25(4), 501528. https://doi.org/10.1006/jpho.1997.0058 CrossRefGoogle Scholar
Zheng, Q., & Gor, K. (2024). The influence of native phonology, allophony, and phonotactics on nonnative lexical encoding: A vocabulary training study. Language Learning, 74(1), 146183. https://doi.org/10.1111/lang.12581 CrossRefGoogle Scholar
Figure 0

Table 1. Articulatory features of [t], [l], [ɾ] in Spanish

Figure 1

Table 2. /t/, /l/, /ɾ/ in Spanish

Figure 2

Table 3. /t/, /l/, /ɾ/ in AE

Figure 3

Table 4. /t/, /l/, /ɾ/ in Chinese

Figure 4

Table 5. Background information for participants

Figure 5

Table 6. Mean accuracy rate and RT information for the three L1 groups in OT

Figure 6

Figure 1. Accuracy rates by contrast type, language group, and trial type (odd or same trials).Note: Bars show proportion correct for /l/-/ɾ/ (black) and /t/-/ɾ/ (gray) contrasts across AE, Spanish, and Chinese L1 speakers. Error bars present one standard error of the mean for each contrast for each group.

Figure 7

Figure 2. d' values for the oddity task across language groups (Spanish L1 speakers, AE L1 speakers, and Chinese L1 speakers) and phonemic contrasts (/l/-/ɾ/ vs. /t/-/ɾ/).

Figure 8

Figure 3. Accuracy rates for nonwords and words by contrast type and language group.Note: Bars show proportion correct for /t/-/ɾ/ (gray) and /l/-/ɾ/ (blue) contrasts across Spanish, American English (AE), and Chinese L1 speakers. Error bars present one standard error of the mean for each contrast for each group.

Figure 9

Figure 4. Accuracy rates for the four nonword creation directions among the three language groups (Spanish, AE, and Chinese L1 speakers).Note: Bars show the proportion correct for responses for the four nonword creation directions across the three participant groups. Error bars represent one standard error of the mean. For each L1 speaker group, L-R: /t/→/ɾ/, /ɾ/→/t/, /l/→/ɾ/, /ɾ/→/l/.

Figure 10

Figure 5. Reaction times by Language Group, Contrast, and response accuracy.Note: Bars show mean RTs (ms) for incorrect (gray) vs. correct (blue) responses for the two contrasts. Error bars represent one standard error of the mean. L-R: Spanish, AE, and Chinese L1 speakers.