Hostname: page-component-68c7f8b79f-kbpd8 Total loading time: 0 Render date: 2025-12-15T22:32:25.184Z Has data issue: false hasContentIssue false

Learning novel complex words in a second language

Effects of morpheme training and family size

Published online by Cambridge University Press:  15 December 2025

Junmin Li
Affiliation:
English Department, Hangzhou City University, Hangzhou, China
Ali Behzadnia
Affiliation:
School of Psychological Sciences, Macquarie University, Sydney, Australia
Elisabeth Beyersmann*
Affiliation:
School of Psychological Sciences, Macquarie University, Sydney, Australia
*
Corresponding author: Elisabeth Beyersmann; Email: lisi.beyersmann@mq.edu.au
Rights & Permissions [Opens in a new window]

Abstract

The role of morphology in complex word acquisition was examined in Chinese (L1)–English (L2) bilinguals. Participants learned words consisting of two novel constituents, by pairing them with pictures. Items either belonged to large (torbnel, torbilm, torbla, torbiph) or small morphological families (torbilm, torbla). After training, participants completed recognition and spelling tasks with novel words that either included or excluded a trained morpheme. Results revealed robust stem-training effects, showing that items including a trained constituent were harder to reject and easier to spell than items including two untrained constituents. There was also a significant effect of morphological family size, with greater training effects for items belonging to large than small families. Effect sizes were overall smaller in L2 than in L1. These findings point to the important role of morphological structure in L2 word acquisition and suggest that large morphological family-clusters lead to better learning outcomes.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press

Research on first language (L1) English speakers has demonstrated that proficient readers can instinctively break down morphologically complex words into their elemental morphemes while engaged in reading activities (e.g., Beyersmann et al., Reference Beyersmann, Ziegler, Castles, Coltheart, Kezilas and Grainger2016; Diependaele et al., Reference Diependaele, Sandra and Grainger2009; Rastle et al., Reference Rastle, Davis and New2004). This morphemic knowledge significantly aids in language comprehension, especially when deciphering unfamiliar words constructed from known morphemes. Although even adept readers frequently come across new vocabulary, the process of acquiring complex words is especially pertinent for those who are regularly confronted with novel vocabulary in their reading material. This includes both developing readers (Beyersmann et al., Reference Beyersmann, Wegener, Pescuma, Nation, Colenbrander and Castles2022) and individuals learning a second language (Behzadnia et al., Reference Behzadnia, Wegener, Bürki and Beyersmann2024a). However, the specific impact of morphemic structure and the size of a morphological family on second language (L2) novel word learning remains an area that requires further investigation.

The debate on how L2 learners process morphology is more complex than for L1 processing. L2 research, particularly in masked priming studies, often seeks to determine whether L2 learners automatically decompose complex words at the initial stages of processing. However, findings have been mixed. Some studies indicate that L2 learners tend to adopt a more holistic approach to word recognition (e.g., Clahsen et al., Reference Clahsen, Felser, Neubauer, Sato and Silva2010; Silva & Clahsen, Reference Silva and Clahsen2008). In contrast, other research suggests that L2 learners, like L1 speakers, automatically decompose words early in processing (e.g., Coughlin & Tremblay, Reference Coughlin and Tremblay2015; Diependaele et al., Reference Diependaele, Duñabeitia, Morris and Keuleers2011). If L2 learners’ approach new vocabulary acquisition using a holistic method, they are unlikely to develop an awareness of the morphological structure of words. Conversely, focusing on learning new words through understanding their morphological components encourages them to engage in morphological processing. The current study aimed to replicate a study design earlier used with L1 monolinguals (Behzadnia et al., Reference Behzadnia, Ziegler, Colenbrander, Bürki and Beyersmann2024b) to investigate whether L2 speakers benefit from morphological structure in novel word learning by focusing on Chinese learners of English.

Research on Chinese learners of English has indicated that, in the early stages of processing complex words, these L2 learners rely heavily on surface forms. This reliance is evidenced by findings from several masked priming studies, which consistently show that morphological priming effects are as strong as form priming effects (e.g., Kahraman & Beyersmann, Reference Kahraman and Beyersmann2024; Li & Taft, Reference Li and Taft2020; J. Li et al., Reference Li, Jiang and Gor2017; M. Li et al., Reference Li, Jiang and Gor2017; for a recent review see Kahraman & Beyersmann, Reference Kahraman, Beyersmann, Elgort, Siyanova-Chanturia and Brysbaert2023). Specifically, the magnitude of priming in the morphological condition (e.g., teacher-TEACH) is comparable to that in the form condition (e.g., pillow-PILL) when both are compared with an unrelated control condition. This equivalence raises questions about whether L2 learners process morphologically complex words similarly to orthographically overlapping words—potentially activating target words letter by letter.

Some L2 processing theories suggest that L2 learners possess different mechanisms from L1 speakers and face greater challenges in memorizing derived words. For instance, Ullman (Reference Ullman and Sanz2005) argues that when processing L1 vocabulary, two memory systems are involved: the declarative memory system, which stores whole words and phrases, and the procedural memory system, which handles language rules (e.g., inflection and derivation). In early L2 development, however, learners rely more heavily on the declarative system than the procedural one. This overreliance can lead to differences in how L2 learners process morphologically complex words relative to L1 speakers. For instance, L2 learners might treat variations of the same lemma (e.g., create vs. creating) as separate words; that is, such words might be processed as wholes rather than being segmented into their morphological components. Ward and Chuenjundaeng (Reference Ward and Chuenjundaeng2009) discovered that L2 learners often struggle to associate related forms of words. Similarly, Schmitt (Reference Schmitt1998) found that even advanced L2 learners, when prompted, demonstrated a lack of proficiency in productive derivational word form knowledge.

While the above-mentioned studies inform mechanisms of complex word recognition in bilinguals, our understanding of how L2 learners use morphemic knowledge when acquiring new vocabulary remains limited. At present, there is a lack of evidence in this area, which we will address later in discussion. The current study seeks to address this gap by building on the findings of a complex word training study originally conducted with L1 English speakers (Behzadnia et al., Reference Behzadnia, Ziegler, Colenbrander, Bürki and Beyersmann2024b). This will shed light on whether L2 learners employ similar morphemic processing strategies and how these may differ from or align with those of L1 speakers.

In language learning, generalization plays a crucial role in helping individuals acquire linguistic knowledge. One of the most compelling examples of linguistic generalization at the single-word level lies in the domain of morphology. In English, as in many languages, stems (e.g., form, act) combine with a limited set of prefixes (e.g., re-, inter-) and suffixes (e.g., -ation, -or) to create a vast majority of word forms—estimated at around 85% (e.g., reform, formation, interact, actor). A key characteristic of this combinatorial system is that language users develop generalized knowledge of its components: they can apply morphemic units flexibly beyond the specific contexts in which they were learned, enabling them to understand and generate new, meaningful words. For example, exposure to existing words like jumped, walked, and watched helps us learn how to form new past-tense constructions (e.g., talked).

Interestingly, although morphology naturally lends itself to novel word generalizations and has been found to support novel word acquisition in L1 speakers (e.g., Behzadnia et al., Reference Behzadnia, Ziegler, Colenbrander, Bürki and Beyersmann2024b; Merkx et al., Reference Merkx, Rastle and Davis2011; Tamminen et al., Reference Tamminen, Davis and Rastle2015), it is less clear whether or not L2 speakers also use morphology to their advantage during novel word learning. Indeed, the above evidence from lexical decision and masked priming suggests that L2 speakers may be less sensitive to presence of morphemes in their reading (e.g., Clahsen et al., Reference Clahsen, Felser, Neubauer, Sato and Silva2010). Our research aims to understand the processes underlying the formation of morphemic representations in L2 learners. Specifically, we investigated whether the development of general morpheme processing in L2 learners mirrors the processes observed in L1 speakers.

A limited number of studies have explored the effect of morphemic knowledge on learning new words. For example, Merkx et al. (Reference Merkx, Rastle and Davis2011) investigated this by training participants on new suffixes attached to familiar stems (e.g., sleep + nept = sleepnept). Participants were either exposed to a form-learning condition where the suffix had no associated meaning or a semantic-learning condition where a definition was provided for the novel suffix. Post-training tasks showed that participants had more difficulty rejecting novel words that combined a trained and untrained morpheme than they did with entirely untrained words. This suggests that trained morphemes made the novel words seem more “word-like,” increasing the difficulty of rejection. In a definition selection task, the effect of training was stronger in the semantic-learning condition, showing that participants could generalize the meaning of the new suffixes to novel words, a finding supported by Tamminen et al. (Reference Tamminen, Davis and Rastle2015). Tamminen et al. (Reference Tamminen, Davis, Merkx and Rastle2012) further confirmed this generalization of new morphemic knowledge by building on Merkx et al.’s (Reference Merkx, Rastle and Davis2011) training paradigm. They trained adult English speakers on new words with familiar stems and novel suffixes (e.g., sleep + afe = sleepafe) and tested them both immediately and after two days. In a shadowing task (i.e., speeded repetition of spoken novel words) conducted two days after training, participants were faster and more accurate when responding to words containing a trained suffix, replicating Merkx et al.’s findings.

Further evidence for the important role of morphology in novel word learning comes from studies investigating novel words consisting of novel stems and familiar suffixes (e.g., Dawson et al., Reference Dawson, Rastle and Ricketts2021; Tucker et al., Reference Tucker, Castles, Laroche and Deacon2016). For example, Dawson et al. (Reference Dawson, Rastle and Ricketts2021) found that in a post-training lexical decision task, participants showed much lower accuracy for trained nonwords with trained stems + trained suffixes (e.g., clantist) compared to recombined nonwords (e.g., clantful), nonwords with untrained stems (e.g., clontist), distant nonwords when both the stem and suffix containing a letter substitution (e.g., clontilt), and nonwords with untrained suffixes(e.g., clantify), further supporting the idea that morphological structure facilitates the reading of novel words in L1, at least when the novel words consisted of one existing and one novel morphemic constituent.

Taking these findings one step further, other studies have asked how readers identify morphemes embedded in letter strings that consist entirely of novel constituents, removing the possibility of relying on pre-existing morphological or lexical knowledge. Encountering completely novel complex words without any familiar morphemes (e.g., torb + iph) presents a more rigorous test of how readers identify morphemic boundaries and derive meaning. Novel word training studies have shown that L1 speakers indeed identify morphemic structure relatively effortlessly, even in the absence of any known constituents (e.g., Behzadnia et al., Reference Behzadnia, Ziegler, Colenbrander, Bürki and Beyersmann2024b; Beyersmann et al., Reference Beyersmann, Grainger, Dufau, Fournet and Ziegler2023). Research shows that the learning of novel morphemes is facilitated by novel constituent frequency (Beyersmann et al., Reference Beyersmann, Grainger, Dufau, Fournet and Ziegler2023) and morphological family size (Behzadnia et al., Reference Behzadnia, Ziegler, Colenbrander, Bürki and Beyersmann2024b).

In contrast, research on novel word learning in L2 is relatively limited. Behzadnia et al. (Reference Behzadnia, Wegener, Bürki and Beyersmann2024a) used an oral novel word training to examine whether exposure to the spoken form of morphologically complex words could facilitate the later reading of the embedded stem morphemes within a group of German (L1)-English (L2) bilinguals. Half of the stems had spellings that were predictable from their phonological form, and half had unpredictable spellings. After three days of oral training, participants saw the written forms of the stems for the first time, embedded in sentences, while their eye movements were monitored. The results showed that as opposed to L1 speakers, the eye movements of L2 speakers were not modulated by spelling predictability, suggesting that L2 participants had not developed phoneme-grapheme mapping rules robust enough to influence their eye movements when encountering trained words again. This task was particularly challenging for L2 learners, as the phoneme-grapheme correspondences in English (their L2) differ significantly from those in their L1 language, German. While participants may have learned the novel words in English, automatic interference from L1 likely impacted their L2 processing. This interference may not apply to Chinese speakers, as Chinese lacks consistent phoneme-grapheme correspondences because its writing system is logographic rather than alphabetic, which eliminates this complication. In another study with Dutch-English bilingual children, Raudszus et al. (Reference Raudszus, Segers and Verhoeven2021) examined how fifth-grade readers use morphological and contextual cues to infer the meanings of unfamiliar words, as well as the extent to which these skills relate to their cognitive and linguistic abilities. In the study, both L1 and L2 Dutch children completed a lexical inferencing task where the availability of morphological and contextual information was controlled. The findings showed that readers used both types of information to infer word meanings. However, while L1 and L2 readers were similar in their use of morphological cues, L2 readers relied less on contextual information than L1 readers. This difference was primarily observed in L2 readers with limited vocabulary. The results suggest that decoding skills are essential for effectively accessing morphological information.

Another factor that significantly affects the response time and accuracy of visual word recognition is morphological family size. Morphological families refer to all words in the language that share the same “family-head” (e.g., play), including both affixed words (played, playing, player, playful, replay, etc.) and compound words (playbill, playhouse, playbook, playtime, playmate, etc.). Research has consistently shown that words that belong to large morphological families are recognized more quickly and accurately than those from smaller families (Baayen et al., Reference Baayen, Lieber and Schreuder1997; Bertram et al., Reference Bertram, Baayen and Schreuder2000; Beyersmann & Grainger, Reference Beyersmann and Grainger2018; Boudelaa & Marslen-Wilson, Reference Boudelaa and Marslen-Wilson2011; De Jong et al., Reference De Jong, Feldman, Schreuder, Pastizzo and Baayen2002; Juhasz & Berkowitz, Reference Juhasz and Berkowitz2011; Kuperman et al., Reference Kuperman, Bertram and Baayen2008; Schreuder & Baayen, Reference Schreuder and Baayen1997). Morphological family size also plays an essential role in language processing in L2 learners, improving accuracy and response time in word recognition tasks (e.g., Dijkstra et al., Reference Dijkstra, Moscoso del Prado Martín, Schulpen, Schreuder and Baayen2005; Moscoso del Prado Martín et al., Reference Moscoso del Prado Martín, Deutsch, Frost, Schreuder, De Jong and Baayen2005; Mulder et al., Reference Mulder, Schreuder and Dijkstra2013). For instance, De Zeeuw et al. (Reference De Zeeuw, Verhoeven and Schreuder2012) explored this in Turkish-Dutch bilingual children (L2) and Dutch-speaking children (L1) using a Dutch lexical decision task. These children were from second, fourth, and sixth grades, who were asked to decide whether a letter string was a Dutch word or not. The target words were matched for word frequency, imageability, and length but varied in morphological family size. Both groups showed better accuracy (in second grade) and faster responses (in fourth and sixth grades) for words with larger morphological families, highlighting the importance of morphological structure in L2 processing. Notably, sixth-grade L2 children showed a greater accuracy benefit from words with a large morphological family size compared to their L1 peers. Similarly, Akbari (Reference Akbari2016) investigated whether bilingual and monolingual children demonstrated morphological family size effect for written L2 (English) words using a lexical decision task. The bilingual children spoke a heritage language as their L1 and were learning English as their L2, which was also the language of education in their host country. The results showed that bilingual students had lower accuracy scores than monolingual students in Grades 5–6 but had similar response speeds in both Grades 5–6 and 11–12. Crucially, the study revealed that both word frequency and morphological family size positively influenced reading performance in bilingual children.

However, few studies have investigated the effects of morphological family size on novel word learning, with one such study by Tamminen et al. (Reference Tamminen, Davis and Rastle2015) reporting that morphological family size does indeed facilitate learning. In their experiment, similar to Merkx et al. (Reference Merkx, Rastle and Davis2011), participants were trained in novel words that combined an existing stem with a novel suffix (e.g., sleep + nept = sleepnept). The suffixes were selected either from a large morphological family (e.g., creepesh, grabesh, sleepesh, sheepesh) or a small morphological family (e.g., bringane, lockane). After training, participants read aloud sentence-final words that included an untrained stem and a trained suffix, and they were faster with words containing a trained suffix from a large family. Additionally, when asked to judge the semantic congruence of sentence frames with the sentence-final word, accuracy was higher when the word contained a trained suffix from a large family. These findings suggest that larger morphological families aid in the learning of new morphemes, supporting the notion that family size plays an important role in L1 word acquisition. This aligns with the observation that skilled readers process words from large morphological families more efficiently, indicating that morphological family size may be crucial in how new words are integrated into a reader’s vocabulary.

Similar results were revealed by Behzadnia et al. (Reference Behzadnia, Ziegler, Colenbrander, Bürki and Beyersmann2024b), who examined the role of morphemic knowledge in learning novel words consisting of two entirely novel constituents; thereby, English L1 speakers learned novel complex words by associating them with pictures of objects. The key variable was the manipulation of morphological family size for the stems: novel stems (e.g., torb) were paired with four different morphemes (e.g., torbnel, torbilm, torbla, torbiph), representing a large family, while others were paired with only two (e.g., torbilm, torbla), representing a small family. Training continued until participants reached 90% accuracy, after which they completed recognition and spelling tasks to assess their ability to generalize learned morphemes (e.g., torb) to new contexts. In the recognition task, participants identified whether an item was trained or untrained, while the spelling task required them to spell out the spoken forms of the target words. The results showed that words made up of a combination of trained and untrained constituents were harder to reject in the recognition task but easier to spell than those without any trained constituents. Moreover, novel words that included trained constituents from large morphological families were more difficult to reject than those from smaller families, suggesting that larger morphological families aid the learning process. The study highlights the facilitatory role of large morphological families in novel word learning. This emphasizes the importance of considering morphological family size in word learning, particularly in second language acquisition and literacy development.

Present study

The current study closely followed the design of Experiment 2 from Behzadnia et al. (Reference Behzadnia, Ziegler, Colenbrander, Bürki and Beyersmann2024b) and was conducted entirely online, with one key modification: the L1 reading fluency test (Test of Word Reading Efficiency [TOWRE]; Torgesen et al., Reference Torgesen, Wagner and Rashotte1999) was replaced with an evaluation survey of L2 proficiency (LexTALE; Lemhöfer & Broersma, Reference Lemhöfer and Broersma2012).

Participants were trained on a set of novel words including novel stems (e.g., torb) that belonged to a large morphological family (e.g., torbnel, torbilm, torbla, torbiph), and novel stems that belonged to a small morphological family (e.g., torbilm, torbla). Following training, a recognition task of form and a spelling task were completed by each participant, including three conditions: a trained stem (large family size) + untrained constituent condition, a trained stem (small family size) + untrained constituent condition, and an untrained + untrained condition. The data from the post-training tasks were used to address two questions. First, we asked if Chinese English learners were able to extract the trained morphemes from an untrained context. This question was addressed by comparing the two trained + untrained conditions against the untrained + untrained condition. Second, we explored if stems with large morphological families were more easily and rapidly extracted than stems with small morphological families. This question will be addressed by comparing the trained stem (large family size) + untrained constituent condition against the trained stem (small family size) + untrained constituent condition.

We hypothesized that with respect to the recognition task, if participants can generalize novel morphological knowledge to an untrained context, they should be responding more slowly and with lower accuracy (i.e., with a higher likelihood of responding “yes”) in the trained + untrained condition compared to the untrained + untrained condition. Moreover, if novel word learning in L2 speakers of English is facilitated by morphological structure, we would expect slower and less accurate responses in the large compared to the small morphological family size condition. However, if participants rely primarily on form processing in novel word learning, as has been observed in visual word recognition (e.g., Kahraman & Beyersmann, Reference Kahraman and Beyersmann2024; Li & Taft, Reference Li and Taft2020; J. Li et al., Reference Li, Taft and Xu2017; M. Li et al., Reference Li, Jiang and Gor2017), we would not expect an effect of morphological family size. That is, response times and accuracy should be similar for both large and small family size conditions.

Regarding the spelling task, our predictions were less defined. Since the results from the spelling task in L1 speakers did not reveal an effect of morphological family size (Behzadnia et al., Reference Behzadnia, Ziegler, Colenbrander, Bürki and Beyersmann2024b), we only predicted a main effect of stem. That is, we expected more accurate written production of constituents in novel words that consisted of a trained constituent + an untrained constituent compared to the untrained + untrained control condition. These hypotheses were pre-registered, along with the experimental design and data analysis plan (https://aspredicted.org/htt9-sgyd.pdf).

Method

Participants

Using methods from Langenberg et al. (Reference Langenberg, Janczyk, Koob, Kliegl and Mayer2023), we conducted a power analysis to test the required sample size for the two post-training tasks: the recognition task and the spelling task. For the recognition task (trained vs. untrained words), detecting a medium effect (d = 0.4) with 80% power required 51 participants. For the spelling task (large vs. small family stems), detecting a typical effect (d = 0.4) with 80% power required 34 participants. To ensure the robust detection of effects across both tasks—particularly if effect sizes prove more modest than anticipated—we recruited 80 participants, exceeding the minimum requirement while accounting for potential exclusions. Besides, we aimed to have a sample size comparable to that of Behzandia et al. (2024b), who had 100 participants in Experiment 2.

Eighty Chinese English learners from various Chinese universities were recruited online for this study. The participants had an average age of 23.77 years (range: 17–32, SD = 4.66). Before participating, all participants were provided with information about the experimental procedure and gave their informed consent. Their average LexTALE score was 78.42 (range: 67.25–96.00, SD = 7.84), corresponding to an approximate B2 (Upper Intermediate) level on the Common European Framework of Reference for Languages (Lemhöfer & Broersma, Reference Lemhöfer and Broersma2012).

Materials

The materials were identical to the ones used in the L1 speaker group by Behzadnia et al. (Reference Behzadnia, Ziegler, Colenbrander, Bürki and Beyersmann2024b).

Training materials

There were 16 initial and 24 final constituents to create 48 complex words. The initial constituents had 4–6 letters, and the finals had 2–3 letters, forming legal and pronounceable letter sequences. These initial constituents were not similar to existing English words and were not found in the English Lexicon Project (ELP; Balota et al., Reference Balota, Yap, Hutchison, Cortese, Kessler, Loftis, Neely, Nelson, Simpson and Treiman2007) or Subtlex-UK databases (Van Heuven et al., Reference Van Heuven, Mandera, Keuleers and Brysbaert2014). The final constituents were not identified as affixes in the MorphoLex database (Sánchez Gutiérrez et al., Reference Sánchez-Gutiérrez, Mailhot, Deacon and Wilson2018), indicating they lacked an affixal status or meaning. However, in this experiment, each constituent was given a meaning by associating a picture with the constituent. The initial constituents (e.g., torb) referred to an object (e.g., a ball), and the final ones modified that meaning. For example, torb could mean a ball, and torb + ilm could mean a big ball. The complete list of new words and their meanings can be found in the online Supplementary Material A.

Each morphological family size condition featured four distinct initial constituents. In the large family size condition, each initial constituent was paired with four second constituents, leading to a total of four appearances for each initial constituent (e.g., farsherp, farshlor, farshoth, farshib). Conversely, in the small family size condition, each initial constituent was matched with only two different second constituents (e.g., dirchilm, dirchla), resulting in just two occurrences for each. To avoid a confound between morphological family size and constituent frequency, the number of exposures to target constituents was balanced by repeating the exposure to novel words in the small family size. This ensured that the first constituent in both large and small family sizes had four exposures each. The two item sets were further divided into four lists (Sets 1a, 1b, 2a, and 2b; see Supplementary Material B) to ensure that each item was categorized under a large morphological family for half of the trials, and under a small morphological family for the remaining trials.

To counterbalance the training effects, two sets of complex novel words were developed, each containing 32 items. One group of participants was trained on Set 1, with Set 2 serving as the untrained control in the post-training phase. The other group was trained on Set 2, with Set 1 as the untrained control. Each set was divided into two conditions: large family size and small family size. Morphological family size referred to the number of distinct morphologically complex words containing a particular morpheme. We chose smaller family sizes (two versus four) to limit the total number of constituent combinations and, consequently, the total number of novel words to be learned. This approach ensured the practicality and feasibility of the training task.

Pictures

Eight object images were selected from the Multilingual Picture (MultiPic) database (Duñabeitia et al., Reference Duñabeitia, Crepaldi, Meyer, New, Pliatsikas, Smolka and Brysbaert2018), with each picture corresponding to one of the first constituents. For instance, torb was used to represent “ball” in examples like torbilm and torbla. The second constituents were shorter than the first one and with related meaning, indicating color (red/blue), size (small/large), price (high/low), age (old/new), or cleanliness (clean/dirty). The concreteness and imageability scores of the pictures representing the constituents were matched, using data from Brysbaert et al. (Reference Brysbaert, Warriner and Kuperman2014) and the Glasgow Norms (Scott et al., Reference Scott, Keitel, Becirspahic, Yao and Sereno2019). The concreteness scale ranged from 1 (abstract) to 5 (concrete), and the imageability scale from 1 (not at all imageable) to 7 (highly imageable). See Table 1 for details.

Table 1. Mean Characteristics of Word Sets and Constituent Morphemes Used in the Experiment

Post-training materials

To create untrained items for the trained stem condition (trained first constituent and untrained second constituent) and the untrained stem condition (untrained first and second constituents), each first constituent was paired with three different untrained second constituents. We used 12-second constituents, each repeated once, with the same untrained second constituents used across both conditions. There was equal number of trials in both family size conditions.

Procedures

Prior to taking part in the experiment, participants were told that they would be learning novel words in English. Figure 1 shows the flow chart to indicate all the procedures.

Figure 1. Study Flowchart.

Training phase

The entire study was designed and conducted online using the Gorilla Experiment Builder (www.gorilla.sc; Anwyl-Irvine et al., Reference Anwyl-Irvine, Dalmaijer, Hodges and Evershed2021). Each trial began with a 500 ms display of a blank screen, which was then replaced by two pictures of objects and a novel printed word that corresponded to one of the pictures (see Figure 2).

Figure 2. Training procedure.

A novel word training paradigm was employed to train participants on morphologically complex novel English words in written form. The two objects shared the same first constituent meaning (e.g., ball) but differed in their visual features (e.g., big, small) due to the second constituent meaning. The task was designed so that participants would assign meaning to the embedded reading units. Knowing the meanings of the embedded constituents (e.g., ball + big; ball + small) was a key prerequisite. However, because the letter strings used in the study were entirely novel, participants had to rely solely on the training and could not use any pre-existing knowledge of the words, which enhanced the effectiveness of the training. Participants’ task was to match each novel word with one of the pictures by pressing a keyboard button. They were instructed to respond as accurately and quickly as possible, with a maximum response time of 5,000 ms. After each response, participants received feedback on the correctness of their choice. If they failed to respond within the time limit, the trial moved on to the next novel word and pictures without feedback. The presentation order of items was randomized for each participant. After all novel words and their corresponding pictures were shown, participants received a performance summary with their accuracy percentage and the number of correct and incorrect responses. To complete the training phase, participants had to repeat the task until they achieved an accurate threshold of 90%. This 90% accuracy was calculated across the entire list of novel words, not for individual items. If participants did not reach the threshold, they were asked to repeat the training with the full word list. These procedures ensured that the number of exposures to items in the small and large family size conditions remained balanced throughout the study.

Post-training phase

This phase involved two tasks: a recognition task and a spelling task. Both tasks included three conditions: a trained item condition (e.g., torbilm), a trained stem condition with a trained first constituent and an untrained second constituent (e.g., torberv), and an untrained stem condition where both constituents were untrained (e.g., voopurt). Each condition had 24 items, with half of the trained stems belonging to a large morphological family and half to a small one. The presentation of all words was randomized across both tasks.

Recognition task. The task began with a fixation cross (+) displayed for 500 ms, followed by the presentation of a written novel word, which remained on the screen until the participant responded. The goal was to quickly and accurately decide if the presented word was trained or untrained, with a maximum response time of 4,000 ms. Participants responded by pressing a corresponding button and received feedback on the accuracy of their responses. At the end of the task, participants were provided with their total number of correct and incorrect responses, as well as their mean accuracy percentage.

Spelling task. A fixation cross (+) appeared for 500 ms at the start of each trial. Participants were then asked to click a “play” button to hear an audio recording of the novel words. They could listen to each recording up to three times. After listening, participants typed their response into a text box that appeared on the screen for each item. Spellcheck was disabled, and since the experiment used novel words, participants could not rely on online dictionaries or tools for assistance. The sole reason for disabling the spellcheck was to prevent any of the types of answers from being automatically corrected or modified. Despite the primary focus of the post-training task being the assessment of participants’ recognition of the trained constituents, it also served as a significant control to gauge the impact of whole-word exposures on the observed learning outcomes.

Analysis

The statistical analyses were conducted using the lme4 package developed by Bates et al. (Reference Bates, Mächler, Bolker and Walker2015) within the R statistical software environment, as maintained by the R Core Team (2020). Our focus was on examining both the response times (RTs) and error rates (ERs) in the recognition task, as well as the ERs in the spelling task. We confined our analysis to trials that occurred under untrained conditions. For each task and the dependent variables, we executed two distinct analyses. The first analysis aimed to assess the impact of stem status by comparing responses to novel words that either included trained or untrained stems. The linear mixed-effects model for this analysis incorporated stem status as a fixed-effect predictor, with trained stems coded as 0.5 and untrained stems as –0.5. The second analysis aimed to evaluate the influence of morphological family size by contrasting responses to trained stems from large families with those from small families. The corresponding model included morphological family size as a predictor, with large family size coded as 0.5 and small family size as –0.5. In each model, the random effect structure accommodated participant-specific and item-specific varying intercepts and slopes. Models were simplified in the event of convergence issues, and random terms were sequentially removed, starting with the term exhibiting the smallest value.

Prior to analysis, RTs were visualized using a density plot to identify and exclude outliers. RTs less than 250 ms and greater than 3,500 ms were deemed outliers and were removed from the dataset. We ran a minimal model first including random intercepts for items and participants only. This model was then used to perform the residual trimming procedure outlined by Baayen and colleagues (Reference Baayen, Davidson and Bates2008; Baayen & Milin, Reference Baayen and Milin2010) by excluding data points corresponding to residuals >2.5. Only the results from the second model are reported here. For the analysis of ERs, we employed generalized linear mixed-effects models, with response accuracy (coded as 1 for accurate and 0 for error) as the dependent variable. These models were constructed similarly to those used for RT analysis.

Finally, the same statistical approach was applied to the combined data analysis, including data from L1 English speakers in Experiment 2 of Behzadnia et al. (Reference Behzadnia, Ziegler, Colenbrander, Bürki and Beyersmann2024b), though different cut-offs and outlier exclusions were used to account for variations in RT distributions across experiments. Moreover, a participant group factor (L1 vs. L2) was added to the linear mixed-effects models, such that the first model examined the interaction between stem status and participant group, and the second model examined the interaction between morphological family size and participant group.

Results

We report the data analysis for the bilingual participants first, followed by the analysis of the combined data, which included the 100 L1 English speakers from Experiment 2 of Behzadnia et al. (Reference Behzadnia, Ziegler, Colenbrander, Bürki and Beyersmann2024b).

Recognition task

One participant did not complete the word recognition task, bringing the total number of participants to 79.

Stem status

RT analysis. After excluding erroneous trials from a total of 3,699, we were left with 3,207 valid trials, resulting in an error rate of 13.3%. To further refine the data, we removed trials with response times over 3,500 ms or under 250 ms, accounting for 0.31% of the total. This resulted in 3,197 trials that were included in our analysis. A linear mixed-effects model, with stem status as a fixed effect, revealed a significant effect of stem status (χ2(1) = 23.07, p < .001), indicating slower response times for novel words containing a trained stem compared to those with an untrained stem.

RT analysis of L1 vs L2. When combining the data from L1 speakers, we excluded incorrect trials from a total of 8,497, leaving 7,289 valid trials and resulting in an error rate of 14.2%. We also trimmed trials with RTs exceeding 3,500 ms or below 250 ms, which constituted 0.21% of the total. This left us with 7,274 trials included in the analysis. The mean RTs for both L1 and L2 participants are presented in Figure 3.

Figure 3. Mean RTs and ERs as a function of stem status for L1 and L2 participants.

The linear mixed-effects model, with stem status as the fixed effect, revealed that there was a significant interaction between stem status and participant group (χ2(1) =10.53, p= .001), showing that there was a larger stem-effect for L1 (t = 9.20, p <.001) compared to L2 speakers (t = 6.00, p <.001). There was a significant main effect of stem status (χ2(1) = 70.29, p <.001), showing that both L1 and L2 participants reacted significantly slower to trained stems than to untrained stems (see Figure 3). Furthermore, a significant main effect of participant group was observed (χ2(1) = 51.19, p <.001), indicating that L1 speakers responded overall faster than L2 learners.

ER analysis. ER analysis indicated a significant impact of stem status on participant responses, with the model demonstrating a significant effect size (χ2(1) =26.24, p <.001). This suggests that participants exhibited lower accuracy rates when responding to the trained stem condition compared to the untrained stem condition.

ER analysis of L1 vs L2. The model revealed a significant interaction effect of stem status and participant group (χ2(1) = 29.09, p <.001), showing that the effect of stem status was significantly larger for L1 (z = –10.08, p <.001) compared to L2 speakers (z = –5.18, p <.001). The main effect of stem status was significant (χ2(1) =101.67, p <.001), but no main effect for participant group (χ2(1) = 0.06, p = .800). Figure 3 presents the error percentages of both L1 and L2 participants for trained and untrained items.

Morphological family size

RT analysis. We excluded 325 erroneous trials, which accounted for 17.55% of the total 1,852 trials. Four outlier trials were also removed from the 1,527 correct responses. After these exclusions, the dataset of 1,523 trials was analyzed. The average response time was 974 ms (SD = 362) in the large stem family size condition and 953 ms (SD = 345) in the small stem family size condition. The results indicated no significant effect of family size on response times (χ2(1) = 0.47, p = .494).

RT analysis of L1 vs L2. We removed 914 errors (21.50%) out of 4,252 trials from the dataset. Outlier trials were also removed (4.19% out of 3,338 correct responses). The remaining 3,324 trials were included in the analysis. The results showed a main effect of participant group (χ2(1) = 29.63, p <.001) but no significant interaction effect of family size and participant group (p = .731) and no main effect for family size (p = 0.264). Figure 4 reports mean RTs for the large and small family size conditions by L1 and L2 participants.

Figure 4. Mean RTs and ERs for each morphological family size condition by L1 and L2 participants.

ER analysis. The statistical model showed a significant effect of family size (χ2(1) = 6.28, p = .012), indicating that participants made more errors when rejecting items in the large family size condition compared to the small family size condition.

ER analysis of L1 vs L2. When the data from L1 participants were combined, there was a marginally significant interaction between family size and participant group (χ2(1) = 3.22, p = .073), showing that there was a trend towards a larger family size effect for L1 speakers (z = –4.40, p <.001), compared to L2 learners (z = –1.93, p = .053). There was a significant main effect of participant group (χ2(1) = 13.17, p <.001) and a significant effect of family size (χ2(1) = 13.37, p <.001). Figure 4 reports mean ERs for the large and small family size conditions by L1 and L2 participants.

Spelling task

Stem status. We followed the methodology of Behzadnia et al. (Reference Behzadnia, Ziegler, Colenbrander, Bürki and Beyersmann2024b), coding only for the accuracy of the stem’s spelling rather than the entire letter string. Ten L2 participants with error rates exceeding 90% were excluded from the analysis. The results showed a significant effect on stem status (χ2(1) = 23.11, p <.001). That is, participants produced more accurate spelling of the trained stems than untrained stems.

When L1 and L2 data were put together, there was a significant interaction between participant group and stem status (χ2(1) =4.83, p = .028), with a significant main effect of stem status (χ2(1) = 21.60, p <.001) and a main effect of participant group (χ2(1) = 26.27, p <.001), indicating that the stem effect for L1 (z = 3.71, p <.001) was smaller than that for L2 (z = 4.96, p <.001). Statistical information categorized by stem status is presented in Figure 5.

Figure 5. Mean error rates as a function of stem status (left panel) and family size (right panel) for L1 and L2 participants in spelling task.

Morphological family size. The mean spelling ERs were comparable between the large family size condition (63.63%; SD = 0.48) and the small family size condition (62.27%; SD = 0.48). The statistical model showed no significant effect of family size on responses (χ2(1) = 0.66, p = .415). When L1 and L2 data were combined, there was no significant interaction between participant group and condition (χ2(1) = 0.84, p = .359), with no significant main effect of family size (χ2(1) = 0.08, p = .771). The only significant effect was a significant main effect of participant group (χ2(1) = 11.32, p <.001), showing that L1 speakers made less spelling mistakes than L2 speakers. Statistical information grouped by morphological family size is shown in Figure 5. The detailed analyses, scripts, and results are accessible in an Open Science Framework repository at https://osf.io/vtgzj/.

Discussion

The present study aimed to investigate how Chinese-English bilinguals acquire morphologically complex words composed of novel constituent morphemes. Participants learned novel words by associating their written form with one of two pictures during the training stage. Training was repeated until a response accuracy of 90% was achieved. The number of training repetitions ranged from 2 to 9, with a mean of 7.16 repetitions (SD = 1.96), which was similar to the number of repetitions in the L1 speakers tested in Behzadnia et al.’s (Reference Behzadnia, Ziegler, Colenbrander, Bürki and Beyersmann2024b) second experiment, where the mean number of training runs was 7.57 (SD = 1.98). Following training, participants completed a recognition task and a spelling task, which included three conditions: a trained stem with large family size + an untrained constituent condition, a trained stem with small family size + an untrained constituent condition, and an untrained + untrained control condition. The data from the post-training tasks were analyzed to determine whether Chinese-English bilinguals extracted the trained morphemes by comparing the two trained + untrained conditions against the untrained + untrained condition. Additionally, we examined whether morphological family size modulated the training effect by comparing the trained stem with large family size + untrained constituent condition to the trained stem with small family size + untrained constituent condition.

L2 morpheme acquisition

The study’s results from recognition indicated a significant influence of stem status. This suggests that Chinese learners of English were able to identify the trained morphemes when embedded within new, untrained letter strings, even though they had never been exposed to the morphemic constituents in isolation. This shows that they were able to segment the trained items into their constituent morphemes during training, which supports the hypothesis that participants were not just memorizing whole letter strings; instead, they demonstrated an ability to generalize their understanding of trained stems to novel morphemic contexts. These findings underscore the importance of morphemes in the acquisition of new words within L2. Interestingly, since the trained constituents occurred in combination with an untrained constituent, this suggests that readers were able to recognize the trained constituents even if the corresponding letter strings were not exhaustively decomposable into morphemes. This, in turn, points to an embedded constituent activation mechanism that allows readers to identify embedded words within non-segmentable nonword letter strings. Our data are in line with prior findings from masked primed lexical decision, showing that lexical decision times on a given word target (e.g. FARM) are facilitated by both affixed (e.g. farmity-FARM) and non-affixed nonword primes (e.g., farmald-FARM), in both L1 (e.g., Beyersmann et al., Reference Beyersmann, Casalis, Ziegler and Grainger2015; Beyersmann et al., Reference Beyersmann, Mousikou, Schroeder, Javourey-Drevet, Ziegler and Grainger2021; Heathcote et al., Reference Heathcote, Nation, Castles and Beyersmann2018) and L2 (e.g., Kahraman & Beyersmann, Reference Kahraman, Beyersmann, Elgort, Siyanova-Chanturia and Brysbaert2023; Kahraman & Beyersmann, Reference Kahraman and Beyersmann2024; Li & Taft, Reference Li and Taft2020; Li et al., Reference Li, Jiang and Gor2017). Our results thus further support one of the core theoretical assumptions of the word and affix model (Beyersmann & Grainger, Reference Beyersmann, Grainger and Crepaldi2023; Grainger & Beyersmann, Reference Grainger, Beyersmann and Ross2017), which suggests that readers rapidly identify words that are embedded at the left or right edges of a given letter string. This mechanism has been found to play a central role in the acquisition and processing of complex nonwords (e.g., Beyersmann et al., Reference Beyersmann, Grainger and Castles2019) and explains the here-reported trained constituent effects.

When the current L2 data were analysed in combination with the L1 data from Experiment 2 of Behzadnia et al. (Reference Behzadnia, Ziegler, Colenbrander, Bürki and Beyersmann2024b), the results indicated that L1 participants exhibited a larger stem-effect size in the recognition task than L2 learners. This finding suggests that L1 participants were more effective in utilizing morphemic structures. As these novel words were constructed as pseudowords, they were not part of the existing knowledge for either L1 or L2 speakers. However, the letter strings were constructed in accordance with English orthographic rules, which are fundamentally different from the Chinese logographic script, and would have facilitated the identification of novel constituents in L1 speakers due to their deeply entrenched morphological structures that have been in use since childhood. L1 speakers develop deeply ingrained morphological structures, enabling automatic decomposition of words into stems and affixes. This entrenchment allows rapid, subconscious access to stems in novel contexts. In contrast, even proficient L2 learners lack this depth of processing due to their later and often less immersive acquisition.

Morphological family size effect in L2 recognition

A second key finding is that L2 learners, like L1 speakers (Behzadnia et al. Reference Behzadnia, Ziegler, Colenbrander, Bürki and Beyersmann2024b), showed a significant effect of morphological family size in the post-training recognition task, which is consistent with prior research suggesting that morphological family size generally tends to have a facilitatory effect on word recognition (e.g., Baayen et al., Reference Baayen, Lieber and Schreuder1997; Bertram et al., Reference Bertram, Baayen and Schreuder2000; Beyersmann & Grainger, Reference Beyersmann and Grainger2018; De Jong et al., Reference De Jong, Feldman, Schreuder, Pastizzo and Baayen2002; Moscoso del Prado Martin et al., Reference Moscoso del Prado Martin, Bertram, Haikio, Schreuder and Baayen2004). The post-training recognition task was deliberately designed to assess the form-based segmentation mechanisms that operate when processing complex words. The aim was to evaluate two conditions—large versus small family size—under conditions that were identical in terms of the semantic and orthographic relationships between the whole item and its embedded constituents, as well as the amount of exposure during training. The sole variable between these two critical conditions was the size of the morphological family. Consequently, it can be confidently concluded that the observed effects were not due to recognition of orthographic form or meaning. This underscores the notion that stems from the fact that large morphological families are more likely to be processed more easily than those from small morphological families. The morphological family size effect was evident in the increased difficulty they faced in correctly rejecting items with embedded constituents from large morphological families, as opposed to those with constituents from small morphological families. Although the impact of morphological family size was significant in the ERs but not in the RTs during the recognition task, the direction of the mean effect sizes was consistent in both ERs and RTs. This pattern is also consistent with what has been previously found in L1 speakers (Behzadnia et al., Reference Behzadnia, Ziegler, Colenbrander, Bürki and Beyersmann2024b). Neither study observed a family size effect in RT data, presumably because error rates in this challenging task are particularly informative, since readers are prone to making errors in identifying trained constituents embedded in untrained nonwords.

According to De Zeeuw et al. (Reference De Zeeuw, Verhoeven and Schreuder2012), L2 children showed a larger morphological family size effect than the monolingual children; that is, the L2 children gained more advantages than L1 children when a word had a large morphological family. However, Akbari (Reference Akbari2016) found that bilingual children in an L2 setting showed lower accuracy in responding to words with very small morphological families (three or fewer related words) but benefited from high frequency and larger morphological family sizes (more than three words per family). This suggests that their L2 lexicons, despite five to twelve years of education in their L2 and high general proficiency in their L2, contained fewer mid- and low-frequency words and words with smaller morphological family sizes. However, the morphological family size effect in the present study was comparable across L1 and L2 participants, possibly because the training words of the present study were novel to both L1 and L2 participants, which is consistent with de Zeeuw et al.’s (Reference De Zeeuw, Verhoeven and Schreuder2012) and Mulder et al.’s (Reference Mulder, Schreuder and Dijkstra2013) finding that morphological family size has an impact on both L1 and L2 word processing.

By simulating real-world novel word learning, our findings shed light on the role of morphological families in L2 vocabulary instruction. Teaching L2 learners novel words belonging to the same morphological family may not only increase their vocabulary size but simultaneously teach them about the important morphological relationships that are shared amongst family members. In turn, this may equip L2 learners with a tool they can apply to derive new morphologically complex words from stems they are already familiar with. By understanding relationships between words within morphological families, L2 speakers acquire a morphological rule system that is applicable to other words and sheds light on the meaning that affixes contribute at the word level. This morphological family size effect supports the idea that word families facilitate reading skills by clustering learning around base words, making it easier for learners to recognize and understand new words in texts.

Morpheme effects in L2 written novel word production

In addition to the lexical recognition task, all participants completed a spelling task where they were asked to spell items that either did or did not contain a trained constituent morpheme. The results revealed that, similar to the recognition task, L2 learners exhibited a significant stem effect, showing that the learning of the morphemic constituents generalized to word production. Interestingly, the combined L1 and L2 analyses showed that the effect of stem status on spelling was even larger in L2 than L1, although the mean effect sizes were highly comparable. One possible explanation for the slightly larger stem effect in our L2 speaker group is that our second language learners employed a particularly careful orthographic analysis of the novel letter strings during training, which allowed them to retrieve the correct constituent spellings in the post-training spelling task. This fits with the general notion that higher levels of attention during novel word reading are associated with better later form-meaning recall development (e.g., Godfroid et al., Reference Godfroid, Winke and Conklin2020; Laufer & Goldstein, Reference Laufer and Goldstein2004; Perfetti, Reference Perfetti2007; Rice & Tokowicz, Reference Rice and Tokowicz2020). The current findings suggest that L2 speakers may have allocated slightly more attention to the precise spellings of the novel items, not only during training but also during the spelling task itself, which explains why their spelling accuracy was slightly higher than in the L1 comparison group. While attention allocation was not directly measured here, future studies may employ eye-tracking as a more dynamic measure to empirically investigate how stems attract attention during processing. Crucially, our study highlights important task differences: While the spelling task revealed a larger stem-effect (on error rates) in L2 than L1, the recognition task showed a larger stem-effect (on response times and error rates) in L1 compared to L2. This shows that L1 speakers demonstrated a greater level of fluency and automaticity in their retrieval of the trained constituents, as captured in the results of the recognition task, whereas L2 speakers paid greater attention to the retrieval of the precise spellings, which explains their outcomes in the spelling task. Overall, across both tasks, our findings show that L2 speakers were able to identify the embedded trained constituents and further highlight the robustness of morpheme effects in L2 novel word learning.

One outstanding question for future research is whether the present constituent training effects extend to novel words with more abstract, affix-like constituent-meanings. The current study used novel words that cannot be clearly classified as novel compound or affixed words. Our second constituents were shorter than the first constituents, which is typical for suffixed words where suffixes tend to be shorter on average than the preceding stem morphemes (e.g., farm [4 letters] + er [2 letters]). However, there are many exceptions to the rule, as it is not uncommon for suffixes to exceed the length of the preceding stem (e.g., pay [3 letters] + ment [4 letters]) and second constituents in compound words can on occasion be shorter than their first constituents (e.g., police [6 letters] + man [3 letters]). As such, constituent length is not a reliable distinguishing factor between compounding and affixation. A more notable distinction lies in the fact that affixes convey more abstract meanings (e.g., -er; “someone who”). In our study, constituent meanings were picturable and concrete and therefore had greater resemblance to stem-stem concatenations that are typical for compound words. Hence, an interesting future extension of this work would be the investigation of novel complex words that consist of second constituents with more abstract meanings, either based on the explicit teaching of affix meanings (Merkx et al., Reference Merkx, Rastle and Davis2011) or using implicit picture-word associations (Behzadnia et al., Reference Behzadnia, Ziegler, Colenbrander, Bürki and Beyersmann2024b).

Conclusions

Prior evidence on L2 morphological processing has yielded mixed findings. Some studies reported that L2 learners rely more on whole-word processing rather than morphological decomposition (e.g., Silva & Clahsen, Reference Silva and Clahsen2008; Clahsen et al., Reference Clahsen, Felser, Neubauer, Sato and Silva2010). In contrast, other studies have found that L2 learners exhibit similar priming effects for targets, regardless of whether the primes are derived, pseudo-derived, or form-overlapping (e.g., J. Li et al., Reference Li, Taft and Xu2017; M. Li et al., Reference Li, Jiang and Gor2017). The current study showed that L2 participants can decipher the morphological structure of completely novel letter strings and apply this newly acquired knowledge to other novel letter strings. Importantly, L2 participants in our experiments had no prior knowledge of the embedded orthographic units and had to rely entirely on mapping letters (e.g., torbilm) onto meanings (e.g., big ball). This finding revealed the ability of L2 readers to leverage morphemic structures even in the absence of prior exposure to the orthographic units, highlighting the flexibility and adaptability of morphological processing in reading. Besides, our study indicates that, despite the significant differences in orthography between Chinese and English, morpheme segmentation is specific to the target language.

In conclusion, this study was among the first to examine the learning of morphologically complex novel words consisting of two entirely novel constituents. Our findings demonstrate that morphological structure, and morphological family size in particular, had a facilitatory effect on the learning outcomes in our L2 speaker group. In a broader sense, our findings have implications for instructional methods employed in vocabulary training programs for second language learners of English, suggesting that morphology may serve as an important tool to boost reading skills in bilinguals.

Data availability statement

The datasets and materials that support the findings of this study are openly available in the Open Science Framework (OSF) at https://osf.io/vtgzj/. The study design was also preregistered. The analyses followed the preregistered plan.

Acknowledgments

The writing was supported by Zhejiang Provincial Philosophy and Social Sciences Planning Project (22NDJC037Z). We would like to thank Yufen Chen from Zhejiang Gongshang University, Lijuan Qiao from Zhejiang Chinese Medical University, Linfei Huang from Hubei University of Technology, and Fang Chen, Aijin Que, Xiaoqin Yin, Yaping Zhang, Ruodan Pang, and Yejuan Zha from Hangzhou City University for their help in collecting data.

Competing interests

There is no conflict of interest in this submission.

References

Akbari, N. (2016). Word frequency and morphological family size effects on the accuracy and speed of lexical access in school-aged bilingual students. International Journal of Applied Linguistics, 26(3), 311328. https://doi.org/10.1111/ijal.12113CrossRefGoogle Scholar
Anwyl-Irvine, A., Dalmaijer, E. S., Hodges, N., & Evershed, J. K. (2021). Realistic precision and accuracy of online experiment platforms, web browsers, and devices. Behavior Research Methods, 53(4), 14071425. https://doi.org/10.3758/s13428-020-01501-5CrossRefGoogle ScholarPubMed
Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008). Mixed effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59(4), 390412. https://doi.org/10.1016/j.jml.2007.12.005CrossRefGoogle Scholar
Baayen, R. H., Lieber, R., & Schreuder, R. (1997). The morphological complexity of simplex nouns. Linguistics, 35(5), 861878. https://doi.org/10.1515/ling.1997.35.5.861CrossRefGoogle Scholar
Baayen, R. H., & Milin, P. (2010). Analyzing reaction times. International Journal of Psychological Research, 3(2), 1228.10.21500/20112084.807CrossRefGoogle Scholar
Balota, D. A., Yap, M. J., Hutchison, K. A., Cortese, M. J., Kessler, B., Loftis, B., Neely, J. H., Nelson, D. L., Simpson, G. B., & Treiman, R. (2007). The English Lexicon Project. Behavior Research Methods, 39(3), 445459. https://doi.org/10.3758/bf03193014CrossRefGoogle ScholarPubMed
Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 148. https://doi.org/10.18637/jss.v067.i01CrossRefGoogle Scholar
Behzadnia, A., Wegener, S., Bürki, A., & Beyersmann, E. (2024a). The role of oral vocabulary when L2 speakers read novel words: A complex word training study. Bilingualism: Language and Cognition, 27(3), 388399. https://doi.org/10.1017/S1366728923000627CrossRefGoogle Scholar
Behzadnia, A., Ziegler, J. C., Colenbrander, D., Bürki, A., & Beyersmann, E. (2024b). The role of morphemic knowledge during novel word learning. Quarterly Journal of Experimental Psychology, 77(8), 16201634. https://doi.org/10.1177/17470218231216369CrossRefGoogle Scholar
Bertram, R., Baayen, R. H., & Schreuder, R. (2000). Effects of family size for complex words. Journal of Memory and Language, 42(3), 390405. https://doi.org/10.1006/jmla.1999.2681CrossRefGoogle Scholar
Beyersmann, E., Casalis, S., Ziegler, J. C., & Grainger, J. (2015). Language proficiency and morpho-orthographic segmentation. Psychonomic Bulletin & Review, 22, 10541061.10.3758/s13423-014-0752-9CrossRefGoogle ScholarPubMed
Beyersmann, E., & Grainger, J. (2018). Support from the morphological family when unembedding the stem. Journal of Experimental Psychology: Learning, Memory, and Cognition, 44(1), 135142. https://doi.org/10.1037/xlm0000435Google ScholarPubMed
Beyersmann, E., & Grainger, J. (2023). The role of embedded words and morphemes in reading. In Crepaldi, D. (Ed.), Linguistic morphology in the mind and brain. Routledge.Google Scholar
Beyersmann, E., Grainger, J., & Castles, A. (2019). Embedded stems as a bootstrapping mechanism for morphological parsing during reading development. Journal of Experimental Child Psychology, 182, 196210. https://doi.org/10.1016/j.jecp.2019.01.010CrossRefGoogle ScholarPubMed
Beyersmann, E., Grainger, J., Dufau, S., Fournet, C., & Ziegler, J. C. (2023). The effect of constituent frequency and distractor type on learning novel complex words. Language, Cognition and Neuroscience, 39(2), 251264. https://doi.org/10.1080/23273798.2023.2263590CrossRefGoogle Scholar
Beyersmann, E., Mousikou, P., Schroeder, S., Javourey-Drevet, L., Ziegler, J. C., & Grainger, J. (2021). The dynamics of morphological processing in developing readers: A cross-linguistic masked priming study. Journal of Experimental Child Psychology, 208, 105140. https://doi.org/10.1016/j.jecp.2021.105140CrossRefGoogle ScholarPubMed
Beyersmann, E., Wegener, S., Pescuma, V. N., Nation, K., Colenbrander, D., & Castles, A. (2022). The effect of oral vocabulary training on reading novel complex words. Quarterly Journal of Experimental Psychology, 76, 13211332. https://doi.org/10.1177/17470218221113949CrossRefGoogle ScholarPubMed
Beyersmann, E., Ziegler, J. C., Castles, A., Coltheart, M., Kezilas, Y., & Grainger, J. (2016). Morpho-orthographic segmentation without semantics. Psychonomic Bulletin & Review, 23(2), 533539. https://doi.org/10.3758/s13423-015-0927-zCrossRefGoogle ScholarPubMed
Boudelaa, S., & Marslen-Wilson, W. D. (2011). Productivity and priming: Morphemic decomposition in Arabic. Language and Cognitive Processes, 26(4–6), 624652. https://doi.org/10.1080/01690965.2010.521022CrossRefGoogle Scholar
Brysbaert, M., Warriner, A. B., & Kuperman, V. (2014). Concreteness ratings for 40 thousand generally known English word lemmas. Behavior Research Methods, 46(3), 904911. https://doi.org/10.3758/s13428-013-0403-5CrossRefGoogle ScholarPubMed
Clahsen, H., Felser, C., Neubauer, K., Sato, M., & Silva, R. (2010). Morphological structure in native and nonnative language processing. Language Learning, 60(1), 2143. https://doi.org/10.1111/j.1467-9922.2009.00550.xCrossRefGoogle Scholar
Coughlin, C. E., & Tremblay, A. (2015). Morphological decomposition in native and non-native French speakers. Bilingualism: Language and Cognition, 18(03), 524542. https://doi.org/10.1017/s1366728914000200CrossRefGoogle Scholar
Dawson, N., Rastle, K., & Ricketts, J. (2021). Bridging form and meaning: Support from derivational suffixes in word learning. Journal of Research in Reading, 44(1), 2750. https://doi.org/10.1111/1467-9817.12338CrossRefGoogle Scholar
De Jong, N. H., Feldman, L. B., Schreuder, R., Pastizzo, M., & Baayen, R. H. (2002). The processing and representation of Dutch and English compounds: Peripheral morphological and central orthographic effects. Brain and Language, 81(1), 555567. https://doi.org/10.1006/brln.2001.2547CrossRefGoogle ScholarPubMed
De Zeeuw, M., Verhoeven, L., & Schreuder, R. (2012). Morphological family size effects in young first and second language learners: Evidence of cross-language semantic activation in visual word recognition. Language Learning, 62(1), 689210.1111/j.1467-9922.2011.00691.xCrossRefGoogle Scholar
Diependaele, K., Sandra, D., & Grainger, J. (2009). Semantic transparency and masked morphological priming: The case of prefixed words. Memory & Cognition, 37(6), 895908. https://doi.org/10.3758/MC.37.6.895CrossRefGoogle ScholarPubMed
Diependaele, K., Duñabeitia, J. A., Morris, J., & Keuleers, E. (2011). Fast morphological effects in first and second language word recognition. Journal of Memory and Language, 64(4), 344358. https://doi.org/10.1016/j.jml.2011.01.003CrossRefGoogle Scholar
Dijkstra, T., Moscoso del Prado Martín, F., Schulpen, B., Schreuder, R., & Baayen, R. (2005). A roommate in cream: Morphological family size effects on interlingual homograph recognition. Language and Cognitive Processes, 20, 741.10.1080/01690960444000124CrossRefGoogle Scholar
Duñabeitia, J. A., Crepaldi, D., Meyer, A. S., New, B., Pliatsikas, C., Smolka, E., & Brysbaert, M. (2018). MultiPic: A standardized set of 750 drawings with norms for six European languages. Quarterly Journal of Experimental Psychology, 71(4), 808816. https://doi.org/10.1080/17470218.2017.1310261CrossRefGoogle ScholarPubMed
Godfroid, A., Winke, P., & Conklin, K. (2020). Exploring the depths of second language processing with eye tracking: An introduction. Second Language Research. 36(3):243255. https://doi.org/10.1177/0267658320922578CrossRefGoogle Scholar
Grainger, J., & Beyersmann, E. (2017). Edge-aligned embedded word activation initiates morpho-orthographic segmentation. In Ross, B. H. (Ed.), The Psychology of Learning and Motivation (Vol. 67, pp. 285317). Elsevier Academic Press.Google Scholar
Heathcote, L., Nation, K., Castles, A., & Beyersmann, E. (2018). Do “blacheap” and “subcheap” both prime “cheap”? An investigation of morphemic status and position in early visual word processing. Quarterly Journal of Experimental Psychology, 71(8), 16451654.10.1080/17470218.2017.1362704CrossRefGoogle Scholar
Juhasz, B. J., & Berkowitz, R. N. (2011). Effects of morphological families on English compound word recognition: A multitask investigation. Language and Cognitive Processes, 26(4–6), 653682. https://doi.org/10.1080/01690965.2010.498668CrossRefGoogle Scholar
Kahraman, H., & Beyersmann, E. (2023). Cross-language influences on morphological processing in bilinguals. In Elgort, I., Siyanova-Chanturia, A., & Brysbaert, M. (Eds.), Cross-language Influences in Bilingual Processing and Second Language Acquisition (pp. 230261). (Bilingual Processing and Acquisition; No. 16). John Benjamins Publishing Company. https://doi.org/10.1075/bpa.16.10kahCrossRefGoogle Scholar
Kahraman, H., & Beyersmann, E. (2024). Sand, sandpaper, and sandwiches: Evidence from a masked compound priming task in L1 and L2 speakers of English. Journal of cognition, 7(1), 30. https://doi.org/10.5334/joc.350CrossRefGoogle ScholarPubMed
Kuperman, V., Bertram, R., & Baayen, R. H. (2008). Morphological dynamics in compound processing. Language and Cognitive Processes, 23(7–8), 10891132. https://doi.org/10.1080/01690960802193688CrossRefGoogle Scholar
Langenberg, B., Janczyk, M., Koob, V., Kliegl, R., & Mayer, A. (2023). A tutorial on using the paired t test for power calculations in repeated measures ANOVA with interactions. Behavior Research Methods, 55(5), 24672484. https://doi.org/10.3758/s13428-022-01902-8CrossRefGoogle ScholarPubMed
Laufer, B., & Goldstein, Z. (2004). Testing vocabulary knowledge: Size, strength, and computer adaptiveness. Language Learning, 54(3), 399436. https://doi.org/10.1111/j.0023-8333.2004.00260.xCrossRefGoogle Scholar
Lemhöfer, K., & Broersma, M. (2012). Introducing LexTALE: A quick and valid Lexical Test for Advanced Learners of English. Behavior Research Methods, 44(2), 325343. https://doi.org/10.3758/s13428-011-0146-0CrossRefGoogle ScholarPubMed
Li, J., & Taft, M. (2020). The processing of English prefixed words by Chinese-English bilinguals. Studies in Second Language Acquisition, 42(1), 239249. https://doi.org/10.1017/S0272263119000172CrossRefGoogle Scholar
Li, J., Taft, M., & Xu, J. (2017). The processing of English derived words by Chinese-English bilinguals. Language Learning, 67(4), 858884. https://doi.org/10.1111/lang.12247CrossRefGoogle Scholar
Li, M., Jiang, N., & Gor, K. (2017). L1 and L2 processing of compound words: Evidence from masked priming experiments in English. Bilingualism: Language and Cognition, 20(2), 384402. https://doi.org/10.1017/S1366728915000681CrossRefGoogle Scholar
Merkx, M., Rastle, K., & Davis, M. H. (2011). The acquisition of morphological knowledge investigated through artificial language learning. Quarterly Journal of Experimental Psychology, 64(6), 12001220. https://doi.org/10.1080/17470218.2010.538211CrossRefGoogle ScholarPubMed
Moscoso del Prado Martin, F., Bertram, R., Haikio, T., Schreuder, R., & Baayen, R. H. (2004). Morphological family size in a morphologically rich language: The case of Finnish compared with Dutch and Hebrew. Journal of Experimental Psychology: Learning, Memory, and Cognition, 30(6), 12711278. https://doi.org/10.1037/0278-7393.30.6.1271Google Scholar
Moscoso del Prado Martín, F., Deutsch, A., Frost, R., Schreuder, R., De Jong, N. H., & Baayen, R. H. (2005). Changing places: A cross-language perspective on frequency and family size in Dutch and Hebrew. Journal of Memory and Language, 53, 496512.10.1016/j.jml.2005.07.003CrossRefGoogle Scholar
Mulder, K., Schreuder, R. & Dijkstra, T. (2013). Morphological family size effects in L1 and L2 processing: An electrophysiological study. Language and Cognitive Processes, 28(7), 1004103510.1080/01690965.2012.733013CrossRefGoogle Scholar
Perfetti, C. (2007). Reading ability: Lexical quality to comprehension. Scientific Studies of Reading, 11(4), 357383. https://doi.org/10.1080/10888430701530730CrossRefGoogle Scholar
R Core Team. (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/Google Scholar
Rastle, K., Davis, M. H., & New, B. (2004). The broth in my brother’s brothel: Morpho-orthographic segmentation in visual word recognition. Psychonomic Bulletin & Review, 11(6), 10901098. https://doi.org/10.3758/BF03196742CrossRefGoogle Scholar
Raudszus, H., Segers, E., & Verhoeven, L. (2021). Use of morphological and contextual cues in children’s lexical inferencing in L1 and L2. Reading & Writing, 34(6), 15131538. https://doi.org/10.1007/s11145-021-10122-zCrossRefGoogle Scholar
Rice, C. A., & Tokowicz, N. (2020). A review of laboratory studies of adult second language vocabulary training. Studies in Second Language Acquisition, 42(2), 439470. https://doi.org/10.1017/S0272263119000500CrossRefGoogle Scholar
Sánchez-Gutiérrez, C. H., Mailhot, H., Deacon, S. H., & Wilson, M. A. (2018). MorphoLex: A derivational morphological database for 70,000 English words. Behavior Research Methods, 50(4), 15681580. https://doi.org/10.3758/s13428-017-0981-8CrossRefGoogle ScholarPubMed
Schreuder, R., & Baayen, R. H. (1997). How complex simplex words can be. Journal of Memory and Language, 37(1), 118139. https://doi.org/10.1006/jmla.1997.2510CrossRefGoogle Scholar
Schmitt, N. (1998). Tracking the incremental acquisition of second language vocabulary: A longitudinal study. Language Learning, 48(2), 281317.10.1111/1467-9922.00042CrossRefGoogle Scholar
Scott, G. G., Keitel, A., Becirspahic, M., Yao, B., & Sereno, S. C. (2019). The Glasgow Norms: Ratings of 5,500 words on nine scales. Behavior Research Methods, 51(3), 12581270. https://doi.org/10.3758/s13428-018-1099-3CrossRefGoogle ScholarPubMed
Silva, R., & Clahsen, H. (2008). Morphologically complex words in L1 and L2 processing: Evidence from masked priming experiments in English. Bilingualism: Language and Cognition, 11(02), 245260. https://doi.org/10.1017/s1366728908003404CrossRefGoogle Scholar
Tamminen, J., Davis, M. H., Merkx, M., & Rastle, K. (2012). The role of memory consolidation in generalisation of new linguistic information. Cognition, 125(1), 107112. https://doi.org/10.1016/j.cognition.2012.06.014CrossRefGoogle ScholarPubMed
Tamminen, J., Davis, M. H., & Rastle, K. (2015). From specific examples to general knowledge in language learning. Cognitive Psychology, 79, 139. https://doi.org/10.1016/j.cogpsych.2015.03.003CrossRefGoogle ScholarPubMed
Torgesen, J.F., Wagner, R. K., & Rashotte, C. A. (1999). Test of Word Reading Efficiency (TOWRE). Pro-Ed.Google Scholar
Tucker, R., Castles, A., Laroche, A., & Deacon, S. H. (2016). The nature of orthographic learning in self-teaching: Testing the extent of transfer. Journal of Experimental Child Psychology, 145, 7994. https://doi.org/10.1016/j.jecp.2015.12.007CrossRefGoogle ScholarPubMed
Ullman, M. T. (2005). A cognitive neuroscience perspective on second language acquisition: the declarative/procedural model. In Sanz, C. (Ed.), Mind and context in adult second language acquisition: Methods, theory and practice (pp. 141178). Georgetown University Press.Google Scholar
Van Heuven, W. J. B., Mandera, P., Keuleers, E., & Brysbaert, M. (2014). Subtlex-UK: A new and improved word frequency database for British English. Quarterly Journal of Experimental Psychology, 67(6), 11761190. https://doi.org/10.1080/17470218.2013.850521CrossRefGoogle ScholarPubMed
Ward, J. & Chuenjundaeng, J. (2009). Suffix knowledge: Acquisition and applications. System, 37, 461469. https://doi.org/10.1016/j.system.2009.01.004CrossRefGoogle Scholar
Figure 0

Table 1. Mean Characteristics of Word Sets and Constituent Morphemes Used in the Experiment

Figure 1

Figure 1. Study Flowchart.

Figure 2

Figure 2. Training procedure.

Figure 3

Figure 3. Mean RTs and ERs as a function of stem status for L1 and L2 participants.

Figure 4

Figure 4. Mean RTs and ERs for each morphological family size condition by L1 and L2 participants.

Figure 5

Figure 5. Mean error rates as a function of stem status (left panel) and family size (right panel) for L1 and L2 participants in spelling task.