Highlights
-
• Language-switching costs vary by context and are influenced by language dominance.
-
• Picture naming shows greater switch cost asymmetry due to top-down control demands.
-
• Reading aloud reduces asymmetry with distinct orthographies.
-
• Language context and task modality jointly shape bilingual language production processes.
1. Introduction
Bilingual speakers are known to switch from one language to another flexibly in various communicative contexts and language modalities (Green & Abutalebi, Reference Green and Abutalebi2013). This impressive ability requires the brain to select the target language and de-select the non-target language, with the activation and inhibition of two or multiple languages (Inhibitory Control Model; Green, Reference Green1998). However, there is a delayed response when switching language, and this “switch cost” is greater when switching from the later-acquired, non-dominant language (here, L2) to the dominant native language (L1). This switch cost asymmetry is usually driven by the differences in L1 and L2 language proficiency and use, leading to greater switch cost to the dominant L1. Switch cost has been reported in previous studies that focus on language production processes such as object naming (e.g., Christoffels et al., Reference Christoffels, Firk and Schiller2007; Costa et al., Reference Costa, Santesteban and Ivanova2006; Costa & Santesteban, Reference Costa and Santesteban2004; Declerck et al., Reference Declerck, Koch and Philipp2012; Finkbeiner et al., Reference Finkbeiner, Almeida, Janssen and Caramazza2006; Ma et al., Reference Ma, Li and Guo2016; Timofeeva et al., Reference Timofeeva, Quinones, Geng, de Bruin, Carreiras and Amoruso2023) and other production modalities such as reading written words aloud (Macizo et al., Reference Macizo, Bajo and Paolieri2012; Slevc et al., Reference Slevc, Davey and Linck2016).
In addition to language proficiency and use, factors contributing to the asymmetry still require more investigations (see Gade et al., Reference Gade, Declerck, Philipp, Rey-Mermet and Koch2021, for meta-analysis). Recently, Timmer et al. (Reference Timmer, Christoffels and Costa2019) used a picture naming task and suggested that language context is another key factor that leads to the pattern of switch cost asymmetry. Picture naming in a language switching paradigm is a top-down production process, and this has been a popular way to examine switch cost asymmetry, but speech production does not merely involve naming pictures. Imagine asking a bilingual person to read words aloud (e.g., Macizo et al., Reference Macizo, Bajo and Paolieri2012; Slevc et al., Reference Slevc, Davey and Linck2016), this production modality becomes a bottom-up processing. The major question we ought to answer is: while being exposed to different language contexts, do bilingual speakers also perform language switching in the same way? More specifically, can the underlying language control mechanism be modulated by language context and production modality? Here, investigating different production modalities and manipulating language ratio in different contexts, we aim to further examine the pattern of switch cost asymmetry. To that end, we present a study consisting of two experiments that directly address this ongoing debate regarding the nature of language switching in bilingual speakers.
1.1. Switch cost in bilingual production
Languages (dominant L1 and non-dominant L2) are activated in parallel (e.g., joint activation, Bialystok, Reference Bialystok2017), hence, the inhibition arises when two languages compete for production (Green, Reference Green1998). When bilinguals switch between languages, this casts a burden upon the brain as it needs to activate the target language and deactivate the non-target language. To reduce the competition, the non-target language (i.e., language that is not currently needed) is inhibited to boost the activation of the target language. When an inhibited language needs to be reactivated (i.e., to switch back to an inhibited language), overcoming such inhibition causes a delayed response, and this increased naming latency depends on the magnitude of inhibition. The L1 requires more inhibition than the L2 during speech production due to underlying higher resting levels of activation (higher interference), resulting in longer delayed response when switching from L2 to L1 than vice versa. It is also known that, within different language contexts, bilinguals can adapt themselves and optimise necessary mechanisms to achieve the language goal (Adaptive Control Hypothesis (ACH), Green & Abutalebi, Reference Green and Abutalebi2013). These language contexts then shape the level of activation, thereby affecting the cognitive control processes (e.g., attention, inhibition, conflict monitoring, task (dis)engagement). Hence, switching between languages as reflected in a dual-language context (i.e., a context where both languages are required) can lead to various language performances.
To test inhibition and switch cost asymmetry, a cued-switching paradigm is used where bilinguals name sets of items (e.g., pictures, digits) according to a cue signalling the target language. This paradigm contains switch trials, which the current naming language is incongruent with the preceding naming language (e.g., L1 switch trial refers to switching from L2 to L1), and non-switch trials, which the current naming language is congruent with the preceding naming language (e.g., L1 non-switch trials refers to naming in L1 in the subsequent trials).
In a seminal study, Meuter and Allport (Reference Meuter and Allport1999) employed a cued-switching paradigm and tested unbalanced bilinguals (i.e., L1 more proficient than L2) by asking the participants to name digits in either L1 or L2 according to the colour cues presented on each trial. Supporting Green’s hypothesis (1998), Meuter and Allport’s results showed greater L1 switch cost when switching from L2 to L1 than vice versa (i.e., switch cost asymmetry). This finding highlighted that the magnitude of inhibition that bilinguals need to overcome differs across languages, with L1, the more proficient dominant language, requiring greater inhibition than L2. There are alternative accounts for greater switch cost from L2 to L1 apart from inhibitory control (e.g., response selection hypothesis, Finkbeiner et al., Reference Finkbeiner, Almeida, Janssen and Caramazza2006; persisting activation hypothesis, Philipp et al., Reference Philipp, Gade and Koch2007; general selective mechanism hypothesis, Blanco-Elorrieta & Caramazza, Reference Blanco-Elorrieta and Caramazza2021, see Declerck & Koch, Reference Declerck and Koch2023 for a review).
Language switching studies to date have mostly reported switch cost, with different cue-stimulus manipulation. There are studies showing asymmetrical switch cost in unbalanced bilinguals (e.g., Costa & Santesteban, Reference Costa and Santesteban2004; Campbell, Reference Campbell2005; Jackson et al., Reference Jackson, Swainson, Cunnington and Jackson2001; Meuter & Allport, Reference Meuter and Allport1999; Verhoef et al., Reference Verhoef, Roelofs and Chwilla2009, Zuo et al., Reference Zuo, Schwieter, Cao and Liu2022; but see Liu et al., Reference Liu, Fan, Rossi, Yao and Chen2016), symmetrical switch cost in balanced bilinguals (Costa et al., Reference Costa, Santesteban and Ivanova2006; Costa & Santesteban, Reference Costa and Santesteban2004; Timofeeva et al., Reference Timofeeva, Quinones, Geng, de Bruin, Carreiras and Amoruso2023), and asymmetrical switch cost pattern when language switching involves more than two languages (Costa et al., Reference Costa, Santesteban and Ivanova2006; Declerck et al., Reference Declerck, Thoma, Koch and Philipp2015; Declerck & Philipp, Reference Declerck and Philipp2018; Philipp et al., Reference Philipp, Gade and Koch2007). Some studies found switch cost reduction or absence of switch cost when presenting a language cue before a target stimulus (Costa & Santesteban, Reference Costa and Santesteban2004; Ma et al., Reference Ma, Li and Guo2016; Verhoef et al., Reference Verhoef, Roelofs and Chwilla2009; Khateb et al., Reference Khateb, Shamshoum and Prior2017; Mosca & Clahsen, Reference Mosca and Clahsen2016; Mosca et al., Reference Mosca, Manawamma and de Bot2022), when switching is voluntary (Blanco-Elorrieta & Pylkkänen, Reference Blanco-Elorrieta and Pylkkanen2017; De Bruin & McGarrigle, Reference De Bruin and McGarrigle2024; De Bruin & Xu, Reference De Bruin and Xu2023, De Bruin et al., Reference De Bruin, Samuel and Duñabeitia2018; Gollan & Ferreira, Reference Gollan and Ferreira2009; Jevtović et al., Reference Jevtović, Duñabeitia and De Bruin2020), and when a contextual cue is used (Blanco-Elorrieta & Pylkkanen, Reference Blanco-Elorrieta and Pylkkanen2018; Liu et al., Reference Liu, Timmer, Jiao, Yuan and Wang2019). Recently, some researchers have also investigated the effect of language contexts (Olson, Reference Olson2016; Timmer et al., Reference Timmer, Christoffels and Costa2019) and processing modalities (Li et al., Reference Li, Midgley, Ferreira, Holcomb and Gollan2024) on bilingual language control.
1.2. The effect of language context and production modalities
When it comes to language switching, the performance of two languages is highly associated with the dual-language context. According to ACH, this context involves the most cognitive control processes, especially goal maintenance and interference control. The inhibitory process is embedded within interference control. Given the load of cognitive demands, bilingual speakers must show a highly flexible language processing ability, which makes switching in a dual-language context an interesting behaviour to explore. To be able to test the nature of language control mechanisms, some studies have taken a step further by manipulating the ratio of L1 and L2.
With respect to Olson (Reference Olson2016) and Timmer et al. (Reference Timmer, Christoffels and Costa2019), language context is an important factor as bilinguals are exposed to different dual-language contexts that modulate the magnitude of inhibition, thereby determining the presence of switch cost asymmetry and highlighting their flexibility. Olson (Reference Olson2016) manipulated the ratio of L1 and L2 in different language contexts (i.e., “Monolingual” contexts, pictures mostly named in either L1 or L2, and “Bilingual” contexts, half in L1 and L2, respectively). The findings showed a switching asymmetry in Monolingual context (greater switch cost in L1 than L2), but symmetrical switching cost in Bilingual context. These results suggest that language context might affect the amount of competition and inhibition during switching, causing its delayed response in switching, whereas when both languages are equally used, the asymmetry becomes absent. It is noteworthy that Olson’s (2016) asymmetrical pattern between L1 and L2 in the monolingual context could be driven by the language block itself. For instance, L1 switch trials were presented in an L2 monolingual context, whilst L1 non-switch trials were presented in an L1 monolingual context. Therefore, L1 switch cost effects were the result of not only a switching manipulation but also a simultaneous language context manipulation (i.e., L1 switch in L2 context vs. L1 non-switch in L1 context). Hence, the asymmetrical pattern found could potentially be the result of the different language contexts in which the switch and non-switch trials were presented. Timmer et al. (Reference Timmer, Christoffels and Costa2019) took a step further and teased apart language effects (L1 vs. L2) from context effects during language switching. They employed L1-predominant and L2-predominant contexts, with both contexts including L1 and L2 switch and non-switch trials. Note that one group of Dutch-English bilinguals completed the L1-predominant context, and another group of Dutch-English bilinguals completed the L2-predominant context. In contrast to Olson (Reference Olson2016), Timmer and colleagues found symmetrical switch cost in the L1-predominant but reversed asymmetry (i.e., greater L2 switch cost than L1 switch cost) in the L2-predominant. The authors suggest that the symmetrical patterns observed in L1-predominant could be the result of global L1 inhibition, meaning the L1 had to be delayed benefiting L2 (see also Christoffels et al., Reference Christoffels, Firk and Schiller2007). Taken together, these results indicate that language context might modulate switch cost patterns and overall L1/L2 performance. However, the underlying reasons leading to two completely different findings across these two studies remain unclear (e.g., individual bilingual life experiences, methodological choices). It is noteworthy that in Timmer et al. (Reference Timmer, Christoffels and Costa2019) the sequence of switch and non-switch trials was fixed, which might also affect predictability of switching and the role of activation and inhibition (see Jackson et al., Reference Jackson, Swainson, Mullin, Cunnington and Jackson2004).
With respect to the previous studies, what we know so far is that bilingual language control mechanisms is highly adaptable, and switch cost does not remain static. The evidence of the dynamic nature of switch cost patterns in different linguistic environments has been revealed in bilingualism research by using a language switching task with the same stimuli presented several times (e.g., four times or more) for each participant. This can, to some extent, cause facilitation during the naming process (Wodniecka et al., Reference Wodniecka, Szewczyk, Kalamala, Mandera and Durlik2020). To this end, the current study was set out to investigate how adaptable the bilingual language control mechanism is by minimising repetition of each stimulus. Furthermore, if switch cost asymmetry can be modulated by language context, it is also intriguing to see how different language processing approaches affect switch cost asymmetry when bilinguals encounter different stimuli. To that end, the question we aim to answer is whether switch cost asymmetry can be consistent in both top-down (naming pictures) and bottom-up (reading words aloud) speech production and whether the more frequently used language leads to delayed performance when re-activating an inhibited language.
Reading words aloud is a bottom-up production modality in which a word form automatically activates its semantic and phonological features (see also BIA+, Dijkstra & van Heuven, Reference Dijkstra and Van Heuven2002; see also Mosca & De Bot, Reference Mosca and De Bot2017). The speed to respond to a word is dependent on language proficiency, namely the dominant proficient L1 is expected to be faster than the non-dominant L2. This production modality has been investigated in the past, with different switch cost patterns observed (Declerck et al., Reference Declerck, Koch, Duñabeitia, Grainger and Stephan2019; Filippi et al., Reference Filippi, Karaminis and Thomas2014; Macizo et al., Reference Macizo, Bajo and Paolieri2012; Reynolds et al., Reference Reynolds, Schloffel and Peressotti2016; Slevc et al., Reference Slevc, Davey and Linck2016; Zuo et al., Reference Zuo, Schwieter, Cao and Liu2022). This, resembling top-down production modality, has shown that when bilinguals process and produce words, competition would still emerge between languages (e.g., cross-language lexical activation, Thierry & Wu, Reference Thierry and Wu2007). Taking Chinese-English bilinguals as an example, they could also show either asymmetrical or symmetrical switch cost. Slevc et al. (Reference Slevc, Davey and Linck2016) used Pinyin (i.e., Romanised Chinese phonology), Chinese characters, and English words and revealed that participants showed symmetrical switch cost. However, in a recent study, Zuo et al. (Reference Zuo, Schwieter, Cao and Liu2022) used even simpler univalent stimuli (i.e., alphabet letters) by asking Chinese-English bilinguals to read the letters aloud, and their results showed asymmetrical switch cost. To some extent, we expect inhibition to be present when we switch, but the consequence that comes after shows either symmetrical or asymmetrical. This leads to a question: What is causing (a)symmetrical switch cost in bottom-up production modality? In this experiment, we aimed to investigate not only the difference between the two production modalities but also the effect of language context contributing to switch cost patterns.
1.3. The current study
In the current study, we tested how language context and production modality affect switch cost asymmetry to support the flexibility of bilingual language control mechanisms in a group of highly proficient Chinese-English bilinguals living in the UK. Previous studies (e.g., Olson, Reference Olson2016; Timmer et al., Reference Timmer, Christoffels and Costa2019) have revealed how bilingual speakers adapt to different language contexts, but multiple repetitions of stimuli and fixed sequences of critical trials can potentially affect the (a)symmetrical patterns within and across language contexts. To create the contexts in the current study, we minimised stimulus repetition (i.e., presenting each picture only once in each language), with a cued-switching paradigm in three language contexts, an “L1-predominant” (75% of pictures named in L1-Chinese), an “L2-predominant” (75% of pictures named in L2-English), and a “Balanced” context (50% L1-Chinese, 50% L2-English). Note that we also used face cues to establish a more natural cued-switching task in Experiment 1 and to avoid increasing cognitive demands during switching (Blanco-Elorrieta & Pylkkänen, 2017; Liu et al., Reference Liu, Timmer, Jiao, Yuan and Wang2019).
Furthermore, we were also interested in how language context affects bilingual speech production in bottom-up production modality (here, reading words aloud). This then motivated us to conduct the second experiment. In the second experiment, we tested if the switch cost pattern was sensitive to different language contexts when reading aloud. In both Experiments 1 and 2, we used a within-participant design (i.e., all participants undertook all language contexts) to avoid confounding effects that could potentially modulate (a)symmetrical patterns (e.g., slightly different bilingual experiences). Our research questions and predictions are as follows:
-
(1) How does language context affect switch cost asymmetry in top-down language processing? If the magnitude of inhibition can be driven by language use and context, we would expect asymmetrical switching costs (i.e., greater L1 switch cost) in L1-predominant given L1 is considered the dominant language and this context highly raises its activation, resulting in the asymmetry. Similarly, we would expect reversed asymmetry (i.e., greater L2 switch cost than L1 switch cost) in L2-predominant contexts since L2 would be highly activated (see also Timmer et al., Reference Timmer, Christoffels and Costa2019). Here, it is noteworthy that our sample of participants is highly proficient bilinguals living in an L2 environment. As a result, we also predicted that the asymmetry would be absent (i.e., symmetrical switch cost) in the Balanced context (Zhu & Snowman Reference Zhu and Sowman2020; but see Campbell, Reference Campbell2005, for asymmetrical switch cost in bilinguals living in the L2 country). Given that L1 and L2 in this context are used equally, this should lead to the same magnitude of inhibition and switch cost for both languages for highly proficient bilinguals. Note that the Balanced context in this study would also serve as a baseline condition for our sample of bilinguals, since inherent demographic language variables of a given sample from the population have been shown to lead to different switching patterns (e.g., Proficiency/Dominance, Costa et al., Reference Costa, Santesteban and Ivanova2006; Age of acquisition, Bonfieni et al., Reference Bonfieni, Branigan, Pickering and Sorace2019; Switching frequency, Han et al., Reference Han, Li and Filippi2022).
-
(2) How does a bottom-up production modality (i.e., reading words aloud) affect switch cost when bilinguals read words aloud? To date, there is no study investigating bottom-up production in different contexts, thus, here we made predictions according to our first research question to investigate the effect of language context. We predicted that switch cost asymmetry would emerge in L1-predominant (i.e., asymmetrical switch cost) and L2-predominant (i.e., reversed asymmetrical switch cost) since both contexts contained a predominant language. However, in Balanced context, we predicted that switch cost would be symmetrical since, again, both languages are equally used; symmetrical switch cost should be found.
2. Experiment 1
2.1. Methods
2.1.1. Participants
We ran a sample-size power analysis based on Olson (Reference Olson2016) before recruiting participants. This was done using SuperPower package developed by Lakens and Caldwell (Reference Lakens and Caldwell2021). The results showed that we needed at least 44 participants to show the switch effect. Sixty-one bilingual adults (54 women) (Chinese (L1) as the dominant language and English (L2) as the non-dominant language), with normal or corrected-to-normal vision, volunteered to participate. Given this was an online experiment and we could not monitor participants’ performance during the task, we increased the number of participants to 61 to make sure that there were still enough data to run the analysis. These Chinese-English participants (mean age = 26.8, SD = 5.1) were residing in the United Kingdom. We used English LexTale (Lemhofer & Broersma, Reference Lemhofer and Broersma2012) to measure participants’ English proficiency. Four participants were excluded before data analysis: one with invalid audio files, one who reported a learning disability, and the other two who did not follow the given instructions (i.e., naming all pictures in English throughout the experiment). In addition, another four participants were excluded because their accuracy rate was below 65%, leaving 53 participants for data analysis (see Table 1 for their L1/L2 proficiency). Fifty-three participants scored 67.7 (SD = 12.5) on English LexTale (see Table 1 for LexTale and self-rating language proficiency). Of the 53 participants, 25 spoke other languages and dialects (e.g., French, Japanese, Korean, Spanish and Taiwanese Hokkien) (see Table S1 in the Supplementary Material). At the end of the experiment, participants received a £5 voucher. Ethics approval of this study was granted by Lancaster University (FASSLUMS-2022-0759-RECR-3).
Table 1. Self-rating language background (range: 0–10 for the first six components)

2.1.2. Materials
Stimuli were presented in the form of grey pictures. Faces of a Chinese and an American celebrity were used as the language cues for this study (i.e., Jackie Chan and Tom Cruise, respectively). For initial picture selection, objects scored below one on H-statistic indexFootnote 1 (i.e., a criterion for name agreement) for both L1 and L2 were selected as critical stimuli. Given that all fillers would be excluded for data analysis, pictures above one on H-statistic were randomly selected. For our experiment, 150 fillers were divided into three sets for each language context.
Three-hundred pictures (150 critical pictures and 150 fillers) were selected from Multipic (Duñabeitia et al., Reference Duñabeitia, Baciero, Antoniou, Antoniou, Ataman, Baus and Pliatsikas2022) and divided into six sets of pictures. All critical pictures were matched in terms of their name agreement, frequency, syllables and number of phonemes (see Table S3 in the Supplementary Material for means and standard deviations). In addition to Multipic Picture Dataset developed by Duñabeitia and colleagues, four other datasets were employed to retrieve more information for each lexical item. These four datasets were (1) Chinese Lexical Dataset (Sun et al., Reference Sun, Hendrix, Ma and Baayen2018), (2) SUBTLEX-UK (Van Heuven et al., Reference Van Heuven, Mandera, Keuleers and Brysbaert2014), (3) SUBTLEX-CH (Cai & Brysbaert, Reference Cai and Brysbaert2010), and (4) Irvine Phonotactic Online Dictionary (Vaden et al., Reference Vaden, Halpin and Hickok2009). The first one was used to retrieve the number of syllables and number of phonemes for Chinese lexical items. The second one was used to retrieve English word frequency based on Zipf’s law (Zipf, Reference Zipf1949). The third one was used to retrieve Chinese word frequency. To calculate word frequency for each target word according to Zipf’s law, we used a function proposed by Van Heuven and colleaguesFootnote 2. Finally, the fourth was used to retrieve the number of syllables and phonemes as well as phonetic notation for English words. Average word phonemes and frequency of the critical stimuli between the items in Chinese were statistically non-significant (Mean word phonemes = 5.61, SD = 1.72, p = .68; Mean frequency = 3.9, SD = .59, p = .36) as well as the ones in English (Mean word phonemes = 4.71, SD = 1.76, p = .83; Mean frequency = 4.13, SD = .55, p = .97).
2.1.3. Procedure
The picture-naming experiment was conducted online via Gorilla, an online data collection platform for behavioural experiments (Anwyl-Irvine et al., Reference Anwyl-Irvine, Massonnie, Flitton, Kirkham and Evershed2020). Our language-switching task was adapted from a task structure developed by Declerck and Kirk (Reference Declerck and Kirk2023) (https://app.gorilla.sc/openmaterials).Footnote 3 Each participant completed three picture-naming blocks and a practice block (i.e., 25 practice trials using non-critical stimuli), followed by three experimental blocks of 200 trials, leading to 625 trials in total. Within the 200 trials of each block, 100 trials were critical trials and 100 trials were fillers. Note that the practice trials and filler trials were later excluded from data analysis. There were four conditions in each set of 100 critical trials, namely 25 L1 switch trials (i.e., switch from L2 to L1), 25 L2 switch trials (i.e., switch from L1 to L2), 25 L1 non-switch trials (i.e., staying in L1), and 25 L2 non-switch trials (i.e., staying in L2). Therefore, the ratio of critical trials for each language (L1/L2) and condition (switch/non-switch) remained constant in the three language contexts, namely, we employed 50% trials for each language and 50% trials for each condition. Fillers were presented randomly and were used to create the language context conditions. We employed three language context blocks: “L1-predominant” (75% of pictures were named in L1, alongside 100 fillers named in L1), “L2-predominant” (75% of pictures were named in L2, alongside 100 fillers in L2), and “Balanced” (50% of pictures were named in L1 and the other 50% in L2, alongside 50 fillers in L1 and 50 fillers in L2).
Filler pictures were presented twice within the same block. In L1-predominant and L2-predominant, fillers were named twice, and the naming language was congruent with the predominant language of the context. In the Balanced context, fillers were named once in L1 and once in L2. All critical stimuli were presented only twice throughout the entire experiment, once in each language and never within the same block. For instance, if a picture was presented in an L1 switch trial in L1-predominant, the next presentation of the same picture would be presented in an L2 switch trial in L2-predominant. Six lists were created, and participants were allocated to each list pseudo-randomly. Within each block, all stimuli were randomly presented whilst preserving a fixed order of trial type and language, and the first three trials were kept as fillers.
The apparatus used for the experiment was participants’ personal computers/laptops (for stimulus presentation on Gorilla) and their headphones (for audio recording). Before the experiment began, all participants were asked to do the tasks in a quiet place so that there was no background noise in the recording. Participants would first be presented with instructions for each block, followed by a fixation cross that was presented on screen for 250 ms. Subsequently, a language cue indicating the target language and a stimulus would be presented at the same time for 1500 ms. Upon the presentation of a cue and a stimulus, participants were instructed to name the picture according to the target language cue as fast and as accurately as they could. Both the cue and the stimulus disappeared after 1500 ms, but participants had another 1500 ms to respond, namely the response window stayed for 3000 ms. The trial would end with a blank for 150 ms prior to the beginning of the next trial. To reduce fatigue during the experiment, participants had a 2-min break between blocks. After completing the main tasks, participants proceeded to English LexTale (Lemhofer & Broersma, Reference Lemhofer and Broersma2012) and LEAP-Q (Marian et al., Reference Marian, Blumenfeld and Kaushanskaya2007). The experiment took approximately 45 minutes to complete.
2.1.4. Accuracy coding and reaction time measurement
Each participant had 625 audio files (25 practice trials, 300 critical trials, and 300 filler trials), making a total of 38,125 audio files (625 audio files x 53 participants). Before measuring the accuracy rate and reaction times, audio recordings of the practice trials were excluded. Accuracy coding was done for all 600 remaining trials before excluding the practice trials for reaction time analysis. This was to ensure that if a critical trial is preceded by an error (in either a filler or a critical trial), that trial would be excluded since it could not be considered a switch nor a non-switch. To measure the accuracy rate of each trial, data were first analysed manually by adapting an accuracy coding method developed by Declerck et al. (Reference Declerck, Ozbakar and Kirk2021) in Gorilla Platform (https://app.gorilla.sc/openmaterials/236318). In Declerck and colleagues’ example, manual accuracy coding responses were categorised as “Correct,” “Incorrect,” or “No/Other-Sound.” However, we categorised the audio files based on five criteria: (1) Correct same word, (2) Correct different word (i.e., synonym), (3) Incorrect same language, (4) Incorrect different language and (5) No/other sound. We added two more accuracy criteria because participants were not familiarised with the pictures and words beforehand (this was because we aimed to minimise the potential of facilitation after second naming, see also Branzi et al., Reference Branzi, Martin, Abutalebi and Costa2014). If a participant named a picture with a different word (synonym in the same language), such a response was considered “Correct different word.” Any audio files with hesitation, filled pauses (e.g., uh, oh, um, ah), no sounds were considered “No/other sound.”
After accuracy coding, we then measured reaction times for each recording file. The reaction times were measured both via Chronset (Roux et al., Reference Roux, Armstrong and Carreiras2017) and manually by the experimenter. We uploaded the 300 audio files of critical trials from each participant to Chronset to measure the voice onset time. The experimenter then randomly selected 5% of the recording files from each participant to manually measure the reaction times (this method was adopted from Declerck et al., Reference Declerck, Ozbakar and Kirk2021). After obtaining the results from Chronset, we ran a Pearson correlation test with the results from Chronset and the manually measured voice onset time to check the reliability of both results (r = .90).
2.1.5. Data cleaning and processing
The data were cleaned and processed using RStudio (R Core Team, 2022). Trials that were preceded by errors or no responses (i.e., criteria (3), (4), and (5)) were excluded from analysis because these were not considered switch or non-switch trials (13.38%). Five items were excluded from further analyses upon inspection of item accuracy (i.e., compass, dummy, microscope, ostrich, scales) as the accuracy rate was below 20%. Physiological implausible responses (RTs below 150 ms) and timeout (RTs above 3s) were excluded (0.52%). Responses above and below 2.5 standard deviations of the mean RTs by participant (2.21%), by item (2.63%), and by participant and item (4.01%) were excluded.
Accuracy rate and reaction times were analysed using logarithmic and linear mixed-effects models using lme4 (Bates, Kliegl, et al., Reference Bates, Mächler, Bolker and Walker2015b). Fixed effects included Context (L1-predominant, L2-predominant, Balanced), Language (L1, L2), and Trial type (switch, non-switch). We first fitted a maximal random model structure, and if the model failed to converge, we continued to run principal component analysis (with rePCA function) of the random effects and drop the components that did not contribute to the cumulative variance to reach a parsimonious model (Bates, Kliegl, et al., Reference Bates, Kliegl, Vasishth and Baayen2015). Language and trial type were contrast-coded using sum contrasts divided by the number of levels (i.e., −0.5, 0.5). Context was set with “Balanced” as the reference level. For reaction time analyses, F-values for main effects and interactions were computed using the lmerTest package (Kuznetsova et al., Reference Kuznetsova, Brockhoff and Christensen2017), using Satterthwaite approximation for degrees of freedom. For accuracy data, follow-up models for significant 3-way interactions were then conducted for each context separately using the same procedure.
2.2. Results
Here, we present the findings including descriptive statistics, accuracy, mean RTs, and main effects according to each language context, L1-predominant, L2-predominant, and Balanced. Table 2 reports the descriptive statistics within each context.
Table 2. Descriptive statistics for the cued-switching task (Experiment 1)

Note: Mean reaction times are presented in milliseconds and accuracy rates in percentage with standard deviations in parentheses.
2.2.1. Accuracy
We ran analyses based on three predictors: Language (L1 vs. L2), Trial type (non-switch vs. switch) and Context (L1-predominant vs. L2-predominant vs. Balanced) with Balanced context as the reference level. We found a significant interaction of Context with Language [χ2(2) = 8.53, p = .014]. Whilst accuracy in L2 did not significantly differ across contexts (all ps > .338), accuracy in L1 was significantly higher in L1-predominant context (mean = 92%, SD = 6%) than L2-predominant context (mean = 89%, SD = 8%) (β = 0.23, SE = .08, z = 2.79, p = .015). L1 Accuracy in Balanced language context did not significantly differ from either L1- or L2-predominant context (all ps > .248). Results also revealed a significant interaction between Context and Trial type [χ 2(2) = 6.05, p = .039], with significant switching effects (i.e., more accurate responses in non-switch than switch trials) in L1-predominant context (β = 0.41, SE = .18, z = 2.20, p = .028) and L2-predominant context (β = 0.61, SE = .18, z = 3.28, p = .001), but only a tendency in Balanced context (β = 0.31, SE = .18, z = 1.67, p = .094).
2.2.2. Reaction times
We ran analyses based on three predictors: Language (L1 vs. L2), Trial type (non-switch vs. switch) and Context (L1-predominant vs. L2-predominant vs. Balanced). There was a main effect of trial type (F(1, 166.3) = 51.44, p < .001) and language (F(1, 84.3) = 6.18, p = .014), showing shorter reaction times on non-switch trials than switch trials. There was a significant interaction between context and language (F(2, 10793) = 20.10, p < .001), showing that overall L1 was significantly faster in L1-predominant than L2-predominant and Balanced context (β = −63.08, SE = 7.54, t = −8.37, p < .001, and β = −57.01, SE = 7.50, t = −7.61, p < .001, respectively) with no significant differences between the latter two (β = 6.08, SE = 7.59, t = .80, p = .422). Furthermore, overall L2 performance was significantly faster in L2-predominant than Balanced (β = −15.62, SE = 7.78 t = −2.01, p = .044) but not L1-predominant (β = 4.19, SE = 7.80, t = .54, p = .591), with no significant differences in L2 between the latter two contexts (β = −11.43, SE = 7.78, t = −1.47, p = .141). A significant interaction between context and trial type was also found (F(2, 10799) = 12.96, p < .001). Simple comparisons showed a significant switching effect in each language context (L1-predominant: β = −150.09, SE = 17.66, t = −8.50, p < .001, L2-predominant: β = −106.20, SE = 17.70, t = −6.00, p < .001 and Balanced: β = −99.44, SE = 17.68, t = −5.63, p < .001). There was also a significant two-way trial type and language interaction (F(1, 127.2) = 5.714, p = .018), which was also modulated by context, as revealed by a significant three-way interaction (F(2, 10795.3) = 3.02, p = .049).
We then ran additional analyses to check the significant interactions within each context. In L1-predominant (see Figure 1 and Table 3), there was a main effect of trial type, showing longer reaction times on switch than non-switch trials (F(1, 136) = 77.34, p < .001). While we did not find a main effect of language (F(1, 72.7) = .21, p = .651), it significantly interacted with trial type (F(1, 88.9) = 9.72, p = .002), indicating an asymmetrical switch cost pattern. Simple comparisons showed a significant switching effect in both languages (see Table 4). However, results revealed a significant slowdown in L1 switch trials compared to L2 switch trials (β = 43.82, SE = 21.01, t = 2.09, p = .04), but not in non-switch trials (β = −28.34, SE = 20.19, t = −1.40, p = .164, see Table 4). In L2-predominant, there was a main effect of trial type (F(1, 136.5) = 30.43, p < .001) (i.e., slower reaction times for switch than non-switch trials, see Table 3) as well as language (F(1, 63.6) = 20.50, p < .001) (slower reaction times in L1 trials than L2 trials). There was no significant two-way interaction between trial type and language (F(1, 119.7) = .48, p = .488), namely the switch cost pattern appeared to be symmetrical (see Figure 1). In Balanced context, there was a significant main effect of trial type (F(1, 140.2) = 27.074, p < .001), and language (F(1, 75.7) = 7.55, p = .008), suggesting that reaction times on switch trials were slower than non-switch, and L1 trials showed slower latency than L2 trials. Additionally, the results also revealed a significant two-way interaction between trial type and language (F(1, 126.7) = 4.54, p = .035), indicating an asymmetrical switch cost pattern. Similar to L1-predominant context, simple comparisons showed significant switch effects in both languages, but significantly slower responses in L1 trials compared to L2 only in the switch condition (see Table 3).

Figure 1. Mean reaction times to switch and non-switch trials in Experiment 1.
Table 3. Estimated fixed effects of language trial type in each context

Note: The results correspond to the post-hoc analyses performed to investigate the origin of the interaction.
*p < .05, **p < .01, ***p < .001.
Table 4. Descriptive statistics for the read-aloud task (Experiment 2)

Note: Mean reaction times are presented in milliseconds and accuracy rates in percentage with standard deviation in parentheses.
2.3. Discussion
In Experiment 1, we manipulated the ratio of L1 and L2 in each context, with each picture and language cue presented unpredictably to measure whether language context could modulate switching cost patterns. As expected, switch cost showed an asymmetrical pattern in L1-predominant context, with higher switching effects in L1 than in L2. This asymmetrical switching pattern was also found in the Balanced context (L1/L2 equally active). This was a surprising finding as our participants were highly fluent Chinese-English bilinguals living in an L2 environment. However, we also found overall slower L1 performance in L2-predominant and Balanced context, meaning there was a global inhibition towards the dominant language in Balanced context as well. This indicates that our participants had a clear language dominance and proficiency towards their L1. In the L2-predominant context, we found a symmetrical switch cost pattern instead of the predicted reversed asymmetry (Timmer et al., Reference Timmer, Christoffels and Costa2019). However, similarly with the Balanced context, L2 proficiency might be playing a role. In this regard, stronger inhibition is needed for the dominant L1, leading to symmetrical switch cost. Although these were inconsistent with the previous findings of switch cost pattern (Olson, Reference Olson2016; Timmer et al., Reference Timmer, Christoffels and Costa2019), our findings further confirmed that bilingual language switching performance can be modulated by different dual-language contexts (see Figure 2).

Figure 2. Mean naming reaction times to switch and non-switch trials in Experiment 2.
3. Experiment 2
We tested whether the switch cost pattern (i.e., Experiment 1: asymmetrical in the L1-predominant and Balanced and symmetrical in L2-predominant) would also emerge in other production-based tasks (here, reading aloud) or whether it was constrained to picture naming. In this experiment, Chinese-English bilinguals were asked to read aloud Chinese characters and English words.
3.1. Methods
3.1.1. Participants
We recruited 56 participants, and the same type of Chinese-dominant (L1) and English non-dominant (L2) bilinguals were residing in the UK at the time of the experiment (45 women, mean age = 30.4, SD = 7.2) (see Table 1 for L1/L2 proficiency). Five participants were excluded: two with no sound or noisy background, two who did not follow the instructions, and the other one who reported cerebral palsy, leaving 51 participants for analysis. The remaining participants scored 65.9 on English LexTale, with some participants speaking other dialects/languages (see Table S2 in the Supplementary Material for proficiency). Ethics approval was granted by Lancaster University (FASSLUMS-2023-0759-SA-1).
3.1.2. Procedure
The procedure was identical to that of Experiment 1, including the ratio of L1 and L2 for context manipulation, but pictures were replaced with words.
3.1.3. Accuracy coding and reaction time measurement
The criteria for accuracy coding in the second experiment were identical to those of Experiment 1. The accuracy of the remaining 51 participants was high; hence were all included for analysis.
3.1.4. Data cleaning and processing
Method for data cleaning and processing was identical to that of Experiment 1.
3.2. Results
3.2.1. Accuracy
There was only a significant language effect in Balanced, with more accurate responses in L1 than in L2 (β = 1.917, SE = .73, z = 2.617, p = .008). No other significant effects or interactions between the three predictors were observed (all ps >. 138) (see Table 4 for descriptive statistics).
3.2.2. Reaction times
We ran analyses based on three predictors: Language (L1 vs. L2), Trial type (non-switch vs. switch) and Context (L1-predominant vs. L2-predominant vs. Balanced). There was a main effect of language (F(1, 99) = 89.40, p < .001), showing faster L1 performance than in L2. There was a significant effect of trial type (F(1, 154) = 5.86, p = .016) in which participants were faster on non-switch than switch trials. There were no other two-way or three-way interactions that were significant (Language: Trial type: F(1, 147) = .01, p = .17, Language: Context: F(2, 13208) = .54, p = .583, Trial type: Context: F(1, 13198) = 1.80, p = .166, Language: Trial type: Context: F(2, 13200) = .08, p = .920). Hence, no other analyses were conducted (see Table 5).
Table 5. Estimated fixed effects of language, trial type, and interaction by language and trial type in each context

Note: The results correspond to the post-hoc analyses performed to investigate the origin of the interaction.
*p < .05, **p < .01, ***p < .001.
3.3. Discussion
We adopted a read-aloud task that required participants to read L1 and L2 words while switching between languages. As in Experiment 1, we manipulated the ratio of L1 and L2 in three different contexts. Although previous studies reported asymmetrical switch cost (e.g., Macizo et al., Reference Macizo, Bajo and Paolieri2012), Experiment 2 showed that switch cost remained symmetrical and that bilingual speakers were faster in their dominant L1 than non-dominant L2. These findings were in line with the assumption that this production modality (i.e., reading words aloud) can lead to faster L1/L2 production speed (e.g., Slevc et al., Reference Slevc, Davey and Linck2016). Furthermore, in contrast to Experiment 1, language context here did not affect switch cost pattern.
4. General discussion
This study set out to measure whether language context affected the pattern of switch cost, and importantly, whether different production modalities led to the same switching pattern. Adding to the accounts of bilingual language control, in our first experiment, we first investigated the effect of language context by using picture naming paradigm. To avoid potential facilitation, different from the previous studies, we minimised repetition of each critical stimulus (i.e., presenting each critical stimulus only twice throughout the entire experiment) and our participants did not receive a familiarisation training on the stimuli. In our second experiment, we further explored the dynamics of switching performance with a read-aloud task. This allowed us to test if switch cost pattern in picture naming can be consistent in reading words aloud.
4.1. How does language context affect switch cost asymmetry in a picture-naming tasks?
In Experiment 1, we found asymmetrical switch cost in L1-predominant and Balanced but symmetrical in L2-predominant bilingual switching production. According to previous studies, language proficiency and use an important factors when it comes to bilingual language processing, indicating highly proficient bilinguals will always perform differently at switching (e.g., Costa et al., Reference Costa, Santesteban and Ivanova2006; Costa & Santesteban, Reference Costa and Santesteban2004). However, our findings in Experiment 1, in line with two language context-related studies (e.g., Olson, Reference Olson2016; Timmer et al., Reference Timmer, Christoffels and Costa2019) suggested that language context, with a significant three-way interaction, also plays an important role in top-down language switching. Noteworthy is that the switch cost patterns were different from these two previous studies, and this is intriguing as it highlights the flexibility of the bilingual language control mechanism.
To begin with, we ought to discuss why top-down language switching is dynamic and why the asymmetry only emerged in L1-predominant and Balanced. As aforementioned, top-down production requires choosing between two languages when seeing an object that represents different lexical candidates. This often causes longer reaction times in a picture naming task and occasionally, better performance of the non-dominant language (i.e., dominance, Christoffels et al., Reference Christoffels, Firk and Schiller2007; Timmer et al., Reference Timmer, Christoffels and Costa2019; Casado et al., Reference Casado, Szewczyk, Wolna and Wodniecka2022, see also Goldrick & Gollan, Reference Goldrick and Gollan2023). Findings of switch cost pattern in L1-predominant have been different. Olson (Reference Olson2016) found greater L1 switch cost in this context and suggested that switch cost pattern merely depends on the activation of the predominant language (e.g., the predominant language triggered greater switch cost than the other language). Timmer et al. (Reference Timmer, Christoffels and Costa2019), with a more thorough study design, found symmetrical switch cost. However, in the current study, we found greater L1 switch cost. Particularly, at least in L1-predominant where L1 was highly activated, we speculated that greater switch cost on L1 was caused by the stronger inhibition when using L2. These findings were consistent with the classic studies (i.e., Costa & Santesteban, Reference Costa and Santesteban2004; Green, Reference Green1998; Meuter & Allport, Reference Meuter and Allport1999) in which inhibition is required to prevent L1 from interfering with L2 production.
Interestingly, in L2-predominant environments where L2 was highly activated, the switch cost pattern became symmetrical. This finding was not in line with previous studies reporting greater L2 than L1 switch cost (i.e., reversed switch cost asymmetry) in an L2-predominant context (Olson, Reference Olson2016; Timmer et al., Reference Timmer, Christoffels and Costa2019). Furthermore, the L2 overall performance was faster than that of L1. We assumed that there was a link between L1 slowing and symmetrical switch cost (see also Christoffels et al., Reference Christoffels, Firk and Schiller2007; Costa et al., Reference Costa, Santesteban and Ivanova2006; Costa & Santesteban, Reference Costa and Santesteban2004). The reason is that language control mechanism needs to optimise the efficiency of switching, so that bilinguals can adapt themselves within different contexts. Thus, when bilinguals are required to use L2 more often in a context (e.g., 75% of stimuli named in L2), language control mechanism needs to largely increase L2 activation and its accessibility by globally delaying L1, given that our participants were more dominant in L1. This will then lead to slower L1 production, and the pattern of inhibition and switch cost on L1 and L2 becomes symmetrical when bilinguals switch between a proficient dominant language (here L1) and a highly activated language (here L2). Note that in L1-predominant, L1 overall performance was not faster than L2, suggesting that when L2 is not frequently used, the responses will not become faster than L1, at least in a picture naming task. To this end, we would like to use this assumption to further answer the findings in the Balanced context.
The findings of Balanced context (i.e., 50% L1 and 50% L2) have been divergent across studies. For example, Declerck et al. (Reference Declerck, Koch and Philipp2012) and Olson (Reference Olson2016) did not find asymmetrical switch cost, showing that unbalanced bilinguals can also produce the same patterns of switch cost on L1 and L2. According to Olson, switch cost pattern in Balanced context should be symmetrical, given that the two languages are equally used, which was also our initial prediction for Experiment 1. Nevertheless, we found asymmetrical switch cost and global L1 slowing in this context. The reason is the ratio of L2, based on our aforementioned assumption for L2-predominant. Compared to L1-predominant (i.e., 75% use of L1), use of L2 in Balanced was increased by 25%, thereby increasing the need to boost L2 activation by globally inhibiting L1. Still, L2 activation level was not as largely increased as in L2-predominant, hence L1 was not as globally inhibited as in L2-predominant. Balanced context, thus, would still show asymmetrical switch cost with overall slower L1 performance. Alternatively, another potential reason is – our participants were unbalanced bilinguals. Putting this type of bilingual participant in a Balanced context or in a context where the non-dominant L2 was not highly required (e.g., in L1-predominant) would eventually show asymmetrical switch cost. This reason, to some extent, would be consistent with previous studies (e.g., Costa & Santesteban, Reference Costa and Santesteban2004) that highlighted the difference between unbalanced and balanced bilinguals.
However, instead of categorising whether our bilingual participants were unbalanced or balanced, in the current study, we focused on the flexibility of bilingual speakers in different contexts with respect to Green and Abutalebi (Reference Green and Abutalebi2013). For example, in Timmer and colleagues’ study (2019), L1-predominant showed symmetrical switch cost with global L1 slowing but reversed asymmetrical switch cost with no global L1 slowing in L2-predominant. In the current study, we reported opposite findings, which was an interesting phenomenon in bilingual language control. This, perhaps, was because of the places the participants were living in by the time of the experiment. In particular, our participants were tested in an L2-speaking country at the time of the experiment. To that end, as bilingual speakers are residing in an L2-speaking country and still need to switch to L1 to communicate (also in our L2-predominant), they really have to rely on global L1 inhibition to balance the activation of L1 and L2, leading to global L1 slowing and symmetrical switch cost. This switching behaviour in L1-predominant, however, would not become a problem for bilingual speakers because all they need to do is to largely activate their L1 and rely on local inhibition to switch, leading to asymmetrical switch cost. In Timmer and colleagues’ study, their bilingual participants were tested in an L1-speaking country and were exposed to L2 very often. Therefore, to be able to use both L1 and L2 in L1-predominant, bilingual speakers needed to rely on global L1 inhibition to balance both languages, as shown by overall L1 slower responses than L2 (Timmer et al., Reference Timmer, Christoffels and Costa2019). However, in L2-predominant, they tended to rely on local inhibition to switch, showing reversed asymmetrical switch cost. These findings revealed the phenomenon of how bilingual speakers adapt themselves in different contexts flexibly and potentially reflected the effect of real-life language experience on speech production. Hence, from the notion of ACH (Green & Abutalebi, Reference Green and Abutalebi2013) and individual differences (see also DeLuca et al., Reference DeLuca, Rothman, Bialystok and Pliatsikas2019), future studies can expand our interpretations from “experiments” to “real-life experience” by comparing participants living in L1 and L2-speaking countries and exploring how these experiences can modulate bilingual language processing.
4.2. How does a bottom-up production modality (i.e., reading words aloud) affect switch cost when bilinguals read words aloud?
Another focus of the current study is the effect of production modality on switch cost pattern. In Experiment 2, we also manipulated the ratio of L1 and L2 in each language context to test whether language context had an influence on a bottom-up production modality (i.e., read words aloud). We used three language contexts, L1-predominant, L2-predominant, and Balanced, with the same repetition of each critical stimulus as in Experiment 1. Previous studies have shown that switch cost always emerged even when bilingual speakers read words aloud (e.g., Macizo et al., Reference Macizo, Bajo and Paolieri2012; Slevc et al., Reference Slevc, Davey and Linck2016; Reynolds et al., Reference Reynolds, Schloffel and Peressotti2016; Zuo et al., Reference Zuo, Schwieter, Cao and Liu2022). Indeed, in our experiments, we found consistent switching effects in both picture naming and reading out loud tasks. This is not surprising, as both reading out loud and picture naming require language control processes. Although it is evident that production involves more language control (e.g., inhibition) than comprehension to reduce interference to achieve a language goal (Declerck et al., 2019; Li et al., Reference Li, Midgley, Ferreira, Holcomb and Gollan2024), reading words aloud also requires a certain degree of language control. For example, reading an L2 word will automatically activate L1 lexical competitors; hence, inhibiting the L1 might also be necessary to achieve the language goal. More interestingly, even in the presence of switching effect during reading out loud, the switch cost pattern can be either asymmetrical or symmetrical (Macizo et al., Reference Macizo, Bajo and Paolieri2012 for asymmetrical switch cost; Slevc et al., Reference Slevc, Davey and Linck2016 for symmetrical switch cost), depending on the relative activation/interference from the non-target language. Nevertheless, due to no interaction between any of the predictors, we reported lack of asymmetry (i.e., symmetrical switch cost), and this pattern was not modulated by language contexts, regardless of the ratio of L1 and L2.
First, before going through findings of the absence of asymmetry, we would like to discuss the overall performance of L1 and L2 across the three language contexts. For orthographically unique words, the activation of lexical candidates becomes faster, depending on one’s language proficiency (Dijkstra & Van Heuven, Reference Dijkstra and Van Heuven2002; Kroll et al., Reference Kroll, Bobb and Hoshino2014). In comprehension-based studies, we can see bilingual speakers were faster at recognising words in their L1 than in L2, and this was what we also found in the bottom-up production in Experiment 2. With the performance across the three contexts, bilinguals were significantly faster at reading words aloud in their L1 than L2. From the perspective of production modality, language context did not seem to affect overall performance, indicating that the higher the proficiency, the faster the performance. This, however, was not the case for Experiment 1, as L1 was not faster than L2, and it even became slower than L2 in L2-predominant and Balanced contexts. This different pattern of results across the two experiments suggests that during top-down production processes (e.g., picture naming), inhibition of the dominant language (L1) is more required than during bottom-up production (e.g., reading out loud), especially in dual-language contexts. The next question we ought to discuss is the reason behind the lack of switch cost asymmetry across the three contexts.
Second, at least according to our findings in Experiment 2, the answer to “whether language context affects language performance” is no. We did not find a three-way interaction, which means there was no statistically significant context effect when reading words aloud. This then showed that symmetrical switch cost emerged across the three contexts, which was an interesting finding given the contrasts between our results and previous studies (e.g., asymmetrical switch cost: Macizo et al., Reference Macizo, Bajo and Paolieri2012; Zuo et al., Reference Zuo, Schwieter, Cao and Liu2022). Note that our results were also consistent with the other studies (e.g., symmetrical switch cost: Declerck et al., 2019; Slevc et al., Reference Slevc, Davey and Linck2016). If we explore this switch cost pattern from the perspective of ACH (Green & Abutalebi, Reference Green and Abutalebi2013), it is possible that bilingual speakers are highly flexible, and that bottom-up production enables them to minimise the interference suppression control process. The reason is that Chinese characters and English words are written in patently two different scripts (i.e., linguistically distant scripts), hence, bilingual speakers can use this distinctive information as language cues and promote lexical access by constraining the amount of cross-language lexical competition (Casaponsa et al., Reference Casaponsa, Thierry and Duñabeitia2019; see also Radman et al., Reference Radman, Jost, Dorood, Mancini and Annoni2021). The language processing here will then rely more on their language proficiency that corresponds to the activation of lexical representations (e.g., lexical frequency). This interpretation will explain our results, where we found symmetrical switching effects that were not modulated by language contexts. Hence, even though inhibition of the non-target language might take place to maintain the language goal (i.e., switching effects), automatic activation of the lexical representations and their phonological code will predominantly drive the effects. Therefore, the effect of language context on reading words aloud will not be as significant as in picture naming, especially for bi-script readers where word forms can be used as language cues.
With respect to a previous study, Finkbeiner et al. (Reference Finkbeiner, Almeida, Janssen and Caramazza2006) proposed a hypothesis in which univalent stimuli eliminate switch cost (but see Abutalebi & Green, Reference Abutalebi and Green2007 for an alternative account regarding Finkbeiner and colleagues’ results). However, some later studies contrasted with Finkbeiner and colleagues’ findings (e.g., Declerck et al., 2019; Macizo et al., Reference Macizo, Bajo and Paolieri2012; Reynolds et al., Reference Reynolds, Schloffel and Peressotti2016; Slevc et al., Reference Slevc, Davey and Linck2016, see also Zuo et al., Reference Zuo, Schwieter, Cao and Liu2022 for naming English alphabet and Chinese characters). Hence, we suggest that using univalent stimuli in a production-based study does not necessarily eliminate switch cost. In a previous study focusing on language switching in comprehension and production, Declerck et al. (2019) observed switch cost in the latter but not in the former. One of the factors, according to Declerck and colleagues, could possibly contribute to switch cost elimination is “parallel language activation.” Within the framework of Inhibitory Control Model (Green, Reference Green1998), all languages are activated in parallel to compete for selection, which is why inhibition is required to reduce any form of interference from the non-target language, followed by a consequence (i.e., switch cost). This cost, also pointed out by Declerck and colleagues, seems to emerge more often in production than comprehension, indicating that the parallel language activation and competition are stronger than in a comprehension-based task (see also Li et al., Reference Li, Midgley, Ferreira, Holcomb and Gollan2024). If we look at the lack of asymmetry in Experiment 2 from the perspective of parallel language activation, we suggest that parallel activation in a read-aloud task will still be present; hence, some degree of language control is required. Furthermore, according to ACH, dual-language context requires the most cognitive control processes, with more magnitude of goal maintenance and interference control (Green & Abutalebi, Reference Green and Abutalebi2013). Under the scope of bottom-up production, this parallel activation does not necessarily trigger strong inhibition to the dominant language, whereby reducing language context effect and causing bilingual speakers to rely more on the activation of the target language. Imagine if the language cue (i.e., word form) can help the target language achieve the activation threshold, Why would the control mechanisms in the dual-language context (i.e., a context that elicits the most cognitive demands) need to stress themselves out on dealing with the much less activated candidate? Hence, as far as what we found, a different ratio of L1 and L2 in a dual-language context we employed here may not be as influential, given that the processing system will minimise the cognitive demands and give bilingual speakers a benefit to switch between L1 and L2. At least from our results, the cost will still be present when switching production (both top-down and bottom-up) takes place with a symmetrical pattern.
Finally, we are also aware that switching does not always have to be costly. Not only in Finkbeiner et al. (Reference Finkbeiner, Almeida, Janssen and Caramazza2006), it is also noteworthy that switch cost absence has been observed in several production tasks using (1) pre-cuing to pre-activate the target language before naming the target picture (e.g., Mosca & Clahsen, Reference Mosca and Clahsen2016; Mosca et al., Reference Mosca, Manawamma and de Bot2022) as well as (2) when the target picture is associated with a more easily accessible word (e.g., Kleinman & Gollan, Reference Kleinman and Gollan2016). However, our results in reading aloud showed that using orthographically distinct stimuli would still yield switch cost. This indicates that distinct orthography does not eliminate switch cost and further reflects cognitive demands on bilingual language control mechanisms. Still, we suggest that future studies examine the effect of linguistic distance to unpack the dynamics of language processing (e.g., how distant/close language pairs affect the need for language control and switch cost, Radman et al., Reference Radman, Jost, Dorood, Mancini and Annoni2021). If in the future, switch cost absence in production studies emerges, we would indicate that the relationship between language processing and inhibition (or any interference control) remains “neutral” (as proposed by Green & Abutalebi, Reference Green and Abutalebi2013) according to the nature of bilingual language processing. That is, inhibitory control in language switching will always be on guard to reduce interference, and this does not only occur to bilingual speakers, but also monolingual speakers, given we are always inhibiting irrelevant information via our neural network daily (Munakata et al., Reference Munakata, Herd, Chatham, Depue, Banich and O’Reilly2011). Taken together, the dynamic performance of language switching has much more behind when investigating how easy (or costly) it is for bilingual speakers to process their L1 and L2.
5. Conclusion
In the current study, we explored the flexibility of bilingual speakers. The pattern of switch cost in bilingual language switching is dependent on language context (e.g., ratio of L1 and L2), but only when a more cognitively demanding processing is needed (e.g., picture-naming). The current study showed that in picture naming, bilingual speakers rely more on top-down inhibition to reduce interference, whereby leads to (a)symmetrical switch cost. In addition to switch cost, when the non-dominant language (L2) is highly activated in a context, the inhibition to L1 becomes more global, which in turn, causes slower L1 responses. When it comes to reading words aloud, switch cost becomes symmetrical, regardless of the ratio of L1 and L2 within a dual-language context. These findings revealed that bilingual speakers are flexible and that they do not always rely heavily on greater inhibition to the dominant language in a production-based language task. To further tease apart the underlying control mechanism, more investigations on the association between language context and production modality are needed to gain a better understandings of bilingual language processing.
Supplementary material
The supplementary material for this article can be found at http://doi.org/10.1017/S1366728925100357.
Data availability statement
Our materials, anonymized data, and R scripts are available on our project site on the Open Science Framework (OSF) platform (https://osf.io/2yab9/).
Acknowledgements
We gratefully acknowledge the support provided by Lancaster University’s Camões Institute Cátedra for Multilingualism and Diversity.
Competing interests
The authors declare none.