Highlights
-
• Larger pupil responses in switch vs. non-switch trials, i.e., a pupil switching cost.
-
• Switching costs were not modulated by language, pointing to an overall balanced proficiency.
-
• A global slowing of the most dominant language was observed, i.e., reversed dominance effect.
-
• Larger pupil responses for the non-dominant language despite faster naming speed.
1. Introduction
Research has shown that when using language, bilinguals activate words in the target and the nontarget language in parallel (Costa et al., Reference Costa, Caramazza and Sebastian-Galles2000; Giezen & Emmorey, Reference Giezen and Emmorey2016). Furthermore, despite a lack of conscious awareness of the nontarget language, both languages are always active in the bilingual brain (Costa et al., Reference Costa, Miozzo and Caramazza1999; Kroll et al., Reference Kroll, Bobb and Hoshino2014; Wu & Thierry, Reference Wu and Thierry2010). This leads to interference from the nontarget language in the processing of the target language, i.e., cross-language interference. It is hypothesized that this cross-language interference is resolved through language control (e.g., Declerck et al., Reference Declerck, Thoma, Koch and Philipp2015; Green, Reference Green1998; Green & Abutalebi, Reference Green and Abutalebi2013). Language control during bilingual language processing has been mainly explained with inhibitory control processes (Green, Reference Green1998), where activation of the nontarget language is inhibited, resulting in smaller cross-language interference. A substantial body of work has proposed that bilingual language control mainly relies on inhibition (for a review, see Declerck & Koch, Reference Declerck and Koch2023). However, inconsistent findings of the effects linked to inhibitory language control have raised doubts about this view. The current study aimed to contribute to this debate by examining language control processes in a group of highly proficient bilinguals.
The inhibitory view is reflected in the Inhibitory Control Model (ICM; Green, Reference Green1998), which posits that words in the nontarget language are inhibited to facilitate the use of words in the target language. According to the ICM, bilinguals rely on inhibitory control in most linguistic situations, regardless of language proficiency. To investigate bilingual language control processes, previous research has mainly relied on language switching paradigms. Language switching is a phenomenon that occurs when bilinguals alternate between their two languages while talking to others. Switches often occur within a single discourse or sentence, and usually with no change of topic or interlocutor. Language switching is a fast and flexible process, which reflects cross-linguistic activation and systematic control of bilinguals’ two languages (Kroll et al., Reference Kroll, Dussias, Bice and Perrotti2015). In cued language switching paradigms, bilinguals are tasked with naming pictures or digits following a cue indicating one of their languages. A switching cost typically emerges in mixed language conditions, with faster naming in non-switch (i.e., naming two consecutive items in the same language) compared to switch (i.e., naming two consecutive items in different languages) trials (e.g., de Bruin et al., Reference de Bruin, Samuel and Duñabeitia2020). This switching cost is hypothesized to reflect two aspects of language control, namely the process of inhibiting the nontarget language and reactivating the previously inhibited representations (Green, Reference Green1998).
Two phenomena have been repeatedly reported in cued language switching tasks: the asymmetry in language switching costs and the reversed language dominance effect. The asymmetry in language switching costs refers to larger switching costs (i.e., bigger response time differences between switch and non-switch trials) in the dominant language (DL) than the non-dominant language (NDL). Reversed language dominance refers to worse overall performance (i.e., slower response times) in the DL than the NDL when naming words in mixed language blocks. It is hypothesized that these two phenomena reflect distinct control processes (Declerck, Reference Declerck2020), both achieved through inhibitory control of one language (Green, Reference Green1998). Asymmetrical language switching costs have been argued to indicate reactive or transient language control stemming from inhibition, while reversed language dominance reflects proactive or sustained language control adjusting the overall activation level of the two languages.
Specifically, the asymmetry in language switching costs arguably results from the need for more inhibition of the DL when it is irrelevant, leading to more difficulty in reactivating it when it is subsequently the target language. Nonetheless, findings of symmetric switching costs have been reported in highly proficient bilinguals (e.g., Costa et al., Reference Costa, Santesteban and Ivanova2006; Costa & Santesteban, Reference Costa and Santesteban2004; Martin et al., Reference Martin, Strijkers, Santesteban, Escera, Hartsuiker and Costa2013). Moreover, symmetric switching costs have been observed in second language learners who are clearly dominant in their native language, instead of the expected larger switching costs in the DL than NDL (e.g., Declerck et al., Reference Declerck, Koch and Philipp2012).
In the case of reversed language dominance, one explanation posits that performance is best when both languages are equally activated (Declerck et al., Reference Declerck, Kleinman and Gollan2020), and, thus, bilinguals may try to ensure equal activation by over-inhibiting the DL. This in turn results in more activation of the NDL and a general slowing of the DL. Findings of this effect have been inconsistent: Some studies have shown a reversed language dominance effect, while others observed an advantage for the DL over NDL or no overall language effect (for a review, see Declerck, Reference Declerck2020). Nonetheless, Declerck et al., (Reference Declerck, Kleinman and Gollan2020) found that reversed language dominance is more often seen in highly proficient balanced bilinguals, because they are more likely to overshoot when trying to equalize language accessibility.
Several studies have incorporated electroencephalography (EEG) measurements while performing cued language switching tasks to shed light on the cognitive processes involved in cued language switching. Specifically, an event-related potential (ERP) component that has been linked to bilingual inhibitory control is the N2 component, which has been related to response inhibition (e.g., Falkenstein et al., Reference Falkenstein, Hoormann and Hohnsbein1999). This component has been proposed to reflect the suppression of habitual (i.e., DL) responses during NDL switch trials (Meuter & Allport, Reference Meuter and Allport1999). Thus, in bilinguals, where one language is usually more dominant than the other, inhibition (of the DL) may be strongest when switching to the NDL from the DL (i.e., NDL switch trials). This interpretation is in line with the ICM (Green, Reference Green1998), which posits that higher task-irrelevant activity leads to stronger levels of inhibition. Furthermore, this perspective could explain the observed global slowing in RTs when bilinguals name words in their DL compared to the NDL (i.e., reversed dominance). On trials where there is a switch to the NDL, the DL must be strongly inhibited due to its dominance. As a result, when there is a subsequent need to switch back to the DL, that previously suppressed language requires more effort and time to re-activate, leading to slower RTs. In contrast, the NDL is less inhibited on DL switch trials, making it easier to access and resulting in faster switches to the NDL.
In an EEG cued language switching task, Jackson et al. (Reference Jackson, Swainson, Cunnington and Jackson2001) examined the time course of language switching in bilinguals naming digits in their first or second language. Their results showed a larger N2 component (i.e., more inhibition) when switching from the first to the second language than when switching from the second to the first language, reflecting asymmetrical switching costs. However, as is the case with response time evidence in cued language switching tasks, this asymmetry in the N2 component has not been replicated consistently (for a review, see Declerck & Koch, Reference Declerck and Koch2023). Regarding the reversed language dominance effect observed (albeit inconsistently) in response time data, most ERP studies do not report such an effect with the N2 component (e.g., Martin et al., Reference Martin, Strijkers, Santesteban, Escera, Hartsuiker and Costa2013; Kang et al., Reference Kang, Ma, Li, Kroll and Guo2020; but see also Kang et al., Reference Kang, Ma and Guo2018).
Importantly, it has been suggested that rather than representing inhibition, the N2 component reflects interference detection which in turn triggers inhibition (Donkers & van Boxtel, Reference Donkers and Van Boxtel2004; Nieuwenhuis et al., Reference Nieuwenhuis, Yeung, Van Den, ildenberg and Ridderinkhof2003). Since these two processes are assumed to be inherently connected, it is impossible to determine which of them is linked to the N2 component. Therefore, despite the contribution of research conducted with the N2 component to the literature, it does not provide conclusive evidence for bilingual inhibitory control (Declerck & Koch, Reference Declerck and Koch2023).
A technique with the potential to further illuminate cognitive control processes in bilinguals is pupillometry (Reimer et al., Reference Reimer, Froudarakis, Cadwell, Yatsenko, Denfield and Tolias2014, Reference Reimer, McGinley, Liu, Rodenkirch, Wang, McCormick and Tolias2016). After controlling for stimulus luminance, the measure of pupil diameter over time has been shown to be a robust measure of cognitive processing load, with larger pupil responses under conditions of increased attentional allocation, memory use and task difficulty (e.g., Ershaid et al., Reference Ershaid, Lizarazu, McLaughlin, Cooke, Simantiraki, Koutsogiannaki and Lallier2024; for reviews, see Van Engen & McLaughlin, Reference Van Engen and McLaughlin2018; van der Wel & van Steenbergen, Reference van der Wel and Van Steenbergen2018). Moreover, it has been reported that the pupil dilates more with increasing attentional demands related to linguistic uncertainty and complexity (Schmidtke, Reference Schmidtke2018; Zekveld et al., Reference Zekveld, Koelewijn and Kramer2018). In bilinguals, larger pupil responses are typically observed when listening or speaking in one’s second language compared to first language (Duñabeitia & Costa, Reference Duñabeitia and Costa2015; Hyönä et al., Reference Hyönä, Tommola and Alaja1995).
Few studies have combined pupillometry with language switching paradigms, but those that did have mainly investigated comprehension rather than production (e.g., Beatty-Martínez et al., Reference Beatty-Martínez, Guzzardo Tamargo and Dussias2021; Byers-Heinlein et al., Reference Byers-Heinlein, Morin-Lessard and Lew-Williams2017). For example, to examine language control processes during bilingual language switching, pupil responses in bilingual infants and adults were examined using a visual-world paradigm (Byers-Heinlein et al., Reference Byers-Heinlein, Morin-Lessard and Lew-Williams2017). In this study, participants saw pairs of pictures and heard a speaker name one of the pictures in either one language (“look! Find the dog”) or two languages (“look! Find le chien!”). Results showed that compared to sentences spoken in one language, codeswitched sentences resulted in larger pupil responses, indicating higher cognitive load during language switches. These and other findings (e.g., Beatty-Martínez et al., Reference Beatty-Martínez, Guzzardo Tamargo and Dussias2021; Hyönä & Alaja, Reference Hyönä, Tommola and Alaja1995) demonstrate that pupil response is a sensitive measure of a variety of language-related processes. Thus, pupillometry can be used as an index of cognitive load and attentional control during cued language switching, potentially providing insights beyond what can be concluded from response time and ERP data.
1.1. Aims and hypotheses
The primary aim of the current study was to contribute to the growing body of literature examining language control in bilinguals. A novel measure of switching cost, pupil size, was implemented during a cued language switching task. To the best of our knowledge, this is the first study to use pupillometry to examine cued language switching costs in a group of highly proficient bilinguals.
We predicted that pupil responses would be larger in switch compared to non-switch trials, indicating increased cognitive load and attention allocation when bilinguals are forced to switch. In line with previous findings of symmetrical response time switching costs in highly proficient bilinguals (Costa et al., Reference Costa, Santesteban and Ivanova2006; Costa & Santesteban, Reference Costa and Santesteban2004), we predicted symmetrical pupil switching costs across languages. Regarding a language effect, we anticipated two possible outcomes – aligning with different views on bilingual language control. Firstly, previous findings in Spanish–Basque bilinguals indicated slower responses in Spanish than Basque, despite Spanish being the more proficient or dominant language (e.g., de Bruin et al., Reference de Bruin, Samuel and Duñabeitia2020; Jevtović et al., Reference Jevtović, Duñabeitia and de Bruin2020). This reflects a reversed language dominance effect, with higher global inhibition of the dominant language. Therefore, larger pupil responses during Spanish than Basque trials might be observed, since relieving this inhibition would be taxing more resources and, hence, lead to the slower response times observed for Spanish than Basque. Secondly, in accordance with proposals of the involvement of processes other than inhibition in cued language switching in highly proficient bilinguals (e.g., Costa & Santesteban, Reference Costa and Santesteban2004; Declerck et al., Reference Declerck, Thoma, Koch and Philipp2015), faster response times in the less dominant language might be related to higher activation or cognitive resource allocation when processing this language; on this view, we would expect larger pupil responses during the Basque than Spanish trials (see Table 1 for a summary of these potential findings). This outcome would align with the hypothesis that reversed language dominance might reflect a constant increase in the activation of the NDL throughout a mixed cued language switching task, alongside a global inhibition of the DL (Declerck et al., Reference Declerck, Thoma, Koch and Philipp2015).
Table 1. Summary of two potential outcomes for a language effect in the cued switching task

In addition to the pupillometry task, we included a classic cued language switching task used in previous research (e.g., de Bruin et al., Reference de Bruin, Samuel and Duñabeitia2020) to objectively assess language switching costs (i.e., response time differences between switch and non-switch trials). The aim was to confirm the validity of the adapted and novel pupillometry task for capturing the same processes examined in classic cued language switching tasks. It was hypothesized that our group of highly proficient bilinguals would exhibit symmetrical language switching costs, in line with previous findings in highly proficient bilinguals (Costa et al., Reference Costa, Santesteban and Ivanova2006; Costa & Santesteban, Reference Costa and Santesteban2004). However, studies with Spanish–Basque bilinguals have found faster response times in Basque than Spanish, despite balanced proficiency across the two languages or higher proficiency in Spanish than Basque (e.g., de Bruin et al., Reference de Bruin, Samuel and Dunabeitia2018; de Bruin et al., Reference de Bruin, Samuel and Duñabeitia2020). These results reflect a reversed language dominance effect, wherein the DL (Spanish) is more strongly inhibited than the NDL (Basque). Therefore, we predicted that a reversed language dominance effect would be observed here.
2. Materials and methods
2.1. Participants
Fifty-one early Spanish–Basque bilingual adults (35 females; mean age = 27.2, SD = 4.5, range = 19–36) were recruited from the BCBL Participa database (https://www.bcbl.eu/participa/). Sample size was selected based on sufficient power in prior work (McLaughlin et al., Reference McLaughlin, Colvett, Bugg and Van Engen2024). Because the data were collected as part of a larger study in which handedness was an inclusion criterion, one participant who reported being left-handed was excluded from further testing and analyses. Participants (N = 50, 34 females) were right-handed, had normal or corrected-to-normal vision, no known neurological or hearing impairments and gave informed consent. The study was approved by the BCBL Ethics Review Board and complied with the guidelines of the Helsinki Declaration.
All participants were highly proficient in Basque and Spanish and reported age of acquisition (AOA) for both languages before or by the age of 6 (mean AOA for Basque = 1; mean AOA for Spanish = 0.4; p > .05). Additional information about the language proficiency, use and exposure of participants for Basque and Spanish was retrieved from the Participa database (see Table 2). These included a picture naming task (BEST proficiency test; de Bruin et al., Reference de Bruin, Carreiras and Duñabeitia2017) and a lexical decision task (LexTALE; de Bruin et al., Reference de Bruin, Carreiras and Duñabeitia2017; Izura et al., Reference Izura, Cuetos and Brysbaert2014). The picture naming task included naming 65 non-cognate pictures in Basque and Spanish (mean Basque = 58.6, SD = 5.9; mean Spanish = 64.5, SD = 1.4; p < .001), while the LexTALE consisted of completing a computerized lexical decision task including 75 items in Basque and 65 items in Spanish (mean Basque = 89, SD = 7.9; mean Spanish = 93.6, SD = 5.7; p < .001). These proficiency scores are in line with previous studies within this group of highly proficient early bilinguals (e.g., de Bruin et al., Reference de Bruin, Samuel and Dunabeitia2018; Reference de Bruin, Samuel and Duñabeitia2020; Jevtović et al., Reference Jevtović, Duñabeitia and de Bruin2020) and indicate that overall, Spanish is more proficient than Basque. Furthermore, participants’ subjective ratings of their language proficiency, exposure and use in Basque and Spanish were provided. For the subjective proficiency measures, participants rated their proficiency in each language on a scale from 0 to 10 in terms of speaking, understanding, reading, writing and a general proficiency score. For exposure, participants rated on a scale from 0 to 100% how often they are exposed to each language. For all subjective ratings, there were no significant differences between Basque and Spanish (all ps > .05). Most participants had additional languages (e.g., English) acquired as a third language, but these languages are not reported here given that they are not the focus of this study.
Table 2. Summary of objective and subjective measurements of language proficiency, exposure and use of Basque (left) and Spanish (right)

2.2. Behavioral cued language switching task
2.2.1. Stimuli
Stimuli were based on de Bruin et al. (Reference de Bruin, Samuel and Duñabeitia2020); see also de Bruin et al., Reference de Bruin, Samuel and Dunabeitia2018) and included 30 colored drawings from the MultiPic database (Duñabeitia et al., Reference Duñabeitia, Crepaldi, Meyer, New, Pliatsikas, Smolka and Brysbaert2018). Picture names were matched between Basque and Spanish on word length and log frequency (see de Bruin et al., Reference de Bruin, Samuel and Dunabeitia2018 for more information about the stimuli selection and the stimuli list). A country flag preceded each picture to indicate the language to be used. As in de Bruin et al. (Reference de Bruin, Samuel and Duñabeitia2020), two versions of each flag were used per language to ensure the presence of a cue switch on every trial, even when there was no language switch. This design was used because using one cue per language results in a switching cost that reflects a switch between cues, which is absent on non-switch trials (Logan & Bundesen, Reference Logan and Bundesen2003; Mayr & Kliegl, Reference Mayr and Kliegl2003).
2.2.2. Procedure
The procedure followed that of de Bruin et al. (Reference de Bruin, Samuel and Duñabeitia2020). Sessions started with a familiarization phase, where participants were familiarized with the names of each drawing in Basque and Spanish. This ensured participants recognized the pictures and used the correct words to name them in both languages. In the cued picture-naming task, each trial started with a fixation cross presented for 500 ms, followed by a flag cue presented for 500 ms to indicate the language to be used, which was then displayed in a smaller version above the picture to be named for 2500 ms (see Figure 1). Participants’ responses were recorded during the presentation of the picture. The task began with a practice block containing 10 trials. Next, two blocks each containing 120 trials were presented with a small break in between. Half of the trials were switch trials, and the other half were non-switch trials. For each trial type, half the trials had to be named in each language. Items were distributed equally across the trial type and language combinations and were pseudo-randomly distributed across conditions, ensuring the same item did not appear twice in a row, and that there were no more than four trials of the same trial type in a row.

Figure 1. Illustration of the behavioral switching task design, showing a switch trial, followed by a non-switch trial.
Response accuracy was coded manually following Bonfieni et al. (Reference Bonfieni, Branigan, Pickering and Sorace2019) and Han et al. (Reference Han, Li and Filippi2022). Responses were coded as errors when participants used the wrong language or did not answer, and such trials were excluded from RT analysis, and the following trial was also deleted from the analysis. In the case of hesitations, pauses or self-corrections to the answers, the trial was also marked as an error and excluded from further analysis, while its following trial was retained. The onset time of the recorded responses was determined with Chronset software (Roux et al., Reference Roux, Armstrong and Carreiras2017) and checked manually with CheckVocal (Protopapas, Reference Protopapas2007), following de Bruin et al. (Reference de Bruin, Samuel and Duñabeitia2020).
2.3. Pupillometry cued language switching task
2.3.1. Stimuli
Twenty-six of the pictures used in the behavioral cued language switching task were adapted for use in the pupillometry switching task. Adobe Photoshop (version 25.4.0) was used to first convert pictures to black and white and then adjust brightness and contrast, to avoid as much as possible potential effects on pupil response. All pictures were adjusted over the same gray square (300 by 300 pixels) to a mean luminosity of 174, with an orange or blue border (of matched luminosity) overlayed. The baseline image was made following this same process but with a gray border of matched luminosity. The assignment of the color cue to the response language was counterbalanced across participants.
2.3.2. Procedure
The procedure for the cued language switching task was adapted for use with pupillometry. During a familiarization phase, participants were familiarized with the color cue-naming language combinations by being presented with the orange and blue framed pictures one at a time, with the names of each item presented in the appropriate color–language combinations. In this task, a trial in which pupil response was measured consisted of two successive pictures to be named. This paired-pictures design was utilized to control for luminance across switch and non-switch trials and to allow for the pupil to go back to baseline before each new trial (see Figure 2). Participants were asked to fixate on the center of the screen throughout the task. Trials started with the presentation of the baseline image for 5 seconds. Next, participants had to name a color-framed picture (presented for 3 seconds). This first picture was followed by 500 ms of fixation on the center of the baseline image, and finally, participants had to name a second color-framed picture (presented for 3 seconds). Participants’ responses were recorded during the presentation of the first and second pictures, with the baselining occurring prior to the first naming part, and the critical analysis being on the second naming part, where the switch/non-switch conditions occurred. The task began with a practice block containing 3 trials. Next, four blocks each containing 52 trials were presented with a small break in between each block. Half of the trials were switch trials, and the other half were non-switch trials. For each trial type, half the trials had to be named in each language. Items were distributed equally across the trial type and language combinations and were randomly distributed across conditions, while ensuring the same item did not appear twice in a row, and that there were no more than four trials of the same trial type in a row. The procedure for coding accuracy and RTs of the oral responses was the same as the one explained in the behavioral language switching task.

Figure 2. Illustration of the pupillometry switching task design, showing a switch trial, followed by a non-switch trial after a five second baselining period, hence the “paired pictures” design.
2.3.3. Pre-processing of pupillometry data
Pupillometry data were acquired using EyeLink 1000 Plus with a sampling rate of 1 kHz. Pre-processing of the pupil data was implemented with the R package gazeR (Geller et al., Reference Geller, Winn, Mahr and Mirman2020) and dplyr (Wickham et al., Reference Wickham, François, Henry, Müller and Vaughan2025). First, timepoints where the participant’s gaze was outside of the center of the screen were excluded (4.2%), and trials with more than 30% data loss due to blinks were also identified and excluded (8.5%). Six participants were subsequently excluded due to the loss of more than 30% of trials. Next, periods of missing data caused by blinking were identified and extended 100 ms prior and 200 ms following to remove extraneous values (i.e., values recorded when the eyelid is partially obscuring the pupil). Windows of missing data were then interpolated with linear fits, and data were smoothed with a 5-point moving average. Subtractive baselining (Reilly et al., Reference Reilly, Kelly, Kim, Jett and Zuckerman2019) was used to align the data across trials. Finally, data were resampled from 1 kHz to 50 Hz.
2.4. Statistical analysis
2.4.1. Behavioral switching task analysis
Outliers were defined as RT values more than 2.5 mean absolute deviations from the median and removed. To examine switching costs, switch and non-switch trials were included and a model constructed with the predictors language (levels: Basque and Spanish; dummy-coded) and trial type (levels: switch and non-switch; dummy-coded), as well as their interaction. RT data were fit in linear mixed effects models, using the lme4 package in R (Bates et al., Reference Bates, Mächler, Bolker and Walker2014). Random intercepts of participants and items and random slopes of language were included. Random slopes of trial type were attempted but resulted in model singularity and were ultimately dropped.
2.4.2. Pupillometry switching task analysis
The procedure for analyzing RTs of the oral responses from the pupillometry language switching task was the same as the one described in the behavioral language switching task.
For the pupil data, only the second naming of a given trial, where the switch/non-switch occurred, was analyzed. Growth curve analysis (i.e., curve-fitting; Mirman, Reference Mirman2017) was conducted using linear mixed-effects regression in R (Bates et al., Reference Bates, Mächler, Bolker and Walker2014). For each trial, a time window from 4500 ms to 6000 ms was selected to center the analysis on the a priori selected critical peak of the data. Every trial has two picture namings, resulting in two pupil dilation peaks, the critical peak is the second one where the switch/non-switch response occurs. The window selected is over the second peak. This window was selected without viewing the effects of interest in our data, only the overall shape of the pupil curve and before the results were known, matching procedures from McLaughlin et al. (Reference McLaughlin, Colvett, Bugg and Van Engen2024) and following advice from Peelle and Van Engen (Reference Peelle and Van Engen2021). Linear, quadratic and cubic polynomial terms were included in the model. The critical effects of interest were trial type (levels: switch vs. non-switch; dummy-coded), language (levels: Basque and Spanish; dummy-coded) and their interaction. We also examined these effects as they interact with the linear, quadratic and cubic polynomial terms. An effect of trial (a proxy for time across the experiment) was included based on methodological advice from McLaughlin et al. (Reference McLaughlin, Zink, Gaunt, Reilly, Sommers, Van Engen and Peelle2023). Random effects included intercepts of subjects and items and slopes of the linear, quadratic and cubic polynomials. R model code is provided in Appendix A.
3. Results
3.1. Behavioral switching task
Average accuracy was high in both non-switch and switch trials and across languages (non-switch Basque M = .95, SD = .2; non-switch Spanish M = .94, SD = .2; switch Basque M = .93, SD = .2; switch Spanish M = .93, SD = .2). Therefore, accuracy data were not further analyzed, as they were close to ceiling and not the main interest of this study (for a similar procedure, see de Bruin et al., Reference de Bruin, Samuel and Duñabeitia2020).
For the RT data, the model showed a significant main effect of trial type, with switch trials being slower than non-switch trials (β = 38.12, p < .001), reflecting a switching cost. There was a main effect of language (β = 59.98, p < .001), indicating that responses in Basque were faster than in Spanish (Figure 3). The trial type by language interaction was not significant, indicating equal switching costs in Basque and Spanish.

Figure 3. Boxplots showing the RTs in Basque (left) and Spanish (right) across switch and non-switch trials. The horizontal line within each box indicates the median, while the box edges represent the interquartile range (25th–75th percentile). Whiskers extend to the highest and lowest values, and dots indicate individual participants’ mean RTs in each language and trial type. Black markers indicate the mean ± standard error.
3.2. Pupillometry switching task
Results of RT data indicated that RTs in the pupillometry switching task were highly correlated to those in the behavioral switching task (p < .001, r = .87). Like the behavioral switching task data, average accuracy was high in both non-switch and switch trials and across languages (non-switch Basque M = .97, SD = .15; non-switch Spanish M = .97, SD = .16; switch Basque M = .96, SD = .18; switch Spanish M = .95, SD = .21). Therefore, accuracy data were not further analyzed, as it was close to ceiling and not the main interest of this study.
For the RT data, results matched the behavioral switching task RT results. The model showed a significant main effect of trial type, with switch trials being slower than non-switch trials (β = 65.68, p < .001), reflecting a switching cost. There was a main effect of language (β = 61.97, p < .001), indicating that responses in Basque were faster than in Spanish. The trial type by language interaction was not significant, indicating equal switching costs in Basque and Spanish.
For the pupil data, fixed effects were first examined in a model with no interaction terms. Log-likelihood model comparisons were performed using reduced versions of this model, with a single effect removed in each iteration. This was done to determine the contribution of each effect to the model fit. Thereafter, a full model containing all fixed effects and two-way interactions was built for log-likelihood comparisons on each of the two-way interactions. The model estimates for fixed effects and two-way interactions from each model are reported below. Full model summaries including the partialled lower-order coefficients in the model containing interactions are available in Appendix B.
Results of all log-likelihood model comparisons are presented in Table 3. As predicted, the fixed effect of trial type significantly improved model fit (χ 2 = 348.71, p < .001), reflecting greater overall pupil response for switch compared to non-switch trials (see Figure 4). The effect of language was also significant (χ 2 = 645.6, p < .001), indicating that overall, naming in Spanish elicited lower pupil response than naming in Basque (see Figure 4). The trial type by language interaction was not significant (χ 2 = 1.52, p = .217), indicating equal pupil switching costs across Basque and Spanish. The trial effect was significant (χ 2 = 180.99, p < .001), indicating a decrease in vertical height across trials, which reflects pupil fatigue (McLaughlin et al., Reference McLaughlin, Zink, Gaunt, Reilly, Sommers, Van Engen and Peelle2023).

Figure 4. Pupil dilation response for Basque (left) and Spanish (right) across switch and non-switch trials, for the time window in which growth curve analyses were conducted. Solid lines represent growth curve model fits, and points represent mean values of the raw data at each timepoint.
Table 3. Log-likelihood model comparisons

Note: df = degrees of freedom; * p < .05; ** p < .01; *** p < .001.
As expected, the linear, quadratic and cubic polynomial effects all significantly improved model fit (ps < .05). The interactions between language and each polynomial were significant (all p’s < .05), indicating differences in model shape for each level of language. As seen in Figure 4, the Spanish naming trials elicited a shallower pupil response than the Basque naming trials. The interactions between trial type and linear and quadratic polynomials were also significant (all p’s < .05), but the interaction between trial type and cubic polynomial was not significant (p = .25). Interactions with the linear polynomial indicated a greater rate of increase in pupil response for the switch than the non-switch condition. The interaction with the quadratic polynomial reflects a difference in the shape of the pupil response (the “pointiness” of the peak) between conditions (p < .001).
4. Discussion
Bilingual language control has been closely investigated through cued language switching paradigms, in various bilingual populations. Most of the findings rely on RTs or ERPs to draw conclusions about the cognitive processes involved in bilingual switching. Here, pupillometry, a robust measure of cognitive processing load and attentional allocation, was utilized for the first time to further illuminate these processes in a group of highly proficient bilinguals. A novel cued language switching task combined with pupillometry was implemented, in addition to a classic cued language switching task, to assess language switching in a group of Spanish–Basque bilingual adults.
Response time data from the two cued language switching tasks were used to assess language switching costs (i.e., RT differences between switch and non-switch trials). Results from these two tasks were highly correlated (r = .87), showing faster responses on non-switch compared to switch trials, i.e., a switching cost. This switching cost was not modulated by language (i.e., symmetrical switching costs emerged across languages), pointing to an overall balanced dominance in this group of bilinguals. However, responses were on average faster in Basque than in Spanish. This is in line with previous findings in Spanish–Basque bilingual adults (e.g., de Bruin et al., Reference de Bruin, Samuel and Duñabeitia2020; Jevtović et al., Reference Jevtović, Duñabeitia and de Bruin2020), despite proficiency being somewhat higher in Spanish, as was the case in our study. This provides support for studies showing a global slowing of the most dominant language (i.e., overall slower naming in the DL than NDL), even when switching costs are symmetrical (e.g., Costa & Santesteban, Reference Costa and Santesteban2004; Gollan & Ferreira, Reference Gollan and Ferreira2009; Martin et al., Reference Martin, Strijkers, Santesteban, Escera, Hartsuiker and Costa2013; for a review, see Bobb & Wodniecka, Reference Bobb and Wodniecka2013). This global slowing effect might stem from inhibitory control of the dominant language, but findings of this effect have been inconsistent (e.g., Declerck et al., Reference Declerck, Koch and Philipp2012).
In their meta-analysis, Gade et al. (Reference Gade, Declerck, Philipp, Rey-Mermet and Koch2021) speculated that if reversed language dominance (linked to proactive language control) and asymmetrical switching costs (linked to reactive language control) are related processes, evidence for reversed language dominance should be seen while asymmetrical switching costs should not be observed. This is because strong, sustained language control should reduce the need for transient reactive control. The current study exhibits this pattern of results and thus provides support for a link between these language control processes. However, in the absence of a baseline single-language condition, it is not possible to conclude whether this reflects language control processes or a more general difference in lexical retrieval speed between Spanish and Basque words.
In addition to RT data, pupil response differences between switch and non-switch trials were assessed. Results were in line with those of the RT data, with switch trials eliciting larger pupil responses than non-switch trials. This finding points to higher cognitive processing load in switch compared to non-switch trials. Pupil switching costs were not modulated by language, matching the RT results and indicating balanced language dominance across Spanish and Basque. However, results further indicated that, overall, Basque elicited faster RTs and larger pupil responses than Spanish; in other words, the pupil data indicated greater cognitive load when naming Basque words despite them being named faster. This reversed pattern of results between RT and pupil data could provide an explanation for the paradoxical finding of faster RTs in Basque than in Spanish, despite Spanish being the DL. Pupillometry, the measure of pupil diameter over time, has been shown to be a robust measure of cognitive processing load in a variety of cognitive tasks (Van Engen & McLaughlin, Reference Van Engen and McLaughlin2018). Greater pupil response when naming in Basque than Spanish might thus indicate increased attentional allocation, which might have resulted in faster naming in Basque than Spanish. This could reflect increased recruitment of cognitive resources when processing the NDL, resulting in increased efficiency and thereby countering the effects of language dominance. These results support proposals of the involvement of processes alongside inhibition in cued language switching in highly proficient bilinguals. For example, the language-specific selection threshold hypothesis (Costa & Santesteban, Reference Costa and Santesteban2004) postulates that bilinguals may try to compensate for an imbalance in language dominance by making the lexical representations of the weaker language more available. On the other hand, Declerck et al. (Reference Declerck, Kleinman and Gollan2020) suggested that the reversed language dominance effect, observed often in highly proficient bilinguals, occurs because of unintended inhibitory overshooting of the DL when the goal is to make both languages equally accessible (Declerck et al., Reference Declerck, Kleinman and Gollan2020; Gollan & Ferreira, Reference Gollan and Ferreira2009). This is because highly proficient bilinguals have somewhat equal accessibility to their two languages, and they may try to ensure equal activation by over-inhibiting their DL. This in turn results in the NDL being highly activated and in a general slowing of the DL. Therefore, our results of larger pupil responses combined with faster RTs to Basque compared to Spanish clearly indicate the involvement of processes alongside inhibition during cued language switching, without excluding the role of inhibition in this complex process. Specifically, they support the proposal of greater activation when naming words in the NDL compared to the DL, resulting from differences in resource allocation, indexed by our novel pupillometry paradigm. However, as this was the first study to implement pupillometry during cued language switching, future studies are needed to further disentangle these processes.
5. Conclusion
This study implemented pupillometry for the first time to objectively assess cognitive resource allocation on a cued language switching task. Findings of the pupil data showed larger pupil responses (i.e., higher cognitive load) on switch compared to non-switch trials and equal pupil switching costs across the two languages. These results were in line with the RT results, which showed slower responses on switch compared to non-switch trials and equal RT switching costs across languages. The results point to an overall balanced proficiency across the two languages in our group of highly proficient bilinguals.
The RT results further exhibited an effect often observed in highly proficient bilinguals, the reversed dominance effect. This effect results from a global slowing of the DL, leading to overall faster RTs in the NDL than DL. Most studies interpret reversed dominance findings using inhibitory control processes. However, the pupil data in the current study revealed that Basque, the less dominant language, elicited larger pupil responses than Spanish, indicating that naming Basque words led to greater cognitive load despite the words being named faster. These results point to increased cognitive load or attentional resource allocation when processing the NDL compared to the DL. This could signal the involvement of processes alongside inhibition in bilingual language control. Finally, our results support the feasibility and value of utilizing pupillometry as a measure of cognitive load to further our understanding of processes involved in bilingual language processing. Future studies could explore pupillary responses during language switching paradigms (cued or voluntary) in groups of bilingual speakers with differing profiles and proficiencies.
Data availability statement
The data that support the findings of this study are openly available in OSF at https://osf.io/uynfv/.
Acknowledgements
This research was supported by the Spanish Ministry of Science, Innovation and Universities (grant ref. PRE2021-098017; project CEX2020-001010-S-21-3 funded by MCIN/AEI/10.13039/501100011033 and FSE+). Additional support was received from the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement No. 101103964, the Basque Government through the BERC 2022–2025 program, the Spanish State Research Agency through BCBL Severo Ochoa excellence accreditation CEX2020-001010-S; SEV-2015-0490 and from project grant PID2022-136989OB-I00.
Competing interests
The authors declare none.
Appendix A
R Model Syntax
Growth curve analysis model with all fixed effects and two-way interactions
gca.model.full <− lmer(Pupil ~ 1 + LinearPoly + QuadraticPoly + CubicPoly + TrialType
+ Language2 + Trial + Language2:TrialType+
TrialType:LinearPoly + TrialType:QuadraticPoly + TrialType:CubicPoly + Language2:LinearPoly + Language2:QuadraticPoly + Language2:CubicPoly +
(1 + LinearPoly + QuadraticPoly + CubicPoly | Subject) +
(1 | Word2),
data = timecourse_data, REML = FALSE,
control = lmerControl(optimizer = “bobyqa,”
optCtrl = list(maxfun = 1e9)))
Appendix B
Full model summaries
Table A1. Model with all lower-order fixed effects

Table A2. Model with all lower-order fixed effects and two-way interactions

Note. Summaries of the linear mixed-effects regression models. TableA1 shows the model containing only lower-order fixed effects and no interactions. Table A2 shows the model containing all fixed effects and two-way interactions. Coefficients in A2 (shaded) contain partialled values. S.E.: Standard Error. S.D.: Standard Deviation.