Highlights
-
• Polish-Dutch and Turkish-Dutch parents do not vary significantly in mixing behavior.
-
• Parental language mixing is not related to any majority language outcome.
-
• Parental mixing negatively relates to expressive minority language vocabulary.
-
• Method to measure mixing does not moderate the relation with language outcomes.
1. Introduction
Language mixing, also called code-switching, is the use of more than one language within the same conversation (Lam & Matthews, Reference Lam and Matthews2020; Poeste et al., Reference Poeste, Müller and Arnaus Gil2019; Poplack, Reference Poplack, Smelser and Baltes2001; Yow et al., Reference Yow, Tan and Flynn2018). Language mixing is common among bilingual speakers, with large individual differences between speakers and communities (Hoff & Core, Reference Hoff and Core2015; Myers-Scotton, Reference Myers-Scotton2017; Yow et al., Reference Yow, Tan and Flynn2018). Different forms of language mixing occur regularly in bilingual children’s language input (Kremin et al., Reference Kremin, Alves, Orena, Polka and Byers-Heinlein2022). Bilingual families are sometimes advised to avoid language mixing and to use the “One-Parent-One-Language” (OPOL) approach (Blom et al., Reference Blom, Oudgenoeg-Paz and Verhagen2018; De Houwer, Reference De Houwer2007). This advice comes from the belief that children would acquire their languages better when the languages are separated in their input (Aronsson, Reference Aronsson2020) and may stem from fears that mixing confuses children, even though empirical support for this idea is scarce. To date, merely five studies have investigated the relation between the degree of parental language mixing and children’s language outcomes, and their results are equivocal (Bail et al., Reference Bail, Morini and Newman2015; Byers-Heinlein, Reference Byers-Heinlein2013; Carbajal & Peperkamp, Reference Carbajal and Peperkamp2020; Place & Hoff, Reference Place and Hoff2011, Reference Place and Hoff2016). Their various findings will be discussed throughout this introduction. The present study aims to provide further insight by investigating the relation between language mixing by caregivers and children’s language outcomes. It focuses on an underinvestigated age range (3–5 years) and includes data from two different bilingual groups, Polish-Dutch and Turkish-Dutch. Its findings are of both theoretical and practical relevance. They will provide insights into the relation between specific language environment factors and children’s language development, and they will contribute to improved advice provided to bilingual families.
1.1. Parental language mixing in relation to children’s language outcomes
Investigating the quantity and characteristics of child-directed language mixing is crucial to understanding how it might affect language acquisition (Kremin et al., Reference Kremin, Alves, Orena, Polka and Byers-Heinlein2022). As previously mentioned, only a few studies have examined the relation between parental language mixing and bilingual children’s language skills. Two studies found a negative relation such that more parental language mixing was related to smaller vocabularies in children (Byers-Heinlein, Reference Byers-Heinlein2013; Carbajal & Peperkamp, Reference Carbajal and Peperkamp2020). Two other studies did not find a relation between the frequency of mixed language input children heard and their vocabulary or grammar outcomes (Place & Hoff, Reference Place and Hoff2011, Reference Place and Hoff2016). One final study reported a positive relation where more mixing was associated with children having larger vocabularies (Bail et al., Reference Bail, Morini and Newman2015). Overall, it remains unknown whether the relation between parental language mixing and children’s language outcomes may be negative, positive or non-existent, and whether it differs for various language aspects (e.g., receptive and expressive vocabulary outcomes or grammar). Moreover, hypothetically, all three empirical outcomes could theoretically be expected, as we will further elaborate on in this section.
1.1.1. More mixing relates to lower child language outcomes
The first possibility is that more parental language mixing is related to lower language outcomes in children. According to the theoretical framework of the Bilingual Interactive Activation Model (Dijkstra & Van Heuven, Reference Dijkstra and Van Heuven2002; Grainger & Dijkstra, Reference Grainger, Dijkstra and Harris1992; Grainger et al., Reference Grainger, Midgley, Holcomb, Kail and Hickmann2010), both languages of bilinguals are simultaneously active and accessible. When children process language input, they implicitly expect an upcoming word to be in the same language as the previous word. Thus, when language mixing occurs, this is counter to their expectations. They need to inhibit the previously activated language and retrieve knowledge from the other language, resulting in a processing cost that may temporarily hinder language comprehension (Byers-Heinlein et al., Reference Byers-Heinlein, Jardak, Fourakis and Lew-Williams2022). Ruan et al. (Reference Ruan, Byers-Heinlein, Orena and Polka2023) note, moreover, that children’s comprehension could be diminished due to language mixing disrupting the statistical regularities in the speech stream. Children use mechanisms such as statistical learning to implicitly detect patterns in their language input (Erickson & Thiessen, Reference Erickson and Thiessen2015; Romberg & Saffran, Reference Romberg and Saffran2010). Parental language mixing may cause disruptions in these regularities, which could temporarily complicate the processing and comprehension of mixed language utterances (Place & Hoff, Reference Place and Hoff2016; Potter et al., Reference Potter, Fourakis, Morin-Lessard, Byers-Heinlein and Lew-Williams2019). It is not expected that a limited amount of mixed language will affect children’s language outcomes, but frequent mixing might have a detrimental effect on children’s language outcomes (Ruan et al., Reference Ruan, Byers-Heinlein, Orena and Polka2023).
Two studies found a negative relation between parental language mixing and children’s language skills. Byers-Heinlein (Reference Byers-Heinlein2013) investigated 129 18-month-old and 39 24-month-old bilingual infants with various language backgrounds and found a negative relation between parents’ language mixing score and children’s receptive vocabulary score in English in the 18-month-olds. No negative relation was found for the 24-month-old children. The frequency of parental language mixing was measured with the Language Mixing Scale (LMS; Byers-Heinlein, Reference Byers-Heinlein2013), which comprises five statements. Four out of the five statements refer to language mixing within sentences (i.e., intra-sentential), and only one statement refers to general language mixing. Therefore, the LMS could be seen as a measure of intra-sentential language mixing instead of overall language mixing (see also Place & Hoff, Reference Place and Hoff2016).
The second study by Carbajal and Peperkamp (Reference Carbajal and Peperkamp2020) examined the dual language exposure of 58 11-month-old bilingual infants in Paris who were exposed to a variety of languages besides the majority language French. The study did not assess the frequency of language mixing; rather, it focused on the inverted form, namely language purity. More language mixing corresponded to a lower language purity. Carbajal and Peperkamp reported a positive relation between receptive vocabulary in French and within-speaker language purity (i.e., the less a speaker mixed, the higher the children’s receptive vocabulary scores in French). No relation was found for within-block language purity (i.e., the absence of language mixing within a 30-minute block). It should be noted that this method lacks fine-grained information regarding the types of mixing that occur within a block or speaker. The occurrence of multiple languages within a 30-minute time frame does not indicate how these languages are being mixed. We assume that intra-sentential language mixing is better reflected by within-speaker language purity than within-block language purity, as language mixing within a sentence implicitly takes place within a speaker. Taken together, the results suggest that hearing multiple languages within a 30-minute time frame is not negatively related to children’s early receptive vocabulary outcomes but hearing a speaker mix their languages may be.
Summarizing, two studies found negative relations between parental language mixing and children’s receptive vocabulary, which is in line with the hypothesis that children might have more difficulties comprehending mixed language input. The variables that are negatively related to vocabulary outcomes more closely represent intra-sentential language mixing. This may support the hypothesis that intra-sentential language mixing induces more processing costs and may hinder language comprehension, whereas inter-sentential language mixing may not (Byers-Heinlein et al., Reference Byers-Heinlein, Morin-Lessard and Lew-Williams2017; Gullifer et al., Reference Gullifer, Kroll and Dussias2013; Kremin et al., Reference Kremin, Alves, Orena, Polka and Byers-Heinlein2022).
1.1.2. More mixing relates to higher child language outcomes
There are also reasons why more language mixing could relate to higher language outcomes in bilingual children. First, language mixing might support the acquisition of new vocabulary by increasing the contextual variability (i.e., the range of unique contexts in which words appear). Some studies have shown that more variability in the linguistic context can enhance learning (Denby et al., Reference Denby, Schecter, Arn, Dimov and Goldrick2018; Gómez, Reference Gómez2002). Language mixing increases this variability by creating infrequent contexts that may draw the child’s attention and highlight novel words, making it easier for children to learn novel words in a mixed context (Kaushanskaya et al., Reference Kaushanskaya, Crespo and Neveu2023).
Second, parents indicate that they mainly mix their languages to bolster their children’s comprehension or to teach new words (Kremin et al., Reference Kremin, Alves, Orena, Polka and Byers-Heinlein2022). Additionally, considering the often unbalanced language levels of bilingual children (Baker & Jones, Reference Baker and Jones1998; Yip & Matthews, Reference Yip and Matthews2006), learning new vocabulary in their weaker language might be aided by presenting these words within sentences of their stronger language (Ruan et al., Reference Ruan, Byers-Heinlein, Orena and Polka2023; but see Potter et al., Reference Potter, Fourakis, Morin-Lessard, Byers-Heinlein and Lew-Williams2019). These instances of language mixing could facilitate language learning and could have a positive effect on children’s language outcomes.
Third, parents may alter their language use based on children’s increasing language abilities (Soderstrom, Reference Soderstrom2007). If children’s language abilities are more developed, parents tend to form longer sentences, thereby creating more possibilities for switching within sentences (Bail et al., Reference Bail, Morini and Newman2015). Additionally, parents may perceive an increase in their child’s language abilities to be indicative of an enhanced capacity to process more complex linguistic input. Indeed, it has been found that parents mixed more when speaking to their infant at 18 months of age than at 10 months of age (Kremin et al., Reference Kremin, Alves, Orena, Polka and Byers-Heinlein2022). Importantly, the direction of the relation remains unknown. It may be that language mixing enhances language abilities, or that enhanced language abilities cause parents to mix more.
A positive relation between intra-sentential language mixing and productive vocabulary outcomes was found in one study that involved toddlers (Bail et al., Reference Bail, Morini and Newman2015). In the study, 24 18- to 24-month-old Spanish-English bilingual children were observed during a 13-minute play session with their parents in a laboratory setting. Parents were given some toys and were instructed to play with their children as they usually would. Language mixing was measured as the number of observed intra- and inter-sentential mixes during the play session. For vocabulary outcomes, the authors created total and conceptual vocabulary scores, comprising both English and Spanish expressive vocabulary scores. The frequency of intra-sentential language mixing was positively correlated to both types of vocabulary outcomes, while no relation was found for the number of inter-sentential switches. The observed positive relation does not indicate whether language mixing may be beneficial for vocabulary development, or whether parents mix more when children show enhanced language skills.
1.1.3. Mixing frequency is unrelated to children’s language outcomes
The final possibility is that language mixing frequency is not associated with children’s language outcomes. Again, there are several potential explanations for null findings. First, a non-existent relation may be explained by the type of mixed input that children generally receive. The most common type of language mixing occurs between sentences (Bail et al., Reference Bail, Morini and Newman2015; Byers-Heinlein et al., Reference Byers-Heinlein, Morin-Lessard and Lew-Williams2017; Kremin et al., Reference Kremin, Alves, Orena, Polka and Byers-Heinlein2022), which is not associated with processing costs (Byers-Heinlein et al., Reference Byers-Heinlein, Morin-Lessard and Lew-Williams2017; Gullifer et al., Reference Gullifer, Kroll and Dussias2013; Kremin et al., Reference Kremin, Alves, Orena, Polka and Byers-Heinlein2022). It could also be the case that children do not experience much processing costs when they are used to language mixing in their natural language environment, regardless of type of language mixing (Adamou & Shen, Reference Adamou and Shen2019; Blanco-Elorrieta & Pylkkänen, Reference Blanco-Elorrieta and Pylkkänen2017).
Second, language mixing is a natural phenomenon (Hoff & Core, Reference Hoff and Core2015; Yow et al., Reference Yow, Tan and Flynn2018) and is considered fluid and effortless for highly proficient bilingual speakers (Poplack, Reference Poplack1980, Reference Poplack, Smelser and Baltes2001). If language mixing were detrimental to language comprehension, we would not expect it to be such a common behavior among bilingual speakers in naturalistic settings (Backus, Reference Backus2005; Myers-Scotton, Reference Myers-Scotton1993; Zentella, Reference Zentella1998).
Third, although language mixing occurs frequently in bilingual speech, mixed language input makes up a relatively small part of a bilingual child’s total language input (Kremin et al., Reference Kremin, Alves, Orena, Polka and Byers-Heinlein2022). It could be that, regardless of any possible processing costs, the frequency of mixed language input is too low to impact children’s language outcomes in either direction.
Place and Hoff have investigated the relation between parental language mixing and children’s expressive vocabulary and grammar outcomes in two studies. Their first study included 29 2-year-old Spanish-English bilingual children whose parents kept detailed language diaries of their children’s language exposure over seven-days (Place & Hoff, Reference Place and Hoff2011). Language mixing was measured as the percentage of 30-minute blocks in which the children were exposed to two languages. They found no relation between language mixing and expressive vocabulary or grammar outcomes. The authors noted that the percentage of mixed 30-minute blocks lacked fine-grained information about the actual frequency of language mixing occurring within these time blocks. In a follow-up study (Place & Hoff, Reference Place and Hoff2016), 90 parents kept a language diary on the language exposure of their 30-month-old Spanish-English bilingual children, and 58 of them also filled in the LMS (Byers-Heinlein, Reference Byers-Heinlein2013). The results from Place and Hoff (Reference Place and Hoff2011) were replicated, as the total frequency of mixed blocks was again unrelated to any measure of English or Spanish development. Furthermore, the score from the language mixing scale also did not relate to children’s expressive vocabulary or grammar outcomes (marginally to one out of five Spanish language outcomes).
1.2. The present study
The present study examines the relation between parental language mixing and language outcomes in 3- to 5-year-old children. The participants come from two distinct bilingual groups in the Netherlands: Turkish-Dutch and Polish-Dutch families. It is important to investigate language mixing in diverse groups of bilingual speakers to increase our understanding of the factors that underlie the influence of parental language mixing on children’s language acquisition, such as the type of mixing or the amount of experience with mixed language input (Adamou & Shen, Reference Adamou and Shen2019; Byers-Heinlein et al., Reference Byers-Heinlein, Jardak, Fourakis and Lew-Williams2022). We selected Polish-Dutch and Turkish-Dutch families because both groups are well represented in the Netherlands, where the study is situated. Moreover, the Polish and Turkish immigrants in the Netherlands exhibit notable demographic differences, particularly with regard to their migration history (Central Bureau of Statistics (CBS), n.d.). Therefore, before turning to studying the relation between parental language mixing and children’s language outcomes, we first address the question:
-
RQ1: How do the language mixing behaviors of Turkish-Dutch parents and Polish-Dutch parents differ in terms of frequency and type of mixing?
Language mixing is a frequent phenomenon in Turkish families in the Netherlands (Backus, Reference Backus, Bhatia and Ritchie2012; Backus & Demirçay, Reference Backus and Demirçay2021; Yagmur, Reference Yagmur2009). This is because most Turkish parents in the Netherlands are second- or third-generation immigrants (CBS, n.d.) and have a good command of both languages (Backus & Demirçay, Reference Backus and Demirçay2021; Yagmur, Reference Yagmur2009). In comparison, the Polish group in the Netherlands largely consists of first-generation immigrants (78.3%; CBS, n.d.). Furthermore, the Turkish community in the Netherlands is nearly twice as large as the Polish community, resulting in Turkish children having more (bilingual) speakers in their environment with whom they can speak both the heritage language and the societal language in comparison to Polish children. As further-generation immigrants show more frequent language mixing than first-generation immigrants (Backus & Demirçay, Reference Backus and Demirçay2021), and Turkish children may have more bilingual interlocutors in their immediate environment, we hypothesize that the Turkish-Dutch group mixes languages more frequently than the Polish-Dutch group. We do not have any hypotheses regarding differences between the two groups in terms of intra- and inter-sentential language mixing.
The other research questions look at the relation between parental language mixing and bilingual children’s language outcomes. We investigate both overall language mixing behavior and intra- and inter-sentential language mixing separately (Poplack, Reference Poplack, Smelser and Baltes2001), and distinguish between two types of language outcomes: vocabulary skills (research question 2) and sentence repetition scores (research question 3). For research question 2, vocabulary outcomes in the majority language (Dutch) and the minority language (Polish/Turkish) are separated into expressive and receptive vocabulary outcomes. Previous studies have found negative effects on receptive vocabulary (Byers-Heinlein, Reference Byers-Heinlein2013; Carbajal & Peperkamp, Reference Carbajal and Peperkamp2020), but not on expressive vocabulary (Bail et al., Reference Bail, Morini and Newman2015; Byers-Heinlein, Reference Byers-Heinlein2013; Place & Hoff, Reference Place and Hoff2011, Reference Place and Hoff2016) and only two studies addressed grammar and found no relation (Place & Hoff, Reference Place and Hoff2011, Reference Place and Hoff2016). In summary, to investigate the relation between parental language mixing and language outcomes, we set up the following two research questions:
-
• RQ2: To what extent does parental language mixing relate to children’s receptive and expressive vocabulary outcomes?
-
• RQ3: To what extent does parental language mixing relate to children’s sentence repetition scores?
The three possible hypotheses regarding the relation between parental language mixing and children’s language outcomes have been substantiated in the introduction. With the use of Bayesian statistics, we evaluate the evidence for a positive, a negative or a non-existent relation (Hoijtink, Gu, et al., Reference Hoijtink, Gu and Mulder2019).
Finally, we investigate whether the method used to measure language mixing affects the observed relations. The present study measured language mixing using both daylong audio recordings and questionnaires. Using two methods allows us to investigate whether the relation between parental language mixing and children’s language outcomes varies as a result of the type of measure used. This leads us to our final research question:
-
• RQ4: Does the method used to measure language mixing affect the observed relations between parental language mixing and children’s language outcomes?
Previous studies that used different methods to measure parental language mixing yielded mixed evidence (Bail et al., Reference Bail, Morini and Newman2015; Byers-Heinlein, Reference Byers-Heinlein2013; Carbajal & Peperkamp, Reference Carbajal and Peperkamp2020; Place & Hoff, Reference Place and Hoff2011, Reference Place and Hoff2016). Whether their disparate findings can be (partly) ascribed to their differences in methodology is currently unknown.
2. Materials and methods
The data in this study are part of the larger project “Children and Language Mixing: developmental, psycholinguistic and sociolinguistic aspects” (CALM; https://osf.io/p9gje/). The project has been approved by the Ethics Committee of Utrecht University (FETC20–0291). The data collection took place from July 2022 to July 2023 in the Netherlands. At that time, the overarching project had gathered data from Turkish-Dutch (n = 33), Polish-Dutch (n = 23) and English-Dutch (n = 4) bilingual children and their families. For this study, a subsample of only the Turkish and Polish groups was used. The study was preregistered on the Open Science Framework on August 1, 2024 (https://doi.org/10.17605/OSF.IO/849RZ). Deviations from the preregistration are made explicit throughout the methods section. Data and scripts are available on the project page for this study on the Open Science Framework (https://osf.io/qudr7/).
2.1. Participants
The participants were 56 multilingual children, aged between 36 and 72 months (M = 55, SD = 10), and their families. Data from four children was excluded from the analysis because they either received more than 50% exposure to a third language (n = 2) or did not finish the daylong audio recording and second test appointment (n = 2). Our final sample consisted of 21 Polish-Dutch (10 girls) and 31 Turkish-Dutch (17 girls) bilingual children. Families were recruited via schools, (local) events, online calls on social media platforms and personal networks. At the time of data collection, none of the children had received a diagnosis of a (suspected) language disorder. All children lived in the Netherlands and heard either the minority language Polish or Turkish in addition to the majority language Dutch.
Table 1 presents the descriptive statistics of the sample per group. Both groups are varied in terms of language exposure and language dominance. Six children were reported to be exposed to a limited amount of English as a third language (M = 4% exposure, SD = 5%, range: 2–14%). The educational level of parents was measured as the highest level of education attained between parents via the questionnaire for Quantifying Bilingual Experience (Q-BEx; De Cat et al., Reference De Cat, Kašćelan, Prévost, Serratrice, Tuller and Unsworth2022) and was relatively high in both groups.
Table 1. Descriptive statistics of the sample per group included in the analysis

Note. Language exposure is expressed as a proportion. Language dominance was determined via the current overall language exposure from the Q-BEx questionnaire, with >60% exposure being considered the threshold for dominance (Cattani et al., Reference Cattani, Abbot‐Smith, Farag, Krott, Arreckx, Dennis and Floccia2014; Siow et al., Reference Siow, Gillen, Lepădatu and Plunkett2023).
2.2. Measures
2.2.1. Vocabulary
Children’s vocabulary was measured in both their languages (Dutch and Polish/Turkish) via the Cross-linguistic Lexical Task (CLT; Haman et al., Reference Haman, Łuniewska, Pomiechowska, Armon-Lotem, De Jong and Meir2015). The CLT is appropriate for our age range (Haman et al., Reference Haman, Łuniewska, Hansen, Simonsen, Chiat, Bjekić, Blažienė, Chyl, Dabašinskienė, De Abreu, Gagarina, Gavarró, Håkansson, Harel, Holm, Kapalková, Kunnari, Levorato, Lindgren and Armon-Lotem2017) and is considered a valid measure of vocabulary (Van Wonderen & Unsworth, Reference Van Wonderen and Unsworth2021). The task consists of four parts, each comprising 32 items: (1) receptive knowledge of nouns; (2) receptive knowledge of verbs; (3) expressive knowledge of nouns and (4) expressive knowledge of verbs. The order of the four parts was counterbalanced.
In the receptive part, children were asked about the target word via a prerecorded phrase (e.g., “Where is the candle?”). The child was prompted to point to one of the four images on the screen that corresponded to the word. In the expressive part, children saw an image and were asked to describe it in a single word (e.g., “What is this?”). Each correct response was awarded one point. Nouns and verbs were collapsed to increase the number of items tapping into receptive and expressive vocabulary, thereby increasing variation within each modality (sum score range: 0–64). The analyses were conducted using the raw sum scores. Accuracy in the CLT is based on a list of target words, but we made a few adaptations to the target list and included more responses as correct. After consultation with at least two native speakers per language (Dutch, Polish and Turkish as spoken in the Netherlands (Doğruöz & Backus, Reference Doğruöz and Backus2010)), we decided to include fourteen Dutch, four Polish and five Turkish additional synonyms in the expressive vocabulary test (e.g., for Dutch, we included both strijkijzer and strijkbout as correct responses to the picture of an iron, as these were considered synonyms, but strijkbout is more old-fashioned).
2.2.2. Sentence repetition
The standardized sentence repetition test from the Clinical Evaluation of Language Fundamentals Preschool-II-Dutch (CELF-P-2-NL; Wiig et al., Reference Wiig, Secord, Semel and de Jong2012) was used. The test is appropriate for children between the ages of 3 and 7 years. The sentence repetition measure was administered in Dutch only, because, currently, there is no equivalent for Polish or Turkish. The CELF sentence repetition comprised thirteen prerecorded sentences of increasing length and complexity, preceded by three practice sentences. The experimenter demonstrated the task by repeating the practice sentences together with the child. Afterward, the child repeated the thirteen test sentences on their own. The experimenter scored how many errors the child made. Errors consisted of repeating, omitting, replacing or adding words or saying words in the wrong order. The task was recorded so the experimenter could listen to the child’s productions again and score accurately. Children’s productions received a score of 3 if no errors were made, 2 if one error was made, 1 if two to three errors were made and 0 if they made four or more errors. The task was discontinued after three consecutive 0-scores. The analyses were conducted using the raw scores.
2.2.3. Language mixing – Naturalistic daylong audio recordings (LENA)
The present study employed two methods to assess parental language mixing. The first method involves naturalistic, daylong audio recordings in the home environment, gathered with the Language Environment Analysis (LENA; Greenwood et al., Reference Greenwood, Thiemann-Bourque, Walker, Buzhardt and Gilkerson2011) recording device. Parents turned the device on and put it in the pocket of a T-shirt the child was wearing during a weekend day. No researchers were present during the recording day, and parents were instructed to go about their day as usual. From the full daylong audio recording, 270 30-second segments were sampled based on the conversational turn count (CTC) provided by the automatic LENA output. The 270 segments were manually coded for speaker(s), language(s) spoken, activity and (target) child-directed speech. The segments that contained more than one language were transcribed in CHAT (MacWhinney, Reference MacWhinney2014) by bilingual research assistants. All instances of intra- and inter-sentential language mixing uttered by a parent were used to create the language mixing variables. More details can be found in the Appendix and Supplementary Materials. The language mixing variables represent the frequency of intra- and inter-sentential language mixing per hour produced by parents to the target child. An overall language mixing score was calculated by summing intra- and inter-sentential language mixing.
2.2.4. Language mixing – Questionnaire (Q-BEx)
The second measure of language mixing was the questionnaire for Quantifying Bilingual Experience (Q-BEx; De Cat et al., Reference De Cat, Kašćelan, Prévost, Serratrice, Tuller and Unsworth2022). The Q-BEx is a modular questionnaire with two fixed modules, namely “background information” and “risk factors”, and several optional modules, one of which being “language mixing”. This module asks parents about different types of language mixing. The frequency with which parents switch is questioned via three questions, formulated as “At home, when people (including yourself) speak to the child, how often do you do any of the following….” The questions ask about one-word switches, two- or three-word switches or inter-sentential switches. Parents answer on an ordinal 5-point scale ranging from almost never to more than 5 conversations per day with the option to select I do not know or not relevant. We transformed these responses into an interval scaled variable by converting them into number of mixes per week (e.g., one or two conversations per week becomes 1.5, one or two conversations per day becomes 10.5 and we set more than five conversations per day to 42). Both one-word switches and two- or three-word switches take place within the sentence and were summed to create one score for intra-sentential language mixing. In conclusion, the Q-BEx provided separate scores for the number of intra- and inter-sentential switches that parents report to utter per week. Overall language mixing was calculated as the sum of the intra- and inter-sentential switches per week.
2.3. Procedure
The research took place in the home environment of the participant and consisted of two home visits. Parents provided informed consent during their first test appointment. The first visit was a bilingual session with a bilingual research assistant (either Turkish-Dutch or Polish-Dutch) to assess children’s vocabulary skills in the minority language. During the first appointment, parents also received the LENA recorder with instructions to use it on a day on the weekend before the second test appointment. To safeguard the privacy of the participants, only fragments of 30 seconds were listened to. Consequently, the researcher was unable to understand the full context of the conversations. It was made clear to the participants that the focus of the study was on the languages being spoken, rather than on the content of the conversations. Furthermore, parents were allowed to have parts of the audio removed before it would ever be listened to. One family had two hours of audio deleted.
The average time between two test appointments was approximately three weeks (range: 2–9 weeks). During the second visit, a Dutch speaker administered the Dutch vocabulary and sentence repetition tasks, retrieved the LENA recorder and filled out the Q-BEx questionnaire together with a parent. The data from this study are part of a larger project and additional tests were administered during home visits, but these are not discussed in this paper.
2.4. Data analysis
Analyses and hypotheses were preregistered on the Open Science Framework. All analyses have been carried out in R (version 4.3.0; R Core Team, 2024). To address our research questions, Bayesian informative hypothesis evaluations were used. Where null-hypothesis significance testing (NHST) can only investigate one specific hypothesis against its null hypothesis, Bayesian informative hypothesis testing allows for the simultaneous comparison of multiple hypotheses with each other. In other words, NHST would require multiple analyses to compare the negative hypothesis to the null hypothesis and the positive hypothesis to the null hypothesis, which increases the type I error. The Bayesian analysis calculates the support for each hypothesis that is included in the set based on its relative fit and relative complexity. The hypothesis that describes the data best, without being unnecessarily complex, receives the most support. As we wanted to evaluate multiple hypotheses at once (i.e., a negative, positive or non-existent relation), Bayesian analyses were conducted.
We used two values to evaluate the hypotheses, the posterior model probabilities (PMP) and Bayes Factor (BF). We did not set predetermined thresholds of these values for accepting or rejecting a hypothesis, as this defies the purpose of Bayesian hypothesis testing (Hoijtink, Mulder, et al., Reference Hoijtink, Mulder, van Lissa and Gu2019). Instead, we looked at the PMPa to quantify the evidence for each hypothesis compared to the other pre-specified hypotheses. The PMP can be used to compute Bayesian error probabilities. When H1 is selected as the preferred hypothesis, 1 – PMP1a denotes the probability of choosing the wrong hypothesis (Hoijtink, Mulder, et al., Reference Hoijtink, Mulder, van Lissa and Gu2019). After selecting the hypothesis with the highest PMPa as the best of the set, we looked at the Bayes Factor (BF.C) to evaluate how good each hypothesis was on its own, without comparing it to the other hypotheses in the set. The Bayes Factor BF.C quantifies how much more likely the data are to be observed under that hypothesis than under its complement (i.e., any hypothesis other than the hypothesis under consideration). The values can be interpreted as follows: a higher PMPa indicates more certainty that the hypothesis is the right one from the set and a higher BF implies stronger evidence for that hypothesis compared to its complement.
RQ1: How do the language mixing behaviors of Turkish-Dutch parents and Polish-Dutch parents differ in terms of frequency and type of mixing?
First, the difference in the frequency of language mixing between the two groups was tested with Bayesian independent sample t-tests using the bain package (Gu, Reference Gu2016; Hoijtink, Gu, et al., Reference Hoijtink, Gu and Mulder2019). The two informative hypotheses were that the Turkish group mixed more than the Polish group, or that there was no difference in overall language mixing behavior between the two groups. We compared the two groups on the frequency of overall language mixing as observed in the daylong audio recording and as reported in the Q-BEx questionnaire. In our preregistration, we stated that we would not investigate the difference between groups on overall language mixing behavior reported in the Q-BEx questionnaire, as these might reflect parents’ attitudes toward mixing rather than actual frequencies (Treffers-Daller et al., Reference Treffers-Daller, Ongun, Hofweber and Korenar2020). However, considering all other analyses were conducted for both methods, we deemed it appropriate to use the Q-BEx data for this question as well.
Second, the difference in the types of language mixing used by the two groups was tested by comparing the ratio of inter- versus intra-sentential switches between the two groups via a Bayesian independent sample t-test. A ratio score was calculated by dividing the number of inter-sentential switches by the number of intra-sentential switches for each family. The comparison of the ratios was exploratory since no hypothesis was formulated.
RQ2: To what extent does parental language mixing relate to children’s vocabulary outcomes?
To test the influence of language mixing on vocabulary outcomes, we ran eight multivariate regressions based on a 2x2x2 matrix. The first factor in the matrix was the vocabulary modality. Separate multivariate regression models were created for expressive and receptive vocabulary scores. The second factor was the method used to measure language mixing. Individual models were created for the different methods of how language mixing was measured (questionnaire and daylong audio recordings). The third factor was the type of language mixing. One set of models included the overall frequency of language mixing whereas the other models investigated the specific types of language mixing (intra- and inter-sentential). We did not have different hypotheses for the individual languages. Therefore, each model has two dependent variables: vocabulary scores in the majority language (Dutch) and in the minority language (Polish/Turkish). Summarizing, we ran eight multivariate regression models to investigate the relation between parental language mixing and children’s receptive and expressive vocabulary outcomes. All regression models were controlled for age, Dutch language exposure, parental education, and group (Polish or Turkish). The models can be found in the preregistration.
RQ3: To what extent does parental language mixing relate to children’s sentence repetition scores?
To test the relation between overall language mixing and Dutch sentence repetition scores, we ran multiple linear regression models containing the same set of control variables as used in the models for vocabulary outcomes. The regression models were based on a 2×2 matrix. The two factors were the method of measuring language mixing (questionnaire and daylong audio recordings) and the type of language mixing (overall language mixing and types of language mixing). Summarizing, four multiple linear regression models were run to investigate the relation between parental language mixing and children’s sentence repetition scores.
3. Results
3.1. RQ1: How do the language mixing behaviors of Turkish-Dutch parents and Polish-Dutch parents differ in terms of frequency and type of mixing?
All descriptive statistics are presented in Table 2. The overall language mixing frequency per group and method is displayed in Figure 1 and the corresponding BFs and PMPs are presented in Table 3. The results from the Bayesian independent samples t-tests are indecisive regarding differences in the overall frequency of language mixing observed between the Turkish-Dutch group and the Polish-Dutch group. The BF.Cs show that the LENA data suggest that the Turkish-Dutch and Polish-Dutch groups do not differ in terms of overall language mixing and that the Q-BEx data strongly suggest that the Turkish-Dutch group mixes more than the Polish-Dutch group. However, similar to non-significant p-values in classical hypothesis testing, the relatively high Bayesian error probabilities prevent us from fully discarding the other hypotheses. Summarizing, we have no conclusive evidence for a difference in overall frequency of language mixing between the two groups.
Table 2. Descriptive statistics of the sample per group included in the analysis

Note. Vocabulary outcomes are number of items correct. Sentence repetition scores are raw scores from the CELF-P-2-NL. The LENA represents the observed instances of language mixing per hour. The Q-BEx represents the number of times that parents report to mix per week.

Figure 1. Boxplots of overall language mixing per group and method.
Table 3. Bayes Factors and Posterior Model Probabilities for group differences in mixing behavior

Regarding the ratio of inter- and intra-sentential switches (Figure 2) both groups mix more between sentences than within sentences. The independent samples t-tests revealed that Turkish-Dutch parents have a lower ratio of inter- versus intra-sentential language mixing than Polish-Dutch parents. Given the large BF.C values and the low Bayesian error probabilities (5% and 10% respectively), we conclude that, compared to the Polish-Dutch group, the Turkish-Dutch group makes relatively more use of intra-sentential language mixing than the Polish-Dutch group.

Figure 2. Ratio of parental inter- and intra-sentential language mixing per family, as observed in the LENA recordings. Each bar represents the parental language mixing in one household.
3.2. RQ2: To what extent does parental language mixing relate to children’s vocabulary outcomes?
All regression coefficients of language mixing on language outcomes and their 95% confidence intervals can be found in Table 4 and Figure 3. The corresponding BF.C and PMPa for our hypotheses are presented in Table 5. In Figures 3A–F, a general trend can be observed that language mixing seems to have little to no relation with the majority language (Dutch), regardless of type of language mixing, method of measurement or vocabulary modality. Regarding relations with the minority language (Polish/Turkish), overall language mixing and intra-sentential language mixing seem to be negatively related to vocabulary outcomes. We will discuss the model outcomes in more detail below.
Table 4. Regression coefficients and 95% confidence intervals of parental language mixing on children’s language outcomes

Note. The meaning of each color: blue: no relation, red: negative relation, yellow: negative trend, green: positive trend. Regressions were considered significant relations when the confidence interval did not contain 0. Regressions were considered trends when the confidence interval crossed the 0 border with less than .5.

Figure 3. The relation between all types of language mixing and expressive vocabulary (A – C), receptive vocabulary (D – F) and sentence repetition scores (G – I). Sentence repetition scores were only available in Dutch.
Table 5. Bayes factors and posterior model probabilities for our hypotheses regarding the relation between language mixing and language outcome

Note. inter = inter-sentential language mixing, intra = intra-sentential language mixing, 1 = Dutch vocabulary and 2 = Polish/Turkish vocabulary. Note that the hypotheses for sentence repetition were only available for the majority language.
3.2.1. Language mixing and receptive vocabulary outcomes
By inspecting the regression coefficients (Table 4), we see that overall, intra-sentential and inter-sentential language mixing do not relate to receptive vocabulary outcomes in Dutch or the minority language. A combination of relatively high BFs and posterior model probabilities confirm that both the Q-BEx and the LENA find positive evidence that there is no relation between overall language mixing and children’s receptive vocabulary outcomes.
The results of the multivariate regressions are not decisive regarding the relation between the types of mixing and children’s receptive vocabulary outcomes. The Q-BEx finds strong evidence for a non-existent relation, but the Bayesian error probability of .20 prevents us from fully discarding the other hypotheses. This can be interpreted similarly to how a non-significant p-value does not lead to acceptance of the H0, but merely that the data fail to fully discard H0. When language mixing is measured with the LENA, equal evidence is found for H0: there is no relation between each type of language mixing and receptive vocabulary, and for H1: there is a negative relation between intra-sentential language mixing and receptive vocabulary. The shared evidence likely stems from both hypotheses partly predicting the true effect. The regression coefficients in Table 4 show that inter-sentential language mixing is indeed unrelated to Dutch and Polish/Turkish receptive vocabulary (as predicted by both H0 and H1), intra-sentential language mixing is unrelated to Dutch receptive vocabulary (as predicted by H0) but negatively related to Polish/Turkish receptive vocabulary (as predicted by H1). Thus, our data suggest that intra-sentential language mixing is not related to receptive Dutch vocabulary, but negatively related to receptive Polish/Turkish vocabulary. This newly generated hypothesis should be tested in future studies with new data.
3.2.2. Language mixing and expressive vocabulary outcomes
We specified in our hypotheses that the relations between language mixing and vocabulary outcomes would be the same in both languages. In Table 4 we observe that overall language mixing is unrelated to Dutch expressive vocabulary, but negatively related to Polish/Turkish expressive vocabulary. It can thus be concluded that our hypotheses are not in line with our data. Therefore, none of our informative hypotheses about the relation between overall language mixing and expressive vocabulary receive any support compared to their complement (All BF.Cs < 0; Table 5).
When looking at the types of language mixing, both LENA and Q-BEx do not find any relation between inter-sentential or intra-sentential language mixing and Dutch expressive vocabulary. The results of the multivariate regressions are not decisive regarding the relation between types of language mixing and expressive vocabulary in the minority language. A negative relation is observed between inter-sentential language mixing as measured by the Q-BEx questionnaire and Polish/Turkish expressive vocabulary. No negative relations with inter-sentential language mixing were predicted by any of our informative hypotheses. Therefore, the Q-BEx data do not receive support for any hypothesis (Table 5). With the LENA method, a negative trend between intra-sentential language mixing and Polish/Turkish expressive vocabulary is observed (Table 4). This negative trend contributes to the evidence that LENA finds for H1: expressive vocabulary is not related to inter-sentential language mixing, but negatively related to intra-sentential language mixing.
3.3. RQ3: To what extent does parental language mixing relate to children’s sentence repetition scores?
As can be seen in Figures 3G–I and Table 4, all regression coefficients are weak and indicate a non-existent relation between all types of language mixing and Dutch sentence repetition scores, but the results from our Bayesian regression analyses show high error probabilities (Table 5). Both the Q-BEx and LENA find the most support for the H0 in the case of overall language mixing and types of language mixing. In conclusion, based on regression coefficients and BFs, our data strongly suggest that all types of language mixing are unrelated to Dutch sentence repetition scores, although high error probabilities prevent us from ruling out other possible hypotheses.
3.4. RQ4: Does the method used to measure language mixing affect the observed relations between parental language mixing and children’s language outcomes?
As can be seen in Figure 3, the full lines (LENA) and the dotted lines (Q-BEx) are very close to each other, indicating that both methods of measurement generally predict the same relation between language mixing and language outcomes. Indeed, it can be seen that, except trends, data obtained with LENA and Q-BEx yield the same conclusion in 14 out of 15 regressions (either no relation or a negative relation; Table 4). The only relation where the Q-BEx and LENA differ is that between inter-sentential language mixing and Polish/Turkish expressive vocabulary; Q-BEx points to a negative relation, whereas for LENA no relation emerged. This discrepancy results in the LENA method providing evidence for the H1, whereas the Q-BEx did not find evidence for any hypothesis in the set (Table 5). In nearly all other cases, both methods agreed on the same informative hypothesis of the set receiving the most support, resulting in identical conclusions.
4. Discussion
The present study investigated the relation between parental language mixing and bilingual children’s language outcomes. Previous studies have investigated this relation, and their results were equivocal. Their findings varied from a negative relation to a positive relation to no relation at all (Bail et al., Reference Bail, Morini and Newman2015; Byers-Heinlein, Reference Byers-Heinlein2013; Carbajal & Peperkamp, Reference Carbajal and Peperkamp2020; Place & Hoff, Reference Place and Hoff2011, Reference Place and Hoff2016). These inconsistent findings may be related to which aspect of language development was examined, which languages were examined, which group of participants was included in the study, which type of language mixing was investigated and/or which method was used to measure language mixing. To ensure a thorough approach, the present study used two different methods to measure language mixing and distinguished between intra- and inter-sentential language mixing. Moreover, it included receptive and expressive vocabulary outcomes in both the majority and minority languages as well as sentence repetition scores in two different groups of participants. This discussion will address the relation between parental language mixing and language outcomes in greater detail, focusing on how our results align with the existing body of literature on the potentially negative, positive or non-existent relations. Moreover, we will discuss the role of the method that is used to measure language mixing. However, before going into this, we will discuss the observed frequency and type of language mixing of our two groups of parents. This is important because the disparate findings between previous studies may be partly attributed to having investigated participants from different communities, who may exhibit different language-mixing practices (Torres & Potowski, Reference Torres and Potowski2016). The inclusion of two groups and the comparison of their language-mixing behavior is something that has not been addressed so far in other studies.
4.1. Frequency and type of language mixing by Polish-Dutch and Turkish-Dutch parents
We hypothesized that the Turkish-Dutch parents would engage more in language mixing than the Polish-Dutch parents because in general Turkish immigrants in the Netherlands have a longer migration history than the Polish immigrants and have developed a practice of intense language mixing (Backus & Demirçay, Reference Backus and Demirçay2021; CBS, n.d.). Additionally, it was speculated that the Turkish group has access to more bilingual interlocutors, facilitating bilingual language use such as language-mixing practices. However, we could not conclude that the Turkish parents mixed more than the Polish parents based on our data.
There are two potential explanations for the lack of evidence. First, our sample may not be representative of the general Turkish community in the Netherlands. According to the Statistics Netherlands (CBS, n.d.), 52% of Dutch Turks are born in the Netherlands. In our sample, however, only 38% of Turkish parents were born in the Netherlands. This means that a relatively large percentage of Turkish parents in our sample did not grow up in the context of the intense mixing that characterizes the Turkish community in the Netherlands (Backus & Demirçay, Reference Backus and Demirçay2021). As a result, the Turkish families in our sample may mix less than the average Turkish family in the Netherlands. Second, the bain package uses the largest prior variance by default, resulting in a slight preference for the null hypothesis. This is because “In an era of heightened awareness of publication bias, sloppy science, and irreplaceability of research results, researchers should be conservative, that is, convincing evidence is needed before another hypothesis is preferred over H0” (Hoijtink et al., Reference Hoijtink, Gu and Mulder2019, p. 30). This may have led to an increase in support for the null hypothesis.
Regarding the different types of language mixing, both groups mixed more inter-sententially than intra-sententially, which is the common pattern among bilingual speakers (Bail et al., Reference Bail, Morini and Newman2015; Kremin et al., Reference Kremin, Alves, Orena, Polka and Byers-Heinlein2022; Torres & Potowski, Reference Torres and Potowski2016). However, the discrepancy between inter- and intra-sentential language mixing was smaller in the Turkish-Dutch group. One possible explanation for this finding is that we considered conversational language mixing (e.g., when the parent would respond in a different language than being spoken to by the child; Gross et al., Reference Gross, López González, Girardin and Almeida2022; Ribot & Hoff, Reference Ribot and Hoff2014) to be a form of inter-sentential language mixing. It may be that parents who do not (natively) speak Dutch are more likely to respond in the minority language than parents who do speak Dutch, resulting in relatively more cross-speaker inter-sentential language mixing. In our Polish sample, 86% of parents were born outside of the Netherlands (in contrast to 69% of Turkish parents); as a result, the Polish group may engage more in cross-speaker inter-sentential language mixing compared to the Turkish group. In addition, several Turkish parents may show more intra-sentential language mixing as a result of the intense language mixing practices that are typical for the Turkish community in the Netherlands (Backus & Demirçay, Reference Backus and Demirçay2021).
4.2. Relations between caregiver language mixing and children’s language outcomes
In the introduction, we reviewed empirical and conceptual support for a negative, positive or non-existent relation between parental language mixing and children’s language outcomes. The present study found (a) evidence for a negative relation between overall language mixing and children’s expressive vocabulary in the minority language, regardless of the method with which language mixing was measured (LENA or Q-BEx); (b) no evidence for any positive relation with children’s language outcomes and (c) in particular, evidence for the absence of relation between parental language mixing and children’s language outcomes when it comes to receptive and expressive vocabulary and sentence repetition scores in the majority language and receptive vocabulary in the minority language.
4.2.1. Negative relation
The negative relation between overall language mixing and expressive vocabulary in the minority language can be explained by two reasons. First, the minority language may be susceptible to effects of language mixing in the home environment because mixed language input constitutes a relatively large proportion of the total language input in this language. This contrasts with the majority language to which children are frequently exposed outside of the home environment and in largely monolingual settings (e.g., in daycare/school). It may thus be that the proportion of mixed language input is too small to impact any language outcomes in the majority language (Kremin et al., Reference Kremin, Alves, Orena, Polka and Byers-Heinlein2022), whereas it may be substantial enough to affect minority language outcomes. This could also explain why we found evidence for a relation between overall language mixing and child language outcomes, while no relations emerged for intra- and inter-sentential language mixing individually. The latter two types of language mixing individually make up too small a part of the total language input (Bail et al., Reference Bail, Morini and Newman2015; Kremin et al., Reference Kremin, Alves, Orena, Polka and Byers-Heinlein2022). Additionally, it has been argued that language mixing might diminish the prestige of the minority language (Cooper & Fishman, Reference Cooper and Fishman1971; Place & Hoff, Reference Place and Hoff2016), which may discourage children to learn the minority language and negatively affect their language outcomes.
Second, (mixed) language input may specifically impact expressive language skills (Dijkstra et al., Reference Dijkstra, Kuiken, Jorna and Klinkenberg2016; Verhoeven et al., Reference Verhoeven, van Witteloostuijn, Oudgenoeg-Paz and Blom2024). The Weaker Links Hypothesis (Gollan et al., Reference Gollan, Montoya, Cera and Sandoval2008) explains this phenomenon based on phonological representations (i.e., the mental representations of the combination of sounds that make up words) that become stronger with experience. When children hear sentences or words more frequently, their phonological representations of those words become stronger (Gibson et al., Reference Gibson, Peña and Bedore2014). By mixing two languages, the monolingual equivalents occur less frequently and become less entrenched. This weakens their phonological representations. Importantly, these weaker phonological representations may be sufficient for recognizing words, but not for producing them (Gollan et al., Reference Gollan, Slattery, Goldenberg, Van Assche, Duyck and Rayner2011). Then, in line with the findings reported here, language mixing would not result in any issues with regard to the recognition of words (receptive vocabulary), but they would impede the retrieval of words from the mental lexicon during production (expressive vocabulary).
Place and Hoff (Reference Place and Hoff2011, Reference Place and Hoff2016) are the only other studies that have related the frequency of parental language mixing to expressive vocabulary outcomes in the minority language, but they found no relation. This may be explained by their measures of language mixing not accurately representing the overall language mixing that takes place within the home environment. Language diaries lack information on how frequently the languages are being mixed within the 30-minute blocks and the Language Mixing Scale (LMS) is a closer representation of intra-sentential language mixing than overall language mixing (Place & Hoff, Reference Place and Hoff2016). The two studies that did find negative relations did so for receptive vocabulary in the majority language in children younger than 18 months (Byers-Heinlein, Reference Byers-Heinlein2013; Carbajal & Peperkamp, Reference Carbajal and Peperkamp2020). These effects were not found in older children, suggesting that children may become more proficient at processing mixed language input with age, thus overcoming the previously identified negative effect of language mixing.
4.2.2. Positive relation
We did not find any evidence regarding a positive relation between language mixing and children’s vocabulary outcomes as was found by Bail et al. (Reference Bail, Morini and Newman2015). Importantly, Bail and colleagues used correlational analyses without controlling for other predictor variables. It may be that the positive relation they found between intra-sentential language mixing and children’s total vocabulary score disappears when age, gender, language input and parental education are controlled for.
4.2.3. No relation
In this study, we investigated 15 potential relations between language mixing by parents and children’s language outcomes. Out of those 15 relations, evidence for only one negative relation emerged, as discussed above. Importantly, we found evidence that parental language mixing was unrelated to children’s language outcomes in the majority language, regardless of type of mixing and language outcome measure. In addition, parental language mixing was unrelated to receptive vocabulary in the minority language. This suggests that the null findings of Bail et al. (Reference Bail, Morini and Newman2015) and Place & Hoff (Reference Place and Hoff2011, Reference Place and Hoff2016), who looked at Spanish-English bilinguals between 17–30 months old, can be generalized to other populations (i.e., Polish-Dutch and Turkish-Dutch language combinations and children aged between 36–72 months). Most previous research focused on children’s vocabulary outcomes, except for Place and Hoff (Reference Place and Hoff2011, Reference Place and Hoff2016) who also investigated the relation between parental language mixing and grammatical aspects of children’s language. Using a sentence repetition measure, we found, in line with Place and Hoff (Reference Place and Hoff2011, Reference Place and Hoff2016), that language mixing in the input was unrelated to children’s grammar outcomes, at least in the majority language.
4.2.4. The role of method used to measure language mixing
To determine whether the method of how language mixing was measured underlies the disparate findings from previous studies, we measured language mixing with the use of daylong audio recordings made with LENA and the Q-BEx questionnaire. Previous studies have used language diaries, the LMS or short recordings in a laboratory setting, but called for more naturalistic measures of language mixing in the home environment (Bail et al., Reference Bail, Morini and Newman2015; Byers-Heinlein, Reference Byers-Heinlein2013; Carbajal & Peperkamp, Reference Carbajal and Peperkamp2020; Place & Hoff, Reference Place and Hoff2011, Reference Place and Hoff2016). Place and Hoff (Reference Place and Hoff2016) measured language mixing with both a language diary and the LMS and found no relations with any language outcome with either method. The present study found that the LENA recording and the Q-BEx questionnaire mostly yielded comparable results with regard to children’s language outcomes, suggesting that they are comparable instruments to measure language mixing (Verhoeven et al., Reference Verhoeven, van Witteloostuijn, Oudgenoeg-Paz and Blom2024). However, the Q-BEx revealed a negative relation between inter-sentential language mixing and expressive vocabulary in the minority language, while no relation emerged when language mixing was measured with the LENA. This may be due to the difference in what an inter-sentential language mix entails for the two methods. The LENA included conversational language mixes (i.e., cross-speaker) as cases of inter-sentential language mixing, whereas the Q-BEx only contained within-speaker inter-sentential language mixes. It was mentioned that the disparate findings from previous studies may in part be ascribed to their different methodologies (e.g., Byers-Heinlein (Reference Byers-Heinlein2013) finding a negative relation with the LMS and Bail et al. (Reference Bail, Morini and Newman2015) finding a positive relation with audio recordings). However, the current study shows that the language mixing measure has limited impact, as long as the measures target the same construct (e.g., whether inter-sentential language mixing contains conversational language mixing or not). Other factors that may underlie these disparate results, such as group of participants, ages, the languages they speak and how accustomed the participants are to language mixing, should be addressed in future work.
4.3. Limitations and future research
The present study is subject to some limitations. The first limitation is our relatively small sample size. Inclusion criteria such as specific language combinations and age ranges, in combination with extensive research methods involving daylong audio recordings make it challenging to obtain a large sample. Furthermore, the (manual) processing of naturalistic data leads to intensive workloads (see also suggestions from Cychosz et al., Reference Cychosz, Villanueva and Weisleder2021). The rapid development of technology will hopefully facilitate automatic processing of bilingual daylong audio recordings in the future, thereby facilitating the collection and analysis of larger sample sizes using this method. Smaller sample sizes result in higher Bayesian error probabilities, preventing us in some cases from discarding the other hypotheses in the set. Thus, despite our data sometimes showing strong evidence in support of a hypothesis with a high BF, the high error probabilities forced us to conclude that the results remained inconclusive (e.g., the hypothesis that intra- and inter-sentential language mixing did not relate to sentence repetition outcomes had a BF of 21.27, but a Bayesian error probability of .32). Even though our results may not allow for very strong conclusions, our analytic approach is strong and nuanced. Moreover, communicating weaker results to the field is important to help mitigate publication bias (Song et al., Reference Song and Loke2013).
Second, only one (weekend) day was recorded. As the children in our sample go to daycare or school, it was not feasible to record weekdays. Even though parents exhibit similar language use during weekdays and weekend days (Orena et al., Reference Orena, Byers‐Heinlein and Polka2020), recording a single day provides merely a snapshot and may lead to inaccurate representations of the typical language use in the home environment (e.g., when a parent who mixes their languages frequently is not at home during the recording day). However, Verhoeven et al. (Reference Verhoeven, van Witteloostuijn, Oudgenoeg-Paz and Blom2024) showed that a single daylong audio recording can be considered a good measure of the bilingual language environment.
Third, we measured grammatical abilities only in Dutch, the majority language, and not in Polish and Turkish, the two minority languages. It thus remains an open question whether the observed negative relation between language mixing and minority language outcomes is limited to expressive vocabulary.
Finally, the cross-sectional design of the study does not allow us to investigate the direction of the relation between overall language mixing and expressive vocabulary in the minority language (Bail et al., Reference Bail, Morini and Newman2015; Soderstrom, Reference Soderstrom2007). It remains unknown whether parents alter their language use based on children’s expressive skills, or whether parental language mixing affects children’s ability or desire to express themselves in the minority language. Future studies should conduct a longitudinal investigation to ascertain whether there is a causal relation, and in which direction.
4.4. Conclusion
The current study provides rich and naturalistic data regarding the language-mixing behavior of multilingual families in the Netherlands and its relation to children’s language outcomes. Our data do not provide strong evidence for differences in overall language mixing frequency between the Turkish-Dutch and Polish-Dutch families, but they do suggest that Turkish-Dutch parents mix more intra-sententially than Polish-Dutch parents. Parental language mixing does not relate to any language outcome in the majority language, while overall language mixing was negatively related to children’s expressive vocabulary outcomes in the minority language. We abstain from advising parents to separate their languages because (a) the direction of the negative relation is unclear, (b) overall, our study provided evidence for the unrelatedness of parental language mixing and child language outcomes and (c) language mixing is such common practice (Kremin et al., Reference Kremin, Alves, Orena, Polka and Byers-Heinlein2022). Parents may, however, consider specific rules about minority language use in the home environment, as these have been found to support minority language maintenance (Hollebeke et al., Reference Hollebeke, Dekeyser, Caira, Agirdag and Struys2023), but these do not need to involve a strict separation of the two languages.
Supplementary material
The supplementary material for this article can be found at http://doi.org/10.1017/S1366728925100175.
Data availability statement
The data that support the findings of this study will be openly available on the Open Science Framework at https://osf.io/qudr7/.
Acknowledgments
We would like to thank all participating children and parents. Furthermore, we thank Laura Koelma, Vera Snijders, Gülşah Yaziçi, Hatice Bulut, Fatma Nur Öztürk, Klaudia Latkowska, Patricia Dworak, Zuzanna Kruber and Jakob Kaiser for their help with collecting and/or processing the data. Lastly, we want to thank Herbert Hoijtink for his help with the Bayesian statistics.
Funding statement
This research is funded by the VICI grant awarded to E.B. by the Netherlands Organization for Scientific Research (NWO; Grant Number VI.C.191.042).
Competing interests
The authors declare that there are no competing interests.
Disclosure of AI tools
The authors declare that no images, text, data or ideas have been AI-generated. The translational tool DeepL was used for translation purposes, as English is not the authors’ native language.
Ethical standard
The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008.
Appendix: Steps to calculate the number of occurrences of parental language mixing from daylong audio recordings
-
1. Based on the automatic output provided by the LENA software, we first removed silent segments. Subsequently, 54 5-minute segments were sampled based on their conversational turn count (CTC). We selected 18 segments that contained the highest number of conversational turns, 18 segments with the lowest number of conversational turns (that were not silent) and 18 segments that were in the middle. More detailed information regarding the sampling method can be found on the Open Science Framework page of the study.
-
2. From those 5-minute segments, we analyzed every other 30-second segment (Marasli & Montag, Reference Marasli and Montag2023; Ramírez-Esparza et al., Reference Ramírez-Esparza, García-Sierra and Kuhl2014, Reference Ramírez-Esparza, García-Sierra and Kuhl2017), resulting in 270 30-second segments per participant.
-
3. All 270 30-second segments were manually coded for speaker(s), language(s) spoken, activity and whether there was speech directed to the target child (CDS).
-
4. Every 30-second segment that was coded as containing more than one language was fully transcribed in CHAT (MacWhinney, Reference MacWhinney2014) by bilingual research assistants. In the transcript, each time the language changed, it was coded as either intra- or inter-sentential language mixing.
-
5. All individual occurrences of language mixing were further coded for speaker, addressee, switch direction (i.e., whether the switch was from the majority language to the minority language or vice versa) and switch type. More specifically, we coded whether a switch was within a speaker, between speakers or between two different conversations. The latter was coded when a change of language occurred in the transcript, but the languages were not part of the same conversation. In this case, the switch was not considered an instance of language mixing and was not analyzed further. For within-speaker switches, we further coded the following subtypes: insertions, alternations, congruent lexicalizations (Muysken, Reference Muysken and Pütz1997) and inter-sentential switches. As we are only interested in the difference between intra- and inter-sentential switches in this study, all instances of insertions, alternations and congruent lexicalizations are collapsed into one category of intra-sentential switches. Switches that occur between speakers are by definition inter-sentential.
-
6. As we investigate the role of parental language mixing on children’s language outcomes, we retain only those switches that were uttered by a parent to the target child. The numbers of intra- and inter-sentential switches represent how many times these types of switches occur in the sampled LENA audio, which had a total duration of 135 minutes (i.e., 270 30-second segments). The outcome variable used in the analysis is the frequency of each type of switch per hour of recording, which is calculated by dividing these total mixing frequencies by 2.25.