Introduction
Narratives are indispensable for everyday communication and later academic success (Wallach, Reference Wallach2008). Narrative production, be it personal or picture-based, is a complex activity requiring not only language-specific skills, such as vocabulary and morphosyntax (Berman & Slobin, Reference Berman, Slobin, Berman and Slobin1994), but also cognitive skills, such as the ability to recognise the (pictorial) components and connect them into a meaningful whole, i.e., macrostructure (Gagarina & Bohnacker, Reference Gagarina and Bohnacker2022).
To monitor pictorial stimuli and to connect the events in a causal-temporal order, a storyteller relies on cognitive skills, such as shifting, memory, and inhibition (Mozeiko et al., Reference Mozeiko, Le, Coelho, Krueger and Grafman2011). Shifting helps discriminate objects or protagonists from complex backgrounds and generate a complete episode. Memory is needed, since constructing a narrative requires the ability to retain the visually presented stimuli and connect the story elements into a cohesive whole in a temporal sequence (logical order). Finally, inhibition is crucial for constructing a cohesive story‚ since it requires the ability to hold attention to two or more characters or actions simultaneously, organise them, and inhibit irrelevant information while producing the story.
Research on the effects of cognitive skills on narrative production is still inconclusive and deals predominantly with monolingual children and children with developmental language disorders (DLDs). Khan (Reference Khan2013) found that shifting predicted narrative production in three- to six-year-old monolinguals, while Veraksa et al. (Reference Veraksa, Bukhalenkova, Kartushina and Oshchepkova2020) reported a significant relationship between macrostructure and visual working memory of pictures and spatial arrangements in typically developing (TD) children aged five to six. Additionally, correlations were found between macrostructure production and attention in both TD and DLD monolinguals aged six to nine (Duinmeijert et al., Reference Duinmeijert, de Jong and Scheper2012).
Recently, Oshchepkova et al. (Reference Oshchepkova, Shatskaya, Dedyukina, Yakupova and Kovyazina2022) found that narrative macrostructure in six- to seven-year-old Russian-speaking monolinguals and bilinguals is affected by cognitive flexibility (a measure of shifting), working memory, and inhibition with both groups showing similar macrostructure and cognitive flexibility. Similarly, Clark-Whitney and Melzi (Reference Clark-Whitney and Melzi2023) described significant contributions of inhibitory control to personal narratives’ macrostructure in English-Spanish bilingual preschoolers.
These studies employed diverse approaches to measuring macrostructure which impedes comparability. Moreover, bilingual research is limited to atypical language development, or to comparisons between monolingual and bilingual English-speaking populations. Consequently, further research is needed to understand cognitive effects on narrative macrostructure in both languages of bilinguals and to find out if cognitive mechanisms underlying narratives vary due to linguistic differences. The present study endeavours to address this gap.
Narrative macrostructure
Previous research employed different pictorial stimuli and grammar models to assess macrostructure in picture-based narratives. The seminal study of Berman and Slobin (Reference Berman, Slobin, Berman and Slobin1994) study using Frog, Where are you? (Mayer, Reference Mayer1969) and the research that followed (Fiestas & Peña, Reference Fiestas and Peña2004; Paradis et al., Reference Paradis, Genesee and Crago2010; Pearson, Reference Pearson, Verhoeven and Strömqvist2002; Uccelli & Páez, Reference Uccelli and Páez2007) used the story grammar model (Stein & Glenn, Reference Stein, Glenn and Freedle1979), suggesting that stories consist of single components that are organised around the setting and episodes.
In the last years the multidimensional model of narrative organisation has served as a base for the Multilingual Assessment Instrument for Narratives (LITMUS – MAIN (henceforth, MAIN) Gagarina et al., Reference Gagarina, Klop, Kunnari, Tantele, Välimaa, Balčiūnienė, Bohnacker and Walters2012, Gagarina et al., Reference Gagarina, Klop, Kunnari, Tantele, Välimaa, Bohnacker and Walters2019 as part of Language Impairment Testing in Multilingual Settings battery). This model provides a unified approach to assess narrative organisation (i.e., macrostructure) with picture-based stories. The core constituent of macrostructure in MAIN is an episode including a Goal (G) of a protagonist, an Attempt (A) to reach the goal, an Outcome (O) of the protagonist’s activities and, additionally, internal states (ISs) as initiating and resulting events of these activities. A and O constitute the factual components (that can be directly seen in the pictures), whereas G and IS are the inferred ones. Each MAIN story consists of three comparable episodes with the same number of components. Story Structure (SS) is operationalised as the sum of all components of all episodes in a narrative (quantity dimension) including the introductory setting. Story complexity (SC) assesses narrator’s ability to combine the core episodic components (quality dimension), with the co-occurrence of G, A, and O forming the highest level of complexity.
For SS, studies using MAIN reported differences across bilinguals’ two languages, showing that it is affected by language proficiency: higher language proficiency resulted in higher SS (Lindgren & Bohnacker, Reference Lindgren and Bohnacker2022; Kapalková et al., Reference Kapalková, Polisenska, Markova and Fenton2016; Lindgren et al., Reference Lindgren, Tselekidou and Gagarina2023, for a comprehensive overview; Tribushinina et al., Reference Tribushinina, Irmawati and Mark2022). In contrast, SC was found to be similar in bilinguals’ two languages which might indicate that it is less dependent on language proficiency (Lindgren & Bohnacker, Reference Lindgren and Bohnacker2022).
Narratives, demographic, and linguistic factors
Bilinguals’ narrative skills are affected by various demographic and linguistic factors. For instance, sex can affect narrative production, with girls displaying better understanding of other people’s thoughts and feelings than boys, especially at preschool age (Sheldon & Engstrom, Reference Sheldon, Engstrom, Thornborrow and Coates2005). Additionally, socioeconomic status (SES) is a determinant factor in children’s language development since it can lead to less language disparities in both the first (L1; Hoff, Reference Hoff2013) and second languages (L2; Armon-Lotem et al., Reference Armon-Lotem, Walters and Gagarina2011). Finally, vocabulary affects SS in both languages of bilinguals (Uccelli & Páez, Reference Uccelli and Páez2007). However, other studies (Lindgren & Bohnacker, Reference Lindgren and Bohnacker2022) found that vocabulary impacts SS in the less developed home language only, which authors attribute to the high proficiency in the societal language.
The present study
The present study explores how cognitive skills affect narrative macrostructure, operationalised as SS and SC, in both languages of Russian-German-bilingual preschoolers aged 4;6 to 5;1. Specifically, we ask how shifting (Figure Ground), memory (Form Completion), and inhibition (Attention Divided) affect macrostructure production when sex, SES, and language proficiency (measured by lexical and morphosyntactic skills) are controlled for. We additionally ask whether cognitive skills affect SS and SC in Russian and German in a similar way.
Based on previous findings (Khan, Reference Khan2013; Veraksa et al., Reference Veraksa, Bukhalenkova, Kartushina and Oshchepkova2020), we expect the cognitive subtasks to positively affect macrostructure in both languages, after controlling demographic factors and language proficiency. Given that SC is less dependent on language skills (Lindgren & Bohnacker, Reference Lindgren and Bohnacker2022), we expect stronger effects of cognitive skills on SC than SS in both languages.
Method
Participants
Thirty-eight bilingual Russian-German children (16 boys, 22 girls) with a mean age 4;8 (range = 4;6–5;1, SD = 0;2) participated in the study and were tested in both Russian and German. Their mean AoO for L2 German was 25 months (range: 0–48, SD=12.87). All children had at least one Russian-speaking parent, always a first-generation immigrant (five fathers and one mother were non-Russian-speaking; home language data were missing from one set of parents and one father in a family with a Russian-speaking mother). The children attended full-time monolingual German or bilingual Russian-German nursery schools in Berlin or Bavaria.
Written parental consent was provided prior to the experiment. Parents filled out a questionnaire (based upon the Russian Language Proficiency Test for Multilingual Children, Gagarina et al., Reference Gagarina, Klassert and Topaj2010) targeting the children’s AoO, the family’s use of German and Russian, and SES.
An overview of the demographic information is presented in Table 1.
Following the Berliner Senate’s manual (Senatsverwaltung für Wissenschaft, Gesundheit, Pflege und Gleichstellung, 2018), SES was classified on a 4-point scale based on parental education: (4) university degree, (3) a diploma program equivalent to college entrance requirements (Germany: Abitur, Russia: Attestat o Srednem Polnom Obščem Obrazovanii, or comparable in other country), (2) basic general education (Germany: Realschulabschluss, Russia: Attestat o Srednem Obščem Obrazovanii or comparable in other country), (1) German Hauptschulabschluss (max. nine years of school in total). In the analysis, we used the mean of both parents’ scores. Parents were educated in Germany, Russia, Kazakhstan or Estonia (see Table 2).
Note: 1–4 represent the scale measuring parental education, G is for parents completing their education in Germany, R for Russia, Ot for other countries (e.g., Kazakhstan or Estonia), and NA for missing information.
All parents had completed at least 9 years of schooling. Data on education was missing from one father and one mother; in these cases, the other parent’s score was used. Additionally, two mothers and one father did not fill out the country where they received their highest education (cf. Table 2).
Participants’ hearing, neurological, socioemotional, and language development as well as IQ were within the norm. The participants were tested in their homes or preschools in a quiet room. The study was approved by the German Linguistics Society ethics committee, and it was carried out according to the Declaration of Helsinki.
Materials and procedure
Language proficiency
Language proficiency was assessed with productive and receptive lexicon and sentence comprehension tasks. Parallel tests were used in both languages: the Sprachstandstest Russisch für mehrsprachige Kinder (SRUK; Gagarina et al., Reference Gagarina, Klassert and Topaj2010) in Russian and the Patholinguistische Diagnostik bei Sprachentwicklungsstörungen (PDSS; Kauschke & Siegmüller, Reference Kauschke and Siegmüller2009) in German. In both tasks, the total score is measured by the number of correctly labelled items on production and reception. For Russian, the total score is 72 (36 verbs and nouns each, 26 reception and 10 production), whereas for German, it is 80 (20 verbs and 20 nouns, each in production and reception). For sentence comprehension, the SRUK grammar comprehension subtest (Gagarina et al., Reference Gagarina, Klassert and Topaj2010) was used for Russian and TROG-D: Test zur Überprüfung des Grammatikverständnisses (Fox, Reference Fox2011) for German. In TROG-D, the participant listens to a sentence and chooses the correct picture out of four possibilities, including three distractors (lexical or morphosyntactic). There are 21 blocks of four sentences, and a block is scored as incorrect if one or more sentences are identified incorrectly. The test ends after five consecutive incorrect blocks. SRUK uses 11 blocks, each containing two similar structures. A composite score for language proficiency in each language was calculated by summing up the scores from the three tasks. Thus, the maximum composite scores were 83 for Russian and 101 for German.
Narrative tests
Narrative production (telling) was elicited using the Baby Goats and Baby Birds stories from the MAIN (Gagarina et al., Reference Gagarina, Klop, Kunnari, Tantele, Välimaa, Balčiūnienė, Bohnacker and Walters2012, Reference Gagarina, Klop, Kunnari, Tantele, Välimaa, Bohnacker and Walters2019). Native speakers of Russian and German administered the test in a monolingual mode with a mean interval of 18 days between languages. Story content (Baby Birds/ Baby Goats) and order of testing (L1/L2) were counterbalanced across participants. Twenty-two children were tested in German first, and 16 first in Russian. Following the protocol (see Gagarina et al., Reference Gagarina, Klop, Kunnari, Tantele, Välimaa, Balčiūnienė, Bohnacker and Walters2012), the experimenter puts three envelopes in front of the child and asks him/her to choose one and look at the sequence of the pictures in the story. Then, the story is folded up so that the child sees only the first two pictures (then four, then all six pictures), and s/he is asked to tell the story in such a way that only the child can have an overview over the sequence of pictures (non-joint attention mode).
The narratives were transcribed verbatim in CHAT (CHILDES; MacWhinney, Reference MacWhinney2000). The transcribers and coders were trained native speakers of the respective language, and their transcripts were double-checked by a third person for accuracy. The procedure included transcribing one-third of audio files per language and performing a Cohen kappa reliability score with computerised language analysis; the agreement on the transcribed words was over 80%.
The analysis of macrostructure production was operationalised by SS (according to the MAIN protocol) and SC in the following way. For SS, setting information (one for time and one for place, maximum two points), one point each for the production of an episode component: Goal (G), Attempt (A), Outcome (O), and an internal state as an initiating event and as a reaction (maximum five points) in the three episodes of the story were calculated. Thus, the maximum score was 17 points. Following previous studies (Yang et al., Reference Yang, Chan and Gagarina2023), SC had a maximum score of 3 per episode (maximum 9 for all three). It was scored as follows: for factual components: A, O or AO score 1; for inferred components: G, GA, or GO score 2; for full GAO score 3.
Cognitive tests
For the current study, the subtests “Figure Ground,” “Form Completion,” and “Attention Divided” from the Leiter International Performance Scale test (Leiter-3; Roid et al., Reference Roid, Miller, Pomplun and Koch2013) were administered. The “Attention Divided” subtest was performed together with the narrative task. The two other subtests were performed six months earlier, as part of the IQ testing for inclusion in the study. Following the test manual, the test was performed nonverbally. Each subtest allows for a teaching trial, which can be repeated up to three times. There are no time limits. According to the manual, the subtests described below show a high internal consistency, with alphas between .80 and .94 for ages three to five.
Figure Ground measures the ability to identify embedded figures within a complex background, and it tests shifting ability. In this subtask, the child is presented with an image (stimulus) and two to five cards. The cards show a figure or item that is located (“hidden”) on the stimulus image. The child must identify and point to the figures in the stimulus. There are a total of 12 stimulus images, and a point is awarded for each item the child correctly locates (maximum score 33). The stopping rule is six cumulative incorrect responses.
Form Completion assesses visual working memory; it requires that the child goes back and forth to recognise a whole picture from randomly displayed pieces. The child is presented with an image and, in the first two trials, must push blocks together to resemble the stimulus image. Then, the child sees a card with an incomplete item and must point to the complete version of the item within the stimulus image. There are 15 stimulus images; each correctly placed block or card gives a point (maximum score 36). The test is stopped if a child gives seven cumulative incorrect responses.
The subtest Attention Divided tests inhibition skills. It consists of ten cards (four of which show a red triangle) and 12 yellow disks. The child has two tasks. S/he must throw the 12 yellow discs into a yellow cup as quickly as possible. At the same time, the examiner flips over cards (one per second). The child must – while placing the yellow discs – slap the card that has been flipped over if it is a red triangle. If the card does not show a red triangle, the child should not interact with it. The test ends when the examiner has flipped over all 10 cards. There is no stop rule. The subtest Attention Divided results in two scores (max. 16 for each), one for correctly performed actions and one for incorrectly performed actions (a slap on a card without a triangle, a missed triangle, or any yellow disks remaining outside of the cup when the task is finished). Since the test manual describes the incorrect score as identifying a greater set of possible executive function problems (Roid et al., Reference Roid, Miller, Pomplun and Koch2013), this score was used in the analysis.
Data analysis
All statistics were performed in R (R Core Team, 2018). In the first step, we compared the performance in language proficiency as well as narrative macrostructure across the two languages in a paired sample two-tailed t-test. For language proficiency, we used a composite score by summing up the correct responses from all three language tests.
Then, we calculated separate hierarchical linear regression models with SS and SC as dependent variables, to find out which predictors affected the two macrostructure dimensions. In total, there were four models, two for each language separately. All four models considered the same predictors with the same order: demographic factors of sex, SES, language proficiency, and cognitive skills (Figure Ground, Form Completion, Attention Divided).
Since our study is exploratory, we have combined hierarchical steps with nonhierarchical, with theoretical motivation to first control for demographic variables and language proficiency, and then to consider the effect of the three cognitive skills. However, the order in which the individual predictors (independent variables) were added within the cognitive skills group: Figure Ground and Form Completion was stepwise rather than hierarchical. In fact, to more accurately understand the effects of both tasks, we calculated models using both orders of entry (first Figure Ground and first Form Completion).
The order of the steps in all four models was thus:
-
1. Sex
-
2. SES (average parental education)
-
3. Language proficiency German/Russian
-
4. Figure Ground/Form Completion
-
5. Figure Ground/Form Completion
-
6. Attention Divided
Results
Only significant findings (p ≤.05) are reported.Footnote 1 We describe the different orders of entry only where the results differ. For all the models used, please see the Appendix.
Differences across the two languages
The children’s proficiency in Russian was significantly higher than in German, t(37) = -8.16, α =.05 p <.001. Moreover, German proficiency has a much wider range (15%-83% correct) than Russian (43%-88% correct). Although the maximum score was similar, the minimum one was much lower in German. This indicates high variability in German proficiency among participants.
Similarly, the narrative macrostructure across languages was compared. For SS, the scores in Russian (M = 5.42, SD = 1.90) were significantly higher than for German (M = 3.82, SD = 2.33), t(37) = -4.03, α =.05, p <.001. For SC, no significant difference between Russian (M = 3.63, SD = 1.37) and German (M = 2.90, SD = 1.83), t(37) = -1.83, α =.05, p =.07) was found.
Story structure
The model used to predict SS in Russian never significantly explained the variance in the data. The model for German SS including demographic factors and German language proficiency (F(4,34) = 4.13, p = 0.01, R2 = .27) was significantly better than the models with only demographic variables, which did not significantly explain the variance (ΔF(1,34) = 9.35, p = .004, ΔR2 = .20). Within this model, the individual variable German language proficiency was significant (t(34) = 3.06, β =.08, p = .004). No other predictor additions (in any order) were significant.
Story complexity
The models for Russian SC were not significant when they included demographic variables or Russian language proficiency. The model including Figure Ground (F(4,33) = 2.85, p = .04, R2 = .26) was significantly more predictive, ΔF(1,33) = 5.40, p = .03, ΔR2 = .12. Figure Ground was the best individual predictor (t(33) = 2.32, p = .03), followed by Russian language proficiency, which had a significant negative effect (t(33) = -2.18, β = -.05, p = .04). This is most probably an example of suppressor regression, implying that the two variables are correlated. This could not be confirmed numerically using any classic test for correlation, however (Pearson’s, p = .06; Spearman’s, p = .05; or Kendall’s, p = .05). No other steps significantly improved the model. When Form Completion was added before Figure Ground, the models never reached significance.
The model for German SC remained nonsignificant when it included demographic variables and German language proficiency. When Figure Ground was added first, the model was not significant. Though the addition of Form Completion resulted in significant improvement, the model including it remained nonsignificant.
When Form Completion was added first, it significantly improved the model, (ΔF(1,33) = 6.10, p = .02, ΔR2 = .14). The model including sex, SES, German language proficiency, and Form Completion was significant (F(4,33) = 2.82, p = .04, R2 = .26). Form Completion was a significant individual predictor (t(33) = 2.47, β = .05 p = .02). Adding Figure Ground as the next step did not improve the model, and in fact made the model to lose significance.
Discussion
The present study examined the effect of cognitive skills on macrostructure production (SS and SC) in both languages of 38 Russian-German-speaking preschool children, after controlling demographic factors (sex and SES) and language proficiency (measured by lexical and morphosyntactic skills in both languages). Subsequently, the performance in macrostructure dimensions across the two languages was also compared.
Macrostructure production was analyzed for SS and SC using MAIN (Gagarina et al., Reference Gagarina, Klop, Kunnari, Tantele, Välimaa, Balčiūnienė, Bohnacker and Walters2012, Reference Gagarina, Klop, Kunnari, Tantele, Välimaa, Bohnacker and Walters2019). The cognitive tasks were assessed nonverbally via Leiter-3 (Roid et al., Reference Roid, Miller, Pomplun and Koch2013), motivated by their relevance for telling a picture-based story: Figure Ground (shifting), Form Completion (visual working memory), and Attention Divided (inhibition).
Results showed that SS was higher in the L1 Russian than in the L2 German, aligning with findings by Tribushinina et al. (Reference Tribushinina, Irmawati and Mark2022). However, this contrasts with studies such as Lindgren and Bohnacker (Reference Lindgren and Bohnacker2022), which found SS to be higher in the societal language Swedish than in the home language German. For SC, as predicted, there were no differences across languages, thus confirming that it is less language-dependent (Lindgren & Bohnacker, Reference Lindgren and Bohnacker2022).
The controlled demographic factors (sex, SES) did not add significant variance in either SS or SC in both Russian and German, which is in contrast with previous research on bilinguals (Armon-Lotem et al., Reference Armon-Lotem, Walters and Gagarina2011; Sheldon & Engstrom, Reference Sheldon, Engstrom, Thornborrow and Coates2005). Language proficiency was a significant predictor of SS, not of SC; however, its effect was only measurable in the weaker language, L2 German SS, not L1 Russian. The differing effects of language proficiency on SS across Russian and German could be explained by variability in vocabulary. Specifically, children’s language proficiency in Russian was high, meaning they had enough vocabulary to produce a story. In contrast, the German vocabulary scores were lower with larger variability, meaning some children had limited German vocabulary. This could prohibit them from producing a coherent SS. The same pattern was found in Lindgren and Bohnacker (Reference Lindgren and Bohnacker2022), where language proficiency significantly affected SS in the weaker language but not in the dominant Swedish.
After controlling demographic factors and language proficiency, cognitive skills did not affect SS either in Russian or German. However, they affected SC in both languages in different ways. While Figure Ground was a significant predictor of Russian SC (together with a negative effect of language proficiency), for German SC, only a model including Form Completion reached significance. Finally, Attention Divided did not affect macrostructure in any of the languages.
Our results on cognitive effects on narratives follow previous literature in which macrostructure was influenced by shifting (Khan, Reference Khan2013) and visual working memory (Veraksa et al., Reference Veraksa, Bukhalenkova, Kartushina and Oshchepkova2020). Nevertheless, they are in contrast with previous studies on the positive effect of inhibition on narrative macrostructure (Clark-Whitney & Melzi, Reference Clark-Whitney and Melzi2023). This might be because the Attention Divided test was conducted six months earlier than the other cognitive tests, potentially resulting in much lower performance and therefore lack of significant impact on narrative macrostructure.
Since SC did not differ between the two languages, we ask the question: Why does Figure Ground affect L1 Russian SC and Form Completion L2 German SC? The differential effects of cognitive skills on the two languages could be linked to differences in language proficiency. In Russian, children had higher language proficiency, which negatively affected SC when Figure Ground was added to the model. This suggests children focusing more on specific linguistic details to produce correct grammar in the narrative and thus paying less attention to creating more complex episodes in Russian. Consequently, children made frequent shifts to depict suitable objects and combine them to create a cohesive story. Contrariwise, children had lower proficiency in German which led to focusing more on holding in memory visually presented stimuli and connecting the story elements into a coherent whole.
Nevertheless, future studies will be necessary to truly differentiate the various cognitive skills affecting storytelling over time and across different language pairs of bilinguals. Furthermore, they should refrain from administering cognitive tests at different time points, particularly with preschool-aged children who are still developing their cognitive skills.
In sum, to our knowledge, studies investigating the effect of cognitive skills on narratives have not been conducted in both languages of bilinguals. Thus, we consider our study exploratory. The findings advance our understanding of how cognitive factors impact bilinguals’ SC by supporting that shifting and visual working memory have a differential influence on SC in L1 and L2 of preschool bilinguals, when sex, SES, and language proficiency are controlled for.
Supplementary material
The supplementary material for this article can be found at http://doi.org/10.1017/S0305000924000394.
Acknowledgements
Data from this study were collected in the research project “Verbal and Non-Verbal Indicators for Identifying Specific Language Impairment in Successive Bilingual Preschoolers (DFG Az. LI 410/5-1 & Az. GA 1424/3-1)” funded by the German Research Foundation. We thank the German Research Foundation as well as all those who participated in the project for data collection, transcription and coding. Finally, we thank the two anonymous reviewers and the Action Editor of this Journal for their insightful and thorough comments.
Competing interest
The authors declare none.