Hostname: page-component-857557d7f7-ms8jb Total loading time: 0 Render date: 2025-12-08T02:50:30.884Z Has data issue: false hasContentIssue false

The repetition of language-specific non-words: A weak clinical marker for language-related disorders in German-speaking monolingual and multilingual children

Published online by Cambridge University Press:  17 March 2025

Eugen Zaretsky*
Affiliation:
Department of Phoniatrics and Paediatric Audiology, Marburg University Hospital, Philipps University, Marburg, Germany
Benjamin P. Lange
Affiliation:
Department of Social Sciences, IU International University of Applied Sciences, Berlin, Germany.
*
Corresponding author: Eugen Zaretsky; E-mail: zaretsky@staff.uni-marburg.de
Rights & Permissions [Opens in a new window]

Abstract

Non-word repetition (NWR) is often utilized for the assessment of phonological short-term memory (PSTM) and as a clinical marker for language-related disorders. In this study, associations between children's language competence and their performance in language-specific NWR tasks as well as the relevance of NWR for the prediction of language development were scrutinized. German preschoolers (N = 1,801) were compared regarding their performance in NWR, German vocabulary, and articulation. For 141 children, results of a school enrolment test were available. Multilingual children performed as well as monolingual German-speaking children in NWR only under the condition of comparable German language skills. NWR performance depended on item length, children's vocabulary and articulation skills and was weakly associated with language-related medical issues. The predictive power of NWR for children's performance in the school enrolment test was minimal. To conclude, chosen German-based NWR tasks did not deliver convincing results as a clinical marker or predictor of language development.

Résumé

Résumé

La répétition de non-mots (NWR) est souvent utilisée pour évaluer la mémoire phonologique à court terme (PSTM) et comme marqueur clinique des troubles du langage. Dans cette étude, nous examinons de près les associations entre la compétence linguistique des enfants et leur performance dans des tâches de répétition de non-mots spécifiques à la langue, ainsi que la pertinence de la répétition de non-mots pour prédire le développement du langage. Des enfants allemands d’âge préscolaire (N = 1 801) ont été comparés en ce qui concerne leurs performances en matière de NWR, de vocabulaire allemand et d'articulation. Pour 141 enfants, les résultats d'un test d'inscription à l’école étaient disponibles. Les enfants multilingues ont obtenu d'aussi bons résultats que les enfants germanophones monolingues en NWR, à condition que leurs compétences en allemand soient comparables. La performance en NWR dépendait de la longueur des items, du vocabulaire des enfants et de leurs capacités d'articulation, et était faiblement associée à des problèmes médicaux liés à la langue. Le pouvoir prédictif du NWR pour le test d'inscription à l’école était minime. En conclusion, les tâches de NWR choisies en allemand n'ont pas donné de résultats convaincants en tant que marqueur clinique ou prédicteur du développement du langage.

Information

Type
Article
Copyright
Copyright © Canadian Linguistic Association/Association canadienne de linguistique 2025

1. Introduction

Non-word repetition (NWR) tasks utilize non-existing words (non-words) that can be either language-specific (drawn from a given language's morphology and phonemes) or language-unspecific (avoiding any language-specific characteristics such as consonant clusters) (Chiat Reference Chiat, Armon-Lotem, Long and Meir2015).Footnote 1 NWR tasks are used to quantify phonological short-term memory (PSTM), one of the subsystems of working memory in Baddeley and Hitch's (Reference Baddeley and Hitch1974) model, thus measuring our innate ability to preserve phonological information in the short-term memory and process it more or less automatically (Archibald Reference Archibald2008). PSTM, in its turn, has been shown to be closely linked to language processing (Deldar et al. Reference Deldar, Gevers-Montoro, Khatibi and Ghazi-Saidi2020).

A weak performance in NWR tasks constitutes a more or less reliable clinical marker (statistically significant sign or predictor) of various language-related disorders (Penke Reference Penke2018, Masoura et al. Reference Masoura, Gogou and Gaterhcole2020) as well as of fluency disorders (Howell et al. Reference Howell, Tang, Tuomainen, Chan, Beltran, Mirawdeli and Harris2017). It also predicts the pace of L1 and L2 acquisition or learning in children and adults (Ogino et al. Reference Ogino, Hanafusa, Morooka, Takeuchi, Oka and Ohtsuka2017, Nowbakht Reference Nowbakht2019), including those with language-related disorders and impairments such as severe hearing loss (Ching et al. Reference Ching, Cupples and Marnane2019). In some countries (e.g., Germany), NWR tasks are sometimes used in language screenings as a substitute for traditional tasks in speech comprehension, vocabulary, articulation, and grammar (H. Grimm Reference Grimm2003, Shriberg et al. Reference Shriberg, Lohmeier, Campbell, Dollaghan, Green and Moore2009, Lavesson et al. Reference Lavesson, Lövdén and Hansson2018).Footnote 2

However, the predictive power of NWR tasks varies considerably depending on a number of factors. Among other things, it may be affected by the choice of the study sample (Weismer et al. Reference Weismer, Tomblin, Zhang, Buckwalter, Chynoweth and Jones2000). Studies with comparatively unselected (population) samples tend to find less prominent differences in NWR performance between typically developing and language-impaired children than studies comparing two groups (children with and without language impairment) matched for age, sex, sociolinguistic background, etc. Children's performances in NWR tasks vary depending on their age (Coady and Evans Reference Coady and Evans2008, Ebert et al. Reference Ebert, Kalanek, Cordero and Kohnert2008) and other sociodemographic characteristics, such as their family's socioeconomic status (Guerra et al. Reference Guerra, Hazin, Guerra, Roulin, Le Gall and Roy2021). Furthermore, the level of difficulty of the NWR tasks can be influenced by the varying linguistic characteristics of non-words, in terms of number of consonant clusters, language-specific phonemes, length of non-words, possible rhymes with real words in the language they are based on, and so on (Coady and Evans Reference Coady and Evans2008, Ferré et al. Reference Ferré, Santos, de Almeida, Grillo and Jepson2015).

However, irrespective of study sample and design, few studies successfully translated significant differences between typically developing and language-impaired children into acceptable sensitivity and specificity of NWR tasks, which may then be used for the identification of language-related impairments (Schwob et al. Reference Schwob, Eddé, Jacquin, Leboulanger, Picard, Oliveira and Skoruppa2021). This indicates that such tasks cannot and should not be used as an independent clinical marker for language-related impairments, due to (a) the heterogeneity of such impairments in terms of etiology and symptoms, as well as (b) varying sociodemographic characteristics of children and their families associated with some impairments or subgroups (e.g., bilinguals) (Chiat and Polišenska Reference Chiat and Polišenska2016, Schwob et al. Reference Schwob, Eddé, Jacquin, Leboulanger, Picard, Oliveira and Skoruppa2021). Nevertheless, in some language tests (e.g., “Kindersprachscreening” [KiSS]; Holler-Zittlau et al. Reference Holler-Zittlau, Euler and Neumann2011), NWR constitutes an independent cut-off parameter for the identification of children needing medical assistance in language acquisition.

2. Background

2.1 The confounding of language skills with performance in non-word repetition tasks

Doubts have been raised whether NWR tasks are really suitable for the assessment of PSTM or, more generally, working memory (A. Grimm Reference Grimm2016, Arthur Reference Arthur2017) and whether they assess only short-term or also long-term memory (Dollaghan et al. Reference Dollaghan, Biber and Campbell1995, Casalini et al. Reference Casalini, Brizzolara, Chilosi, Cipriani, Marcolini, Pecini, Roncoli and Burani2007). Indeed, performance in NWR tasks depends not only on PSTM capacity, but also on the internalized phonological and lexical representations of the language the non-words are based on (Arthur Reference Arthur2017). With the acquisition of L1 or L2 vocabulary, phonotactic regularities are extracted from the input, statistically analyzed, and built into a complex system of acquired linguistic knowledge. Children's attempts to re-etymologize (lexicalize) non-words, that is, to substitute real words or morphemes for some segments of non-words, are reflections of this phenomenon (Dollaghan et al. Reference Dollaghan, Biber and Campbell1995).

Multilinguals usually score higher in NWR tasks in their L1 than their L2 (Summers et al. Reference Summers, Bohman, Gillam, Peña and Bedore2010), which again demonstrates a link between language competence and performance in repetition tasks. Also, children who begin to acquire the language the non-words are based on at an earlier age, for instance simultaneous bilinguals (Mathieu et al. Reference Mathieu, Lindner, Lomako and Gagarina2016), usually perform better on repetition tasks. This confounding of language skills with PSTM performance contributes to a certain ambiguity in the interpretation of the findings from NWR tests (A. Grimm Reference Grimm2016).

2.2 Possible disadvantages of non-word repetition tasks for multilingual children

There are often no separate norms for multilingual children in widely used language tests with NWR tasks (e.g., in the “Screening des Entwicklungsstandes bei Einschulungsuntersuchungen” [Developmental Screening for the School Enrolment Examination] (S-ENS); Döpfner et al. Reference Döpfner, Dietmair, Mersmann, Simon and Trost-Brinkhues2005). This means that multilinguals are expected to perform at the same level as monolinguals, which could lead to a misinterpretation of their PSTM performance, especially in case of successive bilingualism and highly language-specific non-words. Due to a well-known link between various language-related impairments and deficits in PSTM, multilinguals with weak L2 competence can be falsely classified as language-impaired if the respective tasks are based on their L2 (Engel de Abreu et al. Reference Engel de Abreu, Baldassi, Puglisi and Befi-Lopes2013).

An assessment which includes a high linguistic load of language-specific NWR tasks raises the question of the extent to which their predictive power for children's language acquisition can be attributed to PSTM. Because language-specific non-words presuppose a command of the language-specific phonotactics and vocabulary (Verhagen et al. Reference Verhagen, de Bree, Mulder and Leseman2017), it can be assumed that the predictive power of NWR tasks is based less on PSTM performance and more on command of the relevant language. For instance, the weak performance of Turkish-German preschoolers on German-based non-words does not mean that Turkish children have PSTM deficits. Rather, it indicates that they have not yet acquired the regularities of German phonotactics and cannot precisely reproduce L2-based non-words due to lexical and phonotactic transfers from their L1 (Zaretsky et al. Reference Zaretsky, Neumann, Euler and Lange2013).

A reduction of the linguistic load in repetition tasks can minimize differences in repetition performance between monolingual and multilingual children (Meir Reference Meir2017). For this purpose, various modifications of repetition tasks, such as meaningless sentences, are used (e.g., in the German language screening “Sprachscreening für das Vorschulalter” [Language Screening for the Preschool Age (SSV)]; H. Grimm Reference Grimm2003). Nevertheless, multilingual children seldom score at the same level as or outperform their monolingual peers (e.g., Wild and Fleck Reference Wild and Fleck2013), and these rare exceptions seem to be attributable to simultaneous bilingualism or other conditions resulting in comparable L1 and L2 skills, or to comparatively language-independent non-words (Bloder et al. Reference Bloder, Eikerling and Lorusso2023). Under other conditions, multilinguals have been shown to be outperformed by monolinguals given large enough sample sizes (Yoo and Kaushanskaya Reference Yoo and Kaushanskaya2012, Armon-Lotem and Meir Reference Armon-Lotem and Meir2016, Zaretsky and Hey Reference Zaretsky, Hey, Spreer, Wahl and Beek2022). Thus, the use of language-specific NWR tasks often puts multilinguals at a disadvantage in comparison with monolinguals and does not clearly reflect their PSTM performance.

However, a growing body of evidence suggests that multilinguals sometimes perform better than monolinguals in tests related to executive functions due to what is sometimes called the “bilingual advantage” (Gunnerud et al. Reference Gunnerud, Ten Braak, Reikerås, Donolato and Melby-Lervåg2020). This term refers to better developed executive functions, including working memory, as a result of the double pressure imposed by multilingualism (Engel de Abreu et al. Reference Engel de Abreu, Cruz-Santos and Puglisi2014). Consistent with this view, when studies control for comparable language skills between monolinguals and multilinguals, multilinguals tend to outperform monolinguals in NWR tasks (e.g., Zaretsky and Lange Reference Zaretsky and Lange2023).

2.3 Language-specific vs. “cross-linguistic”/“quasi-universal” repetition tasks

Whereas the present study is dedicated to language-specific non-words, it should be noted that in recent years, some NWR tests with a low linguistic load have been developed, such as Chiat's (Reference Chiat, Armon-Lotem, Long and Meir2015) “cross-linguistic”/“quasi-universal” items. Such non-words account for the phonotactic regularities of many world languages and, thus, do not put multilingual children at a disadvantage in terms of their overall score. They can also identify language-impaired children both in monolingual and multilingual subgroups (Boerma et al. Reference Boerma, Chiat, Leseman, Timmermeister, Wijnen and Blom2015, Tuller et al. Reference Tuller, Hamann, Chilla, Ferré, Morin, Prevost, Santos, Ibrahim and Zebib2018). However, again, they do so not in terms of sensitivity and specificity but, rather, in terms of statistically significant differences in NWR total scores. Although there is some promising evidence based on small samples (e.g., A. Grimm Reference Grimm2022), studies with large samples (e.g., Zaretsky and Hey Reference Zaretsky, Hey, Spreer, Wahl and Beek2022) show that children's performance in cross-linguistic items is very weakly associated with their language-related impairments. Moreover, language-specific NWR tasks sometimes showed better results than cross-linguistic tasks in the detection of language-related impairments in multilingual children (e.g., Eikerling et al. Reference Eikerling, Bloder and Lorusso2022).

3. Research questions and hypotheses

Our research questions and hypotheses can be summarized as follows:

  1. 1. Do monolingual German-speaking children outperform multilingual children in German-based NWR tasks from language tests KiSS.2 and S-ENS

    1. (a) in the whole sample without consideration of German language skills,

    2. (b) under the condition of comparable German language skills?

We hypothesized that multilinguals would score lower than monolinguals under the condition of deficient German language skills but would perform better under comparable German language skills, as has been shown for another sample of German preschoolers by Zaretsky and Lange (Reference Zaretsky and Lange2023). Furthermore, we assumed that one can segment multilinguals into subgroups (here: age and linguistic subgroups) with similar results as for all multilinguals taken together. Due to the supposed “bilingual advantage” (see section 2.2), we could not exclude the possibility that multilinguals would outperform monolinguals in chosen NWR tasks in some linguistically advanced subgroups. The results of the NWR tasks were expected to be associated with performance in (expressive) German vocabulary and articulation. Thus, chosen German-based NWR tasks were predicted to be of limited validity for assessing multilinguals’ PSTM performance due to lexical and phonological interference from other languages, combined with limited command of German.

  1. 2. Does NWR performance correlate with language-related medical issues or allow for identification of such issues in terms of sensitivity and specificity?

No strong associations between NWR performance and language-related medical issues (such as hearing disorders or Developmental Language Disorder, DLD) were expected.

  1. 3. How much does NWR performance depend on extra-linguistic factors (e.g., children's German articulation skills, mono-/multilingualism, language-related impairments) and intra-linguistic factors (e.g., length of non-words)?

It was hypothesized that, taking children's German language skills and intra-linguistic factors into consideration, NWR would be weakly associated with language-related medical issues or other factors such as mono-/multilingualism.

  1. 4. Can NWR performance contribute to the prediction of language development in a follow-up study design in terms of

    1. (a) providing an independent predictor of the result of a language test (total score of correct answers) that was carried out, on average, 1.5 years later (S-ENS),

    2. (b) sensitivity and specificity for dichotomized results (“pass/fail”) of the same language test?

Due to linguistic interference in NWR performance, this predictive power was expected to be very limited.

Thus, the study examined intra-linguistic and extra-linguistic influencing factors on the children's performance in NWR tasks and quantified the role of children's language skills, their sociodemographic and medical characteristics, as well as linguistic characteristics of non-words in the NWR performance.

4. Material and Methods

4.1 Participants

In this study, a cumulative data set that included several language test validation studies was analyzed retrospectively. In the original prospective studies (Neumann and Euler Reference Neumann and Euler2009, Reference Neumann and Euler2010, Reference Neumann, Euler, Redder and Weinert2013; Neumann et al. Reference Neumann, Euler and Zaretsky2011), kindergarten children aged 41 to 71 months (3.0-5.11 years) were included if their parents gave written consent, without any further inclusion or exclusion criteria.Footnote 3 After the first test session (t1), some children were retested during the school enrolment examination (t2) at the age of 60 to 81 months (5.0-6.9 years).

4.1.1 KiSS.2 samples

At t1, a total of 1,801 three- to five-year-old children were tested in 258 German kindergartens with the language screening KiSS.2 (Neumann and Euler Reference Neumann and Euler2009, Reference Neumann and Euler2010, Reference Neumann, Euler, Redder and Weinert2013; Holler-Zittlau et al. Reference Holler-Zittlau, Euler and Neumann2011). Because children's age was shown to influence repetition performance (see section 1), monolingual and multilingual children had to be matched for age (in months). For all children taken together (N = 1,801), the age difference between monolinguals and multilinguals was not statistically significant (p > .05), and, thus, they can be considered age-matched. However, in the subsamples of three- to five-year-olds that were used for some calculations, a total of 57 randomly selected children had to be excluded to keep the groups age-matched. Therefore, the subsample sizes of three- to five-year-olds do not add up to 1,801. Biological sex, as one of variables related to children's language competence, including PSTM (Lange and Zaretsky Reference Lange and Zaretsky2021), was evenly distributed among monolinguals and multilinguals, both in the whole sample and in the subsamples of three- to five-year-olds according to Chi-Square analyses (ps > .05).

KiSS.2 questionnaires for parents and kindergarten teachers were utilized to define children as monolinguals or multilinguals. Information on the language(s) spoken at home was taken from the questionnaire for parents, or, if not available, from the questionnaire for kindergarten teachers. Children who had acquired at least one language other than German were defined as multilingual.

4.1.2 S-ENS subsample

For a subgroup of 141 children from the KiSS sample (t1), results of the school enrolment test S-ENS (Döpfner et al. Reference Döpfner, Dietmair, Mersmann, Simon and Trost-Brinkhues2005) were available (t2). S-ENS was carried out 7 to 33 months after t1 (M = 16.8 ±6.7 months), at the age of 60 to 81 months (5.0-6.9 years). No inclusion or exclusion criteria for t2 were defined other than a second informed consent form signed by parents and a time span of at least six months between the two test sessions. Among children who were retested with S-ENS, no significant differences between monolinguals and multilinguals regarding age and biological sex were found (ps > .05).

Table 1 presents other sociodemographic characteristics of the t1 and t2 samples, as well as dichotomized results of the language tests KiSS.2 and S-ENS. KiSS.2 categorizes both monolingual and multilingual children as those speaking German age-appropriately (AA) and those needing either educational (ED) or medical (MED) assistance in acquiring German (Neumann et al. Reference Neumann, Euler and Zaretsky2011). Usually, most MED are also ED but not vice versa. ED are those who can benefit from German language courses. Their deficits in German language competence result from poor quality and/or quantity of German language input. MED need some kind of medical assistance, such as hearing aids in the case of a hearing disorder. Thus, MED are children who are highly likely to have language-related impairments such as DLD, hearing disorders, or attention-deficit/hyperactivity disorder. Criteria for the classifications AA/ED/MED (Neumann et al. Reference Neumann, Euler and Zaretsky2011) are extensive and include some items from the KiSS.2 questionnaires (e.g., whether the child's German language competence improved in the last six months) and cut-off values for most of the KiSS.2 subtests; that is, certain total scores must be achieved to be classified as AA (e.g., correct pronunciation of /r/ in at least one item out of three). Weak performance in the NWR task also constitutes an independent cut-off criterion for the classification of (monolingual) children as MED. In KiSS.2 validation studies, the definition of ED targeted the lowest 16% of the (monolingual German-speaking) sample regarding German language competence, the definition of MED the lowest 10%. In terms of the CATALISE study (a multinational and multidisciplinary Delphi consensus study on the terminology regarding language impairments in children), ED children are those with insufficient exposure to the language used by the school or community to be fully fluent in it and who do not have a language impairment (Bishop et al. Reference Bishop, Snowling, Thompson and Greenhalgh2017), whereas MED children are those with language impairments (Bishop et al. Reference Bishop, Snowling, Thompson and Greenhalgh2016). S-ENS categorizes children as “pass/fail”, without differentiation between ED and MED; “fail” means that children's German language competence is considered insufficient for school enrolment. S-ENS cut-off criteria aim to identify children whose German language competence was below the 17th percentile of the (monolingual German-speaking) norming sample and to identify children with language-related medical issues.

Table 1. Sample characteristics: Children tested with the language screening KiSS.2 and the school enrolment test S-ENS

ED children needing educational assistance in acquiring German, MED children needing medical assistance in acquiring German, AA children speaking German age-appropriately

4.2 Test Materials

The validated, standardized language screening KiSS.2 contains subtests on speech comprehension, productive vocabulary, productive articulation, productive grammar, and PSTM. The latter is assessed by the repetition of four German-based non-words (see Appendix 1) and two sentences. In the present study, only non-words were studied.

Sociodemographic characteristics of children and their families were assessed by the KiSS.2 questionnaires for parents and kindergarten teachers. Information on children's language-related medical issues was taken from the questionnaire for parents (item “Does your child have any disorder or impairment that affects language acquisition?”). This item was used as a gold standard for research question 2 and includes not only DLD but also all language-related impairments with comorbidities, such as hearing disorders or attention-deficit/hyperactivity disorder.

The validated and standardized school enrolment test S-ENS (Döpfner et al. Reference Döpfner, Dietmair, Mersmann, Simon and Trost-Brinkhues2005) contains tasks on productive articulation, PSTM (repetition of German-based non-words and sentences), and productive phonological awareness (filling out gaps for single phonemes in words).Footnote 4 S-ENS is one of the official school enrolment tests in Germany and was carried out with all five- to six-year-old children, irrespective of their participation in the present study. For the children included in this study, an extended version of S-ENS was conducted (“S-ENS-Add-on”), with additional subtests on speech comprehension, productive vocabulary, and productive grammar (Neumann et al. Reference Neumann, Euler and Zaretsky2011). S-ENS non-words (see Appendix 1) are also German-based.

Both KiSS.2 and S-ENS are predominantly picture-based tests. Children are asked questions such as “What is it?” or “What does it feel like?” In case of the PSTM tasks (non-words and sentences), children are asked to repeat after the test administrator. Articulation errors (e.g., those resulting from rhotacism) are not considered as repetition errors. In articulation tasks, children have to name objects on the pictures, but, if not possible, they repeat after the test administrator. For further information on KiSS.2 and S-ENS, see Table 2.

Table 2. Description of language tests KiSS.2 and S-ENS

* Additional tasks known as S-ENS-Add-on (Neumann et al. Reference Neumann, Euler and Zaretsky2011)

For further information on the testing procedure, see Neumann et al. (Reference Neumann, Euler and Zaretsky2011) and Neumann and Euler (Reference Neumann, Euler, Redder and Weinert2013).

4.3 Statistical Analysis

First, German language skills of monolinguals and multilinguals, including their subgroups (three-, four-, and five-year-olds; Turkish-, Italian-, English-, and Arabic-speaking children), were compared by means of chi-square tests.

Next, children's language skills according to KiSS.2 and S-ENS subtests on vocabulary, articulation, and NWR were compared for monolinguals and multilinguals by Mann-Whitney U-tests, (a) without any preconditions, (b) under condition of comparable German language skills.

The strength of the link between NWR and language-related impairments was quantified by point-biserial correlations, receiver operating characteristic (ROC) curves, and a classification tree.

The predictive power of the KiSS.2 NWR task for S-ENS results was assessed by a linear regression and four ROC curves.

Details on the statistical analyses can be found in Appendix 2.

5. Results

First, dichotomized German language skills of monolinguals and multilinguals (including subgroups) were compared. In t1, multilinguals were classified as ED significantly more often than monolinguals in the whole sample, as well as in all age and linguistic subgroups (see Table 3). In the subgroups of five-year-olds and (three- to five-year-old) Italians, multilinguals were additionally more often classified as MED.

Table 3. Differences between monolingual (MO) and multilingual (MU) children, including subgroups, in dichotomized KiSS.2 results

*** p < .001, ** p < .01, * p < .05

In S-ENS (t2), multilinguals also demonstrated deficient German language skills (“fail”) more often than monolinguals (χ 2(1) = 8.65, p = .003; 37/91 (40.7%) vs. 8/49 (16.3%), n = 140, one child without dichotomized result). Thus, both in t1 and t2 monolinguals showed better German language skills than multilinguals and, therefore, could be expected to outperform multilinguals in German-based NWR tasks.

Second, total scores of correct answers in KiSS.2 subtests on NWR, vocabulary, and articulation were compared for multilinguals and monolinguals, including various subgroups, using Mann-Whitney U-tests (research questions 1a and 1b). As can be seen in Tables 4 and 5 (cf. rows KiSS.2 subtest “Non-words” vs. KiSS.2 subtest “Articulation”), (a) monolinguals outperformed multilinguals in KiSS.2 NWR tasks under condition of better performance in the articulation subtest, (b) monolinguals outperformed Turkish children under the same condition, (c) in the subgroups with comparable performances in articulation tasks, monolinguals and multilinguals (including their subgroups) did not differ in their NWR scores. Comparable performance in articulation tasks was mostly achieved by the exclusion of ED and MED children. Also, monolinguals always outperformed multilinguals in vocabulary, including AA subgroups (Tables 4 and 5, row KiSS.2 subtest “Vocabulary”). Therefore, the condition of comparable vocabulary skills could not be tested.

Table 4. Differences between monolingual (MO) and multilingual (MU) children, including subgroups, in KiSS.2 results

*** p < .001, ** p < .01, * p < .05, M mean, SD standard deviation, AA children with age-appropriate German language skills

Table 5. Differences between monolingual children (MO) and subgroups of multilingual children (MU) in KiSS.2 results

*** p < .001, ** p < .01, * p < .05, M mean, SD standard deviation, TU Turkish, IT Italian, EN English, AR Arabic, AA children with age-appropriate German language skills

In S-ENS, monolinguals outperformed multilinguals in vocabulary (Z = -5.68, p < .001,  = .21, n = 141, M = 4.0 ±1.2 vs. 2.0 ±1.9), but not in articulation and NWR (p > .05). After the exclusion of children with a “fail” result in S-ENS, multilinguals outperformed monolinguals in NWR (Z = -2.37, p = .018,  = .42, n = 95, M = 5.2 ±0.9 vs. 4.6 ±1.2), although they remained significantly weaker in vocabulary (Z = -3.15, p = .002,  = .32, n = 95, M = 2.9 ±1.8 vs. 4.1 ±1.2), without differences in articulation (p > .05). These findings confirm those in Tables 4 and 5 above, where the condition of comparable articulation skills almost reversed the results of monolinguals and multilinguals in two cases (three-year-olds in Table 4 and English-speaking children in Table 5), that is, multilinguals outperformed monolinguals numerically in the NWR repetition task, although the results did not reach statistical significance. Generally, Mann-Whitney U-tests showed that multilinguals could score on the same level or outperform monolinguals in NWR tasks under condition of comparable skills in the language non-words are based on.

A point-biserial correlation between total scores of correctly repeated KiSS.2 non-words and parents’ information on children's language-related medical issues (research question 2) yielded rpb = .142, p < .001. In t2, the same correlation was not statistically significant: rpb = .074, p > .05. Thus, associations between children's NWR performance and language-related impairments can be described as weak to non-existent.

In an ROC curve, language-related impairments were detected by KiSS.2 non-words with a sensitivity of 74.2%, a specificity of 46.4%, and an area under the curve of .652 (95% confidence interval .557-.727). S-ENS non-words identified language-related impairments with a sensitivity of 84.4%, a specificity of 25.0%, and an area under the curve of .558 (.392-.724). Thus, total scores of correct answers in NWR tasks could not be “translated” into reliable quality criteria for the detection of language-related impairments.

A classification tree with NWR performance in KiSS.2 as dependent variable was calculated for the whole sample (research question 3); see Figure 1. Advanced performance in the NWR subtest was predominantly associated with a high KiSS.2 articulation subtest score (the highest level of the classification tree under the dependent variable in Figure 1), followed by a short length of non-words (the next level). On the third level (not depicted in Figure 1), NWR performance was associated with a high KiSS.2 vocabulary score (ps < .05). The best NWR results were shown by children who achieved at least 10 points in the KiSS.2 subtest on articulation and at least nine points in the subtest on vocabulary, which corresponds to the “pass” result in these subtests according to KiSS.2 cut-off criteria. Other independent variables did not yield significant results, that is, did not appear in the classification tree. To sum up, children's performance in the KiSS.2 NWR subtest depended on their German language competence and item length, rather than on their language-related impairments.

Figure 1. Possible extra- and intra-linguistic influencing factors on the answers (correct/wrong) in the KiSS.2 subtest on the repetition of non-words

A linear regression was calculated with the total score of correct answers in S-ENS as dependent variable (research question 4a; see Table 6). The regression model was statistically significant (F (7, 133) = 48.19, p < .001) and accounted for 70% of the variance (corr. R 2 = .70). Better S-ENS results were most significantly associated with advanced children's vocabulary and articulation skills (in KiSS.2). NWR performance in KiSS.2 was only weakly (positively) associated with the S-ENS total score. A larger time span between t1 and t2 also contributed to a higher S-ENS total score. Thus, the predictive power of non-words for the children's German language competence in, on average, 1.5 years was minimal.

Table 6. Possible predictors of the S-ENS total score of correct answers

*** p < .001, * p < .05

Four ROC curves with dichotomized S-ENS results (“pass/fail”) as dependent variable were calculated (research question 4b). The KiSS.2 NWR score as well as the KiSS.2 total score with and without NWR score as independent variables showed unsatisfactory quality criteria in the prediction of the S-ENS result (see Table 7). The addition of the weighted variable “age of onset of German language acquisition” to the KiSS.2 total score improved the sensitivity (93%) and specificity (70%), so that the minimal requirements for both values (90% and 70% respectively) were fulfilled; that is, the predictive power of the model (d) can be considered sufficient. ROC curves were not calculated separately for monolinguals and multilinguals because neither classification tree nor linear regression identified monolingualism/multilingualism as a relevant influencing factor.

Table 7. Characteristics of four ROC curves for the prediction of the dichotomized S-ENS result

ROC receiver operating characteristic, CI confidence interval

6. Discussion

This study has examined whether particular language-specific NWR tasks can be used for the detection of language-impaired children and as a predictor of language acquisition, especially in multilinguals. Because non-words are usually language-specific (although there are exceptions, such as non-words developed by Chiat (Reference Chiat, Armon-Lotem, Long and Meir2015)), they can put multilingual children with limited L2 skills at a disadvantage (Zaretsky and Lange Reference Zaretsky and Lange2023). Weak results of multilinguals in repetition tasks can be misinterpreted as correlates of language-related medical issues (A. Grimm and Schulz Reference Grimm and Schulz2016). In this study, multilingual preschoolers had a weaker command of German than their monolingual German-speaking peers and scored significantly lower in chosen German-based NWR tasks (research question 1a). After the exclusion of children with a limited command of German, performance improved and mostly did not differ from that of monolinguals (research question 1b). Also, in the subgroups where no significant difference between multilinguals and monolinguals in articulation skills was found, multilinguals scored at the same level as monolinguals even before the exclusion of children with weak German language skills. Associations between children's performance in chosen NWR tasks and their language-related impairments were very weak or non-existent (research question 2). Rather, NWR performance depended on children's German articulation and vocabulary skills as well as on the item length (research question 3). Thus, it is highly likely that many multilingual children who were classified as having defective PSTM were not impaired in a clinical sense. Their weak performance in the repetition tasks might have resulted from a limited proficiency in German articulation and vocabulary. Also, although the KiSS.2 NWR task did constitute an independent predictor of the children's linguistic outcomes in the school enrolment examination, its predictive power was very limited (research question 4).

As was shown in section 1, it remains unclear to what extent language-specific NWR tasks can quantify PSTM and to what extent they quantify L1 or L2 competence. In the current study, a close link between repetition tasks and PSTM performance was called into question and examined under consideration of children's German articulation and vocabulary skills. First, the German language skills of monolinguals and multilinguals were compared. It is obvious that children who speak two or three languages at home need more help in acquiring German than monolingual German-speaking children. Therefore, as expected, in t1 multilinguals were classified as ED (according to KiSS.2 criteria) significantly more often than monolinguals (cf. Zaretsky et al. Reference Zaretsky, Euler, Neumann and Lange2014, Eisenwort et al. Reference Eisenwort, Felnhofer and Klier2018, Weiland et al. Reference Weiland, Schmidt, Herper-Klein and Kieslich2019). They also scored significantly lower in the KiSS.2 subtests on articulation and vocabulary. Because non-words in the KiSS.2 subtest on PSTM assessment are German-based, multilinguals were outperformed by monolinguals in these tasks. The exclusion of children with weak German language skills (ED, MED) from both subgroups (multilinguals, monolinguals) resulted in comparable performance of multilingual and monolingual children in the KiSS.2 NWR subtest. These results were confirmed by the analyses of the three- to five-year-old children as well as the four linguistic subgroups, namely children speaking Turkish, Italian, English, and Arabic. The common denominators of the associations between results in KiSS.2 NWR, vocabulary, and articulation tasks can be described as follows:

  • In all subgroups where the performance of multilinguals and monolinguals in the articulation tasks did not differ, it did not differ in NWR either.

  • In all subgroups where multilinguals scored lower than monolinguals in articulation, they also scored lower in NWR.

  • Exclusion of ED and MED led to the improvement of results of multilingual children in articulation and NWR.

  • In all cases, multilinguals scored lower than monolinguals in vocabulary.

  • There is no direct link between the percentage of MED children and children's performance in the KiSS.2 NWR task (e.g., Italian children were more often MED than German monolinguals in spite of comparable NWR performance, Turkish children were as often MED as monolinguals in spite of significantly weaker NWR results) due to a high number of other cut-off criteria for the MED classification.

The exclusion of ED and MED almost reversed the results of multilinguals and monolinguals in two cases. Three-year-old multilinguals with age-appropriate German language skills scored numerically higher than their monolingual peers with a considerable effect size of .36. However, the result did not reach statistical significance, probably due to a low sample size in this subgroup (n = 115). The same is true for English-speaking children (effect size .39, n = 36).

In t2, monolinguals and multilinguals did not differ in their articulation skills. After the exclusion of linguistically weak children, multilinguals outperformed monolinguals in NWR tasks, which supports the findings of t1 and delivers further evidence for the hypothesis of “bilingual advantage” (see section 1).

To sum up, according to univariate calculations, in both t1 and in t2 monolinguals could not outperform multilinguals in NWR tasks under condition of comparable articulation skills, demonstrating that NWR performance depended, first and foremost, on German articulation competence.

Although only 10 language-specific non-words were analyzed in the present study (four KiSS.2 and six S-ENS non-words), a separate publication (Zaretsky and Lange Reference Zaretsky and Lange2023) was dedicated to the analysis of 18 German-based SSV non-words (H. Grimm Reference Grimm2003) and 20 German-based non-words from an older and larger version of KiSS (Neumann and Euler Reference Neumann and Euler2009).Footnote 5 The results were directly comparable to those presented here in terms of a close link between children's performance in NWR tasks and the KiSS subtest on articulation. After the exclusion of ED and MED, multilinguals outperformed monolinguals in the NWR task in SSV (cf. S-ENS NWR in the present study) and were at the same NWR performance level in KiSS. Again, this shows that multilinguals might have a certain advantage over monolinguals in PSTM or, more generally, in working memory and/or in other cognitive functions (Bialystok and Barac Reference Bialystok and Barac2012, Van den Noort et al. Reference Van den Noort, Struys, Bosch, Jaswetz, Perriard, Yeo, Barisch, Vermeire, Lee and Lim2019).

A very close link between performance in German-based KiSS.2 NWR items and articulation skills in KiSS.2 was confirmed by the highest level of a classification tree. On its next (lower) levels, NWR performance was closely associated with a short item length and children's advanced vocabulary skills. Thus, performance in the NWR tasks can be partially attributed to advanced German language skills. Indeed, KiSS.2 non-words contained several German morphemes (see Appendix 1). This lexical and grammatical load probably put those children who had not yet acquired the relevant morphemes at a disadvantage. Neither the children's age nor their classification as monolinguals/multilinguals appeared as nodes in the classification tree, indicating that a link between NWR and articulation/vocabulary skills was valid for all three- to five-year-old children. The KiSS.2 questionnaire item on children's language-related medical issues likewise did not influence the results. Thus, a multivariate statistical analysis (classification tree) confirmed the results of the univariate methods described above (Mann-Whitney U-tests) and the findings of Zaretsky and Lange (Reference Zaretsky and Lange2023) on other German-based non-words.

The link between performance in the NWR subtests in KiSS.2 or S-ENS and known language-related medical issues was also weak or non-existent, according to the point-biserial correlations and ROC curves (see another KiSS.2 study, Zaretsky and Hey (Reference Zaretsky, Hey, Spreer, Wahl and Beek2022), with comparable findings). Thus, if parents’ information about language-related medical issues is taken as the gold standard, a test using the repetition of non-words could not detect most children with such issues. It follows that the German-based non-words administered in the present study can hardly be used to detect children's language-related impairments. Moreover, in spite of some promising evidence regarding “quasi-universal” non-words, at the moment they cannot be recommended as an alternative to the German-based items: in one previous KiSS.2 study, “cross-linguistic” non-words from Chiat (Reference Chiat, Armon-Lotem, Long and Meir2015) were not as effective in detecting MED as German-based non-words for either monolingual or multilingual subgroups (Zaretsky and Hey Reference Zaretsky, Hey, Spreer, Wahl and Beek2022). Furthermore, differences in the NWR performance (both “cross-linguistic” and German-based items) were minimal or non-existent for children with and without language-related medical issues, hearing disorders, early or risk birth, family history of language disorders and of participation in language therapies (Zaretsky and Hey Reference Zaretsky, Hey, Spreer, Wahl and Beek2022).

It should be noted that most other studies, including those that reported sensitivity and specificity of up to 100% in the identification of language impairments by means of NWR tasks, usually focused on DLD children and utilized very small, carefully selected samples (e.g., Thordardottir and Brandeker Reference Thordardottir and Brandeker2013). The present study used a broader definition of language impairments and a very large, almost unselected sample, which might account for discrepancies in the results.

A close association between performance on KiSS.2 repetition tasks and other linguistic domains raises the question of whether multilinguals are sometimes pathologized (i.e., prescribed unnecessary medical interventions) due to the misinterpretation of NWR results. Weak performance in NWR tasks has been shown to be linked to numerous language-related disorders and impairments, especially to DLD (Zebib et al. Reference Zebib, Tuller, Hamann, Ibrahim and Prévost2020, Schwob et al. Reference Schwob, Eddé, Jacquin, Leboulanger, Picard, Oliveira and Skoruppa2021). Furthermore, weak results in such tasks are associated with various factors that are indirectly related to delayed language acquisition, such as preterm birth (Kaul et al. Reference Kaul, Johansson, Månsson, Stjernqvist, Farooqi, Serenius and Thorell2021). Genetic influences have also been described in the literature, with a family history of language-literacy problems being a significant predictor of children's persisting problems in NWR (Bishop et al. Reference Bishop, Holt, Line, McDonald, McDonald and Watt2012). Therefore, low performance in NWR tasks can be misinterpreted as an indication that a child might have acquired German under some of the unfavourable medical or sociodemographic conditions described above. The cut-off criteria of some language tests contribute to such erroneous interpretations. For instance, in the pre-final KiSS.2 version, low performance in repetition tasks served as a criterion for the classification of multilinguals as MED, but this criterion was excluded prior to the publication of the final version (Neumann and Euler Reference Neumann and Euler2010). After that, the rates of MED monolinguals and multilinguals no longer differed (e.g., Weiland et al. Reference Weiland, Schmidt, Herper-Klein and Kieslich2019). Since this was the only difference in the cut-off criteria of the MED classification between the pre-final and final versions, we can draw the conclusion that the NWR task contributed to the imbalance of MED results among monolinguals and multilinguals in the pre-final KiSS.2 version. However, because most speech-language therapists do not rely on language test results alone but use in-depth diagnostic procedures to examine children's medical issues, misleading PSTM test results might disadvantage multilinguals in terms of unnecessary medical examinations rather than misdiagnoses or therapies.

Apart from associations between children's language skills and their NWR performance, the present study also aimed to quantify the predictive power of NWR tasks for German language competence in a follow-up study design. Indeed, as demonstrated above, linguistic interference in NWR performance does not necessarily mean that chosen repetition tasks assess nothing but articulation and vocabulary skills. As was shown in the linear regression, performance in NWR constituted an independent predictor of children's German language competence at the time of the school enrolment examination (cf. Adlof and Patten Reference Adlof and Patten2017, Arthur Reference Arthur2017, Ching et al. Reference Ching, Cupples and Marnane2019, Cunningham et al. Reference Cunningham, Burgess, Witton, Talcott and Shapiro2021). A higher total score of correct answers in the KiSS.2 NWR subtest was associated with a better linguistic outcome in both monolinguals and multilinguals. The result was statistically significant, but the regression coefficient clearly indicates a very weak predictive power of non-words (ϐ = .126).

The predictive power can also be assessed in terms of dichotomous quality criteria. Several ROC curves were calculated to find optimal constellations of sensitivity and specificity for the prediction of “pass/fail” results on the school enrolment examination. Neither NWR alone nor the total KiSS.2 score (with or without NWR) could predict the dichotomized S-ENS result. After the age of onset of language acquisition, as one of the most important influencing factors on the language competence (Mayberry Reference Mayberry2007, Abrahamsson and Hyltenstam Reference Abrahamsson and Hyltenstam2009), was considered in the statistical analyses, was considered in the statistical analyses, sensitivity and specificity yielded satisfactory, although not excellent, values. Thus, the conclusion can be drawn that NWR tasks contribute to a certain degree to the success of prediction models but cannot predict later German language skills alone.

Some sociodemographic variables that are known to be significantly associated with NWR scores, such as the family's socioeconomic status (Guerra et al. Reference Guerra, Hazin, Guerra, Roulin, Le Gall and Roy2021), could not be controlled for in the present retrospective analysis because relevant questionnaire items were either missing or available only for small subgroups of children. Also, measurements of parents’ PSTM (cf. Waters et al. Reference Waters, Ahmed, Tang, Morrison and Davis-Kean2021) were not carried out in the original studies. Less than one tenth of children tested in t1 were retested in t2. Monolinguals and multilinguals could not be matched on two out of three factors that were shown to often be associated with children's language delays and disorders: (1) family history of language disorders and (2) lower educational levels of parents. However, monolinguals and multilinguals were matched for (3) biological sex (Berkmann et al. Reference Berkmann, Wallace, Watson, Coyne-Beasley, Cullen, Wood and Lohr2015). All these shortcomings of the original KiSS.2 studies (Neumann and Euler Reference Neumann and Euler2009, Reference Neumann and Euler2010, Reference Neumann, Euler, Redder and Weinert2013, Neumann et al. Reference Neumann, Euler and Zaretsky2011) can be considered limitations of the retrospective study presented here.

To sum up, language-specific (here: German-based KiSS.2 and S-ENS) non-words are not able to detect children with language-related impairments and cannot contribute much to the prediction of children's language development. Such non-words do constitute an independent predictor of the linguistic outcome in the school enrolment examination, but a very weak one. They assess not only PSTM but also the command of German vocabulary and, especially, articulation. Multilinguals with weak L2 skills cannot be expected to perform at the same level as their monolingual German-speaking peers in L2-based non-words. As a consequence, multilinguals might be prescribed unnecessary medical examinations. Because language-specific repetition tasks were integrated into the test batteries of large-scale language screening programmes, such false positive results might impose an unnecessary economic burden on local communities and families that try to provide children with (often costly) medical examinations instead of German language courses. Generally, quality criteria (sensitivity, specificity) of language-specific non-words are too low to be used in the diagnostics for language-related impairments and for the prediction of language development. It can be assumed that the results of the present study on KiSS.2 and S-ENS non-words can be generalized to many other language-specific NWR tasks regarding such influencing factors as item characteristics (e.g., length) and articulation and/or vocabulary skills.

Appendix 1: Characteristics of KiSS.2 and S-ENS non-words

Appendix 2: Statistical analyses in detail

To verify the assumption of a close link between German language skills and performance in KiSS.2 and S-ENS NWR tasks, differences between monolinguals and multilinguals in command of the German language were assessed for the whole sample (N = 1,801) by cross-tables with the ED and MED classifications, including a Chi-Square calculation. These calculations were repeated for the following t1 subgroups: three-, four- and five-year-olds, as well as children speaking Turkish (n = 130), Italian (n = 51), English (n = 36), and Arabic (n = 35). Calculations were also repeated for the t2 sample. Multilinguals were expected to show limited German language skills compared to monolingual German-speaking children.

Next, total scores of correct answers in the KiSS.2 vocabulary, articulation, and NWR subtests were compared for monolinguals and multilinguals, including subgroups, by means of Mann-Whitney U-tests (see research question 1a in section 3). Calculations were repeated for the t2 sample with S-ENS results. In case of significantly better German language skills in dichotomized language test results (according to cross-tables described in the previous paragraph), monolinguals were expected to outperform multilinguals in KiSS.2 and S-ENS subtests, including NWR.

The Mann-Whitney U-tests were repeated for AA monolinguals and multilinguals only (see research question 1b in section 3). It was assumed that the disadvantage of multilingual children in NWR tasks would vanish if multilinguals and monolinguals are tested under equal conditions (here, comparable German language competence).

The effect sizes in all Mann-Whitney U-tests were quantified with the probability of superiority index (; Grissom and Kim Reference Grissom and Kim2012). Values close to .5 indicate small effect sizes, those close to .0 high effect sizes.

The strength of the link between total scores of correct answers in the KiSS.2 NWR subtest and presence of language-related impairments (yes/no) according to questionnaires for parents was quantified by means of a point-biserial correlation (see research question 2 in section 3). This item was answered in 904 questionnaires. Calculations were repeated for S-ENS, whose questionnaire also contained the same item (n = 134). Additionally, sensitivity and specificity in the identification of language-related impairments by KiSS.2 and S-ENS non-words were assessed by two ROC curves (ROC = receiver operating characteristic). ROC curves deliver many constellations of sensitivity and specificity depending on cut-off values of the independent variable. The most optimal constellation can be chosen manually. This study aimed at sensitivity values of at least 90% and specificity of at least 70%.

Next, an influencing factor most closely associated with the variable “correct/wrong answer” in the KiSS.2 NWR task was identified by means of a multivariate statistical method called classification tree (N = 1,801, calculation method “CHAID = Chi-square Automatic Interaction Detectors” with Bonferroni correction; Bühl Reference Bühl2012). For this purpose, total scores of the KiSS.2 subtests on (a) vocabulary and (b) articulation, under consideration of (c) the classification “monolingual/multilingual”, (d) children's age in months, (e) language-related impairments according to the KiSS.2 questionnaire for parents, (f) four KiSS.2 non-words, (g) non-word length in syllables, (h) German prefixes in non-words (yes/no), (i) German suffixes in non-words (yes/no), (j) German-specific phonemes in non-words (yes/no; see Appendix 1), and (k) number of consonant clusters in non-words were utilized as independent variables (see research question 3 in section 3). Thus, both intra-linguistic and extra-linguistic influencing factors were considered. In contrast to other multivariate statistical methods, classification trees depict interactions between independent variables in a hierarchical way.

To quantify the predictive power of the KiSS.2 NWR subtest for the linguistic part of the school enrolment test S-ENS (including Add-on part), a linear regression was calculated with the total score of correct answers in all linguistic S-ENS subtests as dependent variable (see research question 4a in section 3). The following independent variables were utilized: (a) KiSS.2 NWR score, (b) KiSS.2 vocabulary score, (c) KiSS.2 articulation score, (d) children's age in months in t1, (e) the same for t2, (f) time span in months between t1 and t2, and (g) classification “monolingual/multilingual”. The inclusion of vocabulary and articulation scores as independent variables, along with the NWR score, was necessary to figure out whether the KiSS.2 NWR task constituted an independent predictor of the S-ENS result, if the other two factors (German vocabulary and articulation skills) were taken into account.

Additionally, various ROC curves were calculated to figure out whether NWR can be considered a reliable predictor of performance in the school enrolment test in terms of sensitivity and specificity (see research question 4b in section 3). For these calculations, the dichotomized result (“pass/fail”) of the linguistic part of S-ENS was utilized as a dependent variable. Four ROC curves are presented here, with the following independent variables: (a) KiSS.2 NWR score, (b) KiSS.2 total score without NWR, (c) the same with NWR, (d) KiSS.2 total score including NWR and weighted age of onset of German language acquisition. For (d), the formula was “KiSS.2 total score + ((maximum age of onset in years * 3) - (real age of onset in years * 3))”, with 1 meaning “since birth”. Thus, lower age of onset resulted in higher values of the variable (d). Many more models with different sociodemographic variables from KiSS.2 questionnaires were tried out but delivered considerably worse sensitivity and specificity than models (a-d).

All statistical analyses were carried out in the software “IBM SPSS 24” (International Business Machines Corp., Armonk, New York, USA).

Footnotes

1 Abbreviations: AA: children speaking German age-appropriately; CHAID: Chi-square Automatic Interaction Detectors; DLD: Developmental Language Disorder; ED: children needing additional educational assistance in acquiring German; KiSS: language test “Kindersprachscreening”; L1: first language; L2: second language; M: mean; MED: children needing additional medical assistance in acquiring German; NWR: non-word repetition; PSTM: phonological short-term memory; ROC: receiver operating characteristic; S-ENS: school enrolment test “Screening des Entwicklungsstandes bei Einschulungsuntersuchungen”; SD: standard deviation; SSV: language test “Sprachscreening für das Vorschulalter”; t1: test session 1; t2: test session 2.

2 However, many authors emphasize that NWR tasks alone do not suffice for language assessment (e.g., Lüke et al. Reference Lüke, Starke, Ritterfeld, Sachse, Bockmann and Buschmann2020).

3 For this retrospective analysis of anonymized data, no ethical approval was necessary. The original prospective studies were conducted in accordance with the Declaration of Helsinki, and the protocol (131/07) was approved by the Ethics Committee of Frankfurt/Main University Hospital (Germany).

4 S-ENS also contains several non-linguistic subtests that are of no relevance for the present study.

5 KiSS.2's predecessor, KiSS.XL that was utilized in Zaretsky and Lange (Reference Zaretsky and Lange2023), has never been published or used anywhere except in one KiSS validation study (Neumann and Euler Reference Neumann and Euler2009). KiSS.2 is used in a large-scale language screening programme for four-year-old children.

References

Abrahamsson, Niclas, and Hyltenstam, Kenneth. 2009. Age of onset and nativelikeness in a second language: Listener perception versus linguistic scrutiny. Language Learning 59(2): 249306.CrossRefGoogle Scholar
Adlof, Suzanne M., and Patten, Hannah. 2017. Nonword repetition and vocabulary knowledge as predictors of children's phonological and semantic word learning. Journal of Speech, Language, and Hearing Research 60(3): 682693.CrossRefGoogle ScholarPubMed
Archibald, Lisa M. D. 2008. The promise of nonword repetition as a clinical tool. Revue canadienne d'orthophonie et d'audiologie 32(1): 2128.Google Scholar
Armon-Lotem, Sharon, and Meir, Natalia. 2016. Diagnostic accuracy of repetition tasks for the identification of specific language impairment (SLI) in bilingual children: Evidence from Russian and Hebrew. International Journal of Language and Communication Disorders 51(6): 715731.CrossRefGoogle ScholarPubMed
Arthur, Dana. 2017. Influences on nonword repetition in young children. Doctoral dissertation, University of Connecticut, USA.Google Scholar
Baddeley, Alan D., and Hitch, Graham. 1974. Working memory. Psychology of Learning and Motivation: Advances in Research and Theory 8: 4789.CrossRefGoogle Scholar
Berkmann, Nancy D., Wallace, Ina, Watson, Linda, Coyne-Beasley, Tamera, Cullen, Katie, Wood, Charles, and Lohr, Kathleen N.. 2015. Screening for speech and language delays and disorders in children age 5 years or younger: A systematic review for the U.S. Preventive Services Task Force. Report No.: 13-05197-EF-1. Rockville: Agency for Healthcare Research and Quality US.Google Scholar
Bialystok, Ellen, and Barac, Raluca. 2012. Emerging bilingualism: Dissociating advantages for metalinguistic awareness and executive control. Cognition 122(1): 6773.CrossRefGoogle ScholarPubMed
Bishop, Dorothy V., Holt, Georgina, Line, Elisabeth, McDonald, David, McDonald, Sarah, and Watt, Helen. 2012. Parental phonological memory contributes to prediction of outcome of late talkers from 20 months to 4 years: A longitudinal study of precursors of specific language impairment. Journal of Neurodevelopmental Disorders 4(1): 112.CrossRefGoogle Scholar
Bishop, Dorothy V. M., Snowling, Margaret J., Thompson, Paul A., Greenhalgh, Trisha, and the CATALISE-2 consortium. 2016. CATALISE: A multinational and multidisciplinary Delphi consensus study. Identifying language impairments in children. PlosONE 11(7): e0158753.CrossRefGoogle ScholarPubMed
Bishop, Dorothy V. M., Snowling, Margaret J., Thompson, Paul A., Greenhalgh, Trisha, and the CATALISE-2 consortium. 2017. Phase 2 of CATALISE: A multinational and multidisciplinary Delphi consensus study of problems with language development: Terminology. Journal of Child Psychology and Psychiatry 58(10): 10681080.CrossRefGoogle ScholarPubMed
Bloder, Theresa, Eikerling, Maren, and Lorusso, Maria Luisa,. 2023. Evaluating the role of word-related parameters in the discriminative power of a novel nonword repetition task for bilingual children. Clinical Linguistics & Phonetics. https://doi.org/10.1080/02699206.2023.2226304Google ScholarPubMed
Boerma, Tessel, Chiat, Shula, Leseman, Paul, Timmermeister, Mona, Wijnen, Frank, and Blom, Elma,. 2015. A quasi-universal nonword repetition task as a diagnostic tool for bilingual children learning Dutch as a second language. Journal of Speech, Language, and Hearing Research 58(6): 17471760.CrossRefGoogle ScholarPubMed
Bühl, Achim. 2012. SPSS 20. Einführung in die moderne Datenanalyse [SPSS 20. Introduction to modern data analysis]. 13th ed. Munich: Pearson.Google Scholar
Casalini, Claudia, Brizzolara, Daniela, Chilosi, Anna, Cipriani, Paola, Marcolini, Stefania, Pecini, Chiara, Roncoli, Silvia, and Burani, Cristina,. 2007. Non-word repetition in children with specific language impairment: A deficit in phonological working memory or in long-term verbal knowledge? Cortex 43(6): 769776.CrossRefGoogle ScholarPubMed
Chiat, Shula. 2015. Non-word repetition. In Assessing multilingual children. Disentangling bilingualism from language impairment, ed. Armon-Lotem, Sharon, Long, Jan de, and Meir, Natalia, 125150. Bristol: Multilingual Matters.Google Scholar
Chiat, Shula, and Polišenska, Kamila. 2016. A framework for cross-linguistic nonword repetition tests: Effects of bilingualism and socioeconomic status on children's performance. Journal of Speech, Language, and Hearing Research 59(5): 11791189.CrossRefGoogle Scholar
Ching, Teresa Y. C., Cupples, Linda, and Marnane, Vivienne. 2019. Early cognitive predictors of 9-year-old spoken language in children with mild to severe hearing loss using hearing aids. Frontiers in Psychology 10: 2180.CrossRefGoogle ScholarPubMed
Coady, Jeffry A., and Evans, Julia L.. 2008. Uses and interpretations of non-word repetition tasks in children with and without specific language impairments (SLI). International Journal of Communication Disorders 43(1): 140.CrossRefGoogle ScholarPubMed
Cunningham, Anna J., Burgess, Adrian P., Witton, Caroline, Talcott, Joel B., and Shapiro, Laura R.. 2021. Dynamic relationships between phonological memory and reading: A five year longitudinal study from age 4 to 9. Developmental Science 24(1): e12986.CrossRefGoogle Scholar
Deldar, Zoha, Gevers-Montoro, Carlos, Khatibi, Ali, and Ghazi-Saidi, Ladan. 2020. The interaction between language and working memory: A systematic review of fMRI studies in the past two decades. AIMS Neuroscience 8(1): 132.CrossRefGoogle ScholarPubMed
Dollaghan, Christine A., Biber, Maureen E., and Campbell, Thomas F.. 1995. Lexical influences on nonword repetition. Applied Psycholinguistics 16(2): 211222.CrossRefGoogle Scholar
Döpfner, Manfred, Dietmair, Iris, Mersmann, Heiner, Simon, Klaus, and Trost-Brinkhues, Gabrielle. 2005. Screening des Entwicklungsstandes bei Einschulungsuntersuchungen (S-ENS). Theoretische und statistische Grundlagen [Developmental Screening for the School Enrolment Examination (S-ENS). Theory and statistics]. Goettingen: Hogrefe.Google Scholar
Ebert, Kerry D., Kalanek, Jocelyne, Cordero, Kelly N., and Kohnert, Kathryn. 2008. Spanish nonword repetition: Stimuli development and preliminary results. Communication Disorders Quarterly 29(2): 6774.CrossRefGoogle Scholar
Eikerling, Maren Rebecca, Bloder, Theresa Sophie, and Lorusso, Maria Luisa. 2022. A nonword repetition task discriminates typically developing Italian-German bilingual children from bilingual children with Developmental Language Disorder: The role of language-specific and language-non-specific nonwords. Frontiers in Psychology 13: 826540.CrossRefGoogle ScholarPubMed
Eisenwort, Brigitte, Felnhofer, Anna, and Klier, Claudia. 2018. Mehrsprachiges Aufwachsen und Sprachentwicklungsstörungen. Eine Übersichtsarbeit [Multilingualism in childhood and language impairments: A review]. Zeitschrift für Kinder- und Jugendpsychiatrie und Psychotherapie 46(6): 488496.CrossRefGoogle Scholar
Engel de Abreu, Pascale M., Baldassi, Martine, Puglisi, Marina L., and Befi-Lopes, Debora M.. 2013. Cross-linguistic and cross-cultural effects on verbal working memory and vocabulary: Testing language minority children with an immigrant background. Journal of Speech, Language and Hearing Research 56(2): 630642.CrossRefGoogle Scholar
Engel de Abreu, Pascale M., Cruz-Santos, Anabela, and Puglisi, Marina L.. 2014. Specific language impairment in language-minority children from low-income families. International Journal of Communication Disorders 49(6): 736747.CrossRefGoogle ScholarPubMed
Ferré, Sandrine, Santos, Christophe dos, and de Almeida, Laetitia. 2015. Potential phonological markers for SLI in bilingual children. In Proceedings of the 39th annual Boston University Conference on Language Development, ed. Grillo, Elizabeth and Jepson, Kyle, 152164. Somerville, MA: Cascadilla Press.Google Scholar
Grimm, Angela. 2016. Quatschwörter nachsprechen – gleiche Anforderungen für alle Kinder? [Repetition of nonwords: The same requirements for all children?]. Diskurs Kindheits- und Jungendforschung 11(1): 113117.CrossRefGoogle Scholar
Grimm, Angela. 2022. The use of the LITMUS quasi-universal nonword repetition task to identify DLD in monolingual and early second language learners aged 8 to 10. Languages 7(3): 218.CrossRefGoogle Scholar
Grimm, Angela, and Schulz, Petra. 2016. Warum man bei mehrsprachigen Kindern dreimal nach dem Alter fragen sollte: Sprachfähigkeiten simultan-bilingualer Lerner im Vergleich mit monolingualen und frühen Zweitsprachlernern [Why the age of multilingual children should be asked thrice: Language abilities of simultaneous bilinguals in comparison with monolinguals and early L2 learners]. Diskurs Kindheits- und Jungendforschung 11(1): 2742.CrossRefGoogle Scholar
Grimm, Hannelore. 2003. SSV – Sprachscreening für das Vorschulalter [SSV – Language Screening for the Preschool Age]. Goettingen: Hogrefe.Google Scholar
Grissom, Robert J., and Kim, John J.. 2012. Effect sizes for research: Univariate and multivariate applications. 2nd ed. New York: Routledge.CrossRefGoogle Scholar
Guerra, Amanda, Hazin, Izabel, Guerra, Yasmin, Roulin, Jean-Luc, Le Gall, Didier, and Roy, Arnaud. 2021. Developmental profile of executive functioning in school-age children from Northeast Brazil. Frontiers in Psychology 11: 596075.CrossRefGoogle ScholarPubMed
Gunnerud, Hilde L., Ten Braak, Dieuwer, Reikerås, Elin Kirsti L., Donolato, Enrica, and Melby-Lervåg, Monica. 2020. Is bilingualism related to a cognitive advantage in children? A systematic review and meta-analysis. Psychological Bulletin 146(12): 10591083.CrossRefGoogle ScholarPubMed
Holler-Zittlau, Inge, Euler, Harald A., and Neumann, Katrin. 2011. Kindersprachscreening (KiSS) – das hessische Verfahren zur Sprachstandserfassung [Kindersprachscreening (KiSS) – a Hessian test for the language assessment]. Sprachheilarbeit 5(6): 263268.Google Scholar
Howell, Peter, Tang, Kevin, Tuomainen, Outi, Chan, Sin K., Beltran, Kirsten, Mirawdeli, Avin, and Harris, John. 2017. Identification of fluency and word-finding difficulty in samples of children with diverse language backgrounds. International Journal of Language and Communication Disorders 52(5): 595611.CrossRefGoogle ScholarPubMed
Kaul, Ylva F., Johansson, Martin, Månsson, Johanna, Stjernqvist, Karin, Farooqi, Aijaz, Serenius, Frederik, and Thorell, Lisa B.. 2021. Cognitive profiles of extremely preterm children: Full-scale IQ hides strengths and weaknesses. Acta Paediatrica 110(6): 18171826.CrossRefGoogle Scholar
Lange, Benjamin P., and Zaretsky, Eugen. 2021. Sex differences in language competence of four-year-old children: Female advantages are mediated by phonological short-term memory. Applied Psycholinguistics 42(6): 15031522.CrossRefGoogle Scholar
Lavesson, Ann, Lövdén, Martin, and Hansson, Kristina. 2018. Development of a language screening instrument for Swedish 4-year-olds. International Journal of Language and Communication Disorders 53(3): 605614.CrossRefGoogle ScholarPubMed
Lüke, Carina, Starke, Anja, and Ritterfeld, Ute. 2020. Sprachentwicklungsdiagnostik bei mehrsprachigen Kindern [Diagnostics of the language development in multilingual children]. In Sprachentwicklung. Entwicklung – Diagnostik – Förderung im Kleinkind- und Vorschulalter, ed. Sachse, Steffi, Bockmann, Ann-Katrin, and Buschmann, Anke, 221237. Berlin: Springer.Google Scholar
Masoura, Elvira, Gogou, Anastasia, and Gaterhcole, Susan E.. 2020. Working memory profiles of children with reading difficulties who are learning to read in Greek. Dyslexia 27(3): 312324.CrossRefGoogle ScholarPubMed
Mathieu, Jennipher, Lindner, Katrin, Lomako, Julia, and Gagarina, Natalia. 2016. „Wo bist du, kleiner Monster?” Sprachspezifische nonword repetition Tests zur Differenzierung von bilingualen typisch entwickelten Kindern und entsprechenden Risikokindern für USES [“Where are you, little monster?” Language specific nonword repetition tasks to differentiate bilingual typically developing children and those at risk for SLI]. Forschung Sprache 1: 524.Google Scholar
Mayberry, Rachel I. 2007. When timing is everything: Age of first-language acquisition effects on second-language learning. Applied Psycholinguistics 28(3): 537549.CrossRefGoogle Scholar
Meir, Natalia. 2017. Effects of Specific Language Impairment (SLI) and bilingualism on verbal short-term memory. Linguistic Approaches to Bilingualism 7(3–4): 301330.CrossRefGoogle Scholar
Neumann, Katrin, and Euler, Harald A.. 2009. Einführung einer flächendeckenden Sprachstandserfassung in Hessen. Forschungsbericht [Introduction of a state-wide language screening programme. Research report]. University Frankfurt/Main.Google Scholar
Neumann, Katrin, and Euler, Harald A.. 2010. Einführung einer flächendeckenden Sprachstandserfassung in Hessen. Forschungsbericht [Introduction of a state-wide language screening programme. Research report]. University Frankfurt/Main.Google Scholar
Neumann, Katrin, and Euler, Harald A.. 2013. Kann ein Sprachstandsscreening zwischen dem Bedarf für Sprachförderung und für Sprachtherapie trennen? [Can a language screening differentiate between need for educational or medical assistance?]. In Sprachförderung und Sprachdiagnostik – interdisziplinäre Perspektiven, ed. Redder, Angelika and Weinert, Sabine, 174198. Muenster: Waxmann.Google Scholar
Neumann, Katrin, Euler, Harald A., and Zaretsky, Yevgen. 2011. Einführung einer flächendeckenden Sprachstandserfassung in Hessen. Forschungsbericht [Introduction of a state-wide language screening programme. Research report]. University Frankfurt/Main.Google Scholar
Van den Noort, Mauritius, Struys, Esli, Bosch, Peggy, Jaswetz, Lars, Perriard, Benoit, Yeo, Sujung, Barisch, Pia, Vermeire, Katrien, Lee, Sook-Hyun, and Lim, Sabrina. 2019. Does the bilingual advantage in cognitive control exist and if so, what are its modulating factors? A systematic review. Behavioral Sciences (Basel) 9(3): 27.CrossRefGoogle ScholarPubMed
Nowbakht, Mohammad. 2019. The role of working memory, language proficiency, and learners’ age in second language English learners’ processing and comprehension of anaphoric sentences. Journal of Psycholinguistic Research 48(2): 353370.CrossRefGoogle ScholarPubMed
Ogino, Tatsuya, Hanafusa, Kaoru, Morooka, Teruko, Takeuchi, Akihito, Oka, Makio, and Ohtsuka, Yoko. 2017. Predicting the reading skill of Japanese children. Brain and Development 39(2): 112121.CrossRefGoogle ScholarPubMed
Penke, Martina. 2018. Verbal agreement inflection in German children with Down syndrome. Journal of Speech, Language, and Hearing Research 61(9): 22172234.CrossRefGoogle ScholarPubMed
Schwob, Salomé, Eddé, Laurane, Jacquin, Laure, Leboulanger, Mégane, Picard, Margot, Oliveira, Patricia R., and Skoruppa, Katrin. 2021. Using nonword repetition to identify developmental language disorder in monolingual and bilingual children: A systematic review and meta-analysis. Journal of Speech, Language, and Hearing Research 64(9): 35783593.CrossRefGoogle ScholarPubMed
Shriberg, Lawrence D., Lohmeier, Heather L., Campbell, Thomas F., Dollaghan, Christine A., Green, Jordan R., and Moore, Christopher A.. 2009. A nonword repetition task for speakers with misarticulations: The Syllable Repetition Task (SRT). Journal of Speech, Language, and Hearing Research 52(5): 11891212.CrossRefGoogle ScholarPubMed
Summers, Connie, Bohman, Thomas M., Gillam, Ronald B., Peña, Elisabeth D., and Bedore, Lisa M.. 2010. Bilingual performance on nonword repetition in Spanish and English. International Journal of Language and Communication Disorders 45(4): 480493.CrossRefGoogle ScholarPubMed
Thordardottir, Elin, and Brandeker, Myrto. 2013. The effect of bilingual exposure versus language impairment on nonword repetition and sentence imitation scores. Journal of Communication Disorders 46(1): 116.CrossRefGoogle ScholarPubMed
Tuller, Laurice, Hamann, Cornelia, Chilla, Solveig, Ferré, Sandrine, Morin, Eléonore, Prevost, Philippe, Santos, Christophe dos, Ibrahim, Lina Abed, and Zebib, Racha. 2018. Identifying language impairment in bilingual children in France and in Germany. International Journal of Language and Communication Disorders 53(4): 888904.CrossRefGoogle ScholarPubMed
Verhagen, Josje, de Bree, Elise, Mulder, Hanna, and Leseman, Paul. 2017. Effects of vocabulary and phonotactic probability on 2-year-olds’ nonword repetition. The Journal of Psycholinguistic Research 46(3): 507524.CrossRefGoogle ScholarPubMed
Waters, Nicholas E., Ahmed, Sammy F., Tang, Sandra, Morrison, Frederick J., and Davis-Kean, Pamela E.. 2021. Pathways from socioeconomic status to early academic achievement: The role of specific executive functions. Early Childhood Research Quarterly 54(1): 321331.CrossRefGoogle ScholarPubMed
Weiland, Marina, Schmidt, Melanie, Herper-Klein, Sabine, and Kieslich, Matthias. 2019. Sprachstandserfassung im Elementarbereich für ein- und mehrsprachige Kinder am Beispiel des hessischen Kindersprachscreenings KiSS [Language assessment at the preschool age for monolingual and bilingual children, exemplified by the language screening KiSS in Hesse]. Sprachförderung und Sprachtherapie 8(2): 8488.Google Scholar
Weismer, Susan E., Tomblin, Bruce J., Zhang, Xuyang, Buckwalter, Paula, Chynoweth, Jan G., and Jones, Maura. 2000. Nonword repetition performance in school-age children with and without language impairment. Journal of Speech, Language, and Hearing Research 43(4): 865878.CrossRefGoogle Scholar
Wild, Nicole, and Fleck, Christine. 2013. Neunormierung des Mottier-Tests für 5- bis 17-jährige Kinder mit Deutsch als Erst- oder Zweitsprache [New norms for the Mottier test for 5- to 17-year-old children speaking German as L1 or L2]. Praxis Sprache 3: 152157.Google Scholar
Yoo, Jeewon, and Kaushanskaya, Margarita. 2012. Phonological memory in bilinguals and monolinguals. Memory and Cognition 40(8): 13141330.CrossRefGoogle ScholarPubMed
Zaretsky, Eugen, Euler, Harald A., Neumann, Katrin, and Lange, Benjamin P.. 2014. Sociolinguistic predictors of language deficits in pre-school children with and without immigrant background. In ACLL2014 − The Asian Conference on Language Learning, 42–53. Aichi: The International Academic Forum.Google Scholar
Zaretsky, Eugen, and Hey, Christiane. 2022. Deutsch-basierte und (quasi-)universelle Kunstwörter als Prädiktoren für sprachbezogene medizinische Störungsbilder bei Vorschulkindern [German-based and (quasi-)universal non-words as predictors of language-related medical impairmens in preschoolers]. In Sprachentwicklung im Dialog. Digitalität – Kommunikation – Partizipation, ed. Spreer, Markus, Wahl, Michael, and Beek, Helmut, 198204. Idstein: Schulz-Kirchner Verlag.Google Scholar
Zaretsky, Eugen, and Lange, Benjamin P.. 2023. Language-specific non-words for the assessment of working memory: Dealing with bilingual children. Communication Disorders Quarterly 44(4): 219227.CrossRefGoogle Scholar
Zaretsky, Eugen, Neumann, Katrin, Euler, Harald A., and Lange, Benjamin P.. 2013. Pluralerwerb im Deutschen bei russisch- und türkischsprachigen Kindern im Vergleich mit anderen Migranten und monolingualen Muttersprachlern [German plural acquisition by Russian- and Turkish-speaking children in comparison with other immigrants and monolingual Germans]. Zeitschrift für Slawistik 58(1): 4371.CrossRefGoogle Scholar
Zebib, Racha, Tuller, Laurice, Hamann, Cornelia, Ibrahim, Lina A., and Prévost, Philippe. 2020. Syntactic complexity and verbal working memory in bilingual children with and without Developmental Language Disorder. First Language 40(4): 461484.CrossRefGoogle Scholar
Figure 0

Table 1. Sample characteristics: Children tested with the language screening KiSS.2 and the school enrolment test S-ENS

Figure 1

Table 2. Description of language tests KiSS.2 and S-ENS

Figure 2

Table 3. Differences between monolingual (MO) and multilingual (MU) children, including subgroups, in dichotomized KiSS.2 results

Figure 3

Table 4. Differences between monolingual (MO) and multilingual (MU) children, including subgroups, in KiSS.2 results

Figure 4

Table 5. Differences between monolingual children (MO) and subgroups of multilingual children (MU) in KiSS.2 results

Figure 5

Figure 1. Possible extra- and intra-linguistic influencing factors on the answers (correct/wrong) in the KiSS.2 subtest on the repetition of non-words

Figure 6

Table 6. Possible predictors of the S-ENS total score of correct answers

Figure 7

Table 7. Characteristics of four ROC curves for the prediction of the dichotomized S-ENS result