Introduction
Relative clauses (RCs) have been recognized as an important probe into language acquisition because of their syntactic complexity, and have received attention in dyslexic literature (e.g., Arosio et al., Reference Arosio, Panzeri, Molteni, Magazù and Guasti2017; Byrne, Reference Byrne1981; Robertson & Joanisse, Reference Robertson and Joanisse2010; Shankweiler et al., Reference Shankweiler, Smith and Mann1984, Reference Shankweiler, Crain, Katz, Fowler, Liberman, Brady and Shaywitz1995; Smith et al.,Reference Smith, Macaruso, Shankweiler and Crain1989; Stella & Engelhardt, Reference Stella and Engelhardt2021). Most dyslexic studies examined languages with head-initial RCs (e.g., English and Italian), with few studies focusing on head-final RCs (Chan, Reference Chan2014, Reference Chan2015). Cross-linguistically, head-final RCs are extremely rare in verb–object (VO) languages, with relevant exceptions such as Mandarin Chinese (henceforth Chinese), Cantonese and Wu (Dryer, Reference Dryer, Dryer and Haspelmath2013; Hu et al., Reference Hu, Cecchetto and Guasti2018a). In the current study, we investigated the comprehension of RCs in Chinese children with DD, comparing with typically developing (TD) children matched for age or reading level, and explored their potential associations with vocabulary knowledge and verbal working memory. In this way, we gain a more in-depth understanding of the relationship between reading impairment and difficulties in complex syntactic processing, and help to disentangle the underlying causes of the syntactic deficits in dyslexia.
The article is organized as follows. We first briefly introduce the comprehension of subject relative clauses (SRCs) and object relative clauses (ORCs) in TD children and theories accounting for the asymmetry between two structures. Then, we reviewed previous studies on dyslexic children’s comprehension of RCs and factors that may influence their processing of complex syntax. Next, we present the current study, and finally, offer a general discussion.
The comprehension of relative clauses and theoretical accounts
The comprehension of RCs has been extensively examined in TD children across a variety of languages. Specifically, studies often compare SRCs and ORCs with animate noun phrases (NPs). In languages with head-initial RCs, as illustrated in (1), a consistent SRC advantage has been observed (Tanaka et al., Reference Tanaka, Lau and Lee2024). However, in languages with head-final RCs such as Chinese, as shown in (2), most studies report a similar SRC advantage (e.g., Hu et al., Reference Hu, Gavarró, Vernice and Guasti2016), though some studies have found no such advantage (e.g., He et al., Reference He, Xu and Ji2017). These conflicting findings have been attributed to various factors, such as word order, experimental tasks, and individual differences (e.g., Hu et al., Reference Hu, Costa and Guasti2020).


Various theories have been proposed for the subject-object asymmetry in the processing of RCs. In the present paper, we focus on two influential versions of the theories, namely, the Dependency Locality Theory (DLT; Gibson, Reference Gibson1998, Reference Gibson, Marantz, Miyashita and O’Neil2000) and the featural Relativized Minimality (fRM; Rizzi, Reference Rizzi1990, Reference Rizzi and Belletti2004, Reference Rizzi2018). One reason is that both theories deal with filler–gap dependencies, but provide different predictions with respect to the processing of head-initial and head-final RCs. Another reason is that both theories hypothesized that the processing of filler-gap dependencies relies heavily on computational resources such as working memory, while working memory impairment is pervasive in dyslexia (e.g., Chiappe et al., Reference Chiappe, Hasher and Siegel2000; Smith-Spark & Fisk, Reference Smith-Spark and Fisk2007).
According to the DLT, sentence comprehension requires two computational resources: storage of the structure built thus far and integration of the current word into the existing structure. One key aspect of this account is that sentence complexity is related to the locality of integration between dependent syntactic elements (e.g., a dependent with a head). The locality is measured by the distance between these relevant elements, i.e., the number of new discourse referents (nouns and verbs) intervening between them. Within this framework, the longer an element has to be kept in working memory, the greater the computation resources required. The DLT predicts that, in languages like English, ORCs should be more difficult to process than SRCs because the relation between the relative head (i.e., the noun modified by the clause) and its trace in ORCs is resolved at a later stage. In particular, in (1a), the integration between the relative head the child and its trace is local. By contrast, in (1b), the integration between the relative head and its trace has to cross the embedded subject the teacher and the embedded verb draws, and is thus hypothesized to consume more computation resources. However, in languages like Chinese, SRCs should be harder to process than ORCs because the distance between the relative head and its trace in SRCs (2a) is longer than that in ORCs (2b), and by hypothesis requires more computation resources.
Under the fRM framework, when two elements that enter a local relation (i.e., the moved NP and the position where it is first merged) are hierarchically separated by an intervening element (i.e., another NP) matching the featural specification of the elements it separates, an intervention effect arises and the sentence is more complex to comprehend. In the case of RCs, the difficulty is modulated by the nature of the relative head and of the embedded noun. Under a raising analysis, the relative head is attracted by a complex attractor [+R, +NP], where R and NP represent the relative feature and the lexical restriction feature, respectively. In an English SRC (1a), the relative head, which bears [+R, +NP] features, does not cross over the lexically restricted element in the clause, i.e., the embedded object, which bears a [+NP] feature, a subset of the features of the relative head. In an English ORC (1b), the lexically restricted relative head has to cross over another lexically restricted element, i.e., the embedded subject, which also bears a [+NP] feature. To consider the relative head and the embedded noun distinct as required by RM, one has to compute the superset−subset relation, but by hypothesis, limited computational resources sometimes prevent younger children from making this computation. Accordingly, an RM violation arises and children have greater trouble in acquiring ORCs compared with SRCs (Belletti et al., Reference Belletti, Friedmann, Brunato and Rizzi2012; Friedmann et al., Reference Friedmann, Belletti and Rizzi2009, Reference Friedmann, Belletti and Rizzi2021). The fRM approach also makes precise predictions for languages like Chinese. As shown in (3a), in a Chinese SRC, the relative head (haizi “child”) structurally dominates (i.e., c-commands) its trace (marked as t, representing the head’s original position in the clause). Critically, there is no intervening element between the relative head and its trace, namely, the embedded object (laoshi “teacher”) does not disrupt this relationship. In contrast, in a Chinese ORC, as illustrated in (3b), the embedded subject (laoshi “teacher”) structurally intervenes between the relative head (haizi “child”) and its trace. Here, the relative head c-commands the embedded subject, which in turn, c-commands the trace, creating a structural conflict. To reiterate, under the raising analysis, the embedded noun has a [+NP] feature, which shares a subset of the features of the relative head. According to the fRM approach sketched above, children encounter difficulty computing this subset relation and, thus, Chinese ORCs are also expected to be more difficult to acquire than SRCs, similar to English.

Critically, intervening elements can disrupt dependency formation both hierarchically and linearly. Contrary to structural intervention, defined by c-command, linear intervention is based on precedence. In head-initial RCs, such as those in English, structural and linear factors are intertwined and cannot be separated, whereas in Chinese RCs, the two factors can be disentangled (Hu et al., Reference Hu, Gavarró, Vernice and Guasti2016). In Chinese SRCs, the embedded object (laoshi “teacher”) linearly intervenes between the relative head (haizi “child”) and its trace, but does not structurally intervene (see 2a and 3a). By contrast, in Chinese ORCs, the embedded subject (laoshi “teacher”) does not linearly intervene between the relative head (haizi “child”) and its trace, but does structurally intervene (see 2b and 3b). Through comprehension experiments with TD children, Hu et al. (Reference Hu, Gavarró, Vernice and Guasti2016) demonstrated that both hierarchical and linear interventions create interference, but to varying degrees: structural intervention has a stronger effect than linear intervention.
To summarize, the SRC advantage is uniformly reported in the comprehension of head-initial RCs, as those in English, which can be explained by the DLT and the fRM. Conversely, this SRC advantage is not as clear-cut in the comprehension of Chinese RCs, and the two theories diverge in predicting the asymmetry between two structures: an ORC advantage predicted by the DLT, and an SRC advantage predicted by the fRM. Therefore, the first aim of our study is to use a clinical population to distinguish between these two approaches and thus to characterize the syntactic difficulties in the dyslexic population. In the next section, we turn to previous studies on the comprehension of RCs in dyslexia.
Relative clause comprehension in children with developmental dyslexia
Developmental dyslexia (DD) is a neurobiological condition affecting around 3−10% of the school population across languages (e.g., Stevenson et al., Reference Stevenson, Stigler, Lucker, Lee, Hsu and Kitamura1982; Zhang et al., Reference Zhang, Zhang, Yin, Zhou and Chang1996). Children with DD often show problems in reading fluently and comprehending written materials accurately, despite their normal intelligence and adequate learning opportunities. Besides, they also face difficulties in language domains such as syntax (e.g., Hu et al., Reference Hu, Vender, Fiorin and Delfitto2018b, Reference Hu, Delfitto, Yang and Vender2024; Shankweiler et al., Reference Shankweiler, Smith and Mann1984, Reference Shankweiler, Crain, Katz, Fowler, Liberman, Brady and Shaywitz1995). Several studies on dyslexic children have reported deficits in the comprehension of RCs, with some differences depending on the manipulation of animate and inanimate NPs in the structure, the varying ages of the children, and the different experimental tasks used (Arosio et al., Reference Arosio, Panzeri, Molteni, Magazù and Guasti2017; Bar-Shalom et al., Reference Bar-Shalom, Crain and Shankweiler1993; Byrne, Reference Byrne1981; Casalis et al., Reference Casalis, Leuwers and Hilton2013; Leikin & Assayag-Bouskila, Reference Leikin and Assayag-Bouskila2004; Mann et al., Reference Mann, Shankweiler and Smith1984; Robertson & Joanisse, Reference Robertson and Joanisse2010; Stein et al., Reference Stein, Cairns and Zurif1984).
With respect to head-initial RCs, a number of studies showed that the comprehension of SRCs is not impaired for children with DD, while the comprehension of ORCs is problematic for them. Using an act-out task, Stein et al. (Reference Stein, Cairns and Zurif1984) examined 7- to 10-year-old English-speaking reading-disabled children (N = 20) and typical readers (N = 20) with sentences containing two animate NPs and one inanimate NP. Results showed that reading-disabled children exhibited difficulties in comprehending ORCs (e.g., The bear bites the lion that the ball hits) compared to SRCs (e.g. The lion hugs the bear that rolls the ball). Using a picture-selection task, Arosio et al. (Reference Arosio, Panzeri, Molteni, Magazù and Guasti2017) examined the comprehension of SRCs and ORCs with animate NPs in Italian children with DD (N = 13, age range: 8; 7–13; 3) and TD controls matched for age or vocabulary. Results showed that SRCs were unproblematic for all the children, while ORCs were challenging for children with DD and vocabulary-matched TD controls (N = 13, age range: 7; 8–11; 0). Accordingly, the authors claimed that this finding supports intervention effects within the fRM framework.
Turning to head-final RCs, children with DD also have difficulties comprehending RCs, with only a couple of studies examining Chinese RCs (Chan Reference Chan2014, Reference Chan2015). Using a listening and reading comprehension test, Chan (Reference Chan2014) found Chinese dyslexic children (N = 43, age range: 8; 4–12; 5) in Hong Kong comprehended SRCs and ORCs less accurately than TD children (N = 43, age range: 8; 4–12; 5). In addition, dyslexic children comprehended SRCs less accurately than ORCs (56% vs 75%), while TD children comprehended both structures similarly (83% vs 89%). The similar results were also reported in Chan (Reference Chan2015), which examined the reading comprehension of RCs using a sentence reading comprehension test. However, the ORC advantage observed in the dyslexic group must be cautious. In those studies, as illustrated in (4), the head noun (na zhi xiao fei zhu “that small fat pig”) is a complex noun phrase, including a demonstrative (na “that”), a general classifier (zhi), two adjectives (xiao “small” and fei “fat”), and a noun (zhu “pig”), while the embedded noun is a noun with an adjective (xiao tu “small rabbit”). To fully address the theoretical issue related to the subject-object asymmetry of RCs, one should carefully match the head noun and the embedded noun. In addition, these studies did not explore how children do when they fail to understand RCs. It is essential to analyze errors, as it would provide an important window into their underlying deficit. Given these limitations, the current study would further examine Chinese children with DD by carefully manipulating the syntactic features of SRCs and ORCs and analyzing both correct and incorrect responses.

Dyslexic children’s impairments in the comprehension of complex syntax may be associated with several sources. One could be a consequence of poor vocabulary knowledge. Vocabulary has been suggested to be a pivotal measure in evaluating individual differences in linguistic performance (e.g., Joshi, Reference Joshi2005; Tunmer & Chapman, Reference Tunmer and Chapman2012). Studies with children and adults showed that increased vocabulary knowledge affects both processes and representations shared with spoken language (Borovsky et al., Reference Borovsky, Elman and Fernald2012; Mani & Huettig, Reference Mani and Huettig2012; Nation et al., Reference Nation, Marshall and Altmann2003). Crucially, vocabulary knowledge robustly accounted for the unique variance of prediction of spoken language beyond production fluency and nonverbal IQ (Hintz et al., Reference Hintz, Meyer and Huettig2017; Rommers et al., Reference Rommers, Meyer and Huettig2015). Van Dyke et al. (Reference Van Dyke, Johns and Kukona2014) administered a comprehensive skill battery and found that receptive vocabulary knowledge was the only significant predictor of comprehension performance when the variance shared with IQ was removed. They interpreted these results in the light of a model that emphasizes retrieval interference and the quality of lexical representations as key determinants of successful comprehension.
Working memory has been suggested to be a critical cognitive factor contributing to comprehension deficits in dyslexia (Mann et al., Reference Mann, Shankweiler and Smith1984; Shankweiler & Crain, Reference Shankweiler and Crain1986). Studies have shown that working memory deficits are pervasive in individuals with dyslexia (Chiappe et al., Reference Chiappe, Hasher and Siegel2000; Smith-Spark & Fisk, Reference Smith-Spark and Fisk2007). In a study that directly examined the relationship between working memory and spoken sentence comprehension in dyslexic children, Robertson and Joanisse (Reference Robertson and Joanisse2010) manipulated working memory loads by varying sentence length and the delay between the offset of the sentence and the presentation of picture stimuli across different sentence types. Results showed that, compared to canonical sentences (i.e., active sentence and SRCs), English-speaking children with DD (N = 14, age range: 9; 1–12; 1) performed more poorly on noncanonical sentences (i.e., passive sentences such as The boy in the dark blue pants is tapped by the girl with the nice blond hair and passive RCs such as This is the man in the light brown shirt that is waved at by the boy in the dark blue pants) under a high working memory load. Research on dyslexic adults’ sentence comprehension has also reported similar findings (e.g., Wiseheart et al., Reference Wiseheart, Altmann, Park and Lombardino2009): College students with dyslexia were significantly outperformed by their TD peers in comprehending complex sentences containing center-embedded RCs with high working memory loads.
To sum up, several critical factors have been considered to account for comprehension deficits in dyslexia, but none of the Chinese dyslexic studies have examined whether those factors could be a potential predictor for the RC comprehension. The current study would further investigate Chinese dyslexic children’s comprehension of RCs and measure their vocabulary knowledge and verbal working memory, with the aim of establishing the source of processing difficulty in RCs.
The current study
The present study investigated the comprehension of subject and object RCs in Chinese children with and without DD. The first goal was to contribute to theoretical debates concerning the source of processing difficulty associated with subject and object RCs. The second goal was to determine whether Chinese children with DD have difficulty comprehending RCs, and whether vocabulary knowledge and working memory contribute to their RC comprehension.
To achieve these goals, we used an RC comprehension task, measuring both accuracy and response latency, to test Chinese children with DD and their chronological age-matched (CA) peers and reading-level-matched (RL) peers. The purpose of including two control groups was to help determine whether any observed syntactic processing difficulties were based on a delay that could potentially be explained by limited reading and vocabulary experience. In addition, we administered additional tasks to determine how children’s vocabulary knowledge and verbal working memory relate to their RC comprehension. To characterize working memory capacity, we used forward and backward digit span tasks. The forward digit span is a measure of phonological short-term memory, as the task requires participants to maintain the correct order of an increasing sequence of digits and to repeat it. The backward digit span is a measure of central executive functioning, as the task requires participants to manipulate the retained information by calculating the reversed order of the digits, which implies retaining, manipulating, and recalling a given number sequence. By assessing vocabulary knowledge and working memory, we were able to identify the underlying factors responsible for RC processing difficulties.
For the first research question, we focused on analyzing whether there was an asymmetry between subject and object RCs across all the groups, and which errors children made in comprehending RCs. Regarding the theoretical psycholinguistic debate, the DLT predicts an ORC advantage, and the fRM predicts an SRC advantage. We therefore anticipated significant differences between SRCs and ORCs either in accuracy or response latency. Additionally, due to linear intervention effects, we expected children not only to have difficulties in ORC comprehension but also to exhibit some difficulties in SRC comprehension. For the second research question, we focused on analyzing whether children with DD differed significantly from TD controls in RC comprehension, and whether vocabulary knowledge and verbal working memory influenced RC comprehension. We hypothesized that children with DD would show poorer comprehension accuracy and longer response latencies compared to the CA controls, and perform similarly to the RL controls. We also expected vocabulary knowledge and verbal working memory scores to correlate significantly with RC comprehension, particularly in children with DD.
Methods
Participants
Sixty-six Chinese children were recruited for three groups, namely, children with DD, the CA controls, and the younger RL controls. All the participants were right-handed (determined by writing habits) and had normal or corrected-to-normal vision, with no reported ophthalmologic or neurological abnormalities. All the information was verified through school records and teacher reports.
Dyslexic children were screened following a well-established procedure in the literature (e.g., Wang et al., Reference Wang, Cheng-Lai, Song, Cutting, Jiang, Lin and Zhou2014; Zhao et al., Reference Zhao, Liu, Li, Sun, Liu, Gao and Huang2019). A total of 1,105 children from Grades 2 to 6 at a public primary school in Beijing were administered two screening tests: the Standardized Chinese Character Recognition Test (Wang & Tao, Reference Wang and Tao1996) and the Chinese version of the Raven’s Standard Progressive Matrices test (Zhang & Wang, Reference Zhang and Wang1985). Children were identified as being at risk of dyslexia if their scores on the character recognition test were at least −1.5 standard deviations below the grade-level mean, and if their nonverbal intelligence fell within the normal range (i.e., above the 5th percentile of the norm). In addition, the children’s Chinese teachers were invited to evaluate whether the screened individuals demonstrated difficulties in their daily Chinese learning and to complete the revised Chinese version of the Swanson, Nolan, and Pelham Scale-IV (SNAP-IV; Zhou et al., Reference Zhou, Guo and Chen2013), which was used to assess attention-deficit/hyperactivity disorder based on the criteria from the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5). To be included in the DD group, children were required to meet fewer than three items on both the inattention and hyperactivity-impulsivity subscales of the SNAP-IV. Based on these criteria, 60 children (5.43%) were identified as at risk for dyslexia among the 1,105 students screened. From this group, 22 children with DD were recruited to participate in the current study. CA and RL controls were selected from their peers.
To ensure the validity of the dyslexia diagnosis screening, all participants were administered multiple linguistic skill assessments. These included phonological awareness, assessed using an onset, rime, and lexical tone detection task; morphological awareness, assessed using homophonetic and homographic awareness tests; and rapid automatized naming, assessed using a rapid digit naming task. All tests have been used in prior research (Huang et al., Reference Huang, Liu and Zhao2021; Zhao et al., Reference Zhao, Bi and Coltheart2017). Table 1 presents the means and standard deviations of the measured variables for the three groups, along with the ANOVA F-values for group differences on these measures and pairwise comparisons derived from Tukey’s post hoc tests. The results revealed that children with DD, matched for age and Raven scores with their CA peers, had significantly lower scores in character recognition, phonological awareness, morphological awareness, and rapid digit naming compared to their CA peers. In addition, compared to their RL peers, they had older age, higher Raven scores, and lower homophonic awareness scores, but were matched for character recognition, phonological awareness and homographic awareness, and rapid digit naming.
Table 1. Participant characteristics

a Standardized Chinese Character Recognition Test.
b Morphological awareness was assessed with homophonetic and homographic awareness tests; *p < .05; **p < .01; ***p < .001
Materials
The study consisted of an RC comprehension, vocabulary knowledge, and a working memory test.
First, children’s RC comprehension was tested with a character-sentence matching task, developed by Hu et al. (Reference Hu, Costa and Guasti2020). Chinese SRCs and ORCs are exemplified in (5a) and (5b), respectively.

The task consisted of 16 black and white pictures with the same structure (i.e., one animal X on the left, a pair of animals Y in the middle, and another X on the right). Figure 1 is a sample of experimental pictures. The pictures depicted 8 actions, including bite, chase, follow, hit, push, smell, spurt, and wipe. They were presented with equal times, i.e., each action appears in the two types of RCs exemplified in (5). To avoid priming effects, two lists of the task (i.e., list A and list B) were created, and each picture was used only once in each list. Each list included 8 SRCs and 8 ORCs. In addition, there were 8 filler sentences involving intransitive verbs (e.g. sleep) or actional irreversible verbs (e.g. drink), and 3 practical items. In total, there were 27 items in each version of the RC test. All the sentences were recorded by a native female Chinese speaker. The pictures and the sentences were presented at the same time. The test was programed, using E-Prime 2.0 software, recording accuracy and response latency (Schneider et al., Reference Schneider, Eschman and Zuccolotto2012).

Figure 1. An example of experimental pictures in the RC comprehension task.
Moreover, to assess working memory abilities, children were tested with a forward digit span (FDS) and a backward digit span (BDS) task, taken from the Wechsler Intelligence Scale for Children Revised in China (C-WISE; Gong & Cai, Reference Gong and Cai1994). The FDS includes a series of digits of increasing length, from 3 to 13 digits, and BDS includes a series of digits of increasing length, from 2 to 12 digits. The lists were digitally recorded in audio files by a Chinese native speaker. Each list was read at a rate of one digit per second. Children were asked to listen carefully to the series of digits and immediately repeat them aloud, either in the same order (yielding the FDS) or in the reversed order (yielding the BDS). The task was stopped when children failed in repeating 2 out of 2 trials within one level.
Furthermore, to assess receptive vocabulary knowledge, children were tested with the Peabody Picture Vocabulary Test-Revised Chinese Version (PPVT-R; Sang & Miao, Reference Sang and Miao1990). The test consists of 175 items. During the test, children were presented each word with four pictures, and were asked to select the picture that best corresponded to the word. The total score was calculated by subtracting the number of errors from the maximum achievable score.
Procedure
Participants were tested individually in a quiet room at the primary school. Written informed consent was obtained for the children from their parents and teachers before assessment took place. The study was approved by the Research Ethics Committee of the School of Psychology at Capital Normal University, in accordance with the standards of the Helsinki Declaration (1964).
Participants took approximately 3 hours to complete all the tests, which were administered over two or three sessions within one week. The tests included those reported in Participants section, as well as the RC comprehension, the PPVT-R, the FDS, and the BDS test. Each task was explained to the children in detail. The experimental materials of the RC tests were presented on a laptop. Half of the children completed list A, and the other half completed list B. Children were asked to listen to a series of sentences while looking at corresponding pictures. They were instructed to choose the character referred to in the sentence by pressing specific keys on the keyboard: “V” for the left character, “B” for the middle character, and “N” for the right character. These keys corresponded to the positions of the characters from left to right in the picture. They were encouraged to respond as quickly and accurately as possible. Before the main experiment began, three practice trials were conducted to familiarize the participants with the task. No feedback was provided during the experimental trials. Test administrators were instructed to give children breaks if any signs of fatigue were observed.
Scoring and error coding
In the RC comprehension test, the dependent variable was the proportion of accurate responses, namely, the accuracy in identifying the correct character. When participants did not choose the correct character(s), we coded the response as Error. Errors were labeled as Reversal Error and Embedded Error, following Hu et al. (Reference Hu, Costa and Guasti2020).
Consider “the horse that the lions are chasing” and Figure 1. A Reversal Error was coded if the horse on the right was chosen, i.e., the horse that is chasing the lions. In this case, the theta-roles are reversed, i.e., the relative head the horse is a Patient, but it was interpreted as an Agent. An Embedded Error was coded when children selected the middle characters, i.e., the lions, which correspond to the embedded noun within the RC. The Reversal Error may reflect a misunderstanding of the thematic assignment, whereas the Embedded Error may arise from children’s confusion about the syntactic role of the relative head, indicating an insensitivity to the fact that the RC adds information to the relative head (Arnon, Reference Arnon, Brugos, Clark-Cotton and Ha2005; Hu et al., Reference Hu, Gavarró, Vernice and Guasti2016).
Statistical analysis
Response accuracy and response latency were analyzed by employing mixed effects models (Baayen et al., Reference Baayen, Davidson and Bates2008), based on the lme4 and lmerTest packages in the R environment (R Development Core Team, 2023). Models were constructed with a maximal random effects structure and were successively simplified when they failed to converge (Barr et al., Reference Barr, Levy, Scheepers and Tily2013). The BOBYQA optimizer was used when models with the default optimizer did not converge. Filler sentences were performed at ceiling, with 98% correct responses, and thus, were excluded from the analysis.
The categorical accuracy data were analyzed via generalized linear mixed effects models, and the response latency data were analyzed with linear mixed effects models. The analysis of response latency was performed only on correct responses, with response latency being logarithmically transformed. Response latencies shorter than 200 ms or exceeding 2 SDs above the group mean were classified as outliers, resulting in 0.88% of the data being excluded from the analyses. To understand whether the asymmetry existed between subject and object RCs across all groups and whether children with DD differed significantly from TD controls in RC comprehension, Sentence Type (SRC vs ORC) and Group (DD vs CA vs RL) were included as potentially significant fixed factors in the analyses, with subjects and items as random factors. We used the ORC as the reference category for the Sentence Type factor, and the DD group for the Group factor. To investigate the effects of vocabulary knowledge and working memory, additional analyses were conducted using the PPVT-R, FDS, and BDS as continuous factors. Raw scores for the PPVT-R were selected for the analyses because the norms for the PPVT-R were established in 1990 (Sang & Miao, Reference Sang and Miao1990) and are now outdated. Raw scores from the working memory tasks were also used to align with methodological precedents from prior studies (e.g., Bentea et al., Reference Bentea, Durrleman and Rizzi2016). Effects were evaluated one by one on the basis of likelihood ratio tests; both first-level effects and the interactions between the fixed factors were tested.
In addition, we explored dyslexic children’s individual performance in RC comprehension against the CA group’s mean score in terms of accuracy and response latency. Furthermore, we analyzed children’s errors using Poisson regression models.
Results
We first report children’s comprehension of RCs and then present their relationship with vocabulary knowledge and working memory.
RC comprehension
We report the results of RC comprehension, with the order of the analyses of correct responses, individual performance, and error analyses.
Correct responses
Table 2 reports means and standard deviations of the accuracy and response latency of correct responses in children with DD, their CA and RL controls. The comprehension of RCs in children with DD differed from that of the CA controls and resembled that of the RL controls. Across all groups, SRCs were comprehended more accurately than ORCs, whereas no significant difference in response latency was observed between the two structures.
Table 2. Mean (and standard deviation) of accuracy and response latency (ms) for each group in RC comprehension

Accuracy. Sentence Type and Group were initially entered into a factorial model, and significantly contributed to the model fit [χ2(3) = 16.95, p < .001]. The interaction between Sentence Type and Group did not significantly contribute to the goodness of fit of the model, as shown by the likelihood ratio test [χ2(2) = 0.40, p = .82]; therefore, it was removed. The best-fitting model included Sentence Type and Group as fixed factors. Table 3 shows the output of the analysis. Overall, the results indicate that SRCs were comprehended more accurately than ORCs; children with DD comprehended RCs significantly less accurately than their CA peers, and displayed a similar pattern with their RL peers.
Table 3. Fixed effects in the mixed-effects model for accuracy in the RC comprehension

Reference level for Sentence Type = ORCs; reference level for Group = DD; *p < .05; ***p < .001.
Response latency. Sentence Type and Group were initially entered into a factorial model, and significantly contributed to the model fit [χ2(3) = 15.35, p < .01). Again, the interaction between Sentence Type and Group did not significantly contribute to the goodness of fit of the model, as shown by the likelihood ratio test [χ2(2) = 1.12, p = .57]; therefore, it was removed. The best-fitting model included Sentence Type and Group as fixed factors. Table 4 shows the output of the analysis. Overall, the main effect of Sentence Type was not significant, revealing a similarity between subject and object RCs; children with DD comprehended RCs significantly more slowly than their CA peers, and performed similarly to their RL peers.
Table 4. Fixed effects in the mixed-effects model for response latency in the RC comprehension

Reference level for Sentence Type = ORCs; reference level for Group = DD; *p < .05; ***p < .001.
To sum up, there are two main findings. First, a clear SRC advantage over ORC in comprehending Chinese RCs was observed in accuracy across all the groups, but no significant difference between two structures was found in response latency. Second, children with DD demonstrated lower accuracy and longer response latencies than their CA peers, while they performed similarly to their RL peers.
Individual performance
We ran individual analyses to better understand the individual severity of the deficit. Following Arosio et al. (Reference Arosio, Panzeri, Molteni, Magazù and Guasti2017), we examined individual mean scores in SRC and ORC comprehension and their deviations from the mean scores of the CA group.
Accuracy. In comprehending SRCs, 12 out of 22 children with DD (54.55%) scored below the mean score of the CA group. Among these, 2 children (9.09%) scored more than 1 SD below the CA mean, and 2 children (9.09%) scored more than 2 SD below the CA mean. In comprehending ORCs, 13 out of 22 children with DD (59.09%) scored below the mean score of the CA group. Among these, 3 children (13.64%) scored more than 1 SD below the CA mean, and 2 children (9.09%) scored more than 2 SD below the CA mean. Notably, 3 of these children (13.64%) who scored more than 1 or 2 SD below the CA mean in ORC comprehension also scored more than 1 or 2 SD below the CA mean in SRC comprehension.
Response latency. For SRCs, 17 out of 22 children with DD (77.27%) responded more slowly than the mean response latency of the CA group. Among these, 3 children (13.64%) exceeded the CA group’s mean by 1 SD, and of these, 4 children (18.18%) exceeded it by 2 SD. For ORCs, 18 out of 22 children with DD (81.82%) showed slower response latencies than the CA group’s mean. Among these, 2 children (9.09%) exceeded the mean by 1 SD, and 2 children (9.09%) exceeded it by 2 SD. Notably, the same 4 children (18.18%) who exceeded 1 or 2 SD in ORC comprehension also exhibited the slowest response latencies in SRC comprehension.
To summarize, individual analyses demonstrate that a significant proportion of children with DD exhibit deficits in comprehending SRCs, ORCs, or both structures.
Error analyses
We examined the distribution of errors to better understand what children did when they failed to understand RCs. Figure 2 shows the means and standard deviation of Reversal and Embedded Error for each group in the comprehension of SRCs and ORCs.

Figure 2. Means (and standard deviation) of errors in the SRC and ORC comprehension.
We counted the number of each type of error made by each child and treated it as a count variable to run a Poisson regression model. In SRC comprehension, there was no significant difference between Reversal and Embedded Errors, a pattern observed across all groups (all ps > .53). In ORC comprehension, Embedded Errors occurred significantly more frequently than Reversal Errors (β = –1.39, SE = 0.25, Wald Z = –5.55, p < .001). This difference was consistent across the DD group (β = –1.56, SE = 0.39, Wald Z = –4.01, p < .001), the CA group (β = –0.98, SE = 0.48, Wald Z = –2.05, p < .05) and the RL group (β = –1.47, SE = 0.45, Wald Z = –3.24, p < .01).
In summary, when the children were not able to comprehend SRCs, they made Reversal or Embedded Error. In other words, they chose another character randomly. When they did not comprehend ORCs, they were more likely to make Embedded Error than Reversal Error.
The relationship between RC comprehension and individual measures
We report children’s performance on vocabulary knowledge and working memory tasks, followed by the analyses of their relationship with RC comprehension.
First, children with DD (M = 141.77, SD = 13.14) scored lower on the PPVT-R than their CA peers (M = 149.55, SD = 10.15), but scored quantitatively higher than their RL peers (M = 137.55, SD = 18.35). One-way ANOVA revealed significant differences between the DD and the CA group, F(1, 42) = 4.82, p < .05, η2 = 0.10, as well as between the CA and the RL group, F(1, 42) = 7.20, p < .05, η² = 0.15, while there was no significant difference between the DD and the RL group, F(1, 42) = 0.77, p = .39, η² = 0.02.
Second, children with DD performed better on the FDS task than on the BDS task (M = 6.82, SD = 1.33; M = 4.00, SD = 1.69, respectively), and so did for the CA group (M = 7.77, SD = 1.63; M = 4.77, SD = 1.45, respectively) and the RL group (M = 7.23, SD = 0.92; M = 4.05, SD = 1.43, respectively). This pattern was as expected, consistent with previous studies on children and adults (Bentea et al., Reference Bentea, Durrleman and Rizzi2016; Wiseheart et al., Reference Wiseheart, Altmann, Park and Lombardino2009). Regarding the FDS, one-way ANOVA revealed a significant difference between the DD and the CA group, F(1, 42) = 4.52, p < .05, η² = 0.10, but no significant differences between the DD and the RL group, F(1, 42) = 1.40, p = .24, η² = 0.03, or between the CA and the RL group, F(1, 42) = 1.86, p = .18, η² = 0.04. Regarding the BDS, one-way ANOVA revealed no significant differences between the DD and the CA group, F(1, 42) = 2.66, p = .11, η² = 0.06, between the DD and the RL group, F(1, 42) = 0.01, p = .92, η² = 0.01, or between the CA and the RL group, F(1, 42) = 2.82, p = .10, η² = 0.06.
To investigate whether vocabulary knowledge and working memory contribute to the processing of complex syntactic structures such as RCs, we examined whether scores on the PPVT-R, FDS, and BDS predicted accuracy and response latencies in the RC comprehension task.
Accuracy: We first carried out an analysis of collinearity between the PPVT-R, FDS, and BDS measures and RC accuracy to decide whether to include these measures in the mixed effects model. Interestingly, we found a significant correlation between PPVT-R and RC accuracy in the DD group (r = 0.72, p < .001), but not in the CA (r = –0.34, p = .12) and the RL group (r = 0.30, p = .17). Thus, we decided to run analyses without PPVT-R. We then tested whether the inclusion of the scores of FDS and BDS predicted accuracy in the comprehension task. We observed the FDS did not contribute to the fit of the model [χ2(1) = 1.84, p = .18], while the BDS contributed to the fit of the model [χ2(1) = 6.09, p < .05; β = .35, SE = .14, Wald Z = 2.48, p < .05], indicating that accurate responses were more likely in association with higher backward span.
We conducted an additional series of analyses to identify the differential roles exerted by working memory separately for each group in each condition. The FDS and the BDS did not significantly predict accuracy in each condition in the DD or the CA group (all ps > .05). In the RL group, the FDS did not contribute to the fit of the models (both ps > .05), while the BDS significantly predicted accuracy in both SRC and ORC conditions (β = .87, SE= .42, Wald Z = 2.06, p < .05; β = 1.88, SE = .66, Wald Z = 2.86, p < .01).
Response latency. As we did for accuracy measure, we further tested whether scores of vocabulary knowledge and working memory predicted response times to the comprehension task, considering all age groups collapsed. The PPVT-R predicted response latency to the RC comprehension task [χ2(1) = 8.14, p < .01; β = –.01, SE = .01, t = –2.90, p < .01], indicating that high scores in a receptive vocabulary task were associated with a faster comprehension of RCs. Neither the FDS [χ2(1) = 1.17, p = .28] nor the BDS [χ2(1) = 0.07, p = .80] contributed to the fit of the model.
We also conducted an additional series of analyses to identify the differential roles exerted by vocabulary knowledge and working memory separately for each group in each condition. In the DD group, the PPVT-R (β = –.01, SE = .01, t = –4.47, p < .001) and the FDS (β = .02, SE = .01, t = 2.34, p < .05) predicted the comprehension task in ORC condition, indicating that response latency was more likely in association with better vocabulary skills and higher forward span. In the CA and the RL group, none of the significant effects were observed (all ps > .05).
To sum up, there are two main findings. First, high scores in a receptive vocabulary task were associated with a more accurate and faster comprehension of RCs, and this was particularly evident in the DD group. Second, FDS was associated with a faster comprehension of ORCs, and this was only evident in DD group; BDS was associated with a more accurate comprehension of RCs, and this was particularly evident in the RL group.
Discussion
In the next sections, we discuss experimental findings to address our research questions and highlight some clinical implications and open questions in future research.
Subject advantage in the comprehension of relative clauses in dyslexia
The first goal of the present study was to contribute to theoretical debates concerning the source of processing difficulty associated with subject and object RCs.
According to the DLT, an ORC advantage should have been found in the comprehension of Chinese RCs. To recall, the comprehension difficulty is related to the locality of assembling two dependent syntactic heads: the earlier the dependency is resolved, the fewer computational resources are required. In the case of Chinese, the dependency in ORCs is resolved earlier than that in SRCs. Thus, ORCs require less computational resources than SRCs, and should be less difficult to process. On the contrary, according to the fRM, in Chinese SRCs, the embedded object does not structurally intervene between the relative head and its trace, whereas in Chinese ORCs, the embedded subject structurally intervenes between the relative head and its trace. Due to this structural intervention, children with DD and their CA and RL peers are expected to perform more poorly on ORCs compared to SRCs. Our data showed that all three groups comprehended ORCs less accurately than SRCs, revealing the SRC advantage in comprehension in line with the predictions based on the fRM, contrary to the DLT. This SRC advantage is consistent with a body of literature showing that children have difficulties comprehending ORCs (e.g., Belletti et al., Reference Belletti, Friedmann, Brunato and Rizzi2012; Friedmann et al., Reference Friedmann, Belletti and Rizzi2009; Hu et al., Reference Hu, Gavarró, Vernice and Guasti2016), confirming intervention effects in the acquisition of RCs.
It is worth pointing out that our results did not replicate the finding reported by Chan (Reference Chan2014, Reference Chan2015). Recall that their Chinese children with DD comprehended SRCs significantly worse than ORCs, and TD children comprehended SRCs and ORCs quantitatively similarly. As pointed out earlier, their results may be due to the limitation of the experimental design, namely, the head noun has many more features than the embedded noun, which may facilitate the comprehension of ORCs. Our study manipulated the features of the head noun and the embedded noun carefully, as both are a noun that bears a [+NP] feature.
Our data also revealed that the pattern of errors was similar in children with DD and their TD peers. That is, when children were not able to comprehend SRCs, they made Reversal or Embedded Error; when they did not comprehend ORCs, they were more likely to make Embedded Error than Reversal Error. These results replicate what has been reported in previous studies on Mandarin-speaking TD children (Hu et al., Reference Hu, Gavarró, Vernice and Guasti2016). As stated earlier, the Reversal Error seems to reflect a misunderstanding of the thematic assignment, whereas the Embedded Error may reflect children’s confusion about the syntactic role of the relative head, indicating that children are not sensitive to the fact that the RC adds information to the relative head and they do not integrate the two sets of information, from the relative head and from the RC. As pointed out by previous studies (Hu et al., Reference Hu, Gavarró, Vernice and Guasti2016), the finding on SRCs reveals that linear intervention is taxing for Chinese children, although to a lesser extent than structural intervention (operating in ORCs). In a SRC (hua laoshi de haizi “the child that draws the teacher”), the embedded object (laoshi ‘teacher’) linearly intervenes between the relative head (haizi ‘child’) and its gap. With respect to the findings on Chinese ORCs, an Embedded Error amounts to choosing the first NP heard as the Agent of the action, confirming that children were also influenced by the linear order when processing RCs.
To sum up, Chinese children with and without DD showed the SRC advantage in comprehension as predicted by the fRM, contrary to the DLT. In addition, the results of correct responses and errors confirmed that both hierarchical and linear interventions create interference, but to different degrees, namely, structural intervention has stronger effects than linear intervention.
Factors contributing to impaired comprehension of relative clauses in dyslexia
The second goal was to determine whether Chinese children with DD have difficulty comprehending subject and object RCs, compared to their TD peers, and whether vocabulary knowledge and working memory contribute to RC comprehension in dyslexia.
First, our data revealed that a subset of children with DD had deficits in the comprehension of SRCs. Significant difference was observed between the DD group and CA controls, but not between the DD group and RL controls. Specifically, 54.55% of children with DD scored below the CA group’s mean accuracy for SRC, and 77.27% exhibited slower response latencies than the CA group’s mean. Critically, 9.09% of children with DD scored more than 2 SD below the CA mean in accuracy, and 18.18% exceeded the CA group’s mean by 2 SD in response latency. This contrasts with findings from languages with head-initial RCs (e.g., Italian), where children with DD achieve ceiling performance in SRC comprehension (Arosio et al., Reference Arosio, Panzeri, Molteni, Magazù and Guasti2017). As theorized, these difficulties in Chinese SRCs can be attributed to linear intervention, where the embedded object linearly intervenes between the relative head and its trace.
Deficits were also observed in ORC comprehension. Similar to the SRC comprehension, children with DD performed significantly worse than the CA controls, and did not differ from the RL controls. In total, 59.09% of children with DD scored below the CA group’s mean accuracy for ORC, and 81.82% exhibited slower response latencies than the CA group’s mean. Critically, 9.09% scored more than 2 SD below the CA group’s mean accuracy, and 9.09% exceeded the CA group’s mean by 2 SD in response latency. These results are in line with cross-linguistic studies of head-initial RCs (e.g., Arosio et al., Reference Arosio, Panzeri, Molteni, Magazù and Guasti2017). This suggests that a subset of children with DD have problems moving a relative head over an embedded subject endowed with a feature which is a subset of the relative head.
A significant proportion of children with DD showed comorbid deficits in both SRC and ORC comprehension. In terms of accuracy, 13.64% of children with DD who scored more than 1 or 2 SD below the CA group’s mean accuracy in ORC comprehension also scored more than 1 or 2 SD below the CA mean in SRC comprehension. For response latency, 18.18% of children who exceeded 1 or 2 SD in ORC comprehension also exhibited slower response latencies in SRC comprehension. These results underscore that both linear and structural interventions contribute to impaired RC comprehension in Chinese children with DD.
Second, results showed that high scores in a receptive vocabulary task were associated with a more accurate and faster comprehension of RCs, and this was particularly evident in the DD group. This result confirmed that vocabulary was the best predictor of sentence comprehension, in line with previous studies (e.g., Joshi, Reference Joshi2005; Tunmer & Chapman, Reference Tunmer and Chapman2012). Importantly, the dyslexic group did not show a significant difference in vocabulary knowledge compared to the RL group, and performed similarly to the RL group in both accuracy and response latency. Taken together, these results raised the possibility that the performance of sentence comprehension might be influenced by vocabulary knowledge. Research showed that literacy enhances vocabulary and syntactic knowledge, and increased vocabulary knowledge enables increased prediction of spoken language (Huettig & Pickering, Reference Huettig and Pickering2019). The vital role of vocabulary seems to signify that it is a fundamental element in an architectural account of comprehension difficulty, according to which the memory retrieval mechanism may play a primary role (Van Dyke et al., Reference Van Dyke, Johns and Kukona2014).
Note that our results also showed that FDS was associated with a faster comprehension of ORCs, and this was only evident in the DD group; BDS was associated with a more accurate comprehension of RCs, and this was particularly evident in the RL group. As introduced earlier, the FDS is a measure of phonological short-term memory, and the BDS is a measure of central executive functioning. Our results revealed a close relationship between phonological short-term memory and sentence processing in DD, while a close relationship between central executive functioning and sentence comprehension in RL controls. The contrast between the DD and the RL children seems to implicate different computational resources involved in comprehending complex structure such as RCs. For this reason, it is likely that the subtle sentence processing deficits found in dyslexia in the current study can also be explained by the processing deficit theory (Mann et al., Reference Mann, Shankweiler and Smith1984; Shankweiler et al., Reference Shankweiler, Smith and Mann1984, Reference Shankweiler, Crain, Katz, Fowler, Liberman, Brady and Shaywitz1995; Smith et al., Reference Smith, Macaruso, Shankweiler and Crain1989). According to this hypothesis, spoken sentence comprehension deficits are grounded in a phonological deficit. In this view, it is not reading that is directly related to sentence processing, but rather phonological processing. Accordingly, a core deficit in phonology makes spoken language processing difficult, especially when working memory loads are high in tasks that are completely auditory.
Together, our results have further implications. First, our data revealed that a subset of children with DD had a consistent language deficit, as they scored drastically lower than 2 SD below the CA mean. This is particularly important, as norm-referenced language tests for children at these ages are not available in China, and it is not common to evaluate these children’s grammatical abilities. Since there is a high comorbidity between reading disorder and language impairment (Snowling & Hulme, Reference Snowling and Hulme2012), and children with DD could only be diagnosed during the school years, our results can help establish whether these children are also affected by language impairment. At a clinical level, the current study suggests that an evaluation of SRC and ORC comprehension should be included in testing materials for identifying language impairment in dyslexia. Second, our results revealed that vocabulary knowledge and phonological short-term memory play a vital role in dyslexic children’s sentence comprehension. This indicates that clinical efforts focused on building children’s vocabulary and working memory skills may improve their spoken language abilities. However, one must be cautious to consider these results’ interpretation, as the study relied solely on a receptive vocabulary measure and a digit span task. Future research could be improved by adding measures of expressive vocabulary (Wiseheart & Altmann, Reference Wiseheart and Altmann2018) and episodic buffer—a component of working memory responsible for integrating information from a range of sources into a multidimensional code (Baddeley, Reference Baddeley2000). Furthermore, given the influence of reading experience on vocabulary development (Cain & Oakhill, Reference Cain and Oakhill2011), subsequent studies could investigate the specific effect of reading experience on the comprehension of complex syntactic structures.
Conclusion
This study investigated the comprehension of SRCs and ORCs in Chinese children with DD, comparing with TD children matched for age or reading level. All the groups showed the SRC advantage in accuracy, a finding that can be explained in terms of structural intervention within the fRM framework, contrary to the DLT. The DD group comprehended SRCs and ORCs less accurately and more slowly than the CA group, and performed similarly to the RL group. Also, a significant number of children with DD exhibit deficits in comprehending SRCs, ORCs, or both structures. These findings confirmed the existence of syntactic difficulties in dyslexia. Dyslexic children’s receptive vocabulary knowledge was associated with higher accuracy and shorter response latencies in RC comprehension, and their phonological short-term memory was specifically linked to faster RC processing. These findings are most compatible with the claim that syntactic difficulties may stem from limited vocabulary knowledge and phonological short-term memory deficits.
Acknowledgements
The authors are grateful to children who participated in the study, as well as to their families, teachers and pediatricians; to Weijie Li who helped us out with data collection; and to Angel Chan, Denis Delfitto, Jiang Guiying, Cornelia Hamann, Stavroula Stavrakaki, Yi (Esther) Su, Wu Zhuang, Yu Yue, Zhang Hui, Lucy Xia Zhao, Zhou Peng and anonymous reviewers for their insightful suggestions. The first author would like to thank China Scholarship Council for supporting her research in Italy. The research was supported by Fujian Provincial Federation of Social Sciences (Grant No. FJ2025A021), National Social Science Fund of China (Grant No. 18BYY080), Beijing Social Science Foundation (Grant No. 23JYC017), and Beijing Municipal University Excellent Youth Talent Support Program. Authors’ contribution is as follows: Shenai Hu, Maria Teresa Guasti and Jing Zhao conceived the experimental questions; Shenai Hu and Jing Zhao developed the experimental tasks; Jing Zhao recruited and tested the children, and Shenai Hu supervised the testing; Shenai Hu performed the statistical analyses, in strict cooperation with Jing Zhao; Shenai Hu drafted the article, and Maria Teresa Guasti and Jing Zhao commented on the article.