Hostname: page-component-745bb68f8f-b6zl4 Total loading time: 0 Render date: 2025-01-25T18:36:45.129Z Has data issue: false hasContentIssue false

Comprehension of indirect answers: Developmental trajectory for preschool- and early elementary school-aged children with typical development

Published online by Cambridge University Press:  30 August 2023

Timothy HUANG*
Affiliation:
Department of Communication Sciences and Disorders, West Chester University of Pennsylvania, USA
Lizbeth H. FINESTACK
Affiliation:
Department of Speech-Language-Hearing Sciences, University of Minnesota, Twin Cities, USA
*
Corresponding Author: Timothy Huang; Email: chuang@wcupa.edu
Rights & Permissions [Opens in a new window]

Abstract

Indirect answers are a common type of non-literal language that do not provide an explicit “yes” or “no” to a question (e.g., “I have to work late” indirectly answered “Are you going to the party?” with a negative response). In the current study, we examined the developmental trajectory of comprehension of indirect answers among 5- to 10-year-old children with typical development. Forty-eight children, 23 boys and 25 girls, between the ages of 5 years; 0 months and 10 years; 11 months (M = 8;2, SD = 19.77 months) completed an experimental task to judge whether a verbally presented indirect answer meant yes or no (Comprehension Task) and then explain their choice (Explanation Task). Responses were scored for accuracy and coded for error analysis. On the Comprehension Task, the 5- to 8-year-olds performed with approximately 85% accuracy, while the 9- and 10-year-olds achieved 95% accuracy. On the Explanation Task, the cross-sectional trajectory revealed three stages: the 5- and 6-year-olds adequately explained indirect answers 32% of the time, the 7- and 8-year-olds performed significantly higher at 55%, and the 9- and 10-year-olds made significant gains than the younger children at 66%. Error analysis revealed that when children fail to interpret speaker intentions appropriately, they repeat the speaker’s utterance or provide an insufficient explanation 80% of the time. Other responses, such as those irrelevant to the context, indicating “I don’t know” or no response, or that were made-up interpretations each accounted for 2%-10% of total inadequate explanations. Study findings indicate discrepancies between task performances and offer two separate sets of baseline data for future comparisons that investigate comprehension or explanation of indirect answers by children with different cultural and linguistic backgrounds and by those with varying cognitive and language profiles.

Type
Article
Copyright
© The Author(s), 2023. Published by Cambridge University Press

Introduction

Consider the following exchange between two friends:

Question: “Are you going to the party?”

Answer: “I have an exam tomorrow.”

The response is an example of an indirect answer because it does not provide an explicit “yes” or “no” to the question. Nonetheless, a refusal to attending the party can be inferred from the response. Along with metaphors and sarcasm, indirect answers are a form of non-literal language due to the discrepancy between the speaker’s intended message and the explicit linguistic expression. Indirect answers are natural and common in everyday communication, accounting for 13%-38% of responses to yes/no questions (de Marneffe et al., Reference de Marneffe and Tonhauser2009; Hockey et al., Reference Hockey, Rossen-Knill, Spejewski, Stone and Isard1997; Stenstrom, Reference Stenstrom1984).

The intended meanings of indirect answers are known as conversational implicatures in the field of linguistic pragmatics. In “Logic and Conversation,” Grice (Reference Grice1975) described the phenomenon of meaning one thing while saying another and explained how speakers manage to understand each other. Grice postulated a general principle that speakers are cooperative with the intention to achieve effective communication. Assuming the responder intended to answer her friend’s question of attending the party in the exchange above, her response that she has an exam the next day must be relevant to her attendance at the party. Because it is commonly known that people prioritize responsibilities and obligations over leisure activities, her utterance intentionally communicated (or implicated per Grice’s terminology) a negative answer to the question.

Existing research suggests that children with typical development begin to comprehend indirect answers with some consistency around the age of 6 years, and this development continues to improve steadily throughout the early school years (Bernicot et al., Reference Bernicot, Laval and Chaminaud2007; Bucciarelli et al., Reference Bucciarelli, Colle and Bara2003; de Villiers et al., Reference de Villiers, de Villiers, Coles-White and Carpenter2009; Loukusa et al., Reference Loukusa, Leinonen and Ryder2007). Four recent studies depict this development from 2 to 10 years of age. First, Bucciarelli et al. (Reference Bucciarelli, Colle and Bara2003) used video-taped stories to test 2- to 7-year-old children on their comprehension of indirect answers (named “complex indirects” in the study). After viewing a story, the children chose a possible ending from four pictures. For example, in one scenario, two siblings stop in front of a doll shop. The brother asks, “Would you get me that game?” and the sister answers, “We don’t have any money.” In this example, selecting the picture of the siblings walking away from the store empty handed would be scored correct. Results indicated that accuracy increased with age: 38% for 2;6- to 3-year-olds, 42% for 3;6- to 4-year-olds, 43% for 4;6- to 5;6-year-olds, and 68% for 6- to 7-year-olds. It is noteworthy that only children older than 6 years showed relatively reliable performance above 50% accuracy when deriving indirect inferences.

Second, Loukusa et al. (Reference Loukusa, Leinonen and Ryder2007) tested children with typical development between the ages of 3 and 9 years on their ability to comprehend indirect comments. The researchers verbally presented a scenario such as: “A man is mowing, and a woman says to him, ‘There are flowers growing in the middle of the lawn so remember to be careful.’” followed by a question prompt, “Why does the woman say this?” Responses were judged as correct/appropriate (e.g., “So that the flowers wouldn’t be cut.”) or incorrect/inappropriate (e.g., “She doesn’t want to do it.”) based on whether the implicated meaning was derived. The researchers found that the mean score of correct/appropriate answers increased with age. Moreover, there was a significant difference in mean scores between 3- and 4-year-olds as well as between 5- and 6-year-olds. Examination of percent correct by age groups revealed that 3-year-olds drew indirect inferences from 21% of the questions; this percentage increased to 77% by the age of 6 years. Eight- and 9-year-old children performed near the ceiling.

Third, Bernicot et al. (Reference Bernicot, Laval and Chaminaud2007) used a computer-based story completion task to test comprehension of indirect answers, among other non-literal language forms, of children between the ages of 6 and 10 years. In one story, Donald and Daisy are in the yard. Donald asks Daisy, “Should I mow the lawn?” and Daisy replies, “The nephews are taking a nap.” The children had to pick a picture from two possible endings: one indicating the inference is understood (i.e., Donald waters the flowers) and the other indicating the inference is not understood (i.e., Donald mows the lawn). The researchers found that 75% of the 6-year-old children were able to correctly select the implicated ending in three or four of the four tested items. Performance for older children was near the ceiling, with 95% for the 8-year-olds and 100% for the 10-year-olds.

Finally, de Villiers et al. (Reference de Villiers, de Villiers, Coles-White and Carpenter2009) investigated comprehension of indirect answers by children aged 3 to 10 years. The researchers presented pictures with short question-answer pairs (e.g., Adult: “What happened to the ham?” Child: “The dog looks happy.”) to children and asked them to explain what the speaker meant (e.g., “What did the boy mean?” or “Why did he say that?”). Responses were coded as adequate or inadequate based on whether the implicated message was derived (e.g., “Because the dog ate the ham” vs. “Because the dog looked happy”). Results indicated that performance increased with age, where 4-year-olds provided adequate answers about 25% of the time and 9-year-olds provided adequate answers 90% of the time. Six-year-olds were able to provide adequate answers about half of the time.

In summary, the reviewed studies report a range of success rates for comprehension of indirect answers by 2- to 10-year-old children with typical development and indicate that this skill grows steadily with age. Particularly, children at the age of 6 years appear to be capable of drawing indirect inferences more consistently, ranging from 50% (de Villiers et al., Reference de Villiers, de Villiers, Coles-White and Carpenter2009) to 75% (Bernicot et al., Reference Bernicot, Laval and Chaminaud2007). It is important to note that one major methodological difference among the studies arises from how “comprehension” was measured. That is, Loukusa et al. (Reference Loukusa, Leinonen and Ryder2007) and de Villiers et al. (Reference de Villiers, de Villiers, Coles-White and Carpenter2009) used open-ended why-questions to probe children’s ability to explain indirect answers, whereas Bucciarelli et al. (Reference Bucciarelli, Colle and Bara2003) and Bernicot et al. (Reference Bernicot, Laval and Chaminaud2007) adapted a forced-choice format that simply assessed participants’ judgement of indirect answers.

Current Study

The purpose of the current study was to further examine the developmental trajectory of comprehension of indirect answers among 5- to 10-year-old children with typical development. There were three primary aims. The first was to examine comprehension of indirect answers with both forced-choice and open-ended questions. Observed differences in performance will clarify the discrepancies in previous findings and will inform future studies regarding the potential impact of methodology on task performance. Moreover, previous literature had only investigated indirect answers that were contextually clear (e.g., Q: “Are you going to the party?” A: “I have an exam tomorrow.”) but not indirect answers that were contextually ambiguous (e.g., Q: “Are you going to the party?” A: “Bob will be there.”). The current study included such a novel category to provide insight into children’s abilities to interpret the speaker’s intentions that are presumably unclear and more complex.

The second aim was to provide more empirical data to the development of comprehension and explanation of indirect answers in children, especially over the preschool and early elementary school years. The findings not only will further our understanding of the development of non-literal language but also will provide critical baseline data that can be compared to children with different cultural and linguistic backgrounds as well as those with varying cognitive and language profiles and communication difficulties, such as autism, Down syndrome, and developmental language disorder.

The third aim was to gather preliminary data on children’s explanations of indirect answers when they fail to interpret speaker intentions appropriately. Previous studies, such as Chin (Reference Chin2017), de Villiers et al. (Reference de Villiers, de Villiers, Coles-White and Carpenter2009), and Loukusa et al. (Reference Loukusa, Leinonen and Ryder2007), mentioned erroneous responses but did not analyze the characteristics of those errors. The findings in the current study will shed light on the challenges children may encounter in the reasoning process and will inform future investigations on teaching pedagogies for interpreting indirect answers. To address these aims, we included the following three research questions:

  1. 1. Is there a significant difference between comprehending and explaining indirect answers by 5- to 10-year-old children with typical development?

  2. 2a. Is there a significant difference in comprehending indirect answers by 5- to 10-year-old children with typical development?

  3. 2b. Is there a significant difference in explaining indirect answers by 5- to 10-year-old children with typical development?

  4. 3. When failing to derive speaker intentions from indirect answers, what are the most common error patterns produced by 5- to 10-year-old children with typical development?

Method

Participants

The study included 48 children, 23 boys and 25 girls, between the ages of 5 years; 0 months and 10 years; 11 months (M = 8;2, SD = 19.77 months). Of the 48 children, seven were 5-year-olds, eight were 6-year-olds, nine were 7-year-olds, seven were 8-year-olds, nine were 9-year-olds, and eight were 10-year-olds. Data collection occurred in summer 2019 during Minnesota’s annual State Fair. Participants were recruited through the University of Minnesota’s research facility where interested fairgoers could volunteer for a variety of research studies. The study was approved by the University of Minnesota’s Institutional Review Board for human subjects. Parents or guardians signed consent forms prior to participating in any study sessions.

Child participants met the following inclusionary criteria: (a) be a monolingual English speaker, (b) use at least 3-word utterances to communicate, (c) have normal or corrected-to-normal vision and hearing per parent report, and (d) receive a T-score lower than 60 on the Social Responsiveness Scale- Second Edition (SRS-2; Constantino & Gruber, Reference Constantino and Gruber2012). The SRS-2 identifies social impairments associated with autism; scores lower than 60 are considered within normal limits and not associated with clinical presentations of autism. This criterion was in place because research has found that individuals on the autism spectrum often do not appropriately interpret and use non-literal language, such as metaphors and irony (Colich et al., Reference Colich, Wang, Rudie, Hernandez, Bookheimer and Dapretto2012; Deliens et al., Reference Deliens, Papastamou, Ruytenbeek, Geelhand and Kissine2018b; Happé, Reference Happé1993, Reference Happé1995; Kalandadze et al., Reference Kalandadze, Norbury, Nærland and Næss2018; Norbury, Reference Noveck2005; Rundblad & Annaz, Reference Rundblad and Annaz2010).

Additionally, participants could not have a history of language impairment or developmental delay per parent report. For five participants, parents reported that they were receiving speech-language services, of which four were due to speech sound errors and one was due to stuttering. All five participants were included in the study. To determine eligibility for the study, participants completed the Matrices subtest of the Kaufman Brief Intelligence Test- Second Edition (KBIT-2; Kaufman & Kaufman, Reference Kaufman and Kaufman2004) and the Recalling Sentences subtest of the Clinical Evaluation of Language Fundamentals- Fourth Edition (CELF-4; Semel et al., Reference Semel, Wiig and Secord2003) as indices of non-verbal cognitive ability and expressive language ability, respectively. Table 1 provides a detailed summary of demographic and linguistic characteristics of the participants.

Table 1. Characteristics of Participants

Note.

a KBIT-2 = Matrices subtest of the Kaufman Brief Intelligence Test- Second Edition (Kaufman & Kaufman, Reference Kaufman and Kaufman2004), mean standard score = 100, SD = 15

b CELF-4 = Recalling Sentences subtest of the Clinical Evaluation of Language Fundamentals- Fourth Edition (Semel et al., Reference Semel, Wiig and Secord2003), mean scaled score = 10, SD = 3

c SRS-2 = Social Responsiveness Scale- Second Edition (Constantino & Gruber, Reference Constantino and Gruber2012), T-scores < 60 are considered within normal limits and scores ≥ 60 are associated with clinical presentations of autism spectrum disorder.

Procedure

Parents or legal guardians of participants completed a Family Background Questionnaire (FBQ; adapted from Bangert et al., Reference Bangert, Halverson and Finestack2019) and the Social Responsiveness Scale- Second Edition (SRS-2; Constantino & Gruber, Reference Constantino and Gruber2012). The FBQ (Bangert et al., Reference Bangert, Halverson and Finestack2019) included a series of questions about family demographic variables, including race, ethnicity, maternal education, employment, and household income. These variables were used to characterize participants’ demographic backgrounds. The FBQ also asked about medical history, diagnosis of neurodevelopmental disorders, and ongoing special services (e.g., speech, occupational, physical therapy). Participants were excluded from the study if language impairments or developmental delays were reported.

The SRS-2 (Constantino & Gruber, Reference Constantino and Gruber2012) is a measure of social behaviors of children and adults across three different age groups: preschool-age, school-age, and adult. The current study used the school-age version appropriate for individuals between 4-18 years of age. Parents or legal guardians rated their children’s reciprocal social behaviors (e.g., “Plays appropriately with children his/her own age,” “Has an unusually narrow range of interests”) on a 4-point Likert scale (i.e., “not true,” “sometimes true,” “often true,” and “almost always true”). Scores lower than 60 are considered within normal limits and not associated with clinical presentations of autism. Thus, participants with an SRS-2 score of 60 and higher were excluded from the study.

Participants completed the Matrices subtest of the KBIT-2 (Kaufman & Kaufman, Reference Kaufman and Kaufman2004) and the Recalling Sentences subtest of the CELF-4 (Semel et al., Reference Semel, Wiig and Secord2003) to measure non-verbal IQ and language ability. The KBIT-2 (Kaufman & Kaufman, Reference Kaufman and Kaufman2004) is a measure of verbal and non-verbal intelligence for individuals between the ages of 4 years; 0 months and 90 years; 11 months. Participants completed the Matrices subtest of the assessment, during which they viewed a pair of pictures that were related (e.g., a rabbit and a carrot) and a third picture (e.g., a dog) paired with a question mark. Then, they were asked to select a picture from five possibilities that would best match with the third picture in a way similar to the first set of pictures (i.e., a bone). As participants progressed the test items became more difficult, transitioning from relationships between people and objects to abstract symbols and designs with more pictures to analyze. Participants established basal by obtaining three consecutive correct answers and reached ceiling by four consecutive scores of 0. Standard scores (M = 100, SD = 15) were calculated and used to characterize participants’ non-verbal IQ.

The CELF-4 (Semel et al., Reference Semel, Wiig and Secord2003) is a measure of expressive and receptive language abilities for individuals aged 5 years; 0 months to 21 years; 11 months. Participants completed the Recalling Sentences subtest of the CELF-4 in which they repeated sentences with varying length and syntactic complexity (e.g., “My mom is the nurse who works in the community clinic.”). Participants did not need to establish basal, and they reached ceiling after five consecutive scores of 0. Raw scores were converted to standardized scaled scores (M = 10, SD = 3), and these scores were used to characterize participants’ language ability. Table 1 provides a detailed summary of demographic and linguistic characteristics of the participants.

Experimental Task

Participants also completed an experimental task designed to measure comprehension and explanation of indirect answers. The experimental stimuli consisted of 30 novel question-answer pairs. Similar to those found in Bernicot et al. (Reference Bernicot, Laval and Chaminaud2007), each item included two images with audio stimuli of two people having a conversation. One person asked the question and the conversational partner responded with an indirect positive answer (e.g., Q: “Are you feeling cold?” A: “I should have worn a sweater.”), an indirect negative answer (e.g., Q: “Are you feeling hungry?” A: “I just came from a pizza party.”), an ambiguous answer (e.g., Q: “Are you feeling hot?” A: “It feels like yesterday.”) or a direct answer (e.g., Q: “Are you feeling tired?” A: “I am feeling tired.”). After viewing the conversation on an iPad, the researcher pointed to the responder and asked the participant whether the speaker meant yes or no (Comprehension Task). Then, the researcher asked the participant “Why?” or “How did you know that?” to explain his or her answer (Explanation Task). If the participant simply repeated the second person’s utterance (e.g., “He said he just came from a pizza party.”), the researcher prompted the child by asking “Tell me more.” or “Why did you think he meant yes/no?” The researcher recorded child responses verbatim and scored them online. All sessions were audio recorded using a digital audio recorder for further coding and reliability purposes. The task required approximately 10-15 minutes to complete.

The 30 question-answer pairs were split across four conditions based on how the conversation partner responded to the question posed: Indirect Yes (10 items), Indirect No (10 items), Ambiguous Response (5 items), and Direct Response (5 items). Indirect Yes answers provided a positive response to the yes-no question without stating “yes.” Indirect No answers provided a negative response to the yes-no question without stating “no.” Ambiguous Responses were designed to provide an unclear answer that could be interpreted as either yes or no to the question. Direct Responses provided a clear “yes” or “no” to the question. The Ambiguous and Direct Response conditions had fewer items because the former served as an exploratory condition and the latter served as a control condition for comparison. Appendix A provides a complete list of the stimuli.

Prior to testing, we created a total of 48 question-answer pairs that were evenly split across the four conditions. We invited 20 adults, who were native English speakers between 19 and 45 years of age, to read these pairs in written form and judge whether the answer meant yes or no. For Indirect Yes and Indirect No conditions, we selected the items with the highest agreement. The consensus indicated a mean agreement of 97% for the Indirect Yes items and 99% for the Indirect No items, with agreements ranging from 90% to 100% for all items. We selected the items with an agreement close to 50% for the Ambiguous Response items. The mean agreement for this condition was between 40% and 45%. Finally, the mean agreement for the Direct Response items was 100%.

We controlled for the syntactic complexity and semantic difficulty of the stimuli. Specifically, the utterance length of all question-answer prompts ranged from 5 to 7 morphemes (M = 5.8, SD = 0.79). Mean length of utterance (MLU; Brown, Reference Brown1973) indexes children’s syntactic development to their age. The MLU of the stimuli was in line with the participants’ language development as Rice et al. (Reference Rice, Smolik, Perpich, Thompson, Rytting and Blossom2010) found that 5- and 8-year-olds with typical development had an MLU of 4.92 and 5.59, respectively, in a large-scale study of more than 300 participants. All words used in the stimuli were acquired by age four, according to the age-of-acquisition norms created by Kuperman et al. (Reference Kuperman, Stadthagen-Gonzalez and Brysbaert2012). We selected age-appropriate vocabulary to evaluate comprehension and reasoning because we did not expect young children to explain items with advanced terms (e.g., Q: “Do you have any siblings?” A: “My father had a vasectomy after me.”). There were no suggestive words (e.g., “good”, “bad”, “favorite”, “hate”, “like”) that might reveal preferences, and, thus, bias judgement. We avoided contractions (e.g., “she’s”, “isn’t”, “can’t”) to maximize the clarity of utterances and avoided other forms of non-literal language (e.g., idiom, sarcasm) to ensure the validity of the experimental stimuli.

Two native English speakers, one male and one female, with mid-western US dialects recorded the auditory stimuli. They were naïve to the study aims and were instructed to read through a list of statements followed by a list of questions (i.e., the experimental stimuli). Two individual recording sessions took place in a quiet therapy room using a microphone and digital recorder. The speakers were instructed to keep the same volume and speech rate and use a neutral tone when reading the sentences. We edited the sound files so that the male and female alternated asking and answering questions. The number of items with the male vs. female asking the question in each category was counterbalanced. We created two randomized sequences of the 30 question-answer pairs. For each sequence, no more than two of the Indirect Yes or Indirect No items appeared consecutively to reduce response bias, where participants responded yes or no with all questions (Winkler et al., Reference Winkler, Kanouse and Ware1982).

Scoring and Coding

Prior to scoring and coding, the first author researcher and a research assistant listened to each participant’s audio recordings and transcribed the child explanations in an spreadsheet. For the Comprehension Task, the researcher scored Indirect Yes, Indirect No, and Direct Response items as correct (1) or incorrect (0). The Ambiguous Response items were not scored because the answers could be interpreted either way (e.g., Q: “Did you have fun playing baseball?” A: “I tossed the ball.”). Thus, the maximum score for the Comprehension Task was 25 (i.e., 10 Indirect Yes, 10 Indirect No, and 5 Direct Response items). For the Explanation Task, two trained research assistants, undergraduate majors in Speech-Language-Hearing Sciences, who were not involved in the transcription and naïve to the purpose of the study and participant characteristics (e.g., age, sex) independently judged whether the explanation for an indirect answer was adequate (1) or inadequate (0). A detailed description of the training and coding procedures can be found in Appendix B.

After scoring was completed, the research assistants further assigned an error code to the inadequate responses as I Don’t Know/No Response (1), Repetition of Response (2), Irrelevant to Context (3), Made-up Interpretation (4) that is appropriate to context but different from the speaker’s intention, or Insufficient Explanation (5) that fails to capture speaker intention. We created the error codes based on an initial evaluation of 150 child explanations and the coding scheme by Nippold and Martin (Reference Nippold and Martin1989) for idiom interpretation. We developed a worksheet that outlined a binary approach for the research assistants to categorize participant explanations (Figure 1). The research assistants coded all inadequate explanations independently. Interrater reliability across Indirect Yes, Indirect No, and Ambiguous Response items ranged from 88%-98% with an overall average of 95%. Interrater reliability for Direct Response items was not calculated, as all explanations for this category were judged adequate and, thus, did not receive an error code. Appendix B provides descriptions and example responses assigned to each error code.

Figure 1. Error Code Worksheet.

Statistical Analyses

To address Research Question 1, the research assistants scored participants’ answers as correct (1) or incorrect (0) for the Comprehension Task. We calculated percent correct for the Indirect Yes, Indirect No, and Direct Response conditions. Percent correct for the Ambiguous Response condition was not calculated because the answers could be interpreted either way (e.g., Q: “Are you going to the party?” A: “Bob will be at the party.”). Given that all participants achieved 100% accuracy for the Direct Response items, we created an Overall percent correct variable by averaging the accuracy of the two experimental conditions only (i.e., Indirect Yes and Indirect No). For the Explanation Task, the research assistants scored participants’ responses as adequate (1) or inadequate (0). We calculated percent “adequate” for all four conditions and created an Overall percent “adequate” variable by averaging the accuracy of the three experimental conditions (i.e., Indirect Yes, Indirect No, and Ambiguous Response). Direct Response items were not included because all participants scored with 100% adequacy. Finally, we conducted a series of Wilcoxon signed-rank tests to examine mean differences in the Indirect Yes, Indirect No, and Overall categories between the Comprehension Task and the Explanation Task. The Wilcoxon signed-rank test is a non-parametric test for paired data based on independent units of analysis (Woolson, Reference Woolson2007). We also evaluated effect sizes using Cohen’s d, with 0.2, 0.5, and 0.8 representing small, medium, and large effect sizes, respectively (Howell, Reference Howell2016).

To increase the statistical power for Research Question 2, we combined the individual age groups (n = 7-9) into three larger groups: 5- and 6-year-olds, 7- and 8-year-olds, and 9- and 10-year-olds (n = 15-17). We conducted a series of Wilcoxon Mann-Whitney U-tests to examine mean differences in the Overall percent correct of the Comprehension Task and the Overall percent “adequate” of the Explanation Task between the three groups. The Wilcoxon Mann-Whitney U-tests provided a more conservative non-parametric approach to test whether two independent groups had been sampled from the same population (Siegel & Castellan, Reference Siegel and Castellan1988). Additionally, we adjusted the p-value cutoff to 0.017 using the Bonferroni correction and evaluated effect sizes using Cohen’s d, where 0.2, 0.5, and 0.8 represent small, medium, and large effect sizes, respectively (Howell, Reference Howell2016).

To address Research Question 3, the research assistants independently coded inadequate explanations with one of the five error codes: I Don’t Know/No Response, Repetition of Response, Irrelevant to Context, Made-up Interpretation, or Insufficient Explanation. Next, we tallied the number of instances of each error code and calculated the percentage of each error code for the three larger age groups (i.e., "5 & 6", "7 & 8", and "9 & 10"). Finally, we conducted the Wilcoxon Mann-Whitney U-tests with the Bonferroni correction to examine differences in the distribution of the error codes between groups.

Results

Research Question 1

The first research question compared performance between the Comprehension Task and the Explanation Task. The children performed significantly higher on the Comprehension Task than the Explanation Task across the Indirect Yes (M (SD) = 87 (13) vs. 52 (23), p < 0.001, d = 1.86), Indirect No (M (SD) = 91 (11) vs. 56 (24), p < 0.001, d = 1.89), and Overall conditions (M (SD) = 89 (9) vs. 52 (21), p < 0.001, d = 2.26). Table 2 summarizes the mean percent correct and the mean percent “adequate” of the two tasks.

Table 2. Comparison of Performance on Comprehension Task vs. Explanation Task

Note.

a Overall percent correct calculated by averaging Indirect Yes and Indirect No for the Comprehension Task (Ambiguous items were not scored because the answers could be interpreted either way); Overall percent adequate calculated by averaging Indirect Yes, Indirect No, and Ambiguous Response for the Explanation Task.

b Mean comparisons using Wilcoxon signed-rank test to evaluate significance at 0.05 and Cohen’s d to evaluate effect sizes.

Research Question 2

The second research question examined performance on the two tasks by age. When comparing the Overall percent correct of the Comprehension Task, no significant difference was found across the three larger age groups (i.e., 5 & 6 vs. 7 & 8 vs. 9 & 10; M (SD) = 85 (11) vs. 87 (13) vs. 95 (9)). However, a large effect size emerged when comparing the 5- and 6- to the 9- and 10-year-olds (d = 0.98) and a medium effect size emerged when comparing the 7- and 8- to the 9- and 10-year-olds (d = 0.71). Within-group analyses indicated no significant differences between Indirect Yes and Indirect No items and small effect sizes from all comparisons (d-values < 0.47). Ambiguous Response items were not scored because they could be interpreted either way. Table 3 summarizes the mean percent correct of the Comprehension Task by the three larger age groups, and Figure 2 contains a scatter plot that visualizes the relationship between age and comprehension of indirect answers.

Table 3. Performance on Comprehension Task by Age

Note.

a Overall percent correct calculated by averaging Indirect Yes and Indirect No items for the Comprehension Task. Ambiguous Response items were not scored because they could be interpreted either way. No significant statistical difference was found between groups.

Figure 2. Scatter Plot of Overall Percent Correct of Comprehension Task.

When comparing the Overall percent “adequate” of the Explanation Task, the 5- and 6-year-olds (M (SD) = 32 (19)) performed significantly lower than the 7- and 8-year-olds (M (SD) = 55 (15); U(30) = 51053, p < 0.0001, d = 1.37). Additionally, the 7- and 8-year-olds performed significantly lower than the 9- and 10-year-olds (M (SD) = 66 (14); U(32) = 66553, p < 0.0001, d = 0.73). Within-group analyses of the Indirect Yes, Indirect No, and Ambiguous items revealed no significant differences. However, a medium effect size was found between Indirect No and Ambiguous in the 9- and 10-year-olds (d = 0.68). The remaining effect sizes were small, ranging from 0.12 to 0.47. Table 4 summarizes the mean percent “adequate” of the Explanation Task by the three larger age groups, and Figure 3 contains a scatter plot that visualizes the relationship between age and explanation of indirect answers.

Table 4. Performance on Explanation Task by Age

Note.

a Overall percent “adequate” calculated by averaging Indirect Yes, Indirect No, and Ambiguous Response items for the Explanation Task. Significant statistical differences were found between groups (all p-values < 0.0001), indicating “5 & 6” < “7 & 8” < “9 & 10”.

Figure 3. Scatter Plot of Overall Percent “Adequate” of Explanation Task.

Research Question 3

The third research question examined inadequate responses by categorizing them into five error types: I Don’t Know/No Response, Repetition of Response, Irrelevant to Context, Made-up Interpretation, and Insufficient Explanation. When comparing the mean percentages of I Don’t Know/No Response, Repetition of Response, and Irrelevant to Context, no significant difference was found across the three larger age groups. However, medium effect sizes emerged from the Repetition of Response when comparing the 5- and 6-year-olds to the 7- and 8-year-olds (d = 0.71) and to the 9- and 10-year-olds (d = 0.74). When comparing the distribution of Made-up Interpretations, the 9- and 10-year-olds had a significantly lower mean percentage (M (SD) = 2.27 (6.32)) than the 5- and 6-year-olds (M (SD) = 5.56 (5.64); U(32) = 56.5, p = 0.007, d = 0.54) and the 7- and 8-year-olds (M (SD) = 9.09 (8.46); U(33) = 61.5, p = 0.007, d = 0.91). Finally, for Insufficient Explanations, 9- and 10-year-olds had a significantly higher mean percentage (M (SD) = 41.66 (26.29)) than 5- and 6-year-olds (M (SD) = 22.62 (18.99); U(32) = 66.5, p = 0.012, d = 0.83), with a medium effect size when comparing 5- and 6-year-olds to 7- and 8-year-olds (d = 0.78). Table 5 shows the distribution of the error types for the three larger age groups, and Table 6 further details the mean percentage of error types across the experimental conditions (i.e., Indirect Yes, Indirect No, and Ambiguous Response).

Table 5. Mean Percentage of Error Types by Age

Note. For I Don’t Know/No Response, Repetition of Response, and Irrelevant to Context, no significant differences were found across the three larger age groups. For Made-up Interpretation, 9- and 10-year-olds had a significantly lower mean percentage than 5- and 6-year-olds and 7- and 8-year-olds (both p-values = 0.007). For Insufficient Explanation, 9- and 10-year-olds had a significantly higher mean percentage than 5- and 6-year-olds (p = 0.012).

Table 6. Mean Percentage of Error Types across Three Experimental Conditions (Indirect Yes, Indirect No, and Ambiguous Response)

Note.

a The percentages indicate the distribution of the inadequate responses to Indirect Yes items across the error codes.

Discussion

The purpose of the current study was to examine the developmental trajectory of comprehension of indirect answers among preschool- and early elementary school-aged children with typical development. To address the first research question, we investigated the impact of methodological measures (i.e., forced-choice vs. open-ended questions) on task performance given that previous studies have used both formats and revealed varying levels of proficiency. The current study incorporated both types of measures (i.e., the Comprehension Task and Explanation Task), and the results indicated a significant difference with the Comprehension Task at 89% correct and the Explanation Task at 52% “adequate” overall. These findings are consistent with Bernicot et al. (Reference Bernicot, Laval and Chaminaud2007) and de Villiers et al. (Reference de Villiers, de Villiers, Coles-White and Carpenter2009) that reported 75% vs. 50% accuracy between forced-choice and open-ended questions. This performance gap is expected because explanation is more difficult than comprehension of non-literal language, and, thus, is developed and mastered over time. Laval (Reference Laval2003) examined both comprehension and metapragmatic knowledge of idioms (i.e., participants’ abilities to justify their chosen answers) and found that 6-year-old children with typical development understand idioms, but the corresponding metapragmatic knowledge is not matured until after age 9.

For the second research question, we examined children’s performance on comprehension and explanation of indirect answers by age. For the Comprehension Task, all children achieved above 84% accuracy, with the 9- and 10-year-olds reaching 95% accuracy. Across the three larger age groups, there was no significant difference between the performance on Indirect Yes and Indirect No items, suggesting that children comprehend indirect yes and no answers correctly at similar levels.

The developmental trajectory of the Explanation Task depicts noticeable gains over the preschool and early elementary years. Specifically, the cross-sectional trajectory shows three distinctive stages. In the first stage, 5- and 6-year-olds can adequately explain speaker intentions behind indirect answers 32% of the time. In the second stage, that percentage increases significantly to 55% by 7- and 8-year-olds. In the third stage, 9- and 10-year-olds make significant gains in their ability to interpret and explain indirect answers 66% of the time. The results also demonstrated that children of the same age performed similarly across the experimental conditions (i.e., Indirect Yes, Indirect No, and Ambiguous Response), suggesting no differential performance between the presumably more difficult Ambiguous Responses and the more common Indirect Yes and No answers. Particularly, older children can adequately explain Ambiguous Responses just as well as Indirect Yes and No answers by linking the speaker’s utterance to their intention (e.g., interpreting an ambiguous response “I tossed the ball” as having fun at a baseball game by reasoning that “he [the speaker] practiced and became better at it” – more examples can be found in Appendix B).

The development of indirect answers is similar to that of metaphors, another common form of non-literal language (Colston & Kuiper, Reference Colston and Kuiper2002; Kerbel & Grunwell, Reference Kerbel and Grunwell1997): comprehension begins at the age of 5 to 6 years and improves steadily throughout childhood and adolescence (Nippold, Reference Nippold1985; Rundblad & Annaz, Reference Rundblad and Annaz2010; Vosniadou & Ortony, Reference Vosniadou and Ortony1983). Winner et al. (Reference Winner, Rosenstiel and Gardner1976) examined comprehension of metaphors by older children between the ages of 6 and 14 years by asking the participants to explain metaphoric sentences, such as “After many years of working at the jail, the prison guard had become a hard rock that could not be moved.” The researchers found that comprehension increased gradually with age, and a higher level of metaphoric understanding emerged in early adolescence. Responses by the youngest participants showed little to no signs of metaphoric understanding – for example, explaining that the prison had hard rock walls, or the guard used to sit on a rock. Eight-year-old children demonstrated initial understanding of metaphors by commenting that the guard had muscles as hard as rocks, linking physical similarities between the guard and rock. By 10 years of age, children began to provide genuine metaphoric responses, interpreting the guard as hard as a rock because he did not care about anybody. Such progressive comprehension continues into early adolescence. In another experimental sentence where participants had to explain “The taste was a sharp knife,” a 10-year-old interpreted it to mean “It was spicy,” while a 14-year-old explained with more enriched descriptions that “The taste was a shocking flavor, hitting all of my senses at once”. In the current study, we also see more elaborated explanations of indirect answers in older participants. For example, to explain the speaker’s intention in Q: “Are you going to the circus?” A: “I have my binoculars ready.” a 5-year-old participant simply answered, “She already packed her stuff.” A 7-year-old further explained, “So she can see better at the circus.” Several 9- and 10-year-olds provided even more detailed explanations that the speaker “brought her binoculars to see closer,” and “If there’s something small at the circus, she can use the binoculars to see it.” The oldest children in the current study were able to adequately explain indirect answers 66% of the time, and their performance may continue to improve well into adolescence similar to other forms of non-literal language (Nippold & Rudzinski, Reference Nippold and Rudzinski1993; Vieiro & García-Madruga, Reference Vieiro and García-Madruga1997; Winner et al., Reference Winner, Rosenstiel and Gardner1976).

The acquisition and development of other common forms of non-literal language – such as irony (Dews et al., Reference Dews, Winner, Kaplan, Rosenblatt, Hunt, Lim, McGovern, Qualter and Smarsh1996; Hancock et al., Reference Hancock, Dunham and Purdy2000; Pexman & Glenwright, Reference Pexman and Glenwright2007), humor and sarcasm (Keenan & Quigley, Reference Keenan and Quigley1999; Semrud-Clikeman & Glass, Reference Semrud-Clikeman and Glass2008, Reference Semrud-Clikeman and Glass2010), and scalar implicature (Guasti et al., Reference Guasti, Chierchia, Crain, Foppolo, Gualmini and Meroni2005; Noveck, Reference Norbury2001; Papafragou & Musolino, Reference Papafragou and Musolino2003) – show trends similar to indirect answers, further suggesting that preschool and early elementary years are prime years of gaining non-literal language. However, more studies that compare and cross-examine multiple forms of non-literal language are needed before a comprehensive view of non-literal language development can be obtained.

Finally, for the third research question, we compared the distribution of five types of inadequate explanations (i.e., I Don’t Know/No Response, Repetition of Response, Irrelevant to Context, Made-up Interpretation, and Insufficient Explanation) across three larger age groups (i.e., 5- and 6-year-olds, 7- and 8-year-olds, and 9- and 10-year-olds). Overall, Repetition of Response and Insufficient Explanation were most common, accounting for approximately 80% of inadequate explanations. Irrelevant to Context, I Don’t Know/No Response, and Made-up Interpretation were less common, each accounting for 2%-10% of total inadequate responses. These data highlight that when children fail to explain indirect answers, their response often repeats the speaker’s utterance or lacks a convincing explanation for the speaker’s intention.

The distributions of I Don’t Know/No Response (Error Code 1), Repetition of Response (2), and Irrelevant to Context (3) did not differ significantly across the three larger age groups. For Made-up Interpretation (4), 9- and 10-year-olds had a significantly lower distribution of 2% than 5- and 6-year-olds at 6% and 7- and 8-year-olds at 9%. For Insufficient Explanation (5), 9- and 10-year-olds had a significantly higher distribution of 42% than 5- and 6-year-olds at 23%. The coding scheme for inadequate explanations was developed to reflect a hierarchy of reasoning, where lower codes represented no or poor explanations and higher codes represented more satisfactory (but still inadequate) explanations. While there was no significant difference in the distribution of the lower codes 1-3, a shift was seen in the higher codes 4 and 5. That is, the oldest group had significantly fewer Made-up Interpretations (4) than the other two younger groups and significantly more Insufficient Explanations (5) than the youngest group. Combining the findings, children’s ability to adequately explain indirect answers improves with age; this shift further suggests that the quality of inadequate explanations also improves with age.

Limitations and Future Directions

Although the current study provides a detailed examination of the comprehension and explanation abilities of preschool- and early elementary school-aged children, one limitation to these findings arises from the small sample sizes in each individual age group (n = 7-9). To increase statistical power, we combined these groups into three larger groups: 5- and 6-year-olds, 7- and 8-year-olds, and 9- and 10-year-olds (n = 15-17) to compare performance. While we found significant differences in explaining indirect answers between groups (i.e., “5 & 6” < “7 & 8” < “9 & 10”), we were not able to pinpoint a specific age or ages at which children make significant gains.

Another limitation stems from the racially homogeneous sample of 79% (38 of 48 participants) being identified as White. Therefore, results from the current study are more appropriately generalized to children with the same demographic. Future studies should more closely examine and compare comprehension of indirect answers across different racial and linguistic backgrounds as well as include a larger sample to increase external validity of the results found in this study.

A third limitation arises from the exploratory nature of the error analyses. While we created the coding scheme based on Nippold and Martin’s (Reference Nippold and Martin1989) classification for idiom interpretation, the five error types are not an exhaustive list. Additionally, we attempted to organize the error codes hierarchically from least to most satisfactory, albeit being inadequate explanations (i.e., I Don’t Know/No Response (1), Repetition of Response (2), Irrelevant to Context (3), Made-up Interpretation (4), and Insufficient Explanation (5)). This framework is not evidence-based, and, thus, requires further investigation to validate its methodological soundness.

The current study did not examine suprasegmental or paralinguistic features that may facilitate comprehension of indirect answers. For example, intonation contours are known for their pragmatic function of suggesting or imposing an alternative meaning different from the literal utterance (de Marneffe & Tonhauser, Reference de Marneffe and Tonhauser2019; Dennison & Schafer, Reference Dennison and Schafer2017; Kurumada, Brown et al., Reference Kurumada, Brown, Bibyk, Pontillo and Tanenhaus2014; Pierrehumbert & Hirschberg, Reference Pierrehumbert and Hirschberg1990). Facial expressions can provide visual cues to determine whether there is a discrepancy between the literal message and speaker intention (Attardo et al., Reference Attardo, Eisterhold, Hay and Poggi2003; Caucci & Kreuz, Reference Caucci and Kreuz2012; Deliens et al., Reference Deliens, Antoniou, Clin, Ostashchenko and Kissine2018a). In everyday communication, indirect answers are naturally accompanied by these multimodal cues. Thus, future research may investigate their influences on comprehension of indirect answers.

Future studies may also consider comparing comprehension of indirect answers across different neurodevelopmental conditions, such as autism and developmental language disorder. Understanding non-literal aspects of language, such as metaphors and irony, is often cited as a communication difficulty for individuals on the autism spectrum (e.g., Colich et al., Reference Colich, Wang, Rudie, Hernandez, Bookheimer and Dapretto2012; Deliens et al., Reference Deliens, Papastamou, Ruytenbeek, Geelhand and Kissine2018b; Dennis et al., Reference Dennis, Lazenby and Lockyer2001; Emerich et al., Reference Emerich, Creaghead, Grether, Murray and Grasha2003; Happé, Reference Happé1993, Reference Happé1995; Kalandadze et al., Reference Kalandadze, Norbury, Nærland and Næss2018; Martin & McDonald, Reference Martin and McDonald2004; Mitchell, Reference Mitchell1997; Norbury, Reference Noveck2005; Rundblad & Annaz, Reference Rundblad and Annaz2010). However, this population’s ability to comprehend and explain the speaker’s intention behind indirect answers as another form of non-literal language remains unknown.

Conclusions

The main contribution of the current study is that it provides empirical evidence to the developmental trajectories of comprehension and explanation of indirect answers in preschool- and early elementary school-aged children with typical development. For comprehension, 5- to 8-year-olds perform around 84%-86% accuracy, and 9- and 10-year-olds were near ceiling at 95%. For explanation, the cross-sectional trajectory indicated three stages, with 5- and 6-year-olds explaining indirect answers adequately 32% of the time, 7- and 8-year-olds performing significantly higher at 55%, and 9- and 10-year-olds performing significantly higher than the two younger groups at 66%. By examining the two tasks separately, the findings offer two sets of baseline data for future studies that investigate acquisition of indirect answers by children with different cultural and linguistic backgrounds or those with varying cognitive and language profiles.

The error analysis offers novel insight into what happens when children fail to interpret the speaker’s intentions appropriately. Overall, Repetition of Response and Insufficient Explanation are the most common errors and account for approximately 80% of inadequate explanations. Irrelevant to Context, I Don’t Know/No Response, and Made-up Interpretation each account for 2%-10% of total inadequate responses. Additionally, the quality of inadequate explanations improves with age, as evidenced by older children providing more Insufficient Explanations and fewer Made-up Interpretations than younger children.

Competing interest statement

The authors have no financial or nonfinancial relationships to disclose.

Funding statement

Timothy Huang received funding from West Chester University’s Creative Activity and Research Experience Award and from the University of Minnesota’s Bryng Bryngelson Research Award to support this study.

Appendix A

Appendix B

This appendix details the training procedure for research assistants. The table at the end of the appendix provides descriptions of the error codes and examples of child responses. All research assistants completed the online Basic Course for Social/Behavior or Humanist Research training through the Collaborative Institutional Training Initiative Program. The researcher trained two coding assistants through direct instruction on definitions of an adequate explanation and characteristics of each error code for inadequate explanations, utilizing participant responses as examples. An adequate explanation was defined as a response that links the speaker’s utterance to his/her intention or provides a context-appropriate alternative reasonan. For example, an adequate explanation to the conversation, Q: “Are you hungry?” A: “I just came from a pizza party.” included “He was not hungry because he already ate at the party.” Inadequate explanations, however, failed to capture or interpret the speaker’s intention appropriately. For example, inadequate explanations to the same item included “He said he went to a pizza party.” and “Parties are fun.” Scoring of the Explanation Task was independent from the Comprehension Task. That is, a participant could score 1 for the Comprehension Task but 0 for the Explanation Task and vice versa. For example, a participant could score 0 by answering “Yes (he is hungry)” for the Comprehension Task due to its contradiction to adult judgement but still score 1 for the Explanation Task by reasoning that “Because he didn’t eat anything at the party, so probably hungry.”

Scoring for the Ambiguous and Direct Response items differed from the other two conditions. Because the Ambiguous answer prompts were designed to be unclear, explanations were judged based on whether an appropriate or adequate speaker intention was provided. For example, in the conversation, Q: “Did you have fun playing baseball?” A: “I tossed the ball.”, adequate explanations included “(Yes) because he practiced and became better at it” and “(No) all he did was tossing the ball.” Inadequate explanations included “(Yes) he tossed the baseball” and “(No) he didn’t have fun.” For Direct Response items, repetition of the answer prompt was considered adequate. For example, in the conversation: Q: “Did you go to the garden?” A: “I did not go to the garden.”, “He said he did not go to the garden.” and “He said so.” would be judged as adequate. More examples are provided at the end of the appendix.

The research assistants completed approximately 1 hour of training with the researcher. Next, the research assistants independently scored and coded 90 child responses (i.e., responses by three participants). The researcher coded the same responses and calculated interrater reliability between the researcher and the assistants for each test item. Interrater reliability was computed by dividing the number of instances of agreement by the total number of opportunities and multiplying by 100. Reliability on the Comprehension Task across all test items ranged from 80%-100% between the researcher and one assistant and 60%-100% between the researcher and the other assistant. Reliability on the Explanation Task across all test items ranged from 80%-100% between the researcher and the two assistants. The researcher and the research assistants met a second time to discuss instances of disagreement item by item and resolved all issues using the error code worksheet (Figure 1). After training completed, the researcher and both assistants reached 100% reliability on 30 new child responses.

Both research assistants scored all responses independently. Reliability across Indirect Yes, Indirect No, and Ambiguous Response items ranged from 82%-94% with an overall average of 87%. Instances of disagreement arose from the coders’ subjective judgment of whether a response “adequately” explained speaker intentions. Take the item Q: “Have you finished your homework?” A: “I just got home from school.” For example, the explanation “He doesn’t want to do it.” was coded as adequate by one coder but inadequate by the other. The former argued that the response showed that the child interpreted the indirect answer as an excuse, and the latter argued that the response failed to provide a convincing explanation, such as “He did not have time to do it.” All disagreements were subsequently resolved by the first author as a third coder. Interrater reliability was 100% for Direct Response items as all explanations were judged adequate. The table below provides descriptions of the error codes and corresponding examples across the experimental conditions.

Note. aChild did not respond after further prompting (e.g., “Tell me more.”); bAll participants scored “adequate” for Direct Response items. *indicates adequate explanations based on alternative assumptions.

References

Attardo, S., Eisterhold, J., Hay, J., & Poggi, I. (2003). Multimodal markers of irony and sarcasmHumor16(2), 243260.Google Scholar
Bangert, K. J., Halverson, D. M., & Finestack, L. H. (2019). Evaluation of an explicit instructional approach to teach grammatical forms to children with low-symptom severity autism spectrum disorderAmerican Journal of Speech-Language Pathology28(2), 650663.10.1044/2018_AJSLP-18-0016CrossRefGoogle ScholarPubMed
Bernicot, J., Laval, V., & Chaminaud, S. (2007). Non-literal language forms in children: In what order are they acquired in pragmatics and metapragmatics? Journal of Pragmatics39(12), 21152132.10.1016/j.pragma.2007.05.009CrossRefGoogle Scholar
Brown, R. (1973). A first language: The early stages. Oxford, England: Harvard U. Press.CrossRefGoogle Scholar
Bucciarelli, M., Colle, L., & Bara, B. G. (2003). How children comprehend speech acts and communicative gesturesJournal of Pragmatics35(2), 207241.10.1016/S0378-2166(02)00099-1CrossRefGoogle Scholar
Caucci, G. M., & Kreuz, R. J. (2012). Social and paralinguistic cues to sarcasmHumor25(1), 122.10.1515/humor-2012-0001CrossRefGoogle Scholar
Chin, I. (2017). Variability in Pragmatic Abilities in Children with Autism Spectrum Disorder. Doctoral Dissertation. University of Connecticut.Google Scholar
Colich, N. L., Wang, A. T., Rudie, J. D., Hernandez, L. M., Bookheimer, S. Y., & Dapretto, M. (2012). Atypical neural processing of ironic and sincere remarks in children and adolescents with autism spectrum disordersMetaphor and symbol27(1), 7092.CrossRefGoogle ScholarPubMed
Colston, H. L., & Kuiper, M. S. (2002). Figurative Language Development Research and Popular Children’s Literature: Why We Should Know,” Where the Wild Things Are”. Metaphor and Symbol17(1), 2743.CrossRefGoogle Scholar
Constantino, J. N., & Gruber, C. P. (2012). Social Responsiveness Scale Second Edition (SRS-2). Western Psychological Services (WPS).Google Scholar
Deliens, G., Antoniou, K., Clin, E., Ostashchenko, E., & Kissine, M. (2018a). Context, facial expression and prosody in irony processingJournal of memory and language99, 3548.CrossRefGoogle Scholar
Deliens, G., Papastamou, F., Ruytenbeek, N., Geelhand, P., & Kissine, M. (2018b). Selective pragmatic impairment in autism spectrum disorder: Indirect requests versus ironyJournal of autism and developmental disorders48(9), 29382952.CrossRefGoogle ScholarPubMed
de Marneffe, M. C., & Tonhauser, J. (2019). Inferring meaning from indirect answers to polar questions: The contribution of the rise-fall-rise contour. In Questions in discourse (pp. 132163). Brill.CrossRefGoogle Scholar
Dennis, M., Lazenby, A. L., & Lockyer, L. (2001). Inferential Language in High Function Children with Autism, Journal of Autism and Developmental Disorders 31: 4754.10.1023/A:1005661613288CrossRefGoogle ScholarPubMed
Dennison, H., & Schafer, A. J. (2017). Processing intonationally implicated contrast versus negation in American EnglishLanguage and Speech60(2), 174199.10.1177/0023830917694066CrossRefGoogle ScholarPubMed
de Villiers, P. A., de Villiers, J., Coles-White, D., & Carpenter, L. (2009). Acquisition of relevance implicatures in typically-developing children and children with autism. In Proceedings of the 33th annual Boston university conference on language development (pp. 121132). Somerville, MA: Cascadilla Press.Google Scholar
Dews, S., Winner, E., Kaplan, J., Rosenblatt, E., Hunt, M., Lim, K., McGovern, A., Qualter, A., & Smarsh, B. (1996). Children’s understanding of the meaning and functions of verbal irony. Child Development, 67(6), 30713085.CrossRefGoogle ScholarPubMed
Emerich, D. M., Creaghead, N. A., Grether, S. M., Murray, D., & Grasha, C. (2003). The comprehension of humorous materials by adolescents with high-functioning autism and Asperger’s syndromeJournal of autism and developmental disorders33(3), 253257.CrossRefGoogle ScholarPubMed
Grice, H. P. (1975). Logic and conversation. In Speech acts (pp. 4158). Brill.CrossRefGoogle Scholar
Guasti, M. T.Chierchia, G.Crain, S.Foppolo, F.Gualmini, A., & Meroni, L. (2005). Why children and adults sometimes (but not always) compute implicaturesLanguage and Cognitive Processes20667.CrossRefGoogle Scholar
Hancock, J. T., Dunham, P. J., & Purdy, K. (2000). Children’s comprehension of critical and complimentary forms of verbal ironyJournal of Cognition and Development1(2), 227248.CrossRefGoogle Scholar
Happé, F. G. (1993). Communicative competence and theory of mind in autism: A test of relevance theoryCognition48(2), 101119.CrossRefGoogle ScholarPubMed
Happé, F. G. (1995). The role of age and verbal ability in the theory of mind task performance of subjects with autismChild development66(3), 843855.CrossRefGoogle ScholarPubMed
Hockey, B. A., Rossen-Knill, D., Spejewski, B., Stone, M., & Isard, S. (1997). Can you predict responses to yes/no questions? yes, no, and stuff. In Fifth european conference on speech communication and technology. 10.21437/Eurospeech.1997-597CrossRefGoogle Scholar
Howell, D. C. (2016). Fundamental statistics for the behavioral sciences. Nelson Education.Google Scholar
Kalandadze, T., Norbury, C., Nærland, T., & Næss, K. A. B. (2018). Figurative language comprehension in individuals with autism spectrum disorder: A meta-analytic reviewAutism22(2), 99117.CrossRefGoogle ScholarPubMed
Kaufman, A. S., & Kaufman, N. L. (2004). Kaufman Brief Intelligence Test, Second Edition. Bloomington, MN: Pearson, Inc.Google Scholar
Keenan, T. R., & Quigley, K. (1999). Do young children use echoic information in their comprehension of sarcastic speech? A test of echoic mention theoryBritish Journal of Developmental Psychology17(1), 8396.CrossRefGoogle Scholar
Kerbel, D., & Grunwell, P. (1997). Idioms in the classroom: An investigation of language unit and mainstream teachers’ use of idiomsChild Language Teaching and Therapy13(2), 113123.CrossRefGoogle Scholar
Kuperman, V., Stadthagen-Gonzalez, H., & Brysbaert, M. (2012). Age-of-acquisition ratings for 30,000 English wordsBehavior research methods44(4), 978990.10.3758/s13428-012-0210-4CrossRefGoogle ScholarPubMed
Kurumada, C., Brown, M., Bibyk, S., Pontillo, D. F., & Tanenhaus, M. K. (2014). Is it or isn’t it: Listeners make rapid use of prosody to infer speaker meaningsCognition133(2), 335342.10.1016/j.cognition.2014.05.017CrossRefGoogle ScholarPubMed
Laval, V. (2003). Idiom comprehension and metapragmatic knowledge in French children. Journal of Pragmatics, 35, 723739.CrossRefGoogle Scholar
Loukusa, S., Leinonen, E., & Ryder, N. (2007). Development of pragmatic language comprehension in Finnish-speaking childrenFirst Language27(3), 279296.CrossRefGoogle Scholar
Martin, I., & McDonald, S. (2004). An exploration of causes of non-literal language problems in individuals with Asperger syndromeJournal of autism and developmental disorders34(3), 311328.CrossRefGoogle ScholarPubMed
Mitchell, P. (1997). Introduction to theory of mind: Children, autism and apes. Edward Arnold Publishers.Google Scholar
Nippold, M. A. (1985). Comprehension of figurative language in youth. Topics in Language Disorders.CrossRefGoogle Scholar
Nippold, M. A., & Martin, S. T. (1989). Idiom interpretation in isolation versus context: A developmental study with adolescentsJournal of Speech, Language, and Hearing Research32(1), 5966.10.1044/jshr.3201.59CrossRefGoogle ScholarPubMed
Nippold, M. A., & Rudzinski, M. (1993). Familiarity and transparency in idiom explanation: A developmental study of children and adolescents. Journal of Speech and Hearing Research, 36, 728737.10.1044/jshr.3604.728CrossRefGoogle ScholarPubMed
Norbury, C. F. (2005). The relationship between theory of mind and metaphor: Evidence from children with language impairment and autistic spectrum disorderBritish Journal of Developmental Psychology23(3), 383399.CrossRefGoogle Scholar
Noveck, I. A. (2001). When children are more logical than adults: Experimental investigationsof scalar implicature. Cognition, 78(2), 165188.CrossRefGoogle Scholar
Papafragou, A., & Musolino, J. (2003). Scalar implicatures: experiments at the semantics–pragmatics interface. Cognition, 86(3), 253282.CrossRefGoogle ScholarPubMed
Pexman, P. M., & Glenwright, M. (2007). How do typically developing children grasp the meaning of verbal irony?Journal of Neurolinguistics20(2), 178196.CrossRefGoogle Scholar
Pierrehumbert, J., & Hirschberg, J. B. (1990). The meaning of intonational contours in the interpretation of discourse.CrossRefGoogle Scholar
Rice, M. L., Smolik, F., Perpich, D., Thompson, T., Rytting, N., & Blossom, M. (2010). Mean length of utterance levels in 6-month intervals for children 3 to 9 years with and without language impairmentsJournal of Speech, Language, and Hearing Research53(2), 333349.CrossRefGoogle ScholarPubMed
Rundblad, G., & Annaz, D. (2010). Development of metaphor and metonymy comprehension: Receptive vocabulary and conceptual knowledgeBritish Journal of Developmental Psychology28(3), 547563.CrossRefGoogle ScholarPubMed
Semel, E., Wiig, E. H., & Secord, W. A. (2003). Clinical evaluation of language fundamentals, fourth edition (CELF-4). Toronto, Canada: The Psychological Corporation/A Harcourt Assessment Company.Google Scholar
Semrud-Clikeman, M., & Glass, K. (2008). Comprehension of humor in children with nonverbal learning disabilities, reading disabilities, and without learning disabilitiesAnnals of Dyslexia58, 163180.10.1007/s11881-008-0016-3CrossRefGoogle ScholarPubMed
Semrud-Clikeman, M., & Glass, K. (2010). The relation of humor and child development: Social, adaptive, and emotional aspectsJournal of child neurology25(10), 12481260.CrossRefGoogle ScholarPubMed
Siegel, S., & Castellan, N. J Jr. (1988). Nonparametric statistics for the behavioral sciences. New York, NY: McGraw-Hill, Inc.Google Scholar
Stenstrom, A.-B. (1984). Questions and Responses in English Conversation. Lund: Lund University Press.Google Scholar
Vieiro, P., & García-Madruga, J. A. (1997). An analysis of story comprehension through spoken and written summaries in school-age children. Reading and Writing: An Interdisciplinary Journal, 9, 4153.10.1023/A:1007932429184CrossRefGoogle Scholar
Vosniadou, S., & Ortony, A. (1983). The emergence of the literal-metaphorical-anomalous distinction in young children. Child Development, 154161.CrossRefGoogle Scholar
Winkler, J. D., Kanouse, D. E., & Ware, J. E. (1982). Controlling for acquiescence response set in scale developmentJournal of Applied Psychology67(5), 555.CrossRefGoogle Scholar
Winner, E., Rosenstiel, A. K., & Gardner, H. (1976). The development of metaphoric understanding. Developmental psychology12(4), 289.CrossRefGoogle Scholar
Woolson, R. F. (2007). Wilcoxon signed‐rank test. Wiley encyclopedia of clinical trials, 13.Google Scholar
Figure 0

Table 1. Characteristics of Participants

Figure 1

Figure 1. Error Code Worksheet.

Figure 2

Table 2. Comparison of Performance on Comprehension Task vs. Explanation Task

Figure 3

Table 3. Performance on Comprehension Task by Age

Figure 4

Figure 2. Scatter Plot of Overall Percent Correct of Comprehension Task.

Figure 5

Table 4. Performance on Explanation Task by Age

Figure 6

Figure 3. Scatter Plot of Overall Percent “Adequate” of Explanation Task.

Figure 7

Table 5. Mean Percentage of Error Types by Age

Figure 8

Table 6. Mean Percentage of Error Types across Three Experimental Conditions (Indirect Yes, Indirect No, and Ambiguous Response)