To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
San Sebastián del Monte Mixtec (henceforth SSM), also known as Tò’on Ndà’vi, is a language of the Mixtecan family, Otomanguean stock. SSM has lexical tones that are orthogonal to rearticulation on vowels. The aim of this production study is to examine both long modal and rearticulated vowels to gain insight into the SSM tonal system, contrastive voice quality, and any potential interactions between voice quality and f0. Rearticulated vowels are described as having a glottal gesture between two vowels of the same quality (V͡ˀV), while modal vowels have no such gesture (VV). To this end, we examined the phonetic realization of the lexical tones in long modal vowels in terms of f0. All tones are distinguished by f0; f0 patterns largely as expected given previously ascribed labels, with minor deviations. Secondly, the phasing and degree of glottalization in rearticulated vowels was measured using ‘strength of excitation’ (SoE); generally the glottal gesture was vowel medial with a dip in SoE at the beginning of the glottal gesture and a rise in SoE following the glottal gesture. However, there was a large degree of interspeaker variation in the production of rearticulated vowels. Additionally, lexical tone category was found to have an impact on the phasing and degree of glottal gesture in rearticulated vowels, and on voice quality in long modal vowels. This supports the idea that voice quality is an additional correlate of lexical tone in SSM.
An open question about cartography is whether one and the same functional head may iterate on the functional hierarchy. We demonstrate that the stackability of certain modals from the same semantic class in Mandarin offers clear evidence for such a possibility.
Jury selection in the US involves voir dire, an examination process wherein prospective jurors are questioned about their potential for fairness or bias. Such inquiries are hampered by social desirability pressures inhibiting admissions of bias. Analogous pressures hamper survey interviews, but since voir dire examinations are unscripted their study can reveal how desirability pressures are addressed through naturally occurring variations in question design. This article combines sequential and distributional analyses of >100 transcribed question-answer sequences targeting juror fairness/bias, and documents various tendencies and preferences in question design. Court officials focus on bias rather than fairness by default, and the predominant bias-targeting questions are mitigated through: (i) indirect references to bias, (ii) diffusion of responsibility for bias, and (iii) projecting bias as minimal or unlikely. The findings shed light on the social dynamics of jury selection and, more broadly, how question design practices are adapted for inquiry into sensitive subjects. (Questions, law, voir dire, juries, social desirability bias, conversation analysis)
Cadastral data reveal key information about the historical organization of cities but are often non-standardized due to diverse formats and human annotations, complicating large-scale analysis. We explore as a case study Venice’s urban history during the critical period from 1740 to 1808, capturing the transition following the fall of the ancient Republic and the Ancien Régime. This era’s complex cadastral data, marked by its volume and lack of uniform structure, presents unique challenges that our approach adeptly navigates, enabling us to generate spatial queries that bridge past and present urban landscapes. We present a text-to-programs framework that leverages large language models to process natural language queries as executable code for analyzing historical cadastral records. Our methodology implements two complementary techniques: a SQL agent for handling structured queries about specific cadastral information, and a coding agent for complex analytical operations requiring custom data manipulation. We propose a taxonomy that classifies historical research questions based on their complexity and analytical requirements, mapping them to the most appropriate technical approach. This framework is supported by an investigation into the execution consistency of the system, alongside a qualitative analysis of the answers it produces. By ensuring interpretability and minimizing hallucination through verifiable program outputs, we demonstrate the system’s effectiveness in reconstructing past population information, property features and spatiotemporal comparisons in Venice.
Prediction is a central feature of mature language comprehension, but little is known about how and when it develops. This study investigates whether lexical prediction emerges before seven using a novel, naturalistic cloze task. Five and six-year-old children listened to a storybook and occasionally guessed which word might come next. We selected 180 words from the story that were shown to be more or less predictable in a prior cloze norming task with adults. We found that children frequently guessed the correct word or provided an alternative that was semantically related to the target, demonstrating an ability to use the context to explicitly predict upcoming words. Six-year-olds were more accurate than 5-year-olds. These findings show prediction is present (but still improving) in early childhood, motivating future work on the role of prediction in children’s comprehension and learning. Finally, we demonstrate that it is feasible to collect cloze values from children.
In this paper, we present two corpus-based case studies which cast doubt on the postulation of a distinction between complements and modifiers in pre-head position in the English noun phrase. Based on examples such as medical student, the paper focuses on ordering patterns as an easily observable criterion, rather than more difficult or less reliable criteria such as anaphoric replacement or stress patterns. The conclusion is that the pre-head dependents treated as complements in, for example, the Cambridge Grammar of the English Language (Huddleston & Pullum et al. 2002), should rather be treated as type-dependents. This conclusion, at least as far as ordering patterns are concerned, is in line with the postulation of a “classifier” function in approaches to English noun phrases such as Feist (2009).
This article explores the potential of large language models (LLMs), particularly through the use of contextualized word embeddings, to trace the evolution of scientific concepts. It thus aims to extend the potential of LLMs, currently transforming much of humanities research, to the specialized field of history and philosophy of science. Using the concept of the virtual particle – a fundamental idea in understanding elementary particle interactions – as a case study, we domain-adapted a pretrained Bidirectional Encoder Representations from Transformers model on nearly a century of Physical Review publications. By employing semantic change detection techniques, we examined shifts in the meaning and usage of the term “virtual.” Our analysis reveals that the dominant meaning of “virtual” stabilized after the 1950s, aligning with the formalization of the virtual particle concept, while the polysemy of “virtual” continued to grow. Augmenting these findings with dependency parsing and qualitative analysis, we identify pivotal historical transitions in the term’s usage. In a broader methodological discussion, we address challenges such as the complex relationship between words and concepts, the influence of historical and linguistic biases in datasets, and the exclusion of mathematical formulas from text-based approaches.
Previous research has demonstrated that predictable words that are not presented linger in memory and lead to false recognition in subsequent memory tests. However, little is known about these effects among second language learners, a population that is known for engaging less in prediction. Here, we used a self-paced reading and word recognition memory test to examine encoding differences and subsequent memory effects in groups of L1 and L2 speakers of German. For initial reading, results showed no group differences in the size of the predictability effect, possibly because group differences in attention allocation during reading masked predictability effects. For recognition memory, L2 learners showed reduced rates of false remembering for predictable words (after correcting for response bias), and they were also less likely to false-alarm to predictable words with high subjective memory confidence, similar to L1 speakers. In addition, L2 learners showed reduced recognition memory for previously presented words. Taken together, these results are consistent with models arguing that lexical-semantic entries are less firmly represented in the L2 lexicon, which in turn lowers pre-activation of predictable referents during L2 sentence processing and leads to the formation of less distinct memory representations for previously encoded information.
While statistical learning of adjacent constructions is well-documented in SLA, our knowledge of this cognitive mechanism concerning nonadjacent constructions remains limited. To address this, we investigated the acquisition of Mandarin predicate-argument constructions containing the preposition duì. Specifically, via a corpus-based approach, we probed whether learners’ core predicate use within these nonadjacent constructions mirrors the patterns of frequency and contingency in their natural language input. Our findings show that learners’ usage aligns with target language distributional regularities, which is consistent with statistical learning. However, our study underscores the necessity of going beyond a sole focus on distributional factors within learners’ input to more fully comprehend L2 production choices and the intricacies of statistical learning. This includes examining variables that shape learners’ exposure to input, such as input accessibility, proficiency, and prototypicality. Finally, we demonstrate the suitability of mixed-effects negative binomial regression to effectively address non-normality and overdispersion in linguistic data.
We investigate timing and eye-movement behavior during semantic prediction in L1 and L2 speakers of English using the Visual World Paradigm, additionally exploring speech rate. We differentiate first-stage predictions, considered to be automatic and relatively cost-free, from second-stage predictions, which are non-automatic and more cognitively demanding, with differences between L1 and L2 speakers believed to arise in second-stage predictions. We found no differences in the divergence of looks to the target in first- or second-stage predictions across groups. However, speech rate played an important role. Both L1 and L2 speakers showed similar first-stage predictions at slower speech rates, but L1 speakers showed earlier predictions as the speech rate increased. L2 speakers showed reduced and more variable second-stage predictions, suggesting they were impacted during the more demanding second-stage prediction. This may indicate a wait-and-see strategy to help reduce costs associated with second-stage prediction.
This study examined the relationship between intelligibility and comprehensibility in second language speech. Four extended speech samples from 50 speakers spanning a wide range of proficiency were drawn from archived test data. These samples were listened to by 570 English users, who provided comprehensibility ratings and transcriptions to measure intelligibility. The relationship between intelligibility and comprehensibility was strong (r = .81, ⍴ = .88) and nonlinear. A segmented regression model suggested a breakpoint for intelligibility scores (transcription accuracy) at 64%, below which speakers were perceived as uniformly hard to understand and above which increased intelligibility was strongly associated with higher comprehensibility.
Verbal fluency (VF) tasks are used in cognitive assessments to detect early signs of neurodegenerative diseases like Alzheimer’s. This study aimed to assess the contribution of VF tasks with varying executive processing loads to the early identification of cognitive impairment in the preclinical stage of subjective cognitive decline (SCD). A total of 97 older adults were classified into three groups: healthy controls (HC), SCD and mild cognitive impairment (MCI). Participants completed phonemic, semantic, alternating and orthographic VF tasks. Education level significantly affected VF performance, with gender differences being inconsistent. The HC and SCD groups performed similarly in phonemic and semantic tasks but differed significantly in high-executive-load tasks, where SCD participants performed worse. MCI patients showed lower performance across all VF tasks. Discriminant and ROC analyses identified alternating and orthographic VF tasks as effective markers for distinguishing cognitive status, supporting their potential for early detection of Alzheimer’s disease.
Zhuang (ISO 639-3, zha) is a group of languages belonging to the Tai language family (Diller, Edmondson & Luo 2008: 7), spoken by the Zhuang people, who form the largest minority group in China with a population of approximately 17 million.1 Most Zhuang speakers live in Guangxi Zhuang Autonomous Region, with over 14.4 million permanent residents.2 It is estimated that more than 20 million people speak a variety of Zhuang (Wei, Qin & Wei 2009: 7), including some other ethnic minorities, such as the Yao and Maonan, who live in the same regions as the Zhuang. A small number of Zhuang speakers inhabit regions in provinces adjacent to Guangxi, like Wenshan (Yunnan province) and Lianshan, in the northwest of Guangdong province (see the map in Figure 1).
We tested masked morphological priming effects with prefixed and suffixed words in L2 speakers of German with L1 Turkish, a language in which prefixes are virtually absent. We found weaker prefixation than suffixation priming, suggesting that cross-linguistic morphological differences between speakers’ L1 and L2 may influence L2 morphological processing. We additionally compared our findings to those of a previous study involving L1 Russian-L2 German speakers and L1 German speakers (Ciaccio & Clahsen (2020). Variability and consistency in first and second language processing: A masked morphological priming study on prefixation and suffixation. Language Learning, 70(1), 103–136). The magnitude of prefixation versus suffixation priming of our group was significantly larger than that reported for the L1 Russian-L2 German group, further corroborating the cross-linguistic hypothesis. However, we found no significant difference between our group and L1 German speakers. Therefore, we additionally consider the hypothesis of a general processing disadvantage for prefixed words as an alternative explanation. We conclude that several factors may contribute to why prefixation, in some studies, proves to be more challenging than suffixation, cross-linguistic influences being possibly just one of them.
This State-of-the-Art review examines second language (L2) writing assessment research over the past 25 years through a framework of fairness, justice, and criticality. Recognizing the socio-political implications of assessment, the authors argue for a shift toward more equitable and socially conscious approaches. Drawing from a corpus of 869 peer-reviewed articles across leading journals, the review identifies five major themes: (1) features of writing performance, (2) rating and scoring, (3) integrated assessment, (4) teacher and learner perspectives, and (5) feedback. Each theme is reviewed for foundational findings, then critiqued through questions related to fairness and justice using a critical lens. The authors advocate for a multilingual turn in writing assessment, greater attention to teacher and student voices, and questioning dominant norms embedded in assessment practices. The review concludes with a call for future research to engage with fairness, justice, and criticality in both theory and practice, ensuring that writing assessments serve as tools for empowerment rather than exclusion.