To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This article theorizes the concept ‘ethnolinguistic infusion’ as a language socialization and language management practice. Infusion involves community members incorporating fragments of their group language, in which most members have little or no competence, in the context of a different dominant language, with the potential effect of fostering ideological links among the individual, group, and language. I explain the metaphor, enumerate several characteristics, and offer a categorization of different types of infusion. I contextualize ethnolinguistic infusion among related constructs in language contact, sociolinguistics, and linguistic anthropology, including translanguaging, postvernacularity, and metalinguistic communities, I explain its relationship to ethnolinguistic repertoire, and I distinguish it from out-group-initiated phenomena like crossing and mock language. I demonstrate how ethnolinguistic infusion plays out in my research on American Jewish summer camps. I offer empirical questions for future research, and I conclude by arguing for the utility of ethnolinguistic infusion, both for academic analysis and for language activism. (Language and ethnicity, heritage language, symbolic language, emblematic language, language and group identity, Hebrew, infusion, loanwords, language contact, translanguaging, metalinguistic community, postvernacularity, endangered languages, language reclamation, language revitalization)
The exploration and retrieval of information from large, unstructured document collections remain challenging. Unsupervised techniques, such as clustering and topic modeling, provide only a coarse overview of thematic structure, while traditional keyword searches often require extensive manual effort. Recent advances in large language models and retrieval-augmented generation (RAG) introduce new opportunities by enabling focused retrieval of relevant documents or chunks tailored to a user’s query. This allows for dynamic, chat-like interactions that streamline exploration and improve access to pertinent information. This article introduces Topic-RAG, a chat engine that integrates topic modeling with RAG to support interactive and exploratory document retrieval. Topic-RAG uses BERTopic to identify the most relevant topics for a given query and restricts retrieval to documents or chunks within those topics. This targeted strategy enhances retrieval relevance by narrowing the search space to thematically aligned content. We utilize the pipeline on 4,711 articles related to nuclear energy from the Impresso historical Swiss newspaper corpus. Our experimental results demonstrate that Topic-RAG outperforms a baseline RAG architecture that does not incorporate topic modeling, as measured by widely recognized metrics, such as BERTScore (including Precision, Recall and F1), ROUGE and UniEval. Topic-RAG also achieves improvements in computational efficiency for both single and batch query processing. In addition, we performed a qualitative analysis in collaboration with domain experts, who assessed the system’s effectiveness in supporting historically grounded research. Although our evaluation is focused on historical newspaper articles, the proposed approach more generally integrates topic information to enhance retrieval performance within a transparent and user-configurable pipeline effectively. It supports the targeted retrieval of contextually rich and semantically relevant content while also allowing users to adjust key parameters such as the number of documents retrieved. This flexibility provides greater control and adaptability to meet diverse research needs in historical inquiry, literary analysis and cultural studies. Due to copyright restrictions, the raw data cannot be publicly shared. Data access instructions are provided in the repository, and the replication code is available on GitHub: https://github.com/KeerthanaMurugaraj/Topic-RAG-for-Historical-Newspapers.
This study explored the acquisition of Spanish nominal morphology in 116 children aged 4;0 to 6;11, grouped according to language ability (developmental language disorder [DLD] and typical development [TD]) and bilingualism (Spanish–English bilingual and Spanish monolingual). Monolinguals produced more target-like articles and direct object clitics than bilinguals, as did children with TD compared to peers with DLD. Bilinguals with TD produced more target-like morphology than monolinguals with DLD, particularly clitics. Children with DLD were more likely to omit clitics than peers with TD, but this contrast did not extend to bilinguals compared to monolinguals. Children produced singular default articles in plural contexts. Overall, our results suggest that clitics function better than articles for identifying DLD in bilinguals on quantitative and qualitative grounds.
This article provides an overview of key challenges in second language (L2) pronunciation learning and teaching within the context of instructed second language acquisition (SLA), with the goal of identifying promising directions for future research. It begins by examining persistent difficulties in L2 pronunciation instruction, such as the typically limited quality of input and the dominant emphasis on grammar and vocabulary in communicative language teaching (CLT). These conditions often result in learners having limited awareness of their pronunciation needs and teachers facing challenges in incorporating pronunciation instruction into CLT-based curricula. The article then reviews emerging instructional approaches that aim to integrate attention to phonetic form within CLT, highlighting the need for further empirical investigation. In addition, several pronunciation training techniques, some underexplored (HVPT, shadowing, embodied pronunciation training, captioned video, accent imitation, and pronunciation self-assessment), are briefly described, with an emphasis on their pedagogical potential both inside and outside the classroom. Finally, the article considers the role of individual differences in L2 pronunciation development and proposes directions for future research in instructed SLA.
We present Digital Collections Explorer, a web-based, open-source exploratory search platform that leverages Contrastive Language-Image Pre-training for enhanced visual discovery of digital collections. Our Digital Collections Explorer can be installed locally and configured to run on a visual collection of interest on disk in just a few steps. Building upon recent advances in multimodal search techniques, our interface enables natural language queries and reverse image searches over digital collections with visual features. This article describes the system’s architecture, implementation and application to various cultural heritage collections, demonstrating its potential for democratizing access to digital archives, especially those with impoverished metadata. We present case studies with maps, photographs and PDFs extracted from web archives in order to demonstrate the flexibility of the Digital Collections Explorer, as well as its ease of use. We demonstrate that the Digital Collections Explorer scales to hundreds of thousands of images on a MacBook Pro with an M4 chip. Lastly, we host a public demo of Digital Collections Explorer.
This article aims to explain how passive participles used as prenominal modifiers developed their eventive nature throughout the history of English. It is argued that prenominal participles first expressed stative result states in Old English (OE) and came to express perfect result states later on. The locus of required resultativity in participles was the inner aspect head in OE, while in Early Middle English (EME), it shifted to the outer aspect head. This shift was triggered by the loss of OE aspectual prefixes, which generally functioned to perfectivize or transitivize the verb by affecting its (internal) argument and assigning a change-of-state meaning to the verb. This shift rendered participial formation to be less constrained, as a result of which, it became possible for prenominal participles to express perfect resultative meanings, which in turn gave rise to their eventive meanings.
Utterance-final weakening refers to a prosodic feature found at the right periphery of some clauses in Pite Saami. This paper provides the most thorough general description of this prosodic phenomenon to date. The dataset used comes from an annotated corpus of spontaneous speech collected during the last 60 years. The phonetic-acoustic correlates are a complete devoicing of all segments in the final syllables of the affected clause, although creaky or breathy voice may also be present. Typically only one syllable is affected, but sometimes multiple syllables are affected. No syntactic units appear to correlate with this, and the weakening phase can even cross word boundaries. The phenomenon marginally correlates with gender, dialect, and age, with the speech of older speakers tending to feature it more frequently and with a longer prosodic scope. Similar utterance-final weakening phenomena are likely found in other languages, especially those in surrounding areas.
Half a century ago, Noam Chomsky posited that humans have specific innate mental abilities to learn and use language, distinct from other animals. This book, a follow-up to the author's previous textbook, A Mind for Language, continues to critically examine the development of this central aspect of linguistics: the innateness debate. It expands upon key themes in the debate - discussing arguments that come from other disciplines, such as psychology, anthropology, sociology, criminology, computer science, formal languages theory, neuroscience, genetics, animal communication, and evolutionary biology. The innateness claim also leads us to ask how human language evolved as a characteristic trait of Homo Sapiens. Written in an accessible way, assuming no prior knowledge of linguistics, the book guides the reader through technical concepts, and employs concrete examples throughout. It is accompanied by a range of online resources, including further material, a glossary, discussion points, questions for reflection, and project suggestions.
The account of extraction using only generalized context free phrase structure (put forth in a series of papers by Gazdar in the late 1970s and early 1980s and then codified in Generalized Phrase Structure Grammar) used, slash as a feature to indicate that there was something missing in wh-extraction constructions. Although this was (deliberately) reminiscent of the slash of Categorial Grammar (CG) (which encodes argument selection), they treated it as distinct from the CG slash. Subsequent work by Steedman proposed to unite them. This paper argues first, that Gazdar et al. were correct to treat the two differently. Second, I advocate a natural view of syntactic categories under the CG world view. Thus, we take the function categories of CG to correspond to functions on strings, and with this we preclude what I call S-crossing composition, used in many CG analyses. With this in mind, we suggest that rightward extraction as in Right Node Raising really is function composition, while wh-extraction should be handled by something much closer to the account in Gazdar et al. The two behave differently under coordination chains involving a silent and or or. This behavior provides evidence that the two should be kept distinct (see also work by Oehrle for this poit), while providing striking evidence for the view of syntactic categories advocated here.
Chapter 4 maps prototypical features of onomatopoeia by means of extensive empirical data on 124 sample languages and bear on the phonological, morphological, syntactic, word-formation, semantic, and sociopragmatic characteristics. The prototypical features are identified primarily, but not exclusively, on the basis of the markedness theory. It is postulated that the defining properties of onomatopoeias are marked relative to the properties of the general non-onomatopoeic word-stock in the sense of aberration from the latter’s properties. This claim does not mean that all onomatopoeias in natural languages or all onomatopoeias in a given language are marked in all the defining characteristics. Their prototypical characteristics should be viewed in the sense that onomatopoeia as a class of words has the capacity to deviate from the characteristics common to the non-onomatopoeic word-stock. It is postulated that these prototypical, defining features of onomatopoeia are marked features in the majority of languages, even if the languages differ in the degree of manifestation of these features at individual levels of language description. The discussion is supported by numerous examples.
This chapter consists of a transcription of a fictitious forum discussion in which a number of fictitious scholars participated, including some very surprising participants. The wide-ranging discussion covers the topics discussed throughout this book, and the chapter ends with the conclusion that the nature–nurture debate is still a vibrant one in which we are seeking to understand the interplay between the nurturing experience and the role of nature, whether in the form of an innate biological endowment or in the form of natural factors that go beyond the realm of the human mind.
Chapter 2 identifies various types of sound-related words in order to define the scope of onomatopoeia and the place of onomatopoeia in the system of language. It argues in favour of its ‘narrow’ definition by reserving this notion exclusively for direct sound imitation to distinguish onomatopoeia from signs based on cross-modal iconicity, from interjections, and from onomatopoeia-based derivatives and semantic shifts. This chapter also illustrates the different status and functions of onomatopoeic words in the sample languages.