To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The Digital Archive of Southern Speech (DASS) is a subset of sixty-four recorded interviews collected as part of the Linguistic Atlas Project from 1970 to 1983. Full transcriptions of all DASS interviews have been produced by the Atlas Project at the University of Georgia; however, these transcriptions have until now existed only as untagged text files, making them unsuitable for detailed corpus analysis. We discuss the process of preparing the DASS transcriptions to be uploaded to a Corpus Workbench (CWB) server, as well as a functionality test of the resulting corpus through analysis of the Southern American English feature reckon. Syntactic analysis indicates that reckon is more grammaticalized than the epistemic verbs think or believe in Southern American English, and sociolinguistic results show that the use of reckon is highly stratified by age, dropping off sharply among speakers born after 1917. Its use also decreases as socioeconomic class and education increase. These findings provide evidence of the growing stigmatization of reckon as a Southern lexeme over time, as well as its association with nonstandard speech. This chapter serves as an example of secondary data analysis of an existing dataset and demonstrates the utility of the DASS corpus for such research.
Recent studies in Construction Grammar have suggested that contracted modals constitute different constructions from their full forms. In this article, we present a corpus-based analysis of the relationship between the modal forms going to and gonna in British English used on the blogging platform LiveJournal. We report a Collostructional Analysis and a Behavioural Profile Analysis based on a logistic regression model of blind annotations, assessing factors of semantic, pragmatic and social meaning on the choice of the variant, in addition to processing factors. The results show that register formality is the only significant meaning predictor for the alternation between going to or gonna in the corpus. We discuss these results in light of recent theoretical debates on isomorphism and synonymy avoidance in Construction Grammar: specifically, our study provides evidence that social meaning drives the distinction between going to and gonna, validating the recently formulated Principle of No Equivalence, and providing further evidence for the constructionhood of contracted modals.
The New Cambridge History of the English Language is aimed at providing a contemporary and comprehensive overiew of English, tracing its roots in Germanic and investigating the contact scenarios in which the language has been an active participant. It discusses the various models and methodologies which have been developed to analyse diachronic data concisely and consistently. The new history furthermore examines the trajectories which the language has embarked on during its spread worldwide and presents overviews of the varieties of English found throughout the world today.
The New Cambridge History of the English Language is aimed at providing a contemporary and comprehensive overiew of English, tracing its roots in Germanic and investigating the contact scenarios in which the language has been an active participant. It discusses the various models and methodologies which have been developed to analyse diachronic data concisely and consistently. The new history furthermore examines the trajectories which the language has embarked on during its spread worldwide and presents overviews of the varieties of English found throughout the world today.
This chapter explores the role of central aspects of cognition in historical linguistics. After describing and discussing the cognitive commitment and its theoretical background, this chapter highlights the relation to cognitive archaeology as well as historical psychology and explores the methodological prerequisites for cognitive approaches to the history of English, particularly the quantitative turn in cognitive linguistics. Case studies from different periods of English illustrate how cognitive factors can shed light on synchronic historical language stages and diachronic developments, and how these in turn can help us to further explore the cognitive commitment. Finally, we argue for a feedback loop, where modern cognitive linguistic theories feed into and guide historical enquiries, but are also checked and modified, if necessary, on the basis of historical findings.
In this chapter, we address some ways in which the use of corpora has revolutionised the study of the history of English. We first account for the development of historical corpora of English and discuss advantages and drawbacks associated with different corpus sizes. We also address types of language use that are not well represented in existing corpora, potential clashes between comparability and representativity, and features such as tagging and spelling normalisation. We then consider contributions that historical corpora have made to specific linguistic fields, notably in variationist studies, historical sociolinguistics and historical pragmatics, and illustrate historical corpus methodology by presenting a case study on sentence-initial and in Late Modern English based on the Corpus of Historical American English (COHA). We conclude the chapter with a list of desiderata for future corpus-based research on the history of the English language.
The New Cambridge History of the English Language is aimed at providing a contemporary and comprehensive overiew of English, tracing its roots in Germanic and investigating the contact scenarios in which the language has been an active participant.
Chapter 3 considers different approaches to data collection. Three case studies are included. The first study involves a purpose-built corpus of news articles about obesity. We focus on theoretical considerations attending to corpus design, as well as practical challenges involved in processing texts provided by repositories such as LexisNexis to make them amenable to corpus analysis. The second study focuses on how corpus linguists might work with existing datasets, in this case, transcripts collected by research collaborators conducting ethnographic research in Australian Emergency Departments. We discuss the ways in which data collected for the purposes of different kinds of analysis is likely to require some pre-processing before it becomes suitable for corpus-based analysis. The third study is concerned with the creation of a corpus of anti-vaccination literature from Victorian England. We discuss the challenges involved in sourcing historical material from existing databases, selecting a principled set of potential texts for inclusion, and using optical character recognition (OCR) software to convert the texts into a format that is appropriate for corpus tools.
Chapter 1 introduces the context and aims of the book, and provides a brief introduction to corpus linguistics for readers unfamiliar with it. It finishes by providing a chapter-by-chapter overview of the book.
Chapter 11 introduces the concept of legitimation in discourse and considers how it might function, and be studied, in the context of health(care) communication. First, we look at how contributors to the online parenting forum Mumsnet use labels denoting attitudes towards vaccinations. We point out how labels that involve opposition to vaccinations, such as ‘anti-vaxxer’ tend to collocate with negation, and then consider how people justify negating the applicability of the label to themselves. This reveals a range of different concerns around vaccinations. We then draw on a study of patient feedback in which we examined how patients legitimate their perspectives and the evaluations they gave in their feedback. For example, this included patients representing themselves as experienced users of healthcare services. Additionally, some patients used aspects of their identities to position themselves as requiring attention, while others used techniques such as employing second person pronouns to imply that their experiences could be generalised to other patients.
Chapter 7 considers how language change over short timespans can be examined using corpus-assisted methods. We present three case studies. The first study involves a corpus of patient feedback relating to cancer care, collected for four consecutive years. A technique called the coefficient of variation was used to identify lexical items that had increased or decreased over time. The second study considered UK newspaper articles about obesity. To examine changing themes over time, we employed a combination of keyness and concordance analyses to identify which themes in the corpus were becoming more or less popular over time. Additionally, the analysis considered time in a different way, by using the concept of the annual news cycle. To this end, the corpus was divided into 12 parts, consisting of articles published according to a particular month, and the same type of analysis was applied to each part. The third case study involves an analysis of a corpus of forum posts about anxiety. Time was considered in terms of the age of the poster and in terms of the number of contributions that a poster had made to the forum, and differences were found depending on both approaches to time.
Chapter 6 shows how it is possible to use demographic metadata to study identities in health-related corpora. We present two case studies, based on research on patient feedback on NHS services in England. The first study compares how cancer patients of different age and sex groups evaluate healthcare services and, specifically, how they use distinct linguistic and rhetorical strategies to do this. The corpus was encoded with demographic metadata which allowed the researchers to explore the language used by people of different age and sex identity groups. For the second study, a different corpus of more general patient feedback was used, one which did not contain demographic information metadata. Instead, targeted searches were used to identify patients’ demographic characteristics based on cases where they made those characteristics explicit within their feedback. In contrasting these case studies, we also evaluate the two different approaches taken, considering the affordances and limitations of both. Taken together, the case studies demonstrate how language and identity can be explored in corpora with and without reliable demographic metadata.
Chapter 5 is concerned with sequential aspects of health-oriented interactions and the challenges this poses for corpus research. Two case studies demonstrate how conventional corpus procedures can be augmented with other linguistic approaches to facilitate a critical examination of the relationships between parts of the data that might otherwise be separated in corpus analysis. The first study is an investigation of a thread from an online forum dedicated to cancer – one that is explicitly dedicated to irreverent verbal play. We show how a corpus approach enabled the identification of humourous metaphors and helped us reveal recurrent lexical and grammatical features that facilitate discussion around sensitive topics, enable a coherent identity, and contribute to a sense of community. In the second study we use an approach that was originally applied to the Spoken BNC 2014 corpus to examine interactional data in terms of functional discourse units. We apply this coding framework to a sample of anxiety support forum data in order to document, quantify, and evaluate how various communicative purposes are formulated in forum posts and are met with different types of response.
Chapter 13 presents a synthesis of the previous chapters, beginning by asking the question – what have our experiences taught us about health communication that we didn’t know? We go on to examine lessons we learnt about carrying out corpus-based research on health communication, offering practical advice and tips for people who might be carrying out similar kinds of studies to the ones described in this book. We then consider the limitations of a corpus-based approach and end by looking to the future – what changes have taken place since we completed our analyses? What kinds of developments in the field of healthcare and in corpus linguistic analysis have occurred recently? And what avenues of research into health care do we believe are potentially interesting to investigate next?
Chapter 2 is concerned with research questions. We discuss the different processes through which research questions can be identified and developed in corpus-based research on health communication. Three case studies are considered. The first study involved the analysis of press representations of obesity. In this study, the researchers developed their own research questions in a variety of ways, including by drawing from the non-linguistic literature on obesity. The second study focused on the McGill Pain Questionnaire – a well-known language-based diagnostic tool for pain. A pain consultant asked the researchers if they could help understand why some patients find it difficult to respond to some sections of the questionnaire. In response, the researchers formulated a series of questions that could be answered using corpus linguistic tools, and identified some issues with the questionnaire that address the pain consultant’s concerns. The third study involved the analysis of patient feedback on the UK’s National Health Service. The researchers were approached by the NHS Feedback Team and given 12 questions that they were commissioned to answer by means of corpus linguistic methods.
Chapter 12 discusses the potential opportunities and challenges associated with disseminating the findings of corpus-based approaches to health communication, which also apply more generally to interdisciplinary research and collaborations between researchers and non-academic stakeholders. We include two case studies. The first case study involves work on patient feedback with members of the NHS who had provided a list of questions for us to work on. We discuss the importance of and challenges around building and maintaining relationships with members of this large, changing organisation, as well as outlining how we approached dissemination of findings, both in academic and non-academic senses, and the extent that we were able to carry out impact. The second case study considers our experiences of disseminating findings from a project on metaphors and cancer, focussing particularly on writing for a healthcare journal, dealing with the media, and going beyond corpus data to create a metaphor-based resource for communication about cancer.
Chapter 4 considers ethical issues in healthcare communication research through two case studies. The first case study looks at a relatively straightforward situation involving a study of the Pain Concern online forum. Data from the forum was provided by Health Unlocked, a company that runs a large number of online communities related to health. One advantage of using their service was that Health Unlocked took care of relevant legal requirements concerning ethics and only shared data from contributors to the forum who had agreed for their posts to be used for research purposes. The second case study relates to the study of dementia and brings into focus the difficulties of working with multiple datasets and a range of stakeholders. The data collection for this project involved public health communication in terms of news media and external communications from support services, including social media. As such, it presents scenarios that are common to studies of health communication and thereby offers instruction in how to navigate related ethical concerns.
Chapter 10 demonstrates how corpus approaches support the study of various social actors. We include two case studies. The first study investigates how representations of people with obesity in the UK press contribute to stigmatisation. The analysis orients around the naming strategies to collectively and individually refer to people with obesity, as well as the adjectives used to describe them and the activities that they are reported to be involved in. Furthermore, we show that people with obesity are regularly held up as figures of ridicule and obesity is discussed in the context of social deviance, foregrounded when reporting on perpetrators of crimes. The second study uses a tailor-made annotation system to discuss referential strategies, descriptions of traits and the capacity to carry out different kinds of actions in the context of voice-hearing, to critically consider the different degrees to which people who experience psychosis personify their voices. We track these representations in the reports of those with lived experience over time and consider the implications of a social actor model for therapeutic interventions to support those with chronic mental health issues.
Chapter 8 is concerned with the use of historical corpora in the study of language relating to health. We present two case studies – one where an issue is well understood and discussed publicly, the other where there was a clear issue with the framing of a discussion. For the former study we explore the VicVaDis corpus, first introduced in chapter 1. We combine different corpus techniques to show the main anti-vaccination arguments in the corpus and to point out parallels with present-day anti-vaccination discourse. The second case study looks at the emergence of venereal disease in the seventeenth century using the Early English Books Online corpus. By examining collocates of the word pox, we are able to weed out relevant uses of the word (e.g., those which referred to venereal disease) as opposed to those which do not. Additionally, we show that through the investigation of one type of collocate (words referring to geographical locations) the analysis was taken in an unexpected but rewarding direction.
Chapter 9 considers how the experience of illness is represented linguistically, focussing on two contexts. In the first case study, collocational patterns were examined in order to show how people represented the word anxiety. Different patterns around anxiety were grouped together in order to identify oppositional pairs of representation (e.g., medicalising/normalising). The second case study involved an examination of the ways in which cancer was constructed in a corpus of interviews with and online forum posts by people with cancer, family carers, and healthcare professionals. Using a combination of manual analysis and corpus searches, we considered how metaphors were used to convey a sense of empowerment or disempowerment in the experience of cancer. More specifically, the analysis of metaphors around cancer revealed insights into people’s identity construction and the relationships between doctors and patients.