Skip to main content Accessibility help
×
Hostname: page-component-857557d7f7-9f75d Total loading time: 0 Render date: 2025-12-09T10:58:02.256Z Has data issue: false hasContentIssue false

Chapter 2 - What Big Data Tells Us about American English Phonetics

Published online by Cambridge University Press:  03 December 2025

Mikko Laitinen
Affiliation:
University of Eastern Finland
Paula Rautionaho
Affiliation:
University of Eastern Finland
Get access

Summary

We extracted around two million vowel tokens from a sample of sixty-four speakers (b. 1886–1965; 35M/29F; 16 African Americans/48 non-African Americans) across eight states in the American South in an NSF-funded project. We have validated automatic measurements with manual inspection of alignment samples and find that 87 percent of alignments are successful and another 6 percent are partially successful. This large body of tokens (big data) complements existing sociophonetic research by providing a more thorough, detailed picture of the phonetics of American English. We find that (1) there is a much wider range of realization for vowels than is typically represented, and (2) there is no central tendency for any vowel. Using spatial methods drawn from technical geography, we find that all distributions of tokens in vowel space are nonlinear. This suggests that traditional reliance on finding average acoustic properties of a vowel may be unrepresentative of what most speakers actually do. (3) Distributional patterns for vowels are fractal. When we break up the overall dataset into subgroups (e.g., male/female), the same nonlinear distributional pattern appears but with varying locations of highest density of tokens. These findings complement existing sociophonetic research and demonstrate methods by which variation can both be represented and analyzed.

Information

Type
Chapter
Information
Publisher: Cambridge University Press
Print publication year: 2025

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book purchase

Temporarily unavailable

References

Baugh, John (2003). “Linguistic profiling,” in Makoni, Sinfree, Smitherman, Geneva, Ball, Arnetha F., and Spears, Arthur K. (eds.), Black Linguistics: Language, Society, and Politics in Africa and the Americas. London: Routledge, pp. 155168.Google Scholar
Boersma, Paul, and Weenink, David (2015). “Praat: Doing phonetics by computer,” version 5.4.08 (computer program). www.praat.org.Google Scholar
Boudahmane, Karim, Manta, Mathieu, Antoine, Fabien, Galliano, Sylvain, and Barras, Claude (1998). “Transcriber: A tool for segmenting, labeling and transcribing speech.” http://trans.sourceforge.net.Google Scholar
Burkette, Allison, and Kretzschmar, William A. Jr. (2018). Exploring Linguistic Science: Language Use, Complexity and Interaction. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Carver, Craig M. (1987). American Regional Dialects: A Word Geography. Ann Arbor: University of Michigan Press.CrossRefGoogle Scholar
Chang, Winston, Cheng, Joe, Allaire, Joseph, Sievert, Carson, Schloerke, Barret, Xie, Yihui, … Borges, Barbara (2024). “shiny: Web Application Framework for R,” R package version 1.8.1.9001. https://github.com/rstudio/shiny, https://shiny.posit.co/.Google Scholar
Clopper, Cynthia G., Pisoni, David B., and de Jong, Kenneth (2005). “Acoustic characteristics of the vowel systems of six regional varieties of American English.” Journal of the Acoustical Society of America, 118(3), 16611676.CrossRefGoogle ScholarPubMed
Dodsworth, Robin, and Kohn, Mary (2012). “Urban rejection of the vernacular: The SVS undone.” Language Variation and Change, 24(2), 221245.CrossRefGoogle Scholar
Ellis, Nick C. (2019). “Essentials of a theory of language cognition.” The Modern Language Journal, 103(S1), 3960.CrossRefGoogle Scholar
Favaretto, Maddalena, De Clercq, Eva, Schneble, Christophe O., and Elger, Bernice S. (2020). “What is your definition of Big Data? Researchers’ understanding of the phenomenon of the decade.” PloS One, 15(2), e0228987.CrossRefGoogle ScholarPubMed
Fridland, Valerie (2001). “The social dimension of the Southern Vowel Shift: Gender, age, and class.” Journal of Sociolinguistics, 5(2), 233253.CrossRefGoogle Scholar
Fridland, Valerie (2003). “Network strength and the realization of the Southern Vowel Shift among African Americans in Memphis, Tennessee.” American Speech, 78(1), 330.CrossRefGoogle Scholar
Hay, Jennifer, Warren, Paul, and Drager, Katie (2006). “Factors influencing speech perception in the context of a merger-in-progress.” Journal of Phonetics, 34(4), 458484.CrossRefGoogle Scholar
Hopper, Paul (1987). “Emergent Grammar,” in Aske, Jon, Beery, Natasha, Michaelis, Laura, and Filip, Hana (eds.), Berkeley Linguistics Society: Proceedings of the Thirteenth Annual Meeting. Berkeley: Berkeley Linguistics Society, pp. 139157.Google Scholar
Ireland, Katherine A., and Kretzschmar, William (2022). “Complex systems and the humanities.” https://emergence.libs.uga.edu.Google Scholar
Ireland, Katherine A., Kretzschmar, William A., and Jones, Jonathan A. (2024). “DASS Southern Vowels and Point Pattern Analysis,” R Shiny Web Application, version 1.2.3. https://github.com/katireland/DASS_pointpatternShiny.git.Google Scholar
Kleiber, Christian, and Zeileis, Achim (2014). “Visualizing count data regressions using rootograms.” Working Papers in Economics and Statistics. https://EconPapers.repec.org/RePEc:inn:wpaper:2014-20.Google Scholar
Kretzschmar, William A. Jr. (2009). The Linguistics of Speech. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Kretzschmar, William A. Jr. (2015a). Language and Complex Systems. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Kretzschmar, William A. Jr. (2015b). “African American voices in Atlanta,” in Lanehart, Sonja (ed.), Oxford Handbook of African American Language. New York: Oxford University Press, pp. 219235.Google Scholar
Kretzschmar, William A. Jr. (2020). “The view of Southern vowels from large-scale data,” paper presented at a meeting of the American Dialect Society, New Orleans, March 1, 2020.Google Scholar
Kretzschmar, William A. (2021). “Complex systems for corpus linguists.” ICAME Journal, 45(1), 155177.CrossRefGoogle Scholar
Kretzschmar, William A., Bounds, Paulina, Hettel, Jacqueline, Pederson, Lee, Juuso, Ilkka, Opas-Hänninen, , Lena, Lisa, and Seppänen, Tapio (2013). “The Digital Archive of Southern Speech (DASS).” Southern Journal of Linguistics, 37(2), 1738.Google Scholar
Kretzschmar, William A. Jr., and Ireland, Katherine (2019). “Apparent time in big data phonetic analysis,” paper presented at a meeting of SHEL (Studies in the History of the English Language), Bloomington, Indiana, September 5, 2019.Google Scholar
Kretzschmar, William A., Kretzschmar, Brendan, and Brockman, Irene (2013). “Scaled measurement of geographic and social speech data.” Literary and Linguistic Computing, 28(1), 173187.CrossRefGoogle Scholar
Kretzschmar, William A. Jr., Renwick, Margaret E. L., Lipani, Lisa, Olsen, Michael L., Olsen, Rachel M., Yuanming, Shi, and Stanley, Joseph A. (2019). “Transcriptions of the Digital Archive of Southern Speech.” www.lap.uga.edu/Projects/DASS2019/.Google Scholar
Kurath, Hans, Hansen, Marcus L., Bloch, Bernard, and Bloch, Julia (1939). Handbook of the Linguistic Geography of New England. Providence: Brown University for the American Council of Learned Societies.Google Scholar
Kurath, Hans, and McDavid, Raven I. (1961). The Pronunciation of English in the Atlantic States: Based upon the Collections of the Linguistic Atlas of the Eastern United States. Ann Arbor: University of Michigan Press.Google Scholar
Labov, William (1963). “The social motivation of a sound change.” WORD, 19(3), 273309.CrossRefGoogle Scholar
Labov, William (1982). The Social Stratification of English in New York City. Washington, DC: Center for Applied Linguistics.Google Scholar
Labov, William (1994). Principles of Linguistic Change. Oxford: Blackwell.Google Scholar
Labov, William, Ash, Sharon, and Boberg, Charles (2006). The Atlas of North American English: Phonetics, Phonology, and Sound Change: A Multimedia Reference Tool. Berlin: Mouton de Gruyter.CrossRefGoogle Scholar
William, Labov, Yaeger, Malcah, and Steiner, Richard C. (1972). A Quantitative Study of Sound Change in Progress. Philadelphia: US Regional Survey.Google Scholar
McAuliffe, Michael, Socolof, Michaela, Mihuc, Sarah, Wagner, Michael, and Sonderegger, Morgan (2017). “Montreal Forced Aligner: Trainable text-speech alignment using Kaldi,” in Interspeech 2017. Stockholm: International Speech Communication Association, pp. 498502.Google Scholar
National Science Foundation (NSF) (2012). “NSF 12-499: Core Techniques and Technologies for Advancing Big Data Science & Engineering (BIGDATA).” https://new.nsf.gov/funding/opportunities/critical-techniques-technologies-methodologies/504767/nsf12-499/solicitation.Google Scholar
Olsen, Rachel M., Olsen, Michael L., Stanley, Joseph A., Renwick, Margaret E. L., and Kretzschmar, William (2017). “Methods for transcription and forced alignment of a legacy speech corpus.” Proceedings of Meetings on Acoustics, 30(1; 060001), 113.Google Scholar
Lee, Pederson, McDaniel, Susan L., and Adams, Carol M. (1986). Linguistic Atlas of the Gulf States. Athens: University of Georgia Press.Google Scholar
Petrulevich, Alexandra, and Kretzschmar, William A. Jr. (2020). “GIS for language study,” in Schuster, Kristen, and Dunn, Stuart (eds.), Routledge Handbook of Research Methods in Digital Humanities, London: Routledge.Google Scholar
Pierrehumbert, Janet B. (2001). “Exemplar dynamics: Word frequency, lenition and contrast,” in Bybee, Joan L., and Hopper, Paul J. (eds.), Frequency and the Emergence of Linguistic Structure. Amsterdam: Benjamins, pp. 137157.CrossRefGoogle Scholar
Preston, Dennis R. (2013). “Language with an attitude,” in Chambers, J. K., and Schilling, Natalie (eds.), The Handbook of Language Variation and Change. Oxford: Wiley, 157182.CrossRefGoogle Scholar
Renwick, Margaret E. L., Stanley, Joseph A., Forrest, Jon, and Glass, Lelia (2023). “Boomer Peak or Gen X Cliff? From SVS to LBMS in Georgia English.Language Variation and Change, 35(2), 175197.CrossRefGoogle Scholar
Rosenfelder, Ingrid, Fruehwald, Josef, Keelan, Evanini, Seyfarth, Scott, Gormon, Kyle, Prichard, Hilary, and Yuan, Jiahong (2014). “FAVE (Forced Alignment and Vowel Extraction),” Program Suite version 1.2.2. https://github.com/JoFrhwld/FAVE.Google Scholar
Stanley, Joseph A., Renwick, Margaret E. L., Ireland Kuiper, Katherine, and Olsen, Rachel M. (2021). “Back vowel dynamics and distinctions in Southern American English.” Journal of English Linguistics, 49(4), 389418.CrossRefGoogle Scholar
Stanley, Joseph A., Renwick, Margaret E. L., Kretzschmar, William A., Olsen, Rachel M., and Olsen, Michael (2018). “The gazetteer of Southern vowels,” paper presented at the American Dialect Society annual meeting, Salt Lake City, Utah, January 8, 2018.Google Scholar
Thomas, Erik R. (2011). Sociophonetics: An Introduction. New York: Palgrave Macmillan.CrossRefGoogle Scholar
Wickham, Hadley, Averick, Mara, Bryan, Jennifer, Chang, Winston, McGowan, Lucy D., François, Romain, … Yutani, Hiroaki (2019). “Welcome to the Tidyverse.” Journal of Open Source Software, 4(43), 1686.CrossRefGoogle Scholar
Wilkinson, Mark D., Dumontier, Michel, Aalberseberg, Ijsbrand J., Appleton, Gabrielle, Axton, Myles, Baak, Arie, … Mons, Barend (2016). “The FAIR Guiding Principles for scientific data management and stewardship.” Scientific Data, 3(1), 160018.CrossRefGoogle ScholarPubMed
Wolfram, Walt, and Schilling-Estes, Natalie (2016). American English: Dialects and Variation, 3rd ed. Malden, MA: Blackwell.Google Scholar

Accessibility standard: Inaccessible, or known limited accessibility

Why this information is here

This section outlines the accessibility features of this content - including support for screen readers, full keyboard navigation and high-contrast display options. This may not be relevant for you.

Accessibility Information

The PDF of this book is known to have missing or limited accessibility features. We may be reviewing its accessibility for future improvement, but final compliance is not yet assured and may be subject to legal exceptions. If you have any questions, please contact accessibility@cambridge.org.

Content Navigation

Table of contents navigation
Allows you to navigate directly to chapters, sections, or non‐text items through a linked table of contents, reducing the need for extensive scrolling.
Index navigation
Provides an interactive index, letting you go straight to where a term or subject appears in the text without manual searching.

Reading Order & Textual Equivalents

Single logical reading order
You will encounter all content (including footnotes, captions, etc.) in a clear, sequential flow, making it easier to follow with assistive tools like screen readers.
Short alternative textual descriptions
You get concise descriptions (for images, charts, or media clips), ensuring you do not miss crucial information when visual or audio elements are not accessible.

Visual Accessibility

Use of colour is not sole means of conveying information
You will still understand key ideas or prompts without relying solely on colour, which is especially helpful if you have colour vision deficiencies.

Structural and Technical Features

ARIA roles provided
You gain clarity from ARIA (Accessible Rich Internet Applications) roles and attributes, as they help assistive technologies interpret how each part of the content functions.

Save book to Kindle

To save this book to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×