Skip to main content Accessibility help
×
Hostname: page-component-745bb68f8f-b95js Total loading time: 0 Render date: 2025-01-11T12:54:23.675Z Has data issue: false hasContentIssue false

7 - Comparing Logistic Regression, Multinomial Regression, Classification Trees and Random Forests Applied to Ternary Variables

Three-Way Genitive Variation in English

from Part III - Perspectives on Multifactorial Methods

Published online by Cambridge University Press:  06 May 2022

Ole Schützler
Affiliation:
Universität Leipzig
Julia Schlüter
Affiliation:
Universität Bamberg
Get access

Summary

The authors apply logistic regression, multinomial regression, classification trees and random forests to a ternary outcome variable: the variation between the ’s-genitive, the of-genitive and functionally equivalent noun + noun combinations. The statistical approaches discussed fall into regression models on the one hand and classification trees on the other. Specifically, as an alternative to successive binomial regression analyses, the authors implement a multinomial model, which can analyse the entire dataset with three outcome categories simultaneously. Further, a basic classification tree is calculated alongside a more complex (and more robust) random forest. The chapter does not only weigh advantages and shortcomings of all four models, but it also explicates the different rationales and interpretations that come with them. As a major insight, it emerges that the nature of the dataset, the analytic purpose and the statistical model are interdependent and condition each other in several non-trivial respects.

Type
Chapter
Information
Data and Methods in Corpus Linguistics
Comparative Approaches
, pp. 194 - 223
Publisher: Cambridge University Press
Print publication year: 2022

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Further Reading

Agresti, Alan. 2013. Categorical Data Analysis. Hoboken, NJ: John Wiley & Sons, Inc.Google Scholar
James, Gareth, Daniela Witten, , Trevor Hastie, and Robert Tibshirani, . 2013. An Introduction to Statistical Learning with Applications in R. New York: Springer.Google Scholar
Vanderschueren, Clara, and Ludovic De Cuypere, . 2014. The Inflected/Non-Inflected Infinitive Alternation in Portuguese Adverbial Clauses. A Corpus Analysis. Language Sciences 41. 153–74.Google Scholar

References

Agresti, Alan. 2013. Categorical Data Analysis. Hoboken, NJ: John Wiley & Sons, Inc.Google Scholar
Baayen, R. Harald. 2008. Analyzing Linguistic Data: A Practical Introduction to Statistics Using R. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Biber, Douglas, Johansson, Stig, Leech, Geoffrey, Conrad, Susan, and Finegan, Edward. 1999. Longman Grammar of Spoken and Written English. Harlow: Longman.Google Scholar
Breiman, Leo, Friedman, Jerome, Olshen, Richard and Stone, Charles. 1984. Classification and Regression Trees. Boca Raton, FL: Chapman & Hall.Google Scholar
Chipman, Hugh, George, Edward and Richard, McCulloch. 2010. BART: Bayesian Additive Regression Trees. The Annals of Applied Statistics 4(1). 266–98.Google Scholar
Feist, Jim. 2012. What Controls the “Genitive Variation” in Present-Day English? Studies in Language 36(2). 261–99.Google Scholar
Fox, John, Weisberg, Sandford, Price, Brad, Friendly, Michael and Hong, Jangman. 2019. effects: Effect Displays for Linear, Generalized Linear, and Other Models. R package, version 4.1–4. https://CRAN.R-project.org/package=effects.Google Scholar
Gries, Stefan Th. 2013. Statistics for Linguistics with R: A Practical Introduction. Berlin: de Gruyter Mouton.Google Scholar
Hastie, Trevor, Tibshirani, Robert and Friedman, Jerome. 2015. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York: Springer.Google Scholar
Hinrichs, Lars, Szmrecsanyi, Benedikt and Bohmann, Axel. 2015. Which-Hunting and the Standard English Relative Clause. Language 91(4). 806–36.CrossRefGoogle Scholar
Kuhn, Max, and Johnson, Kjell. 2013. Applied Predictive Modeling. New York: Springer.CrossRefGoogle Scholar
Labov, William. 1969. Contraction, Deletion, and Inherent Variability of the English Copula. Language 45(4). 715–62.CrossRefGoogle Scholar
Labov, William 1982. Building on Empirical Foundations. In Lehmann, Winfred P. and Malkiel, Yakov, eds. Perspectives on Historical Linguistics. Amsterdam and Philadelphia: John Benjamins. 1792.Google Scholar
Liaw, Andy, and Wiener, Matthew. 2002. Classification and Regression by randomForest. R News 2(3). 1822.Google Scholar
Rickford, John, Ball, Arnetha, Blake, Renee, Jackson, Raina and Martin, Nomi. 1991. Rappin on the Copula Coffin: Theoretical and Methodological Issues in the Analysis of Copula Variation in African-American Vernacular English. Language Variation and Change 3. 103–32.CrossRefGoogle Scholar
Ripley, Brian. 2018. tree: Classification and Regression Trees. R package, version 1.0–39. https://CRAN.R-project.org/package=tree.Google Scholar
Rosenbach, Anette. 2014. English Genitive Variation: The State of the Art. English Language and Linguistics 18(2). 215–62.Google Scholar
Röthlisberger, Melanie, Grafmiller, Jason and Szmrecsanyi, Benedikt. 2017. Cognitive Indigenization Effects in the English Dative Alternation. Cognitive Linguistics. 28(4). 673710.Google Scholar
Sankoff, David, and Rousseau, Pascale. 1989. Statistical Evidence for Rule Ordering. Language Variation and Change 1(1). 118.CrossRefGoogle Scholar
Szmrecsanyi, Benedikt, Biber, Douglas, Egbert, Jesse and Franco, Karlien. 2016. Toward More Accountability: Modeling Ternary Genitive Variation in Late Modern English. Language Variation and Change 28. 129.Google Scholar
Tagliamonte, Sali. 2006. Analysing Sociolinguistic Variation. Cambridge: Cambridge University Press.Google Scholar
Venables, William, and Ripley, Brian. 2002. Modern Applied Statistics with S. New York: Springer.CrossRefGoogle Scholar

Save book to Kindle

To save this book to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×