Hostname: page-component-745bb68f8f-b95js Total loading time: 0 Render date: 2025-01-10T22:43:21.892Z Has data issue: false hasContentIssue false

Three Steps Towards Robust Regression

Published online by Cambridge University Press:  01 January 2025

Howard Wainer*
Affiliation:
The University of Chicago
David Thissen
Affiliation:
The University of Chicago
*
Requests for reprints should be sent to Howard Wainer, 5848 University Avenue, Chicago, Illinois 60637.

Abstract

The three most commonly used statistics, the arithmetic mean, variance, and the product-moment correlation, are most unfortunate choices when data are not strictly Gaussian. A new measure of correlation and a measure of scale are proposed which are substantially more robust than their least squares counterparts. An illustration shows how increased robustness can be obtained through the use of equal regression weights without severe loss in accuracy. The paper also shows how incorporating knowledge about the theoretical structure of the regression coefficients into their estimation can aid substantially in increasing their robustness.

Type
Original Paper
Copyright
Copyright © 1976 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Various aspects of the research reported here was supported by: NICH and HD grant 1 R01 HD08896-01 AFY to The University of Chicago, Howard Wainer, Principal Investigator; The Social Sciences Divisional Research Grants of the University of Chicago; NIH grant HD-04660 and contract NICHD-72-2735 to the Fels Research Institute, A. F. Roche, Principal Investigator.

We wish to thank John W. Tukey for his initial help and continuing interest. The measure of correlation (rt) and the resistant measure of scale herein proposed stem directly from a suggestion by Professor Tukey. Additionally, we wish to acknowledge the help and useful suggestions of W. Kruskal, A. F. Roche, J. Kettenring, and R. Gnanadesikan.

References

Anderson, N. H. Scales and statistics: Parametric and non-parametric. Psychological Bulletin, 1961, 58, 305316.CrossRefGoogle Scholar
Andrews, D. F., Bickel, P. J., Hampel, F. R., Huber, P. J., Rogers, W. H., and Tukey, J. W. Robust estimates of location, 1972, Princeton, N. J.: Princeton University Press.Google Scholar
Bock, R. D. Multivariate statistical methods in behavioral research, 1975, New York: McGraw-Hill.Google Scholar
Bock, R. D. and Kolakowski, D. Further evidence of sex-linked major gene influence on human spatial visualizing ability. Americal Journal of Human Genetics, 1973, 25, 114.Google ScholarPubMed
Bock, R. D., Wainer, H., Thissen, D., Peterson, A., Murray, J., and Roche, A. F. A parameterization of individual human growth curves. Human Biology, 1973, 45, 6380.Google ScholarPubMed
Box, G. E. P., and Tiao, G. C. A Bayesian approach to some outlier problems. Biometrika, 1968, 55, 119129.CrossRefGoogle ScholarPubMed
Czuber, E. Theorie der beobachtungsfehler. Leipzig, 1891.Google Scholar
David, H. A. Gini's mean difference rediscovered. Biometrika, 1968, 55, 573574.Google Scholar
Devlin, S. J., Gnanadesikan, R., and Kettenring, J. R. Robust estimation and outlier detection with correlation coefficients. Biometrika, 1975, in press. (a)CrossRefGoogle Scholar
Devlin, S. J., Gnanadesikan, R., and Kettenring, J. R. Robust estimation of correlation and covariance matrices. Paper presented at the spring meeting of the Psychometric Society, Iowa City, April 26, 1975. (b)Google Scholar
Downton, F. Linear estimates with polynomial coefficients. Biometrika, 1966, 53, 129141.CrossRefGoogle ScholarPubMed
Gauss, C. F. Gottingsche gelehrte anzeigen, 1821.Google Scholar
Gini, C. Variabilita e mutabilita, contributo allo studio delle distribuzione e relazione statistiche. Sudi-Economico-Giuridici della R. Universita di Cagliari, 1912.Google Scholar
Gnanadesikan, R., and Kettenring, J. R. Robust estimates, residuals, and outlier detection with multiresponse data. Biometrics, 1972, 28, 81124.CrossRefGoogle Scholar
Green, B. F. Parameter sensitivity in multivariate methods, 1974, Baltimore: Department of Psychology, Johns Hopkins University.Google Scholar
Helmert, F. R. Die Berechnung des wahrscheinlichen Beobachtungs fehlers aus den ersten Potenzen der Differenzen gleichgenauer directer Beobachtungen. Astronomische Nachrichten, 1876, 88, 257272.CrossRefGoogle Scholar
Hogg, R. V. Adaptive robust procedures: A partial review and some suggestions for further applications and theory. Journal of the American Statistical Association, 1974, 69, 909927.CrossRefGoogle Scholar
Hogg, R. V., and Randles, R. Adaptive distribution-free regression methods. Technometrics, 1975, in press.CrossRefGoogle Scholar
Hotelling, H., and Pabst, M. R. Rank correlation and tests of significance involving no assumption of normality. Annals of Mathematical Statistics, 1936, 7, 2943.CrossRefGoogle Scholar
Huber, P. J. Robust statistics: A review. Annals of Mathematical Statistics, 1972, 43, 10411067.CrossRefGoogle Scholar
Knuth, D. E. The art of computer programming (Vol. 2). Reading, Mass.: Addison-Wesley. 1969, 1112.Google Scholar
Mood, A. M. Introduction to the theory of statistics, 1950, New York: McGraw-Hill.Google Scholar
Roche, A. F., Wainer, H., and Thissen, D. Predicting adult stature for individuals, 1975, Basel, Switz.: Karger.Google ScholarPubMed
Samejima, F. Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph Supplement, 1969, No. 17.Google Scholar
Singleton, R. C. An efficient algorithm for sorting with minimal storage. Communications of the Association for Computing Machinery, 1969, 12, 185187.CrossRefGoogle Scholar
Tukey, J. W. Exploratory data analysis (limited preliminary edition, Vol. 3), 1970, Reading, Mass.: Addison-Wesley.Google Scholar
Tukey, J. W., and McLaughlin, D. H. Less vulnerable confidence and significance procedures for location based upon a single sample: Trimming/Winsorization 1. Sankhyā, 1963, A 25, 331352.Google Scholar
von Andrae, Uber die Bestimmung des wahrscheinlichen Fehlers durch die gegebenen Differenzen vom gleich genauen Beobachtungen einer Unbekannten. Astronomische Nachrichten, 1872, 79, 257272.CrossRefGoogle Scholar
Wainer, H. Predicting the outcome of the Senate trial of Richard M. Nixon. Behavioral Science, 1974, 19, 404406.CrossRefGoogle Scholar
Wainer, H. Estimating coefficients in linear models: It don't make no nevermind. Psychological Bulletin, 1975, in press.Google Scholar
Wainer, H., Gruvaeus, G., and Zill, N. Senatorial decision making: I. The determination of structure. Behavioral Science, 1973, 18, 719.CrossRefGoogle Scholar
Wainer, H., and Thissen, D. Multivariate semi-metric smoothing in multiple prediction. Journal of the American Statistical Association, 1975, 70. (a)Google Scholar
Wainer, H., and Thissen, D. When jackknifing fails (or does it?). Psychometrika, 1975, 40, 113114.CrossRefGoogle Scholar
Wainer, H., Zill, N., and Gruvaeus, G. Senatorial decision making: II. Prediction. Behavioral Science, 1973, 18, 2026.CrossRefGoogle Scholar
Wright, S. Evolution and the genetics of populations. Vol. 1, Genetic and biometric foundations, 1968, Chicago: University of Chicago Press.Google Scholar