Hostname: page-component-745bb68f8f-v2bm5 Total loading time: 0 Render date: 2025-01-11T20:03:02.188Z Has data issue: false hasContentIssue false

Tests of Homogeneity of Means and Covariance Matrices for Multivariate Incomplete Data

Published online by Cambridge University Press:  01 January 2025

Kevin H. Kim*
Affiliation:
University of California, Los Angeles
Peter M. Bentler
Affiliation:
University of California, Los Angeles
*
Requests for reprints should be sent to Kevin H. Kim, Department of Psychology, UCLA, Box 951563, Los Angeles, CA 90095-1563. E-Mail: kevinkim@ucla.edu.

Abstract

Existing test statistics for assessing whether incomplete data represent a missing completely at random sample from a single population are based on a normal likelihood rationale and effectively test for homogeneity of means and covariances across missing data patterns. The likelihood approach cannot be implemented adequately if a pattern of missing data contains very few subjects. A generalized least squares rationale is used to develop parallel tests that are expected to be more stable in small samples. Three factors were varied for a simulation: number of variables, percent missing completely at random, and sample size. One thousand data sets were simulated for each condition. The generalized least squares test of homogeneity of means performed close to an ideal Type I error rate for most of the conditions. The generalized least squares test of homogeneity of covariance matrices and a combined test performed quite well also.

Type
Articles
Copyright
Copyright © 2002 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Preliminary results on this research were presented at the 1999 Western Psychological Association convention, Irvine, CA, and in the UCLA Statistics Preprint No. 265 (http://www.stat.ucla.edu). The assistance of Ke-Hai Yuan and several anonymous reviewers is gratefully acknowledged.

References

Allison, P.D. (1987). Estimation of linear models with incomplete data. In Clogg, C. (Eds.), Sociological methodology 1987 (pp. 71103). San Francisco, CA: Jossey Bass.Google Scholar
Arbuckle, J.L. (1996). Full information estimation in the presence of incomplete data. In Marcoulides, G.A., Schumacker, R.E. (Eds.), Advanced structural equation modeling: Issues and techniques (pp. 243277). Mahwah, NJ: Lawrence Erlbaum Associates.Google Scholar
Bentler, P.M. (1989). EQS structural equations program manual. Los Angeles, CA: BMDP Statistical Software.Google Scholar
Bentler, P.M. (2002). EQS 6 structural equations program manual. Encino, CA: Multivariate Software.Google Scholar
Bentler, P.M., Lee, S.-Y., Weng, J. (1987). Multiple population covariance structure analysis under arbitrary distribution theory. Communications in Statistics—Theory, 16, 19511964.CrossRefGoogle Scholar
Bernaards, C.A., Sijtsma, K. (2000). Influence of imputation and EM methods on factor analysis when item nonresponse in questionnaire data is nonignorable. Multivariate Behavioral Research, 35, 321364.CrossRefGoogle ScholarPubMed
Browne, M.W. (1974). Generalized least squares estimators in the analysis of covariance structures. South African Statistical Journal, 8, 124.Google Scholar
Chen, H.Y., Little, R. (1999). A test of missing completely at random for generalised estimating equation with missing data. Biometrika, 86, 113.CrossRefGoogle Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39, 138.CrossRefGoogle Scholar
Dixon, W.J. (1988). BMDP statistical software. Los Angeles, CA: University of California Press.Google Scholar
Enders, C.K., Bandalos, D.L. (2001). The relative performance of full information maximum likelihood estimation for missing data in structural equation models. Structural Equation Modeling, 8, 430457.CrossRefGoogle Scholar
Fuchs, C. (1982). Maximum likelihood estimation and model selection in contingency tables with missing data. Journal of the American Statistical Association, 77, 270278.CrossRefGoogle Scholar
Gold, M.S., Bentler, P.M. (2000). Treatments of missing data: a Monte Carlo comparison of RBHDI, iterative stochastic regression imputation, and expectation-maximization. Structural Equation Modeling, 7, 319355.CrossRefGoogle Scholar
Jamshidian, M., Bentler, P.M. (1999). ML estimation of mean and covariance structures with missing data using complete data routines. Journal of Educational and Behavioral Statistics, 24, 2141.CrossRefGoogle Scholar
Jennrich, R.I. (1970). An asymptotic x2 test for equality of two correlation matrices. Journal of the American Statistical Association, 65, 904912.Google Scholar
Lee, S.-Y., Tsui, K.-L. (1982). Covariance structure analysis in several populations. Psychometrika, 47, 297308.CrossRefGoogle Scholar
Little, R.J.A. (1988). A test of missing completely at random for multivariate data with missing values. Journal of the American Statistical Association, 83, 11981202.CrossRefGoogle Scholar
Little, R.J.A., Rubin, D.B. (1987). Statistical analysis with missing data. New York, NY: Wiley.Google Scholar
Little, R.J.A., Schenker, N. (1995). Missing Data. In Arminger, G., Clogg, C.C., Sobel, M.E. (Eds.), Handbook of statistical modeling for the social and behavioral sciences (pp. 3975). New York, NY: Plenum Press.CrossRefGoogle Scholar
Muthén, B., Kaplan, D., Hollis, M. (1987). On structural equation modeling with data that are not missing completely at random. Psychometrika, 52, 431462.CrossRefGoogle Scholar
Nagao, H. (1973). On some test criteria for covariance matrix. Annals of Statistics, 4, 700709.Google Scholar
Odeh, R.E., Evans, J.O. (1974). Algorithm AS 70. The percentage points of the normal distribution. Applied Statistics, 23, 9697.CrossRefGoogle Scholar
Orchard, T., Woodbury, M.A. (1972). Missing information principle: theory and application. Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, 1, 697715.Google Scholar
Rovine, M.J. (1994). Latent variables models and missing data analysis. In von Eye, A., Clogg, C.C. (Eds.), Latent variables analysis: Applications for developmental research (pp. 181225). Thousand Oaks, CA: Sage.Google Scholar
Tabachnick, B.G., Fidell, L.S. (1996). Using multivariate statistics 3rd ed., New York, NY: Harper Collins.Google Scholar
Tang, M., Bentler, P.M. (1998). Theory and method for constrained estimation in structural equation models with incomplete data. Computational Statistics & Data Analysis, 27, 257270.CrossRefGoogle Scholar
Yuan, K.-H., Bentler, P.M. (2000). Three likelihood-based methods for mean and covariance structure analysis with nonnormal missing data. Sociological Methodology 2000 (pp. 165200). Washington, DC: American Sociological Association.Google Scholar