Hostname: page-component-5b777bbd6c-cp4x8 Total loading time: 0 Render date: 2025-06-19T07:27:04.898Z Has data issue: false hasContentIssue false

How Should We Assess the Fit of Rasch-Type Models? Approximating the Power of Goodness-of-Fit Statistics in Categorical Data Analysis

Published online by Cambridge University Press:  01 January 2025

Alberto Maydeu-Olivares*
Affiliation:
Faculty of Psychology, University of Barcelona
Rosa Montaño
Affiliation:
Universidad de Santiago de Chile
*
Requests for reprints should be sent to Alberto Maydeu-Olivares, Faculty of Psychology, University of Barcelona, P. Valle de Hebrón, 171, 08035 Barcelona, Spain. E-mail: amaydeu@ub.edu

Abstract

We investigate the performance of three statistics, R1, R2 (Glas in Psychometrika 53:525–546, 1988), and M2 (Maydeu-Olivares & Joe in J. Am. Stat. Assoc. 100:1009–1020, 2005, Psychometrika 71:713–732, 2006) to assess the overall fit of a one-parameter logistic model (1PL) estimated by (marginal) maximum likelihood (ML). R1 and R2 were specifically designed to target specific assumptions of Rasch models, whereas M2 is a general purpose test statistic. We report asymptotic power rates under some interesting violations of model assumptions (different item discrimination, presence of guessing, and multidimensionality) as well as empirical rejection rates for correctly specified models and some misspecified models. All three statistics were found to be more powerful than Pearson’s X2 against two- and three-parameter logistic alternatives (2PL and 3PL), and against multidimensional 1PL models. The results suggest that there is no clear advantage in using goodness-of-fit statistics specifically designed for Rasch-type models to test these models when marginal ML estimation is used.

Type
Original Paper
Copyright
Copyright © The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

This research was supported by an ICREA-Academia Award and Grant SGR 2009 74 from the Catalan Government, and by Grants PSI2009-07726 and PR2010-0252 from the Spanish Ministry of Education awarded to the first author, and by a Dissertation Research Award of the Society of Multivariate Experimental Psychology awarded to the second author. The authors are indebted to the reviewers and to David Thissen for comments that improved the manuscript.

References

Agresti, A., & Yang, M. (1987). An empirical investigation of some effects of sparseness in contingency tables. Computational Statistics & Data Analysis, 5, 921CrossRefGoogle Scholar
Andersen, E.B. (1973). A goodness of fit test for the Rasch model. Psychometrika, 38, 123140CrossRefGoogle Scholar
Bartholomew, D.J., & Leung, S.O. (2002). A goodness of fit test for sparse 2p contingency tables. British Journal of Mathematical & Statistical Psychology, 55, 115CrossRefGoogle ScholarPubMed
Bartholomew, D., & Tzamourani, P. (1999). The goodness of fit of latent trait models in attitude measurement. Sociological Methods & Research, 27, 525546CrossRefGoogle Scholar
Bock, R.D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: application of an EM algorithm. Psychometrika, 46, 443459CrossRefGoogle Scholar
Bock, R.D., & Lieberman, M. (1970). Fitting a response model for n dichotomously scored items. Psychometrika, 35, 179197CrossRefGoogle Scholar
Cai, L., Maydeu-Olivares, A., Coffman, D.L., & Thissen, D. (2006). Limited information goodness of fit testing of item response theory models for sparse 2p tables. British Journal of Mathematical & Statistical Psychology, 59, 173194CrossRefGoogle ScholarPubMed
Christoffersson, A. (1975). Factor analysis of dichotomized variables. Psychometrika, 40, 532CrossRefGoogle Scholar
De Leeuw, J., & Verhelst, N. (1986). Maximum likelihood estimation in generalized Rasch models. Journal of Educational and Behavioral Statistics, 11, 183196CrossRefGoogle Scholar
Fischer, G.H., & Molenaar, I.W. (1995). Rasch models: foundations, recent developments and applications, New York: SpringerCrossRefGoogle Scholar
Glas, C.A.W. (1988). The derivation of some tests for the Rasch model from the multinomial distribution. Psychometrika, 53, 525546CrossRefGoogle Scholar
Glas, C.A.W., & Verhelst, N.D. (1989). Extensions of the partial credit model. Psychometrika, 54, 635659CrossRefGoogle Scholar
Glas, C.A.W., & Verhelst, N.D. (1995). Testing the Rasch model. In Fischer, G.H., & Molenaar, I.W. (Eds.), Rasch models: foundations, recent developments and applications, New York: Springer 6996CrossRefGoogle Scholar
Glas, C.A.W. (2009). Personal communication. Google Scholar
Irtel, H. (1995). An extension of the concept of specific objectivity. Psychometrika, 60, 115118CrossRefGoogle Scholar
Joe, H., & Maydeu-Olivares, A. (2010). A general family of limited information goodness-of-fit statistics for multinomial data. Psychometrika, 75, 393419CrossRefGoogle Scholar
Jöreskog, K.G. (1994). On the estimation of polychoric correlations and their asymptotic covariance matrix. Psychometrika, 59, 381389CrossRefGoogle Scholar
Jöreskog, K.G., & Moustaki, I. (2001). Factor analysis of ordinal variables: a comparison of three approaches. Multivariate Behavioral Research, 36, 347387CrossRefGoogle ScholarPubMed
Koehler, K., & Larntz, K. (1980). An empirical investigation of goodness of fit statistics for sparse multidimensional tables. Journal of the American Statistical Association, 75, 336344CrossRefGoogle Scholar
Kullback, S., & Leibler, R.A. (1951). On information and sufficiency. Annals of Mathematical Statistics, 22, 7986CrossRefGoogle Scholar
Lord, F.M., & Novick, M.R. (1968). Statistical theories of mental test scores, Reading: Addison-WesleyGoogle Scholar
Mathai, A.M., & Provost, S.B. (1992). Quadratic forms in random variables: theory and applications, New York: Marcel DekkerGoogle Scholar
Maydeu-Olivares, A., & Joe, H. (2005). Limited and full information estimation and goodness-of-fit testing in 2n tables: a unified approach. Journal of the American Statistical Association, 100, 10091020CrossRefGoogle Scholar
Maydeu-Olivares, A., & Joe, H. (2006). Limited information goodness-of-fit in multidimensional contingency tables. Psychometrika, 71, 713732CrossRefGoogle Scholar
Maydeu-Olivares, A., & Joe, H. (2008). An overview of limited information goodness-of-fit testing in multidimensional contingency tables. In Shigemasu, K., Okada, A., Imaizumi, T., & Hoshino, T. (Eds.), New trends in psychometrics, Tokyo: Universal Academy Press 253262Google Scholar
Maydeu-Olivares, A., & Liu, Y. (2012). Item diagnostics in multivariate discrete data. Manuscript under review. Google Scholar
Mavridis, D., Moustaki, I., & Knott, M. (2007). Goodness-of-fit measures for latent variable models for binary data. In Lee, S.-Y. (Eds.), Handbook of latent variables and related models, Amsterdam: Elsevier 135162Google Scholar
McDonald, R.P. (1999). Test theory: a unified treatment, Mahwah: Lawrence ErlbaumGoogle Scholar
Montaño, R. (2009). Una comparación de las estadísticas de bondad de ajuste R 1y M 2para modelos de la Teoría de Respuesta al Ítem [Comparing the R 1and M 2statistics for goodness of fit assessment in IRT models]. Unpublished Ph.D. dissertation, University of Barcelona. Google Scholar
Pfanzagel, J. (1993). A case of asymptotic equivalence between conditional and marginal maximum likelihood estimators. Journal of Statistical Planning and Inference, 35, 301307CrossRefGoogle Scholar
Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests, Copenhagen: Paedagogiske InstitutGoogle Scholar
Reiser, M. (1996). Analysis of residuals for the multinomial item response model. Psychometrika, 61, 509528CrossRefGoogle Scholar
Reiser, M. (2008). Goodness-of-fit testing using components based on marginal frequencies of multinomial data. British Journal of Mathematical & Statistical Psychology, 61, 331360CrossRefGoogle ScholarPubMed
Satorra, A., & Saris, W.E. (1985). Power of the likelihood ratio test in covariance structure analysis. Psychometrika, 50, 8390CrossRefGoogle Scholar
Suárez-Falcon, J.C., & Glas, C.A.W. (2003). Evaluation of global testing procedure for item fit to the Rasch model. British Journal of Mathematical & Statistical Psychology, 56, 127143CrossRefGoogle Scholar
Swaminathan, H., Hambleton, R.K., & Rogers, H.J. (2007). Assessing the fit of item response models. In Rao, C.R., Sinharay, S. (Eds.), Psychometrics, Amsterdam: Elsevier 683718Google Scholar
Teugels, J.L. (1990). Some representations of the multivariate Bernoulli and binomial distributions. Journal of Multivariate Analysis, 32, 256268CrossRefGoogle Scholar
Thissen, D. (1982). Marginal maximum likelihood estimation for the one-parameter logistic models. Psychometrika, 47, 175186CrossRefGoogle Scholar
van den Wollenberg, A.L. (1982). Two new test statistics for the Rasch model. Psychometrika, 47, 123139CrossRefGoogle Scholar