Hostname: page-component-745bb68f8f-kw2vx Total loading time: 0 Render date: 2025-01-11T07:06:34.788Z Has data issue: false hasContentIssue false

On the Misuse of Manifest Variables in the Detection of Measurement Bias

Published online by Cambridge University Press:  01 January 2025

William Meredith
Affiliation:
University of California, Berkeley
Roger E. Millsap*
Affiliation:
Baruch College, City University of New York
*
Requests for reprints should be sent to Roger E. Millsap, Department of Psychology, Baruch College, City University of New York, 17 Lexington Ave, New York, NY 10010.

Abstract

Measurement invariance (lack of bias) of a manifest variable Y with respect to a latent variable W is defined as invariance of the conditional distribution of Y given W over selected subpopulations. Invariance is commonly assessed by studying subpopulation differences in the conditional distribution of Y given a manifest variable Z, chosen to substitute for W. A unified treatment of conditions that may allow the detection of measurement bias using statistical procedures involving only observed or manifest variables is presented. Theorems are provided that give conditions for measurement invariance, and for invariance of the conditional distribution of Y given Z. Additional theorems and examples explore the Bayes sufficiency of Z, stochastic ordering in W, local independence of Y and Z, exponential families, and the reliability of Z. It is shown that when Bayes sufficiency of Z fails, the two forms of invariance will often not be equivalent in practice. Bayes sufficiency holds under Rasch model assumptions, and in long tests under certain conditions. It is concluded that bias detection procedures that rely strictly on observed variables are not in general diagnostic of measurement bias, or the lack of bias.

Type
Original Paper
Copyright
Copyright © 1992 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Preparation of this article was supported in part by PSC-CUNY grant #661282 to Roger E. Millsap.

References

Berk, R. A. (1982). Handbook of methods for detecting test bias, Baltimore, MD: The Johns Hopkins University.Google Scholar
Cleary, T. A. (1968). Test bias: Prediction of grades of Negro and white students in integrated colleges. Journal of Educational Measurement, 5, 115124.CrossRefGoogle Scholar
Holland, P. W., Thayer, D. T. (1988). Differential item performance and the Mantel-Haenszel procedure. In Wainer, H., Braun, H. I. (Eds.), Test validity (pp. 129145). Hillsdale, NJ: Lawrence Erlbaum.Google Scholar
Ironson, G. H. (1982). Use of chi-square and latent trait approaches for detecting item bias. In Berk, R. A. (Eds.), Handbook of methods for detecting test bias (pp. 117160). Baltimore, MD: The Johns Hopkins University.Google Scholar
Junker, B. W. (1990, June). Essential independence and structural robustness in item response theory. Paper presented at the annual meeting of the Psychometric Society, Princeton, NJ.Google Scholar
Lehmann, E. L. (1955). Ordered families of distributions. Annals of Mathematical Statistics, 26, 399419.CrossRefGoogle Scholar
Lehmann, E. L. (1986). Testing statistical hypotheses, New York: Wiley.CrossRefGoogle Scholar
Lord, F. M. (1980). Applications of item response theory to practical testing problems, Hillsdale, NJ: Erlbaum.Google Scholar
Lord, F. M., Novick, M. R. (1968). Statistical theories of mental test scores, Reading, MA: Addison-Wesley.Google Scholar
Mantel, N., Haenszel, W. (1959). Statistical aspects of the analysis of data from retrospective studies of disease. Journal of the National Cancer Institute, 22, 719748.Google ScholarPubMed
Marascuilo, L. A., Slaughter, R. E. (1981). Statistical procedures for identifying possible sources of item bias based on x 2 statistics. Journal of Educational Measurement, 18, 229248.CrossRefGoogle Scholar
Mellenbergh, G. J. (1989). Item bias and item response theory. International Journal of Educational Research, 13, 127143.CrossRefGoogle Scholar
Rao, C. R. (1973). Linear statistical inference and its applications, New York: Wiley.CrossRefGoogle Scholar
Reilly, R. R. (1986). Validating employee selection procedures. In Kaye, D. H., Aicken, M. H. (Eds.), Statistical methods in discrimination litigation (pp. 133158). New York: Marcel Dekker.Google Scholar
Scheuneman, J. D. (1979). A method of assessing bias in test items. Journal of Educational Measurement, 16, 143152.CrossRefGoogle Scholar
Shealy, R., & Stout, W. F. (1990, June). A new model and statistical test for psychological test bias. Paper presented at the annual meeting of the Psychometric Society, Princeton, NJ.Google Scholar
Shepard, L. A., Camilli, G., Averill, M. (1981). Comparison of procedures for detecting test-item bias with both internal and external ability criteria. Journal of Educational Statistics, 6, 317375.CrossRefGoogle Scholar
Stout, W. F. (1990). A new item response theory modeling approach with applications to multidimensionality assessment and ability estimation. Psychometrika, 55, 293325.CrossRefGoogle Scholar
Thissen, D., Steinberg, L., Wainer, H. (1988). Use of item response theory in the study of group differences in trace lines. In Wainer, H., Braun, H. I. (Eds.), Test validity (pp. 147169). Hillsdale, NJ: Lawrence Erlbaum.Google Scholar
Zwick, R. (1990). When do item response function and Mantel-Haenszel definitions of differential item functioning coincide?. Journal of Educational Statistics, 15, 185197.CrossRefGoogle Scholar