Hostname: page-component-7dd5485656-2pp2p Total loading time: 0 Render date: 2025-10-26T23:45:36.463Z Has data issue: false hasContentIssue false

IRT Test Equating in Complex Linkage Plans

Published online by Cambridge University Press:  01 January 2025

Michela Battauz*
Affiliation:
Department of Economics and Statistics, University of Udine
*
Requests for reprints should be sent to Michela Battauz, Department of Economics and Statistics, University of Udine, Via Tomadini 30/A, 33100 Udine, Italy. E-mail: michela.battauz@uniud.it

Abstract

Linkage plans can be rather complex, including many forms, several links, and the connection of forms through different paths. This article studies item response theory equating methods for complex linkage plans when the common-item nonequivalent group design is used. An efficient way to average equating coefficients that link the same two forms through different paths will be presented and the asymptotic standard errors of indirect and average equating coefficients are derived. The methodology is illustrated using simulations studies and a real data example.

Information

Type
Original Paper
Copyright
Copyright © 2013 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Bock, R.D., Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: application of an EM algorithm. Psychometrika, 46, 443459CrossRefGoogle Scholar
Braun, H.I., Holland, P.W. (1982). Observed-score test equating: a mathematical analysis of some ETS equating procedures. In Holland, P.W., Rubin, D.B. (Eds.), Test equating, New York: Academic Press 949Google Scholar
Guo, H. (2010). Accumulative equating error after a chain of linear equatings. Psychometrika, 75, 438453CrossRefGoogle Scholar
Guo, H., Liu, J., Dorans, N., Feigenbaum, M. (2011). Multiple linking in equating and random scale drift, Princeton: Educational Testing Service (ETS RR-11-46)CrossRefGoogle Scholar
Haberman, S.J. (2009). Linking parameter estimates derived from an item response model through separate calibrations, Princeton: Educational Testing Service (ETS RR-09-40)CrossRefGoogle Scholar
Holland, P.W., Strawderman, W.E. (2011). How to average equating functions if you must. In von Davier, A.A. (Eds.), Statistical models for test equating, scaling, and linking, New York: Springer 89107Google Scholar
Kolen, M.J., Brennan, R.L. (2004). Test equating, scaling, and linking: methods and practices, (2nd ed.). New York: SpringerCrossRefGoogle Scholar
Li, D., Jiang, Y., von Davier, A.A. (2012). The accuracy and consistency of a series of IRT true-score equatings. Journal of Educational Measurement, 49, 167189CrossRefGoogle Scholar
Li, D., Li, S., von Davier, A.A. (2011). Applying time-series analysis to detect scale drift. In von Davier, A.A. (Eds.), Statistical models for test equating, scaling, and linking, New York: SpringerGoogle Scholar
Ogasawara, H. (2000). Asymptotic standard errors of IRT equating coefficients using moments. Economic Review (Otaru University of Commerce), 51, 123Google Scholar
Ogasawara, H. (2001). Item response theory true score equatings and their standard errors. Journal of Educational and Behavioral Statistics, 26, 3150CrossRefGoogle Scholar
Ogasawara, H. (2001). Standard errors of item response theory equating/linking by response function methods. Applied Psychological Measurement, 25, 5367CrossRefGoogle Scholar
Ogasawara, H. (2003). Asymptotic standard errors of IRT observed-score equating methods. Psychometrika, 68, 193211CrossRefGoogle Scholar
Ogasawara, H. (2011). Applications of asymptotic expansion in item response theory linking. In von Davier, A.A. (Eds.), Statistical models for test equating, scaling, and linking, New York: SpringerGoogle Scholar
Puhan, G. (2009). Detecting and correcting scale drift in test equating: an illustration from a large scale testing program. Applied Measurement in Education, 22, 79103CrossRefGoogle Scholar
R Development Core Team (2012). R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Google Scholar
Rizopoulos, D. (2006). ltm: an R package for latent variable modelling and item response theory analyses. Journal of Statistical Software, 17, 125CrossRefGoogle Scholar
van der Linden, W.J., Hambleton, R.K. (1997). Handbook of modern item response theory, Berlin: SpringerCrossRefGoogle Scholar
von Davier, A.A. (2011). Quality control and data mining techniques applied to monitoring scaled scores. In Pechenizkiy, M., Calders, T., Conati, C., Ventura, S., Romero, C., Stamper, J. (Eds.), Proceedings of the 4th international conference on educational data mining, Eindhoven: University of Technology LibraryGoogle Scholar