Hostname: page-component-745bb68f8f-lrblm Total loading time: 0 Render date: 2025-01-11T22:04:29.173Z Has data issue: false hasContentIssue false

Bayesian Model Assessment for Jointly Modeling Multidimensional Response Data with Application to Computerized Testing

Published online by Cambridge University Press:  01 January 2025

Fang Liu
Affiliation:
Northeast Normal University
Xiaojing Wang*
Affiliation:
University of Connecticut
Roeland Hancock
Affiliation:
University of Connecticut
Ming-Hui Chen
Affiliation:
University of Connecticut
*
Correspondence should be made to Xiaojing Wang, University of Connecticut, Storrs, CT 06250, USA. Email: xiaojing.wang@uconn.edu; URL: https://xiaojing-wang.uconn.edu

Abstract

Computerized assessment provides rich multidimensional data including trial-by-trial accuracy and response time (RT) measures. A key question in modeling this type of data is how to incorporate RT data, for example, in aid of ability estimation in item response theory (IRT) models. To address this, we propose a joint model consisting of a two-parameter IRT model for the dichotomous item response data, a log-normal model for the continuous RT data, and a normal model for corresponding paper-and-pencil scores. Then, we reformulate and reparameterize the model to capture the relationship between the model parameters, to facilitate the prior specification, and to make the Bayesian computation more efficient. Further, we propose several new model assessment criteria based on the decomposition of deviance information criterion (DIC) the logarithm of the pseudo-marginal likelihood (LPML). The proposed criteria can quantify the improvement in the fit of one part of the multidimensional data given the other parts. Finally, we have conducted several simulation studies to examine the empirical performance of the proposed model assessment criteria and have illustrated the application of these criteria using a real dataset from a computerized educational assessment program.

Type
Theory and Methods
Copyright
Copyright © 2022 The Author(s) under exclusive licence to The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/S0033312300005470a.

References

Bolsinova, M., de Boeck, P., Tijmstra, J., (2017). Modelling conditional dependence between response time and accuracy Psychometrika 82 (4) 11261148 27738955 10.1007/s11336-016-9537-6CrossRefGoogle ScholarPubMed
Bolt, D. M., Wollack, J. A., Suh, Y., (2012). Application of a multidimensional nested logit model to multiple-choice test items Psychometrika 77 (2) 339357 10.1007/s11336-012-9257-5CrossRefGoogle Scholar
Celeux, G., Forbes, F., Robert, C. P., Titterington, D. M., (2006). Deviance information criteria for missing data models Bayesian Analysis 1 (4) 651673 10.1214/06-BA122CrossRefGoogle Scholar
Chan, J. C., & Grant, A. L. (2016). Fast computation of the deviance information criterion for latent variable models. Computational Statistics and Data Analysis, 100, 847859.CrossRefGoogle Scholar
Chen, G., Luo, S., (2018). Bayesian hierarchical joint modeling using skew-normal/independent distributions Communications in Statistics-Simulation and Computation 47 (5) 14201438 30174369 10.1080/03610918.2017.1315730CrossRefGoogle ScholarPubMed
Chen, M. H., Shao, Q. M., (1999). Monte Carlo estimation of Bayesian credible and HPD intervals Journal of Computational and Graphical Statistics 8 (1) 6992CrossRefGoogle Scholar
Chen, M. H., Shao, Q. M., Ibrahim, J. G., Monte Carlo methods in Bayesian computation Berlin Springer 10.1007/978-1-4612-1276-8CrossRefGoogle Scholar
de la Torre, J., Patz, R. J., (2000). Making the most of what we have: A practical application of multidimensional item response theory in test scoring Journal of Educational and Behavioral Statistics (2005). 30 (3) 295311 10.3102/10769986030003295CrossRefGoogle Scholar
de Valpine, P., Paciorek, C., Turek, D., Michaud, N., Anderson-Bergman, C., Obermeyer, F. & Paganin, S. (2020). NIMBLE: MCMC, particle filtering, and programmable hierarchical modeling. https://doi.org/10.5281/zenodo.1211190CrossRefGoogle Scholar
de Valpine, P., Turek, D., Paciorek, C. J., Anderson-Bergman, C., Lang, D. T., Bodik, R., (2017). Programming with models: Writing statistical algorithms for general model structures with NIMBLE Journal of Computational and Graphical Statistics 26 (2) 403413 10.1080/10618600.2016.1172487CrossRefGoogle Scholar
Donkin, C., Averell, L., Brown, S., Heathcote, A., (2009). Getting more from accuracy and response time data: Methods for fitting the linear ballistic accumulator Behavior Research Methods 41 (4) 10951110 19897817 10.3758/BRM.41.4.1095CrossRefGoogle ScholarPubMed
Entink, R. K., Fox, J. P., van der Linden, W. J., (2009). A multivariate multilevel approach to the modeling of accuracy and speed of test takers Psychometrika 74 (1) 2148 10.1007/s11336-008-9075-yGoogle Scholar
Fox, J. P., Bayesian item response modeling: Theory and applications Berlin Springer 10.1007/978-1-4419-0742-4CrossRefGoogle Scholar
Fox, J. P., Marianti, S., (2010). Joint modeling of ability and differential speed using responses and response times Multivariate Behavioral Research (2016). 51 (4) 540553 27269482 10.1080/00273171.2016.1171128CrossRefGoogle Scholar
Fujimoto, K. A., (2018). A general Bayesian multilevel multidimensional IRT model for locally dependent data British Journal of Mathematical and Statistical Psychology 71 (3) 536560 29882212 10.1111/bmsp.12133CrossRefGoogle ScholarPubMed
Geisser, S., Eddy, W. F., (1979). A predictive approach to model selection Journal of the American Statistical Association 74 (365) 153160 10.1080/01621459.1979.10481632CrossRefGoogle Scholar
Gelfand, A. E., Dey, D. K., (1994). Bayesian model choice: Asymptotics and exact calculations Journal of the Royal Statistical Society: Series B 56 (3) 501514CrossRefGoogle Scholar
Gelfand, A. E., Dey, D. K., & Chang, H. (1992). Model determination using predictive distributions with implementation via sampling-based-methods (with discussion). In A. P. D. J.M. Bernado J.O. Berger & A. Smith (eds), In bayesian statistics 4. Oxford: Oxford University Press.Google Scholar
Gilbert, J. K., Compton, D. L., Fuchs, D., Fuchs, L. S., (2012). Early screening for risk of reading disabilities: Recommendations for a four-step screening system Assessment for Effective Intervention 38 (1) 614 24478613 3903290 10.1177/1534508412451491CrossRefGoogle ScholarPubMed
Ibrahim, J. G., Chen, M. H., Sinha, D Bayesian survival analysis Berlin Springer 10.1007/978-1-4757-3447-8Google Scholar
Jeffreys, H The theory of probability 3 Oxford, UK Oxford University PressGoogle Scholar
Johnson, T. R., (2003). On the use of heterogeneous thresholds ordinal regression models to account for individual differences in response style Psychometrika 68 (4) 563583 10.1007/BF02295612CrossRefGoogle Scholar
Karadavut, T., (2019). The uniform prior for Bayesian estimation of ability in item response theory models International Journal of Assessment Tools in Education 6 (4) 568579 10.21449/ijate.581314CrossRefGoogle Scholar
Kass, R. E., Raftery, A. E., (1995). Bayes factors Journal of the American Statistical Association 90 (430) 773795 10.1080/01621459.1995.10476572CrossRefGoogle Scholar
Li, Y., Yu, J., Zeng, T., (2020). Deviance information criterion for latent variable models and misspecified models Journal of Econometrics 216 (2) 450493 10.1016/j.jeconom.2019.11.002CrossRefGoogle Scholar
Lindley, D. V., Introduction to probability and statistics from a bayesian viewpoint Cambridge Cambridge University Press 10.1017/CBO9780511662973CrossRefGoogle Scholar
Loeys, T., Rosseel, Y., Baten, K., (2011). A joint modeling approach for reaction time and accuracy in psycholinguistic experiments Psychometrika 76 (3) 487503 10.1007/s11336-011-9211-yCrossRefGoogle Scholar
Lu, J., Wang, C., Zhang, J., Tao, J., (2020). A mixture model for responses and response times with a higher-order ability structure to detect rapid guessing behaviour British Journal of Mathematical and Statistical Psychology 73 (2) 261288 31385609 10.1111/bmsp.12175CrossRefGoogle ScholarPubMed
Luce, R. D. (1991). Response times: Their role in inferring elementary mental organization. Oxford: Oxford University Press.CrossRefGoogle Scholar
Man, K., Harring, J. R., Jiao, H., Zhan, P., (2019). Joint modeling of compensatory multidimensional item responses and response times Applied Psychological Measurement 43 (8) 639654 31551641 6745633 10.1177/0146621618824853CrossRefGoogle ScholarPubMed
Merkle, E. C., Furr, D., Rabe-Hesketh, S., (2019). Bayesian comparison of latent variable models: Conditional versus marginal likelihoods Psychometrika 84 (3) 802829 31297664 10.1007/s11336-019-09679-0CrossRefGoogle ScholarPubMed
Molenaar, D., de Boeck, P., (2018). Response mixture modeling: Accounting for heterogeneity in item characteristics across response times Psychometrika 83 (2) 279297 29392567 10.1007/s11336-017-9602-9CrossRefGoogle ScholarPubMed
Rouder, J. N., Province, J. M., Morey, R. D., Gomez, P., Heathcote, A., (2015). The lognormal race: A cognitive-process model of choice and latency with desirable psychometric properties Psychometrika 80 (2) 491513 24522340 10.1007/s11336-013-9396-3CrossRefGoogle Scholar
Spiegelhalter, D. J., Best, N. G., Carlin, B. P., Van Der Linde, A., (2002). Bayesian measures of model complexity and fit Journal of the Royal Statistical Society: Series B 64 (4) 583639 10.1111/1467-9868.00353CrossRefGoogle Scholar
Torgesen, J. K., Wagner, R., & Rashotte, C. (2012). Test of word reading efficiency: (TOWRE-2). New York, NY: Pearson.Google Scholar
van der Linden, W. J., (2009). Conceptual issues in response-time modeling Journal of Educational Measurement 46 (3) 247272 10.1111/j.1745-3984.2009.00080.xCrossRefGoogle Scholar
van der Linden, W. J., Handbook of item response theory, volume three: Applications Boca Raton Chapman and Hall/CRC 10.1201/b19166Google Scholar
van der Linden, W. J., Guo, F., (2008). Bayesian procedures for identifying aberrant response-time patterns in adaptive testing Psychometrika 73 (3) 365384 10.1007/s11336-007-9046-8CrossRefGoogle Scholar
van der Linden, W. J., Hambleton, R. K., Handbook of modern item response theory Berlin SpringerCrossRefGoogle Scholar
Visual Numerics, I Imsl fortran library user’s guide math/library San Ramon, CA Visual Numerics IncGoogle Scholar
Wang, X., Saha, A., & Dey, D. K. (2016). Bayesian joint modeling of response times with dynamic latent ability in educational testing (Vol. 3; Tech. Rep.). Department of Statistics, University of Connecticut, Storrs, Connecticut, USAGoogle Scholar
Watanabe, S., (2010). Asymptotic equivalence of bayes cross validation and widely applicable information criterion in singular learning theory Journal of Machine Learning Research (2010). 11 35713594Google Scholar
Zhang, D., Chen, M. H., Ibrahim, J. G., Boye, M. E., Shen, W., (2017). Bayesian model assessment in joint modeling of longitudinal and survival data with applications to cancer clinical trials Journal of Computational and Graphical Statistics 26 (1) 121133 28239247 5321618 10.1080/10618600.2015.1117472CrossRefGoogle ScholarPubMed
Zhang, F., Chen, M. H., Cong, X. J., Chen, Q., (2021). Assessing importance of biomarkers: A bayesian joint modelling approach of longitudinal and survival data with semi-competing risks Statistical Modelling 21 1–2 3055 34326706 10.1177/1471082X20933363CrossRefGoogle Scholar
Zhang, X., Tao, J., Wang, C., Shi, N. Z., (2019). Bayesian model selection methods for multilevel IRT models: A comparison of five DIC-based indices Journal of Educational Measurement 56 (1) 327 10.1111/jedm.12197CrossRefGoogle Scholar
Supplementary material: File

Liu et al. supplementary material

Liu et al. supplementary material
Download Liu et al. supplementary material(File)
File 1.7 MB