Hostname: page-component-745bb68f8f-grxwn Total loading time: 0 Render date: 2025-01-11T08:00:29.849Z Has data issue: false hasContentIssue false

Modeling Not-Reached Items in Timed Tests: A Response Time Censoring Approach

Published online by Cambridge University Press:  01 January 2025

Jinxin Guo
Affiliation:
Beijing Normal University
Xin Xu
Affiliation:
Beijing Normal University
Zhiliang Ying
Affiliation:
Columbia University
Susu Zhang*
Affiliation:
University of Illinois at Urbana-Champaign
*
Correspondence should be made to Susu Zhang, University of Illinois at Urbana-Champaign, Illinois, USA. Email: szhan105@illinois.edu

Abstract

Time limits are imposed on many computer-based assessments, and it is common to observe examinees who run out of time, resulting in missingness due to not-reached items. The present study proposes an approach to account for the missing mechanisms of not-reached items via response time censoring. The censoring mechanism is directly incorporated into the observed likelihood of item responses and response times. A marginal maximum likelihood estimator is proposed, and its asymptotic properties are established. The proposed method was evaluated and compared to several alternative approaches that ignore the censoring through simulation studies. An empirical study based on the PISA 2018 Science Test was further conducted.

Type
Theory & Methods
Copyright
Copyright © 2021 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability Statistical theories of mental test scores.Google Scholar
Bolsinova, M., & Tijmstra, J., (2018). Improving precision of ability estimation: Getting more from response times British Journal of Mathematical and Statistical Psychology 71(1) 1338CrossRefGoogle ScholarPubMed
Bolsinova, M.,Tijmstra, J., Molenaar, D., (2017). Response moderation models for conditional dependence between response time and response accuracy British Journal of Mathematical and Statistical Psychology 70(2) 257279CrossRefGoogle ScholarPubMed
Cronbach, L. J., & Warrington, W. G., (1951). Time-limit tests: Estimating their reliability and degree of speeding Psychometrika 16(2) 167188CrossRefGoogle ScholarPubMed
Evans, F. R., & Reilly, R. R., (1972). Astudy of speededness as a source of test bias 1 Journal of Educational Measurement 9(2) 123131CrossRefGoogle Scholar
Glas, C. A., & Pimentel, J. L., (2008). Modeling nonignorable missing data in speeded tests Educational and Psychological Measurement 68(6) 907922CrossRefGoogle Scholar
Harik, P., Clauser, B. E., Grabovsky, I., Baldwin, P., Margolis, M. J., Bucak, D., & Haist, S., (2018). A comparison of experimental and observational approaches to assessing the effects of time constraints in a medical licensing examination Journal of Educational Measurement 55(2) 308327CrossRefGoogle Scholar
Holman, R., & Glas, C. A., (2005). Modelling non-ignorable missing-data mechanisms with item response theory models British Journal of Mathematical and Statistical Psychology 58(1) 117Google ScholarPubMed
Johnson, E., Allen, N. ((1992).). The 1990 naep technical report (no. 21-tr-20). Washington, DC: National Center for Education Statistics.Google Scholar
Kyllonen, P., & Zu, J., (2016). Use of response time for measuring cognitive ability Journal of Intelligence 4(4) 14CrossRefGoogle Scholar
Lawless, J. F., Statistical models and methods for lifetime data (362) Hoboken WileyGoogle Scholar
Lee, Y. H., & Ying, Z., (2011). A mixture cure-rate model for responses and response times in time-limit tests Psychometrika (2015). 80(3) 748775CrossRefGoogle Scholar
Lehmann, E. L., & Romano, J. P., Testing statistical hypotheses Berlin Springer Science & Business MediaGoogle Scholar
Little, R. J., & Rubin, D. B., Statistical analysis with missing data (1986). Hoboken John Wiley & Sons IncGoogle Scholar
Lu, J., Wang, C., Tao, J. ((2018).). Modeling nonignorable missing for not-reached items incorporating item response times. Presented at the 83rd International Meeting of the Psychometric Society, New York, NY.Google Scholar
Luecht, RM., Sireci, SG. ((2011).). A review of models for computer-based testing. Research report 2011-12. College Board.Google Scholar
Moustaki, I., & Knott, M., (2006). Weighting for item non-response in attitude scales by using latent variable models with covariates Journal of the Royal Statistical Society: Series A (Statistics in Society) (2000). 163(3) 445459CrossRefGoogle Scholar
OECD PISA 2006 technical report Paris FranceGoogle Scholar
OECD PISA 2018 technical report (2021). Paris FranceGoogle Scholar
O‘Muircheartaigh, C., & Moustaki, I., (2009). Symmetric pattern models: A latent variable approach to item non-response in attitude scales Journal of the Royal Statistical Society: Series A (Statistics in Society) (1999). 162(2) 177194CrossRefGoogle Scholar
Pohl, S.,Gräfe, L., Rose, N., (2014). Dealing with omitted and not-reached items in competence tests: Evaluating approaches accounting for missing responses in item response theory models Educational and Psychological Measurement 74(3) 423452CrossRefGoogle Scholar
Pohl, S., Haberkorn, K., Hardt, K., Wiegand, E. ((2012).). Neps technical report for reading—scaling results of starting cohort 3 in fifth grade. NEPS Working Paper No. 15.Google Scholar
Pohl, S., Ulitzsch, E., von Davier, M. ((2019).). Using response times to model not-reached items due to time limits. Psychometrika1–29.Google Scholar
Pohl, S., & von Davier, M., (2018). Commentary: On the importance of the speed-ability trade-off when dealing with not reached items by jesper tijmstra and maria bolsinova Frontiers in psychology 9 1988CrossRefGoogle ScholarPubMed
Rose, N.,von Davier, M., Nagengast, B., (2017). Modeling omitted and not-reached items in irt models Psychometrika 82(3) 795819CrossRefGoogle Scholar
Rose, N.,von Davier, M., Xu, X., (2010). Modeling nonignorable missing data with item response theory (irt) ETS Research Report Series 2010(1) i53CrossRefGoogle Scholar
Roskam, EE. ((1997).). Models for speed and time-limit tests. Handbook of modern item response theory (187–208). Springer.Google Scholar
Schleicher, A. ((2019).). PISA 2018: Insights and interpretations. OECD Publishing.Google Scholar
Steffen, M., Schaeffer, G. ((1996).). Comparison of scoring models for incomplete adaptive tests. Presentation to the Graduate Record Examinations Technical Advisory Committee for the GRE General Test.Google Scholar
Talento-Miller, E.,Guo, F., Han, K. T., (2013). Examining test speededness by native language International Journal of Testing 13(2) 89104CrossRefGoogle Scholar
Tijmstra, J., & Bolsinova, M., (2018). On the importance of the speed-ability trade-off when dealing with not reached items Frontiers in Psychology 9 964CrossRefGoogle Scholar
Ulitzsch, E., von Davier, M., Pohl, S. ((2019).). Using response times for joint modeling of response and omission behavior. Multivariate behavioral research1–29.Google Scholar
Ulitzsch, E.,von Davier, M., Pohl, S., (2020). A multiprocess item response model for not-reached items due to time limits and quitting Educational and Psychological Measurement 80(3) 522547CrossRefGoogle ScholarPubMed
van der Linden, W. J., (2006). A lognormal model for response times on test items Journal of Educational and Behavioral Statistics 31(2) 181204CrossRefGoogle Scholar
van der Linden, W. J., (2007). A hierarchical framework for modeling speed and accuracy on test items Psychometrika 72(3) 287CrossRefGoogle Scholar
van der Linden, W. J., (2011). Setting time limits on tests Applied Psychological Measurement 35(3) 183199CrossRefGoogle Scholar
van der Linden, W. J., & Glas, C. A., (2010). Statistical tests of conditional independence between responses and/or response times on test items Psychometrika 75(1) 120139CrossRefGoogle Scholar
Veldkamp, B. P.,Avetisyan, M., Weissman, A., & Fox, J. P., (2017). Stochastic programming for individualized test assembly with mixture response time models Computers in Human Behavior 76 693702CrossRefGoogle Scholar
Wang, C., & Xu, G., (2015). A mixture hierarchical model for response times and response accuracy British Journal of Mathematical and Statistical Psychology 68(3) 456477CrossRefGoogle ScholarPubMed
Wang, S., Zhang, S., Shen, Y. ((2019).). A joint modeling framework of responses and response times to assess learning outcomes. Multivariate behavioral research, 1–20.Google Scholar
Wang, T., & Hanson, B. A., (2005). Development and calibration of an item response model that incorporates response time Applied Psychological Measurement 29(5) 323339CrossRefGoogle Scholar
Way, W. D.,Gawlick, L. A., Eignor, D. R., (2001). Scoring alternatives for incomplete computerized adaptive tests 1 ETS Research Report Series 2001(2) i35CrossRefGoogle Scholar
Wise, S. L., & Kingsbury, G. G., (2016). Modeling student test-taking motivation in the context of an adaptive achievement test Journal of Educational Measurement 53(1) 86105CrossRefGoogle Scholar
Wise, S. L., Ma, L. ((2012).). Setting response time thresholds for a cat item pool: The normative threshold method. In annual meeting of the national council on measurement in education, Vancouver, Canada (163–183).Google Scholar