Search

Adjusting for Information Inflation Due to Local Dependency in Moderately Large Item Clusters
Edward Hak-sing IP
Journal:

Psychometrika / Volume 65 / Issue 1 / March 2000

Published online by Cambridge University Press:

02 January 2025, pp. 73-91
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
When multiple items are clustered around a reading passage, the local independence assumption in item response theory is often violated. The amount of information contained in an item cluster is usually overestimated if violation of local independence is ignored and items are treated as locally independent when in fact they are not. In this article we provide a general method that adjusts for the inflation of information associated with a test containing item clusters. A computational scheme was presented for the evaluation of the factor of adjustment for clusters in the restrictive case of two items per cluster, and the general case of more than two items per cluster. The methodology was motivated by a study of the NAEP Reading Assessment. We present a simulated study along with an analysis of a NAEP data set.

A Note on the Identifiability of Fixed-Effect 3PL Models
Hao Wu
Journal:

Psychometrika / Volume 81 / Issue 4 / December 2016

Published online by Cambridge University Press:

01 January 2025, pp. 1093-1097
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
In this note, we prove that the 3 parameter logistic model with fixed-effect abilities is identified only up to a linear transformation of the ability scale under mild regularity conditions, contrary to the claims in Theorem 2 of San Martín et al. (Psychometrika, 80(2):450–467, 2015a).

A New Concurrent Calibration Method for Nonequivalent Group Design under Nonrandom Assignment
Kei Miyazaki, Takahiro Hoshino, Shin-ichi Mayekawa, Kazuo Shigemasu
Journal:

Psychometrika / Volume 74 / Issue 1 / March 2009

Published online by Cambridge University Press:

01 January 2025, pp. 1-19
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
This study proposes a new item parameter linking method for the common-item nonequivalent groups design in item response theory (IRT). Previous studies assumed that examinees are randomly assigned to either test form. However, examinees can frequently select their own test forms and tests often differ according to examinees’ abilities. In such cases, concurrent calibration or multiple group IRT modeling without modeling test form selection behavior can yield severely biased results. We proposed a model wherein test form selection behavior depends on test scores and used a Monte Carlo expectation maximization (MCEM) algorithm. This method provided adequate estimates of testing parameters.

On the Bock-Aitkin Procedure—from an EM Algorithm Perspective
Yaowen Hsu
Journal:

Psychometrika / Volume 65 / Issue 4 / December 2000

Published online by Cambridge University Press:

01 January 2025, pp. 547-549
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
The relationship between the EM algorithm and the Bock-Aitkin procedure is described with a continuous distribution of ability (latent trait) from an EM-algorithm perspective. Previous work has been restricted to the discrete case from a probit-analysis perspective.

Correction for Item Response Theory Latent Trait Measurement Error in Linear Mixed Effects Models
Chun Wang, Gongjun Xu, Xue Zhang
Journal:

Psychometrika / Volume 84 / Issue 3 / September 2019

Published online by Cambridge University Press:

01 January 2025, pp. 673-700
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
When latent variables are used as outcomes in regression analysis, a common approach that is used to solve the ignored measurement error issue is to take a multilevel perspective on item response modeling (IRT). Although recent computational advancement allows efficient and accurate estimation of multilevel IRT models, we argue that a two-stage divide-and-conquer strategy still has its unique advantages. Within the two-stage framework, three methods that take into account heteroscedastic measurement errors of the dependent variable in stage II analysis are introduced; they are the closed-form marginal MLE, the expectation maximization algorithm, and the moment estimation method. They are compared to the naïve two-stage estimation and the one-stage MCMC estimation. A simulation study is conducted to compare the five methods in terms of model parameter recovery and their standard error estimation. The pros and cons of each method are also discussed to provide guidelines for practitioners. Finally, a real data example is given to illustrate the applications of various methods using the National Educational Longitudinal Survey data (NELS 88).

Multidimensional Item Response Theory in the Style of Collaborative Filtering
Yoav Bergner, Peter Halpin, Jill-Jênn Vie
Journal:

Psychometrika / Volume 87 / Issue 1 / March 2022

Published online by Cambridge University Press:

01 January 2025, pp. 266-288
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
This paper presents a machine learning approach to multidimensional item response theory (MIRT), a class of latent factor models that can be used to model and predict student performance from observed assessment data. Inspired by collaborative filtering, we define a general class of models that includes many MIRT models. We discuss the use of penalized joint maximum likelihood to estimate individual models and cross-validation to select the best performing model. This model evaluation process can be optimized using batching techniques, such that even sparse large-scale data can be analyzed efficiently. We illustrate our approach with simulated and real data, including an example from a massive open online course. The high-dimensional model fit to this large and sparse dataset does not lend itself well to traditional methods of factor interpretation. By analogy to recommender-system applications, we propose an alternative “validation” of the factor model, using auxiliary information about the popularity of items consulted during an open-book examination in the course.

A Multicomponent Latent Trait Model for Diagnosis
Susan E. Embretson, Xiangdong Yang
Journal:

Psychometrika / Volume 78 / Issue 1 / January 2013

Published online by Cambridge University Press:

01 January 2025, pp. 14-36
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
This paper presents a noncompensatory latent trait model, the multicomponent latent trait model for diagnosis (MLTM-D), for cognitive diagnosis. In MLTM-D, a hierarchical relationship between components and attributes is specified to be applicable to permit diagnosis at two levels. MLTM-D is a generalization of the multicomponent latent trait model (MLTM; Whitely in Psychometrika, 45:479–494, 1980; Embretson in Psychometrika, 49:175–186, 1984) to be applicable to measures of broad traits, such as achievement tests, in which component structure varies between items. Conditions for model identification are described and marginal maximum likelihood estimators are presented, along with simulation data to demonstrate parameter recovery. To illustrate how MLTM-D can be used for diagnosis, an application to a large-scale test of mathematics achievement is presented. An advantage of MLTM-D for diagnosis is that it may be more applicable to large-scale assessments with more heterogeneous items than are latent class models.

Commentary: Matching IRT Models to PRO Constructs—Modeling Alternatives, and Some Thoughts on What Makes a Model Different
Matthias von Davier
Journal:

Psychometrika / Volume 86 / Issue 3 / September 2021

Published online by Cambridge University Press:

01 January 2025, pp. 825-832
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
This commentary is an attempt to present some additional alternatives to the suggestions made by Reise et al. (2021). IRT models as they are used for patient-reported outcome (PRO) scales may not be fully satisfactory when used with commonly made assumptions. The suggested change to an alternative parameterization is critically reflected with the intent to initiate discussion around more comprehensive alternatives that allow for more complex latent structures having the potential to be more appropriate for PRO scales as they are applied to diverse populations.

A Response Model for Multiple Choice Items
David Thissen, Lynne Steinberg
Journal:

Psychometrika / Volume 49 / Issue 4 / December 1984

Published online by Cambridge University Press:

01 January 2025, pp. 501-519
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
We introduce an extended multivariate logistic response model for multiple choice items; this model includes several earlier proposals as special cases. The discussion includes a theoretical development of the model, a description of the relationship between the model and data, and a marginal maximum likelihood estimation scheme for the item parameters. Comparisons of the performance of different versions of the full model with more constrained forms corresponding to previous proposals are included, using likelihood ratio statistics and empirical data.

Modeling Rule-Based Item Generation
Hanneke Geerlings, Cees A. W. Glas, Wim J. van der Linden
Journal:

Psychometrika / Volume 76 / Issue 2 / April 2011

Published online by Cambridge University Press:

01 January 2025, pp. 337-359
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
An application of a hierarchical IRT model for items in families generated through the application of different combinations of design rules is discussed. Within the families, the items are assumed to differ only in surface features. The parameters of the model are estimated in a Bayesian framework, using a data-augmented Gibbs sampler. An obvious application of the model is computerized algorithmic item generation. Such algorithms have the potential to increase the cost-effectiveness of item generation as well as the flexibility of item administration. The model is applied to data from a non-verbal intelligence test created using design rules. In addition, results from a simulation study conducted to evaluate parameter recovery are presented.

Simulation-Extrapolation with Latent Heteroskedastic Error Variance
J. R. Lockwood, Daniel F. McCaffrey
Journal:

Psychometrika / Volume 82 / Issue 3 / September 2017

Published online by Cambridge University Press:

01 January 2025, pp. 717-736
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
This article considers the application of the simulation-extrapolation (SIMEX) method for measurement error correction when the error variance is a function of the latent variable being measured. Heteroskedasticity of this form arises in educational and psychological applications with ability estimates from item response theory models. We conclude that there is no simple solution for applying SIMEX that generally will yield consistent estimators in this setting. However, we demonstrate that several approximate SIMEX methods can provide useful estimators, leading to recommendations for analysts dealing with this form of error in settings where SIMEX may be the most practical option.

Comparing Item Characteristic Curves
Paul R. Rosenbaum
Journal:

Psychometrika / Volume 52 / Issue 2 / March 1987

Published online by Cambridge University Press:

01 January 2025, pp. 217-233
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Test items are often evaluated and compared by contrasting the shapes of their item characteristics curves (ICC's) or surfaces. The current paper develops and applies three general (i.e., nonparametric) comparisons of the shapes of two item characteristic surfaces: (i) proportional latent odds, (ii) uniform relative difficulty, and (iii) item sensitivity. Two items may be compared in these ways while making no assumption about the shapes of item characteristic surfaces for other items, and no assumption about the dimensionality of the latent variable. Also studied is a method for comparing the relative shapes of two item characteristic curves in two examinee populations.

Measuring Growth in a Longitudinal Large-Scale Assessment with a General Latent Variable Model
Matthias von Davier, Xueli Xu, Claus H. Carstensen
Journal:

Psychometrika / Volume 76 / Issue 2 / April 2011

Published online by Cambridge University Press:

01 January 2025, pp. 318-336
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
The aim of the research presented here is the use of extensions of longitudinal item response theory (IRT) models in the analysis and comparison of group-specific growth in large-scale assessments of educational outcomes.
A general discrete latent variable model was used to specify and compare two types of multidimensional item-response-theory (MIRT) models for longitudinal data: (a) a model that handles repeated measurements as multiple, correlated variables over time and (b) a model that assumes one common variable over time and additional variables that quantify the change. Using extensions of these MIRT models, we approach the issue of modeling and comparing group-specific growth in observed and unobserved subpopulations. The analyses presented in this paper aim at answering the question whether academic growth is homogeneous across types of schools defined by academic demands and curricular differences. In order to facilitate answering this research question, (a) a model with a single two-dimensional ability distribution was compared to (b) a model assuming multiple populations with potentially different two-dimensional ability distributions based on type of school and to (c) a model that assumes that the observations are sampled from a discrete mixture of (unobserved) populations, allowing for differences across schools with respect to mixing proportions. For this purpose, we specified a hierarchical-mixture distribution variant of the two MIRT models. The latter model, (c), is a growth-mixture MIRT model that allows for variation of the mixing proportions across clusters in a hierarchically organized sample. We applied the proposed models to the PISA-I-Plus data for assessing learning and change across multiple subpopulations. The results of this study support the hypothesis of differential growth.

A Nonparametric Approach for Assessing Latent Trait Unidimensionality
William Stout
Journal:

Psychometrika / Volume 52 / Issue 4 / December 1987

Published online by Cambridge University Press:

01 January 2025, pp. 589-617
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Assuming a nonparametric family of item response theory models, a theory-based procedure for testing the hypothesis of unidimensionality of the latent space is proposed. The asymptotic distribution of the test statistic is derived assuming unidimensionality, thereby establishing an asymptotically valid statistical test of the unidimensionality of the latent trait. Based upon a new notion of dimensionality, the test is shown to have asymptotic power 1. A 6300 trial Monte Carlo study using published item parameter estimates of widely used standardized tests indicates conservative adherence to the nominal level of significance and statistical power averaging 81 out of 100 rejections for examinee sample sizes and psychological test lengths often incurred in practice.

The Crosswise Model for Surveys on Sensitive Topics: A General Framework for Item Selection and Statistical Analysis
Marco Gregori, Martijn G. De Jong, Rik Pieters
Journal:

Psychometrika / Volume 89 / Issue 3 / September 2024

Published online by Cambridge University Press:

01 January 2025, pp. 1007-1033
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
When surveys contain direct questions about sensitive topics, participants may not provide their true answers. Indirect question techniques incentivize truthful answers by concealing participants’ responses in various ways. The Crosswise Model aims to do this by pairing a sensitive target item with a non-sensitive baseline item, and only asking participants to indicate whether their responses to the two items are the same or different. Selection of the baseline item is crucial to guarantee participants’ perceived and actual privacy and to enable reliable estimates of the sensitive trait. This research makes the following contributions. First, it describes an integrated methodology to select the baseline item, based on conceptual and statistical considerations. The resulting methodology distinguishes four statistical models. Second, it proposes novel Bayesian estimation methods to implement these models. Third, it shows that the new models introduced here improve efficiency over common applications of the Crosswise Model and may relax the required statistical assumptions. These three contributions facilitate applying the methodology in a variety of settings. An empirical application on attitudes toward LGBT issues shows the potential of the Crosswise Model. An interactive app, Python and MATLAB codes support broader adoption of the model.

Detection of Test Speededness Using Change-Point Analysis
Can Shao, Jun Li, Ying Cheng
Journal:

Psychometrika / Volume 81 / Issue 4 / December 2016

Published online by Cambridge University Press:

01 January 2025, pp. 1118-1141
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Change-point analysis (CPA) is a well-established statistical method to detect abrupt changes, if any, in a sequence of data. In this paper, we propose a procedure based on CPA to detect test speededness. This procedure is not only able to classify examinees into speeded and non-speeded groups, but also identify the point at which an examinee starts to speed. Identification of the change point can be very useful. First, it informs decision makers of the appropriate length of a test. Second, by removing the speeded responses, instead of the entire response sequence of an examinee suspected of speededness, ability estimation can be improved. Simulation studies show that this procedure is efficient in detecting both speeded examinees and the speeding point. Ability estimation is dramatically improved by removing speeded responses identified by our procedure. The procedure is then applied to a real dataset for illustration purpose.

A Nonparametric Multidimensional Latent Class IRT Model in a Bayesian Framework
Francesco Bartolucci, Alessio Farcomeni, Luisa Scaccia
Journal:

Psychometrika / Volume 82 / Issue 4 / December 2017

Published online by Cambridge University Press:

01 January 2025, pp. 952-978
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
We propose a nonparametric item response theory model for dichotomously-scored items in a Bayesian framework. The model is based on a latent class (LC) formulation, and it is multidimensional, with dimensions corresponding to a partition of the items in homogenous groups that are specified on the basis of inequality constraints among the conditional success probabilities given the latent class. Moreover, an innovative system of prior distributions is proposed following the encompassing approach, in which the largest model is the unconstrained LC model. A reversible-jump type algorithm is described for sampling from the joint posterior distribution of the model parameters of the encompassing model. By suitably post-processing its output, we then make inference on the number of dimensions (i.e., number of groups of items measuring the same latent trait) and we cluster items according to the dimensions when unidimensionality is violated. The approach is illustrated by two examples on simulated data and two applications based on educational and quality-of-life data.

Assessing Statistical Accuracy in Ability Estimation: A Bootstrap Approach
Michelle Liou, Lien-Chi Yu
Journal:

Psychometrika / Volume 56 / Issue 1 / March 1991

Published online by Cambridge University Press:

01 January 2025, pp. 55-67
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Given known item parameters, the bootstrap method can be used to determine the statistical accuracy of ability estimates in item response theory. Through a Monte Carlo study, the method is evaluated as a way of approximating the standard error and confidence limits for the maximum likelihood estimate of the ability parameter, and compared to the use of the theoretical standard error and confidence limits developed by Lord. At least for short tests, the bootstrap method yielded better estimates than the corresponding theoretical values.

Randomization-Based Inference about Latent Variables from Complex Samples
Robert J. Mislevy
Journal:

Psychometrika / Volume 56 / Issue 2 / June 1991

Published online by Cambridge University Press:

01 January 2025, pp. 177-196
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Standard procedures for drawing inferences from complex samples do not apply when the variable of interest θ cannot be observed directly, but must be inferred from the values of secondary random variables that depend on θ stochastically. Examples are proficiency variables in item response models and class memberships in latent class models. Rubin's “multiple imputation” techniques yield approximations of sample statistics that would have been obtained, had θ been observable, and associated variance estimates that account for uncertainty due to both the sampling of respondents and the latent nature of θ. The approach is illustrated with data from the National Assessment for Educational Progress.

Applying the Principles of Specific Objectivity and of Generalizability to the Measurement of Change
Gerhard H. Fischer
Journal:

Psychometrika / Volume 52 / Issue 4 / December 1987

Published online by Cambridge University Press:

01 January 2025, pp. 565-587
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
A formal framework for measuring change in sets of dichotomous data is developed and implications of the principle of specific objectivity of results within this framework are investigated. Building upon the concept of specific objectivity as introduced by G. Rasch, three equivalent formal definitions of that postulate are given, and it is shown that they lead to latent additivity of the parametric structure. If, in addition, the observations are assumed to be locally independent realizations of Bernoulli variables, a family of models follows necessarily which are isomorphic to a logistic model with additive parameters, determining an interval scale for latent trait measurement and a ratio scale for quantifying change. Adding the further assumption of generalizability over subsets of items from a given universe yields a logistic model which allows a multidimensional description of individual differences and a quantitative assessment of treatment effects; as a special case, a unidimensional parameterization is introduced also and a unidimensional latent trait model for change is derived. As a side result, the relationship between specific objectivity and additive conjoint measurement is clarified.

Search Results

Refine search

Refine search

Actions for selected content:

255 results

Adjusting for Information Inflation Due to Local Dependency in Moderately Large Item Clusters

A Note on the Identifiability of Fixed-Effect 3PL Models

A New Concurrent Calibration Method for Nonequivalent Group Design under Nonrandom Assignment

On the Bock-Aitkin Procedure—from an EM Algorithm Perspective

Correction for Item Response Theory Latent Trait Measurement Error in Linear Mixed Effects Models

Multidimensional Item Response Theory in the Style of Collaborative Filtering

A Multicomponent Latent Trait Model for Diagnosis

Commentary: Matching IRT Models to PRO Constructs—Modeling Alternatives, and Some Thoughts on What Makes a Model Different

A Response Model for Multiple Choice Items

Modeling Rule-Based Item Generation

Simulation-Extrapolation with Latent Heteroskedastic Error Variance

Comparing Item Characteristic Curves

Measuring Growth in a Longitudinal Large-Scale Assessment with a General Latent Variable Model

A Nonparametric Approach for Assessing Latent Trait Unidimensionality

The Crosswise Model for Surveys on Sensitive Topics: A General Framework for Item Selection and Statistical Analysis

Detection of Test Speededness Using Change-Point Analysis

A Nonparametric Multidimensional Latent Class IRT Model in a Bayesian Framework

Assessing Statistical Accuracy in Ability Estimation: A Bootstrap Approach

Randomization-Based Inference about Latent Variables from Complex Samples

Applying the Principles of Specific Objectivity and of Generalizability to the Measurement of Change

Search Results

Refine search

Refine search

Actions for selected content:

Save Search

255 results