Investigating the practical consequences of model misfit in unidimensional IRT models


In this article, the practical consequences of violations of unidimensionality on selection decisions in the framework of unidimensional item response theory (IRT) models are investigated based on simulated data. The factors manipulated include the severity of violations, the proportion of misfitting items, and test length. The outcomes that were considered are the precision and accuracy of the estimated model parameters, the correlations of estimated ability ($\widehat{\theta}$) and number-correct ($NC$) scores with the true ability ($\theta$), the ranks of the examinees and the overlap between sets of examinees selected based on either $\theta$, $\widehat{\theta}$, or $NC$ scores, and the bias in criterion-related validity estimates. Results show that the $\widehat{\theta}$ values were unbiased by violations of unidimensionality, but their precision decreased as multidimensionality and the proportion of misfitting items increased; the estimated item parameters were robust to violations of unidimensionality. The correlations between $\theta$, $\widehat{\theta}$, and $NC$ scores, the agreement between the three selection criteria, and the accuracy of criterion-related validity estimates are all negatively affected, to some extent, by increasing levels of multidimensionality and the proportion of misfitting items. However, removing the misfitting items only improved the results in the case of severe multidimensionality and large proportion of misfitting items, and deteriorated them otherwise.

Applied Psychological Measurement, 41