Abstract
Rapid guessing (RG) is a form of non-effortful responding that is characterized by short response latencies. This construct-irrelevant behavior has been shown in previous research to bias inferences concerning measurement properties and scores. To mitigate these deleterious effects, a number of response time threshold scoring procedures have been proposed, which recode RG responses (e.g., treat them as incorrect or missing, or impute probable values) and then estimate parameters for the recoded dataset using a unidimensional or multidimensional IRT model. To date, there have been limited attempts to compare these methods under the possibility that RG may be misclassified in practice. To address this shortcoming, the present simulation study compared item and ability parameter recovery for four scoring procedures by manipulating sample size, the linear relationship between RG propensity and ability, the percentage of RG responses, and the type and rate of RG misclassifications. Results demonstrated two general trends. First, across all conditions, treating RG responses as incorrect produced the largest degree of combined systematic and random error (larger than ignoring RG). Second, the remaining scoring approaches generally provided equal accuracy in parameter recovery when RG was perfectly identified; however, the multidimensional IRT approach was susceptible to increased error as misclassification rates grew. Overall, the findings suggest that recoding RG as missing and employing a unidimensional IRT model is a promising approach.