Abstract
In this article, we emphasize that the Rasch model is not only very useful for psychological test calibration but is also necessary if the number of solved items is to be used as an examinee's score. Simplified proof that the Rasch model implies specific objective parameter comparisons is given. Consequently, a model check per se is possible. For data and item pools that fail to fit the Rasch model, various reasons are listed. For instance, the two-parameter logistic or three-parameter logistic models would probably be more suitable. Several suggestions are given for controlling the overall Type I risk, for including a power analysis (i.e., taking the Type II risk into account), for disclosing artificial model check results, and for the deletion of Rasch model misfitting examinees. These suggestions are empirically founded and may serve in the establishment of certain rough state-of-the-art standards. However, a degree of statistical elaboration is needed; and forthcoming test authors will still suffer from the fact that no standard software exists that offers all of the given approaches as a package.

This publication has 11 references indexed in Scilit: