Abstract
The logistic regression model has become the standard analysing tool for binary responses in medical statistics. Methods for assessing goodness‐of‐fit, however, are less developed where this problem is especially pronounced in performing global goodness‐of‐fit tests with sparse data, that is, if the data contain only a small numbers of observations for each pattern of covariate values. In this situation it has been known for a long time that the standard goodness‐of‐fit tests (residual deviance and Pearson chi‐square) behave unsatisfactorily if p‐values are calculated from the χ2‐distribution. As a remedy in this situation the Hosmer–Lemeshow test is frequently recommended; it relies on a new grouping of the observations to avoid sparseness, where this grouping depends on the estimated probabilities from the model. It has been shown, however, that the Hosmer–Lemeshow test also has some deficiencies, for example, it depends heavily on the calculating algorithm and thus different implementations might lead to different conclusions regarding the fit of the model. We present some alternative tests from the statistical literature which should also perform well with sparse data. Results from a simulation study are given which show that there exist some goodness‐of‐fit tests (for example, the Farrington test) that have good properties regarding size and power and that even outperform the Hosmer–Lemeshow test. We illustrate the various tests with an example from dermatology on occupational hand eczema in hairdressers. Copyright © 2002 John Wiley & Sons, Ltd.

This publication has 23 references indexed in Scilit: