Missing data mechanisms of the questionnaire SF-36's items in the SU.V1.MAX study

  • 1 October 2003
    • journal article
    • research article
    • Vol. 51 (5), 513-525
Abstract
Background: Health related quality of life is becoming of greater importance in the medical field. Nevertheless, methodological problems persist, and particularly when it comes to processing missing data on quality of life questionnaires. In fact, this leads to three difficulties: (i) loss of power; (ii) bias; (iii) choice of the most adequate method for treating missing data. Prevention is the best recommendation in order to avoid unanswered questions. Unfortunately, this does not guarantee the absence of missing data. Therefore, the treatment of missing data depends on: i) identification of the missing data mechanism and ii) choice of the most appropriate method to correct the data. The main objective of this article is to illustrate the identification of non-response items as described in the SF-36 questionnaire items in the SU.VI.MAX study. Methods: A logistic regression on the characteristics of the subjects was used to distinguish between two missing data mechanisms: missing completely at random (MCAR) and missing at random (MAR). Two global Chi-2 tests on MCAR mechanism were proposed. The missing data not at random (MNAR) mechanism was also analysed considering the questionnaire features. Results: The percentage of non-responses was small (1.7%), with a maximum equal to 3% for four questions of the General Health dimension (GH2 to GH5). Both global Chi-2 tests rejected the hypothesis that all SF-36 non-responses were MCAR. As to the 32 items with less than 2.3% of non-responses, the mechanisms were: MCAR for 29 items, MAR for 2 items, and probably MNAR for I item. The logistic regression indicates that the factors related to non-responses were gender (female), age (greater than or equal to50 years), attention problem, and number of children (greater than or equal to3). The. hierarchical feature of item PF5 (climb one flight of stairs) in relation to PF4 (climb several flights of stairs) would be a generator MNAR non-responses. The "I don't know" response modality of bloc GH2 to GH5 would also be generator of non-responses of the MNAR type. Conclusion: The identification of missing data mechanisms through statistical analysis and through further reflection on the questionnaire's features is a necessary preliminary in the treatment of non-responses.