Using generalized additive models to reduce residual confounding

Abstract
Traditionally, confounding by continuous variables is controlled by including a linear or categorical term in a regression model. Residual confounding occurs when the effect of the confounder on the outcome is mis‐modelled. A continuous representation of a covariate was previously shown to result in a less biased estimate of the adjusted exposure effect than categorization provided the functional form of the covariate–outcome relationship is correctly specified. However, this is rarely known. In contrast to parametric regression, generalized additive models (GAM) fit a smooth dose–response curve to the data, without requiring a priori knowledge of the functional form. We used simulations to compare parametric multiple logistic regression vs its non‐parametric GAM extension in their ability to control for a continuous confounder. We also investigated several issues related to the implementation of GAM in this context, including: (i) selecting the degrees of freedom; and (ii) alternative criteria for inclusion/exclusion of the potential confounder and for choosing between parametric and non‐parametric representation of its effect. The impact of the shape and strength of the confounder–disease association, sample size, and the correlation between the confounder and exposure were investigated. Simulations showed that when the confounder has a non‐linear association with the outcome, compared to a parametric representation, GAM modelling (i) reduced the mean squared error for the adjusted exposure effect; (ii) avoided inflation of the type I error for testing the exposure effect. When the true confounder–outcome relationship was linear, GAM performed as well as the parametric logistic regression. When modelling a continuous exposure non‐parametrically, in the presence of a continuous confounder, our results suggest that assuming a linear effect of the confounder and focussing on the non‐linearity of the exposure–outcome relationship leads to spurious findings of non‐linearity: joint non‐linear modelling is necessary. Overall, our results suggest that the use of GAM to reduce residual confounding offers several improvements over conventional parametric modelling. Copyright © 2004 John Wiley & Sons, Ltd.

This publication has 29 references indexed in Scilit: