Abstract
Suppose, in contrast to the fact, in 1950, we had put the cohort of 18-year-old non-smoking American men on a stringent mandatory diet that guaranteed that no one would ever weigh more than their baseline weight established at the age of 18 years. How would the counterfactual mortality of these 18 year olds have compared to their actual observed mortality through 2007? We describe in detail how this counterfactual contrast could be estimated from longitudinal epidemiologic data similar to that stored in the electronic medical records of a large health maintenance organization (HMO) by applying g-estimation to a novel of structural nested model (SNM). Our analytic approach differs from any alternative approach in that, in the absence of model misspecification, it can successfully adjust for (i) measured time-varying confounders such as exercise, hypertension and diabetes that are simultaneously intermediate variables on the causal pathway from weight gain to death and determinants of future weight gain, (ii) unmeasured confounding by undiagnosed preclinical disease (that is, reverse causation) that can cause both poor weight gain and premature mortality (provided an upper bound can be specified for the maximum length of time a subject may suffer from a subclinical illness severe enough to affect his weight without the illness becomes clinically manifest) and (iii) the presence of particular identifiable subgroups, such as those suffering from serious renal, liver, pulmonary and/or cardiac disease, in whom confounding by unmeasured prognostic factors is so severe as to render useless any attempt at direct analytic adjustment. However, (ii) and (iii) limit the ability to empirically test whether the SNM is misspecified. The other two g-methods—the parametric g-computation algorithm and inverse probability of treatment weighted estimation of marginal structural models—can adjust for potential bias due to (i) but not due to (ii) or (iii).