Population Stratification Bias in the Case-Only Study for Gene-Environment Interactions

Abstract
The case-only study is a convenient approach and provides increased statistical efficiency in detecting gene-environment interactions. The validity of a case-only study hinges on one well-recognized assumption: The susceptibility genotypes and the environmental exposures of interest are independent in the population. Otherwise, the study will be biased. The authors show that hidden stratification in the study population could also ruin a case-only study. They derive the formulas for population stratification bias. The bias involves three terms: 1) the coefficient of variation of the exposure prevalence odds, 2) the coefficient of variation of the genotype frequency odds, and 3) the correlation coefficient between the exposure prevalence odds and the genotype frequency odds. The authors perform simulation to investigate the magnitude of bias over a wide range of realistic scenarios. It is found that the estimated interaction effect is frequently biased by more than 5%. For a rarer gene and a rarer exposure, the bias becomes even larger (>30%). Because of the potentially large bias, researchers conducting case-only studies should use the boundary formula presented in this paper to make more prudent interpretations of their results, or they should use stratified analysis or a modeling approach to adjust for population stratification bias in their studies.