Statistical methods for studying disease subtype heterogeneity

Abstract
A fundamental goal of epidemiologic research is to investigate the relationship between exposures and disease risk. Cases of the disease are often considered a single outcome and assumed to share a common etiology. However, evidence indicates that many human diseases arise and evolve through a range of heterogeneous molecular pathologic processes, influenced by diverse exposures. Pathogenic heterogeneity has been considered in various neoplasms such as colorectal, lung, prostate, and breast cancers, leukemia and lymphoma, and non‐neoplastic diseases, including obesity, type II diabetes, glaucoma, stroke, cardiovascular disease, autism, and autoimmune disease. In this article, we discuss analytic options for studying disease subtype heterogeneity, emphasizing methods for evaluating whether the association of a potential risk factor with disease varies by disease subtype. Methods are described for scenarios where disease subtypes are categorical and ordinal and for cohort studies, matched and unmatched case–control studies, and case–case study designs. For illustration, we apply the methods to a molecular pathological epidemiology study of alcohol intake and colon cancer risk by tumor LINE‐1 methylation subtypes. User‐friendly software to implement the methods is publicly available. Copyright © 2015 John Wiley & Sons, Ltd.
Funding Information
  • National Institutes of Health (R01 CA151993, UM1 CA167552, P01 CA55075, P01 CA87969, UM1 CA186107, R01 CA137178, R35 CA197735)
  • National Institutes of Health (R01 CA151993, UM1 CA167552, P01 CA55075, P01 CA87969, UM1 CA186107, R01 CA137178, R35 CA197735)