Estimation and testing for the effect of a genetic pathway on a disease outcome using logistic kernel machine regression via logistic mixed models
Open Access
- 24 June 2008
- journal article
- Published by Springer Science and Business Media LLC in BMC Bioinformatics
- Vol. 9 (1), 1-11
- https://doi.org/10.1186/1471-2105-9-292
Abstract
Growing interest on biological pathways has called for new statistical methods for modeling and testing a genetic pathway effect on a health outcome. The fact that genes within a pathway tend to interact with each other and relate to the outcome in a complicated way makes nonparametric methods more desirable. The kernel machine method provides a convenient, powerful and unified method for multi-dimensional parametric and nonparametric modeling of the pathway effect. In this paper we propose a logistic kernel machine regression model for binary outcomes. This model relates the disease risk to covariates parametrically, and to genes within a genetic pathway parametrically or nonparametrically using kernel machines. The nonparametric genetic pathway effect allows for possible interactions among the genes within the same pathway and a complicated relationship of the genetic pathway and the outcome. We show that kernel machine estimation of the model components can be formulated using a logistic mixed model. Estimation hence can proceed within a mixed model framework using standard statistical software. A score test based on a Gaussian process approximation is developed to test for the genetic pathway effect. The methods are illustrated using a prostate cancer data set and evaluated using simulations. An extension to continuous and discrete outcomes using generalized kernel machine models and its connection with generalized linear mixed models is discussed. Logistic kernel machine regression and its extension generalized kernel machine regression provide a novel and flexible statistical tool for modeling pathway effects on discrete and continuous outcomes. Their close connection to mixed models and attractive performance make them have promising wide applications in bioinformatics and other biomedical areas.Keywords
This publication has 22 references indexed in Scilit:
- Semiparametric Regression of Multidimensional Genetic Pathway Data: Least‐Squares Kernel Machines and Linear Mixed ModelsBiometrics, 2007
- Analyzing gene expression data in terms of gene sets: methodological issuesBioinformatics, 2007
- Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profilesProceedings of the National Academy of Sciences of the United States of America, 2005
- Comparison of Maximum Statistics for Hypothesis Testing When a Nuisance Parameter Is Present only under the AlternativeBiometrics, 2005
- Hypothesis testing in semiparametric additive mixed modelsBiostatistics, 2003
- Pathway Processor: A Tool for Integrating Whole-Genome Expression Results into Metabolic NetworksGenome Research, 2002
- GenMAPP, a new tool for viewing and analyzing microarray data on biological pathwaysNature Genetics, 2002
- Delineation of prognostic biomarkers in prostate cancerNature, 2001
- Approximate Inference in Generalized Linear Mixed ModelsJournal of the American Statistical Association, 1993
- Asymptotic Properties of Maximum Likelihood Estimators and Likelihood Ratio Tests under Nonstandard ConditionsJournal of the American Statistical Association, 1987