A multivariate regression approach to association analysis of a quantitative trait network
Open Access
- 27 May 2009
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 25 (12), i204-i212
- https://doi.org/10.1093/bioinformatics/btp218
Abstract
Motivation: Many complex disease syndromes such as asthma consist of a large number of highly related, rather than independent, clinical phenotypes, raising a new technical challenge in identifying genetic variations associated simultaneously with correlated traits. Although a causal genetic variation may influence a group of highly correlated traits jointly, most of the previous association analyses considered each phenotype separately, or combined results from a set of single-phenotype analyses. Results: We propose a new statistical framework called graph-guided fused lasso to address this issue in a principled way. Our approach represents the dependency structure among the quantitative traits explicitly as a network, and leverages this trait network to encode structured regularizations in a multivariate regression model over the genotypes and traits, so that the genetic markers that jointly influence subgroups of highly correlated traits can be detected with high sensitivity and specificity. While most of the traditional methods examined each phenotype independently, our approach analyzes all of the traits jointly in a single statistical method to discover the genetic markers that perturb a subset of correlated triats jointly rather than a single trait. Using simulated datasets based on the HapMap consortium data and an asthma dataset, we compare the performance of our method with the single-marker analysis, and other sparse regression methods that do not use any structural information in the traits. Our results show that there is a significant advantage in detecting the true causal single nucleotide polymorphisms when we incorporate the correlation pattern in traits using our proposed methods. Availability: Software for GFlasso is available at http://www.sailing.cs.cmu.edu/gflasso.html Contact:sssykim@cs.cmu.edu; ksohn@cs.cmu.edu;This publication has 31 references indexed in Scilit:
- Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networksNature Genetics, 2008
- Genetics of gene expression and its effect on diseaseNature, 2008
- Variations in DNA elucidate molecular networks that cause diseaseNature, 2008
- Accommodating Linkage Disequilibrium in Genetic-Association Analyses via Ridge RegressionAmerican Journal of Human Genetics, 2008
- Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controlsNature, 2007
- Association Mapping via Regularized Regression Analysis of Single-Nucleotide–Polymorphism Haplotypes in Variable-Sized Sliding WindowsAmerican Journal of Human Genetics, 2007
- Characterization of the severe asthma phenotype by the National Heart, Lung, and Blood Institute's Severe Asthma Research ProgramJournal of Allergy and Clinical Immunology, 2007
- Identifying regulatory mechanisms using individual variation reveals key role for chromatin modificationProceedings of the National Academy of Sciences of the United States of America, 2006
- A haplotype map of the human genomeNature, 2005
- Module networks: identifying regulatory modules and their condition-specific regulators from gene expression dataNature Genetics, 2003