Identifying Causal Genes and Dysregulated Pathways in Complex Diseases

Abstract
In complex diseases, various combinations of genomic perturbations often lead to the same phenotype. On a molecular level, combinations of genomic perturbations are assumed to dys-regulate the same cellular pathways. Such a pathway-centric perspective is fundamental to understanding the mechanisms of complex diseases and the identification of potential drug targets. In order to provide an integrated perspective on complex disease mechanisms, we developed a novel computational method to simultaneously identify causal genes and dys-regulated pathways. First, we identified a representative set of genes that are differentially expressed in cancer compared to non-tumor control cases. Assuming that disease-associated gene expression changes are caused by genomic alterations, we determined potential paths from such genomic causes to target genes through a network of molecular interactions. Applying our method to sets of genomic alterations and gene expression profiles of 158 Glioblastoma multiforme (GBM) patients we uncovered candidate causal genes and causal paths that are potentially responsible for the altered expression of disease genes. We discovered a set of putative causal genes that potentially play a role in the disease. Combining an expression Quantitative Trait Loci (eQTL) analysis with pathway information, our approach allowed us not only to identify potential causal genes but also to find intermediate nodes and pathways mediating the information flow between causal and target genes. Our results indicate that different genomic perturbations indeed dys-regulate the same functional pathways, supporting a pathway-centric perspective of cancer. While copy number alterations and gene expression data of glioblastoma patients provided opportunities to test our approach, our method can be applied to any disease system where genetic variations play a fundamental causal role. It is now being recognized that complex diseases should be studied from the perspective of dys-regulated pathways and processes rather than individual genes. Indeed, various combinations of molecular perturbations might lead to the same disease. In such cases, responses to these perturbations are expected to converge to common pathways. In addition, signals that are associated with each individual perturbation might be weak, rendering studies of complex diseases particularly challenging. Aiming to provide an integrated perspective on complex disease mechanisms we developed a novel computational method to simultaneously identify causal genes and dys-regulated pathways. Starting with an identification of a disease-associated set of genes and their statistical associations with genomic alterations, we utilized graph-theoretical techniques and combinatorial algorithms to determine potential paths from the genomic causes through a network of molecular interactions. We applied our method to sets of genomic alterations and gene expression profiles of Glioblastoma multiforme (GBM) patients, uncovering candidate causal genes and causal paths that are potentially responsible for the altered expression of disease associated target genes. While copy number alterations and gene expression data of GBM patients provided opportunities to test our approach, our method can be applied to any disease system where genetic alterations play a fundamental causal role, and provides an important step toward the understanding of complex diseases.