Prediction of phenotype and gene expression for combinations of mutations

Abstract
Molecular interactions provide paths for information flows. Genetic interactions reveal active information flows and reflect their functional consequences. We integrated these complementary data types to model the transcription network controlling cell differentiation in yeast. Genetic interactions were inferred from linear decomposition of gene expression data and were used to direct the construction of a molecular interaction network mediating these genetic effects. This network included both known and novel regulatory influences, and predicted genetic interactions. For corresponding combinations of mutations, the network model predicted quantitative gene expression profiles and precise phenotypic effects. Multiple predictions were tested and verified. ### Synopsis Capitalizing on recent advances in genetics will require not only linking individual genes to traits but also understanding how multiple genes act together in complex ways to affect outcomes. Most phenotypes are controlled by multiple genes with multiple allelic variants, and these alleles often interact in complex ways ([Shook and Johnson, 1999][1]; [Steinmetz et al , 2002][2]; [Carlborg and Haley, 2004][3]; [Sinha et al , 2006][4]). Network models that account for this complexity will have the capacity to predict, systematically and explicitly, the effects of multiple interacting genetic perturbations. This capacity will enable testing of genetically complex hypotheses, prioritization of candidate genes for targeted intervention, and the personalization of prognoses and therapies ([Ideker et al , 2001][5]; [Galitski, 2004][6]). In this paper, we describe a method to systematically identify genetic‐interaction effects in yeast cells. Our goal was to infer specific functional relationships to drive network modeling and make precise testable predictions for novel combinations of perturbed genes. We based our analysis on a quantitative generalization of the classical genetic‐interaction approach of observing how genetic perturbations interact to affect phenotypes. This method has historically been used to reveal functional relationships such as activation, repression, and pathway order ([Avery and Wasserman, 1992][7]). However, because mutant phenotypes result from the activities of complex molecular pathways, the biochemical interpretation of a genetic interaction is often ambiguous and frequently involves multiple alternative molecular models and both direct and indirect mechanisms ([Kelley and Ideker, 2005][8]; [Zhang et al , 2005][9]). Conversely, molecular interactions, plentifully generated through high‐throughput methods, often lack in functional interpretation or are of uncertain relevance to specific genetic observations ([Galitski, 2004][6]). We exploited this complementarity by using knowledge gained from genetic interactions to direct the integration of molecular data, and thereby assign function to specific molecular interaction paths. We used the filamentous growth response of budding yeast as a model system ([Gimeno et al , 1992][10]; [Lengeler et al , 2000][11]). In response to environmental cues, yeast cells switch from their round single‐cell growth form to a pathogen‐like, adhesive, invasive, filamentous form. We collected microarray data of multiple genetic perturbations under these conditions and, viewing the expression of each gene as a quantitative phenotype, we subjected the expression data to our mathematical decomposition. The genetically ‘direct’ (not necessarily molecularly direct) effects from regulator genes on the expression of hundreds of differentially expressed genes were separated from the genetically ‘indirect’ effects that involve genetic interactions between regulator genes. In this way our decomposition method dissected the complexities of genetic interactions. This strategy is outlined in [Figure 1][12]. We next integrated molecular interaction data with our decomposition results to construct regulatory network models. [Figure 2][13] from the paper illustrates the strategy with a small network for the transcriptional regulation of a single gene by three of our trait‐linked genes. These networks represent specific, testable hypotheses of influence from the causal perturbation to the expression of affected genes. We tested a set of predictions with additional microarray experiments, and found that our model provided significantly more accurate predictions than a similar model that did not incorporate genetic interactions. We then identified an expression pattern that was strongly correlated with measurements of the filamentous growth phenotype. The network model inferred for the regulation of these genes successfully implicated new regulators of the phenotype and was used to predict phenotypes for 13 novel combinatorial deletions. The model correctly predicted all of the double‐mutant phenotypes. Our methods are designed for application to any system in which multiple interacting genes are linked to phenotypes. The data‐integration strategy exploited the availability of accurate, genome‐scale molecular interaction data sets, and identified instances in which functionally important molecular data are missing. With the increasing availability of human interaction data and further modeling developments to address allelic variation in outbred populations, similar quantitative and integrative techniques may ultimately be applied to disease‐related models ([Stelzl et al , 2005][14]). Mol Syst Biol. 3: 96 [1]: #ref-28 [2]: #ref-31 [3]: #ref-7 [4]: #ref-29 [5]: #ref-17 [6]: #ref-13 [7]: #ref-3 [8]: #ref-18 [9]: #ref-43 [10]: #ref-15 [11]: #ref-20 [12]: #F1 [13]: #F2 [14]: #ref-32