Comparative Gene Expression Analysis by a Differential Clustering Approach: Application to the Candida albicans Transcription Program

Abstract
Differences in gene expression underlie many of the phenotypic variations between related organisms, yet approaches to characterize such differences on a genome-wide scale are not well developed. Here, we introduce the “differential clustering algorithm” for revealing conserved and diverged co-expression patterns. Our approach is applied at different levels of organization, ranging from pair-wise correlations within specific groups of functionally linked genes, to higher-order correlations between such groups. Using the differential clustering algorithm, we systematically compared the transcription program of the fungal pathogen Candida albicans with that of the model organism Saccharomyces cerevisiae. Many of the identified differences are related to the differential requirement for mitochondrial function in the two yeasts. Distinct regulation patterns of cell cycle genes and of amino acid metabolic genes were also revealed and, in some cases, could be linked to the differential appearance of cis-regulatory elements in the gene promoter regions. Our study provides a comprehensive framework for comparative gene expression analysis and a rich source of hypotheses for uncharacterized open reading frames and putative cis-regulatory elements in C. albicans. Candida albicans is a fungal inhabitant of the intestinal tract of most healthy humans. It becomes a serious and often lethal pathogen in people with a weak immune system. C. albicans is a distant relative of the well-studied baker's yeast, Saccharomyces cerevisiae. It is now possible to determine the degree to which these two fungi have similar or different patterns of transcription. Here, methods were developed that comprehensively compare the expression patterns of S. cerevisiae and C. albicans. A novel algorithm was used to determine if the expression of groups of genes in one organism are fully, partially, or not at all similar in the other organism. This algorithm was first applied to pre-defined groups of genes predicted to have similar functions and was then used to compare the global organization of the transcription programs between the two organisms. The analysis revealed that the expression patterns reflect the different metabolic preferences of the two yeasts. The authors also found that amino acid metabolism regulation is more differentiated in C. albicans. Furthermore, the different expression patterns can be traced down to the use of different regulatory sequences. This study provides a comprehensive framework for comparative gene expression analysis, as well as a Web site with interactive analysis tools, which allow the development of hypotheses concerning uncharacterized genes and the sequences that regulate them.