Abstract
Data on replacement mutations in genes of disease patients exist in a variety of online resources. In addition, genome sequencing projects and individual gene sequencing efforts have led to the identification of disease gene homologs in diverse metazoan species. The availability of these two types of information provides unique opportunities to investigate factors that are important in the development of genetically based disease by contrasting long and short-term molecular evolutionary patterns. Therefore, we conducted an analysis of disease-associated human genetic variation for seven disease genes: the cystic fibrosis transmembrane conductance regulator, glucose-6-phosphate dehydrogenase, the neural cell adhesion molecule L1, phenylalanine hydroxylase, paired box 6, the X-linked retinoschisis gene and TSC2/tuberin. Our analyses indicate that disease mutations show definite patterns when examined from an evolutionary perspective. Human replacement mutations resulting in disease are overabundant at amino acid positions most conserved throughout the long-term history of metazoans. In contrast, human polymorphic replacement mutations and silent mutations are randomly distributed across sites with respect to the level of conservation of amino acid sites within genes. Furthermore, disease-causing amino acid changes are of types usually not observed among species. Using Grantham's chemical difference matrix, we find that amino acid changes observed in disease patients are far more radical than the variation found among species and in non-diseased humans. Overall, our results demonstrate the usefulness of evolutionary analyses for understanding patterns of human disease mutations and underscore the biomedical significance of sequence data currently being generated from various model organism genome sequencing projects.