The origins of apicomplexan sequence innovation
- 10 April 2009
- journal article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 19 (7), 1202-1213
- https://doi.org/10.1101/gr.083386.108
Abstract
The Apicomplexa are a group of phylogenetically related parasitic protists that include Plasmodium, Cryptosporidium, and Toxoplasma. Together they are a major global burden on human health and economics. To meet this challenge, several international consortia have generated vast amounts of sequence data for many of these parasites. Here, we exploit these data to perform a systematic analysis of protein family and domain incidence across the phylum. A total of 87,736 protein sequences were collected from 15 apicomplexan species. These were compared with three protein databases, including the partial genome database, PartiGeneDB, which increases the breadth of taxonomic coverage. From these searches we constructed taxonomic profiles that reveal the extent of apicomplexan sequence diversity. Sequences without a significant match outside the phylum were denoted as apicomplexan specialized. These were collated into 9134 discrete protein families and placed in the context of the apicomplexan phylogeny, identifying the putative origin of each family. Most apicomplexan families were associated with an individual genus or species. Interestingly, many genera-specific innovations were associated with specialized host cell invasion and/or parasite survival processes. Contrastingly, those families reflecting more ancestral relationships were enriched in generalized housekeeping functions such as translation and transcription, which have diverged within the apicomplexan lineage. Protein domain searches revealed 192 domains not previously reported in apicomplexans together with a number of novel domain combinations. We highlight domains that may be important to parasite survival.Keywords
This publication has 84 references indexed in Scilit:
- Comparative genomics of the neglected human malaria parasite Plasmodium vivaxNature, 2008
- Determining the protein repertoire of Cryptosporidium parvum sporozoitesProteomics, 2008
- Whole-genome analysis reveals molecular innovations and evolutionary transitions in chromalveolate speciesProceedings of the National Academy of Sciences of the United States of America, 2008
- A combined transcriptome and proteome survey of malaria parasite liver stagesProceedings of the National Academy of Sciences of the United States of America, 2008
- ApiDB: integrated resources for the apicomplexan bioinformatics resource centerNucleic Acids Research, 2006
- The genome of Cryptosporidium hominisNature, 2004
- OrthoMCL: Identification of Ortholog Groups for Eukaryotic GenomesGenome Research, 2003
- Genome sequence of the human malaria parasite Plasmodium falciparumNature, 2002
- A proteomic view of the Plasmodium falciparum life cycleNature, 2002
- dbEST — database for “expressed sequence tags”Nature Genetics, 1993