Annotation and analysis of a large cuticular protein family with the R&R Consensus in Anopheles gambiae
Open Access
- 18 January 2008
- journal article
- research article
- Published by Springer Science and Business Media LLC in BMC Genomics
- Vol. 9 (1), 22
- https://doi.org/10.1186/1471-2164-9-22
Abstract
The most abundant family of insect cuticular proteins, the CPR family, is recognized by the R&R Consensus, a domain of about 64 amino acids that binds to chitin and is present throughout arthropods. Several species have now been shown to have more than 100 CPR genes, inviting speculation as to the functional importance of this large number and diversity. We have identified 156 genes in Anopheles gambiae that code for putative cuticular proteins in this CPR family, over 1% of the total number of predicted genes in this species. Annotation was verified using several criteria including identification of TATA boxes, INRs, and DPEs plus support from proteomic and gene expression analyses. Two previously recognized CPR classes, RR-1 and RR-2, form separate, well-supported clades with the exception of a small set of genes with long branches whose relationships are poorly resolved. Several of these outliers have clear orthologs in other species. Although both clades are under purifying selection, the RR-1 variant of the R&R Consensus is evolving at twice the rate of the RR-2 variant and is structurally more labile. In contrast, the regions flanking the R&R Consensus have diversified in amino-acid composition to a much greater extent in RR-2 genes compared with RR-1 genes. Many genes are found in compact tandem arrays that may include similar or dissimilar genes but always include just one of the two classes. Tandem arrays of RR-2 genes frequently contain subsets of genes coding for highly similar proteins (sequence clusters). Properties of the proteins indicated that each cluster may serve a distinct function in the cuticle. The complete annotation of this large gene family provides insight on the mechanisms of gene family evolution and clues about the need for so many CPR genes. These data also should assist annotation of other Anopheles genes.Keywords
This publication has 43 references indexed in Scilit:
- Developmental expression patterns of cuticular protein genes with the R&R Consensus from Anopheles gambiaeInsect Biochemistry and Molecular Biology, 2008
- Insights into social insects from the genome of the honeybee Apis melliferaNature, 2006
- Breakpoint structure reveals the unique origin of an interspecific chromosomal inversion ( 2La ) in the Anopheles gambiae complexProceedings of the National Academy of Sciences of the United States of America, 2006
- A Genetic Algorithm Approach to Detecting Lineage-Specific Variation in Selection PressureMolecular Biology and Evolution, 2004
- Improved Prediction of Signal Peptides: SignalP 3.0Journal of Molecular Biology, 2004
- DnaSP, DNA polymorphism analyses by the coalescent and other methodsBioinformatics, 2003
- The Genome Sequence of the Malaria Mosquito Anopheles gambiaeScience, 2002
- The rapid generation of mutation data matrices from protein sequencesBioinformatics, 1992
- Structure and expression of a Manduca sexta larval cuticle gene homologous to Drosophila cuticle genesJournal of Molecular Biology, 1988
- The Relative Rates of Evolution of Sex Chromosomes and AutosomesThe American Naturalist, 1987