TARGeT: a web-based pipeline for retrieving and characterizing gene and transposable element families from genomic sequences
Open Access
- 8 May 2009
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 37 (11), e78
- https://doi.org/10.1093/nar/gkp295
Abstract
Gene families compose a large proportion of eukaryotic genomes. The rapidly expanding genomic sequence database provides a good opportunity to study gene family evolution and function. However, most gene family identification programs are restricted to searching protein databases where data are often lagging behind the genomic sequence data. Here, we report a user-friendly web-based pipeline, named TARGeT (Tree Analysis of Related Genes and Transposons), which uses either a DNA or amino acid ‘seed’ query to: (i) automatically identify and retrieve gene family homologs from a genomic database, (ii) characterize gene structure and (iii) perform phylogenetic analysis. Due to its high speed, TARGeT is also able to characterize very large gene families, including transposable elements (TEs). We evaluated TARGeT using well-annotated datasets, including the ascorbate peroxidase gene family of rice, maize and sorghum and several TE families in rice. In all cases, TARGeT rapidly recapitulated the known homologs and predicted new ones. We also demonstrated that TARGeT outperforms similar pipelines and has functionality that is not offered elsewhere.Keywords
This publication has 63 references indexed in Scilit:
- Jalview Version 2—a multiple sequence alignment editor and analysis workbenchBioinformatics, 2009
- FGF: A web tool for Fishing Gene Family in a whole genome databaseNucleic Acids Research, 2007
- The PANTHER database of protein families, subfamilies, functions and pathwaysNucleic Acids Research, 2004
- MUSCLE: multiple sequence alignment with high accuracy and high throughputNucleic Acids Research, 2004
- Dosage sensitivity and the evolution of gene families in yeastNature, 2003
- An efficient algorithm for large-scale detection of protein familiesNucleic Acids Research, 2002
- BLAT—The BLAST-Like Alignment ToolGenome Research, 2002
- Mariner -like transposases are widespread and diverse in flowering plantsProceedings of the National Academy of Sciences of the United States of America, 2001
- Initial sequencing and analysis of the human genomeNature, 2001
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997