The Protist Ribosomal Reference database (PR2): a catalog of unicellular eukaryote Small Sub-Unit rRNA sequences with curated taxonomy
Top Cited Papers
Open Access
- 26 November 2012
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 41 (D1), D597-D604
- https://doi.org/10.1093/nar/gks1160
Abstract
The interrogation of genetic markers in environmental meta-barcoding studies is currently seriously hindered by the lack of taxonomically curated reference data sets for the targeted genes. The Protist Ribosomal Reference database (PR2, http://ssu-rrna.org/) provides a unique access to eukaryotic small sub-unit (SSU) ribosomal RNA and DNA sequences, with curated taxonomy. The database mainly consists of nuclear-encoded protistan sequences. However, metazoans, land plants, macrosporic fungi and eukaryotic organelles (mitochondrion, plastid and others) are also included because they are useful for the analysis of high-troughput sequencing data sets. Introns and putative chimeric sequences have been also carefully checked. Taxonomic assignation of sequences consists of eight unique taxonomic fields. In total, 136 866 sequences are nuclear encoded, 45 708 (36 501 mitochondrial and 9657 chloroplastic) are from organelles, the remaining being putative chimeric sequences. The website allows the users to download sequences from the entire and partial databases (including representative sequences after clustering at a given level of similarity). Different web tools also allow searches by sequence similarity. The presence of both rRNA and rDNA sequences, taking into account introns (crucial for eukaryotic sequences), a normalized eight terms ranked-taxonomy and updates of new GenBank releases were made possible by a long-term collaboration between experts in taxonomy and computer scientists.Keywords
This publication has 28 references indexed in Scilit:
- Ultra-deep sequencing of foraminiferal microbarcodes unveils hidden richness of early monothalamous lineages in deep-sea sedimentsProceedings of the National Academy of Sciences of the United States of America, 2011
- Eukaryotic Richness in the Abyss: Insights from Pyrotag SequencingPLOS ONE, 2011
- Protistan microbial observatory in the Cariaco Basin, Caribbean. I. Pyrosequencing vs Sanger insights into species richnessThe ISME Journal, 2011
- Depicting more accurate pictures of protistan community complexity using pyrosequencing of hypervariable SSU rRNA gene regionsEnvironmental Microbiology, 2010
- Parallelization of the MAFFT multiple sequence alignment programBioinformatics, 2010
- SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARBNucleic Acids Research, 2007
- Clustal W and Clustal X version 2.0Bioinformatics, 2007
- Greengenes, a Chimera-Checked 16S rRNA Gene Database and Workbench Compatible with ARBApplied and Environmental Microbiology, 2006
- Bellerophon: a program to detect chimeric sequences in multiple sequence alignmentsBioinformatics, 2004
- MUSCLE: multiple sequence alignment with high accuracy and high throughputNucleic Acids Research, 2004