ForestTreeDB: a database dedicated to the mining of tree transcriptomes

Open Access

27 November 2006

journal article
research article
Published by Oxford University Press (OUP) in Nucleic Acids Research

Vol. 35 (Databae), D888-D894
https://doi.org/10.1093/nar/gkl882

Abstract

ForestTreeDB is intended as a resource that centralizes large-scale expressed sequence tag (EST) sequencing results from several tree species (). It currently encompasses 344 878 quality sequences from 68 libraries, from diverse organs of conifer and hybrid poplar trees. It utilizes the Nimbus data model to provide a hosting system for multiple projects, and uses object-relational mapping APIs in Java and Perl for data accesses within an Oracle database designed to be scalable, maintainable and extendable. Transcriptome builds or unigene sets occupy the focal point of the system. Several of the five current species-specific unigenes were used to design microarrays and SNP resources. The ForestTreeDB web application provides the means for multiple combination database queries. It presents the user with a list of discrete queries to retrieve and download large EST datasets or sequences from precompiled unigene assemblies. Functional annotation assignment is not trivial in conifers which are distantly related to angiosperm model plants. Optimal annotations are achieved through database queries that integrate results from several procedures based open-source tools. ForestTreeDB aims to facilitate sequence mining of coherent annotations in multiple species to support comparative genomic approaches. We plan to continuously enrich ForestTreeDB with other resources through collaborations with other genomic projects.

Keywords

This publication has 17 references indexed in Scilit:

Automated SNP detection from a large collection of white spruce expressed sequences: contributing factors and approaches for the categorization of SNPs
BMC Genomics, 2006
Water stress-responsive genes in loblolly pine (Pinus taeda) roots identified by analyses of expressed sequence tag libraries
Tree Physiology, 2006
Dirigent Proteins in Conifer Defense: Gene Discovery, Phylogeny, and Differential Wound- and Insect-induced Expression of a Family of DIR and DIR-like Genes in Spruce (Picea spp.)
Plant Molecular Biology, 2006
Generation, annotation, analysis and database integration of 16,500 white spruce EST clusters
BMC Genomics, 2005
Comparative Plant Genomics Resources at PlantGDB
Plant Physiology, 2005
The TIGR Gene Indices: clustering and assembling EST and known genes and integration with eukaryotic genomes
Nucleic Acids Research, 2004
Apparent homology of expressed genes from wood-forming tissues of loblolly pine ( Pinus taeda L.) with Arabidopsis thaliana
Proceedings of the National Academy of Sciences of the United States of America, 2003
Analysis of xylem formation in pine by cDNA sequencing
Proceedings of the National Academy of Sciences of the United States of America, 1998
Base-Calling of Automated Sequencer Traces UsingPhred. I. Accuracy Assessment
Genome Research, 1998
Base-Calling of Automated Sequencer Traces Using Phred. II. Error Probabilities
Genome Research, 1998

Cited by 22 articles