EST analysis in barley defines a unigene set comprising 4,000 genes

Abstract
We report the generation of 13,109 EST (Expressed Sequence Tag) sequences from barley as a first step towards the generation of a unigene set for this organism. Sequences were generated from three libraries encompassing 7,568 cDNA clones. Comparisons to nucleic acid and protein sequence databases enabled the assignment of putative functions to the mRNAs. The results of the searches against protein databases were parsed and built into a regularly updated database, available over the World Wide Web. The Stack_Pack clustering system has been applied to survey the level of redundancy, which was calculated to amount to 69%, thus we identified 4,000 different barley genes. To prove the usability of the results of the clustering process for further experiments, we subjected alignments with sequences similar to elongation factor 1 alpha to additional analysis. These sequences represented the largest group with identical putative functions (228 members) and clustering based on the analysis of 3´ sequences subdivided the group into five different assemblies. Alignments of the consensus sequences facilitated the development of PCR assays suitable for genetic mapping of four of the different gene-family members, which reside on chromosomes 2H, 4H and 5H, thus demonstrating the suitability of the cluster-results as a basis for in-depth analyses of barley gene families.