Normalization and subtraction: two approaches to facilitate gene discovery.
Open Access
- 1 September 1996
- journal article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 6 (9), 791-806
- https://doi.org/10.1101/gr.6.9.791
Abstract
Large-scale sequencing of cDNAs randomly picked from libraries has proven to be a very powerful approach to discover (putatively) expressed sequences that, in turn, once mapped, may greatly expedite the process involved in the identification and cloning of human disease genes. However, the integrity of the data and the pace at which novel sequences can be identified depends to a great extent on the cDNA libraries that are used. Because altogether, in a typical cell, the mRNAs of the prevalent and intermediate frequency classes comprise as much as 50-65% of the total mRNA mass, but represent no more than 1000-2000 different mRNAs, redundant identification of mRNAs of these two frequency classes is destined to become overwhelming relatively early in any such random gene discovery programs, thus seriously compromising their cost-effectiveness. With the goal of facilitating such efforts, previously we developed a method to construct directionally cloned normalized cDNA libraries and applied it to generate infant brain (INIB) and fetal liver/spleen (INFLS) libraries, from which a total of 45,192 and 86,088 expressed sequence tags, respectively, have been derived. While improving the representation of the longest cDNAs in our libraries, we developed three additional methods to normalize cDNA libraries and generated over 35 libraries, most of which have been contributed to our integrated Molecular Analysis of Genomes and Their Expression (IMAGE) Consortium and thus distributed widely and used for sequencing and mapping. In an attempt to facilitate the process of gene discovery further, we have also developed a subtractive hybridization approach designed specifically to eliminate (or reduce significantly the representation of) large pools of arrayed and (mostly) sequenced clones from normalized libraries yet to be (or just partly) surveyed. Here we present a detailed description and a comparative analysis of four methods that we developed and used to generate normalize cDNA libraries from human (15), mouse (3), rat (2), as well as the parasite Schistosoma mansoni (1). In addition, we describe the construction and preliminary characterization of a subtracted liver/spleen library (INFLS-SI) that resulted from the elimination (or reduction of representation) of -5000 INFLS-IMAGE clones from the INFLS library.Keywords
This publication has 21 references indexed in Scilit:
- The Genexpress Index: a resource for gene discovery and the genic map of the human genome.Genome Research, 1995
- Gene–based sequence–tagged–sites (STSs) as the basis for a human gene mapNature Genetics, 1995
- Single pass sequencing and physical and genetic mapping of human brain cDNAsNature Genetics, 1992
- Large scale cDNA sequencing for analysis of quantitative and qualitative aspects of gene expressionNature Genetics, 1992
- Caenorhabditis elegans expressed sequence tags identify gene families and potential disease gene homologuesNature Genetics, 1992
- Sequence identification of 2,375 human brain genesNature, 1992
- Basic Local Alignment Search ToolJournal of Molecular Biology, 1990
- High-level expression of M13 gene II protein from an inducible polycistronic messenger RNAGene, 1985
- Regulation of Gene Expression: Possible Role of Repetitive SequencesScience, 1979
- Three abundance classes in HeLa cell messenger RNANature, 1974