INITIAL ASSESSMENT OF HUMAN GENE DIVERSITY AND EXPRESSION PATTERNS BASED UPON 83-MILLION NUCLEOTIDES OF CDNA SEQUENCE
- 28 September 1995
- journal article
- research article
- Vol. 377, 3-+
Abstract
In an effort to identify new genes and analyse their expression patterns, 174,472 partial complementary DNA sequences (expressed sequence tags (ESTs)), totalling more than 52 million nucleotides of human DNA sequence, have been generated from 300 cDNA libraries constructed from 37 distinct organs and tissues. These ESTs have been combined with an additional 118,406 ESTs from the database dbEST, for a total of 83 million nucleotides, and treated as a shotgun sequence assembly project. The assembly process yielded 29,599 distinct tentative human consensus (THC) sequences and 58,384 non-overlapping ESTs. Of these 87,983 distinct sequences, 10,214 further characterize previously known genes based on statistically significant similarity to sequences in the available databases; the remainder identify previously unknown genes. Thirty tissues were sampled by over 1,000 ESTs each; only eight genes were matched by ESTs from all 30 tissues, and 227 genes were represented in 20 or more of the tissues sampled with more than 1,000 ESTs. Approximately 40% of identified human genes appear to be associated with basic energy metabolism, cell structure, homeostasis and cell division, 22% with RNA and protein synthesis and processing, and 12% with cell signalling and communication.This publication has 10 references indexed in Scilit:
- IMAGE - INTEGRATED MOLECULAR ANALYSIS OF THE HUMAN GENOME AND ITS EXPRESSION1995
- The identification of nuclear and mitochondrial genes by sequencing randomly chosen clones from a marsupial mammary gland cDNA libraryBiochemical Genetics, 1994
- An inventory of 1152 expressed sequence tags obtained by partial sequencing of cDNAs from Arabidopsis thaliana†The Plant Journal, 1993
- A molecular inventory of human pancreatic islets: sequence analysis of 1000 cDNA clonesHuman Molecular Genetics, 1993
- 3,400 new expressed sequence tags identify diversity of transcripts in human brainNature Genetics, 1993
- A quality control algorithm for DNA sequencing projectsNucleic Acids Research, 1993
- Partial Sequence Analysis of 130 Randomly Selected Maize cDNA ClonesPlant Physiology, 1993
- Caenorhabditis elegans expressed sequence tags identify gene families and potential disease gene homologuesNature Genetics, 1992
- Sequence identification of 2,375 human brain genesNature, 1992
- Isolation of a large number of novel mammalian genes by a differential cDNA library sreening strategyNucleic Acids Research, 1991