The Institute for Genomic Research & Human Genome Sciences, Inc.
Venter et al. (1995) "Initial assessment of human gene diversity and expression patterns based upon 83 million nucleotides of cDNA sequence", Nature 377 Suppl., pp. 3-174
Abstract
In an effort to identify new genes and analyse their expression patterns,
174,472 partial complementary DNA sequences (expressed sequence tags (ESTs)),
totalling more than 52 million nucleotides of human DNA sequence, have been
generated from 300 cDNA libraries constructed from 37 distinct organs and
tissues. These ESTs have been combined with an additional 118,406 ESTs from
database dbEST, for a total of 83 million nucleotides, and treated as a
shotgun sequence assembly project. The assembly process yielded 29,599
distinct tentative human consensus (THC) sequences and 58,384 non-overlapping
ESTs. Of these 87,983 distinct sequences, 10,214 further characterize
previously known genes based on statistically significant similarity to
sequences in the available databases; the remainder identify previously
unknown genes. Thirty tissues were sampled by over 1,000 ESTs each; only
eight genes were matched by ESTs from all tissues, and 227 genes were
represented in 20 or more of the tissues sampled with more than 1,000 ESTs.
Approximately 40% of identified human genes appear to be associated with
basic energy metabolism, cell structure, homeostasis and cell division, 22%
with RNA and protein synthesis, and 12% with cell signalling and
communcation.