Mark D. Adams, Antony R. Kerlavage, Robert D. Fleischmann, Rebecca A. Fuldner, Carol J. Bult, Norman H. Lee, Ewen F. Kirkness, Keith G, Weinstock, Jeannine d. Gocayne, Owen White, Granger Sutton, Judith A. Blake, Rhonda C. Brandon, Man-Wei Chiu, Rebecca A. Clayton, Robin T. Cline, Matthew D. Cotton, Julie Earl Hughes, Leah D. Fine, Lisa M. Fitzgerald, William M. FitzHugh, Janice L. Fritchman, N. S. M. Geoghagen, Anna Glodek, Cheryl L. Gnehm, Michael C. Hanna, Eva Hedblom, Paul S. Hinkle Jr., Jenny M. Kelley, Karim M. Klimek, John C. Kelley, Li-Ing Liu, Simos M. Marmaros, Jospeh M. Merrick, Ruben F. Moreno-Palanques, Lisa A. McDonald, Dave T. Nguyen, Susan M. Pellegrino, Cheryl A. Phillips, Sean E. Ryder, John L. Scott, Deborah M. Saudek, Robert Shirley, Keith V. Small, Tracy A. Spriggs, Teresa R. Utterback, Janice F. Weidman, Yi Li, Ray Barthlow, Daniel P. Bednarik, Liang Cao, Mario A. Cepeda, Timothy A. Coleman, Erin-Joi Collins, Donna Dimke, Ping Feng, Andrew Ferrie, Carrie Fischer, Gregg A. Hastings, Wei-Wu He, Jing-Shan Hu, Kathleen A. Huddleston, John M. Greene, Joachim Gruber, Peter Hudson, Ann Kim, Diane L. Kozak, Charles Kunsch, Hungjun Ji, Haodong Li, Paul S. Meissner, Henrik Olsen, Lisa Raymond, Ying-Fei Wei, John Wing, Charlotte Xu, Guo-Liang Yu, Steven M. Ruben, Patrick J. Dillon, Michael R. Fannon, Craig A. Rosen, William A. Haseltine, Chris Fields, Claire M. Fraser & J. Craig Venter

The Institute for Genomic Research & Human Genome Sciences, Inc.

Venter et al. (1995) "Initial assessment of human gene diversity and expression patterns based upon 83 million nucleotides of cDNA sequence", Nature 377 Suppl., pp. 3-174

In an effort to identify new genes and analyse their expression patterns, 174,472 partial complementary DNA sequences (expressed sequence tags (ESTs)), totalling more than 52 million nucleotides of human DNA sequence, have been generated from 300 cDNA libraries constructed from 37 distinct organs and tissues. These ESTs have been combined with an additional 118,406 ESTs from database dbEST, for a total of 83 million nucleotides, and treated as a shotgun sequence assembly project. The assembly process yielded 29,599 distinct tentative human consensus (THC) sequences and 58,384 non-overlapping ESTs. Of these 87,983 distinct sequences, 10,214 further characterize previously known genes based on statistically significant similarity to sequences in the available databases; the remainder identify previously unknown genes. Thirty tissues were sampled by over 1,000 ESTs each; only eight genes were matched by ESTs from all tissues, and 227 genes were represented in 20 or more of the tissues sampled with more than 1,000 ESTs. Approximately 40% of identified human genes appear to be associated with basic energy metabolism, cell structure, homeostasis and cell division, 22% with RNA and protein synthesis, and 12% with cell signalling and communcation.

