Article
Nature 424, 157-164 (10 July 2003) | doi:10.1038/nature01782; Received 25 February 2003; Accepted 23 April 2003
The DNA sequence of human chromosome 7
LaDeana W. Hillier1, Robert S. Fulton1, Lucinda A. Fulton1, Tina A. Graves1, Kymberlie H. Pepin1, Caryn Wagner-McPherson1, Dan Layman1, Jason Maas1, Sara Jaeger1, Rebecca Walker1, Kristine Wylie1, Mandeep Sekhon1, Michael C. Becker1, Michelle D. O'Laughlin1, Mark E. Schaller1, Ginger A. Fewell1, Kimberly D. Delehaunty1, Tracie L. Miner1, William E. Nash1, Matt Cordes1, Hui Du1, Hui Sun1, Jennifer Edwards1, Holland Bradshaw-Cordum1, Johar Ali1, Stephanie Andrews1, Amber Isak1, Andrew VanBrunt1, Christine Nguyen1, Feiyu Du1, Betty Lamar1, Laura Courtney1, Joelle Kalicki1, Philip Ozersky1, Lauren Bielicki1, Kelsi Scott1, Andrea Holmes1, Richard Harkins1, Anthony Harris1, Cynthia Madsen Strong1, Shunfang Hou1, Chad Tomlinson1, Sara Dauphin-Kohlberg1, Amy Kozlowicz-Reilly1, Shawn Leonard1, Theresa Rohlfing1, Susan M. Rock1, Aye-Mon Tin-Wollam1, Amanda Abbott1, Patrick Minx1, Rachel Maupin1, Catrina Strowmatt1, Phil Latreille1, Nancy Miller1, Doug Johnson1, Jennifer Murray1, Jeffrey P. Woessner1, Michael C. Wendl1, Shiaw-Pyng Yang1, Brian R. Schultz1, John W. Wallis1, John Spieth1, Tamberlyn A. Bieri1, Joanne O. Nelson1, Nicolas Berkowicz1, Patricia E. Wohldmann1, Lisa L. Cook1, Matthew T. Hickenbotham1, James Eldred1, Donald Williams1, Joseph A. Bedell1, Elaine R. Mardis1, Sandra W. Clifton1, Stephanie L. Chissoe1, Marco A. Marra1,10, Christopher Raymond2, Eric Haugen2, Will Gillett2, Yang Zhou2, Rose James2, Karen Phelps2, Shawn Iadanoto2, Kerry Bubb2, Elizabeth Simms2, Ruth Levy2, James Clendenning2, Rajinder Kaul2, W. James Kent3, Terrence S. Furey3, Robert A. Baertsch3, Michael R. Brent4, Evan Keibler4, Paul Flicek4, Peer Bork5, Mikita Suyama5, Jeffrey A. Bailey6, Matthew E. Portnoy7, David Torrents5, Asif T. Chinwalla1, Warren R. Gish1, Sean R. Eddy8, John D. McPherson1,10, Maynard V. Olson2, Evan E. Eichler6, Eric D. Green7, Robert H. Waterston1,10 & Richard K. Wilson1
- Genome Sequencing Center, Washington University School of Medicine, Campus Box 8501, 4444 Forest Park Avenue, St Louis, Missouri 63108, USA
- University of Washington Genome Center, 2225 Fluke Hall on Mason Road, Campus Box 352 145 Seattle, Washington 98195, USA
- Center for Biomolecular Science and Engineering, University of California, 321 BE Santa Cruz, California 95064, USA
- Department of Computer Science, Washington University, Box 1045, St Louis, Missouri 63130, USA
- EMBL, Meyerhofstrasse 1, Heidelberg 69117, Germany
- Department of Genetics, Center for Computational Genomics and Center for Human Genetics, Case Western Reserve University School of Medicine and University Hospitals of Cleveland, Cleveland, Ohio 44106, USA
- Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Building 50, Room 5222, South Drive, Bethesda, Maryland 20892, USA
- Howard Hughes Medical Institute and Department of Genetics, Washington University School of Medicine, Campus Box 8232, 4566 Scott Ave. St Louis, Missouri 63110, USA
- Present addresses: Genome Sciences Centre, British Columbia Cancer Agency, 600 West 10th Avenue, Room 3427, Vancouver, British Columbia, V5Z-4E6, Canada (M.A.M.); Department of Genome Sciences, Box 357730, University of Washington, 1705 NE Pacific Street, Seattle, Washington 98195-7730, USA (R.H.W.); Baylor College of Medicine, 1 Baylor Plaza, Human Genome Sequencing Center, N1519, Houston, Texas 77030, USA (J.D.M.).
Correspondence to: Richard K. Wilson1
Email: rwilson@watson.wustl.edu
Accession numbers for the sequence analysed for this paper can be found in Table 1. All reported DNA sequences have been deposited in GenBank or EMBL. The updated chromosome 7 sequence can be accessed through GenBank accession BL000002.
Abstract
Human chromosome 7 has historically received prominent attention in the human genetics community, primarily related to the search for the cystic fibrosis gene and the frequent cytogenetic changes associated with various forms of cancer. Here we present more than 153 million base pairs representing 99.4% of the euchromatic sequence of chromosome 7, the first metacentric chromosome completed so far. The sequence has excellent concordance with previously established physical and genetic maps, and it exhibits an unusual amount of segmentally duplicated sequence (8.2%), with marked differences between the two arms. Our initial analyses have identified 1,150 protein-coding genes, 605 of which have been confirmed by complementary DNA sequences, and an additional 941 pseudogenes. Of genes confirmed by transcript sequences, some are polymorphic for mutations that disrupt the reading frame.


