Article

Nature 402, 761-768 (16 December 1999) | doi:10.1038/45471; Received 5 October 1999; Accepted 29 October 1999

Sequence and analysis of chromosome 2 of the plant Arabidopsis thaliana

Xiaoying Lin1, Samir Kaul1, Steve Rounsley2, Terrance P. Shea2, Maria-Ines Benito, Christopher D. Town, Claire Y. Fujii, Tanya Mason, Cheryl L. Bowman, Mary Barnstead, Tamara V. Feldblyum, C. Robin Buell, Karen A. Ketchum, John Lee, Catherine M. Ronning, Hean L. Koo, Kelly S. Moffat, Lisa A. Cronin, Mian Shen, Grace Pai, Susan Van Aken, Lowell Umayam, Luke J. Tallon, John E. Gill, Mark D. Adams2, Ana J. Carrera, Todd H. Creasy, Howard M. Goodman2, Chris R. Somerville2, Greg P. Copenhaver2, Daphne Preuss2, William C. Nierman, Owen White, Jonathan A. Eisen, Steven L. Salzberg, Claire M. Fraser & J. Craig Venter2

  1. The Institute for Genomic Research, 9712 Medical Center Drive, Rockville, Maryland 20850, USA
  2. These authors contributed equally to this work
  3. Present addresses: Cereon Genomics, 270 Albany Street, Cambridge, Massachusetts 02139, USA (S.R., T.P.S.); Celera Genomics Corporation, 45 West Gude Drive, Rockville, Maryland 20850, USA (M.D.A., J.C.V.); Department of Molecular Biology, Massachusetts General Hospital, Boston, Massachussets 02114, USA (H.M.G.); Department of Plant Biology, Carnegie Institution of Washington, 260 Panama St., Stanford, California 94305, USA (C.R.S.); Department of Molecular Genetics and Cell Biology, University of Chicago, Chicago, Illinois 60637, USA (G.P.C., D.P.).

Correspondence to: Correspondence and requests for materials should be addressed to J. C. Venter (e-mail: Email: jcventer@celera.com).

Top

Arabidopsis thaliana (Arabidopsis) is unique among plant model organisms in having a small genome (130–140 Mb), excellent physical and genetic maps, and little repetitive DNA. Here we report the sequence of chromosome 2 from the Columbia ecotype in two gap-free assemblies (contigs) of 3.6 and 16 megabases (Mb). The latter represents the longest published stretch of uninterrupted DNA sequence assembled from any organism to date. Chromosome 2 represents 15% of the genome and encodes 4,037 genes, 49% of which have no predicted function. Roughly 250 tandem gene duplications were found in addition to large-scale duplications of about 0.5 and 4.5 Mb between chromosomes 2 and 1 and between chromosomes 2 and 4, respectively. Sequencing of nearly 2 Mb within the genetically defined centromere revealed a low density of recognizable genes, and a high density and diverse range of vestigial and presumably inactive mobile elements. More unexpected is what appears to be a recent insertion of a continuous stretch of 75% of the mitochondrial genome into chromosome 2.

Extra navigation

.

Open Innovation Challenges

ADVERTISEMENT