special
Nature 408, 796-815 (14 December 2000) | doi:10.1038/35048692; Received 20 October 2000; Accepted 15 November 2000
Analysis of the genome sequence of the flowering plant Arabidopsis thaliana
The Arabidopsis Genome Initiative
Abstract
The flowering plant Arabidopsis thaliana is an important model system for identifying genes and determining their functions. Here we report the analysis of the genomic sequence of Arabidopsis. The sequenced regions cover 115.4 megabases of the 125-megabase genome and extend into centromeric regions. The evolution of Arabidopsis involved a whole-genome duplication, followed by subsequent gene loss and extensive local gene duplications, giving rise to a dynamic genome enriched by lateral gene transfer from a cyanobacterial-like ancestor of the plastid. The genome contains 25,498 genes encoding proteins from 11,000 families, similar to the functional diversity of Drosophila and Caenorhabditis elegans— the other sequenced multicellular eukaryotes. Arabidopsis has many families of new proteins but also lacks several common protein families, indicating that the sets of common proteins have undergone differential expansion and contraction in the three multicellular eukaryotes. This is the first complete genome sequence of a plant and provides the foundations for more comprehensive comparison of conserved processes in all eukaryotes, identifying a wide range of plant-specific gene functions and establishing rapid systematic ways to identify genes for crop improvement.
- Authorship of this paper should be cited as "The Arabidopsis Initiative"
- The Institute for Genomic Research, 9712 Medical Centre Drive, Rockville, Maryland 20850, USA
- Kazusa DNA Research Institute, 1532-3 Yana, Kisarazu, Chiba 292, Japan
- Plant Science Institute, Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania 19104 USA;
- Plant Gene Expression Center/USDA-U.C.Berkeley, 800 Buchanan Street, Albany, California 94710, USA;
- Stanford Genome Technology Center, 855 California Avenue, Palo Alto, California 94304, USA.
- AGOWA GmbH, Glienicker Weg 185, D-12489 Berlin, Germany;
- John Innes Centre, Colney Lane, Norwich NR4 7UH, UK;
- QIAGEN GmbH, Max-Volmer-Str. 4, D-40724 Hilden, Germany;
- Greenomics, Plant Research International, Droevendaalsesleeg 1, NL 6700, AA Wageningen, The Netherlands;
- GATC GmbH, Fritz-Arnold Strasse 23, D-78467 Konstanz, Germany;
- SRD GmbH, Oberurseler Str. 43, Oberursel 61440, Germany;
- Department for Plant Genetics, (VIB), University of Gent, K.L. Ledeganckstraat 35, B-9000 Gent, Belgium;
- Katholieke Universiteit Leuven, Laboratory of Gene Technology, Kardinaal Mercierlaan 92, B-3001 Leuven, Belgium
- Genoscope and CNRS FRE2231, 2 rue G. Crémieux, 91057 Evry Cedex, France ;
- Genotype GmbH Angelhofweg 39, D-69259 Wilhemlsfeld, Germany;
- European Molecular Biology Laboratory, Biochemical Instrumentation Program, Meyerhoftstr. 1, D-69117 Heidelberg, Germany;
- LION Bioscience AG, Im Neuenheimer Feld 515-517, 69120 Heidelberg, Germany;
- MWG-Biotech AG, Anzinger Strasse 7a , 85560 Ebersberg, Germany;
- CRIBI, Università di Padova, via G. Colombo 3, Padova 35131, Italy
- Washington University Genome Sequencing Center, Washington University in St Louis School of Medicine, 4444 Forest Park Blvd., St. Louis, Missouri 63108 USA;
- Lita Annenberg Hazen Genome Center, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724 , USA;
- Celera Genomics, 850 Lincoln Center Drive, Foster City, California 94494, USA ;
- Plant Biology Group, Cold Spring Harbor Laboratory , Cold Spring Harbor, New York 11724, USA
- GSF-Forschungszentrum f. Umwelt u. Gesundheit, Munich Information Center for Protein Sequences, am Max-Planck-Institut f. Biochemie , Am Klopferspitz 18a, D-82152, Germany ;
- The Institute for Genomic Research, 9712 Medical Center Drive, Rockville, Maryland 20850, USA;
- Molecular Genetics Deartment, John Innes Centre, Colney Lane, Norwich NR4 7UH, UK;
- Dept Genetics, Stanford University Medical School , Stanford, California 94305-5120, USA.
- Cereon Genomics LLC, 45 Sidney St, Cambridge, Massachussetts 02139, USA
- Max-Delbrück-Laboratorium in der Max-Planck-Gesellschaft , Carl-von-Linné-Weg 10, 50829 Cologne, Germany;
- Brassicas and Oilseeds Research Department, John Innes Centre, Norwich NR4 7UJ, UK
- Genoscope, Centre Nationale de Sequencage, 2 rue Gaston Cremieux, CP 5706, 91057 Evry Cedex , France;
- Molekulare Botanik, Universität Ulm, 89069 Ulm, Germany;
- The Institute for Genomic Research, 9712 Medical Centre Drive, Rockville, Maryland 20850, USA
- McGill University, Dept of Biology, 1205 rue Dr Penfield, Montreal, Quebec, H3A 1B1, Canada;
- Plant Biology Group, Cold Spring Harbor Laboratory , Cold Spring Harbor, New York 11724, USA
- Howard Hughes Medical Institute, The University of Chicago, 1103 East 57th Street, Chicago, Illiois, USA;
- Biology Department, Washington University in St Louis , St Louis, Missouri 63130, USA
- The Institute for Genomic Research, 9712 Medical Centre Drive, Rockville, Maryland 20850, USA;
- University of Wisconsin Biotechnology Center, 425 Henry Mall, Madison, Wisconsin 53706 , USA
- Section of Plant Biology, University of California , Davis, California 95616, USA;
- The Institute for Genomic Research, 9712 Medical Centre Drive, Rockville, Maryland 20850, USA
- Department of Plant Sciences, University of Arizona, 303 Forbes Hall; and
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, Arizona 85721, USA ;
- Biology Department, Washington University in St Louis , St Louis, Missouri 63130, USA
- Entwicklungsgenetik, ZMBP-Centre for Plant Molecular Biology, auf der Morgenstelle 1, Tuebingen D-72076, Germany
- Division of Biology, California Institute of Biology, Pasadena, California 91125, USA
- The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, California 92037, USA;
- Plant Gene Expression Center/USDA-UC Berkeley, 800 Buchanan Street, Albany, California 94710, USA
- Biology Department, Coker Hall, University of North Carolina, Chapel Hill, North Carolina 27599 , USA;
- Sainsbury Laboratory, John Innes Centre, Colney Lane, Norwich NR4 7UJ, UK
- Howard Hughes Medical Institute and Plant Biology Laboratory, The Salk Institute, 10010 North Torrey Pines Road, La Jolla, California 92037, USA
- Carnegie Institution, 260 Panama Street, Stanford, California 94305, USA
- genomeanalysis@tgr.org or genomeanalysis@gsf.de
Correspondence to: Correspondence and requests for materials should be addressed to The Arabidopsis Genome Initiative (e-mail: Email: genomeanalysis@tigr.org or genomeanalysis@gsf.de).


