Sequence and analysis of chromosome 1 of the plant Arabidopsis thaliana

Article metrics

Abstract

The genome of the flowering plant Arabidopsis thaliana has five chromosomes1,2. Here we report the sequence of the largest, chromosome 1, in two contigs of around 14.2 and 14.6 megabases. The contigs extend from the telomeres to the centromeric borders, regions rich in transposons, retrotransposons and repetitive elements such as the 180-base-pair repeat. The chromosome represents 25% of the genome and contains about 6,850 open reading frames, 236 transfer RNAs (tRNAs) and 12 small nuclear RNAs. There are two clusters of tRNA genes at different places on the chromosome. One consists of 27 tRNAPro genes and the other contains 27 tandem repeats of tRNATyr-tRNATyr-tRNASergenes. Chromosome 1 contains about 300 gene families with clustered duplications. There are also many repeat elements, representing 8% of the sequence.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: Density of various features along chromosome 1.
Figure 2: Clusters of tRNA genes in chromosome 1.
Figure 3

References

  1. 1

    Goodman, H., Ecker, J. R. & Dean, C. The genome of Arabidopsis thaliana. Proc. Natl Acad. Sci. USA 93, 10831– 10835 (1995).

  2. 2

    Meyerowitz, E. M. in Arabidopsis (eds Meyerowitz, E. M. & Somerville, C.) 21–36 (Cold Spring Harbor Press, Cold Spring Harbor, NY, 1994).

  3. 3

    Goffeau, A. et al. Life with 6000 genes. Science 274, 546–567 (1996).

  4. 4

    C. elegans Sequencing Consortium. Genome sequence of the nematode C. elegans: A platform for investigating biology. Science 282, 2012– 2046 (1998).

  5. 5

    Adams, M. D. The genome sequence of Drosophila melanogaster Science 287, 2185–2195 (2000).

  6. 6

    Lin, X. et al. Sequence and analysis of chromosome 2 of the plant Arabidopsis thaliana. Nature 402, 761– 768 (1999).

  7. 7

    Mayer, K. Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana . Nature 402, 769–777 (1999).

  8. 8

    Mozo, T. et al. A complete BAC-based physical map of the Arabidopsis thaliana genome. Nature Genet. 22, 271– 275 (1999).

  9. 9

    Marra, M. et al. A map for sequence analysis of the Arabidopsis thaliana genome. Nature Genet. 22, 265– 270 (1999).

  10. 10

    Creusot, F. et al. The CIC library: a large insert YAC library for genome mapping in Arabidopsis thaliana. Plant J. 8, 763–770 (1995).

  11. 11

    Ewens, W. J. et al. Genome mapping with anchored clones: theoretical aspects. Genomics 11, 799–805 (1991).

  12. 12

    Venter, J. C., Smith, H. O. & Hood, L. A new strategy for sequencing. Nature 381, 364–366 (1996).

  13. 13

    Choi, S., Creelman, R. A., Mullet, J. E. & Wing, R. Construction and characterization of a bacterial artificial chromosome library of Arabidopsis thaliana. Plant Mol. Biol. Rep. 13, 124–128 (1995).

  14. 14

    Mozo, T., Fischer, S., Shizuya, H. & Altmann, T. Construction and characterization of the IGF Arabidopsis BAC library. Mol. Gen. Genet. 258, 562–570 (1998).

  15. 15

    Round, E. K., Flowers, S. K. & Richards, E. J. Arabidopsis thaliana centromere regions: genetic map positions and repetitive DNA structure. Genome Res. 7, 1045–1053 (1997).

  16. 16

    Richards, E. J., Chao, S., Vongs, A. & Yang, J. Characterization of Arabidopsis thaliana telomeres isolated in yeast. Nucleic Acids Res. 20, 4039–4046 (1992).

  17. 17

    Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).

  18. 18

    Lister, C. & Dean, C. Recombinant inbred lines for mapping RFLP and phenotypic markers in Arabidopsis thaliana. Plant J. 4, 745–750 ( 1993).

  19. 19

    The Arabidopsis Genome Initiative. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana . Nature 408, 796–815 (2000)..

  20. 20

    Beier, D., Stange, N., Gross, H. J. & Beier, H. Nuclear tRNA(Tyr) genes are highly amplified at a single chromosomal site in the genome of Arabidopsis thaliana. Mol. Gen. Genet. 225, 72–80 (1991).

  21. 21

    Copenhaver, G. P. et al. Genetic definition and sequence analysis of Arabidopsis centromeres. Science 286, 2468– 2474 (1999).

  22. 22

    Conner, J. A., Conner, P., Nasrallah, M. E. & Nasrallah, J. B. Comparative mapping of the Brassica S locus region and its homolog in Arabidopsis: implications for the evolution of mating systems in the Brassicaceae. Plant Cell 10, 801– 812 (1998).

  23. 23

    Rottmann, W. E. et al. 1-aminocyclopropane-1-carboxylate synthase in tomato is encoded by a multigene family whose transcription is induced during fruit and floral senescence. J. Mol. Biol. 222, 937– 961 (1991).

  24. 24

    Salanoubat, M. et al. Sequence and analysis of chromosome 3 of the plant Arabidopsis thaliana. Nature 408, 820– 822 (2000).

  25. 25

    Tabata, S. et al. Sequence and analysis of chromosome 5 of the plant Arabidopsis thaliana. Nature 408, 823– 826 (2000).

  26. 26

    Chory, J. et al. National Science Foundation-sponsored workshop report: “The 2010 Project” functional genomics and the virtual plant. A blueprint for understanding how plants are built and how to improve them. Plant Physiol. 123, 423–426 (2000).

  27. 27

    Lockhart, D. J. & Winzeler, E. A. Genomics, gene expression and DNA arrays. Nature 405, 827–836 (2000).

  28. 28

    Ecker, J. R. PFGE and YAC analysis of the Arabidopsis genome. Methods 1, 186–194 ( 1990).

  29. 29

    Oefner, P. J. et al. Efficient random subcloning of DNA sheared in a recirculating point-sink flow system. Nucleic Acids Res. 24, 3879–3886 (1996).

  30. 30

    Dietrich, F. S. et al. The nucleotide sequence of Saccharomyces cerevisiae chromosome V. Nature (Suppl.) 387, 78– 81 (1997).

  31. 31

    Marziali, A., Willis, T. D., Federspiel, N. A. & Davis, R. W. An automated sample preparation system for large-scale DNA sequencing. Genome Res. 9, 457–462 ( 1999).

  32. 32

    Ewing, B., Hillier, L., Wendl, M. C. & Green, P. Base-calling of automated sequencer traces using Phred I. Accuracy assessment. Genome Res. 8, 175– 185 (1998).

  33. 33

    Ewing, B. & Green, P. Base-calling of automated sequencer traces using Phred II. Error probabilities. Genome Res. 8, 186–194 ( 1998).

  34. 34

    Gordon, D., Abajian, C. & Green, P. Consed: a graphical tool for sequence finishing. Genome Res. 8, 195–202. (1998).

  35. 35

    Uberbacher, E. C. & Mural, R. J. Locating protein-coding regions in human DNA sequences by a multiple sensor-neural network approach. Proc. Natl Acad. Sci. USA 88, 11261– 11265 (1991).

  36. 36

    Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78–94 (1997).

  37. 37

    Salzberg, S. L., Pertea, M., Delcher, A. L., Gardner, M. J. & Tettelin, H. Interpolated Markov models for eukaryotic gene finding. Genomics 59, 24 –31 (1999).

  38. 38

    Lukashin, A. V. & Borodovsky, M. GeneMark.hmm: new solutions for gene finding. Nucleic Acids Res. 26, 1107–1115 (1998).

  39. 39

    Hebsgaard, S. M. et al. Splice site prediction in Arabidopsis thaliana pre-mRNA by combining local and global sequence information. Nucleic Acids Res. 24, 3430–3452 ( 1996).

  40. 40

    Huang, X., Adams, M. D., Zhou, H. & Kerlavage, A. R. A tool for analyzing and annotating genomic sequences. Genomics 46, 37–45 (1997).

  41. 41

    Frishman, D. & Mewes, H.-W. PEDANTic genome analysis. Trends Genet. 13, 415–416 (1997).

  42. 42

    Lowe, T. M. & Eddy, S. R. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25, 955–964 ( 1997).

  43. 43

    Emanuelsson, O., Nielsen, H., Brunak, S. & von Heijne, G. Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J. Mol. Biol. 300, 1005– 1016 (2000).

Download references

Acknowledgements

We thank K. Mayer and H. Schoof of MIPS for discussions; S. Rhee and E. Huala of TAIR for sequences for the RI markers; and R. Wells for editing the manuscript. This work was funded by National Science Foundation/US Department of Energy/US Department of Agriculture (NSF/DOE/USDA) grants to the SPP Consortium and TIGR.

Author information

Correspondence to Athanasios Theologis or Joseph R. Ecker.

Supplementary information

Supplementary Tables 1 — 3

Rights and permissions

Reprints and Permissions

About this article

Further reading

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.