Sequence and analysis of rice chromosome 4

Article metrics


Rice is the principal food for over half of the population of the world. With its genome size of 430 megabase pairs (Mb), the cultivated rice species Oryza sativa is a model plant for genome research1. Here we report the sequence analysis of chromosome 4 of O. sativa, one of the first two rice chromosomes to be sequenced completely2. The finished sequence spans 34.6 Mb and represents 97.3% of the chromosome. In addition, we report the longest known sequence for a plant centromere, a completely sequenced contig of 1.16 Mb corresponding to the centromeric region of chromosome 4. We predict 4,658 protein coding genes and 70 transfer RNA genes. A total of 1,681 predicted genes match available unique rice expressed sequence tags. Transposable elements have a pronounced bias towards the euchromatic regions, indicating a close correlation of their distributions to genes along the chromosome. Comparative genome analysis between cultivated rice subspecies shows that there is an overall syntenic relationship between the chromosomes and divergence at the level of single-nucleotide polymorphisms and insertions and deletions. By contrast, there is little conservation in gene order between rice and Arabidopsis.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Maps of rice chromosome 4.
Figure 2: Distribution of various repeats and features along chromosome 4.
Figure 3: Comparative sequence analyses between rice subspecies indica GLA4 and japonica Nipponbare.


  1. 1

    Sasaki, T. & Burr, B. International rice genome sequencing project: the effort to completely sequence the rice genome. Curr. Opin. Plant Biol. 3, 138–141 (2000)

  2. 2

    Sasaki, T. et al. The genome sequence and structure of rice chromosome 1. Nature this issue

  3. 3

    Harushima, Y. et al. A high-density rice genetic linkage map with 2275 markers using a single F2 population. Genetics 148, 479–494 (1998)

  4. 4

    Wu, J. et al. A comprehensive rice transcript map containing 6591 expressed sequence tag sites. Plant Cell 14, 525–535 (2002)

  5. 5

    Chen, M. et al. An integrated physical and genetic map of the rice genome. Plant Cell 14, 537–545 (2002)

  6. 6

    Gale, M. D. & Devos, K. M. Comparative genetics in the grasses. Proc. Natl Acad. Sci. USA 95, 1971–1974 (1998)

  7. 7

    Meinke, D. W., Cherry, J. M., Dean, C. D., Rounsley, S. & Koornneef, M. Arabidopsis thaliana: a model plant for genome analysis. Science 282, 662–682 (1998)

  8. 8

    The Arabidopsis Genome Initiative. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408, 796–815 (2000).

  9. 9

    Lin, X. et al. Sequence and analysis of chromosome 2 of the plant Arabidopsis thaliana. Nature 402, 761–768 (1999)

  10. 10

    Mayer, K. et al. Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana. Nature 402, 769–777 (1999)

  11. 11

    Messing, J. & Llaca, V. Importance of anchor genomes for any plant genome project. Proc. Natl Acad. Sci. USA 95, 2017–2020 (1998)

  12. 12

    Zhao, Q. et al. A fine physical map of the rice chromosome 4. Genome Res. 12, 817–823 (2002)

  13. 13

    Saji, S. et al. A physical map with yeast artificial chromosome (YAC) clones covering 63% of the 12 rice chromosomes. Genome 44, 32–37 (2001)

  14. 14

    Copenhaver, G. et al. Genetic definition and sequence analysis of Arabidopsis centromeres. Science 286, 2468–2474 (1999)

  15. 15

    Jenny, A. & Keller, W. Cloning of cDNAs encoding the 160 kDa subunit of the bovine cleavage and polyadenylation specificity factor. Nucleic Acids Res. 23, 2629–2635 (1995)

  16. 16

    Yoshimura, S. et al. Expression of Xa1, a bacterial blight-resistance gene in rice, is induced by bacterial inoculation. Proc. Natl Acad. Sci. USA 95, 1663–1668 (1998)

  17. 17

    Dong, F. et al. Rice (Oryza sativa) centromeric regions consist of complex DNA. Proc. Natl Acad. Sci. USA 95, 8135–8140 (1998)

  18. 18

    Oka, H. I. in Rice Biotechnology (eds Khush, G. S. & Toenniessen, G. H.) 55–80 (CAB International, Oxon, 1991)

  19. 19

    Khush, G. S. Origin, dispersal, cultivation and variation of rice. Plant Mol. Biol. 35, 25–34 (1997)

  20. 20

    Yu, J. et al. A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science 296, 79–92 (2002)

  21. 21

    Goff, S. A. et al. A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 296, 92–100 (2002)

  22. 22

    Huang, X., Adams, M. D., Zhou, H. & Kerlavage, A. R. A tool for analyzing and annotating genomic sequences. Genomics 46, 37–45 (1997)

  23. 23

    Paterson, A. H. et al. Toward a unified genetic map of higher plants, transcending the monocot–dicot divergence. Nature Genetics 14, 380–382 (1996)

  24. 24

    Maugenest, S., Martinez, I., Godin, B., Perez, P. & Lescure, A. M. Structure of two maize phytase genes and their spatio-temporal expression during seedling development. Plant Mol. Biol. 39, 503–514 (1999)

  25. 25

    Knutzon, D. S. et al. Cloning of a coconut endosperm cDNA encoding a 1-acyl-sn-glycerol-3-phosphate acyltransferase that accepts medium-chain-length substrates. Plant Physiol. 109, 999–1006 (1995)

  26. 26

    Ewing, B. & Green, P. Base-calling of automated sequencer traces using Phred. II. Error probabilities. Genome Res. 8, 186–194 (1998)

  27. 27

    Gordon, D., Abajian, C. & Green, P. Consed. A graphical tool for sequence finishing. Genome Res. 8, 195–202 (1998)

  28. 28

    Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78–94 (1997)

  29. 29

    Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997)

  30. 30

    Lowe, T. M. & Eddy, S. R. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25, 955–964 (1997)

Download references


We thank T. Sasaki and the RGP for rice genetic and EST markers and a PAC genomic library of rice Nipponbare; R. Wing and the CUGI for providing BAC libraries of the Nipponbare variety; Monsanto for the rice working-draft sequence data; G. Barry and J. Liu for help; R. Buell and Q. Yuan for help with the annotation and analysis of chromosome 4 sequences; X. Huang and Z. Ning for help with using the AAT and the ssaha programs, respectively; members of the National Centre for Gene Research for assistance; Z. Xu, Z. Chen, G. Wang, Q. Ma and Q. Zhang for support; and X. Lin, X. Deng, Y. Li, L. Zhou, N. Zheng, X. Liu and members of the IRGSP for discussion. This work was supported by grants from the Ministry of Science and Technology of the People's Republic of China, Chinese Academy of Sciences, and the Shanghai Municipal Commission of Science and Technology.

Author information

Correspondence to Bin Han.

Ethics declarations

Competing interests

The authors declare that they have no competing financial interests.

Supplementary information

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Feng, Q., Zhang, Y., Hao, P. et al. Sequence and analysis of rice chromosome 4. Nature 420, 316–320 (2002) doi:10.1038/nature01183

Download citation

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.