Article | Published:

Genome-wide association study of flowering time and grain yield traits in a worldwide collection of rice germplasm

Nature Genetics volume 44, pages 3239 (2012) | Download Citation


A high-density haplotype map recently enabled a genome-wide association study (GWAS) in a population of indica subspecies of Chinese rice landraces. Here we extend this methodology to a larger and more diverse sample of 950 worldwide rice varieties, including the Oryza sativa indica and Oryza sativa japonica subspecies, to perform an additional GWAS. We identified a total of 32 new loci associated with flowering time and with ten grain-related traits, indicating that the larger sample increased the power to detect trait-associated variants using GWAS. To characterize various alleles and complex genetic variation, we developed an analytical framework for haplotype-based de novo assembly of the low-coverage sequencing data in rice. We identified candidate genes for 18 associated loci through detailed annotation. This study shows that the integrated approach of sequence-based GWAS and functional genome annotation has the potential to match complex traits to their causal polymorphisms in rice.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.


  1. 1.

    & The 1001 genomes project for Arabidopsis thaliana. Genome Biol. 10, 107 (2009).

  2. 2.

    et al. Genomic diversity and introgression in O. sativa reveal the impact of domestication and breeding on the rice genome. PLoS ONE 5, e10780 (2010).

  3. 3.

    et al. Genome-wide association studies of 14 agronomic traits in rice landraces. Nat. Genet. 42, 961–967 (2010).

  4. 4.

    Genome-wide association studies coming of age in rice. Nat. Genet. 42, 926–927 (2010).

  5. 5.

    et al. Common sequence polymorphisms shaping genetic diversity in Arabidopsis thaliana. Science 317, 338–342 (2007).

  6. 6.

    The International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005).

  7. 7.

    1000 Genomes Project Consortium. et al. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).

  8. 8.

    et al. GS3, a major QTL for grain length and weight and minor QTL for grain width and thickness in rice, encodes a putative transmembrane protein. Theor. Appl. Genet. 112, 1164–1171 (2006).

  9. 9.

    et al. Linking differential domain functions of the GS3 protein to natural variation of grain size in rice. Proc. Natl. Acad. Sci. USA 107, 19579–19584 (2010).

  10. 10.

    , , , & Allelic diversification at the C (OsC1) locus of wild and cultivated rice: nucleotide changes associated with phenotypes. Genetics 168, 997–1007 (2004).

  11. 11.

    , , & Caught red-handed: Rc encodes a basic helix-loop-helix protein conditioning red pericarp in rice. Plant Cell 18, 283–294 (2006).

  12. 12.

    et al. Allelic diversities in rice starch biosynthesis lead to a diverse array of rice eating and cooking qualities. Proc. Natl. Acad. Sci. USA 106, 21760–21765 (2009).

  13. 13.

    et al. Deletion in a gene associated with grain size increased yields during rice domestication. Nat. Genet. 40, 1023–1028 (2008).

  14. 14.

    et al. Isolation and initial characterization of GW5, a major QTL associated with rice grain width and weight. Cell Res. 18, 1199–1209 (2008).

  15. 15.

    et al. Reference-guided assembly of four diverse Arabidopsis thaliana genomes. Proc. Natl. Acad. Sci. USA 108, 10249–10254 (2011).

  16. 16.

    et al. The characterization of twenty sequenced human genomes. PLoS Genet. 6, e1001111 (2010).

  17. 17.

    et al. Independent losses of function in a polyphenol oxidase in rice: differentiation in grain discoloration between subspecies and the role of positive selection under domestication. Plant Cell 20, 2946–2959 (2008).

  18. 18.

    et al. A rice quantitative trait locus for salt tolerance encodes a sodium transporter. Nat. Genet. 37, 1141–1146 (2005).

  19. 19.

    et al. Natural genetic variation caused by small insertions and deletions in the human genome. Genome Res. 21, 830–839 (2011).

  20. 20.

    et al. Xa26, a gene conferring resistance to Xanthomonas oryzae pv. oryzae in rice, encodes an LRR receptor kinase-like protein. Plant J. 37, 517–527 (2004).

  21. 21.

    et al. Hd1, a major photoperiod sensitivity quantitative trait locus in rice, is closely related to the Arabidopsis flowering time gene CONSTANS. Plant Cell 12, 2473–2484 (2000).

  22. 22.

    et al. Genomic structure and evolution of the Pi2/9 locus in wild rice species. Theor. Appl. Genet. 121, 295–309 (2010).

  23. 23.

    et al. Mixed linear model approach adapted for genome-wide association studies. Nat. Genet. 42, 355–360 (2010).

  24. 24.

    et al. The genetic architecture of maize flowering time. Science 325, 714–718 (2009).

  25. 25.

    , , , & Association mapping of local climate-sensitive quantitative trait loci in Arabidopsis thaliana. Proc. Natl. Acad. Sci. USA 107, 21199–21204 (2010).

  26. 26.

    et al. Hd3a, a rice ortholog of the Arabidopsis FT gene, promotes transition to flowering downstream of Hd1 under short-day conditions. Plant Cell Physiol. 43, 1096–1105 (2002).

  27. 27.

    et al. Natural variation in Ghd7 is an important regulator of heading date and yield potential in rice. Nat. Genet. 40, 761–767 (2008).

  28. 28.

    et al. Genetic control of a transition from black to straw-white seed hull in rice domestication. Plant Physiol. 155, 1301–1311 (2011).

  29. 29.

    et al. Natural variation at the DEP1 locus enhances grain yield in rice. Nat. Genet. 41, 494–497 (2009).

  30. 30.

    , , , & QTL for rice grain width and weight encodes a previously unknown RING-type E3 ubiquitin ligase. Nat. Genet. 39, 623–630 (2007).

  31. 31.

    et al. Characterization and fine mapping of the ibf mutant in rice. J. Integr. Plant Biol. 49, 678–685 (2007).

  32. 32.

    et al. Genome-wide association study of quantitative resistance to southern leaf blight in the maize nested association mapping population. Nat. Genet. 43, 163–168 (2011).

  33. 33.

    et al. Genome-wide association study of leaf architecture in the maize nested association mapping population. Nat. Genet. 43, 159–162 (2011).

  34. 34.

    The Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 (2007).

  35. 35.

    et al. Association mapping: critical considerations shift from genotyping to experimental design. Plant Cell 21, 2194–2202 (2009).

  36. 36.

    et al. Function annotation of the rice transcriptome at single-nucleotide resolution by RNA-Seq. Genome Res. 20, 1238–1249 (2010).

  37. 37.

    , & Genome structural variation discovery and genotyping. Nat. Rev. Genet. 12, 363–376 (2011).

  38. 38.

    et al. Sequencing of natural strains of Arabidopsis thaliana with short reads. Genome Res. 18, 2024–2033 (2008).

  39. 39.

    et al. Versatile and open software for comparing large genomes. Genome Biol. 5, R12 (2004).

  40. 40.

    , & EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet. 16, 276–277 (2000).

  41. 41.

    et al. Mapping quantitative trait loci for milling quality, protein content and color characteristics of rice using a recombinant inbred line population derived from an elite rice hybrid. Theor. Appl. Genet. 103, 1037–1045 (2001).

  42. 42.

    & Empirical threshold values for quantitative trait mapping. Genetics 138, 963–971 (1994).

  43. 43.

    et al. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4, 249–264 (2003).

  44. 44.

    et al. Genome-wide temporal-spatial gene expression profiling of drought responsiveness in rice. BMC Genomics 12, 149 (2011).

  45. 45.

    et al. F-box proteins in rice. Genome-wide analysis, classification, temporal and spatial gene expression during panicle and seed development, and regulation by light and abiotic stress. Plant Physiol. 143, 1467–1483 (2007).

Download references


We thank the China National Rice Research Institute for providing the rice germplasm samples. We thank S. Griffiths and G. Moore for critical reading of the manuscript. We thank Z. Zhang and E.S. Buckler for helping us use the compressed MLM and Z. Ning for assistance with sequence alignment. This work was supported by the Chinese Academy of Sciences (KSCX2-YW-N-094), the Ministry of Agriculture of China (2011ZX08001-004 and 2011ZX08009-002), the National Natural Science Foundation of China (30821004) and the Ministry of Science and Technology of China (2011CB100205) to B.H.

Author information

Author notes

    • Xuehui Huang
    • , Yan Zhao
    •  & Xinghua Wei

    These authors contributed equally to this work.


  1. National Center for Gene Research, National Center for Plant Gene Research (Shanghai), Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China.

    • Xuehui Huang
    • , Yan Zhao
    • , Canyang Li
    • , Ahong Wang
    • , Qiang Zhao
    • , Wenjun Li
    • , Yunli Guo
    • , Liuwei Deng
    • , Chuanrang Zhu
    • , Danlin Fan
    • , Yiqi Lu
    • , Qijun Weng
    • , Kunyan Liu
    • , Taoying Zhou
    • , Yufeng Jing
    • , Lizhen Si
    • , Guojun Dong
    • , Tao Huang
    • , Tingting Lu
    • , Qi Feng
    •  & Bin Han
  2. Chinese Academy of Sciences Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China.

    • Xuehui Huang
    • , Yan Zhao
    •  & Bin Han
  3. State Key Laboratory of Rice Biology, China National Rice Research Institute, Chinese Academy of Agricultural Sciences, Hangzhou, China.

    • Xinghua Wei
    • , Guojun Dong
    •  & Qian Qian
  4. National Center for Plant Gene Research, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China.

    • Jiayang Li


  1. Search for Xuehui Huang in:

  2. Search for Yan Zhao in:

  3. Search for Xinghua Wei in:

  4. Search for Canyang Li in:

  5. Search for Ahong Wang in:

  6. Search for Qiang Zhao in:

  7. Search for Wenjun Li in:

  8. Search for Yunli Guo in:

  9. Search for Liuwei Deng in:

  10. Search for Chuanrang Zhu in:

  11. Search for Danlin Fan in:

  12. Search for Yiqi Lu in:

  13. Search for Qijun Weng in:

  14. Search for Kunyan Liu in:

  15. Search for Taoying Zhou in:

  16. Search for Yufeng Jing in:

  17. Search for Lizhen Si in:

  18. Search for Guojun Dong in:

  19. Search for Tao Huang in:

  20. Search for Tingting Lu in:

  21. Search for Qi Feng in:

  22. Search for Qian Qian in:

  23. Search for Jiayang Li in:

  24. Search for Bin Han in:


B.H. conceived of the project and its components. J.L. and B.H. contributed to the original concept of the project. W.L., Y.G., L.D., D.F., Y.L., Q.W. and Q.F. performed the genome sequencing. X.H., Q.Z., Y.Z., C.Z., K.L., L.S., T.H. and T.L. performed the genome data analyses. Y.Z., C.Z., Q.Z. and X.H. improved the imputation program for the data analyses. X.H., Q.Z. and Y.Z. developed an analytical framework for de novo assembly of the low-coverage sequencing data. X.W., C.L., A.W., T.Z., Y.J., G.D. and Q.Q. collected samples and performed the phenotyping. Y.Z. and X.H. performed the GWAS and statistical analyses. X.H. and B.H. analyzed all of the data together and wrote the paper.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to Bin Han.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Note, Supplementary Tables 4, 6, 7 and 9–12 and Supplementary Figures 1–31.

Excel files

  1. 1.

    Supplementary Table 1

    The list of 950 accessions sampled in the collection.

  2. 2.

    Supplementary Table 2

    The levels of sequence diversity (π) in each group across the rice genome.

  3. 3.

    Supplementary Table 3

    The levels of pariwise population differentiation (Fst) across the rice genome.

  4. 4.

    Supplementary Table 5

    The list of SNP sites with population-special alleles.

  5. 5.

    Supplementary Table 8

    The detailed list of all the large-effect variations in rice genome.

  6. 6.

    Supplementary Table 13

    The detailed list of the microarrays used in the study and their related descriptions.

  7. 7.

    Supplementary Table 14

    The genotype dataset of indica accessions on the causal polymorphic sites of Hd3a.

  8. 8.

    Supplementary Table 15

    The genotype dataset of indica accessions on the causal polymorphic sites of OsFBX310.

  9. 9.

    Supplementary Table 16

    The genotype dataset of japonica accessions on the causal polymorphic sites of OsRAL6.

About this article

Publication history