A genome-wide association study (GWAS) can be a powerful tool for the identification of genes associated with agronomic traits in crop species, but it is often hindered by population structure and the large extent of linkage disequilibrium. In this study, we identified agronomically important genes in rice using GWAS based on whole-genome sequencing, followed by the screening of candidate genes based on the estimated effect of nucleotide polymorphisms. Using this approach, we identified four new genes associated with agronomic traits. Some genes were undetectable by standard SNP analysis, but we detected them using gene-based association analysis. This study provides fundamental insights relevant to the rapid identification of genes associated with agronomic traits using GWAS and will accelerate future efforts aimed at crop improvement.
At a glance
Sequence Read Archive
- Food security: the challenge of feeding 9 billion people. Science 327, 812–818 (2010). et al.
- The role of QTLs in the breeding of high-yielding rice. Trends Plant Sci. 16, 319–326 (2011). , &
- Natural variations and genome-wide association studies in crop plants. Annu. Rev. Plant Biol. 65, 531–551 (2014). &
- Association mapping: critical considerations shift from genotyping to experimental design. Plant Cell 21, 2194–2202 (2009). et al.
- Population genetics of genomics-based crop improvement methods. Trends Genet. 27, 98–106 (2011). , &
- From association to prediction: statistical methods for the dissection and selection of complex traits in plants. Curr. Opin. Plant Biol. 24, 110–118 (2015). et al.
- A map of rice genome variation reveals the origin of cultivated rice. Nature 490, 497–501 (2012). et al.
- Resequencing rice genomes: an emerging new era of rice genomics. Trends Genet. 29, 225–232 (2013). , &
- A haplotype map of genomic variations and genome-wide association studies of agronomic traits in foxtail millet (Setaria italica). Nat. Genet. 45, 957–961 (2013). et al.
- Whole-genome sequencing reveals untapped genetic potential in Africa's indigenous cereal crop sorghum. Nat. Commun. 4, 2320 (2013). et al.
- Exploring genetic variation in the tomato (Solanum section Lycopersicon) clade by whole-genome sequencing. Plant J. 80, 136–148 (2014). et al.
- Genomic analyses provide insights into the history of tomato breeding. Nat. Genet. 46, 1220–1226 (2014). et al.
- Whole genome re-sequencing of date palms yields insights into diversification of a fruit tree crop. Nat. Commun. 6, 8824 (2015). et al.
- Genomic analysis of hybrid rice varieties reveals numerous superior alleles that contribute to heterosis. Nat. Commun. 6, 6258 (2015). et al.
- Resequencing 302 wild and cultivated accessions identifies genes related to domestication and improvement in soybean. Nat. Biotechnol. 33, 408–414 (2015). et al.
- The next-generation sequencing revolution and its impact on genomics. Cell 155, 27–38 (2013). , , , &
- Genetic linkage analysis in the age of whole-genome sequencing. Nat. Rev. Genet. 16, 275–284 (2015). , &
- Genome-wide association studies of 14 agronomic traits in rice landraces. Nat. Genet. 42, 961–967 (2010). et al.
- Genome-wide association mapping reveals a rich genetic architecture of complex traits in Oryza sativa. Nat. Commun. 2, 467 (2011). et al.
- Genome-wide association study of flowering time and grain yield traits in a worldwide collection of rice germplasm. Nat. Genet. 44, 32–39 (2012). et al.
- Genome-wide association study dissects the genetic architecture of oil biosynthesis in maize kernels. Nat. Genet. 45, 43–50 (2013). et al.
- Genetic design and statistical power of nested association mapping in maize. Genetics 178, 539–551 (2008). , , &
- Genetic properties of the maize nested association mapping population. Science 325, 737–740 (2009). et al.
- Genome-wide association study of quantitative resistance to southern leaf blight in the maize nested association mapping population. Nat. Genet. 43, 163–168 (2011). et al.
- Genome-wide association study of leaf architecture in the maize nested association mapping population. Nat. Genet. 43, 159–162 (2011). et al.
- From mutations to MAGIC: resources for gene discovery, validation and delivery in crop plants. Curr. Opin. Plant Biol. 11, 215–221 (2008). , , &
- MAGIC maize: a new resource for plant genetics. Genome Biol. 16, 163 (2015).
- Genetic properties of the MAGIC maize population: a new platform for high definition QTL mapping in Zea mays. Genome Biol. 16, 167 (2015). et al.
- Linkage disequilibrium in the human genome. Nature 411, 199–204 (2001). et al.
- Linkage disequilibrium and association studies in higher plants: present status and future prospects. Plant Mol. Biol. 57, 461–485 (2005). , &
- Rice. Nature 514, S49 (2014).
- International Rice Genome Sequencing Project. The map-based sequence of the rice genome. Nature 436, 793–800 (2005).
- Cloning of quantitative trait genes from rice reveals conservation and divergence of photoperiod flowering pathways in Arabidopsis and rice. Front. Plant Sci. 5, 193 (2014). , , &
- Hd6, a rice quantitative trait locus involved in photoperiod sensitivity, encodes the alpha subunit of protein kinase CK2. Proc. Natl. Acad. Sci. USA 98, 7922–7927 (2001). , , &
- Natural variation in OsPRR37 regulates heading date and contributes to rice cultivation at a wide range of latitudes. Mol. Plant 6, 1877–1888 (2013). et al.
- Uridylation of miRNAs by HEN1 SUPPRESSOR1 in Arabidopsis. Curr. Biol. 22, 695–700 (2012). , &
- HEN1 functions pleiotropically in Arabidopsis development and acts in C function in the flower. Development 129, 1085–1094 (2002). , , &
- NAL1 allele from a rice landrace greatly increases yield in modern indica cultivars. Proc. Natl. Acad. Sci. USA 110, 20431–20436 (2013). et al.
- A natural variant of NAL1, selected in high-yield rice breeding programs, pleiotropically increases photosynthesis rate. Sci. Rep. 3, 2149 (2013). et al.
- Hd1, a major photoperiod sensitivity quantitative trait locus in rice, is closely related to the Arabidopsis flowering time gene CONSTANS. Plant Cell 12, 2473–2484 (2000). et al.
- Multiple introgression events surrounding the Hd1 flowering-time gene in cultivated rice, Oryza sativa L. Mol. Genet. Genomics 284, 137–146 (2010). et al.
- Heading date 1 (Hd1), an ortholog of Arabidopsis CONSTANS, is a possible target of human selection during domestication to diversify flowering times of cultivated rice. Genes Genet. Syst. 86, 175–182 (2011). &
- Genome-wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines. Nature 465, 627–631 (2010). et al.
- A coastal cline in sodium accumulation in Arabidopsis thaliana is driven by natural variation of the sodium transporter AtHKT1; 1. PLoS Genet. 6, e1001193 (2010). et al.
- Rare variants create synthetic genome-wide associations. PLoS Biol. 8, e1000294 (2010). , , , &
- Conditions under which genome-wide association studies will be positively misleading. Genetics 186, 1045–1052 (2010). , &
- A gene-centric approach to genome-wide association studies. Nat. Rev. Genet. 7, 885–891 (2006). &
- Longevity GWAS using the Drosophila genetic reference panel. J. Gerontol. A Biol. Sci. Med. Sci. 70, 1470–1478 (2015). et al.
- A genome-wide screening and SNPs-to-genes approach to identify novel genetic risk factors associated with frontotemporal dementia. Neurobiol. Aging 36, 2904, e13–2904.e26 (2015). et al.
- Generation of signaling specificity in Arabidopsis by spatially restricted buffering of ligand-receptor interactions. Plant Cell 23, 2864–2879 (2011). , &
- An efficient multi-locus mixed-model approach for genome-wide association studies in structured populations. Nat. Genet. 44, 825–830 (2012). et al.
- The nature of confounding in genome-wide association studies. Nat. Rev. Genet. 14, 1–2 (2013). &
- Green revolution: a mutant gibberellin-synthesis gene in rice. Nature 416, 701–702 (2002). et al.
- Artificial selection for a green revolution gene during japonica rice domestication. Proc. Natl. Acad. Sci. USA 108, 11034–11039 (2011). et al.
- An SNP caused loss of seed shattering during rice domestication. Science 312, 1392–1396 (2006). et al.
- Natural variation in GS5 plays an important role in regulating grain size and yield in rice. Nat. Genet. 43, 1266–1269 (2011). et al.
- Copy number variation at the GL7 locus contributes to grain size diversity in rice. Nat. Genet. 47, 944–948 (2015). et al.
- OsSPL13 controls grain size in cultivated rice. Nat. Genet. 48, 447–456 (2016). et al.
- Genetic diversity and phylogeny of Japanese sake-brewing rice as revealed by AFLP and nuclear and chloroplast SSR markers. Theor. Appl. Genet. 109, 1586–1596 (2004). et al.
- Development of mini core collection of Japanese rice landrace. Breed. Sci. 58, 281–291 (2008). , , , &
- Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26, 589–595 (2010). &
- A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011). et al.
- The TIGR Rice Genome Annotation Resource: improvements and new features. Nucleic Acids Res. 35, D883–D887 (2007). et al.
- Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006). et al.
- LDheatmap: an R function for graphical display of pairwise linkage disequilibria between single nucleotide polymorphisms. J. Stat. Softw. 16, Code Snippet 3 (2006). , , &
- Biological insights from 108 schizophrenia-associated genetic loci. Nature 511, 421–427 (2014). et al.
- No association between ovarian cancer susceptibility variants and breast cancer risk among Chinese women. Cancer Epidemiol. Biomarkers Prev. 22, 467–469 (2013). et al.
- Identification of genetic loci associated with Helicobacter pylori serologic status. J. Am. Med. Assoc. 309, 1912–1920 (2013). et al.
- A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat. Genet. 38, 203–208 (2006). et al.
- Ridge regression and other kernels for genomic selection with R package rrBLUP. Plant Genome J. 4, 250–255 (2011).
- Estimation of significance thresholds for genomewide association scans. Genet. Epidemiol. 32, 227–234 (2008). &
- MEGA6: molecular evolutionary genetics analysis version 6.0. Mol. Biol. Evol. 30, 2725–2729 (2013). , , , &
- A high-efficiency Agrobacterium-mediated transformation system of rice (Oryza sativa L.). Methods Mol. Biol. 847, 51–57 (2012).
- Supplementary Text and Figures (8,300 KB)
Supplementary Figures 1–21 and Supplementary Tables 1, 3–5, 13 and 14.
- Supplementary Table 2 (62,077 KB)
Phenotypic data of the seven traits observed in 2013 and 2014.
- Supplementary Table 6 (65,381 KB)
List of the top 50 P-value-ranked genes in the gene-based association analysis of days to heading.
- Supplementary Table 7 (101 KB)
List of the top 50 P-value-ranked genes in the gene-based association analysis of plant height.
- Supplementary Table 8 (58,891 KB)
List of the top 50 P-value-ranked genes in the gene-based association analysis of panicle length.
- Supplementary Table 9 (62,473 KB)
List of the top 50 P-value-ranked genes in the gene-based association analysis of panicle number per plant.
- Supplementary Table 10 (55,987 KB)
List of the top 50 P-value-ranked genes in the gene-based association analysis of leaf blade width.
- Supplementary Table 11 (67,230 KB)
List of the top 50 P-value-ranked genes in the gene-based association analysis of spikelet number per panicle.
- Supplementary Table 12 (101 KB)
List of the top 50 P-value-ranked genes in the gene-based association analysis of awn length.