Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Revisiting a GWAS peak in Arabidopsis thaliana reveals possible confounding by genetic heterogeneity

Abstract

Genome-wide association studies (GWAS) have become a standard approach for exploring the genetic basis of phenotypic variation. However, correlation is not causation, and only a tiny fraction of all associations have been experimentally confirmed. One practical problem is that a peak of association does not always pinpoint a causal gene, but may instead be tagging multiple causal variants. In this study, we reanalyze a previously reported peak associated with flowering time traits in Swedish Arabidopsis thaliana population. The peak appeared to pinpoint the AOP2/AOP3 cluster of glucosinolate biosynthesis genes, which is known to be responsible for natural variation in herbivore resistance. Here we propose an alternative hypothesis, by demonstrating that the AOP2/AOP3 flowering association can be wholly accounted for by allelic variation in two flanking genes with clear roles in regulating flowering: NDX1, a regulator of the main flowering time controller FLC, and GA1, which plays a central role in gibberellin synthesis and is required for flowering under some conditions. In other words, we propose that the AOP2/AOP3 flowering-time association may be yet another example of a spurious, “synthetic” association, arising from trying to fit a single-locus model in the presence of two statistically associated causative loci. We conclude that caution is needed when using GWAS for fine-mapping.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: GWAS for flowering time revealed a peak centered on the chromosome 4 AOP cluster.
Fig. 2: Functional validation of AOP2 alleles.
Fig. 3: Haplotype structure around the AOP peak.
Fig. 4: Multilocus GWAS suggests genetic heterogeneity.
Fig. 5: Summary of our results.

References

  1. 1001 Genomes Consortium (2016) 1,135 genomes reveal the global pattern of polymorphism in Arabidopsis thaliana. Cell 166:481–491

    Article  CAS  Google Scholar 

  2. Andrés F, Coupland G (2012) The genetic basis of flowering responses to seasonal cues. Nat Rev Genet 13:627–639

    PubMed  Article  CAS  PubMed Central  Google Scholar 

  3. Atwell S, Huang YS, Vilhjálmsson BJ, Willems G, Horton M, Li Y et al. (2010) Genomewide association study of 107 phenotypes in Arabidopsis thaliana inbred lines. Nature 465:627–631

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  4. Blazquez MA, Green R, Nilsson O, Sussman MR, Weigel D (1998) Gibberellins promote flowering of arabidopsis by activating the LEAFY promoter. Plant Cell 10:791–800

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  5. Boyle EA, Li YI, Pritchard JK (2017) An expanded view of complex traits: from polygenic to omnigenic. Cell 169:1177–1186

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  6. Brachi B, Faure N, Horton M, Flahauw E, Vazquez A, Nordborg M et al. (2010) Linkage and association mapping of Arabidopsis thaliana flowering time in nature. PLoS Genet 6:e1000940

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  7. Chong VK, Stinchcombe JR (2019) Evaluating population genomic candidate genes underlying flowering time in arabidopsis thaliana using T-DNA insertion lines. J Hered 110:445–454

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  8. Czechowski T, Stitt M, Altmann T, Udvardi MK, Scheible W-R (2005) Genome-wide identification and testing of superior reference genes for transcript normalization in arabidopsis. Plant Physiol 139:5–17

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  9. Dickson SP, Wang K, Krantz I, Hakonarson H, Goldstein DB (2010) Rare variants create synthetic genome-wide associations. PLoS Biol 8:e1000294

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  10. Dubin MJ, Zhang P, Meng D, Remigereau M-S, Osborne EJ, Paolo Casale F et al. (2015) DNA methylation in Arabidopsis has a genetic basis and shows evidence of local adaptation. Elife 4:e05255

    PubMed  PubMed Central  Article  Google Scholar 

  11. Flint J, Eskin E (2012) Genome-wide association studies in mice. Nat Rev Genet 13:807–817

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  12. Gallagher MD, Chen-Plotkin AS (2018) The post-GWAS era: from association to function. Am J Hum Genet 102:717–730

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  13. Haley CS, Knott SA (1992) A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69:315–324

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  14. Hormozdiari F, Kostem E, Kang EY, Pasaniuc B, Eskin E (2014) Identifying causal variants at loci with multiple signals of association. Genetics 198:497–508

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  15. Horton MW, Hancock AM, Huang YS, Toomajian C, Atwell S, Auton A et al. (2012) Genome-wide patterns of genetic variation in worldwide Arabidopsis thaliana accessions from the RegMap panel. Nat Genet 44:212–216

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  16. Huang X, Wei X, Sang T, Zhao Q, Feng Q, Zhao Y et al. (2010) Genome-wide association studies of 14 agronomic traits in rice landraces. Nat Genet 42:961–967

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  17. Jensen LM, Jepsen HSK, Halkier BA, Kliebenstein DJ, Burow M (2015) Natural variation in cross-talk between glucosinolates and onset of flowering in Arabidopsis. Front Plant Sci 6:697

    PubMed  PubMed Central  Google Scholar 

  18. Johanson U, West J, Lister C, Michaels S, Amasino R, Dean C (2000) Molecular analysis of FRIGIDA, a major determinant of natural variation in Arabidopsis flowering time. Science 290:344–347

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  19. Kang HM, Zaitlen NA, Wade CM, Kirby A, Heckerman D, Daly MJ et al. (2008) Efficient control of population structure in model organism association mapping. Genetics 178:1709–1723

    PubMed  PubMed Central  Article  Google Scholar 

  20. Katz E, Li JJ, Jaegle B, Ashkenazy H, Abrahams SR, Bagaza C et al. (2021). Genetic variation, environment and demography intersect to shape Arabidopsis defense metabolite variation across Europe. Elife 10:e67784

  21. Kerdaffrec E, Filiault DL, Korte A, Sasaki E, Nizhynska V, Seren Ü et al. (2016). Multiple alleles at a single locus control seed dormancy in Swedish Arabidopsis. Elife 5, e22502

  22. Kerwin RE, Jimenez-Gomez JM, Fulop D, Harmer SL, Maloof JN, Kliebenstein DJ (2011) Network quantitative trait loci mapping of circadian clock outputs identifies metabolic pathway-to-clock linkages in Arabidopsis. Plant Cell 23:471–485

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  23. Kliebenstein DJ, Lambrix VM, Reichelt M, Gershenzon J, Mitchell-Olds T (2001) Gene duplication in the diversification of secondary metabolism: tandem 2-oxoglutarate– dependent dioxygenases control glucosinolate biosynthesis in arabidopsis. Plant Cell 13:681–693

    CAS  PubMed  PubMed Central  Google Scholar 

  24. Koornneef M, Alonso-Blanco C, Peeters AJM, Soppe W (1998) Genetic control of flowering time in Arabidopsis. Annu Rev Plant Physiol Plant Mol Biol 49:345–370

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  25. Larsson SJ, Lipka AE, Buckler ES (2013) Lessons from Dwarf8 on the strengths and weaknesses of structured association mapping. PLoS Genet 9:e1003246

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  26. Liang X, Lee HW, Li Z, Lu Y, Zou L, Ong CN (2018) Simultaneous quantification of 22 glucosinolates in 12 brassicaceae vegetables by hydrophilic interaction chromatography–tandem mass spectrometry. ACS Omega 3:15546–15553

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  27. Li P, Filiault D, Box MS, Kerdaffrec E, van Oosterhout C, Wilczek AM et al. (2014) Multiple FLC haplotypes defined by independent cis-regulatory variation underpin life history diversity in Arabidopsis thaliana. Genes Dev 28:1635–1640

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  28. Li Y, Huang Y, Bergelson J, Nordborg M, Borevitz JO (2010) Association mapping of local climate-sensitive quantitative trait loci in Arabidopsis thaliana. Proc Natl Acad Sci USA 107:21199–21204

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  29. Lippert C, Casale FP, Rakitsch B, Stegle O (2014). LIMIX: genetic analysis of multiple traits. bioRxiv. https://www.biorxiv.org/content/10.1101/003905v2

  30. Long Q, Rabanal FA, Meng D, Huber CD, Farlow A, Platzer A et al. (2013) Massive genomic variation and strong selection in Arabidopsis thaliana lines from Sweden. Nat Genet 45:884–890

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  31. MacArthur J, Bowler E, Cerezo M, Gil L, Hall P, Hastings E et al. (2017) The new NHGRIEBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res 45:D896–D901

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  32. Martínez O, Curnow RN (1992) Estimating the locations and the sizes of the effects of quantitative trait loci using flanking markers. Theor Appl Genet 85:480–488

    PubMed  Article  PubMed Central  Google Scholar 

  33. Maynard Smith J, Haigh J (1974) The hitch-hiking effect of a favourable gene. Genet Res 23:23–35

    Article  Google Scholar 

  34. Neal CS, Fredericks DP, Griffiths CA, Neale AD (2010) The characterisation of AOP2: a gene associated with the biosynthesis of aliphatic alkenyl glucosinolates in Arabidopsis thaliana. BMC Plant Biol 10:170

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  35. Platt A, Vilhjálmsson BJ, Nordborg M (2010a) Conditions under which genome-wide association studies will be positively misleading. Genetics 186:1045–1052

    PubMed  PubMed Central  Article  Google Scholar 

  36. Platt A, Vilhjálmsson BJ, Nordborg M (2010b) Conditions under which genome-wide association studies will be positively misleading. Genetics 186:1045–1052

    PubMed  PubMed Central  Article  Google Scholar 

  37. Porri A, Torti S, Romera-Branchat M, Coupland G (2012) Spatially distinct regulatory roles for gibberellins in the promotion of flowering of Arabidopsis under long photoperiods. Development 139:2198–2209

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  38. Reeves PH, Coupland G (2001) Analysis of flowering time control in Arabidopsis by comparison of double and triple mutants. Plant Physiol 126:1085–1091

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  39. Sasaki E, Frommlet F, Nordborg M (2018) GWAS with Heterogeneous Data: Estimating the Fraction of Phenotypic VariationMediated by Gene Expression Data G3-Genes Genom Genet 8:3059–3068

    CAS  Google Scholar 

  40. Sasaki E, Zhang P, Atwell S, Meng D, Nordborg M (2015) ‘Missing’ G x E variation controls flowering time in Arabidopsis thaliana. PLoS Genet 11:e1005597

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  41. Scheet P, Stephens M (2006) A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet 78:629–644

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  42. Seren Ü, Grimm D, Fitz J, Weigel D, Nordborg M, Borgwardt K et al. (2017) AraPheno: a public database for Arabidopsis thaliana phenotypes. Nucleic Acids Res 45:D1054–D1059

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  43. Srikanth A, Schmid M (2011) Regulation of flowering time: all roads lead to Rome. Cell Mol Life Sci 68:2013–2037

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  44. Sun Q, Csorba T, Skourti-Stathaki K, Proudfoot NJ, Dean C (2013) R-loop stabilization represses antisense transcription at the Arabidopsis FLC locus. Science 340:619–621

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  45. Sun TP, Kamiya Y (1994) The Arabidopsis GA1 locus encodes the cyclase ent-kaurene synthetase A of gibberellin biosynthesis. Plant Cell 6:1509–1518

    CAS  PubMed  PubMed Central  Google Scholar 

  46. Tam V, Patel N, Turcotte M, Bossé Y, Paré G, Meyre D (2019) Benefits and limitations of genome-wide association studies. Nat Rev Genet 20:467–484

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  47. Vilhjálmsson BJ, Nordborg M (2013) The nature of confounding in genome-wide association studies. Nat Rev Genet 14:1–2

    PubMed  Article  CAS  PubMed Central  Google Scholar 

  48. Whittaker C, Dean C (2017) The FLC locus: a platform for discoveries in epigenetics and adaptation. Annu Rev Cell Dev Biol 33:555–575

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  49. Yu J, Pressoir G, Briggs WH, Vroh Bi I, Yamasaki M, Doebley JF et al. (2006) A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat Genet 38:203–208

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  50. Zan Y, Carlborg Ö (2019) A polygenic genetic architecture of flowering time in the worldwide arabidopsis thaliana population. Mol Biol Evol 36:141–154

    CAS  PubMed  Article  PubMed Central  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Caroline Dean for sharing knowledge and seeds of the AtNDX1 mutant. We also thank Ümit Selen for technical support with the data analysis, and Daniel J. Kliebenstein and Haijun Liu for comments and helpful discussions. The VBCF Metabolomics Facility is supported by the City of Vienna through the Vienna Business Agency.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Magnus Nordborg.

Ethics declarations

Competing interests

The authors declare no competing interest.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Associate editor Marc Stift

Supplementary information

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Sasaki, E., Köcher, T., Filiault, D.L. et al. Revisiting a GWAS peak in Arabidopsis thaliana reveals possible confounding by genetic heterogeneity. Heredity 127, 245–252 (2021). https://doi.org/10.1038/s41437-021-00456-3

Download citation

Search

Quick links