Genome-wide association studies (GWAS) have become a standard approach for exploring the genetic basis of phenotypic variation. However, correlation is not causation, and only a tiny fraction of all associations have been experimentally confirmed. One practical problem is that a peak of association does not always pinpoint a causal gene, but may instead be tagging multiple causal variants. In this study, we reanalyze a previously reported peak associated with flowering time traits in Swedish Arabidopsis thaliana population. The peak appeared to pinpoint the AOP2/AOP3 cluster of glucosinolate biosynthesis genes, which is known to be responsible for natural variation in herbivore resistance. Here we propose an alternative hypothesis, by demonstrating that the AOP2/AOP3 flowering association can be wholly accounted for by allelic variation in two flanking genes with clear roles in regulating flowering: NDX1, a regulator of the main flowering time controller FLC, and GA1, which plays a central role in gibberellin synthesis and is required for flowering under some conditions. In other words, we propose that the AOP2/AOP3 flowering-time association may be yet another example of a spurious, “synthetic” association, arising from trying to fit a single-locus model in the presence of two statistically associated causative loci. We conclude that caution is needed when using GWAS for fine-mapping.
Subscribe to Journal
Get full journal access for 1 year
only $9.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
1001 Genomes Consortium (2016) 1,135 genomes reveal the global pattern of polymorphism in Arabidopsis thaliana. Cell 166:481–491
Andrés F, Coupland G (2012) The genetic basis of flowering responses to seasonal cues. Nat Rev Genet 13:627–639
Atwell S, Huang YS, Vilhjálmsson BJ, Willems G, Horton M, Li Y et al. (2010) Genomewide association study of 107 phenotypes in Arabidopsis thaliana inbred lines. Nature 465:627–631
Blazquez MA, Green R, Nilsson O, Sussman MR, Weigel D (1998) Gibberellins promote flowering of arabidopsis by activating the LEAFY promoter. Plant Cell 10:791–800
Boyle EA, Li YI, Pritchard JK (2017) An expanded view of complex traits: from polygenic to omnigenic. Cell 169:1177–1186
Brachi B, Faure N, Horton M, Flahauw E, Vazquez A, Nordborg M et al. (2010) Linkage and association mapping of Arabidopsis thaliana flowering time in nature. PLoS Genet 6:e1000940
Chong VK, Stinchcombe JR (2019) Evaluating population genomic candidate genes underlying flowering time in arabidopsis thaliana using T-DNA insertion lines. J Hered 110:445–454
Czechowski T, Stitt M, Altmann T, Udvardi MK, Scheible W-R (2005) Genome-wide identification and testing of superior reference genes for transcript normalization in arabidopsis. Plant Physiol 139:5–17
Dickson SP, Wang K, Krantz I, Hakonarson H, Goldstein DB (2010) Rare variants create synthetic genome-wide associations. PLoS Biol 8:e1000294
Dubin MJ, Zhang P, Meng D, Remigereau M-S, Osborne EJ, Paolo Casale F et al. (2015) DNA methylation in Arabidopsis has a genetic basis and shows evidence of local adaptation. Elife 4:e05255
Flint J, Eskin E (2012) Genome-wide association studies in mice. Nat Rev Genet 13:807–817
Gallagher MD, Chen-Plotkin AS (2018) The post-GWAS era: from association to function. Am J Hum Genet 102:717–730
Haley CS, Knott SA (1992) A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69:315–324
Hormozdiari F, Kostem E, Kang EY, Pasaniuc B, Eskin E (2014) Identifying causal variants at loci with multiple signals of association. Genetics 198:497–508
Horton MW, Hancock AM, Huang YS, Toomajian C, Atwell S, Auton A et al. (2012) Genome-wide patterns of genetic variation in worldwide Arabidopsis thaliana accessions from the RegMap panel. Nat Genet 44:212–216
Huang X, Wei X, Sang T, Zhao Q, Feng Q, Zhao Y et al. (2010) Genome-wide association studies of 14 agronomic traits in rice landraces. Nat Genet 42:961–967
Jensen LM, Jepsen HSK, Halkier BA, Kliebenstein DJ, Burow M (2015) Natural variation in cross-talk between glucosinolates and onset of flowering in Arabidopsis. Front Plant Sci 6:697
Johanson U, West J, Lister C, Michaels S, Amasino R, Dean C (2000) Molecular analysis of FRIGIDA, a major determinant of natural variation in Arabidopsis flowering time. Science 290:344–347
Kang HM, Zaitlen NA, Wade CM, Kirby A, Heckerman D, Daly MJ et al. (2008) Efficient control of population structure in model organism association mapping. Genetics 178:1709–1723
Katz E, Li JJ, Jaegle B, Ashkenazy H, Abrahams SR, Bagaza C et al. (2021). Genetic variation, environment and demography intersect to shape Arabidopsis defense metabolite variation across Europe. Elife 10:e67784
Kerdaffrec E, Filiault DL, Korte A, Sasaki E, Nizhynska V, Seren Ü et al. (2016). Multiple alleles at a single locus control seed dormancy in Swedish Arabidopsis. Elife 5, e22502
Kerwin RE, Jimenez-Gomez JM, Fulop D, Harmer SL, Maloof JN, Kliebenstein DJ (2011) Network quantitative trait loci mapping of circadian clock outputs identifies metabolic pathway-to-clock linkages in Arabidopsis. Plant Cell 23:471–485
Kliebenstein DJ, Lambrix VM, Reichelt M, Gershenzon J, Mitchell-Olds T (2001) Gene duplication in the diversification of secondary metabolism: tandem 2-oxoglutarate– dependent dioxygenases control glucosinolate biosynthesis in arabidopsis. Plant Cell 13:681–693
Koornneef M, Alonso-Blanco C, Peeters AJM, Soppe W (1998) Genetic control of flowering time in Arabidopsis. Annu Rev Plant Physiol Plant Mol Biol 49:345–370
Larsson SJ, Lipka AE, Buckler ES (2013) Lessons from Dwarf8 on the strengths and weaknesses of structured association mapping. PLoS Genet 9:e1003246
Liang X, Lee HW, Li Z, Lu Y, Zou L, Ong CN (2018) Simultaneous quantification of 22 glucosinolates in 12 brassicaceae vegetables by hydrophilic interaction chromatography–tandem mass spectrometry. ACS Omega 3:15546–15553
Li P, Filiault D, Box MS, Kerdaffrec E, van Oosterhout C, Wilczek AM et al. (2014) Multiple FLC haplotypes defined by independent cis-regulatory variation underpin life history diversity in Arabidopsis thaliana. Genes Dev 28:1635–1640
Li Y, Huang Y, Bergelson J, Nordborg M, Borevitz JO (2010) Association mapping of local climate-sensitive quantitative trait loci in Arabidopsis thaliana. Proc Natl Acad Sci USA 107:21199–21204
Lippert C, Casale FP, Rakitsch B, Stegle O (2014). LIMIX: genetic analysis of multiple traits. bioRxiv. https://www.biorxiv.org/content/10.1101/003905v2
Long Q, Rabanal FA, Meng D, Huber CD, Farlow A, Platzer A et al. (2013) Massive genomic variation and strong selection in Arabidopsis thaliana lines from Sweden. Nat Genet 45:884–890
MacArthur J, Bowler E, Cerezo M, Gil L, Hall P, Hastings E et al. (2017) The new NHGRIEBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res 45:D896–D901
Martínez O, Curnow RN (1992) Estimating the locations and the sizes of the effects of quantitative trait loci using flanking markers. Theor Appl Genet 85:480–488
Maynard Smith J, Haigh J (1974) The hitch-hiking effect of a favourable gene. Genet Res 23:23–35
Neal CS, Fredericks DP, Griffiths CA, Neale AD (2010) The characterisation of AOP2: a gene associated with the biosynthesis of aliphatic alkenyl glucosinolates in Arabidopsis thaliana. BMC Plant Biol 10:170
Platt A, Vilhjálmsson BJ, Nordborg M (2010a) Conditions under which genome-wide association studies will be positively misleading. Genetics 186:1045–1052
Platt A, Vilhjálmsson BJ, Nordborg M (2010b) Conditions under which genome-wide association studies will be positively misleading. Genetics 186:1045–1052
Porri A, Torti S, Romera-Branchat M, Coupland G (2012) Spatially distinct regulatory roles for gibberellins in the promotion of flowering of Arabidopsis under long photoperiods. Development 139:2198–2209
Reeves PH, Coupland G (2001) Analysis of flowering time control in Arabidopsis by comparison of double and triple mutants. Plant Physiol 126:1085–1091
Sasaki E, Frommlet F, Nordborg M (2018) GWAS with Heterogeneous Data: Estimating the Fraction of Phenotypic VariationMediated by Gene Expression Data G3-Genes Genom Genet 8:3059–3068
Sasaki E, Zhang P, Atwell S, Meng D, Nordborg M (2015) ‘Missing’ G x E variation controls flowering time in Arabidopsis thaliana. PLoS Genet 11:e1005597
Scheet P, Stephens M (2006) A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet 78:629–644
Seren Ü, Grimm D, Fitz J, Weigel D, Nordborg M, Borgwardt K et al. (2017) AraPheno: a public database for Arabidopsis thaliana phenotypes. Nucleic Acids Res 45:D1054–D1059
Srikanth A, Schmid M (2011) Regulation of flowering time: all roads lead to Rome. Cell Mol Life Sci 68:2013–2037
Sun Q, Csorba T, Skourti-Stathaki K, Proudfoot NJ, Dean C (2013) R-loop stabilization represses antisense transcription at the Arabidopsis FLC locus. Science 340:619–621
Sun TP, Kamiya Y (1994) The Arabidopsis GA1 locus encodes the cyclase ent-kaurene synthetase A of gibberellin biosynthesis. Plant Cell 6:1509–1518
Tam V, Patel N, Turcotte M, Bossé Y, Paré G, Meyre D (2019) Benefits and limitations of genome-wide association studies. Nat Rev Genet 20:467–484
Vilhjálmsson BJ, Nordborg M (2013) The nature of confounding in genome-wide association studies. Nat Rev Genet 14:1–2
Whittaker C, Dean C (2017) The FLC locus: a platform for discoveries in epigenetics and adaptation. Annu Rev Cell Dev Biol 33:555–575
Yu J, Pressoir G, Briggs WH, Vroh Bi I, Yamasaki M, Doebley JF et al. (2006) A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat Genet 38:203–208
Zan Y, Carlborg Ö (2019) A polygenic genetic architecture of flowering time in the worldwide arabidopsis thaliana population. Mol Biol Evol 36:141–154
The authors would like to thank Caroline Dean for sharing knowledge and seeds of the AtNDX1 mutant. We also thank Ümit Selen for technical support with the data analysis, and Daniel J. Kliebenstein and Haijun Liu for comments and helpful discussions. The VBCF Metabolomics Facility is supported by the City of Vienna through the Vienna Business Agency.
The authors declare no competing interest.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Associate editor Marc Stift
About this article
Cite this article
Sasaki, E., Köcher, T., Filiault, D.L. et al. Revisiting a GWAS peak in Arabidopsis thaliana reveals possible confounding by genetic heterogeneity. Heredity 127, 245–252 (2021). https://doi.org/10.1038/s41437-021-00456-3