Abstract
Genomic loci that control the variance of agronomically important traits are increasingly important due to the profusion of unpredictable environments arising from climate change. The ability to identify such variance-controlling loci in association studies will be critical for future breeding efforts. Two statistical approaches that have already been used in the variance genome-wide association study (vGWAS) paradigm are the Brown–Forsythe test (BFT) and the double generalized linear model (DGLM). To ensure that these approaches are deployed as effectively as possible, it is critical to study the factors that influence their ability to identify variance-controlling loci. We used genome-wide marker data in maize (Zea mays L.) and Arabidopsis thaliana to simulate traits controlled by epistasis, genotype by environment (GxE) interactions, and variance quantitative trait nucleotides (vQTNs). We then quantified true and false positive detection rates of the BFT and DGLM across all simulated traits. We also conducted a vGWAS using both the BFT and DGLM on plant height in a maize diversity panel. The observed true positive detection rates at the maximum sample size considered (N = 2815) suggest that both of these vGWAS approaches are capable of identifying epistasis and GxE for sufficiently large sample sizes. We also noted that the DGLM decisively outperformed the BFT for simulated traits controlled by vQTNs at sample sizes of N = 500. Although we conclude that there are still certain aspects of vGWAS approaches that need further refinement, this study suggests that the BFT and DGLM are capable of identifying variance-controlling loci in current state-of-the-art plant or agronomic data sets.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Rent or buy this article
Get just this article for as long as you need it
$39.95
Prices may be subject to local taxes which are calculated during checkout





Data availability
The genotypic data, simulated trait data, ROC curves, and code to simulate traits are available at https://github.com/mdm10-code/vGWAS_arabidopsis_maize.
References
Agresti A (2003) Categorical data analysis, Vol. 482. John Wiley and Sons, New York, NY
Alonso-Blanco C, Andrade J, Becker C, Bemm F, Bergelson J, Borgwardt KMM et al. (2016) 1,135 genomes reveal the global pattern of polymorphism in Arabidopsis thaliana. Cell 166(2):481–491
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat 57(1):289–300
Bradbury PJ, Zhang Z, Kroon DE, Casstevens TM, Ramdoss Y, Buckler ES (2007) TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics 23(19):2633–2635
Brown MB, Forsythe AB (1974) The Small sample behavior of some statistics which test the equality of several. Technometrics 16(1):129–132
Clopper CJ, Pearson ES (1934) The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika 26(4):404–413
Cook JP, McMullen MD, Holland JB, Tian F, Bradbury P, Ross-Ibarra J et al. (2012) Genetic architecture of maize kernel composition in the nested association mapping and inbred association panels. Plant Physiol 158(2):824–834
Cordell HJ (2002) Epistasis: what it means, what it doesn’t mean, and statistical methods to detect it in humans. Hum Mol Genet 11(20):2463–2468
Córdova-Palomera A, van der Meer D, Kaufmann T, Bettella F, Wang Y, Alnæs D et al. (2021) Genetic control of variability in subcortical and intracranial volumes. Mol Psychiatry 26(8):3876–3883
Corty RW, Valdar W (2018) QTL mapping on a background of variance heterogeneity. G3-Genes Genom Genet 8(12):3767–3782
Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePisto et al. (2011) The variant call format and VCFtools. Bioinformatics 27(15):2156–2158
Debat V, David P (2001) Mapping phenotypes: canalization, plasticity and developmental stability. Trends Ecol Evol 16(10):555–561
Dumitrascu B, Darnell G, Ayroles J, Engelhardt BE (2019) Statistical tests for detecting variance effects in quantitative trait studies. Bioinformatics 35(2):200–210
Dunn PK, Smyth GK, Dunn MPK (2020) Package ‘dglm’
Fernandes SB, Lipka AE (2020) simplePHENOTYPES: simulation of pleiotropic, linked and epistatic phenotypes. BMC Bioinform 21(1):1–10
Flint‐Garcia SA, Thuillet AC, Yu J, Pressoir G, Romero SM, Mitchell SE et al. (2005) Maize association population: a high‐resolution platform for quantitative trait locus dissection. Plant J 44(6):1054–1064
Forsberg SKG, Andreatta ME, Huang XY, Danku J, Salt DE, Carlborg Ö (2015) The multi-allelic genetic architecture of a variance-heterogeneity locus for molybdenum concentration in leaves acts as a source of unexplained additive genetic variance. PLoS Genet 11(11):1–24
Forsberg SKG, Carlborg Ö (2017) On the relationship between epistasis and genetic variance heterogeneity. J Exp Bot 68(20):5431–5438
Gage JL, de Leon N, Clayton MK (2018) Comparing genome-wide association study results from different measurements of an underlying phenotype. G3-Genes Genom Genet 8(11):3715–3722
Hill WG, Zhang XS (2004) Effects on phenotypic variability of directional selection arising through genetic differences in residual variability. Genet Res 83(2):121–132
Hill WG, Mulder HA (2010) Genetic analysis of environmental variation. Genet Res 92(5-6):381–395
Hong C, Ning Y, Wei P, Cao Y, Chen Y (2017) A semiparametric model for vQTL mapping. Biometrics 73(2):571–581
Hussain W, Campbell MT, Jarquin D, Walia H, Morota G (2020) Variance heterogeneity genome-wide mapping for cadmium in bread wheat reveals novel genomic loci and epistatic interactions. Plant Genome 13(1):1–13
Izawa T (2007) Adaptation of flowering-time by natural and artificial selection in arabidopsis and rice. J Exp Bot 58(12):3091–3097
Al Kawam A, Alshawaqfeh M, Cai JJ, Serpedin E, Datta A (2018) Simulating variance heterogeneity in quantitative genome-wide association studies. BMC Bioinform 19(Suppl 3):72
Kitano H (2004) Biological robustness. Nat Rev Genet 5(11):826–837
Lee Y, Nelder JA (1996) Hierarchical generalized linear models. J R Stat Soc Ser B Stat Methodol: Ser B 58(4):619–656
Lee Y, Nelder JA (2006) Double hierarchical generalized linear models. J R Stat Soc, C: Appl Stat 55(2):139–185
Li H, Wang M, Li W, He L, Zhou Y, Zhu J et al. (2020) Genetic variants and underlying mechanisms influencing variance heterogeneity in maize. Plant J 103(3):1089–1102
Li M, Zhang YW, Zhang ZC, Xiang Y, Liu MH, Zhou YH et al. (2022) A compressed variance component mixed model for detecting QTNs and QTN-by-environment and QTN-by-QTN interactions in genome-wide association studies. Mol Plant 15:630–650
Lipka AE, Tian F, Wang Q, Peiffer J, Li M, Bradbury PJ et al. (2012) GAPIT: genome association and prediction integrated tool. Bioinformatics 28(18):2397–2399
Metz CE (1978) Basic principles of ROC analysis. Semin Nucl Med 8(4):283–298
Mulder HA, Bijma P, Hill WG (2007) Prediction of breeding values and selection responses with genetic heterogeneity of environmental variance. Genet 175(4):1895–1910
Park CJ, Seo YS (2015) Heat shock proteins: a review of the molecular chaperones for plant immunity. Plant Pathol J 31(4):323–333
Peiffer JA, Romay MC, Gore MA, Flint-Garcia SA, Zhang Z, Millard MJ et al. (2014) The genetic architecture of maize height. Genet 196(4):1337–1356
Pettersson ME, Carlborg Ö (2015) Capacitating epistasis—detection and role in the genetic architecture of complex traits. In: Moore J., Williams S. (eds.) Epistasis. Human Press, New York, NY, p 185–196
Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 38(8):904–909
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D et al. (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81(3):559–575
R Core Team R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria (2022) https://www.R-project.org/
Romay MC, Millard MJ, Glaubitz JC, Peiffer JA, Swarts KL, Casstevens TM et al. (2013) Comprehensive genotyping of the USA national maize inbred seed bank. Genome Biol 14(6):R55
Rönnegård L, Felleki M, Fikse F, Mulder HA, Strandberg E (2010) Genetic heterogeneity of residual variance-estimation of variance components using double hierarchical generalized linear models. Genet Sel Evol 42(1):1–10
Rönnegård L, Valdar W (2011) Detecting major genetic loci controlling phenotypic variability in experimental crosses. Genet 188(2):435–447
Rönnegård L, Valdar W (2012) Recent developments in statistical methods for detecting genetic loci affecting phenotypic variability. BMC Genet 13:63
Scherer R, Scherer MR (2018) Package ‘PropCIs’
Schillaci M, Gupta S, Walker R, Roessner U (2019) The role of plant growth-promoting bacteria in the growth of cereals under abiotic stresses. Root Biol-Growth, Physiol, Funct 28:1–21
Struchalin MV, Amin N, Eilers PHC, Dujin CM, Aulchenko YS (2012) An R package “VariABEL” for genome-wide searching of potentially interacting loci by testing genotypic variance heterogeneity. BMC Genet 13:4
Shen X, Pettersson M, Rönnegård L, Carlborg Ö (2012) Inheritance beyond plain heritability: variance-controlling genes in arabidopsis thaliana. PLoS Genet 8(8):e1002839
Van Raden PM (2008) Efficient methods to compute genomic predictions. J Dairy Sci 91(11):4414–4423
Waddington CH (1942) Canalization of development and the inheritance of acquired characters. Nature 150(3811):563–565
Woodward AW, Bartel B (2018) Biology in bloom: a primer on the Arabidopsis thaliana model system. Genet 208(4):1337–1349
Yin L (2018) CMplot: Circle Manhattan Plot
Yu J, Pressoir G, Briggs WH, Bi IV, Yamasaki M, Doebley JF et al. (2006) A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat Genet 38(2):203–208
Ziervogel G, Ericksen PJ (2010) Adapting to climate change to sustain food security. Wiley Interdiscip Rev Clim Change 1(4):525–540
Zhang X, Qi Y (2021) Genetic architecture affecting maize agronomic traits identified by variance heterogeneity association mapping. Genomics 113:1681–1688
Acknowledgements
We would like to thank the Associate Editor and four anonymous referees for their helpful suggestions. The research conducted in this manuscript is supported by the National Science Foundation project accession numbers 1355406 and 1733606, the University of Illinois Urbana-Champaign Department of Crop Science’s J.C. Hackleman and Lawrence E. Schrader and Elfriede Massier Plant Physiology Fellowship Programs.
Author information
Authors and Affiliations
Contributions
MDM conducted all simulations and analyses, wrote the computer program that will simulate vQTNs, created all figures and tables, and wrote and edited the manuscript. SBF contributed to the design of the simulation settings, made edits to the computer program to make it more computationally efficient, and edited the manuscript. GM contributed to various aspects of the statistical analysis, including how to use the DGLM in a meaningful manner, as well as how to interpret the results of the simulation study. These contributions significantly guided the direction of our study. AEL oversaw the entire analysis, designed the simulation study, designed the procedure for simulating vQTNs, and wrote and edited the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Associate editor: Yuan-Ming Zhang.
Supplementary information
Rights and permissions
About this article
Cite this article
Murphy, M.D., Fernandes, S.B., Morota, G. et al. Assessment of two statistical approaches for variance genome-wide association studies in plants. Heredity 129, 93–102 (2022). https://doi.org/10.1038/s41437-022-00541-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41437-022-00541-1