Different exposures, including diet, physical activity, or external conditions can contribute to genotype–environment interactions (G×E). Although high-dimensional environmental data are increasingly available and multiple exposures have been implicated with G×E at the same loci, multi-environment tests for G×E are not established. Here, we propose the structured linear mixed model (StructLMM), a computationally efficient method to identify and characterize loci that interact with one or more environments. After validating our model using simulations, we applied StructLMM to body mass index in the UK Biobank, where our model yields previously known and novel G×E signals. Finally, in an application to a large blood eQTL dataset, we demonstrate that StructLMM can be used to study interactions with hundreds of environmental variables.
Subscribe to Journal
Get full journal access for 1 year
only $4.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
The BIOS RNA data can be obtained from the European Genome-phenome Archive (EGA; accession EGAS00001001077). Genotype data are available from the respective biobanks.
Hunter, D. J. Gene-environment interactions in human diseases. Nat. Rev. Genet. 6, 287–298 (2005).
Ritz, B. R. et al. Lessons learned from past gene-environment interaction successes. Am. J. Epidemiol. 186, 778–786 (2017).
Brown, A. A. et al. Genetic interactions affecting human gene expression identified by variance association mapping. eLife 3, e01381 (2014).
Fairfax, B. P. et al. Innate immune activity conditions the effect of regulatory variants upon monocyte gene expression. Science 343, 1246949 (2014).
Kraft, P., Yen, Y. C., Stram, D. O., Morrison, J. & Gauderman, W. J. Exploiting gene-environment interaction to detect genetic associations. Hum. Hered. 63, 111–119 (2007).
Rask-Andersen, M., Karlsson, T., Ek, W. E. & Johansson, A. Gene-environment interaction study for BMI reveals interactions between genetic factors and physical activity, alcohol consumption and socioeconomic status. PLoS Genet. 13, e1006977 (2017).
Lin, X., Lee, S., Christiani, D. C. & Lin, X. Test for interactions between a genetic marker set and environment in generalized linear models. Biostatistics 14, 667–681 (2013).
Lin, X. et al. Test for rare variants by environment interactions in sequencing association studies. Biometrics 72, 156–164 (2016).
Casale, F. P., Horta, D., Rakitsch, B. & Stegle, O. Joint genetic analysis using variant sets reveals polygenic gene-context interactions. PLoS Genet. 13, e1006693 (2017).
Kilpelainen, T. O. et al. Physical activity attenuates the influence of FTO variants on obesity risk: a meta-analysis of 218,166 adults and 19,268 children. PLoS Med. 8, e1001116 (2011).
Ahmad, S. et al. Gene x physical activity interactions in obesity: combined analysis of 111,421 individuals of European ancestry. PLoS Genet. 9, e1003607 (2013).
Bjornland, T., Langaas, M., Grill, V. & Mostad, I. L. Assessing gene-environment interaction effects of FTO, MC4R and lifestyle factors on obesity using an extreme phenotype sampling design: Results from the HUNT study. PLoS One 12, e0175071 (2017).
Young, A. I., Wauthier, F. & Donnelly, P. Multiple novel gene-by-environment interactions modify the effect of FTO variants on body mass index. Nat. Commun. 7, 12724 (2016).
Corella, D. et al. Statistical and biological gene-lifestyle interactions of MC4R and FTO with diet and physical activity on obesity: new effects on alcohol consumption. PLoS One 7, e52344 (2012).
Qi, Q. et al. Fried food consumption, genetic risk, and body mass index: gene-diet interaction analysis in three US cohort studies. BMJ 348, g1610 (2014).
Lee, S. et al. Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. Am. J. Hum. Genet. 91, 224–237 (2012).
Crawford, L., Zeng, P., Mukherjee, S. & Zhou, X. Detecting epistasis with the marginal epistasis test in genetic mapping studies of quantitative traits. PLoS Genet. 13, e1006869 (2017).
Gauderman, W. J. et al. Update on the state of the science for analytical methods for gene-environment interactions. Am. J. Epidemiol. 186, 762–770 (2017).
The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Locke, A. E. et al. Genetic studies of body mass index yield new insights for obesity biology. Nature 518, 197–206 (2015).
Bycroft, C. et al. Genome-wide genetic data on ~500,000 UK Biobank participants. bioRxiv, https://doi.org/10.1101/166298 (2017).
Richardson, A. S. et al. Moderate to vigorous physical activity interactions with genetic variants and body mass index in a large US ethnically diverse cohort. Pediatr. Obes. 9, e35–e46 (2014).
Ahmad, S. et al. Established BMI-associated genetic variants and their prospective associations with BMI and other cardiometabolic traits: the GLACIER Study. Int. J. Obes. (Lond). 40, 1346–1352 (2016).
Hall, N. G., Klenotic, P., Anand-Apte, B. & Apte, S. S. ADAMTSL-3/punctin-2, a novel glycoprotein in extracellular matrix related to the ADAMTS family of metalloproteases. Matrix Biol. 22, 501–510 (2003).
Zillikens, M. C. et al. Large meta-analysis of genome-wide association studies identifies five loci for lean body mass. Nat. Commun. 8, 80 (2017).
Wen, W. et al. Genome-wide association studies in East Asians identify new loci for waist-hip ratio and waist circumference. Sci. Rep. 6, 17958 (2016).
Shungin, D. et al. New genetic loci link adipose and insulin biology to body fat distribution. Nature 518, 187–196 (2015).
Zhernakova, D. V. et al. Identification of context-dependent expression quantitative trait loci in whole blood. Nat. Genet. 49, 139–145 (2017).
Westra, H. J. et al. Cell specific eQTL analysis without sorting cells. PLoS Genet. 11, e1005223 (2015).
Cookson, W., Liang, L., Abecasis, G., Moffatt, M. & Lathrop, M. Mapping complex disease traits with global gene expression. Nat. Rev. Genet. 10, 184–194 (2009).
Maurano, M. T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012).
Emilsson, V. et al. Genetics of gene expression and its effect on disease. Nature 452, 423–428 (2008).
MacArthur, J. et al. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res. 45, D896–D901 (2017).
Galvez, J. Role of Th17 cells in the pathogenesis of human IBD. ISRN Inflamm. 2014, 928461 (2014).
Day, F. R., Loh, P.-R., Scott, R. A., Ong, K. K. & Perry, J. R. A robust example of collider bias in a genetic association study. Am. J. Hum. Genet. 98, 392–393 (2016).
Listgarten, J., Lippert, C. & Heckerman, D. FaST-LMM-Select for addressing confounding from spatial structure and rare variants. Nat. Genet. 45, 470 (2013).
Wu, M. C. et al. Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 89, 82–93 (2011).
Lee, S., Wu, M. C. & Lin, X. Optimal tests for rare variant effects in sequencing association studies. Biostatistics 13, 762–775 (2012).
Schaeffer, L. Application of random regression models in animal breeding. Livest. Prod. Sci. 86, 35–45 (2004).
Casale, F. P., Rakitsch, B., Lippert, C. & Stegle, O. Efficient set tests for the genetic analysis of correlated traits. Nat. Methods 12, 755–758 (2015).
Loh, P. R. et al. Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat. Genet. 47, 284–290 (2015).
Fesinmeyer, M. D. et al. Genetic risk factors for BMI and obesity in an ethnically diverse population: results from the population architecture using genomics and epidemiology (PAGE) study. Obesity 21, 835–846 (2013).
Abraham, G., Qiu, Y. & Inouye, M. FlashPCA2: principal component analysis of Biobank-scale genotype datasets. Bioinformatics 33, 2776–2778 (2017).
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Series B Methodol. 57, 289–300 (1995).
Van Greevenbroek, M. M. et al. The cross‐sectional association between insulin resistance and circulating complement C3 is partly explained by plasma alanine aminotransferase, independent of central obesity and general inflammation (the CODAM study). Eur. J. Clin. Invest. 41, 372–379 (2011).
Tigchelaar, E. F. et al. Cohort profile: LifeLines DEEP, a prospective, general population cohort study in the northern Netherlands: study design and baseline characteristics. BMJ Open 5, e006772 (2015).
Hofman, A. et al. The Rotterdam Study: 2014 objectives and design update. Eur. J. Epidemiol. 28, 889–926 (2013).
Skyler, J. S. Pulmonary insulin update. Diabetes Technol. Ther. 7, 834–839 (2005).
Storey, J. D. A direct approach to false discovery rates. J. R. Stat. Soc. Series B Methodol. 64, 479–498 (2002).
Alexa, A. & Rahnenfuhrer, J. topGO: enrichment analysis for gene ontology. R package version 2 (2010).
Lippert, C., Casale, F. P., Rakitsch, B. & Stegle, O. LIMIX: genetic analysis of multiple traits. bioRxiv https://doi.org/10.1101/003905 (2014).
The authors thank C. Lippert and L. Parts for helpful discussions. This research was conducted using the UK Biobank Resource (Application Number 14069). R.M. was supported by a PhD fellowship from the Mathematical Genomics and Medicine program, funded by the Wellcome Trust. F.P.C., D.H. and O.S. received support from core funding of the European Molecular Biology Laboratory and the European Union’s Horizon2020 research and innovation program under grant agreement N635290. I.B. acknowledges funding from Wellcome (WT098051 and WT206194). M.J.B. was supported by a fellowship from the EMBL Interdisciplinary Postdoc (EI3POD) program under Marie Skłodowska-Curie Actions COFUND (grant number 664726). The Biobank-Based Integrative Omics Studies (BIOS) Consortium is funded by BBMRI-NL, a research infrastructure financed by the Dutch government (NWO 184.021.007).
F.P.C. was employed at Microsoft while performing the research.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Figures 1–23, Supplementary Tables 1 and 2, and Supplementary Note
Interactions identified by StructLMM for BMI in UK Biobank
Associations identified by StructLMM and LMM in the association analysis of BMI using data from UK Biobank
Summary table of interaction eQTL analysis in blood cohort
Pathway enrichment analysis for interaction eQTLs that are in linkage with GWAS loci
eQTL Manhattan plots for interaction eQTLs that colocalize with disease variants
Interaction eQTL colocalizing with disease variants
About this article
Cite this article
Moore, R., Casale, F.P., Jan Bonder, M. et al. A linear mixed-model approach to study multivariate gene–environment interactions. Nat Genet 51, 180–186 (2019). https://doi.org/10.1038/s41588-018-0271-0
Clinical Pharmacology & Therapeutics (2021)
Identification of genetic loci affecting body mass index through interaction with multiple environmental factors using structured linear mixed model
Scientific Reports (2021)
Attachment Style Moderates Polygenic Risk for Posttraumatic Stress in United States Military Veterans: Results From the National Health and Resilience in Veterans Study
Biological Psychiatry (2021)
Uncovering Evidence for Endocrine-Disrupting Chemicals That Elicit Differential Susceptibility through Gene-Environment Interactions
Sex-stratified gene-by-environment genome-wide interaction study of trauma, posttraumatic-stress, and suicidality
Neurobiology of Stress (2021)