Different exposures, including diet, physical activity, or external conditions can contribute to genotype–environment interactions (G×E). Although high-dimensional environmental data are increasingly available and multiple exposures have been implicated with G×E at the same loci, multi-environment tests for G×E are not established. Here, we propose the structured linear mixed model (StructLMM), a computationally efficient method to identify and characterize loci that interact with one or more environments. After validating our model using simulations, we applied StructLMM to body mass index in the UK Biobank, where our model yields previously known and novel G×E signals. Finally, in an application to a large blood eQTL dataset, we demonstrate that StructLMM can be used to study interactions with hundreds of environmental variables.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Data availability

The BIOS RNA data can be obtained from the European Genome-phenome Archive (EGA; accession EGAS00001001077). Genotype data are available from the respective biobanks.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


  1. 1.

    Hunter, D. J. Gene-environment interactions in human diseases. Nat. Rev. Genet. 6, 287–298 (2005).

  2. 2.

    Ritz, B. R. et al. Lessons learned from past gene-environment interaction successes. Am. J. Epidemiol. 186, 778–786 (2017).

  3. 3.

    Brown, A. A. et al. Genetic interactions affecting human gene expression identified by variance association mapping. eLife 3, e01381 (2014).

  4. 4.

    Fairfax, B. P. et al. Innate immune activity conditions the effect of regulatory variants upon monocyte gene expression. Science 343, 1246949 (2014).

  5. 5.

    Kraft, P., Yen, Y. C., Stram, D. O., Morrison, J. & Gauderman, W. J. Exploiting gene-environment interaction to detect genetic associations. Hum. Hered. 63, 111–119 (2007).

  6. 6.

    Rask-Andersen, M., Karlsson, T., Ek, W. E. & Johansson, A. Gene-environment interaction study for BMI reveals interactions between genetic factors and physical activity, alcohol consumption and socioeconomic status. PLoS Genet. 13, e1006977 (2017).

  7. 7.

    Lin, X., Lee, S., Christiani, D. C. & Lin, X. Test for interactions between a genetic marker set and environment in generalized linear models. Biostatistics 14, 667–681 (2013).

  8. 8.

    Lin, X. et al. Test for rare variants by environment interactions in sequencing association studies. Biometrics 72, 156–164 (2016).

  9. 9.

    Casale, F. P., Horta, D., Rakitsch, B. & Stegle, O. Joint genetic analysis using variant sets reveals polygenic gene-context interactions. PLoS Genet. 13, e1006693 (2017).

  10. 10.

    Kilpelainen, T. O. et al. Physical activity attenuates the influence of FTO variants on obesity risk: a meta-analysis of 218,166 adults and 19,268 children. PLoS Med. 8, e1001116 (2011).

  11. 11.

    Ahmad, S. et al. Gene x physical activity interactions in obesity: combined analysis of 111,421 individuals of European ancestry. PLoS Genet. 9, e1003607 (2013).

  12. 12.

    Bjornland, T., Langaas, M., Grill, V. & Mostad, I. L. Assessing gene-environment interaction effects of FTO, MC4R and lifestyle factors on obesity using an extreme phenotype sampling design: Results from the HUNT study. PLoS One 12, e0175071 (2017).

  13. 13.

    Young, A. I., Wauthier, F. & Donnelly, P. Multiple novel gene-by-environment interactions modify the effect of FTO variants on body mass index. Nat. Commun. 7, 12724 (2016).

  14. 14.

    Corella, D. et al. Statistical and biological gene-lifestyle interactions of MC4R and FTO with diet and physical activity on obesity: new effects on alcohol consumption. PLoS One 7, e52344 (2012).

  15. 15.

    Qi, Q. et al. Fried food consumption, genetic risk, and body mass index: gene-diet interaction analysis in three US cohort studies. BMJ 348, g1610 (2014).

  16. 16.

    Lee, S. et al. Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. Am. J. Hum. Genet. 91, 224–237 (2012).

  17. 17.

    Crawford, L., Zeng, P., Mukherjee, S. & Zhou, X. Detecting epistasis with the marginal epistasis test in genetic mapping studies of quantitative traits. PLoS Genet. 13, e1006869 (2017).

  18. 18.

    Gauderman, W. J. et al. Update on the state of the science for analytical methods for gene-environment interactions. Am. J. Epidemiol. 186, 762–770 (2017).

  19. 19.

    The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).

  20. 20.

    Locke, A. E. et al. Genetic studies of body mass index yield new insights for obesity biology. Nature 518, 197–206 (2015).

  21. 21.

    Bycroft, C. et al. Genome-wide genetic data on ~500,000 UK Biobank participants. bioRxiv, https://doi.org/10.1101/166298 (2017).

  22. 22.

    Richardson, A. S. et al. Moderate to vigorous physical activity interactions with genetic variants and body mass index in a large US ethnically diverse cohort. Pediatr. Obes. 9, e35–e46 (2014).

  23. 23.

    Ahmad, S. et al. Established BMI-associated genetic variants and their prospective associations with BMI and other cardiometabolic traits: the GLACIER Study. Int. J. Obes. (Lond). 40, 1346–1352 (2016).

  24. 24.

    Hall, N. G., Klenotic, P., Anand-Apte, B. & Apte, S. S. ADAMTSL-3/punctin-2, a novel glycoprotein in extracellular matrix related to the ADAMTS family of metalloproteases. Matrix Biol. 22, 501–510 (2003).

  25. 25.

    Zillikens, M. C. et al. Large meta-analysis of genome-wide association studies identifies five loci for lean body mass. Nat. Commun. 8, 80 (2017).

  26. 26.

    Wen, W. et al. Genome-wide association studies in East Asians identify new loci for waist-hip ratio and waist circumference. Sci. Rep. 6, 17958 (2016).

  27. 27.

    Shungin, D. et al. New genetic loci link adipose and insulin biology to body fat distribution. Nature 518, 187–196 (2015).

  28. 28.

    Zhernakova, D. V. et al. Identification of context-dependent expression quantitative trait loci in whole blood. Nat. Genet. 49, 139–145 (2017).

  29. 29.

    Westra, H. J. et al. Cell specific eQTL analysis without sorting cells. PLoS Genet. 11, e1005223 (2015).

  30. 30.

    Cookson, W., Liang, L., Abecasis, G., Moffatt, M. & Lathrop, M. Mapping complex disease traits with global gene expression. Nat. Rev. Genet. 10, 184–194 (2009).

  31. 31.

    Maurano, M. T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012).

  32. 32.

    Emilsson, V. et al. Genetics of gene expression and its effect on disease. Nature 452, 423–428 (2008).

  33. 33.

    MacArthur, J. et al. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res. 45, D896–D901 (2017).

  34. 34.

    Galvez, J. Role of Th17 cells in the pathogenesis of human IBD. ISRN Inflamm. 2014, 928461 (2014).

  35. 35.

    Day, F. R., Loh, P.-R., Scott, R. A., Ong, K. K. & Perry, J. R. A robust example of collider bias in a genetic association study. Am. J. Hum. Genet. 98, 392–393 (2016).

  36. 36.

    Listgarten, J., Lippert, C. & Heckerman, D. FaST-LMM-Select for addressing confounding from spatial structure and rare variants. Nat. Genet. 45, 470 (2013).

  37. 37.

    Wu, M. C. et al. Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 89, 82–93 (2011).

  38. 38.

    Lee, S., Wu, M. C. & Lin, X. Optimal tests for rare variant effects in sequencing association studies. Biostatistics 13, 762–775 (2012).

  39. 39.

    Schaeffer, L. Application of random regression models in animal breeding. Livest. Prod. Sci. 86, 35–45 (2004).

  40. 40.

    Casale, F. P., Rakitsch, B., Lippert, C. & Stegle, O. Efficient set tests for the genetic analysis of correlated traits. Nat. Methods 12, 755–758 (2015).

  41. 41.

    Loh, P. R. et al. Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat. Genet. 47, 284–290 (2015).

  42. 42.

    Fesinmeyer, M. D. et al. Genetic risk factors for BMI and obesity in an ethnically diverse population: results from the population architecture using genomics and epidemiology (PAGE) study. Obesity 21, 835–846 (2013).

  43. 43.

    Abraham, G., Qiu, Y. & Inouye, M. FlashPCA2: principal component analysis of Biobank-scale genotype datasets. Bioinformatics 33, 2776–2778 (2017).

  44. 44.

    Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Series B Methodol. 57, 289–300 (1995).

  45. 45.

    Van Greevenbroek, M. M. et al. The cross‐sectional association between insulin resistance and circulating complement C3 is partly explained by plasma alanine aminotransferase, independent of central obesity and general inflammation (the CODAM study). Eur. J. Clin. Invest. 41, 372–379 (2011).

  46. 46.

    Tigchelaar, E. F. et al. Cohort profile: LifeLines DEEP, a prospective, general population cohort study in the northern Netherlands: study design and baseline characteristics. BMJ Open 5, e006772 (2015).

  47. 47.

    Hofman, A. et al. The Rotterdam Study: 2014 objectives and design update. Eur. J. Epidemiol. 28, 889–926 (2013).

  48. 48.

    Skyler, J. S. Pulmonary insulin update. Diabetes Technol. Ther. 7, 834–839 (2005).

  49. 49.

    Storey, J. D. A direct approach to false discovery rates. J. R. Stat. Soc. Series B Methodol. 64, 479–498 (2002).

  50. 50.

    Alexa, A. & Rahnenfuhrer, J. topGO: enrichment analysis for gene ontology. R package version 2 (2010).

  51. 51.

    Lippert, C., Casale, F. P., Rakitsch, B. & Stegle, O. LIMIX: genetic analysis of multiple traits. bioRxiv https://doi.org/10.1101/003905 (2014).

Download references


The authors thank C. Lippert and L. Parts for helpful discussions. This research was conducted using the UK Biobank Resource (Application Number 14069). R.M. was supported by a PhD fellowship from the Mathematical Genomics and Medicine program, funded by the Wellcome Trust. F.P.C., D.H. and O.S. received support from core funding of the European Molecular Biology Laboratory and the European Union’s Horizon2020 research and innovation program under grant agreement N635290. I.B. acknowledges funding from Wellcome (WT098051 and WT206194). M.J.B. was supported by a fellowship from the EMBL Interdisciplinary Postdoc (EI3POD) program under Marie Skłodowska-Curie Actions COFUND (grant number 664726). The Biobank-Based Integrative Omics Studies (BIOS) Consortium is funded by BBMRI-NL, a research infrastructure financed by the Dutch government (NWO 184.021.007).

Author information

Author notes

  1. These authors contributed equally: Rachel Moore, Francesco Paolo Casale.


  1. Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK

    • Rachel Moore
    •  & Inês Barroso
  2. European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK

    • Rachel Moore
    • , Marc Jan Bonder
    • , Danilo Horta
    •  & Oliver Stegle
  3. University of Cambridge, Cambridge, UK

    • Rachel Moore
  4. Microsoft Research New England, Cambridge, Massachusetts, USA

    • Francesco Paolo Casale
  5. University of Groningen, University Medical Center Groningen, Department of Genetics, Groningen, the Netherlands

    • Lude Franke
  6. European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany

    • Oliver Stegle
  7. Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany

    • Oliver Stegle
  8. Molecular Epidemiology Section, Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, the Netherlands

    • Bastiaan T. Heijmans
    • , P. Eline Slagboom
    • , Marian Beekman
    • , Joris Deelen
    • , H. Eka D. Suchiman
    • , Ruud van der Breggen
    • , Nico Lakenberg
    • , Maarten van Iterson
    • , Matthijs Moed
    •  & René Luijk
  9. Department of Human Genetics, Leiden University Medical Center, Leiden, the Netherlands

    • Peter A. C.’t Hoen
    • , Michael Verbiest
    • , Michiel van Galen
    •  & Martijn Vermaat
  10. Department of Internal Medicine, ErasmusMC, Rotterdam, the Netherlands

    • Joyce van Meurs
    • , André G. Uitterlinden
    • , P. Mila Jhamai
    • , Marijn Verkerk
    •  & Jeroen van Rooij
  11. Department of Neurology, Brain Center Rudolf Magnus, University Medical Center Utrecht, Utrecht, the Netherlands

    • Aaron Isaacs
    • , Jan H. Veldink
    •  & Leonard H. van den Berg
  12. Department of Psychiatry, VU University Medical Center, Neuroscience Campus Amsterdam, Amsterdam, the Netherlands

    • Rick Jansen
  13. Department of Genetics, University of Groningen, University Medical Centre Groningen, Groningen, the Netherlands

    • Lude Franke
    • , Cisca Wijmenga
    • , Alexandra Zhernakova
    • , Ettje F. Tigchelaar
    • , Patrick Deelen
    • , Dasha V. Zhernakova
    • , Marc Jan Bonder
    • , Freerk van Dijk
    •  & Morris A. Swertz
  14. Department of Biological Psychology, VU University Amsterdam, Neuroscience Campus Amsterdam, Amsterdam, the Netherlands

    • Dorret I. Boomsma
    • , René Pool
    • , Jenny van Dongen
    •  & Jouke J. Hottenga
  15. Department of Internal Medicine and School for Cardiovascular Diseases (CARIM), Maastricht University Medical Center, Maastricht, the Netherlands

    • Marleen M. J. van Greevenbroek
    • , Coen D. A. Stehouwer
    • , Carla J. H. van der Kallen
    •  & Casper G. Schalkwijk
  16. Department of Gerontology and Geriatrics, Leiden University Medical Center, Leiden, the Netherlands

    • Diana van Heemst
  17. Department of Genetic Epidemiology, ErasmusMC, Rotterdam, the Netherlands

    • Cornelia M. van Duijn
  18. Department of Epidemiology, ErasmusMC, Rotterdam, the Netherlands

    • Bert A. Hofman
  19. Sequence Analysis Support Core, Leiden University Medical Center, Leiden, the Netherlands

    • Hailiang Mei
    • , Peter van’t Hof
    •  & Szymon M. Kielbasa
  20. SURFsara, Amsterdam, the Netherlands

    • Jan Bot
    •  & Irene Nooren
  21. Genomics Coordination Center, University Medical Center Groningen, University of Groningen, Groningen, the Netherlands

    • Freerk van Dijk
    •  & Morris A. Swertz
  22. Medical Statistics Section, Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, the Netherlands

    • Wibowo Arindrarto
    •  & Erik W. van Zwet


  1. Search for Rachel Moore in:

  2. Search for Francesco Paolo Casale in:

  3. Search for Marc Jan Bonder in:

  4. Search for Danilo Horta in:

  5. Search for Lude Franke in:

  6. Search for Inês Barroso in:

  7. Search for Oliver Stegle in:


  1. BIOS Consortium


R.M., F.P.C., I.B., and O.S. conceived the method. R.M., F.P.C., and D.H. implemented the methods. R.M., F.P.C., and M.J.B. analyzed the data. L.F. and BIOS Consortium provided data resources. R.M., F.P.C., I.B., and O.S. interpreted results and wrote the paper.

Competing interests

F.P.C. was employed at Microsoft while performing the research.

Corresponding authors

Correspondence to Inês Barroso or Oliver Stegle.

Supplementary information

  1. Supplementary Text and Figures

    Supplementary Figures 1–23, Supplementary Tables 1 and 2, and Supplementary Note

  2. Reporting Summary

  3. Supplementary Table 3

    Interactions identified by StructLMM for BMI in UK Biobank

  4. Supplementary Table 4

    Associations identified by StructLMM and LMM in the association analysis of BMI using data from UK Biobank

  5. Supplementary Table 5

    Summary table of interaction eQTL analysis in blood cohort

  6. Supplementary Table 6

    Pathway enrichment analysis for interaction eQTLs that are in linkage with GWAS loci

  7. Supplementary Data 1

    eQTL Manhattan plots for interaction eQTLs that colocalize with disease variants

  8. Supplementary Data 2

    Interaction eQTL colocalizing with disease variants

About this article

Publication history