Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

A linear mixed-model approach to study multivariate gene–environment interactions

Abstract

Different exposures, including diet, physical activity, or external conditions can contribute to genotype–environment interactions (G×E). Although high-dimensional environmental data are increasingly available and multiple exposures have been implicated with G×E at the same loci, multi-environment tests for G×E are not established. Here, we propose the structured linear mixed model (StructLMM), a computationally efficient method to identify and characterize loci that interact with one or more environments. After validating our model using simulations, we applied StructLMM to body mass index in the UK Biobank, where our model yields previously known and novel G×E signals. Finally, in an application to a large blood eQTL dataset, we demonstrate that StructLMM can be used to study interactions with hundreds of environmental variables.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Overview of the StructLMM model.
Fig. 2: Assessment of statistical calibration and power using simulated data.
Fig. 3: Applications to model G×E on BMI in UK Biobank.
Fig. 4: Downstream analysis to explore identified G×E loci.
Fig. 5: Gene-context interactions in a blood gene expression cohort.

Data availability

The BIOS RNA data can be obtained from the European Genome-phenome Archive (EGA; accession EGAS00001001077). Genotype data are available from the respective biobanks.

References

  1. 1.

    Hunter, D. J. Gene-environment interactions in human diseases. Nat. Rev. Genet. 6, 287–298 (2005).

    Article  CAS  PubMed  Google Scholar 

  2. 2.

    Ritz, B. R. et al. Lessons learned from past gene-environment interaction successes. Am. J. Epidemiol. 186, 778–786 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  3. 3.

    Brown, A. A. et al. Genetic interactions affecting human gene expression identified by variance association mapping. eLife 3, e01381 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  4. 4.

    Fairfax, B. P. et al. Innate immune activity conditions the effect of regulatory variants upon monocyte gene expression. Science 343, 1246949 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. 5.

    Kraft, P., Yen, Y. C., Stram, D. O., Morrison, J. & Gauderman, W. J. Exploiting gene-environment interaction to detect genetic associations. Hum. Hered. 63, 111–119 (2007).

    Article  CAS  PubMed  Google Scholar 

  6. 6.

    Rask-Andersen, M., Karlsson, T., Ek, W. E. & Johansson, A. Gene-environment interaction study for BMI reveals interactions between genetic factors and physical activity, alcohol consumption and socioeconomic status. PLoS Genet. 13, e1006977 (2017).

  7. 7.

    Lin, X., Lee, S., Christiani, D. C. & Lin, X. Test for interactions between a genetic marker set and environment in generalized linear models. Biostatistics 14, 667–681 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  8. 8.

    Lin, X. et al. Test for rare variants by environment interactions in sequencing association studies. Biometrics 72, 156–164 (2016).

    Article  CAS  PubMed  Google Scholar 

  9. 9.

    Casale, F. P., Horta, D., Rakitsch, B. & Stegle, O. Joint genetic analysis using variant sets reveals polygenic gene-context interactions. PLoS Genet. 13, e1006693 (2017).

  10. 10.

    Kilpelainen, T. O. et al. Physical activity attenuates the influence of FTO variants on obesity risk: a meta-analysis of 218,166 adults and 19,268 children. PLoS Med. 8, e1001116 (2011).

  11. 11.

    Ahmad, S. et al. Gene x physical activity interactions in obesity: combined analysis of 111,421 individuals of European ancestry. PLoS Genet. 9, e1003607 (2013).

  12. 12.

    Bjornland, T., Langaas, M., Grill, V. & Mostad, I. L. Assessing gene-environment interaction effects of FTO, MC4R and lifestyle factors on obesity using an extreme phenotype sampling design: Results from the HUNT study. PLoS One 12, e0175071 (2017).

  13. 13.

    Young, A. I., Wauthier, F. & Donnelly, P. Multiple novel gene-by-environment interactions modify the effect of FTO variants on body mass index. Nat. Commun. 7, 12724 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. 14.

    Corella, D. et al. Statistical and biological gene-lifestyle interactions of MC4R and FTO with diet and physical activity on obesity: new effects on alcohol consumption. PLoS One 7, e52344 (2012).

  15. 15.

    Qi, Q. et al. Fried food consumption, genetic risk, and body mass index: gene-diet interaction analysis in three US cohort studies. BMJ 348, g1610 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  16. 16.

    Lee, S. et al. Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. Am. J. Hum. Genet. 91, 224–237 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. 17.

    Crawford, L., Zeng, P., Mukherjee, S. & Zhou, X. Detecting epistasis with the marginal epistasis test in genetic mapping studies of quantitative traits. PLoS Genet. 13, e1006869 (2017).

  18. 18.

    Gauderman, W. J. et al. Update on the state of the science for analytical methods for gene-environment interactions. Am. J. Epidemiol. 186, 762–770 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  19. 19.

    The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. 20.

    Locke, A. E. et al. Genetic studies of body mass index yield new insights for obesity biology. Nature 518, 197–206 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. 21.

    Bycroft, C. et al. Genome-wide genetic data on ~500,000 UK Biobank participants. bioRxiv, https://doi.org/10.1101/166298 (2017).

  22. 22.

    Richardson, A. S. et al. Moderate to vigorous physical activity interactions with genetic variants and body mass index in a large US ethnically diverse cohort. Pediatr. Obes. 9, e35–e46 (2014).

    Article  CAS  PubMed  Google Scholar 

  23. 23.

    Ahmad, S. et al. Established BMI-associated genetic variants and their prospective associations with BMI and other cardiometabolic traits: the GLACIER Study. Int. J. Obes. (Lond). 40, 1346–1352 (2016).

    Article  CAS  Google Scholar 

  24. 24.

    Hall, N. G., Klenotic, P., Anand-Apte, B. & Apte, S. S. ADAMTSL-3/punctin-2, a novel glycoprotein in extracellular matrix related to the ADAMTS family of metalloproteases. Matrix Biol. 22, 501–510 (2003).

    Article  CAS  PubMed  Google Scholar 

  25. 25.

    Zillikens, M. C. et al. Large meta-analysis of genome-wide association studies identifies five loci for lean body mass. Nat. Commun. 8, 80 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. 26.

    Wen, W. et al. Genome-wide association studies in East Asians identify new loci for waist-hip ratio and waist circumference. Sci. Rep. 6, 17958 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. 27.

    Shungin, D. et al. New genetic loci link adipose and insulin biology to body fat distribution. Nature 518, 187–196 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. 28.

    Zhernakova, D. V. et al. Identification of context-dependent expression quantitative trait loci in whole blood. Nat. Genet. 49, 139–145 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. 29.

    Westra, H. J. et al. Cell specific eQTL analysis without sorting cells. PLoS Genet. 11, e1005223 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. 30.

    Cookson, W., Liang, L., Abecasis, G., Moffatt, M. & Lathrop, M. Mapping complex disease traits with global gene expression. Nat. Rev. Genet. 10, 184–194 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. 31.

    Maurano, M. T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. 32.

    Emilsson, V. et al. Genetics of gene expression and its effect on disease. Nature 452, 423–428 (2008).

    Article  CAS  Google Scholar 

  33. 33.

    MacArthur, J. et al. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res. 45, D896–D901 (2017).

    Article  CAS  PubMed  Google Scholar 

  34. 34.

    Galvez, J. Role of Th17 cells in the pathogenesis of human IBD. ISRN Inflamm. 2014, 928461 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. 35.

    Day, F. R., Loh, P.-R., Scott, R. A., Ong, K. K. & Perry, J. R. A robust example of collider bias in a genetic association study. Am. J. Hum. Genet. 98, 392–393 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. 36.

    Listgarten, J., Lippert, C. & Heckerman, D. FaST-LMM-Select for addressing confounding from spatial structure and rare variants. Nat. Genet. 45, 470 (2013).

    Article  CAS  PubMed  Google Scholar 

  37. 37.

    Wu, M. C. et al. Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 89, 82–93 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. 38.

    Lee, S., Wu, M. C. & Lin, X. Optimal tests for rare variant effects in sequencing association studies. Biostatistics 13, 762–775 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  39. 39.

    Schaeffer, L. Application of random regression models in animal breeding. Livest. Prod. Sci. 86, 35–45 (2004).

    Article  Google Scholar 

  40. 40.

    Casale, F. P., Rakitsch, B., Lippert, C. & Stegle, O. Efficient set tests for the genetic analysis of correlated traits. Nat. Methods 12, 755–758 (2015).

    Article  CAS  PubMed  Google Scholar 

  41. 41.

    Loh, P. R. et al. Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat. Genet. 47, 284–290 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. 42.

    Fesinmeyer, M. D. et al. Genetic risk factors for BMI and obesity in an ethnically diverse population: results from the population architecture using genomics and epidemiology (PAGE) study. Obesity 21, 835–846 (2013).

    Article  CAS  PubMed  Google Scholar 

  43. 43.

    Abraham, G., Qiu, Y. & Inouye, M. FlashPCA2: principal component analysis of Biobank-scale genotype datasets. Bioinformatics 33, 2776–2778 (2017).

    Article  CAS  PubMed  Google Scholar 

  44. 44.

    Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Series B Methodol. 57, 289–300 (1995).

    Google Scholar 

  45. 45.

    Van Greevenbroek, M. M. et al. The cross‐sectional association between insulin resistance and circulating complement C3 is partly explained by plasma alanine aminotransferase, independent of central obesity and general inflammation (the CODAM study). Eur. J. Clin. Invest. 41, 372–379 (2011).

    Article  CAS  PubMed  Google Scholar 

  46. 46.

    Tigchelaar, E. F. et al. Cohort profile: LifeLines DEEP, a prospective, general population cohort study in the northern Netherlands: study design and baseline characteristics. BMJ Open 5, e006772 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  47. 47.

    Hofman, A. et al. The Rotterdam Study: 2014 objectives and design update. Eur. J. Epidemiol. 28, 889–926 (2013).

    Article  CAS  PubMed  Google Scholar 

  48. 48.

    Skyler, J. S. Pulmonary insulin update. Diabetes Technol. Ther. 7, 834–839 (2005).

    Article  PubMed  Google Scholar 

  49. 49.

    Storey, J. D. A direct approach to false discovery rates. J. R. Stat. Soc. Series B Methodol. 64, 479–498 (2002).

    Article  Google Scholar 

  50. 50.

    Alexa, A. & Rahnenfuhrer, J. topGO: enrichment analysis for gene ontology. R package version 2 (2010).

  51. 51.

    Lippert, C., Casale, F. P., Rakitsch, B. & Stegle, O. LIMIX: genetic analysis of multiple traits. bioRxiv https://doi.org/10.1101/003905 (2014).

Download references

Acknowledgements

The authors thank C. Lippert and L. Parts for helpful discussions. This research was conducted using the UK Biobank Resource (Application Number 14069). R.M. was supported by a PhD fellowship from the Mathematical Genomics and Medicine program, funded by the Wellcome Trust. F.P.C., D.H. and O.S. received support from core funding of the European Molecular Biology Laboratory and the European Union’s Horizon2020 research and innovation program under grant agreement N635290. I.B. acknowledges funding from Wellcome (WT098051 and WT206194). M.J.B. was supported by a fellowship from the EMBL Interdisciplinary Postdoc (EI3POD) program under Marie Skłodowska-Curie Actions COFUND (grant number 664726). The Biobank-Based Integrative Omics Studies (BIOS) Consortium is funded by BBMRI-NL, a research infrastructure financed by the Dutch government (NWO 184.021.007).

Author information

Affiliations

Authors

Consortia

Contributions

R.M., F.P.C., I.B., and O.S. conceived the method. R.M., F.P.C., and D.H. implemented the methods. R.M., F.P.C., and M.J.B. analyzed the data. L.F. and BIOS Consortium provided data resources. R.M., F.P.C., I.B., and O.S. interpreted results and wrote the paper.

Corresponding authors

Correspondence to Inês Barroso or Oliver Stegle.

Ethics declarations

Competing interests

F.P.C. was employed at Microsoft while performing the research.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–23, Supplementary Tables 1 and 2, and Supplementary Note

Reporting Summary

Supplementary Table 3

Interactions identified by StructLMM for BMI in UK Biobank

Supplementary Table 4

Associations identified by StructLMM and LMM in the association analysis of BMI using data from UK Biobank

Supplementary Table 5

Summary table of interaction eQTL analysis in blood cohort

Supplementary Table 6

Pathway enrichment analysis for interaction eQTLs that are in linkage with GWAS loci

Supplementary Data 1

eQTL Manhattan plots for interaction eQTLs that colocalize with disease variants

Supplementary Data 2

Interaction eQTL colocalizing with disease variants

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Moore, R., Casale, F.P., Jan Bonder, M. et al. A linear mixed-model approach to study multivariate gene–environment interactions. Nat Genet 51, 180–186 (2019). https://doi.org/10.1038/s41588-018-0271-0

Download citation

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing