Article | Published:

Reverse gene–environment interaction approach to identify variants influencing body-mass index in humans


Identifying gene–environment (G×E) interactions contributing to human cardiometabolic disorders is challenging. Here we apply a reverse G×E candidate search by deriving candidate variants from promoter–enhancer interactions that respond to dietary fatty acid challenge through altered chromatin accessibility in primary human adipocytes. We then test all variants residing in lipid-responsive open chromatin sites in adipocyte promoter–enhancer contacts for interaction effects between genotype and dietary saturated fat intake on body-mass index (BMI) in the UK Biobank. We discover 14 new G×E variants in 12 lipid-responsive promoters, including in well-known lipid-related genes (LIPE, CARM1 and PLIN2) and newly associated genes, such as LDB3, for which we provide further functional and integrative genomic evidence. We further identify 24 G×E variants in enhancers, for a total of 38 new G×E variants for BMI in the UK Biobank, demonstrating that molecular genomics data produced in physiologically relevant contexts can be applied to discover new functional G×E mechanisms in humans.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Data availability

The ATAC-seq data for primary human preadipocytes and adipocytes (untreated and lipid-challenged cells) and the pCHi-C data for primary human adipocytes under lipid-challenge conditions have been deposited in the Gene Expression Omnibus under accession GSE129574 and are available upon request from the corresponding author.


  1. 1.

    Joseph, P. G., Pare, G. & Anand, S. S. Exploring gene–environment relationships in cardiovascular disease. Can. J. Cardiol. 29, 37–45 (2013).

  2. 2.

    Heianza, Y. & Qi, L. Gene–diet interaction and precision nutrition in obesity. Int. J. Mol. Sci. 18, E787 (2017).

  3. 3.

    Kilpeläinen, T. O. et al. Physical activity attenuates the influence of FTO variants on obesity risk: a meta-analysis of 218,166 adults and 19,268 children. PLoS Med. 8, e100116 (2011).

  4. 4.

    Sudlow, C. et al. UK Biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015).

  5. 5.

    Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).

  6. 6.

    Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).

  7. 7.

    Lefterova, M. I. & Lazar, M. A. New developments in adipogenesis. Trends Endocrinol. Metab. 20, 107–114 (2009).

  8. 8.

    Ernst, J. & Kellis, M. Large-scale imputation of epigenomic datasets for systematic annotation of diverse human tissues. Nat. Biotechnol. 33, 364–376 (2015).

  9. 9.

    Surakka, I. et al. The impact of low-frequency and rare variants on lipid levels. Nat. Genet. 47, 589–597 (2015).

  10. 10.

    Teslovich, T. M. et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713 (2010).

  11. 11.

    Willer, C. J. et al. Discovery and refinement of loci associated with lipid levels. Nat. Genet. 45, 1274–1283 (2013).

  12. 12.

    Kanai, M. et al. Genetic analysis of quantitative traits in the Japanese population links cell types to complex human diseases. Nat. Genet. 50, 390–400 (2018).

  13. 13.

    MacArthur, J. et al. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res. 45, D896–D901 (2017).

  14. 14.

    Mifsud, B. et al. Mapping long-range promoter contacts in human cells with high-resolution capture Hi-C. Nat. Genet. 47, 598–606 (2015).

  15. 15.

    Pan, D. Z. et al. Integration of human adipocyte chromosomal interactions with adipose gene expression prioritizes obesity-related genes from GWAS. Nat. Commun. 9, 1512 (2018).

  16. 16.

    Liu, X. et al. Functional architectures of local and distal regulation of gene expression in multiple human tissues. Am. J. Hum. Genet. 100, 605–616 (2017).

  17. 17.

    Wang, J., Duncan, D., Shi, Z. & Zhang, B. WEB-based GEne SeT AnaLysis Toolkit (WebGestalt): update 2013. Nucleic Acids Res. 41, W77–W83 (2013).

  18. 18.

    Siepel, A. et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034–1050 (2005).

  19. 19.

    Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).

  20. 20.

    Kathiresan, S. et al. Common variants at 30 loci contribute to polygenic dyslipidemia. Nat. Genet. 41, 56–65 (2009).

  21. 21.

    Shin, S.-Y. et al. An atlas of genetic influences on human blood metabolites. Nat. Genet. 46, 543–550 (2014).

  22. 22.

    Kettunen, J. et al. Genome-wide study for circulating metabolites identifies 62 loci and reveals novel systemic effects of LPA. Nat. Commun. 7, 11122 (2016).

  23. 23.

    Vaittinen, M. et al. FADS2 genotype regulates delta-6 desaturase activity and inflammation in human adipose tissue. J. Lipid Res. 57, 56–65 (2016).

  24. 24.

    Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).

  25. 25.

    Nettleton, J. A., Brouwer, I. A., Geleijnse, J. M. & Hornstra, G. Saturated fat consumption and risk of coronary heart disease and ischemic stroke: a science update. Ann. Nutr. Metab. 70, 26–33 (2017).

  26. 26.

    Casas-Agustench, P. et al. Saturated fat intake modulates the association between an obesity genetic risk score and body mass index in two US populations. J. Acad. Nutr. Diet. 114, 1954–1966 (2014).

  27. 27.

    Luukkonen, P. K. et al. Saturated fat is more metabolically harmful for the human liver than unsaturated fat or simple sugars. Diabetes Care 41, 1732–1739 (2018).

  28. 28.

    Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).

  29. 29.

    Robinson, M. R. et al. Genotype–covariate interaction effects and the heritability of adult body mass index. Nat. Genet. 49, 1174–1181 (2017).

  30. 30.

    Zhou, J. & Troyanskaya, O. G. Predicting effects of noncoding variants with deep learning–based sequence model. Nat. Methods 12, 931–934 (2015).

  31. 31.

    Laakso, M. et al. METabolic Syndrome In Men (METSIM) Study: a resource for studies of metabolic and cardiovascular diseases. J. Lipid Res. 58, 481–493 (2017).

  32. 32.

    Mancuso, N. et al. Integrating gene expression with summary association statistics to identify genes associated with 30 complex traits. Am. J. Hum. Genet. 100, 473–487 (2017).

  33. 33.

    Gusev, A. et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. 48, 245–252 (2016).

  34. 34.

    Ardlie, K. G. et al. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science 348, 648–660 (2015).

  35. 35.

    Ehrlund, A. et al. The cell-type specific transcriptome in human adipose tissue and influence of obesity on adipocyte progenitors. Sci. Data 4, 170164 (2017).

  36. 36.

    Acosta, J. R. et al. Single cell transcriptomics suggest that human adipocyte progenitor cells constitute a homogeneous cell population. Stem Cell Res. Ther. 8, 250 (2017).

  37. 37.

    Zheng, J., Chen, S., Chen, Y., Zhu, M. & Hong, D. A novel mutation in the PDZ-like motif of ZASP causes distal ZASP-related myofibrillar myopathy. Neuropathology 37, 45–51 (2017).

  38. 38.

    Griggs, R. et al. Zaspopathy in a large classic late-onset distal myopathy family. Brain 130, 1477–1484 (2007).

  39. 39.

    Fumagalli, M. et al. Greenlandic Inuit show genetic signatures of diet and climate adaptation. Science 349, 1343–1347 (2015).

  40. 40.

    Lu, Y. et al. Dietary n-3 and n-6 polyunsaturated fatty acid intake interacts with FADS1 genetic variation to affect total and HDL-cholesterol concentrations in the Doetinchem Cohort Study. Am. J. Clin. Nutr. 92, 258–265 (2010).

  41. 41.

    He, Z. et al. FADS1FADS2 genetic polymorphisms are associated with fatty acid metabolism through changes in DNA methylation and gene expression. Clin. Epigenetics 10, 113 (2018).

  42. 42.

    Vernekar, M. & Amarapurkar, D. Diet–gene interplay: an insight into the association of diet and FADS gene polymorphisms. J. Nutr. Food Sci. 6, 503 (2016).

  43. 43.

    Dunham, I. et al. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

  44. 44.

    Bujold, D. et al. The International Human Epigenome Consortium Data Portal. Cell Syst. 3, 496–499 (2016).

  45. 45.

    Roadmap Epigenomics Consortium. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

  46. 46.

    Bernstein, B. E. et al. The NIH Roadmap Epigenomics Mapping Consortium. Nat. Biotechnol. 28, 1045–1048 (2010).

  47. 47.

    Simeonov, D. R. et al. Discovery of stimulation-responsive immune enhancers with CRISPR activation. Nature 549, 111–115 (2017).

  48. 48.

    Gate, R. E. et al. Genetic determinants of co-accessible chromatin regions in activated T cells across humans. Nat. Genet. 50, 1140–1150 (2018).

  49. 49.

    Phan, A. T., Goldrath, A. W. & Glass, C. K. Metabolic and epigenetic coordination of T cell and macrophage immunity. Immunity 46, 714–729 (2017).

  50. 50.

    Schneider, C. A., Rasband, W. S. & Eliceiri, K. W. NIH Image to ImageJ: 25 years of image analysis. Nat. Methods 9, 671–675 (2012).

  51. 51.

    Corces, M. R. et al. An improved ATAC-seq protocol reduces background and enables interrogation of frozen tissues. Nat. Methods 14, 959–962 (2017).

  52. 52.

    Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

  53. 53.

    Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

  54. 54.

    Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).

  55. 55.

    Hansen, K. D., Irizarry, R. A. & Wu, Z. Removing technical variability in RNA-seq data using conditional quantile normalization. Biostatistics 13, 204–216 (2012).

  56. 56.

    Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).

  57. 57.

    Rao, S. S. P. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).

  58. 58.

    Nagano, T. et al. Comparison of Hi-C results using in-solution versus in-nucleus ligation. Genome Biol. 16, 175 (2015).

  59. 59.

    Wingett, S. W. et al. HiCUP: pipeline for mapping and processing Hi-C data. F1000Res 4, 1310 (2015).

  60. 60.

    Cairns, J. et al. CHiCAGO: robust detection of DNA looping interactions in Capture Hi-C data. Genome Biol. 17, 127 (2016).

Download references


This research has been conducted using the UK Biobank Resource under application number 33934. We thank the individuals who participated in the METSIM and UK Biobank studies. We also thank the UNGC sequencing core at UCLA for performing the DNA and RNA sequencing. This study was funded by National Institutes of Health (NIH) grants HL-095056, HL-28481 and U01DK105561. K.M.G. was supported by NIH-NHLBI grant 1F31HL142180, M.A. was supported by an HHMI Gilliam grant, D.Z.P. was supported by NIH-NCI grant T32LM012424 and NIH-NIDDK grant F31DK118865, and A.K. was supported by NIH grant F31HL127921. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the article.

Author information

K.M.G. and P.P. designed the study. K.M.G., D.Z.P., Z.M., J.R.P., C.J.Y., J.S.S. and P.P. performed methods development and statistical analysis. K.M.G., D.Z.P., Z.M., M.A. and C.R.R. performed computational analysis of the data. K.M.G., Y.V.B., M.A., C.C. and J.N.B. performed the experiments. M.L., K.M. and P.P. produced the METSIM RNA-seq data. A.K. performed quality control of the METSIM RNA-seq data. K.M.G. and P.P. wrote the manuscript and all authors read, reviewed and/or edited the manuscript.

Correspondence to Päivi Pajukanta.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figures 1–6 and Supplementary Tables 1, 3, 5, 6, 8, 10–14, 16 and 21

Reporting Summary

Supplementary Table 2

Differentially accessible ATAC-seq peaks between human preadipocytes and adipocytes. Peaks were considered differentially accessible at a cutoff of FDR < 0.05. FDR was calculated (adjusting for n = 154,647 ATAC-seq peaks) from the P values of the QL F test for differential accessibility between preadipocytes and adipocytes by using ATAC-seq libraries from three replicates per cell type. Related to Fig. 1.

Supplementary Table 4

Differentially accessible ATAC-seq peaks in lipid-challenged human adipocytes. The table lists the significant differential ATAC-seq peaks in human primary adipocytes that were treated with saturated (palmitic) or monounsaturated (oleic) fatty acids or vehicle (BSA) control. Peaks were considered differentially accessible at a cutoff of FDR < 0.05. FDR was calculated (adjusting for n = 122,252 ATAC-seq peaks) from the P values of the QL F test in one-way ANOVA. For the post hoc test to determine which comparison was significant after one-way ANOVA (OA vs. BSA, PA vs. BSA or OA vs. PA), we determined the least significant difference. Related to Fig. 2.

Supplementary Table 7

154 genes with lipid-responsive promoters in chromosomal interactions in adipocytes. The table lists the Ensembl ID and gene symbol for genes with promoters in interactions in adipocyte promoter-capture Hi-C that also had lipid-responsive ATAC-seq peaks. Related to Fig. 3.

Supplementary Table 9

323 gene promoters physically interact with lipid-responsive enhancers in adipocytes. The table lists the Ensembl ID and gene symbol for genes with promoters that interact with enhancers that contained lipid-responsive ATAC-seq peaks. Related to Supplementary Fig. 5.

Supplementary Table 15

75 lipid-responsive peaks in gene promoters contain SNPs with MAF>0.05 in the UK Biobank. The table lists the lipid-responsive ATAC-seq peaks within gene promoters involved in adipocyte chromosomal interactions that contain SNPs with MAF > 0.05 in the UK Biobank (n = 75/91 peaks). The SNPs in these regions were tested for gene–environment interactions in the UK Biobank. Related to Fig. 3 and Table 2.

Supplementary Table 17

142 lipid-responsive peaks in enhancers contain SNPs with MAF>0.05 in the UK Biobank. The table lists the lipid-responsive ATAC-seq peaks within enhancers involved in adipocyte chromosomal interactions that contain SNPs with MAF > 0.05 in the UK Biobank (n = 142/169 peaks). The SNPs in these regions were tested for gene–environment interactions in the UK Biobank. Related to Supplementary Fig. 5 and Supplementary Table 18.

Supplementary Table 18

Significant G×E interactions with BMI from a multivariable linear model for 410 enhancer SNPs. The cis-eQTLs were identified in adipose tissue from the METSIM cohort; §When more than one non-independent SNP (LD r2 > 0.2) has a significant G×E P value for the lipid-responsive region, both SNPs are listed together in order of more to less significant. Genes in separate promoter-containing baits are marked when a lipid-responsive enhancer with a G×E SNP is interacting with more than one bait in adipocyte pCHi-C. The reported P values are from the multivariable linear model (see equation (2) in the Methods), where g is the number of minor alleles of the genotype and e is saturated fat intake. Here p-g indicates the P value for the genotype effect and p-g*e indicates the P value for the G×E effect; beta values follow the same notation. For the multivariable linear model, there were a total of 410 SNPs and 18,318 individuals with no missing data available for study.

Supplementary Table 19

DeepSEA analysis of the 20 G×E SNPs in interacting lipid-responsive gene promoters. The table lists the predicted functional impact of promoter G×E SNPs on chromatin features such as transcription factor binding and histone marks. Related to Table 2.

Supplementary Table 20

DeepSEA analysis of the 26 G×E SNPs in interacting lipid-responsive enhancers. The table lists the predicted functional impact of enhancer G×E SNPs on chromatin features such as transcription factor binding and histone marks. Related to Supplementary Table 18.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark
Fig. 1: ATAC-seq analysis comparing primary human preadipocytes and adipocytes indicates successful adipocyte differentiation and widespread changes in chromatin accessibility.
Fig. 2: Lipid-responsive regions fall within adipocyte accessible regions of the genome, as well as within context-dependent regions that are not present in untreated adipocytes.
Fig. 3: The 154 genes with lipid-responsive promoters within chromosomal interactions exhibit cross-species conservation and constraints on loss-of-function mutations, in line with their potential importance for energy homeostasis and survival.
Fig. 4: A lipid-responsive open chromatin region in human primary adipocytes at the 11q12.2 FADS1FADS2FADS3 locus harbours GWAS SNPs for serum lipid traits.
Fig. 5: Fine-mapping of the gene–diet interaction for BMI in the LDB3 promoter region.
Fig. 6: Analytical approach.