Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Genomic and phenotypic insights from an atlas of genetic effects on DNA methylation

Abstract

Characterizing genetic influences on DNA methylation (DNAm) provides an opportunity to understand mechanisms underpinning gene regulation and disease. In the present study, we describe results of DNAm quantitative trait locus (mQTL) analyses on 32,851 participants, identifying genetic variants associated with DNAm at 420,509 DNAm sites in blood. We present a database of >270,000 independent mQTLs, of which 8.5% comprise long-range (trans) associations. Identified mQTL associations explain 15–17% of the additive genetic variance of DNAm. We show that the genetic architecture of DNAm levels is highly polygenic. Using shared genetic control between distal DNAm sites, we constructed networks, identifying 405 discrete genomic communities enriched for genomic annotations and complex traits. Shared genetic variants are associated with both DNAm levels and complex diseases, but only in a minority of cases do these associations reflect causal relationships from DNAm to trait or vice versa, indicating a more complex genotype–phenotype map than previously anticipated.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Fig. 1: Discovery and replication of mQTLs.
Fig. 2: Cis- and trans-mQTLs operate through distinct mechanisms.
Fig. 3: Communities constructed from trans-mQTLs.
Fig. 4: Identifying putative causal relationships between sites and traits using bidirectional MR.

Data availability

A database of our results is available as a resource to the community at http://mqtldb.godmc.org.uk. The individual-level genotype and DNAm data are available by request from each individual study or can be downloaded from Gene Expression Omnibus (GEO, https://www.ncbi.nlm.nih.gov/geo), European Genome–Phenome Archive (EGA, https://ega-archive.org) or Array Express (https://www.ebi.ac.uk/arrayexpress). As the consent for most studies requires the data to be under managed access, the individual-level genotype and DNAm data are not available from a public repository unless stated.

ALS BATCH1 and -2 data are available to researchers by request as outlined in the Project MinE access policy. ARIES data are available to researchers by request from the Avon Longitudinal Study of Parents and Children Executive Committee (http://www.bristol.ac.uk/alspac/researchers/access) as outlined in the study’s access policy http://www.bristol.ac.uk/media-library/sites/alspac/documents/researchers/data-access/ALSPAC_Access_Policy.pdf. BAMSE data are available from the GABRIEL consortium as well as on request in EGA, under accession no. EGAC00001000786. BASICMAR DNAm data are available under accession no. GSE69138. Born-in-Bradford data are available to researchers who submit an expression of interest to the Born-in-Bradford Executive Group (https://borninbradford.nhs.uk/research). BSGS DNAm data are available under accession no. GSE56105. GOYA data are available by request from DNBC: https://www.dnbc.dk. Dunedin data are available via a managed access system (contact: ac115@duke.edu). E-Risk DNAm data are available under accession no. GSE105018. Estonian biobank (ECGUT) data can be accessed on ethical approval by submitting a data release request to the Estonian Genome Center, University of Tartu (http://www.geenivaramu.ee/en/access-biopank/data-access). EPIC-Norfolk data can be accessed by contacting the study management committee: http://www.srl.cam.ac.uk/epic/contact. Requests for EPICOR data accession may be sent to Professor Giuseppe Matullo (giuseppe.matullo@unito.it). FTC data can be accessed on approval from the Data Access Committee of the Institute for Molecular Medicine Finland FIMM (fimm-dac@helsinki.fi). Requests for Generation R data access are evaluated by the Generation R Management Team. Researchers can obtain a de-identified GLAKU dataset after having obtained an approval from the GLAKU Study Board. GSK DNAm data are available under accession no. GSE125105. INMA data are available by request from the INfancia y Medio Ambiente Executive Committee for researchers who meet the criteria for access to confidential data. IOW F2 data are available by request from Isle of Wight Third Generation Study. Please contact Mr Stephen Potter (stephen.potter@iow.nhs.uk). LLS DNAm data were submitted to the EGA under accession no. EGAS00001001077. LBC1921 and LBC1936 data are available on request from the Lothian Birth Cohort Study, Centre for Cognitive Ageing and Cognitive Epidemiology, University of Edinburgh (I.Deary@ed.ac.uk). DNAm from MARTHA participants are available under accession no. E-MTAB-3127. NTR DNAm data are available on request in EGA, under the accession no. EGAD00010000887. PIAMA data are available on request. Requests can be submitted to the PIAMA Principal Investigators (https://piama.iras.uu.nl/english). PRECISESADS data are available through ELIXIR at https://doi.org/10.17881/th9v-xt85. Collaboration in data analysis of PREDO is possible through specific research proposals sent to the PREDO Study Board (predo.study@helsinki.fi) or primary investigators Katri Räikkönen (katri.raikkonen@helsinki.fi) or Hannele Laivuori (hannele.laivuori@helsinki.fi). Data are available on request at Project MinE (https://www.projectmine.com). Raine data are available on request (https://ross.rainestudy.org.au). Requests for the data accession of the Rotterdam Study may be sent to Frank van Rooij (f.vanrooij@erasmusmc.nl). SABRE data are available by request from SABRE (https://www.sabrestudy.org). SCZ1 DNAm data are available under accession no. GSE80417. SCZ2 DNAm data are available under accession no. GSE84727. SYS data are available on request addressed to Dr. Zdenka Pausova (zdenka.pausova@sickkids.ca) and Dr. Tomas Paus (tpausresearch@gmail.com). Further details about the protocol can be found at http://www.saguenay-youth-study.org. TwinsUK DNAm data are available in the GEO under accession nos. GSE62992 and GSE121633. TwinsUK adipose DNAm data are stored in EGA under the accession no. E-MTAB-1866. Access to additional individual-level genotype and phenotype data can be applied for through the TwinsUK data access committee: http://twinsuk.ac.uk/resources-for-researchers/access-our-data. Individual-level DNAm and genetic data from the UK Household Longitudinal Study are available on application through the EGA under accession no. EGAS00001001232. Nonidentifiable Generation Scotland data will be made available to researchers through the GS:SFHS Access Committee. MESA DNAm data are available under accession nos. GSE56046 and GSE56581. Tissue DNAm data are available from accession no. GSE78743. Brain DNAm data can be found under accession no. GSE58885.

Cohort descriptions and further contact details can be found in the Supplementary Note.

For the enrichments, we used chromatin states from the Epigenome Roadmap (https://egg2.wustl.edu/roadmap/data/byFileType/chromhmmSegmentations/ChmmModels/imputed12marks/jointModel/final), TFBSs from the ENCODE project (http://hgdownload.cse.ucsc.edu/goldenpath/hg19/encodeDCC/wgEncodeAwgTfbsUniform) downloaded from the LOLA core database (http://databio.org/regiondb), and gene annotations from https://zwdzwd.github.io/InfiniumAnnotation or GARFIELD (https://www.ebi.ac.uk/birney-srv/GARFIELD). To extract GWA signals for co-localization, we used the MRBase database (https://www.mrbase.org).

Code availability

Datasets were processed using https://github.com/perishky/meffil unless stated otherwise. Individual study analysts used a github pipeline https://github.com/MRCIEU/godmc to conduct the mQTL analysis. We used https://github.com/MRCIEU/godmc_phase1_analysis for the phase 1 analysis, https://github.com/explodecomputer/random-metal for the meta-analyses and https://github.com/MRCIEU/godmc_phase2_analysis for the follow-up analyses.

References

  1. Petronis, A. Epigenetics as a unifying principle in the aetiology of complex traits and diseases. Nature 465, 721–727 (2010).

    CAS  PubMed  Article  Google Scholar 

  2. van Dongen, J. et al. Genetic and environmental influences interact with age and sex in shaping the human methylome. Nat. Commun. 7, 11115 (2016).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  3. Hannon, E. et al. Characterizing genetic and environmental influences on variable DNA methylation using monozygotic and dizygotic twins. PLoS Genet. 14, e1007544 (2018).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  4. Kerkel, K. et al. Genomic surveys by methylation-sensitive SNP analysis identify sequence-dependent allele-specific DNA methylation. Nat. Genet. 40, 904–908 (2008).

    CAS  PubMed  Article  Google Scholar 

  5. Schadt, E. E. et al. Genetics of gene expression surveyed in maize, mouse and man. Nature 422, 297–302 (2003).

    CAS  PubMed  Article  Google Scholar 

  6. Davey Smith, G. & Hemani, G. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum. Mol. Genet. 23, R89–R98 (2014).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  7. Gaunt, T. R. et al. Systematic identification of genetic influences on methylation across the human life course. Genome Biol. 17, 61 (2016).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  8. Bonder, M. J. et al. Disease variants alter transcription factor levels and methylation of their binding sites. Nat. Genet. 49, 131–138 (2017).

    CAS  PubMed  Article  Google Scholar 

  9. Hannon, E. et al. Methylation QTLs in the developing brain and their enrichment in schizophrenia risk loci. Nat. Neurosci. 19, 48–54 (2016).

    CAS  PubMed  Article  Google Scholar 

  10. Hop, P. J. et al. Genome-wide identification of genes regulating DNA methylation using genetic anchors for causal inference. Genome Biol. 21, 220 (2020).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  11. Abecasis, G. R. et al. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).

    PubMed  Article  CAS  Google Scholar 

  12. Pidsley, R. et al. Critical evaluation of the Illumina MethylationEPIC BeadChip microarray for whole-genome DNA methylation profiling. Genome Biol. 17, 208 (2016).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  13. Bibikova, M. et al. High density DNA methylation array with single CpG site resolution. Genomics 98, 288–295 (2011).

    CAS  PubMed  Article  Google Scholar 

  14. Yang, J. et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat. Genet. 44, 369–375 (2012).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  15. Shah, S. et al. Genetic and environmental exposures constrain epigenetic drift over the human life course. Genome Res. 24, 1725–1733 (2014).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  16. Gutierrez-Arcelus, M. et al. Passive and active DNA methylation and the interplay with genetic variation in gene regulation. Elife 2, e00523 (2013).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  17. Chen, L. et al. Genetic drivers of epigenetic and transcriptional variation in human immune cells. Cell 167, 1398–1414.e24 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  18. McRae, A. F. et al. Identification of 55,000 replicated DNA methylation QTL. Sci. Rep. 8, 17605 (2018).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  19. Kundaje, A. et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  20. Wahl, S. et al. Epigenome-wide association study of body mass index, and the adverse outcomes of adiposity. Nature 541, 81–86 (2017).

    CAS  PubMed  Article  Google Scholar 

  21. Elliott, G. et al. Intermediate DNA methylation is a conserved signature of genome regulation. Nat. Commun. 6, 6363 (2015).

    CAS  PubMed  Article  Google Scholar 

  22. Feldmann, A. et al. Transcription factor occupancy can mediate active turnover of DNA methylation at regulatory regions. PLoS Genet. 9, e1003994 (2013).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  23. Grundberg, E. et al. Global analysis of DNA methylation variation in adipose tissue from twins reveals links to disease-associated variants in distal regulatory elements. Am. J. Hum. Genet. 93, 876–890 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  24. Kim-Hellmuth, S. et al. Cell type-specific genetic regulation of gene expression across human tissues. Science 369, eaaz8528 (2020).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  25. Qi, T. et al. Identifying gene targets for brain-related traits using transcriptomic and methylomic data from blood. Nat. Commun. 9, 2282 (2018).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  26. Yin, Y. et al. Impact of cytosine methylation on DNA binding specificities of human transcription factors. Science 356, eaaj2239 (2017).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  27. Domcke, S. et al. Competition between DNA methylation and transcription factors determines binding of NRF1. Nature 528, 575–579 (2015).

    CAS  PubMed  Article  Google Scholar 

  28. Baubec, T. et al. Genomic profiling of DNA methyltransferases reveals a role for DNMT3B in genic methylation. Nature 520, 243–247 (2015).

    CAS  PubMed  Article  Google Scholar 

  29. Ginno, P. A. et al. A genome-scale map of DNA methylation turnover identifies site-specific dependencies of DNMT and TET activity. Nat. Commun. 11, 2680 (2020).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  30. Sánchez-Castillo, M. et al. CODEX: a next-generation sequencing experiment database for the haematopoietic and embryonic stem cell communities. Nucleic Acids Res. 43, D1117–D1123 (2015).

    PubMed  Article  CAS  Google Scholar 

  31. ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

  32. Waszak, S. M. et al. Population variation and genetic control of modular chromatin architecture in humans. Cell 162, 1039–1050 (2015).

    CAS  PubMed  Article  Google Scholar 

  33. Viny, A. D. et al. Dose-dependent role of the cohesin complex in normal and malignant hematopoiesis. J. Exp. Med. 212, 1819–1832 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  34. Battle, A. et al. Genetic effects on gene expression across human tissues. Nature 550, 204–213 (2017).

    PubMed  Article  Google Scholar 

  35. Kumasaka, N., Knights, A. J. & Gaffney, D. J. High-resolution genetic mapping of putative causal interactions between regions of open chromatin. Nat. Genet. 51, 128–137 (2019).

    CAS  PubMed  Article  Google Scholar 

  36. Rao, S. S. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  37. Delaneau, O. et al. Chromatin three-dimensional interactions mediate genetic effects on gene expression. Science 364, eaat8266 (2019).

    CAS  PubMed  Article  Google Scholar 

  38. Võsa, U. et al. Large-scale cis- and trans-eQTL analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression. Nat. Genet. https://doi.org/10.1038/s41588-021-00913-z (2021).

  39. Astle, W. J. et al. The allelic landscape of human blood cell trait variation and links to common complex disease. Cell 167, 1415–1429.e19 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  40. Maurano, M. T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  41. Tachmazidou, I. et al. Whole-genome sequencing coupled to imputation discovers genetic signals for anthropometric traits. Am. J. Hum. Genet. 100, 865–884 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  42. Kato, N. et al. Trans-ancestry genome-wide association study identifies 12 genetic loci influencing blood pressure and implicates a role for DNA methylation. Nat. Genet. 47, 1282–1293 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  43. Iotchkova, V. et al. GARFIELD classifies disease-relevant genomic features through integration of functional annotations with association signals. Nat. Genet. 51, 343–353 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  44. Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  45. Reinius, L. E. et al. Differential DNA methylation in purified human blood cells: implications for cell lineage and studies on disease susceptibility. PLoS ONE 7, e41361 (2012).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  46. Houseman, E. A. et al. Model-based clustering of DNA methylation array data: a recursive-partitioning algorithm for high-dimensional data arising as a mixture of beta distributions. BMC Bioinform. 9, 365 (2008).

    Article  CAS  Google Scholar 

  47. Hemani, G., Tilling, K. & Davey Smith, G. Orienting the causal relationship between imprecisely measured traits using GWAS summary data. PLoS Genet. 13, e1007081 (2017).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  48. Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  49. Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481–487 (2016).

    CAS  PubMed  Article  Google Scholar 

  50. Zheng, J. et al. Phenome-wide Mendelian randomization mapping the influence of the plasma proteome on complex diseases. Nat Genet. 52, 1122–1131 (2020).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  51. Richardson, T. G. et al. Systematic Mendelian randomization framework elucidates hundreds of CpG sites which may mediate the influence of genetic variants on disease. Hum. Mol. Genet. 27, 3293–3304 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  52. Hemani, G., Bowden, J. & Davey Smith, G. Evaluating the potential role of pleiotropy in Mendelian randomization studies. Hum. Mol. Genet. 27, R195–R208 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  53. Brion, M. J., Shakhbazov, K. & Visscher, P. M. Calculating statistical power in Mendelian randomization studies. Int. J. Epidemiol. 42, 1497–1501 (2013).

    PubMed  Article  Google Scholar 

  54. Pierce, B. L. & Burgess, S. Efficient design for Mendelian randomization studies: subsample and 2-sample instrumental variable estimators. Am. J. Epidemiol. 178, 1177–1184 (2013).

    PubMed  PubMed Central  Article  Google Scholar 

  55. Hemani, G. et al. The MR-base platform supports systematic causal inference across the human phenome. Elife 7, e34408 (2018).

    PubMed  PubMed Central  Article  Google Scholar 

  56. Dekkers, K. F. et al. Blood lipids influence DNA methylation in circulating cells. Genome Biol. 17, 138 (2016).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  57. Braun, K. V. E. et al. Epigenome-wide association study (EWAS) on lipids: the Rotterdam study. Clin. Epigenet. 9, 15 (2017).

    Article  CAS  Google Scholar 

  58. Simpson, J. T. et al. Detecting DNA cytosine methylation using nanopore sequencing. Nat. Methods 14, 407–410 (2017).

    CAS  PubMed  Article  Google Scholar 

  59. Zhou, W., Laird, P. W. & Shen, H. Comprehensive characterization, annotation and innovative use of Infinium DNA methylation BeadChip probes. Nucleic Acids Res. 45, e22 (2017).

    PubMed  Google Scholar 

  60. Winkler, T. W. et al. Quality control and conduct of genome-wide association meta-analyses. Nat. Protoc. 9, 1192–1212 (2014).

    PubMed  PubMed Central  Article  Google Scholar 

  61. Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  62. Conomos, M. P., Reiner, A. P., Weir, B. S. & Thornton, T. A. Model-free estimation of recent genetic relatedness. Am. J. Hum. Genet. 98, 127–148 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  63. Min, J. L., Hemani, G., Davey Smith, G., Relton, C. & Suderman, M. Meffil: efficient normalization and analysis of very large DNA methylation datasets. Bioinformatics 34, 3983–3989 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  64. Zeilinger, S. et al. Tobacco smoking leads to extensive genome-wide changes in DNA methylation. PLoS ONE 8, e63812 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  65. Aulchenko, Y. S., de Koning, D. J. & Haley, C. Genomewide rapid association using mixed model and regression: a fast and simple method for genomewide pedigree-based quantitative trait loci association analysis. Genetics 177, 577–585 (2007).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  66. Chen, Y. A. et al. Discovery of cross-reactive probes and polymorphic CpGs in the Illumina Infinium HumanMethylation450 microarray. Epigenetics 8, 203–209 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  67. Naeem, H. et al. Reducing the risk of false discovery enabling identification of biologically significant genome-wide methylation status using the HumanMethylation450 array. BMC Genom. 15, 51 (2014).

    Article  CAS  Google Scholar 

  68. Price, M. E. et al. Additional annotation enhances potential for biologically-relevant analysis of the Illumina Infinium HumanMethylation450 BeadChip array. Epigenet. Chromatin 6, 4 (2013).

    CAS  Article  Google Scholar 

  69. Shabalin, A. A. Matrix eQTL: ultra fast eQTL analysis via large matrix operations. Bioinformatics 28, 1353–1358 (2012).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  70. Dahl, A., Guillemot, V., Mefford, J., Aschard, H. & Zaitlen, N. Adjusting for principal components of molecular phenotypes induces replicating false positives. Genetics 211, 1179–1189 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  71. Willer, C. J., Li, Y. & Abecasis, G. R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190–2191 (2010).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  72. DerSimonian, R. & Laird, N. Meta-analysis in clinical trials. Control Clin. Trials 7, 177–188 (1986).

    CAS  PubMed  Article  Google Scholar 

  73. Hedges, L. V. & Olkin, I. Statistical Methods for Meta-Analysis 189–203 (Academic Press, 1985).

Download references

Acknowledgements

C.L.R., G.D.S., G.S., J.L.M., K.B., M. Suderman, T.G.R. and T.R.G. are supported by the UK Medical Research Council (MRC) Integrative Epidemiology Unit at the University of Bristol (MC_UU_00011/1, MC_UU_00011/4, MC_UU_00011/5). C.L.R. receives support from a Cancer Research UK Programme grant (no. C18281/A191169). G.H. is funded by the Wellcome Trust and the Royal Society (208806/Z/17/Z). E.H. and J.M. were supported by MRC project grants (nos. MR/K013807/1 and MR/R005176/1 to J.M.) and an MRC Clinical Infrastructure award (no. MR/M008924/1 to J.M.). B.T.H. is supported by the Netherlands CardioVascular Research Initiative (the Dutch Heart Foundation, Dutch Federation of University Medical Centres, the Netherlands Organisation for Health Research and Development, and the Royal Netherlands Academy of Sciences) for the GENIUS project ‘Generating the best evidence-based pharmaceutical targets for atherosclerosis’ (CVON2011-19, CVON2017-20). J.T.B. was supported by the Economic and Social Research Council (grant no. ES/N000404/1). The present study was also supported by JPI HDHL-funded DIMENSION project (administered by the BBSRC UK, grant no. BB/S020845/1 to J.T.B., and by ZonMW the Netherlands, grant no. 529051021 to B.T.H). A.D.B. has been supported by a Wellcome Trust PhD Training Fellowship for Clinicians and the Edinburgh Clinical Academic Track programme (204979/Z/16/Z). J. Klughammer was supported by a DOC fellowship of the Austrian Academy of Sciences. Cohort-specific acknowledgements and funding are presented in the Supplementary Note.

Author information

Authors and Affiliations

Authors

Consortia

Contributions

G.H., G.S. and J.L.M. managed the project. A.A.C., A. Caspi, A.D.H., A.G.U, A. Metspalu, A. Murray, A.M.M., B.B., B.T.H., C.H., C.L.R., C.P., C. Sacerdote, C. Shaw, C. Söderhäll, D.A.L., D.v.H., D.I.B., D.-A.T., E.A.N., E.B.B., E.J.C.d.G, E.M., F.G., F.R., G.E.D, G.H.K., G.P., G.W.M., H.R.E., H.T., H.Z., I.J.D., J.F.F., J.H.V., J.J.-C., J. Kaprio, J.L., J.M., J.M.S., J.M.V., J.v.M., J.R., J.R.B.P., J.R.G., J. Shin, J.T.B., J.W., J.W.H., K.K.O., K.L.E., K.R., L.A., L.C.S., L.M., M.A.I., M. Beekman, M. Bustamante, M.E.A.-R., M.H.v.IJ., M. Kerick, M.O., N.C., N.G.M., N.J.W., N.R.W., P.E.S., P.-E.M., P.M.V., R.-C.H., R.P., S.L., S.P., T.D.S., T.E., T.E.M., T.I.A.S, T.P., T.T., V.W.V.J., W.K. and Z.P. designed individual studies and contributed data. A.A.K., A.I., A.S., B.C., C.S.M., H.R.E., J.L.M., K.B., K.M.H., N.K., S.M.R., T.H., R.M.W. and W.L.M. generated and/or quality-controlled data. G.H., J.L.M., M. Suderman, T.R.G. and V.I. designed new statistical or bioinformatics tools. A.D.B., A. Cardona, A.D., A.F.M., A.K., B.T.H., C.B., C.H., C.L.R., C.R.-A., C.S.-T., C.V., C.-J.X., C.W., D.A., D.C., D.J.L., D.L.C., D.M., E.C.-M., E.G.-S., E.H., E.M., F.C.-M., F.I.R., F.R.D., G.B., G.C., G.D.S., G.H., G.H.K., G.M., G.W., I.Y., J.C.-F., J.v.D., J.-J.H., J. Kaprio, J. Klughammer, J.L.M., J.M., J. Sunyer, J.T.B., K.B., K.v.E., K.F.D., K.S., L.C.S., M. Bernard, M. Bustamante, M.H.v.IJ., M.G., M. Kumari, M.L., M. Smart, M. Suderman, N.K., P. Melton, P. Mandaviya, P.M.V., R.E.M., R.G., R.L., R.Z., S.B., S.G., S.K., T.-K.C., T.G.-S., T.G.R., T.I.A.S., T.L., T.R.G., Y.A., Y.Z., V.I. and V.S. analyzed the data and/or provided critical interpretation of results. B.T.H., C.B., C.L.R., J.M., J.T.B. and T.R.G. designed and/or managed the study. A.D.B., B.T.H., C.B., C.L.R., D.J.L., E.C.-M., E.H., G.D.S., G.H., J.C.-F., J. Klughammer, J.L.M., J.M., J.T.B., K.B., K.F.D., M. Suderman, P.M.V., R.L., T.G.R., T.R.G. and V.I. wrote the manuscript.

Corresponding author

Correspondence to Josine L. Min.

Ethics declarations

Competing interests

T.R.G. receives funding from GlaxoSmithKline and Biogen for unrelated research. The other authors declare no competing interests.

Additional information

Peer review information Nature Genetics thanks the anonymous reviewers for their contribution to the peer review of this work. Peer review reports are available.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Quality control of 36 studies.

We used 337 independent SNPs on chromosome 20 with a p-value<1e-14. The number of SNPs used for each study are indicated in the bottom plot. a, Mstatistic (Magosi et al., PLoS Genet., 13, e1006755 (2017)) for each of the 36 cohorts. b, Boxplot of mQTL effect sizes for each of the 36 studies. The center line of a boxplot corresponds to the median value. The lower and upper box limits indicate the first and third quartiles (the 25th and 75th percentiles). The length of the whiskers corresponds to values up to 1.5 times the IQR in either direction.

Extended Data Fig. 2 Distance of SNP from DNAm site.

a, Density plot of the distance of SNP from DNAm site against the -log10 p-value of 4,533 intrachromosomal trans-mQTL associations (>1Mb). b, Density plot of the distance of SNP from DNAm site against the -log10 p-value of 248,607 cis-mQTL associations (<1Mb).

Extended Data Fig. 3 Effect sizes and weighted standard deviation (SD) for each mQTL category.

a, For each DNAm site, the strongest absolute effect size (the maximum absolute additive change in DNAm level measured in SD per allele) was selected. The kernel density estimations of the effect sizes were shown for all sites with a mQTL (n=190,102), sites with cis only effects (n=170,986), cis effects for sites with cis and trans effects (n=11,902), trans effects for sites with cis and trans effects (n=11,902) and sites with trans only effects (n=7,214). Comparing the strongest effect size for each site in a two-sided linear regression model showed that cis+trans sites had larger cis effect sizes (per allele SD change = 0.05 (s.e.= 0.002), p<2e-16) as compared to cis only sites and weaker trans effect sizes (per allele SD change = −0.06 (s.e.= 0.002), p<2e-16) as compared to trans only sites. To detect these small trans effect sizes at sites with both a cis and a trans association, it is crucial to regress out the cis effect to decrease the residual variance and improve power to detect a trans effect. b, The violin plots represent kernel density estimates of the weighted SD across 36 cohorts for each DNAm site. The center line of the boxplot in the violin plots corresponds to the median value. The lower and upper box limits indicate the first and third quartiles (the 25th and 75th percentiles). The length of the whiskers corresponds to values up to 1.5 times the IQR in either direction.

Extended Data Fig. 4 Impact of the twostage design on mQTL coverage.

a, Loss in power in twostage design. We calculated the power of detecting a cis association in at least one of the 22 studies at p<1e-5 or a trans association in at least two of 22 studies at p<1e-5. b, Expected number of mQTLs. Using the number of mQTLs with a particular r2 value, and the power of detecting mQTLs with that r2 value, we calculated how many mQTLs would expect to exist with that value.

Extended Data Fig. 5 Correlation of mQTL effects (p<1e-14) between blood and other tissues.

For each mQTL category, the correlation of genetic effects between tissues (rb) were estimated using the rb method25 where we used the blood mQTLs as reference. DNAm levels are categorized as low (<0.2), intermediate (0.2–0.8) or high (>0.8).

Extended Data Fig. 6 2D enrichment of SNP and DNAm site TFBS annotation.

a, To test if the annotations of the SNPs involved in trans-mQTLs were specific to the annotations of the DNAm sites that they influence, we compared the real SNP-DNAm site pairs against permuted SNP-DNAm site pairs, where the biological link between SNP and site is severed whilst maintaining the distribution of annotations for the SNPs and sites. We constructed 100 such permuted datasets b, SNP and site positions were annotated against genomic features, and we quantified how frequently mQTLs were found for each pair of SNP-DNAm site annotations. This enabled the construction of 2D-annotation matrices for both the real trans-mQTL list and the permuted trans-mQTL lists. c, Distribution of two-dimensional enrichment values of trans-mQTLs. There was substantial departure from the null in the real dataset for all tissues indicating that the TFBS of a site depended on the TFBS of the SNP that influenced it. d, A bipartite graph of the two-dimensional enrichment for trans-mQTLs, SNPs annotations (blue) with pemp< 0.01 after multiple testing correction co-occur with particular site annotations (red).

Extended Data Fig. 7 Correspondence of MR estimates amongst multiple independent instruments.

a, To evaluate if a site having a shared causal variant with a trait was potentially due to the site being on the causal pathway to the trait, we reasoned that independent instruments for the site should exhibit consistent effects on the outcome consistent with the original co-localizing variant. b, Amongst the putative co-localizing signals, 440 involved a DNAm site that had at least one other independent mQTL. The plot shows the causal effect estimate estimated from the original co-localizing signal against the causal effect estimates obtained from the independent variants (n=440). Grey regions represent the 95% confidence of the slope. c, Correspondence of MR estimates amongst multiple independent instruments on 36 blood traits. To evaluate if a site having a shared causal variant with a blood trait was potentially due to the site being on the causal pathway to the trait, we reasoned that independent instruments for the site should exhibit consistent effects on the outcome consistent with the original co-localizing variant. Amongst the putative co-localizing signals, 30% involved a DNAm site that had at least one other independent mQTL. The plot shows the causal effect estimate estimated from the original co-localizing signal against the causal effect estimates obtained from the independent variants. The HLA region has been removed and betas are plotted.

Extended Data Fig. 8 Genomic inflation factors for genome-wide scans of causal effects of traits on DNAm sites.

Each trait (x axis) was tested for causal effects against (on average) 317,659 DNAm sites, excluding sites in the MHC region. The p-values from IVW MR analysis were used to estimate the genomic inflation for each trait (y-axis). Traits are ordered by genomic inflation factor.

Supplementary information

Supplementary Note

Supplementary Methods and Results, Acknowledgements, Supplementary Figs. 1–40, Supplementary References.

Reporting Summary

Peer Review Information.

Supplementary Tables

Supplementary Tables 1–20.

Supplementary Data 1

Discovery and replication of 169,656 mQTL associations in GoDMC (n = 27,750) and Generation Scotland (n = 5,101).

Supplementary Data 2

The relationship between the variance in DNA methylation explained by mQTL effects in GoDMC, and the estimated contribution of additive genetic effects.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Min, J.L., Hemani, G., Hannon, E. et al. Genomic and phenotypic insights from an atlas of genetic effects on DNA methylation. Nat Genet 53, 1311–1321 (2021). https://doi.org/10.1038/s41588-021-00923-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-021-00923-x

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing