Integrated analysis of environmental and genetic influences on cord blood DNA methylation in new-borns

Czamara, Darina; Eraslan, Gökçen; Page, Christian M.; Lahti, Jari; Lahti-Pulkkinen, Marius; Hämäläinen, Esa; Kajantie, Eero; Laivuori, Hannele; Villa, Pia M.; Reynolds, Rebecca M.; Nystad, Wenche; Håberg, Siri E.; London, Stephanie J.; O’Donnell, Kieran J.; Garg, Elika; Meaney, Michael J.; Entringer, Sonja; Wadhwa, Pathik D.; Buss, Claudia; Jones, Meaghan J.; Lin, David T. S.; MacIsaac, Julie L.; Kobor, Michael S.; Koen, Nastassja; Zar, Heather J.; Koenen, Karestan C.; Dalvie, Shareefa; Stein, Dan J.; Kondofersky, Ivan; Müller, Nikola S.; Theis, Fabian J.; Räikkönen, Katri; Binder, Elisabeth B.

doi:10.1038/s41467-019-10461-0

Download PDF

Article
Open access
Published: 11 June 2019

Integrated analysis of environmental and genetic influences on cord blood DNA methylation in new-borns

Darina Czamara ORCID: orcid.org/0000-0001-7381-904X¹,
Gökçen Eraslan ORCID: orcid.org/0000-0001-9579-2909^2,3,
Christian M. Page ORCID: orcid.org/0000-0002-1897-3666^4,5,
Jari Lahti ORCID: orcid.org/0000-0002-4310-5297^6,7,
Marius Lahti-Pulkkinen ORCID: orcid.org/0000-0002-5244-5525^6,8,
Esa Hämäläinen⁹,
Eero Kajantie^10,11,12,
Hannele Laivuori ORCID: orcid.org/0000-0003-3212-7826^13,14,15,16,
Pia M. Villa¹³,
Rebecca M. Reynolds⁸,
Wenche Nystad¹⁷,
Siri E. Håberg⁵,
Stephanie J. London ORCID: orcid.org/0000-0003-4911-5290¹⁸,
Kieran J. O’Donnell^19,20,
Elika Garg ORCID: orcid.org/0000-0002-3093-7940¹⁹,
Michael J. Meaney^19,20,21,
Sonja Entringer^22,23,
Pathik D. Wadhwa^23,24,
Claudia Buss ORCID: orcid.org/0000-0002-8738-3133^22,23,
Meaghan J. Jones²⁵,
David T. S. Lin ORCID: orcid.org/0000-0001-5695-9446²⁵,
Julie L. MacIsaac²⁵,
Michael S. Kobor²⁵,
Nastassja Koen^26,27,
Heather J. Zar²⁸,
Karestan C. Koenen²⁹,
Shareefa Dalvie²⁶,
Dan J. Stein^26,27,
Ivan Kondofersky^2,30,
Nikola S. Müller²,
Fabian J. Theis ORCID: orcid.org/0000-0002-2419-1943^2,30,
Major Depressive Disorder Working Group of the Psychiatric Genomics Consortium,
Katri Räikkönen⁶ &
…
Elisabeth B. Binder^1,31

Nature Communications volume 10, Article number: 2548 (2019) Cite this article

12k Accesses
86 Citations
36 Altmetric
Metrics details

Subjects

Abstract

Epigenetic processes, including DNA methylation (DNAm), are among the mechanisms allowing integration of genetic and environmental factors to shape cellular function. While many studies have investigated either environmental or genetic contributions to DNAm, few have assessed their integrated effects. Here we examine the relative contributions of prenatal environmental factors and genotype on DNA methylation in neonatal blood at variably methylated regions (VMRs) in 4 independent cohorts (overall n = 2365). We use Akaike’s information criterion to test which factors best explain variability of methylation in the cohort-specific VMRs: several prenatal environmental factors (E), genotypes in cis (G), or their additive (G + E) or interaction (GxE) effects. Genetic and environmental factors in combination best explain DNAm at the majority of VMRs. The CpGs best explained by either G, G + E or GxE are functionally distinct. The enrichment of genetic variants from GxE models in GWAS for complex disorders supports their importance for disease risk.

A Pregnancy and Childhood Epigenetics Consortium (PACE) meta-analysis highlights potential relationships between birth order and neonatal blood DNA methylation

Article Open access 09 January 2024

Epigenetic signature of very low birth weight in young adult life

Article Open access 19 June 2024

Association between maternal depression during pregnancy and newborn DNA methylation

Article Open access 08 November 2021

Introduction

Foetal or prenatal programming describes the process by which environmental events during pregnancy influence the development of the embryo with on-going implications for future health and disease. Several studies have shown that the in utero environment is associated with disease risk, including coronary heart disease^1,2, type 2 diabetes³, childhood obesity^4,5 as well as psychiatric problems⁶ and disorders^7,8,9.

Environmental effects on the epigenome, for example, via DNA methylation, could lead to sustained changes in gene transcription and thus provide a molecular mechanism for the enduring influences of the early environment on later health¹⁰. Smoking during pregnancy influences widespread and highly reproducible differences in DNA methylation at birth¹¹. Less dramatic effects have been reported for maternal body mass index (BMI)¹², pre-eclampsia and gestational diabetes^13,14. Possible epigenetic changes as a consequence of prenatal stress are less well established¹⁵. Some of these early differences in DNA methylation persist, although attenuated, through childhood^11,16 and might be related to later symptoms and indicators of disease risk, including BMI during childhood^17,18 or substance use in adolescence¹⁹. These data emphasise the potential importance of the prenatal environment for the establishment of inter-individual variation in the methylome as a predictor or even mediator of disease risk trajectories.

In addition to the environment, the genome plays an important role in the regulation of DNA methylation. To this end, the impact of genetic variation, especially of single nucleotide polymorphisms (SNPs) on DNA methylation in different tissues, has resulted in the discovery of a large number of methylation quantitative trait loci (meQTLs, i.e., SNPs significantly associated with DNA methylation status²⁰). These variants are primarily in cis, i.e., at most 1 million base pairs away from the DNA methylation site^20,21,22 and often co-occur with expression QTLs or other regulatory QTLs^23,24,25. The association of meQTLs with DNA methylation is relatively stable throughout the life course²¹. In addition, SNPs within meQTLs are strongly enriched for genetic variants associated with common disease in large genome-wide association studies (GWAS) such as BMI, inflammatory bowel disease, type 2 diabetes or major depressive disorder^21,23,24,26.

Environmental and genetic factors may act in an additive or multiplicative manner to shape the epigenome to modulate phenotype presentation and disease risk²⁷. However, very few studies have so far investigated the joint effects of environment and genotype on DNA methylation, especially in a genome-wide context. Klengel et al.²⁸, for instance, reported an interaction of the FK506 binding protein 5 gene (FKBP5) SNP genotype and childhood trauma on FKBP5 methylation levels in peripheral blood cells, with trauma associated changes only observed in carriers of the rare allele. The most comprehensive study of integrated genetic and environmental contributions to DNA methylation so far was performed by Teh et al.²⁹. This study examined variably methylated regions (VMRs), defined as regions of consecutive CpG-sites showing the highest variability across all methylation sites assessed on the Illumina Infinium HumanMethylation450 BeadChip array. In a study of 237 neonate methylomes derived from umbilical cord tissue, the authors explored the proportions of the influence of genotype vs. prenatal environmental factors such as maternal BMI, maternal glucose tolerance and maternal smoking on DNA methylation at VMRs. They found that 75% of the VMRs were best explained by the interaction between genotype and environmental factors (GxE) whereas 25% were best explained by SNP genotype and none by environmental factors alone. Collectively, these studies highlight the importance of investigating the combination of environmental and genetic contributions to DNA methylation and not only their individual contribution.

The main objective of the present study is to extend our knowledge of combined effects of prenatal environment and genetic factors on DNA methylation at VMRs. Specifically, this is addressed by: (1) assessing the stability of the best explanatory factors across different cohorts and whether this extends to all environmental factors, (2) dissecting differences between additive and interactive effects of gene and environment not explored in Teh et al., (3) testing whether VMRs influenced by genetic and/or environmental factors might have a different predicted impact on gene regulation and (4) evaluating the relevance of genetic variants that interact with the environment to shape the methylome for their contribution to genetic disease risk.

Our results show that across cohorts genetic variants in combination with prenatal environment are the best predictors of variance in DNA methylation. We observe functional differences of both the genetic variants and the methylation sites best explained by genetic or additive and interactive effects of genes and environment. Finally, the enrichment of genetic variants within additive as well as interactive models in GWAS for complex disorders supports the importance of these environmentally modified methylation quantitative trait loci for disease risk.

Results

Cohorts and analysis plan

We investigated the influence of the prenatal environment and genotype on VMRs in the DNA of 2365 newborns within 4 different cohorts: Prediction and Prevention of Pre-eclampsia and Intrauterine Growth Restrictions (PREDO, cordblood)³⁰, the UCI cohort (refs. ^31,32,33, heel prick), the Drakenstein Child Health Study (DCHS, cordblood)^34,35 and the Norwegian Mother and Child Cohort Study (MoBa, cordblood³⁶). A description of the workflow of this manuscript is given in Fig. 1 and the details for each of the cohorts are given in Table 1.

Table 1 Overview of investigated cohorts

Full size table

We analysed 963 cord blood samples from the PREDO cohort with available genome-wide DNA methylation and genotype data. Of these samples, 817 had data on the Illumina 450k array (PREDO I) and 146 on the Illumina EPIC array (PREDO II). The main analyses are reported for PREDO I, and replication and extension of the results is shown for PREDO II as well as for three independent cohorts including 121 heel prick samples (UCI cohort, EPIC array) as well as 258 (DCHS, 450 K and EPIC array) and 1023 cord blood samples (MoBa, 450 K array). We tested 10 different prenatal environmental factors covering a broad spectrum of prenatal phenotypes (see Table 1) (referred to as E), as well as cis SNP genotype (referred to as G), i.e., SNPs located in at most 1MB distance to the specific CpG, additive effects of cis SNP genotype and prenatal environment (G + E) and cis SNP×environment interactions (GxE) for association with DNA methylation levels (see Fig. 1). We then assessed for each VMR independently which model described the variance of DNAm best using Akaike’s information criterion (AIC)³⁷. In all models, we corrected for child’s gender, ethnicity (using MDS-components), gestational age as well as estimated cell proportions to account for cellular heterogeneity.

Variably methylated regions

We first identified candidate VMRs, defined as regions of CpG-sites showing the highest variability across all methylation sites. In PREDO I, we identified 10,452 variable CpGs that clustered into 3982 VMRs (see Supplementary Data 1). Most VMRs (n = 2683) include 2 CpGs. As detailed in Supplementary Note 1, the distribution of methylation levels of CpGs within these VMRs is unimodal, (see Supplementary Fig. 1A), VMRs are enriched in specific functional regions of the genome, correlate with differences in gene expression, and overlap with sites associated with specific prenatal environmental factors.

To examine the factors that best explain the variance in methylation in these functionally relevant sites, we chose the CpG-site with the highest MAD score as representative of the VMR. These CpGs are named tagCpGs. The correlation between methylation levels of tagCpG and average methylation of the respective VMR was high (mean r = 0.85, sd r = 0.08), suggesting that the tag CpGs are valid representatives of their VMRs. Furthermore, tagCpGs are mainly uncorrelated with each other (mean r = 0.03, sd = 0.12).

Which models explain methylation of tagCpGs best?

We next compared the fit of four models for each of the 3,982 tagCpGs (see Fig. 1): best SNP (G model), best environment (E model), SNP+ environment (G + E model) and SNP× environment (GxE model). Association results for each model are listed in Supplementary Data 2–5. For each tagCpG, the model with the lowest AIC was chosen as the best model (see Methods section). In total, 40.6% of tagCpGs were best explained by GxE (n = 1616), followed by G (30%, n = 1, 194) and G + E (29%, n = 1171) (Fig. 2a). E explained most variance in one tagCpG. All tag CpGs and the respective SNPs and environments from the best model are listed in Supplementary Data 6–8 and Supplementary Table 1.

With regard to environmental factors, 27.0% of tagCpGs best explained by the G + E model were associated with environmental factors related with stress or, in particular, glucocorticoids (i.e., maternal betamethasone treatment), 40.8% with general maternal factors (mostly maternal age) and 32.20 % with factors related to metabolism (pre-pregnancy BMI, hypertension, gestational diabetes). For best model GxE tagCpGs, the proportions of environmental factors were similar with 22.2, 44.1 and 33.7%, respectively (see Fig. 2b).

We next looked into the delta AIC, i.e., the difference between the AIC of the best model to the AIC of the next best model (see Supplementary Note 2). GxE models appear to be winning by a significantly larger AIC margin over the next best model, when compared to the other types of winning models (see Fig. 2c).

DeepSEA prediction of SNP function

We were next interested in understanding the functionality of both the VMRs as well as the associated SNPs in the G, GxE and G + E models. For this we restricted the analyses only to potentially functional relevant SNPs using DeepSEA³⁸ and not all linkage disequilibrium (LD)-pruned SNPs as described above. DeepSEA, a deep neural network pretrained with DNase-seq and ChIP-seq data from the ENCODE³⁹ project, predicts the presence of histone marks, DNase hypersensitive regions (DHS) or TF binding for a given 1 kb sequence. The likelihood that a specific genetic variant influences regulatory chromatin features is estimated by comparing predicted probabilities of two sequences where the bases at the central position are the reference and alternative alleles of a given variant. We reran the four models now restricting the cis-SNPs to those 36,241 predicted DeepSEA variants that were available in our imputed, quality-controlled genotype dataset.

Top results for models including G, GxE and G + E are depicted in Supplementary Data 9–12.

Results were comparable to what we observed before: 1195 (30.09%) of tagCpGs presented with best model G, 1193 CpGs (30.04%) with best model G + E, 1510 CpGs (38.02%) with best model GxE and 74 CpGs (1.86%) with best model E (Fig. 3a) and also showed similar differences in delta-AIC and proportions of E categories (see Supplementary Note 3). Only 10 tagCpGs did not present with any DeepSEA variant within 1MB distance in cis and were therefore not further considered. All respective CpG-environment-DeepSea SNP combinations are depicted in Supplementary Data 13–16.

The distribution of best models was not influenced by the degree of variability of DNA methylation, but was comparable across the whole range of DNA methylation variation (see Supplementary Note 4 and Supplementary Fig. 2). A slight enrichment for G + E models was observed in longer VMRs with at least 3 CpGs (p = 9.00 × 10⁻⁰⁶, OR = 1.39, Fisher-test, see Supplementary Fig. 3).

In conclusion, also when we focus on potentially functionally relevant SNPs, it is the combination of genotype and environment which best explains VMRs.

We observed that, as expected, different types of exposures or maternal factors have different relative impact on DNA methylation (see Supplementary Note 5). However, even for those exposures with the highest fraction of VMRs best explained by E alone, combined models of G + E and GxE remain the best models in even higher fractions of VMRs (see Supplementary Fig. 4B).

Functional annotation of different best models

Focusing on combinations between tagCpGs, environmental factors and DeepSEA variants, we found functional differences for both the SNPs as well as the tagCpGs (see Supplementary Note 6) within the different models. Overall, 895 DeepSEA variants were uniquely involved in best G models, 905 were uniquely in best G + E models and 1162 uniquely in best GxE models. As a DeepSEA variant can be in multiple 1 MB-cis windows around the tagCpGs, several DeepSEA variants were involved in multiple best models: 138 DeepSEA variants overlapped between G and GxE, 118 between G and G + E and 147 between GxE and G + E VMRs. We observed no significant differences with regard to gene-centric location for DeepSEA variants involved only in G models, only in G + E models or in multiple models. However, DeepSEA variants involved only in GxE models were significantly depleted for promoter locations (p = 3.92 × 10⁻⁰², OR = 0.79, Fisher-test, see Supplementary Fig. 5A).

Although no significant differences were present, DeepSEA SNPs involved in the G and G + E model were located in closer proximity to the specific CpG (model G: mean absolute distance = 256.8 kb, sd = 291.2 kb, model G + E: mean absolute distance = 244.8 kb, sd = 284.0 kb, Supplementary Fig. 5B) whereas DeepSEA SNPs involved in GxE models (mean absolute distance = 352.6 kb, sd = 305.3 kb) showed broader peaks around the CpGs.

With regards to histone marks, DeepSEA variants in general were enriched across multiple histone marks indicative of active transcriptional regulation (Fig. 4c). DeepSEA variants involved in best model G + E showed further enrichment for strong transcription (p = 7.19 × 10⁻⁰³, OR = 1.34, Fisher-test) as well as depletion for quiescent loci (p = 7.17 × 10⁻⁰³, OR = 0.78, Fisher-test). In contrast, GxE DeepSEA variants were significantly enriched in these regions (p = 2.62 × 10⁻⁰², OR = 1.22, Fisher-test, Fig. 4d).

Taken together, these analyses indicate that both the genetic variants and the VMRs in the different best models (G, GxE and G + E) preferentially annotate to functionally distinct genomics regions.

Replication of best models in independent cohorts

To assess whether the relative distribution of the best models for VMRs and DeepSEA variants was stable across different samples, we assessed the relative distribution of these models in 3 additional samples (DCHS I and DCHS II, UCI and PREDOII) with VMR data both from the Illumina 450 K as well as the IlluminaHumanEPIC arrays. Information on these cohorts is summarised in Table 1 and the number of VMRs, the distribution of VMR methylation levels, VMR length and specific SNP information are given in Supplementary Note 7 and Supplementary Fig. 6.

While major maternal factors overlapped among the cohorts - such as maternal age, delivery method, parity and depression during pregnancy - there were also differences, as the non-PREDO cohorts did not include betamethasone treatment but additionally included maternal smoking (see Table 1). Despites these differences and differences in the total number of VMRs, the overall pattern remained stable: in all 4 analyses, DCHS I, DCHS II, UCI and PREDO II, we replicated that E alone models almost never explained most of the variances, while G alone models explained the most variance in up to 15% of the VMRs; G + E in up to 32%; and GxE models in up to 60% (see Fig. 5 and Table 2).

Table 2 VMRs and best models across cohorts

Full size table

The importance of including G for a best model fit could also be observed for maternal smoking, described as one of the most highly replicated factors shaping the newborns’ methylome¹¹ and present in the replication but not the discovery cohort PREDO I. These analyses are detailed in Supplementary Note 8.

We were also able to replicate our finding showing that GxE VMRs were enriched for OpenSea positions with a trend on the 450 K array (DCHS I, OR = 1.11, p = 5.03 × 10⁻⁰², Fisher-test) and significantly for the EPIC array data (PREDOII: p = 2.96 × 10⁻⁰⁶, OR = 1.29, UCI: p = 3.79 × 10⁻⁰², OR = 1.09, DCHSII: p = 2.91 × 10⁻⁰⁴, OR = 1.16, Fisher-tests). For all additional cohorts, the delta AIC for best model GxE to the next best model was also significantly higher as compared to CpGs with G, E or G + E as the best model.

Overall, 387 tag CpGs overlapped between PREDO I, PREDO II, DCHS I and DCHS II (see Supplementary Fig. 7), which allowed us to test the consistency of the best models for specific VMRs across the different cohorts. Over 70% of the overlapping tagCPGs showed consistent best models in at least 3 cohorts (see Fig. 6) with GxE being the most consistent model (for over 60% of consistent models, see Supplementary Fig. 8). Focusing only on EPIC data (PREDO II, DCHSII and UCI), we identified more, namely 2091, tag CpGs that overlap across the three cohorts and here 86% show a consistent best model in at least two of the three cohorts, despite differences in study design, prenatal phenotypes and ethnicity.

Thus, the additional cohorts not only showed a consistent replication of the proportion of the models best explaining variance of VMRs but also consistency of the best model for specific VMRs. Within this context, we observed the GxE models were the most consistent models across the cohorts (see Supplementary Fig. 8), with 85% of the CpGs with consistent models across 5 cohorts having GxE as the best model. Furthermore, we could validate specific GxE combinations between PREDO I and MoBa as shown as in the Supplementary Note 9, in Supplementary Data 17 and 18 and in Supplementary Fig. 9.

Disease relevance

Finally, we tested whether functional DeepSEA SNPs involved in only G, only GxE and only G + E models in PREDO I for their enrichment in GWAS hits. We used all functional SNPs and their LD proxies (defined as r² of at least 0.8 in the PREDO cohort and in maximal distance of 1MB to the target SNP) and performed enrichment analysis with the overlap of nominal significant GWAS hits. We selected for a broad spectrum of GWAS, including GWAS for complex disorders for which differences in prenatal environment are established as risk factors, but also including GWAS on other complex diseases. For psychiatric disorders, we used summary statistics of the Psychiatric Genomics Consortium (PGC) including association studies for autism⁴⁰, attention-deficit-hyperactivity disorder⁴¹, bipolar disorder⁴², major depressive disorder⁴³, schizophrenia⁴⁴ and the cross-disorder associations including all five of these disorders⁴⁵. Additionally, we included GWAS of inflammatory bowel disease⁴⁶, type 2 diabetes⁴⁷ and for BMI⁴⁸. Nominal significant GWAS findings were enriched for DeepSEA variants and their LD proxies per se across psychiatric as well as non-psychiatric diseases (Fig. 7a). However, G, GxE and G + E DeepSEA variants showed a differential enrichment pattern above all DeepSEA variants (Fig. 7b), with strongest enrichments of GxE DeepSEA variants in GWAS of autism spectrum disorder (p < 2.20 × 10⁻¹⁶, OR = 2.07 above DeepSEA, Fisher-test), attention-deficit-hyperactivity disorder (p < 2.20 × 10⁻¹⁶, OR = 1.71, Fisher-test) and inflammatory bowel disease (p < 2.20 × 10⁻¹⁶, OR = 1.71, Fisher-test) and G + E DeepSEA variants in GWAS for attention-deficit-hyperactivity disorder (p = 9.54 × 10⁻³⁶, OR = 1.23, Fisher-test) and inflammatory bowel disease (p = 1.85 × 10⁻⁵², OR = 1.30, Fisher-test). While SNPs with strong main meQTL effects such as those within G and G + E models have been reported to be enriched in GWAS for common disease, we now also show this for SNPs within GxE models that often have non-significant main G effects.

Discussion

We evaluated the effects of prenatal environmental factors and genotype on DNA methylation at VMRs identified in neonatal blood samples. We found that most variable methylation sites were best explained by either genotype and prenatal environment interactions (GxE) or additive effects (G + E) of these factors, followed by main genotype effects. This pattern was replicated in independent cohorts and underscores the need to consider genotype in the study of environmental effects on DNA methylation.

In fact, VMRs best explained by G, G + E or GxE and their associated functional genetic variants were located in distinct genomic regions, suggesting that different combinatorial effects of G and E may impact VMRs with distinct downstream regulatory effects and thus possibly context-dependent impact on cellular function. We also observed that functional variants with best models G, G + E or GxE, all showed significant enrichment within GWAS signals for complex disorders beyond the enrichment of the functional variants themselves. While this was expected for G and G + E models based on results from previous studies^21,23,24,26, it was surprising for GxE SNPs, as these often do not have highly significant main genetic effects. Their specific enrichment in GWAS for common disorders supports the importance of these genetic variants that moderate environmental impact both at the level of DNA methylation but also, potentially, for disease risk.

The fact that GxE and G + E best explained the majority of VMRs (see Fig. 5) and that GxE models were selected by a larger margin than the other models (see Fig. 2c) was consistently found across all tested cohorts. These findings are in line with a previous report by Teh et al.²⁹ who performed a similar analysis based on AIC in umbilical cord tissue. Differences to the findings by Teh et al. are discussed in the Supplemental Discussion. Using data from four different cohorts, we not only saw comparable proportions of VMRs best explained by the different models, but also saw in the VMRs common across cohorts that specific VMRs had consistent best models (see Fig. 6). This is in line with the fact that VMRs best explained by G, GxE or G + E show functional differences and may differentially impact gene regulation.

In addition to consistent findings using AIC-based approaches, we also observed some indication for validation of individual GxE and G + E combinations on selected VMRs using p-value based criteria, with a small number of specific G + E and GxE effects on VMRs replicating between the PREDO I and the MoBa cohort. The low number of specific replications could be due to lack of overall power as well as larger differences in prenatal factors between these two cohorts (see Table 1). As shown in Supplementary Fig. 4B, which specific G and E combinations best explain VMRs is also dependent on the specific prenatal factors. Larger and more homogenous cohorts regarding exposures will be needed for such analyses to be more conclusive.

While E alone was rarely the best model, it should be pointed out that main environmental effects on DNA methylation were observed (see Supplementary Data 3), and consistent with previous large meta-analyses such as in the case of maternal smoking (see Supplementary Note 7). Within the MoBa cohort, the cohort with the largest proportion of maternal smoking, 10% of all tagCpGs were best explained by maternal smoking alone. However, in all other cohorts, where smoking was less prevalent, the inclusion of genotypic effects in addition to maternal smoking explained more of the variance. This supports that while main E effects on the newborn methylome are present, genotype is an important factor that, in combination with E, may explain even more of the variance in DNA methylation.

VMRs best explained by either E, G, G + E or GxE and their associated functional SNPs were enriched for distinct genomics locations and chromatin states (see Fig. 4), suggesting that VMRs moderated by different combinations of G and E may in fact have distinct functional roles in gene regulation. Overall, VMRs best explained by GxE were consistently enriched for regions annotated to the OpenSea regions with lower CpG density and located farthest from CpG Islands⁴⁹. Open Sea regions have been reported to be enriched for environmentally-associated CpGs with for example exposure to childhood trauma⁵⁰ and may harbour more long-range enhancers.

In addition to their position relative to CpG islands and their CpG content, G, GxE and G + E VMRs and their associated functional SNPs also showed distinct enrichments for chromatin marks. Compared to 450 K VMRs in general, VMRs with GxE as the best models were relatively depleted in regions surrounding the TSS, while VMRs with G + E were relatively enriched in these regions (see Fig. 4), suggesting that GxE VMRs are located at more distance from the TSS than G + E VMRs. To better map the potential functional variants in these models and to compare methylation-associated SNPs from a regulatory perspective, we used DeepSEA³⁸, a machine learning algorithm that predicts SNP functionality from the sequence context based on sequencing data for different regulatory elements in different cell lines using ENCODE data³⁹. We identified the SNPs with putatively functional consequences on regulatory marks by DeepSEA and compared putative regulatory effects of G, G + E and GxE hits. Relative to the imputed non-DeepSEA SNPs contained in our dataset, these predicted functional DeepSEA SNPs were enriched for TSS and enhancer regions and depleted for quiescent regions, supporting their relevance in regulatory processes (see Fig. 4). Compared to DeepSEA SNPs overall, DeepSEA SNPs within the three different best models also showed distinct enrichment or depletion patterns. Similar to GxE VMRs, likely functional GxE SNPs also showed a relative depletion in TSS regions while G + E SNPs showed enrichment in genic enhancers. Overall, both the VMRs as well as the associated functional SNPs appear to be in distinct regulatory regions, depending on their best model. In addition, GxE functional SNP and tagCpGs were located farther apart than SNP/tagCpG pairs within G or G + E models (see Supplementary Fig. 5B), supporting a more long-range type of regulation in GxE interactions on molecular traits as compared to all genes; a similar relationship has been reported previously for GxE with regard to gene expression in C. elegans^51,52.

SNPs associated with differences in gene expression but also DNA methylation have consistently been shown to be enriched among SNPs associated with common disorders in GWAS^21,24,26,53. The functional genetic variants that were within G, GxE or G + E models predicting variable DNA methylation were even enriched in GWAS association results (beyond the baseline enrichment of DeepSea SNPs per se). The fact that such enrichment was observed for not only G and G + E SNPs, with strong main genetic effects, but also for GxE SNPs, with smaller to sometimes no main genetic effect on DNA methylation underscores the importance of also including SNPs within GxE models in the functional annotation of GWAS. A detailed catalogue of meQTLs that are responsive to environmental factors could support a better pathophysiological understanding of diseases for which risk is shaped by a combination of environment and genetic factors.

Finally, we want to note the limitations of this study. First, we restricted our analyses to specific DNA methylation array contents that are inherently biased as compared to genome-wide bisulfite sequencing, for example. In addition, we restricted our analysis to VMRs, which also limits the generalisability of the findings, but also has advantages. Ong and Holbrooke⁵⁴ showed that this approach increases statistical power. Furthermore, VMRs are enriched for enhancers and transcription factor binding sites, overlap with GWAS hits⁵⁵ and are associated with gene expression of nearby genes at these sites⁵⁶. VMRs in this study presented with intermediate methylation levels which have been shown to be enriched in regions of regulatory function, like enhancers, exons and DNase I hypersensitivity sites⁵⁷. Hence, the effects of genotypes on DNA methylation levels in VMRs might be higher as compared to less variable CpG-sites. In addition, genotypes are measured with much less error as compared to environmental factors which may also reduce the overall explained variance in large cohorts.

Second, it has been reported that different cell types display different patterns of DNA methylation⁵⁵. Therefore, the most variable CpG-sites may also include those that reflect differences in cord blood cell type proportions. To address this issue, all analyses were corrected for estimated cell proportions to the best of our current availability, so that differences in cell type proportion likely do not account for all of the observed effects. However, only replication in specific cell types will be able to truly assess the proportion of VMRs influenced by this.

Third, we used the AIC as main criterion for model fit³⁷ which is equivalent to a penalised likelihood-function. There are a variety of other model selection criteria⁵⁸ and choosing between these is an ongoing debate which also depends on the underlying research question. We decided to use the AIC as one of our main aims was to compare our results with the study of Teh et al.²⁹ in which this criterion was applied and as this method maybe more powerful for detecting GxE than for example model selection criteria based on lowest p-values.

Fourth, all reported interactions are statistical interactions and limited to a cis window around the CpG-site. Further experiments are required to assess whether these would also reflect biological/mechanistic interactions. Much larger cohorts will be needed to assess potential trans effects. Additional inclusion of further covariates such as maternal smoking or maternal age may further modify the effects of specific Es but is beyond the scope of this manuscript.

Fifth, as summarised in Table 1, results presented are based on cohorts which differ in ethnicity, assessed phenotypes, methylation and SNP arrays, processing pipelines and sample sizes. While all these factors may contribute to differences in the proportions of models across the cohorts, it also suggests that our findings are quite robust to these methodological issues.

Finally, our analyses are restricted to DNA methylation in neonatal blood and to pregnancy environments. Whether similar conclusions can be drawn for methylation levels assessed at a later developmental stage needs to be investigated.

We tested whether genotype, a combination of different prenatal environmental factors and the additive or the multiplicative interactive effects of both mainly influence VMRs in the newborn’s epigenome. Our results show that G in combination with E are the best predictors of variance in DNA methylation. This highlights the importance of including both individual genetic differences as well as environmental phenotypes into epigenetic studies and also the importance of improving our ability to identify environmental associations. Our data also support the disease relevance of variants predicting DNA methylation together with the environment beyond main meQTL effects, and the view that there are functional differences of additive and interactive effects of genes and environment on DNA methylation. Improved understanding of these functional differences may also yield novel insights into pathophysiological mechanisms of common non-communicable diseases, as risk for all of these disorders is driven by both genetic and environmental factors.

Methods

The PREDO cohort

The Prediction and Prevention of Preeclampsia and Intrauterine Growth Restriction (PREDO) Study is a longitudinal multicenter pregnancy cohort study of Finnish women and their singleton children born alive between 2006 and 2010 ³⁰. We recruited 1,079 pregnant women, of whom 969 had one or more and 110 had none of the known clinical risk factors for preeclampsia and intrauterine growth restriction. The recruitment took place when these women attended the first ultrasound screening at 12 + 0–13 + 6 weeks + days of gestation in one of the ten hospital maternity clinics participating in the study. The cohort profile³⁰ contains details of the study design and inclusion criteria.

Ethics

The study protocol was approved by the Ethical Committees of the Helsinki and Uusimaa Hospital District and by the participating hospitals. A written informed consent was obtained from all women.

Maternal characteristics

We tested 10 different maternal environments:

Depressive symptoms

Starting from 12 + 0–13 + 6 gestational weeks + days pregnant women filled in the 20 item Center for Epidemiological Studies Depression Scale (CES-D)⁵⁹ for depressive symptoms in the past 7 days. They filled in the CES-D scale biweekly until 38 + 0–39 + 6 weeks + days of gestation or delivery. We used the mean-value across all the CES-D measurements.

Symptoms of anxiety

At 12 + 0–13 + 6 weeks + days of gestation, women filled in the 20 item Spielberger’s State Trait Anxiety Inventory (STAI)⁶⁰ for anxiety symptoms in the past 7 days. They filled in the STAI scale biweekly until 38 + 0–39 + 6 weeks + days of gestation or delivery. We used the mean-value across all these measurements.

Betamethasone

Antenatal betamethasone treatment (yes/no) was derived from the hospital records and the Finnish Medical Birth Register (MBR).

Delivery method

Mode of delivery (vaginal delivery vs. caesarean section) was derived from patient records and MBR.

Parity

Parity (number of previous pregnancies leading to childbirth) at the start of present pregnancy was derived from the hospital records and the MBR.

Maternal age

Maternal age at delivery (years) was derived from the hospital records and the MBR.

Pre-pregnancy BMI

Maternal pre-pregnancy BMI (kg/m²), calculated from measurements weight and height verified at the first antenatal clinic visit at 8 + 4 (SD 1 + 3) gestational week was derived from the hospital records and the MBR.

Hypertension

Hypertension was defined as any hypertensive disorder including gestational hypertension, chronic hypertension and preeclampsia against normotension. Gestational hypertension was defined as systolic/diastolic blood pressure ≥140/90 mm Hg on ≥2 occasions at least 4 h apart in a woman who was normotensive before 20th week of gestation. Preeclampsia was defined as systolic/diastolic blood pressure ≥140/90 mm Hg on ≥2 occasions at least 4 h apart after 20th week of gestation and proteinuria ≥300 mg/24 h. Chronic hypertension was defined as systolic/diastolic blood pressure ≥140/90 mm Hg on ≥2 occasions at least 4 h apart before 20th gestational week or medication for hypertension before 20 weeks of gestation.

Gestational diabetes and oral glucose tolerance test

Gestational diabetes was defined as fasting, 1 h or 2 h plasma glucose during a 75 g oral glucose tolerance test ≥5.1, ≥10.0 and/or ≥8.5 mmol/L, respectively, that emerged or was first identified during pregnancy. We took the area under the curve from the three measurements as a single measure for the oral glucose tolerance test (OGTT) itself.

Genotyping and imputation

Genotyping was performed on Illumina Human Omni Express Exome Arrays containing 964,193 SNPs. Only markers with a call rate of at least 98%, a minor allele frequency of at least 1% and a p-value for deviation from Hardy-Weinberg-Equilibrium >1.0 × 10⁻⁰⁶ were kept in the analysis. After QC, 587,290 SNPs were available.

In total, 996 cord blood samples were genotyped. Samples with a call rate below 98% (n = 11) were removed.

Any pair of samples with IBD estimates >0.125 was checked for relatedness. As we corrected for admixture in our analyses using MDS-components (see Supplementary Fig. 10), these samples were kept except for one pair which could not be resolved. From this pair we excluded one sample from further analysis. Individuals showing discrepancies between phenotypic and genotypic sex (n = 1) were removed. We also checked for heterozygosity outliers but found none. Nine hundred and eighty-three participants were available in the final dataset.

Before imputation, AT and CG SNPs were removed. Imputation was performed using shapeit2 (http://mathgen.stats.ox.ac.uk./genetics_software/shapeit/shapeit.html) and impute2 (https://mathgen.stats.ox.ac.uk/impute/impute_v2.html). Chromosomal and base pair positions were updated to the 1000 Genomes Phase 3 reference set, allele strands were flipped where necessary.

After imputation, we reran quality control, filtering out SNPs with an info score <0.8, a minor allele frequency below 1% and a deviation from HWE with a p-value <1.0 × 10 ^-06.

This resulted in a dataset of 9,402,991 SNPs. After conversion into best guessed genotypes using a probability threshold of 90%, we performed another round of QC (using SNP-call rate of least 98%, a MAF of at least 1% and a p-value threshold for HWE of 1.0 × 10⁻⁰⁶), after which 7,314,737 SNPs remained for the analysis.

For the evaluation of which model best explained the methylation sites, we pruned the dataset using a threshold of r² of 0.2 and a window-size of 50 SNPs with an overlap of 5 SNPs. The final, pruned dataset contained 788,156 SNPs. 36,241 of these variants were DeepSea variants (see Methods below).

DNA methylation

Cord blood samples were run on Illumina 450k Methylation arrays. The quality control pipeline was set up using the R-package minfi⁶¹ (https://www.r-project.org). Three samples were excluded as they were outliers in the median intensities. Furthermore, 20 samples showed discordance between phenotypic sex and estimated sex and were excluded. Nine samples were contaminated with maternal DNA according to the method suggested by Morin et al.⁶² and were also removed.

Methylation beta-values were normalised using the funnorm function⁶³. After normalisation, two batches, i.e., slide and well, were significantly associated and were removed iteratively using the Combat function⁶⁴ in the sva package⁶⁵.

We excluded any probes on chromosome X or Y, probes containing SNPs and cross-hybridising probes according to Chen et al.⁵³ and Price et al.⁶⁶ Furthermore, any CpGs with a detection p-value >0.01 in at least 25% of the samples were excluded.

The final dataset contained 428,619 CpGs and 822 participants. For 817 of these, also genotypes were available.

An additional 161 cord blood samples were run on Illumina EPIC Methylation arrays.

Three samples were excluded as they were outliers in the median intensities. Three samples showed discordance between phenotypic sex and estimated sex and were excluded. Three samples were contaminated with maternal DNA and were also removed⁶².

Methylation beta-values were normalised using the funnorm function⁶³ in the R–package minfi⁶¹. Three samples showed density artefacts after normalisation and were removed from further analysis. We excluded any probes on chromosome X or Y, probes containing SNPs and cross-hybridising probes according to Chen et al.⁵³, Price et al.⁶⁶ and McCartney et al.⁶⁷. Furthermore, any CpGs with a detection p-value >0.01 in at least 25% of the samples were excluded. The final dataset contains 812,987 CpGs and 149 samples. After normalisation no significant batches were identified. For 146 of these samples, genotypic data was also available.

Cord blood cell counts were estimated for seven cell types (nucleated red blood cells, granulocytes, monocytes, natural killer cells, B cells, CD4(+)T cells, and CD8(+)T cells) using the method of Bakulski et al.⁶⁸ which is incorporated in the R-package minfi⁶¹.

Identification of VMRs (variable methylated regions)

The VMR approach was described by Ong and Holbrook⁵⁴. We chose all 42,862 CpGs with a MAD score greater than the 90^th percentile. For each CpG-site, the MAD (median absolute deviation) is defined as the median of the absolute deviations from each individual’s methylation beta-value at this CpG-site to the CpG’s median. A candidate VMR region was defined as at least two spatially contiguous probes which were at most 1 kb apart of each other. This resulted in 3982 VMRs in the 450 K samples and in 8547 VMRs in the EPIC sample. The CpG with the highest MAD scores was chosen as representative of the whole VMR in the statistical analysis.

The Drakenstein cohort

Details on this cohort and the assessed phenotypes can be found in refs. ^34,35. The birth cohort design recruits pregnant women attending one of two primary health care clinics in the Drakenstein sub-district of the Cape Winelands, Western Cape, South Africa – Mbekweni (serving a black African population) and TC Newman (serving a mixed ancestry population). Consenting mothers were enroled during pregnancy, and mother–child dyads are followed longitudinally until children reach at least 5 years of age. Mothers are asked to request that the father of the index pregnancy attend a single antenatal study visit where possible. Follow-up visits for mother–child dyads take place at the two primary health care clinics and at Paarl Hospital.

Pregnant women were eligible to participate if they were 18 years or older, were accessing one of the two primary health care clinics for antenatal care, had no intention to move out of the district within the following year, and provided signed written informed consent. Participants were enroled between 20 and 28 weeks’ gestation, upon presenting for antenatal care visit. In addition, consenting fathers of the index pregnancy when available were enroled in the study and attended a single antenatal study visit.

Ethics

The study was approved by the Faculty of Health Sciences, Human Research Ethics Committee, University of Cape Town (401/2009), by Stellenbosch University (N12/02/0002), and by the Western Cape Provincial Health Research committee (2011RP45). All participants provided written informed consent.

Maternal characteristics

After providing consent, participants were asked to complete a battery of self-report and clinician-administered measures at a number of antenatal and postnatal study visits. All assessed phenotypes are described in detail in ref. ³⁴. Here, we give a short outline on the phenotypes which were used in our analysis. Maternal parity was obtained from the antenatal record; maternal age was from the date of birth as recorded on the mothers’ national identity document. The mode of delivery was ascertained by direct observation of the birth by a member of the study team as all births occurred at Paarl hospital. The SRQ-20⁶⁹ is a WHO-endorsed measure of psychological distress consisting of 20 items which assess non-psychotic symptoms, including symptoms of depressive and anxiety disorders. Each item is scored according to whether the participant responds in the affirmative (scored as 1) or negative (scored as 0) to the presence of a symptom. Individual items are summed to generate a total score. The Beck Depression Inventory (BDI-II) is a widely-used and reliable measure of depressive symptoms⁷⁰. The BDI-II comprises 21 items, each of which assesses the severity of a symptom of major depression. Each item is assessed on a severity scale ranging from 0 (absence of symptoms) to 3 (severe, often with functional impairment). A total score is then obtained by summing individual item responses, with a higher score indicative of more severe depressive symptoms.

Smoking was assessed using The Alcohol, Smoking and Substance Involvement Screening Test (ASSIST)⁷¹, a tool that was developed by the WHO to detect and manage substance use among people attending primary health care services. The tool assesses substance use and substance-related risk across 10 categories (tobacco, alcohol, cannabis, cocaine, amphetamine-type stimulants, inhalants, sedatives/sleeping pills, hallucinogens, opioids and other substances), as well as enquiring about a history of intravenous drug use. Total scores are obtained for each substance by summing individual item responses, with a higher score indicative of greater risk for substance-related health problems.

Hypertension was assessed by blood pressure measured antenatally.

Genotyping and Imputation

Genotyping in DCHS was performed using the Illumina PsychArray for those samples with 450k data, or the Illumina GSA for those samples with EPIC DNA methylation data (Illumina, San Diego, USA). For both array types, QC and imputation was the same; first, raw data was imported into Genome Studio and exported into R for QC. SNPs were filtered out if they had a tenth percentile GC score below 0.2 or an average GC score below 0.1, for a total of 140 SNPs removed. Phasing was performed using shapeit, and imputation was performed using impute2 with 1000 Genomes Phase 1 reference data. After imputation, we used qctool to filter out SNPs with an info score <0.8 or out of Hardy–Weinberg equilibrium. All SNPs with MAF <1% were removed.

As after imputation, only 5286 DeepSEA variants were available for those samples genotyped on the PsychArray and only 4049 for those samples genotyped on the GSAchip, we performed LD-pruning based on a threshold of r² of 0.2 and a window-size of 50 SNPs with an overlap of 5 SNPs. This resulted in 162,292 SNPs (PsychArray) and 176,553 SNPs (GSAchip).

DNA methylation

We performed basic quality control on data generated by either the 450k or EPIC arrays using Illumina’s Genome Studio software for background subtraction and colour correction. Data was filtered to remove CpGs with high detection p values, those on the X or Y chromosome, or with previously identified poor performance. 450k data was normalised using SWAN and EPIC data using BMIQ, and both used ComBat to correct for chip (both), and row (450k only). Details for DNA methylation measurements and quality control have been published⁶². The final analysis was performed with 107 samples with methylation levels from the 450k array and 151 with methylation levels assessed on the EPIC array and available genotypes. Neonatal blood cell counts were estimated for seven cell types: nucleated red blood cells, granulocytes, monocytes, natural killer cells, B cells, CD4(+)T cells, and CD8(+)T cells⁶⁸.

VMRs

We identified 6072 candidate VMRs in DCHS I and 10,005 candidate VMRs in DCHS II.

The UCI cohort

Mothers and children were part of an ongoing, longitudinal study, conducted at the University of California, Irvine (UCI), for which mothers were recruited during the first trimester of pregnancy^31,32,33. All women had singleton, intrauterine pregnancies. Women were not eligible for study participation if they met the following criteria: corticosteroids, or illicit drugs during pregnancy (verified by urinary cotinine and drug toxicology). Exclusion criteria for the newborn were preterm birth (i.e., less than 34 weeks of gestational age at birth), as well as any congenital, genetic, or neurologic disorders at birth.

Ethics

The UCI institutional review board approved all study procedures and all participants provided written informed consent.

Maternal characteristics

Maternal sociodemographic characteristics (age, parity) were obtained via a standardised structured interview at the first pregnancy visit. Maternal pre-pregnancy BMI (weight kg/height m²) was computed based on pre-pregnancy weight abstracted from the medical record, and maternal height was measured at the research laboratory during the first pregnancy visit. Obstetric risk conditions during pregnancy, including presence of gestational diabetes and hypertension, and delivery mode were abstracted from the medical record. At each pregnancy visit the Center for Epidemiological Studies Depression Scale⁵⁹ and the State scale from the State–Trait Anxiety Inventory⁶⁰ were administered. For individuals with <3 missing items on any scale at any time point, the mean responses for that scale were calculated and then multiplied by the total number of items in the respective scale, to generate total scale scores that are comparable to those generated from participants without any missing data. We used the average depression and anxiety score throughout pregnancy in the calculations. Maternal smoking during pregnancy was determined by maternal self-report and verified by measurement of urinary cotinine concentration. Urinary cotinine was assayed in maternal samples collected at each trimester using the Nicotine/COT(Cotinine)/Tobacco Drug Test Urine Cassette (http://www.meditests.com/nicuintescas.html), which involves transferring 4 drops of room temperature urine into the well of the cassette, and employs a cutoff for COT presence of 200 ng/ml. Endorsement of smoking or detection of urinary COT in any trimester was coded as 1, and absence of evidence for smoking in any trimester coded as 0.

Genotyping

Genomic DNA was extracted from heel prick blood samples and used for all genomic analysis. Genotyping was performed on Illumina Human Omni Express (24 v1.1) Arrays containing 713,014 SNPs. All samples had a high call rate (above 97%). SNPs with a minor allele frequency >5% and a p-value for deviation from Hardy-Weinberg-Equilibrium >1.0 × 10⁻²⁵ were retained for analysis. After QC, 602,807 SNPs were available.

Imputation

Before imputation, chromosomal and base pair positions were updated to the Haplotype Reference Consortium (r1.1) reference set, allele strands were flipped where necessary. Phasing was performed using EAGLE2 (https://data.broadinstitute.org/alkesgroup/Eagle/) and imputation was performed using PBWT (https://github.com/VertebrateResequencing/pbwt). Imputed SNPs with an info score <0.8, duplicates and ambiguous SNPs were removed resulting in 21,341,980 SNPs. All SNPs with MAF <0.01 were removed. Of the remaining SNPs, 19,530 were DeepSEA variants.

DNA methylation

DNAm analysis using the Infinium Illumina MethylationEPIC BeadChip (Illumina, Inc., San Diego, CA) was performed according to the manufacturer´s guidelines in using genomic DNA derived from neonatal heel prick samples. Quality Control carried out in minfi⁶¹. No outliers were detected in the median intensities of methylated and unmethylated channels. All samples had a high call rate of at least 95% and their predicted sex was the same as the phenotypic sex. We removed CpGs with a high detection value (p < 0.0001), probes missing >3 beads in >5% of the cohort, in addition to non-specific/cross-hybridising and SNP probes^66,67. Methylation beta-values were normalised using functional normalisation (funnorm)⁶³. We also iteratively adjusted the data for relevant technical factors, i.e., array row, experimental batch and sample plate, using Combat⁶⁴. The final dataset contained 768,910 CpGs. Neonatal blood cell counts were estimated for seven cell types: nucleated red blood cells, granulocytes, monocytes, natural killer cells, B cells, CD4(+)T cells, and CD8(+)T cells⁶⁸. The final dataset contained 121 samples with available genotypes and methylation values.

VMRs

Applying the same procedure as for PREDO I and PREDO II, we identified 9525 candidate VMRs in the ICU cohort.

The MoBa cohort

Participants represent two subsets of mother-offspring pairs from the national Norwegian Mother and Child Cohort Study (MoBa)⁷². MoBa is a prospective population-based pregnancy cohort study conducted by the Norwegian Institute of Public Health. The years of birth for MoBa participants ranged from 1999 to 2009. MoBa mothers provided written informed consent. Each subset is referred to here as MoBa1 and MoBa2. MoBa1 is a subset of a larger study within MoBa that included a cohort random sample and cases of asthma at age 3 years⁷³. We previously reported an association between maternal smoking during pregnancy and differential DNA methylation in MoBa1 newborns⁷⁴. We subsequently measured DNA methylation in additional newborns (MoBa2) in the same laboratory (Illumina, San Diego, CA)¹¹. MoBa2 included cohort random sample plus cases of asthma at age 7 years and non-asthmatic controls. Years of birth were 2002–2004 for children in MoBa1, 2000–2005 for MoBa2.

Ethics

The establishment and data collection in MoBa obtained a license from the Norwegian Data Inspectorate and approval from The Regional Committee for Medical Research Ethics. Both studies were approved by the Regional Committee for Ethics in Medical Research, Norway. In addition, MoBa1 and MoBa2 were approved by the Institutional Review Board of the National Institute of Environmental Health Sciences, USA.

Maternal characteristics

To replicate specific GxE and G + E from PREDO I, we focused on those characteristics which were available in both cohorts: maternal age, pre-pregnancy BMI and hypertension.

Within MoBa, the questionnaires at weeks 17 and 30 include general background information as well as details on previous and present health problems and exposures. The birth record from the Medical Birth Registry of Norway⁷⁵ which includes maternal health during pregnancy as well as procedures around birth and pregnancy outcomes, is integrated in the MoBa database.

Genotyping and imputation

DNA was extracted from the MoBa biobank and genotyped on the Illumina HumanExomeCore platform. The genotypes were called with GenomeStudio software. Phasing and imputation were done using shapeit2 (https://mathgen.stats.ox.ac.uk./genetics_software/shapeit/shapeit.html) and impute2 (https://mathgen.stats.ox.ac.uk/impute/impute_v2.html) with the thousand genomes phase 3 reference panel for the European population. Variants with a imputation score of <0.8 and with a minor allele frequency below 1% were filtered out.

DNA methylation

Details of the DNA methylation measurements and quality control for the MoBa1 participants were previously described³⁶ and the same protocol was implemented for the MoBa2 participants. Briefly, at birth, umbilical cord blood samples were collected and frozen at birth at −80 °C. All biological material was obtained from the Biobank of the MoBa study³⁶. Bisulfite conversion was performed using the EZ-96 DNA Methylation kit (Zymo Research Corporation, Irvine, CA) and DNA methylation was measured at 485,577 CpGs in cord blood using Illumina’s Infinium HumanMethylation450 BeadChip⁷⁶. Raw intensity (.idat) files were handled in R using the minfi package to calculate the methylation level at each CpG as the beta-value (β = intensity of the methylated allele (M)/(intensity of the unmethylated allele (U)+ intensity of the methylated allele (M) + 100)) and the data was exported for quality control and processing. Control probes (N = 65) and probes on X (N = 11 230) and Y (N = 416) chromosomes were excluded in both datasets. Remaining CpGs missing >10% of methylation data were also removed (N = 20 in MoBa1, none in MoBa2). Samples indicated by Illumina to have failed or have an average detection p value across all probes <0.05 (N = 49 MoBa1, N = 35 MoBa2) and samples with gender mismatch (N = 13 MoBa1, N = 8 MoBa2) were also removed. For MoBa1 and MoBa2, we accounted for the two different probe designs by applying the intra-array normalisation strategy Beta Mixture Quantile dilation (BMIQ)⁷⁷. The Empirical Bayes method via ComBat was applied separately in MoBa1 and MoBa2 for batch correction using the sva package in R⁶⁵. After quality control exclusions, the sample sizes were 1068 for MoBa1 and 685 for MoBa2.

After QC, the total number of samples was 1732, with 1592 overlapping with the methylation samples. Specific G + E and GxE associations were calculated in the combined dataset of MoBa1 and MoBa2, while VMR analysis was conducted in MoBa1 only.

Regression analysis

Linear regression analysis was conducted using the lm function in R 3.3.1 (https://www.r-project.org). We included the child’s sex, gestational age, seven estimated cell counts as well as the first two (PREDO I and PREDO II), first three (UCI) and first five (DCHS I and II) principal components of the MDS analysis on the genotypes in the model. The corresponding plot of the first ten MDS-components in PREDO is depicted in Figure S4. SNP genotypes were recoded into a count of 0, 1 or 2 representing the number of minor allele copies. For each VMR site, we tested SNPs located in a 1MB window up- and downstream of the specific site. In PREDO and UCI, we restricted the analysis to DeepSEA variants while we used the pruned SNP-set in DCHS.

For each VMR, we tested four models:

(1)
Methylation at tagCpG ~ covariates + environment
(2)
Methylation at tagCpG ~ covariates + SNP
(3)
Methylation at tagCpG ~ covariates + SNP + environment
(4)
Methylation at tagCpG ~ covariates + SNP + environment + SNP×environment

In model (1) we included all ten different environments, in model (2) all DeepSEA cis SNPs and in models (3) and (4) all possible environment-cis-SNP combinations. Please also see Fig. 1.

For each model, the AIC, Akaike’s information criterion³⁷ was calculated and the model with the lowest AIC was chosen as the best model. The AIC was obtained using the AIC function in R 3.3.1 (https://www.r-project.org).

P-values were obtained from the summary function and adjusted for the number of tested Es (E model), of tested cis SNPs (G model) or of tested cis SNP-environment combinations (G + E/GxE model) using Bonferroni-correction. Afterwards, we used FDR to correct for all tested tagCpGs (all models) using p.adjust in R.

Enrichment analyses

With regard to enrichment for VMRs, CpG-site within VMRs were compared to all other CpG-sites on the 450 K array located in non-VMR-regions. With regard to enrichment for VMRs best explained by G, G + E or GxE, tagCpGs best explained by the specific model were compared to tagCpGs best explained by any of the other models. For enrichment tests for DeepSEA SNPs, non-DeepSEA SNPs present in our dataset were used as comparison group. Enrichment tests were performed based on a hyper-geometric test, i.e. a Fisher-test. The significance levels was set at p < 0.05.

With regard to enrichment for GWAS hits, DeepSEA variants were matched to GWAs variants based on chromosome and position (hg19). To check for enrichment for nominal significant GWAS hits, the full summary statistics were derived from the respective publication.

Histone ChiP-seq peaks from Roadmap Epigenomics project for blood and embryonic stem cells were downloaded from http://egg2.wustl.edu/roadmap/data/byFileType/peaks/consolidated/broadPeak/.

The pre-processed consolidated broad peaks from the uniform processing pipeline of the Roadmap project were used.

Genomic annotation mapping

CpG sites were mapped to the genome location according to Illumina’s annotation using the R-package minfi.

DeepSEA analysis

Pretrained DeepSEA model was downloaded from:

http://deepsea.princeton.edu/media/code/deepsea.v0.94.tar.gz and variant files in VCF format are used for producing e-values. VCF files were first split into smaller files each containing one million variants and the model was run using the command line on a server with a NVIDIA Titan X GPU card.

We reran our models using only DeepSEA variants which had been identified by the algorithm of Zhou and Troyanskaya³⁸. This method predicts functionality of a SNP based on the DNA-sequence. We included all 212,210 variants with a functional significance e-value below 5 × 10⁻⁰⁵. The e-values represent the significance of the regulatory impact of given variants compared to one million random variants.

Random-effects meta-analysis

GxE and G + E result for PREDO and for MoBa were meta-analysed using a random-effects model in the R-package rmeta. Replication was defined as DeepSEA-tagCpG-environment combinations showing the same effect direction in both cohorts, presenting with smaller p-values as for PREDO alone and with a FDR-corrected p-value (across all combinations tested in the meta-analysis) below 0.05.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability

Due to ethical issues and consent the datasets analysed during the current study are not publicly available. However, an interested researcher can obtain a de-identified dataset after approval from the PREDO Study Board. Data requests may be subject to further review by the national register authority and by the ethical committees. Data can be obtained upon reasonable request from the PREDO Study Board (predo.study@helsinki.fi) or individual researchers. The summary statistics of the best models for PREDO I are accessible at: https:/doi.org/10.6084/m9.figshare.8074964.

For access to the UCI cohort, please contact claudia.buss@charite.de, for access to DCHS please contact Heather.Zar@uct.ac.za, for MoBa access please apply for data access at https:/www.fih.no

References

Roseboom, T., de Rooij, S. & Painter, R. The Dutch famine and its long-term consequences for adult health. Early Hum. Dev. 82, 485–491 (2006).
Article Google Scholar
Barker, D. J., Osmond, C., Forsen, T. J., Kajantie, E. & Eriksson, J. G. Trajectories of growth among children who have coronary events as adults. N. Engl. J. Med. 353, 1802–1809 (2005).
Article CAS Google Scholar
Hovi, P. et al. Glucose regulation in young adults with very low birth weight. N. Engl. J. Med. 356, 2053–2063 (2007).
Article CAS Google Scholar
Hillier, T. A. et al. Childhood obesity and metabolic imprinting: the ongoing effects of maternal hyperglycemia. Diabetes Care 30, 2287–2292 (2007).
Article Google Scholar
Dancause, K. N. et al. Prenatal stress due to a natural disaster predicts adiposity in childhood: the Iowa Flood Study. J. Obes. 2015, 570541 (2015).
Article Google Scholar
Lahti, M. et al. Maternal depressive symptoms during and after pregnancy and psychiatric problems in children. J. Am. Acad. Child Adolesc. Psychiatry 56, 30–39 e37 (2017).
Article MathSciNet Google Scholar
Bronson, S. L. & Bale, T. L. The Placenta as a mediator of stress effects on neurodevelopmental reprogramming. Neuropsychopharmacology 41, 207–218 (2016).
Article Google Scholar
Schwarze, C. E. et al. Prenatal adversity: a risk factor in borderline personality disorder? Psychol. Med. 43, 1279–1291 (2013).
Article CAS Google Scholar
Entringer, S., Buss, C. & Wadhwa, P. D. Prenatal stress, development, health and disease risk: a psychobiological perspective-2015 Curt Richter Award Paper. Psychoneuroendocrinology 62, 366–375 (2015).
Article Google Scholar
Gutierrez-Arcelus, M. et al. Passive and active DNA methylation and the interplay with genetic variation in gene regulation. Elife 2, e00523 (2013).
Article Google Scholar
Joubert, B. R. et al. DNA methylation in newborns and maternal smoking in pregnancy: genome-wide consortium meta-analysis. Am. J. Hum. Genet. 98, 680–696 (2016).
Article CAS Google Scholar
Sharp, G. C. et al. Maternal BMI at the start of pregnancy and offspring epigenome-wide DNA methylation: findings from the pregnancy and childhood epigenetics (PACE) consortium. Hum. Mol. Genet 26, 4067–4085 (2017).
Article CAS Google Scholar
Girchenko, P. et al. Associations between maternal risk factors of adverse pregnancy and birth outcomes and the offspring epigenetic clock of gestational age at birth. Clin. Epigenet. 9, 49 (2017).
Article Google Scholar
Rijlaarsdam, J. et al. An epigenome-wide association meta-analysis of prenatal maternal stress in neonates: a model approach for replication. Epigenetics 11, 140–149 (2016).
Article Google Scholar
Sosnowski, D. W., Booth, C., York, T. P., Amstadter, A. B. & Kliewer, W. Maternal prenatal stress and infant DNA methylation: a systematic review. Dev. Psychobiol. 60, 127–139 (2018).
Article CAS Google Scholar
Bauer, T. et al. Environment-induced epigenetic reprogramming in genomic regulatory elements in smoking mothers and their children. Mol. Syst. Biol. 12, 861 (2016).
Article Google Scholar
Sharp, G. C. et al. Maternal pre-pregnancy BMI and gestational weight gain, offspring DNA methylation and later offspring adiposity: findings from the Avon Longitudinal Study of Parents and Children. Int. J. Epidemiol. 44, 1288–1304 (2015).
Article Google Scholar
Lin, X. et al. Developmental pathways to adiposity begin before birth and are influenced by genotype, prenatal environment and epigenome. BMC Med. 15, 50 (2017).
Article Google Scholar
Cecil, C. A. et al. DNA methylation and substance-use risk: a prospective, genome-wide study spanning gestation to adolescence. Transl. Psychiatry 6, e976 (2016).
Article CAS Google Scholar
Gibbs, J. R. et al. Abundant quantitative trait loci exist for DNA methylation and gene expression in human brain. PLoS Genet 6, e1000952 (2010).
Article Google Scholar
Gaunt, T. R. et al. Systematic identification of genetic influences on methylation across the human life course. Genome Biol. 17, 61 (2016).
Article Google Scholar
McClay, J. L. et al. High density methylation QTL analysis in human blood via next-generation sequencing of the methylated genomic DNA fraction. Genome Biol. 16, 291 (2015).
Article Google Scholar
Chen, L. et al. Genetic drivers of epigenetic and transcriptional variation in human immune cells. Cell 167, 1398–1414 e1324 (2016).
Article CAS Google Scholar
Hannon, E., Weedon, M., Bray, N., O’Donovan, M. & Mill, J. Pleiotropic effects of trait-associated genetic variation on DNA Methylation: utility for refining GWAS loci. Am. J. Hum. Genet 100, 954–959 (2017).
Article CAS Google Scholar
Pierce, B. L. et al. Co-occurring expression and methylation QTLs allow detection of common causal variants and shared biological mechanisms. Nat. Commun. 9, 804 (2018).
Article ADS Google Scholar
Cheung, W. A. et al. Functional variation in allelic methylomes underscores a strong genetic contribution and reveals novel epigenetic alterations in the human epigenome. Genome Biol. 18, 50 (2017).
Article Google Scholar
Gluckman, P. D., Hanson, M. A., Cooper, C. & Thornburg, K. L. Effect of in utero and early-life conditions on adult health and disease. N. Engl. J. Med 359, 61–73 (2008).
Article CAS Google Scholar
Klengel, T. et al. Allele-specific FKBP5 DNA demethylation mediates gene-childhood trauma interactions. Nat. Neurosci. 16, 33–41 (2013).
Article CAS Google Scholar
Teh, A. L. et al. The effect of genotype and in utero environment on interindividual variation in neonate DNA methylomes. Genome Res 24, 1064–1074 (2014).
Article CAS Google Scholar
Girchenko, P. et al. Prediction and Prevention of Preeclampsia and Intrauterine Growth Restriction (PREDO) study. Int. J. Epidemiol. https://doi.org/10.1093/ije/dyw154 (2016).
Graham, A. M. et al. Maternal systemic interleukin-6 during pregnancy is associated with newborn amygdala phenotypes and subsequent behavior at 2 years of age. Biol. Psychiatry 83, 109–119 (2018).
Article CAS Google Scholar
Moog, N. K. et al. Intergenerational effect of maternal exposure to childhood maltreatment on newborn brain anatomy. Biol. Psychiatry 83, 120–127 (2018).
Article Google Scholar
Entringer, S. et al. Maternal cortisol during pregnancy and infant adiposity: a prospective investigation. J. Clin. Endocrinol. Metab. 102, 1366–1374 (2017).
PubMed Google Scholar
Stein, D. J. et al. Investigating the psychosocial determinants of child health in Africa: The Drakenstein Child Health Study. J. Neurosci. Methods 252, 27–35 (2015).
Article CAS Google Scholar
Zar, H. J., Barnett, W., Myer, L., Stein, D. J. & Nicol, M. P. Investigating the early-life determinants of illness in Africa: the Drakenstein Child Health Study. Thorax 70, 592–594 (2015).
Article CAS Google Scholar
Ronningen, K. S. et al. The biobank of the Norwegian Mother and Child Cohort Study: a resource for the next 100 years. Eur. J. Epidemiol. 21, 619–625 (2006).
Article CAS Google Scholar
Akaike, H. Proc. 2nd International Symposium on Information Theory. 267–281 (Akademiai Kiado, Budapest, 1973).
Zhou, J. & Troyanskaya, O. G. Predicting effects of noncoding variants with deep learning-based sequence model. Nat. Methods 12, 931–934 (2015).
Article CAS Google Scholar
Consortium, E. P. The ENCODE (ENCyclopedia Of DNA Elements) Project. Science 306, 636–640 (2004).
Article ADS Google Scholar
Autism Spectrum Disorders Working Group of The Psychiatric Genomics Consortium. Meta-analysis of GWAS of over 16,000 individuals with autism spectrum disorder highlights a novel locus at 10q24.32 and a significant overlap with schizophrenia. Mol. Autism 8, 21 (2017).
Article Google Scholar
Demontis, D. et al. Discovery of the first genome-wide significant risk loci for attention deficit/hyperactivity disorder. Nat. Genet. 51, 63–75 (2019).
Article CAS Google Scholar
Psychiatric GWAS Consortium Bipolar Disorder Working Group Large-scale genome-wide association analysis of bipolar disorder identifies a new susceptibility locus near ODZ4. Nat. Genet. 43, 977–983 (2011).
Article Google Scholar
Wray, N. R. et al. Genome-wide association analyses identify 44 risk variants and refine the genetic architecture of major depression. Nat. Genet 50, 668–681 (2018).
Article CAS Google Scholar
Schizophrenia Working Group of the Psychiatric Genomics Consortium. Biological insights from 108 schizophrenia-associated genetic loci. Nature 511, 421–427 (2014).
Article ADS Google Scholar
Cross-Disorder Group of the Psychiatric Genomics Consortium. Identification of risk loci with shared effects on five major psychiatric disorders: a genome-wide analysis. Lancet 381, 1371–1379 (2013).
Article Google Scholar
Liu, J. Z. et al. Association analyses identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk across populations. Nat. Genet. 47, 979–986 (2015).
Article CAS Google Scholar
Morris, A. P. et al. Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nat. Genet. 44, 981–990 (2012).
Article CAS Google Scholar
Horikoshi, M. et al. Discovery and fine-mapping of glycaemic and obesity-related trait loci using high-density imputation. PLoS Genet. 11, e1005230 (2015).
Article Google Scholar
Sandoval, J. et al. Validation of a DNA methylation microarray for 450,000 CpG sites in the human genome. Epigenetics 6, 692–702 (2011).
Article CAS Google Scholar
Mehta, D. et al. Childhood maltreatment is associated with distinct genomic and epigenetic profiles in posttraumatic stress disorder. Proc. Natl Acad. Sci. USA 110, 8302–8307 (2013).
Article ADS CAS Google Scholar
Grishkevich, V. & Yanai, I. The genomic determinants of genotype × environment interactions in gene expression. Trends Genet. 29, 479–487 (2013).
Article CAS Google Scholar
Grishkevich, V. et al. A genomic bias for genotype-environment interactions in C. elegans. Mol. Syst. Biol. 8, 587 (2012).
Article Google Scholar
Chen, Y. A. et al. Discovery of cross-reactive probes and polymorphic CpGs in the Illumina Infinium HumanMethylation450 microarray. Epigenetics 8, 203–209 (2013).
Article CAS Google Scholar
Ong, M. L. & Holbrook, J. D. Novel region discovery method for Infinium 450K DNA methylation data reveals changes associated with aging in muscle and neuronal pathways. Aging Cell 13, 142–155 (2014).
Article CAS Google Scholar
Gu, J. et al. Mapping of variable DNA methylation across multiple cell types defines a dynamic regulatory landscape of the human. Genome G3 (Bethesda) 6, 973–986 (2016).
Article CAS Google Scholar
Feinberg, A. P. & Irizarry, R. A. Evolution in health and medicine Sackler colloquium: stochastic epigenetic variation as a driving force of development, evolutionary adaptation, and disease. Proc. Natl Acad. Sci. USA 107(Suppl 1), 1757–1764 (2010).
Article ADS CAS Google Scholar
Elliott, G. et al. Intermediate DNA methylation is a conserved signature of genome regulation. Nat. Commun. 6, 6363 (2015).
Article CAS Google Scholar
Zhang, P. Inference after variable selection in linear regression models. Biometrika 79, 741–746 (1992).
Article MathSciNet Google Scholar
Radloff, L. S. The CES-D scale: a self-report depression scale for research in the general population. Appl. Psychol. Meas. 1, 385–401 (1977).
Article Google Scholar
Spielberger, C. D. State-trait anxiety inventory: bibliography 2nd edn (Consulting Psychologists Press, Palo Alto, CA, 1989).
Aryee, M. J. et al. Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics 30, 1363–1369 (2014).
Article CAS Google Scholar
Morin, A. M. et al. Maternal blood contamination of collected cord blood can be identified using DNA methylation at three CpGs. Clin. Epigenet. 9, 75 (2017).
Article Google Scholar
Fortin, J. P. et al. Functional normalization of 450k methylation array data improves replication in large cancer studies. Genome Biol. 15, 503 (2014).
Article Google Scholar
Johnson, W. E., Li, C. & Rabinovic, A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8, 118–127 (2007).
Article Google Scholar
Leek, J. T., Johnson, W. E., Parker, H. S., Jaffe, A. E. & Storey, J. D. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28, 882–883 (2012).
Article CAS Google Scholar
Price, M. E. et al. Additional annotation enhances potential for biologically-relevant analysis of the Illumina Infinium HumanMethylation450 BeadChip array. Epigenetics Chromatin 6, 4 (2013).
Article CAS Google Scholar
McCartney, D. L. et al. Identification of polymorphic and off-target probe binding sites on the Illumina Infinium MethylationEPIC BeadChip. Genom. Data 9, 22–24 (2016).
Article Google Scholar
Bakulski, K. M. et al. DNA methylation of cord blood cell types: applications for mixed cell birth studies. Epigenetics 11, 354–362 (2016).
Article Google Scholar
van der Westhuizen, C., Wyatt, G., Williams, J. K., Stein, D. J. & Sorsdahl, K. Validation of the self reporting questionnaire 20-Item (SRQ-20) for use in a low- and middle-income country emergency centre setting. Int J. Ment. Health Addict. 14, 37–48 (2016).
Article Google Scholar
Beck, A. T., Ward, C. H., Mendelson, M., Mock, J. & Erbaugh, J. An inventory for measuring depression. Arch. Gen. Psychiatry 4, 561–571 (1961).
Article CAS Google Scholar
Group, W. A. W. The Alcohol, Smoking and Substance Involvement Screening Test (ASSIST): development, reliability and feasibility. Addiction 97, 1183–1194 (2002).
Article Google Scholar
Magnus, P. et al. Cohort profile update: the norwegian mother and child cohort study (MoBa). Int J. Epidemiol. 45, 382–388 (2016).
Article Google Scholar
Haberg, S. E. et al. Maternal folate levels in pregnancy and asthma in children at age 3 years. J. Allergy Clin. Immunol. 127, 262–264 (2011). 264 e261.
Article CAS Google Scholar
Joubert, B. R. et al. 450K epigenome-wide scan identifies differential DNA methylation in newborns related to maternal smoking during pregnancy. Environ. Health Perspect. 120, 1425–1431 (2012).
Article CAS Google Scholar
Irgens, L. M. The Medical Birth Registry of Norway. Epidemiological research and surveillance throughout 30 years. Acta Obstet. Gynecol. Scand. 79, 435–439 (2000).
Article CAS Google Scholar
Bibikova, M. et al. High density DNA methylation array with single CpG site resolution. Genomics 98, 288–295 (2011).
Article CAS Google Scholar
Teschendorff, A. E. et al. A beta-mixture quantile normalization method for correcting probe design bias in Illumina Infinium 450 k DNA methylation data. Bioinformatics 29, 189–196 (2013).
Article CAS Google Scholar

Download references

Acknowledgements

We want to thank Susanne Sauer and Maik Ködel for their technical assistance and Jessica Keverne for language editing. We thank all mothers who took part in the on-going PREDO study. We are grateful to all the families in Norway who participate in the on-going MoBa cohort study. We thank the Drakenstein Child Health Study staff, and the clinical and administrative staff of the Western Cape Government Department of Health at Paarl Hospital and at the clinics for support of the Study. We also thank our collaborators and students. Finally, we thank all mothers and children enroled in the Drakenstein Child Health Study. We thank the research participants and employees of 23andMe, Inc. for their contribution to this study. This work was supported by the Academy of Finland (E.K., H.L., K.R., and J.L.); University of Helsinki Research Funds (J.L., M.L.P., and H.L.), British Heart Foundation (RMR); Tommy’s (RMR); European Commission (EK, KR, Horizon 2020 Award SC1–2016-RTD-733280 RECAP); NorFace DIAL (E.K., KR PremLife); Foundation for Pediatric Research (E.K.); Juho Vainio Foundation (E.K.); Novo Nordisk Foundation (E.K.); Signe and Ane Gyllenberg Foundation (E.K., K.R.); Sigrid Jusélius Foundation (E.K.); Finnish Medical Foundation (H.L.); Jane and Aatos Erkko Foundation (H.L.); Päivikki and Sakari Sohlberg Foundation (H.L., P.M.V.); the Clinical Graduate school in Pediatrics and Obstetrics/Gynaecology in University of Helsinki (P.M.V.). The Norwegian Mother and Child Cohort Study is supported by the Norwegian Ministry of Health and Care Services and the Ministry of Education and Research, NIH/NIEHS (contract no N01-ES-75558), NIH/NINDS (grant no.1 UO1 NS 047537–01 and grant no.2 UO1 NS 047537–06A1). For this work, MoBa 1 and 2 were supported by the Intramural Research Program of the NIH, National Institute of Environmental Health Sciences (Z01-ES-49019) and the Norwegian Research Council/BIOBANK (grant no 221097). This work was also partly supported by the Research Council of Norway through its Centres of Excellence funding scheme, project number 262700. The Drakenstein Child Health Study is supported by the Bill and Melinda Gates Foundation (OPP 1017641); with additional support for this work from the Eunice Kennedy Shriver National Institute of Child Health and Human Development of the National Institutes of Health (NICHD) under Award Number R21HD085849; and the Fogarty International Center (FIC). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Additional support for H.J.Z., D.J.S. and N.K., and for research reported in this publication was by the South African Medical Research Council (SAMRC); N.K. receives support from the SAMRC under a Self-Initiated Research Grant. The views and opinions expressed are those of the authors and do not necessarily represent the official views of the SAMRC. This work was also funded by the German Federal Ministry of Education and Research through the Research Consortium Integrated Network IntegraMent (grant 01ZX1314H) under the auspices of the e:Med Programme (NSM). The UCI cohort was supported by a European Research Area Network (ERA Net) Neuron grant (01EW1407A, CB) and National Institutes of Health grant (R01 HD-060628, CB) as well as NIH grant R01 MH-105538 (PDW). This work was also funded by the Canadian Institute for Advanced Research, Child and Brain Development Program, Toronto, ON, Canada (KJOD).

Author information

Authors and Affiliations

Max-Planck-Institute of Psychiatry, Department of Translational Research in Psychiatry, Munich, 80804, Germany
Darina Czamara, Till F. M. Andlauer, Bertram Müller-Myhsok & Elisabeth B. Binder
Institute of Computational Biology, Helmholtz-Zentrum München, German Research Center for Environmental Health, Neuherberg, 85764, Germany
Gökçen Eraslan, Ivan Kondofersky, Nikola S. Müller & Fabian J. Theis
School of Life Sciences, Weihenstephan, Technische Universität München, Freising, 85354, Germany
Gökçen Eraslan
Oslo Centre for Biostatistics and Epidemiology, Research Support Unit, Oslo University Hospital, Oslo, 0372, Norway
Christian M. Page
Center for Fertility and Health, Norwegian Institute of Public Health, Oslo, 0213, Norway
Christian M. Page & Siri E. Håberg
Department of Psychology and Logopedics, Faculty of Medicine, University of Helsinki, Helsinki, 00014, Finland
Jari Lahti, Marius Lahti-Pulkkinen & Katri Räikkönen
Helsinki Collegium for Advanced Studies, University of Helsinki, Helsinki, 00101, Finland
Jari Lahti
British Heart Foundation Centre for Cardiovascular Science, Queen’s Medical Research Institute, University of Edinburgh, Edinburgh, EH16 4TJ, UK
Marius Lahti-Pulkkinen & Rebecca M. Reynolds
HUSLAB and Department of Clinical Chemistry, Helsinki University, Helsinki, 00290, Finland
Esa Hämäläinen
Oulu University Hospital and University of Oulu, PEDEGO Research Unit, MRC Oulu, 90014, Finland
Eero Kajantie & E. J. C. de Geus
Hospital for Children and Adolescents, University of Helsinki and Helsinki University Hospital, Helsinki, 00029, Finland
Eero Kajantie
National Institute for Health and Welfare, Helsinki, 00271, Finland
Eero Kajantie
Medical and Clinical Genetics and Obstetrics and Gynaecology University of Helsinki and Helsinki University Central Hospital, Helsinki, 00014, Finland
Hannele Laivuori & Pia M. Villa
Institute for Molecular Medicine Finland, Helsinki Institute of Life Science, University of Helsinki, Helsinki, 00014, Finland
Hannele Laivuori
Faculty of Medicine and Life Sciences, University of Tampere, Tampere, 33100, Finland
Hannele Laivuori
Department of Obstetrics and Gynecology, Tampere University Hospital, Tampere, 33100, Finland
Hannele Laivuori
Department of Chronic Diseases and Ageing, Norwegian Institute of Public Health, Oslo, 0213, Norway
Wenche Nystad
Epidemiology Branch, National Institute of Environmental Health Sciences, National Institutes of Health, U.S. Department of Health and Human Services, Research Triangle Park, North Carolina, 20814, USA
Stephanie J. London
Ludmer Centre for Neuroinformatics and Mental Health, Douglas Mental Health University Institute, McGill University, Montreal, H3A 2B4, QC, Canada
Kieran J. O’Donnell, Elika Garg & Michael J. Meaney
Sackler Program for Epigenetics and Psychobiology at McGill University, Montreal, H3A 0G4, QC, Canada
Kieran J. O’Donnell & Michael J. Meaney
Singapore Institute for Clinical Sciences, Singapore, 117609, Singapore
Michael J. Meaney
Charité – Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health (BIH), Institute of Medical Psychology, Berlin, 10117, Germany
Sonja Entringer & Claudia Buss
University of California, Irvine, Development, Health, and Disease Research Program, Orange, CA, 92697, USA
Sonja Entringer, Pathik D. Wadhwa & Claudia Buss
Department of Psychiatry and Human Behavior, Obstetrics and Gynecology, and Epidemiology, University of California, Irvine, School of Medicine, Irvine, CA, 92697, USA
Pathik D. Wadhwa
Centre for Molecular Medicine and Therapeutics, Department of Medical Genetics, University of British Columbia and the BC Children’s Hospital Research Institute, Vancouver, V5Z 4H4, BC, Canada
Meaghan J. Jones, David T. S. Lin, Julie L. MacIsaac & Michael S. Kobor
Department of Psychiatry and Mental Health, University of Cape Town, Cape Town, 7925, South Africa
Nastassja Koen, Shareefa Dalvie & Dan J. Stein
South African Medical Research Council (SAMRC), Unit on Risk and Resilience in Mental Disorders, Cape Town, 7505, South Africa
Nastassja Koen & Dan J. Stein
Department of Paediatrics & Child Health and SAMRC Unit on Child and Adolescent Health, University of Cape Town, Cape Town, 7505, South Africa
Heather J. Zar
Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, 02115, USA
Karestan C. Koenen
Department of Mathematics, Technische Universität München, Munich, 85748, Germany
Ivan Kondofersky & Fabian J. Theis
Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta, 30329, USA
Elisabeth B. Binder
Institute for Molecular Bioscience, The University of Queensland, Brisbane, 4072, QLD, Australia
Naomi R. Wray, Maciej Trzaskowski, Enda M. Byrne, Peter M. Visscher, Yang Wu & Futao Zhang
Queensland Brain Institute, The University of Queensland, Brisbane, 4072, QLD, Australia
Naomi R. Wray, Robert M. Maier, Divya Mehta, Grant W. Montgomery, Peter M. Visscher & Jian Yang
Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, 02114, USA
Stephan Ripke
Department of Psychiatry and Psychotherapy, Universitätsmedizin Berlin Campus Charité Mitte, Berlin, 14129, Germany
Stephan Ripke, Julia Kraft & Vassily Trubetskoy
Medical and Population Genetics, Broad Institute, Cambridge, MA, 02142, USA
Stephan Ripke & Tõnu Esko
Centre for Psychiatry Research, Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, 17177, SE, Sweden
Manuel Mattheisen
Department of Biomedicine, Aarhus University, Aarhus, 8000, Denmark
Manuel Mattheisen, Jane Hvarregaard Christensen, Jakob Grove, Per Qvist & Anders D. Børglum
iSEQ, Centre for Integrative Sequencing, Aarhus University, Aarhus, 8000, Denmark
Manuel Mattheisen, Henriette N. Buttenschøn, Jane Hvarregaard Christensen, Jakob Grove, Per Qvist, Preben Bo Mortensen & Anders D. Børglum
iPSYCH, The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Aarhus, 8000, Denmark
Manuel Mattheisen, Esben Agerbo, Marie Bækvad-Hansen, Henriette N. Buttenschøn, Jonas Bybjerg-Grauholm, Jane Hvarregaard Christensen, Jakob Grove, Christine Søholm Hansen, David M. Hougaard, Carsten Bøcker Pedersen, Marianne Giørtz Pedersen, Per Qvist, Wesley Thompson, Yunpeng Wang, Shantel Marie Weinsheimer, Ole Mors, Preben Bo Mortensen, Merete Nordentoft, Thomas Werge & Anders D. Børglum
Department of Biological Psychology & EMGO+ Institute for Health and Care Research, Vrije Universiteit Amsterdam, Amsterdam, 1081 BT, Netherlands
Abdel Abdellaoui, Conor V. Dolan, Jouke-Jan Hottenga, Hamdi Mbarek, Christel M. Middeldorp, Michel G. Nivard, Gonneke Willemsen & Dorret I. Boomsma
Division of Psychiatry, University of Edinburgh, Edinburgh, EH10 5HF, UK
Mark J. Adams, Douglas H. R. Blackwood, Toni-Kim Clarke, Lynsey S. Hall & Andrew M. McIntosh
Centre for Integrated Register-based Research, Aarhus University, Aarhus, 8210, Denmark
Esben Agerbo, Carsten Bøcker Pedersen, Marianne Giørtz Pedersen & Preben Bo Mortensen
National Centre for Register-Based Research, Aarhus University, Aarhus, 8210, Denmark
Esben Agerbo, Carsten Bøcker Pedersen, Marianne Giørtz Pedersen & Preben Bo Mortensen
Discipline of Psychiatry, University of Adelaide, Adelaide, 5000, SA, Australia
Tracy M. Air & Bernhard T. Baune
Munich Cluster for Systems Neurology (SyNergy), Munich, 81377, Germany
Till F. M. Andlauer & Bertram Müller-Myhsok
Department of Psychiatry, Virginia Commonwealth University, Richmond, VA, 22903, USA
Silviu-Alin Bacanu, Tim B. Bigdeli, Per Hoffmann, Roseann E. Peterson, Brien P. Riley & Kenneth S. Kendler
Center for Neonatal Screening, Department for Congenital Disorders, Statens Serum Institut, Copenhagen, 2300, Denmark
Marie Bækvad-Hansen, Jonas Bybjerg-Grauholm, Christine Søholm Hansen & David M. Hougaard
Department of Psychiatry, Vrije Universiteit Medical Center and GGZ inGeest, Amsterdam, 1081, NL, Netherlands
Aartjan T. F. Beekman, Rick Jansen, Yuri Milaneschi, Wouter J. Peyrot, Johannes H. Smit & Brenda W. J. H. Penninx
Virginia Institute for Psychiatric and Behavior Genetics, Richmond, VA, 23298, USA
Tim B. Bigdeli
Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, 17177, SE, Sweden
Julien Bryois, Erik Pettersson, Alexander Viktorin, Patrik K. Magnusson, Nancy L. Pedersen & Patrick F. Sullivan
Department of Clinical Medicine, Translational Neuropsychiatry Unit, Aarhus University, Aarhus, 8240, Denmark
Henriette N. Buttenschøn
Human Genetics, Wellcome Trust Sanger Institute, Cambridge, CB10 1SA, UK
Na Cai
Statistical genomics and systems genetics, European Bioinformatics Institute (EMBL-EBI), Cambridge, CB10 1 SD, UK
Na Cai
Department of Psychiatry, University Hospital of Lausanne, Prilly, Vaud, 1004, Switzerland
Enrique Castelao, Giorgio Pistis & Martin Preisig
MRC Social Genetic and Developmental Psychiatry Centre, King’s College London, London, WC2R 2LS, UK
Jonathan R. I. Coleman, Thalia C. Eley, Héléna A. Gaspar, Peter McGuffin, Niamh Mullins, Paul F. O’Reilly, Margarita Rivera, Cathryn M. Lewis & Gerome Breen
Genetics and Computational Biology, QIMR Berghofer Medical Research Institute, Herston, 4006, QLD, Australia
Lucía Colodro-Conde, Eske M. Derks, Penelope A. Lind, Sarah E. Medland & Jodie N. Painter
Centre for Advanced Imaging, The University of Queensland, Saint Lucia, 4072, QLD, Australia
Baptiste Couvy-Duchesne & Hilary K. Finucane
Queensland Brain Institute, The University of Queensland, Saint Lucia, 4072, QLD, Australia
Baptiste Couvy-Duchesne
Psychological Medicine, Cardiff University, Cardiff, CF14 4XN, UK
Nick Craddock
Center for Genomic and Computational Biology, Duke University, Durham, NC, 27705, USA
Gregory E. Crawford
Division of Medical Genetics, Department of Pediatrics, Duke University, Durham, NC, 27708, USA
Gregory E. Crawford
Centre for Cognitive Ageing and Cognitive Epidemiology, University of Edinburgh, Edinburgh, EH8 9JZ, UK
Gail Davies, Ian J. Deary & Andrew M. McIntosh
Institute of Human Genetics, University of Bonn, Bonn, 53127, DE, Germany
Franziska Degenhardt, Andreas J. Forstner, Stefan Herms, Per Hoffmann, Sven Cichon & Markus M. Nöthen
Life & Brain Center, Department of Genomics, University of Bonn, Bonn, 53127, Germany
Franziska Degenhardt, Andreas J. Forstner, Stefan Herms, Per Hoffmann & Markus M. Nöthen
Epidemiology, Erasmus MC, Rotterdam, 3015, Zuid-Holland, Netherlands
Nese Direk, Saira Saeed Mirza & Henning Tiemeier
Psychiatry, Dokuz Eylul University School Of Medicine, Izmir, 35220, Turkey
Nese Direk
Department of Psychiatry, Massachusetts General Hospital, Boston, MA, 02114, USA
Erin C. Dunn, Roy H. Perlis & Jordan W. Smoller
Psychiatric and Neurodevelopmental Genetics Unit (PNGU), Massachusetts General Hospital, Boston, MA, 02114, USA
Erin C. Dunn & Jordan W. Smoller
Stanley Center for Psychiatric Research, Broad Institute, Cambridge, MA, 02142, USA
Erin C. Dunn & Jordan W. Smoller
Neuroscience and Mental Health, Cardiff University, Cardiff, CF24 4HQ, UK
Valentina Escott-Price
Bioinformatics, University of British Columbia, Vancouver, V5Z 4S6, BC, Canada
Farnush Farhadi Hassan Kiadeh
Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA, 02142, USA
Hilary K. Finucane
Department of Psychiatry (UPK), University of Basel, Basel, 4002, Switzerland
Andreas J. Forstner
Human Genomics Research Group, Department of Biomedicine, University of Basel, Basel, 4031, Switzerland
Andreas J. Forstner, Stefan Herms & Sven Cichon
Department of Genetic Epidemiology in Psychiatry, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, 68159, Baden-Württemberg, Germany
Josef Frank, Fabian Streit, Jana Strohmaier, Jens Treutlein, Stephanie H. Witt, Marcella Rietschel & Thomas G. Schulze
Department of Psychiatry, Trinity College Dublin, Dublin, 8, Ireland
Michael Gill
Psychiatry & Behavioral Sciences, Johns Hopkins University, Baltimore, MD, 21287, USA
Fernando S. Goes, Dean F. MacKinnon, Francis M. Mondimore & J. Raymond DePaulo
Genetics and Computational Biology, QIMR Berghofer Medical Research Institute, Brisbane, 4006, QLD, Australia
Scott D. Gordon & Nicholas G. Martin
Bioinformatics Research Centre, Aarhus University, Aarhus, 8000, Denmark
Jakob Grove
Institute of Genetic Medicine, Newcastle University, Newcastle upon Tyne, NE1 3BZ, England
Lynsey S. Hall
Danish Headache Centre, Department of Neurology, Rigshospitalet, Glostrup, 2600, Denmark
Thomas F. Hansen
Institute of Biological Psychiatry, Mental Health Center Sct. Hans, Mental Health Services Capital Region of Denmark, Copenhagen, 4000, Denmark
Thomas F. Hansen, Wesley Thompson, Yunpeng Wang, Shantel Marie Weinsheimer & Thomas Werge
iPSYCH, The Lundbeck Foundation Initiative for Psychiatric Research, Copenhagen, 8000, Denmark
Thomas F. Hansen
Brain and Mind Centre, University of Sydney, Sydney, 2050, NSW, Australia
Ian B. Hickie
Interfaculty Institute for Genetics and Functional Genomics, Department of Functional Genomics, University Medicine and Ernst Moritz Arndt University Greifswald, Greifswald, 17489, Mecklenburg-Vorpommern, Germany
Georg Homuth
Roche Pharmaceutical Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd, Basel, 4070, Switzerland
Carsten Horn
Max Planck Institute of Psychiatry, Munich, 80804, Germany
Marcus Ising, Stefan Kloiber & Susanne Lucae
Division of Research, Kaiser Permanente Northern California, Oakland, CA, 94612, USA
Eric Jorgenson, Ling Shen & Catherine Schaefer
Psychiatry & The Behavioral Sciences, University of Southern California, Los Angeles, CA, 90033, USA
James A. Knowles
Department of Biomedical Informatics, Harvard Medical School, Boston, MA, 02115, USA
Isaac S. Kohane
Department of Medicine, Brigham and Women’s Hospital, Boston, MA, 02115, USA
Isaac S. Kohane
Informatics Program, Boston Children’s Hospital, Boston, MA, 02115, USA
Isaac S. Kohane
Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, UK
Warren W. Kretzschmar & Yihan Li
Department of Endocrinology at Herlev University Hospital, University of Copenhagen, Copenhagen, 2730, Denmark
Jesper Krogh
Institute of Social and Preventive Medicine (IUMSP), University Hospital of Lausanne, Lausanne, VD, 1010, Switzerland
Zoltán Kutalik
Swiss Institute of Bioinformatics, Lausanne, VD, 1015, Switzerland
Zoltán Kutalik
Division of Psychiatry, Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, EH16 4SB, UK
Donald J. MacIntyre
Mental Health, NHS 24, Glasgow, G12 0XH, UK
Donald J. MacIntyre
Department of Psychiatry and Psychotherapy, University of Bonn, Bonn, 53105, Germany
Wolfgang Maier
Statistics, University of Oxford, Oxford, OX1 3LB, UK
Jonathan Marchini
Psychiatry, Columbia University College of Physicians and Surgeons, New York, NY, 10032, USA
Patrick McGrath & Myrna M. Weissman
School of Psychology and Counseling, Queensland University of Technology, Brisbane, QLD, 4059, Australia
Divya Mehta
Child and Youth Mental Health Service, Children’s Health Queensland Hospital and Health Service, South Brisbane, QLD, 4000, Australia
Christel M. Middeldorp
Child Health Research Centre, University of Queensland, Brisbane, QLD, 4101, Australia
Christel M. Middeldorp
Estonian Genome Center, University of Tartu, Tartu, 51005, Estonia
Evelin Mihailov, Lili Milani, Tõnu Esko & Andres Metspalu
Medical Genetics, University of British Columbia, Vancouver, BC, V6H 3N1, Canada
Sara Mostafavi
Statistics, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
Sara Mostafavi & Bernard Ng
DZHK (German Centre for Cardiovascular Research), Partner Site Greifswald, University Medicine, University Medicine Greifswald, Greifswald, Mecklenburg-Vorpommern, 17489, Germany
Matthias Nauck
Institute of Clinical Chemistry and Laboratory Medicine, University Medicine Greifswald, Greifswald, Mecklenburg-Vorpommern, 17489, Germany
Matthias Nauck
Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, QLD, 4059, Australia
Dale R. Nyholt
Humus, Reykjavik, 101, Iceland
Hogni Oskarsson
MRC Centre for Neuropsychiatric Genetics and Genomics, Cardiff University, Cardiff, CF24 4HQ, UK
Michael J. Owen & Michael C. O’Donovan
Virginia Institute for Psychiatric & Behavioral Genetics, Virginia Commonwealth University, Richmond, VA, 23298, USA
Roseann E. Peterson
Clinical Genetics, Vrije Universiteit Medical Center, Amsterdam, 1081HV, Netherlands
Danielle Posthuma
Complex Trait Genetics, Vrije Universiteit Amsterdam, Amsterdam, 1081 HV, Netherlands
Danielle Posthuma
Solid Biosciences, Boston, MA, 02139, USA
Jorge A. Quiroz
Department of Psychiatry, Washington University in Saint Louis School of Medicine, Saint Louis, MO, 63110, USA
John P. Rice, Andrew C. Heath & Pamela A. F. Madden
Department of Biochemistry and Molecular Biology II, Institute of Neurosciences, Center for Biomedical Research, University of Granada, Granada, CP 18100, Spain
Margarita Rivera
Department of Psychiatry, University of Groningen, University Medical Center Groningen, Groningen, 9700 RB, Netherlands
Robert Schoevers
Department of Psychiatry and Psychotherapy, Medical Center of the University of Munich, Campus Innenstadt, Munich, 80336, Germany
Eva C. Schulte
Institute of Psychiatric Phenomics and Genomics (IPPG), Medical Center of the University of Munich, Campus Innenstadt, Munich, 80336, Germany
Eva C. Schulte & Thomas G. Schulze
Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, 20892, USA
Jianxin Shi
Behavioral Health Services, Kaiser Permanente Washington, Seattle, WA, 98112, USA
Stanley I. Shyn
Faculty of Medicine, Department of Psychiatry, University of Iceland, Reykjavik, 101, Iceland
Engilbert Sigurdsson
School of Medicine and Dentistry, James Cook University, Townsville, QLD, 4811, Australia
Grant C. B. Sinnamon
Institute of Health and Wellbeing, University of Glasgow, Glasgow, G12 8RZ, UK
Daniel J. Smith
deCODE Genetics/Amgen, Reykjavik, 101, Iceland
Hreinn Stefansson, Stacy Steinberg & Thorgeir E. Thorgeirsson
College of Biomedical and Life Sciences, Cardiff University, Cardiff, CF14 4EP, UK
Katherine E. Tansey
Institute of Epidemiology and Social Medicine, University of Münster, Münster, Nordrhein-Westfalen, 48149, Germany
Henning Teismann, Jürgen Wellmann & Klaus Berger
Institute for Community Medicine, University Medicine Greifswald, Greifswald, Mecklenburg-Vorpommern, 17489, Germany
Alexander Teumer & Henry Völzke
Department of Psychiatry, University of California, San Diego, San Diego, CA, 92093, USA
Wesley Thompson & Pippa A. Thomson
KG Jebsen Centre for Psychosis Research, Norway Division of Mental Health and Addiction, Oslo University Hospital, Oslo, 0407, Norway
Wesley Thompson & Yunpeng Wang
Clinical Neurosciences, University of Cambridge, Cambridge, CB2 1QW, UK
Matthew Traylor
Internal Medicine, Erasmus MC, Rotterdam, Zuid-Holland, 3015, Netherlands
André G. Uitterlinden
Roche Pharmaceutical Research and Early Development, Neuroscience, Ophthalmology and Rare Diseases Discovery & Translational Medicine Area, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd, Basel, 4070, Switzerland
Daniel Umbricht
Department of Psychiatry and Psychotherapy, University Medicine Greifswald, Greifswald, Mecklenburg-Vorpommern, 17475, Germany
Sandra Van der Auwera & Hans J. Grabe
Department of Psychiatry, Leiden University Medical Center, Leiden, 2333 ZA, Netherlands
Albert M. van Hemert
Virginia Institute of Psychiatric & Behavioral Genetics, Virginia Commonwealth University, Richmond, VA, 23298, USA
Bradley T. Webb
Computational Sciences Center of Emphasis, Pfizer Global Research and Development, Cambridge, MA, 02139, USA
Hualin S. Xi
Institute for Molecular Bioscience; Queensland Brain Institute, The University of Queensland, Brisbane, QLD, 4072, Australia
Jian Yang
Department of Psychiatry, University of Münster, Münster, Nordrhein-Westfalen, 48149, Germany
Volker Arolt & Udo Dannlowski
Institute of Medical Genetics and Pathology, University Hospital Basel, University of Basel, Basel, 4031, Switzerland
Sven Cichon
Institute of Neuroscience and Medicine (INM-1), Research Center Juelich, Juelich, 52425, Germany
Sven Cichon
Amsterdam Public Health Institute, Vrije Universiteit Medical Center, Amsterdam, 1081 BT, Netherlands
E. J. C. de Geus
Centre for Integrative Biology, Università degli Studi di Trento, Trento, Trentino-Alto Adige, 38123, Italy
Enrico Domenici
Department of Psychiatry and Psychotherapy, Medical Center, University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, 79104, Germany
Katharina Domschke
Psychiatry, Kaiser Permanente Northern California, San Francisco, CA, 94115, USA
Steven P. Hamilton
Medical Research Council Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, EH4 2XU, UK
Caroline Hayward
Department of Psychiatry, University of Toronto, Toronto, ON, M5T 1R8, Canada
Stefan Kloiber
Centre for Addiction and Mental Health, Toronto, ON, M6J 1H4, Canada
Stefan Kloiber
Division of Psychiatry, University College London, London, W1T 7NF, UK
Glyn Lewis
Neuroscience Therapeutic Area, Janssen Research and Development, LLC, Titusville, NJ, 08560, USA
Qingqin S. Li
Institute of Molecular and Cell Biology, University of Tartu, Tartu, 51010, Estonia
Andres Metspalu
Psychosis Research Unit, Aarhus University Hospital, Risskov, Aarhus, 8200, Denmark
Ole Mors & Kari Stefansson
University of Liverpool, Liverpool, L69 3BX, UK
Bertram Müller-Myhsok
Mental Health Center Copenhagen, Copenhagen Universtity Hospital, Copenhagen, 2100, Denmark
Merete Nordentoft
Human Genetics and Computational Biomedicine, Pfizer Global Research and Development, Groton, CT, 06340, USA
Sara A. Paciga
Psychiatry, Harvard Medical School, Boston, MA, 02215, USA
Roy H. Perlis
Medical Genetics Section, CGEM, IGMM, University of Edinburgh, Edinburgh, EH4 2XU, UK
David J. Porteous
Psychiatry, University of Iowa, Iowa City, IA, 52246, USA
James B. Potash
Department of Psychiatry and Behavioral Sciences, Johns Hopkins University, Baltimore, MD, 21287, USA
Thomas G. Schulze
Department of Psychiatry and Psychotherapy, University Medical Center Göttingen, Goettingen, Niedersachsen, 37075, Germany
Thomas G. Schulze
Human Genetics Branch, NIMH Division of Intramural Research Programs, Bethesda, MD, 20892-9663, USA
Thomas G. Schulze
Faculty of Medicine, University of Iceland, Reykjavik, 101, Iceland
Kari Stefansson
Child and Adolescent Psychiatry, Erasmus MC, Rotterdam, Zuid-Holland, 3015, Netherlands
Henning Tiemeier
Psychiatry, Erasmus MC, Rotterdam, Zuid-Holland, 3015, Netherlands
Henning Tiemeier
Psychiatry, Dalhousie University, Halifax, NS, B3H 2E2, Canada
Rudolf Uher
Division of Epidemiology, New York State Psychiatric Institute, New York, NY, 10032, USA
Myrna M. Weissman
Department of Clinical Medicine, University of Copenhagen, Copenhagen, 2200, Denmark
Thomas Werge
Department of Medical & Molecular Genetics, King’s College London, London, WC2R 2LS, UK
Cathryn M. Lewis
Psychiatry & Behavioral Sciences, Stanford University, Stanford, CA, 94305-5717, USA
Douglas F. Levinson
NIHR BRC for Mental Health, King’s College London, London, SE5 8AF, UK
Gerome Breen
Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27514, USA
Patrick F. Sullivan
Psychiatry, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27514, USA
Patrick F. Sullivan

Authors

Darina Czamara
View author publications
You can also search for this author in PubMed Google Scholar
Gökçen Eraslan
View author publications
You can also search for this author in PubMed Google Scholar
Christian M. Page
View author publications
You can also search for this author in PubMed Google Scholar
Jari Lahti
View author publications
You can also search for this author in PubMed Google Scholar
Marius Lahti-Pulkkinen
View author publications
You can also search for this author in PubMed Google Scholar
Esa Hämäläinen
View author publications
You can also search for this author in PubMed Google Scholar
Eero Kajantie
View author publications
You can also search for this author in PubMed Google Scholar
Hannele Laivuori
View author publications
You can also search for this author in PubMed Google Scholar
Pia M. Villa
View author publications
You can also search for this author in PubMed Google Scholar
Rebecca M. Reynolds
View author publications
You can also search for this author in PubMed Google Scholar
Wenche Nystad
View author publications
You can also search for this author in PubMed Google Scholar
Siri E. Håberg
View author publications
You can also search for this author in PubMed Google Scholar
Stephanie J. London
View author publications
You can also search for this author in PubMed Google Scholar
Kieran J. O’Donnell
View author publications
You can also search for this author in PubMed Google Scholar
Elika Garg
View author publications
You can also search for this author in PubMed Google Scholar
Michael J. Meaney
View author publications
You can also search for this author in PubMed Google Scholar
Sonja Entringer
View author publications
You can also search for this author in PubMed Google Scholar
Pathik D. Wadhwa
View author publications
You can also search for this author in PubMed Google Scholar
Claudia Buss
View author publications
You can also search for this author in PubMed Google Scholar
Meaghan J. Jones
View author publications
You can also search for this author in PubMed Google Scholar
David T. S. Lin
View author publications
You can also search for this author in PubMed Google Scholar
Julie L. MacIsaac
View author publications
You can also search for this author in PubMed Google Scholar
Michael S. Kobor
View author publications
You can also search for this author in PubMed Google Scholar
Nastassja Koen
View author publications
You can also search for this author in PubMed Google Scholar
Heather J. Zar
View author publications
You can also search for this author in PubMed Google Scholar
Karestan C. Koenen
View author publications
You can also search for this author in PubMed Google Scholar
Shareefa Dalvie
View author publications
You can also search for this author in PubMed Google Scholar
Dan J. Stein
View author publications
You can also search for this author in PubMed Google Scholar
Ivan Kondofersky
View author publications
You can also search for this author in PubMed Google Scholar
Nikola S. Müller
View author publications
You can also search for this author in PubMed Google Scholar
Fabian J. Theis
View author publications
You can also search for this author in PubMed Google Scholar
Katri Räikkönen
View author publications
You can also search for this author in PubMed Google Scholar
Elisabeth B. Binder
View author publications
You can also search for this author in PubMed Google Scholar

Consortia

Major Depressive Disorder Working Group of the Psychiatric Genomics Consortium

Naomi R. Wray
, Stephan Ripke
, Manuel Mattheisen
, Maciej Trzaskowski
, Enda M. Byrne
, Abdel Abdellaoui
, Mark J. Adams
, Esben Agerbo
, Tracy M. Air
, Till F. M. Andlauer
, Silviu-Alin Bacanu
, Marie Bækvad-Hansen
, Aartjan T. F. Beekman
, Tim B. Bigdeli
, Douglas H. R. Blackwood
, Julien Bryois
, Henriette N. Buttenschøn
, Jonas Bybjerg-Grauholm
, Na Cai
, Enrique Castelao
, Jane Hvarregaard Christensen
, Toni-Kim Clarke
, Jonathan R. I. Coleman
, Lucía Colodro-Conde
, Baptiste Couvy-Duchesne
, Nick Craddock
, Gregory E. Crawford
, Gail Davies
, Ian J. Deary
, Franziska Degenhardt
, Eske M. Derks
, Nese Direk
, Conor V. Dolan
, Erin C. Dunn
, Thalia C. Eley
, Valentina Escott-Price
, Farnush Farhadi Hassan Kiadeh
, Hilary K. Finucane
, Andreas J. Forstner
, Josef Frank
, Héléna A. Gaspar
, Michael Gill
, Fernando S. Goes
, Scott D. Gordon
, Jakob Grove
, Lynsey S. Hall
, Christine Søholm Hansen
, Thomas F. Hansen
, Stefan Herms
, Ian B. Hickie
, Per Hoffmann
, Georg Homuth
, Carsten Horn
, Jouke-Jan Hottenga
, David M. Hougaard
, Marcus Ising
, Rick Jansen
, Eric Jorgenson
, James A. Knowles
, Isaac S. Kohane
, Julia Kraft
, Warren W. Kretzschmar
, Jesper Krogh
, Zoltán Kutalik
, Yihan Li
, Penelope A. Lind
, Donald J. MacIntyre
, Dean F. MacKinnon
, Robert M. Maier
, Wolfgang Maier
, Jonathan Marchini
, Hamdi Mbarek
, Patrick McGrath
, Peter McGuffin
, Sarah E. Medland
, Divya Mehta
, Christel M. Middeldorp
, Evelin Mihailov
, Yuri Milaneschi
, Lili Milani
, Francis M. Mondimore
, Grant W. Montgomery
, Sara Mostafavi
, Niamh Mullins
, Matthias Nauck
, Bernard Ng
, Michel G. Nivard
, Dale R. Nyholt
, Paul F. O’Reilly
, Hogni Oskarsson
, Michael J. Owen
, Jodie N. Painter
, Carsten Bøcker Pedersen
, Marianne Giørtz Pedersen
, Roseann E. Peterson
, Erik Pettersson
, Wouter J. Peyrot
, Giorgio Pistis
, Danielle Posthuma
, Jorge A. Quiroz
, Per Qvist
, John P. Rice
, Brien P. Riley
, Margarita Rivera
, Saira Saeed Mirza
, Robert Schoevers
, Eva C. Schulte
, Ling Shen
, Jianxin Shi
, Stanley I. Shyn
, Engilbert Sigurdsson
, Grant C. B. Sinnamon
, Johannes H. Smit
, Daniel J. Smith
, Hreinn Stefansson
, Stacy Steinberg
, Fabian Streit
, Jana Strohmaier
, Katherine E. Tansey
, Henning Teismann
, Alexander Teumer
, Wesley Thompson
, Pippa A. Thomson
, Thorgeir E. Thorgeirsson
, Matthew Traylor
, Jens Treutlein
, Vassily Trubetskoy
, André G. Uitterlinden
, Daniel Umbricht
, Sandra Van der Auwera
, Albert M. van Hemert
, Alexander Viktorin
, Peter M. Visscher
, Yunpeng Wang
, Bradley T. Webb
, Shantel Marie Weinsheimer
, Jürgen Wellmann
, Gonneke Willemsen
, Stephanie H. Witt
, Yang Wu
, Hualin S. Xi
, Jian Yang
, Futao Zhang
, Volker Arolt
, Bernhard T. Baune
, Klaus Berger
, Dorret I. Boomsma
, Sven Cichon
, Udo Dannlowski
, E. J. C. de Geus
, J. Raymond DePaulo
, Enrico Domenici
, Katharina Domschke
, Tõnu Esko
, Hans J. Grabe
, Steven P. Hamilton
, Caroline Hayward
, Andrew C. Heath
, Kenneth S. Kendler
, Stefan Kloiber
, Glyn Lewis
, Qingqin S. Li
, Susanne Lucae
, Pamela A. F. Madden
, Patrik K. Magnusson
, Nicholas G. Martin
, Andrew M. McIntosh
, Andres Metspalu
, Ole Mors
, Preben Bo Mortensen
, Bertram Müller-Myhsok
, Merete Nordentoft
, Markus M. Nöthen
, Michael C. O’Donovan
, Sara A. Paciga
, Nancy L. Pedersen
, Brenda W. J. H. Penninx
, Roy H. Perlis
, David J. Porteous
, James B. Potash
, Martin Preisig
, Marcella Rietschel
, Catherine Schaefer
, Thomas G. Schulze
, Jordan W. Smoller
, Kari Stefansson
, Henning Tiemeier
, Rudolf Uher
, Henry Völzke
, Myrna M. Weissman
, Thomas Werge
, Cathryn M. Lewis
, Douglas F. Levinson
, Gerome Breen
, Anders D. Børglum
& Patrick F. Sullivan

Contributions

D.C. and E.B.B. conceived the analyses. J.L., M.L.P., E.H., E.K., H.L., P.M.V., R.M.R. and K.R. conceptualised and planned the PREDO study and collected the data. C.M.P., W.N., S.H. and S.J.L. conceptualised and planned the MoBa study and collected the data. C.B., S.E., P.D.W., and K.J.O.D. conceptualised and planned the UCI study and collected the data. D.T.S.L., J.L.M. and E.G. performed the DNA methylation and genotyping arrays for the UCI and DCH studies. D.J.S., N.K., and H.J.Z. designed and undertook the DCHS; M.J.M., M.S.K., and K.C.K. were involved in testing and analysis of epigenetic data; S.D. was involved in testing and analysis of genetic data. Major Depressive Disorder Working Group of the Psychiatric Genomics Consortium calculated summary statistics for enrichment tests. D.C., G.E., C.M.P. and M.J.J. ran the statistical analysis. N.S.M., I.K. and F.J.T. co-supervised statistical analysis. D.C. and E.B.B. wrote the manuscript with contributions from G.E., S.J.L., C.M.P., K.R., J.L.; D.C., J.L., K.R. and E.B.B. interpreted the results. All authors contributed to and approved the final version of the manuscript.

Corresponding author

Correspondence to Elisabeth B. Binder.

Ethics declarations

Competing interests

E.B.B. is co-inventor on the following patent applications: FKBP5: a novel target for antidepressant therapy. European Patent# EP 1687443 B1; Polymorphisms in ABCB1 associated with a lack of clinical response to medicaments. United States Patent # 8030033; Means and methods for diagnosing predisposition for treatment emergent suicidal ideation (TESI). European application number: 08016477.5 International application number: PCT/EP2009/061575. The remaining authors declare no competing interests.

Additional information

Journal peer review information: Nature Communications thanks Carrie Breton and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Description of Additional Supplementary Files

Supplementary Data 1

Supplemental Data 2

Supplemental Data 3

Supplemental Data 4

Supplemental Data 5

Supplemental Data 6

Supplemental Data 7

Supplemental Data 8

Supplemental Data 9

Supplemental Data 10

Supplemental Data 11

Supplemental Data 12

Supplemental Data 13

Supplemental Data 14

Supplemental Data 15

Supplemental Data 16

Supplemental Data 17

Supplemental Data 18

Reporting Summary

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Czamara, D., Eraslan, G., Page, C.M. et al. Integrated analysis of environmental and genetic influences on cord blood DNA methylation in new-borns. Nat Commun 10, 2548 (2019). https://doi.org/10.1038/s41467-019-10461-0

Download citation

Received: 13 September 2018
Accepted: 10 May 2019
Published: 11 June 2019
DOI: https://doi.org/10.1038/s41467-019-10461-0

This article is cited by

Pregnancy induced hypertension and umbilical cord blood DNA methylation in newborns: an epigenome-wide DNA methylation study
- Xiaojun Zhu
- Peiyue Jiang
- Xiaofu Yang
BMC Pregnancy and Childbirth (2024)
Epigenetics of prenatal stress in humans: the current research landscape
- Linda Dieckmann
- Darina Czamara
Clinical Epigenetics (2024)
Epigenetic signature of very low birth weight in young adult life
- Juho Kuula
- Darina Czamara
- Eero Kajantie
Pediatric Research (2024)
Genetic variation and pesticide exposure influence blood DNA methylation signatures in females with early-stage Parkinson’s disease
- S. L. Schaffner
- W. Casazza
- M. S. Kobor
npj Parkinson's Disease (2024)
The association of early life factors with depression and anxiety in adults aged 40–69 years: a population-based cohort study
- Ruirui Wang
- Mengyao Shi
- Yonghong Zhang
Translational Psychiatry (2024)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

Cohorts and analysis plan

Variably methylated regions

Which models explain methylation of tagCpGs best?

DeepSEA prediction of SNP function

Functional annotation of different best models

Replication of best models in independent cohorts

Disease relevance

Discussion

Methods

The PREDO cohort

Ethics

Maternal characteristics

Depressive symptoms

Symptoms of anxiety

Betamethasone

Delivery method

Parity

Maternal age

Pre-pregnancy BMI

Hypertension

Gestational diabetes and oral glucose tolerance test

Genotyping and imputation

DNA methylation

Identification of VMRs (variable methylated regions)

The Drakenstein cohort

Ethics

Maternal characteristics

Genotyping and Imputation

DNA methylation

VMRs

The UCI cohort

Ethics

Maternal characteristics

Genotyping

Imputation

DNA methylation

VMRs

The MoBa cohort

Ethics

Maternal characteristics

Genotyping and imputation

DNA methylation

Regression analysis

Enrichment analyses

Genomic annotation mapping

DeepSEA analysis

Random-effects meta-analysis

Reporting summary

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Consortia

Major Depressive Disorder Working Group of the Psychiatric Genomics Consortium

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links