Meta-analysis of epigenome-wide association studies in neonates reveals widespread differential DNA methylation associated with birthweight

Birthweight is associated with health outcomes across the life course, DNA methylation may be an underlying mechanism. In this meta-analysis of epigenome-wide association studies of 8,825 neonates from 24 birth cohorts in the Pregnancy And Childhood Epigenetics Consortium, we find that DNA methylation in neonatal blood is associated with birthweight at 914 sites, with a difference in birthweight ranging from −183 to 178 grams per 10% increase in methylation (PBonferroni < 1.06 x 10−7). In additional analyses in 7,278 participants, <1.3% of birthweight-associated differential methylation is also observed in childhood and adolescence, but not adulthood. Birthweight-related CpGs overlap with some Bonferroni-significant CpGs that were previously reported to be related to maternal smoking (55/914, p = 6.12 x 10−74) and BMI in pregnancy (3/914, p = 1.13x10−3), but not with those related to folate levels in pregnancy. Whether the associations that we observe are causal or explained by confounding or fetal growth influencing DNA methylation (i.e. reverse causality) requires further research.

I ntrauterine exposures, such as maternal smoking, prepregnancy body mass index (BMI), hyperglycaemia, hypertension, folate and famine are associated with fetal growth and hence birthweight [1][2][3][4][5][6] . Observational studies show that birthweight is also associated with later-life health outcomes, including cardio-metabolic and mental health, some cancers and mortality [7][8][9][10][11] . In these long-term associations, birthweight may act as a proxy for potential effects of intrauterine exposures 12,13 . Several mechanisms may explain the associations of intrauterine exposures with birthweight and later-life health as we illustrate in Fig. 1. Our overall conceptual framework in this study was that the intrauterine environment induces epigenetic alterations, which influence fetal growth and hence correlate with birthweight. This is partly supported by previous large-scale epigenome-wide association studies (EWAS) that have reported associations of relevant maternal pregnancy exposures, including smoking, air pollution and BMI, with DNA methylation in offspring neonatal blood [14][15][16] . However, whilst four previous EWAS have observed associations of DNA methylation with birthweight [17][18][19][20] , the evidence to date has been limited in scale and power with sample sizes ranging from approximately 200 to 1000.
In this study, we hypothesised that there are associations between DNA methylation and birthweight. We further aimed to explore if these epigenetic alterations are associated with later disease outcomes (Fig. 1). If birthweight is a proxy for a range of adverse prenatal exposures, we might expect neonatal blood DNA methylation to be associated with birthweight. However, we acknowledge that any associations of DNA methylation with birthweight may be explained by confounding 21 or reflect fetal growth influencing DNA methylation.
Here we present a large meta-analysis of multiple EWAS to explore associations between neonatal blood DNA methylation and birthweight. In further analyses, we explore whether any birthweightassociated differential methylation persists at older ages. To aid functional interpretation, we (i) explore the overlap of identified cytosine-phosphate-guanine sites (CpGs) that are differentially methylated in relation to birthweight with those known to be associated with intrauterine exposure to smoking, famine and different levels of BMI and folate; (ii) associate DNA methylation at identified CpGs with gene expression and (iii) explore potential causal links with birthweight and later-life health using Mendelian randomization (MR) 22 . We show that DNA methylation in neonatal blood is associated with birthweight and some of the differential methylation is also observed in childhood and adolescence, but not in adulthood.
Also, we show overlap between birthweight-related CpGs and CpGs related to intrauterine exposures. Potential causality of the associations needs to be studied further.

Results
Participants. We used data from 8825 neonates from 24 studies in the Pregnancy And Childhood Epigenetics (PACE) Consortium, representing mainly European, but also African and Hispanic ethnicities with similar proportions of males and females. Details of participants used in all analyses are presented in Table 1, Supplementary Data 1 and study-specific Supplementary Methods.
Meta-analysis. Primary, secondary and follow-up analyses are outlined in the study design in Fig. 2. Methylation at 8170 CpGs, measured in neonatal blood using the Illumina Infinium® Human-Methylation450 BeadChip assay and adjusted for cell-type heterogeneity [23][24][25] , was associated with birthweight (false discovery rate (FDR) <0.05), of which 1029 located in or near 807 genes survived the more stringent Bonferroni correction (p < 1.06 × 10 −7 , Supplementary Data 2). We observed both positive (45%) and negative (55%) directions of associations between methylation levels of these 1029 CpGs and birthweight (Fig. 3) and these CpGs were spread throughout the genome (orange track (1) in Fig. 4 and Supplementary Fig. 1). We found evidence of between-study heterogeneity (I 2 > 50%) for 115 of the 1029 sites (Supplementary Data 2), thus we prioritised 914 CpGs, located in or near 729 genes, based on p < 1.06 × 10 −7 and I 2 ≤ 50% for further analyses ( Fig. 3 and orange track (1) in Fig. 4). The CpG with the largest positive association was cg06378491 (in the gene body of MAP4K2). For each 10% increase in methylation at this site, birthweight was 178 g higher (95% confidence interval (CI): 138, 218 g). The CpG with the largest negative association was cg10073091 (in the gene body of DHCR24), which showed a 183 g decrease in birthweight per 10% increase in methylation (95% CI: −225, −142 g). The CpG with the smallest P-value and I 2 ≤ 50% was cg17714703 (in the gene body of UHRF1), which showed a 130 g increase in birthweight for 10% increase in methylation (95% CI: 109, 151 g).
Findings were consistent with results from our main analyses when restricted to participants of European ethnicity, with a Pearson correlation coefficient for effect estimates of 0.99 for the 914 birthweight-associated CpGs ( Supplementary Fig. 2 Fig. 4) and when we did not exclude neonates born preterm or to women with pre-eclampsia or diabetes (Supplementary Fig. 4 and Supplementary Data 5A and 5C, and red track (3) in Fig. 4). Without these exclusions, we were able to examine associations with low (<2500 g, n = 178) versus normal (2500-4000 g, n = 4197) birthweight, though statistical power was still limited. Four CpGs were associated with low versus normal birthweight (Bonferroni-corrected threshold), none of which overlapped with the 914 CpGs from the main analysis (Supplementary Data 5B, purple track (4) in Fig. 4). We identified that 161 of the 914 differentially methylated CpGs potentially contained a singlenucleotide polymorphism (SNP) at cytosine or guanine positions (i.e. polymorphic CpGs; Supplementary Data 6). Polymorphic CpGs may affect probe binding and hence measured DNA methylation levels 26,27 . We used one of the largest studies (ALSPAC; n = 633) to explore this. We found no indication of bimodal distributions for any of the 161 CpGs suggesting SNPs had not markedly affected methylation measurements at these sites (dip test p-values: 0.299-1.00) [28][29][30] .
Analyses at older ages. We took the 914 neonatal blood CpGs that were associated with birthweight at Bonferroni-corrected statistical significance and with I 2 ≤ 50% and examined their associations with birthweight when measured in blood taken in childhood (2-13 years; n = 2756 from 10 studies), adolescence (16-18 years; n = 2906 from six studies) and adulthood (30-45 years; n = 1616 from three studies). Only participants from ALSPAC, CHAMACOS and Generation R had also contributed to the main neonatal blood EWAS. In childhood, adolescence and adulthood, we observed 87, 49 and 42 of the 914 CpGs to be nominally associated with birthweight (p < 0.05). Results are presented as mean ± SD or N (%). Normal birthweight: 2500−4000 g, high birthweight: >4000 g, low birthweight: <2500 g. Studies with mixed ethnicities analysed all participants together with adjustment for ethnicities. g: grams, wk: weeks, y: years. Full study names can be found in study-specific Supplementary Methods. For some studies the sample size for defining normal/high BW was too small a CBC, CHS and NCL used heel prick blood spot samples instead of cord blood b GOYA is a case-cohort study (cases are mothers with BMI>32 and controls are mothers randomly sampled from the underlying study population in which the cases were identified), in analyses where we included a random sample with a normal BMI distribution results were essentially the same as in the main analyses All these CpGs showed consistent directions of association. Ten CpGs showed differential methylation across all four age periods. However, only a minority survived Bonferroni correction for 914 tests (p < 5.5 × 10 -5 ): 12 (1.3%), 1 (0.1%) and 0 CpGs in childhood, adolescence and adulthood, respectively (Supplementary Data 7; the 12 CpGs that persisted in childhood are presented in the green track (6) in Fig. 4). Of the 914 CpGs, 50, 52 and 49% showed consistency in direction of association in childhood, adolescence and adulthood, but correlations of the associations of DNA methylation and birthweight between methylation measured in infancy and that measured in childhood, adolescence and adulthood were weak (Pearson correlation coefficients: 0.15, 0.06 and 0.02, respectively).
Intrauterine factors. We observed enrichment of previously published maternal smoking-related CpGs in the birthweight-associated CpGs 14 (55/914 (6.0%) p enrichment = 6.12 × 10 −74 , of which cg00253658 and cg26681628 also showed persistent methylation differences in the look-up in childhood). We additionally found enrichment of maternal BMI-related CpGs in the list of birthweightrelated CpGs 15 (3/914 (0.3%) p enrichment = 1.13 × 10 −3 ). All directions of association were consistent with the birthweight-lowering influence of maternal smoking or the positive association of maternal BMI with birthweight (Supplementary Data 8). We did not find evidence for overlap with plasma folate 31 . For famine, we were unable to explore overlap with DNA methylation at the Bonferronisignificant level as the previous EWAS of famine only reported results that reached a FDR level of statistical significance 32

Follow up of methylation sites for function & causality
Focused on n = 914 prioritised CpGs (p < 1.06*10 -7 and l 2 ≤ 50%) In silico explore overlap of CpGs associated with intrauterine exposures a. Maternal smoking and BMI b. Metastable epialleles and imprinted genes Functional analyses a. In silico explore overlap with a publicly available list of cis-eQTMs b. Explore whole blood mRNA gene expression in 112 Spanish fouryear-olds and 84 Gambian two-year-olds c. In silico functional enrichment analyses (GO and KEGG pathways) Explore causality with birth weight and later-life health using two-sample Mendelian randomization with publicly available summary data

Exploration of persistence at older ages
Focused on n = 914 prioritised CpGs (p < 1.06*10 -7 and l 2 ≤ 50%) Associations with birthweight in blood samples collected at older ages n = 8,825 neonates* from 24 birth cohorts Fig. 2 Design of the study. Schematic representation of the main meta-analysis, secondary meta-analyses, follow-up analyses and exploration of persistence at older ages. *We removed multiple births from all analyses and excluded preterm births (<37 weeks) and offspring of mothers with preeclampsia or diabetes (three major pathological causes of differences). **For sufficient power in the low vs normal BW analyses, we only included nine studies with >10 low birthweight cases studies. The X-axis represents the difference in birthweight in grams per 10% methylation difference, the Y-axis represents the −log 10 (P). The red line shows the Bonferroni-corrected significance threshold for multiple testing (p < 1.06 × 10 −7 ). Highlighted in orange are the 914 CpGs with p < 1.06 × 10 −7 and I 2 ≤ 50% and highlighted in blue are the 115 CpGs with p < 1.06 × 10 −7 and I 2 > 50% Metastable epialleles and imprinted genes. We tested the birthweight-associated CpGs for enrichment of metastable epialleles (loci for which the methylation state is established in the periconceptional period 33,34 ). We additionally tested for enrichment of CpGs annotated to imprinted genes (loci that depend on the maintenance of parental-origin-specific methylation marks in the preimplantation embryo, some of which are known to regulate fetal growth 35,36 ). We did not find evidence of enrichment for metastable epialleles (3/1936 metastable epialleles overlap a birthweightassociated CpG), imprinting control regions (0/741) or imprinted gene transcription start sites (5/1728) (Supplementary Data 9).
Comparison with GWAS for birthweight. To compare these EWAS results to those from genetic studies, we used the 60 recently published fetal SNPs associated with birthweight in a GWAS metaanalysis of 153,781 newborns 37 and mapped the CpG sites identified in the EWAS to these SNPs to seek evidence of co-localisation of genetic and epigenetic variation (Supplementary Data 10). We repeated this for the 10 recently published maternal SNPs associated with birthweight in a GWAS meta-analysis of 86,577 women 38 (Supplementary Data 11). We observed that one or more of the 914 birthweight-associated CpGs were within +/−2Mb of 34/60 fetal and all 10 maternal birthweight-associated SNPs. Of the 34 fetal SNPs, three were located in the same gene as the CpG, as was one of the ten maternal SNPs. Ten fetal and four maternal SNPs were within 100 kb of identified CpGs. In a look-up of the fetal and maternal SNPs from GWAS of birthweight in an online cord blood methylation quantitative trait loci (mQTL) database (mqtldb.org 39 ), 35 fetal and four maternal SNPs affected methylation at some CpG(s), but none at the 914 birthweight-associated CpGs specifically.  the transcription start site), CpG sites known to correlate with gene expression, from whole blood samples of 2101 Dutch adult individuals. We found that 82 of the 914 birthweight-associated CpGs were associated with gene expression of 98 probes (cis-eQTMs) 40 (p enrichment < 1.73 × 10 −11 , Supplementary Data 12). Additionally, in 112 Spanish 4-year-olds 41 , we observed that 19 CpGs were inversely associated with whole blood mRNA gene expression and four CpGs were positively associated with gene expression (FDR<0.05, Supplementary Data 13). Of these 23 CpGs, 13 were also found in the publicly available cis-eQTM list 40 . In 84 Gambian children (age 2 years) 42 , we found two CpGs that were inversely associated with whole blood mRNA gene expression, but neither were found in the Spanish results or the publicly available cis-eQTM list. The 914 birthweight-associated CpGs showed no functional enrichment of Gene Ontology (GO) terms or Kyoto Encyclopedia of Genes and Genomes (KEGG) terms (FDR<0.05).
Mendelian randomization. We aimed to explore causality using MR analysis, in which genetic variants associated with methylation levels (methylation quantitative trait loci (mQTLs)) are used as instrumental variables to appraise causality. For 788 (86%) of the 914 birthweight-associated CpGs, no mQTLs were identified in a publicly available mQTL database 39 . For 108 (86%) of the remaining 126 CpGs, only one mQTL was identified and for the remainder none had more than four mQTLs (Supplementary Data 14 provides a complete list of all mQTLs identified for these 126 CpGs). Many of the currently available methods that can be used as sensitivity analyses to explore whether MR results are biased by horizontal pleiotropy (a single mQTL influencing multiple traits) require more than one genetic instrument (here mQTLs) and even with two or three this can be difficult to interpret 43 . Having determined that it was not possible to undertake MR analyses of 86% of the birthweight-related differentially methylated CpGs (because we did not identify any mQTLs), and for the majority of the remaining CpGs we would not have been reliably able to distinguish causality from horizontal pleiotropy (because only one mQTL could be identified), we decided not to pursue MR analyses further.

Discussion
This large-scale meta-analysis shows that birthweight is associated with widespread differences in DNA methylation. We observed some enrichment of birthweight-associated CpGs among sites that have previously been linked to smoking during pregnancy 14 and pre-pregnancy BMI 15 , consistent with the hypothesis that epigenetic pathways may underlie the observational associations of those prenatal exposures with birthweight 21,44,45 . However, the actual overlap in this analysis was modest, likely explained by the adjustments for maternal smoking and BMI in the EWAS analyses. The overlap that we observed with pregnancy smoking-related CpGs may reflect the possibility that smoking-related CpGs capture smoking better than self-report 46,47 , in line with expectations of pregnant women underreporting their smoking behaviour. Adjustment for maternal smoking and BMI may have masked a greater level of overlap between our results and EWAS of these two maternal exposures. The fact that we find an association of DNA methylation across the genome with birthweight provides some support for our conceptual framework shown in Fig. 1. However, we acknowledge that the associations that we have observed may also be explained by causal effects of maternal pregnancy exposures on both DNA methylation and fetal growth, as well as subtle inflammatory responses in celltype proportions associated with maternal smoking that might not have been completely captured with the currently available cell type estimation methods. The differential methylation associated with birthweight in neonates persisted only minimally across childhood and into adulthood. Larger (preferably longitudinal) studies are needed to explore persistent differential methylation in more detail and with better power at older ages. It is possible that inclusion of the Gambia study in the childhood EWAS (which was the only non-European study in these analyses and was not included in the main meta-analyses with neonatal blood) might have impacted these results, although this study made up just 7% of the total child follow-up sample. A rapid attenuation of differential methylation in relation to birthweight in the first years after birth has previously been reported 19 , but our sample size for these analyses may have been too small to detect persistence. This rapid decrease, if real, may indicate a reduction in the dose of the child's exposure to maternal factors such as smoking once the offspring is delivered, with that reduction continuing as the child ages. Persistence of birthweight-related differential DNA methylation may not necessarily be a prerequisite for long-term effects, as transient differential methylation in early life may cause lasting functional alterations in organ structure and function that predispose to later adverse health effects.
Methylation is known to be associated with gene expression 48 . However, we found no consistent associations between birthweightrelated methylation and gene expression in two childhood studies. This could be due to the relatively small sample sizes, differences in ethnicities, age, or platforms to measure gene expression. The use of blood, which is likely only a possible surrogate tissue for fetal growth phenotypes, for gene expression analysis might also explain the lack of findings. We did find multiple cis-eQTMs among the birthweightrelated CpGs at which methylation was related to gene expression in blood when using a publicly available database from a larger adult sample 40 , providing some evidence that birthweight-related differentially methylated CpGs may be associated with gene expression. These initial in silico association analyses need further exploration to establish any underlying causal mechanisms.
In observational studies, birthweight has repeatedly been associated with a range of later-life diseases. Change in DNA methylation has been hypothesized as a potential mechanism linking early exposures, birthweight and later health (Fig. 1). We originally aimed to explore this using MR analysis. For the vast majority of the birthweight-associated CpGs, no genetic instrumental variables were available. For the remaining 126 CpGs, only one mQTL was available, which would make it impossible to disentangle causality from horizontal pleiotropy. To ensure a strong basis for future MR analyses on this topic, there is a clear need for a more extensive mQTL resource.
Strengths of this study are its large sample size and the extensive analyses that we have undertaken. In a post hoc power calculation based on the sample size of 8825 with a weighted mean birthweight of 3560 g (weighted mean standard deviation (SD): 483 g) and with an alpha set at the Bonferroni-corrected level of P < 1.06 × 10 −7 we had 80% power, with a two-sided test, to detect a minimum difference of 0.13 SD (63 g) in birthweight for each SD increase in methylation. The difference in methylation corresponding to a 1 SD increase differs per CpG, as it depends on the distribution of the methylation values. We acknowledge that smaller differences which might be clinically or biologically relevant may not have been identified in the current analysis. Nonetheless, to our knowledge this analysis has brought together all studies currently available with relevant data and is the largest published study of this association. DNA methylation patterns in neonatal blood, whilst easily accessible in large numbers, may not reflect the key tissue of importance in relation to birthweight. DNA methylation and gene expression in placental tissue may be important targets for future studies. DNA methylation varies between leucocyte subtypes 49 and we used an adult whole blood reference to correct for this in the main analyses 23,24 , as the study-specific analyses were completed before the widespread availability of specific cord blood reference datasets 50,51 . However, we observed very similar findings in two studies (Generation R and GECKO) when we compared the results with those using one of the currently available cord blood references 50 . Although we adjusted for potential major confounders that may affect both methylation and fetal growth, we acknowledge that the main results cannot ascertain causality. That is, whilst we have hypothesised that variation in fetal DNA methylation influences fetal growth and hence birthweight, and undertaken the analyses accordingly, we cannot exclude the possibility that differences in neonatal blood DNA methylation are caused by variation in fetal growth itself, or that the association is confounded by factors, including maternal smoking and BMI, that independently influence both fetal growth and DNA methylation (as suggested in Fig. 1). The 450k array that was used to measure genome-wide DNA methylation only covers 1.7% of the total number of CpGs present in the genome and specifically targets CpGs in promoter regions and gene bodies 52 . We removed the CpGs that were flagged as potentially cross-reactive, as the measured methylation levels may represent methylation at either of the potential loci. Also, although we did not find evidence for polymorphic effects for the 161 potentially polymorphic CpGs in ALSPAC, we cannot completely exclude these potential polymorphic effects in the meta-analysed results. The majority of participants were of European ethnicity and when analyses were restricted to those of European ethnicity the results were essentially identical to those with all studies included. Direct comparisons of the main analysis with analyses in those of Hispanic or of African ethnicity for the 914 hits suggested strong correlations with Hispanic but weaker with African ethnicity. However, these results need to be treated with caution. First, we had very few studies of Hispanic and African populations. Second, we only compared the initial hits from the main metaanalysis with all ethnicities included. A detailed exploration of ethnic differences would require similar large samples for each ethnic group and within ethnic EWAS, which is beyond the scope of the data currently available.
Neonatal blood DNA methylation at many sites across the genome is associated with birthweight. Further research is required to determine if these are causal and if so whether they mediate any long-term effect of intrauterine exposures on future health.

Methods
Participants. In the main EWAS meta-analysis we explored associations of neonatal blood DNA methylation with birthweight using data from 8825 neonates from 24 studies in the PACE Consortium 53 (Table 1). We removed multiple births from all analyses and excluded preterm births (<37 weeks) and offspring of mothers with pre-eclampsia or diabetes (three major pathological causes of differences in fetal growth). In follow-up analyses, we explored whether any sites found in the main analysis were discernible in relation to birthweight when examined in DNA from blood drawn during childhood (2-13 years; 2756 children from 10 studies), adolescence (16-18 years; 2906 adolescents from six studies) or adulthood (30-45 years; 1616 adults from three studies), see Supplementary Data 1B. Informed consent was obtained from all participants, and all studies received approval from local ethics committees. Study-specific methods and ethical approval statements are provided in Supplementary Methods.
Birthweight, DNA methylation and covariates. Our primary outcome was birthweight on a continuous scale (grams), adjusted for gestational age, and measured immediately after birth or retrospectively reported by mothers in questionnaires. In secondary analyses, we categorised and compared associations with high (>4000 g, n = 1593) versus normal (2500-4000g, n = 6377) birthweight. We also explored all associations with (continuous and categorical) birthweight in analyses that did not exclude women with pre-eclampsia, diabetes or preterm delivery, which also resulted in enough cases to explore low (<2500 g, n = 178) versus normal (2500-4000 g, n = 4197) birthweight (Supplementary Data 1C shows the characteristics of participants). Primary, secondary and follow-up analyses are outlined in the study design in Fig. 2. DNA methylation was measured in neonatal blood samples using the Illumina Infi-nium® HumanMethylation450 BeadChip assay. All participants had cord blood samples except for three studies with heel stick blood spots (n = 1254 [14.2%]). After study-specific laboratory analyses, quality control, normalisation, and removal of control probes (n = 65) and probes that mapped to the X (n = 11,232) and Y (n = 370) chromosomes, we included 473,864 CpGs. DNA methylation is expressed as the proportion of cells in which the DNA was methylated at a specific site and hence takes values from zero to one. We converted this to a percentage and present differences in mean birthweight per 10% higher DNA methylation level at each CpG. All analyses were adjusted for gestational age at delivery, child sex, maternal age at delivery, parity (0/≥1), smoking during pregnancy (no smoking/stopped in early pregnancy/smoking throughout pregnancy), pre-pregnancy BMI, socio-economic position, technical variation, and estimated white blood cell proportions (B-cells, CD8+ T-cells, CD4+ T-cells, granulocytes, NK-cells and monocytes) [23][24][25] . In studies with participants from multiple ethnic groups, each group was analysed separately and results were added to the meta-analyses as separate studies. Further details are provided in the studyspecific Supplementary Methods.
Statistical methods. Robust linear (birthweight as a continuous outcome) or logit (binary birthweight outcomes) regression EWAS were undertaken within each study according to a pre-specified analysis plan. Quality control, normalisation and regression analyses were conducted independently by each study. After confirming comparability of study-specific summary statistics 54 , we combined results using a fixed effects inverse variance weighted meta-analysis 55 . The meta-analysis was done independently by two study groups and the results were compared in order to minimise the likelihood of human error. We show (two-sided) results after correcting for multiple testing using both the FDR<0.05 56 and the Bonferroni correction (p < 1.06 × 10 −7 ). We completed follow-up analyses for differentially methylated CpGs that reached the Bonferroni-adjusted threshold and did not show large between-study heterogeneity 57 (I 2 ≤ 50%). We annotated the nearest gene for each CpG using the UCSC Genome Browser build hg19 58,59 . We explored whether between-study heterogeneity might be explained by differences in ethnicity between studies, by repeating the meta-analysis including only participants of European ethnicity, which was by far the largest ethnic subgroup (n = 6023 from 17 studies) (Fig. 2). Ethnicity was defined using maternal or self-report, unless specified otherwise in study-specific Supplementary Methods. We also did metaanalyses only including the Hispanic studies and only including the African American studies and present those results for illustrative purposes only, given the much smaller sample size. All analyses were performed using R 60 , except for the meta-analysis which was performed using METAL 55 . We removed CpGs that cohybridised to alternate sequences (i.e. cross-reactive sites), because we cannot distinguish whether the differential methylation is at the locus that we have reported or at the one that the probe cross-reacts with. We compared the birthweight-related CpGs to lists of CpGs that are potentially influenced by a SNP (polymorphic sites) 26,27 . For these CpGs, we determined if DNA methylation levels were influenced by nearby SNPs, by assessing whether their distributions deviated from unimodality using Hartigans' dip test 28,29 and visual inspection of density plots in n = 742 cord blood samples in the ALSPAC study.
Analyses at older ages. Analyses of the associations with DNA methylation in blood collected in childhood, adolescence and adulthood followed the same covariable adjustment and methods as for the main analyses (p < 5.5 × 10 −5 for 914 tests). All participants and studies in these analyses at older ages had not been included in the main meta-analysis in neonatal blood, except for ALSPAC (n = 633 in neonatal analyses, n = 605 in childhood and n = 526 in adolescence), CHA-MACOS (n = 283 in neonatal analyses and n = 191 in childhood) and Generation R (n = 717 in neonatal analyses and n = 372 in childhood). Characteristics are shown in study-specific Supplementary Methods and Supplementary Data 1B.
Intrauterine factors. We used a hypergeometric test to explore the extent to which any of the birthweight-related CpGs overlapped with those previously associated with intrauterine exposure to smoking 14 (n = 568 CpGs), BMI 15 (n = 104 CpGs) and plasma folate 31 (n = 48 CpGs), using the same (Bonferroni-corrected) cut-off for statistical significance. No CpGs reached the Bonferroni-corrected cut-off for famine 32 . We additionally appraised this overlap using the FDR<0.05 cut-off for all traits (n = 8170 birthweight-related CpGs, n = 6703 smoking-related CpGs, n = 16,067 BMI-related CpGs, n = 443 folate-related CpGs, n = 7 famine-related CpGs). These FDR results were available from the publications for smoking, folate and famine, and we obtained them from the corresponding author for BMI.
Metastable epialleles and imprinted genes. We tested the birthweightassociated CpGs for enrichment of metastable epialleles and CpGs associated with imprinted genes. The metastable epialleles were derived from a recently published study that identified 1936 putative metastable epialleles 34 . For imprinted genes, we first identified a set of CpGs falling within a curated set of imprinting control regions; differentially methylated regions controlling the parental-specific expression of one or more imprinted genes 36 . Second, we extracted the set of imprinting control region controlled genes from the above source and identified all 450k CpGs within +/−10kbp of the gene transcription start site, including all known alternative TSS identified in grch37.ensembl.org using biomaRt 61,62 .
Comparison with GWAS for birthweight. We compared the birthweightassociated CpGs with the 60 SNPs from the most recent GWAS meta-analyses of fetal genotype associations with birthweight in >150,000 newborns 37 and with the 10 SNPs from the most recent GWAS meta-analysis of maternal genotype associations with birthweight in >86,000 women 38  surrounding these SNPs. We additionally checked whether SNPs and CpGs were located in the same gene.
Functional analyses. To explore the association of methylation with gene expression, we compared birthweight-related CpGs with a recently published list of 18,881 cis-eQTMs from whole blood samples of 2101 Dutch adult individuals 40 . With a hypergeometric test, we calculated enrichment of cis-eQTMs in the list of birthweight-associated CpGs. We further explored methylation of birthweightassociated CpGs in relation to whole blood mRNA gene expression (transcript levels) within a 500 kb region of the CpGs (+/−250 kb, FDR<0.05) in 112 Spanish 4-year-olds 41 and 84 Gambian 2-year-olds 42 (Supplementary Methods). To better understand the potential mechanisms linking DNA methylation and birthweight, we explored the potential functions of the birthweight-associated CpGs using GO and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses. We used the missMethyl R package 63 , which enabled us to correct for the number of probes per gene on the 450k array, based on the November 2018 version of the GO and KEGG source databases. To filter out the large, general pathways we set the number of genes for each gene set between 15 and 1000, respectively. We calculated FDR at 5% corrected P-values for enrichment.
Mendelian randomization. MR uses genetic variants as instrumental variables to study the causal effect of exposures on outcomes 64,65 . We aimed to use two-sample MR 22,66 to explore (a) evidence of a causal association of methylation levels at the identified CpGs with birthweight and (b) evidence of a causal association of these CpGs with later-life health outcomes (i.e. to explore our hypothesised causal mechanisms shown in Fig. 1). We did this by first searching a publicly available mQTL database 39 to identify cis-mQTLs within 1 Mb of each of the Bonferronicorrected, with I 2 ≤ 50%, birthweight-related differentially methylated CpGs. These mQTLs could then be used as genetic instrumental variables for methylation levels of the birthweight-related CpGs. We then aimed to determine the association of these mQTLs with birthweight and later-life health outcomes from publicly available summary GWAS results 66 .
Reporting summary. Further information on experimental design is available in the Nature Research Reporting Summary linked to this article.

Data availability
All relevant data supporting the key findings of this study are available within the article and its Supplementary Information files or from the corresponding authors upon reasonable request. All summary statistics from this EWAS meta-analysis are available via doi: 10.5281/zenodo.2222287. A reporting summary for this Article is available as a Supplementary Information file.

Code availability
The code used for this EWAS meta-analysis is available from the authors upon request.