Introduction

Chronic lung disease of prematurity (CLD; also called bronchopulmonary dysplasia) is the most common pulmonary morbidity in extremely preterm infants. Generally, clinical factors have been found to be more predictive of CLD than genetic markers.1 For example, fetal growth restriction (FGR) is highly predictive of CLD.2 It is unclear to what extent prenatal exposures, including chronic hypoxia, maternal socioeconomic status, and placental inflammation, influence the risk of postnatal outcomes, such as CLD. Placental epigenetic programming, as measured by placental DNA methylation, has been proposed as an intermediate linking prenatal exposures to later health outcomes in premature infants.3

Consistent with the placenta’s critical role in fetal development through both the exchange of nutrients and mediation of environmental influences, the fetal origins hypothesis of CLD posits that an adverse intrauterine milieu alters lung development and increases the risk of CLD.4 Several epidemiological studies demonstrating an association between placental disorders and CLD support this hypothesis.2,5 Still, a knowledge gap remains concerning whether the placental epigenome plays a role in fetal origins of CLD in extremely preterm infants.

Preclinical studies suggest that epigenetic mechanisms underlying fetal lung development may occur in a sex-specific manner. For example, sex-specific impairments in alveolarization were found in rats in response to FGR via alterations in the transcription of peroxisome proliferator-activated receptor-gamma (PPARγ).6 Likewise, in murine models, mechanical ventilation and hyperoxia lead to epigenetic changes in the lung related to histone deacetylase activity, which inhibits the alveolar formation and promotes lung remodeling.7 While sexual dimorphism exists for CLD with a higher incidence in males compared to females,8 whether placental epigenetic variation contributes to the pathogenesis of CLD in a sex-specific manner in humans remains unclear.

The objective of this study was to assess the relationship between placental CpG methylation and CLD in extremely preterm infants. We hypothesized that genes involved in pathways related to placental hypoxia and oxidative stress, which may adversely affect fetal lung and pulmonary vascular development,9 would be differentially methylated in placentas collected from infants who later went on to develop CLD versus those who did not. Furthermore, based on significant sex-based differences in placental methylation in response to prenatal exposures as documented by Martin et al.10 and sex-based differences in the incidence of CLD,8,11 we hypothesized that patterns of placental DNA methylation would differ by sex.

Methods

Study sample

Participants in this study represent a subset of individuals enrolled in the Extremely Low Gestational Age Newborn (ELGAN) Study. The ELGAN cohort of newborns delivered prior to 28 weeks of gestation, enrolled at 14 United States hospitals located in five states (Connecticut, Massachusetts, North Carolina, Michigan, and Illinois).12 Of the 1506 infants enrolled, 1251 infants survived to 36 weeks postmenstrual age (PMA) and 1241 (99.2%) were evaluated at 36 weeks PMA for CLD. Epigenetic data were available from the placentas of 423 (34%) of the infants evaluated for CLD. The institutional review boards of all participating institutions approved enrollment and consent procedures for the ELGAN Study.

Classification of chronic lung disease

We classified infants as having CLD if they were receiving either supplemental oxygen or mechanical ventilation, or both, at 36 weeks PMA. In addition, we recorded the type of respiratory support infants were receiving at 36 weeks PMA, classified as follows: none, increased ambient oxygen, nasal cannula, nasal continuous positive airway pressure, conventional mechanical ventilation, or high-frequency ventilation.

Placenta tissue collection

As previously described, placental biopsies were taken shortly after delivery.13 Briefly, a tissue sample from the fetal side of the placenta was collected by applying traction to the chorion and the underlying trophoblast tissue and cutting a sample out at the base of this tissue structure. The tissue sample was immediately frozen in liquid nitrogen and stored at −80 °C until further processing.

DNA extraction and assessment of epigenome-wide DNA methylation

DNA was extracted and epigenome-wide DNA methylation assessed using the EZ DNA Methylation kit (Zymo Research, Irvine, CA) and Infinium MethylationEPIC BeadChip (Illumina, San Diego, CA), as previously described.13 The Infinium MethylationEPIC array measures more than 850,000 CpG loci at single-nucleotide resolution. Samples were randomly allocated to different plates and chips to minimize confounding variables. Quality control was performed at the sample level using the minfi package.14 Functional normalization was performed with a preliminary step of normal-exponential out-of-band (noob) correction method15 for background subtraction and dye normalization, followed by functional normalization to the top two principal components of the control matrix.16 Quality control was performed on individual probes by computing a detection p-value and excluded 806 (0.09%) probes with non-significant detection (p > 0.01) for 5% or more of the samples. The ComBat function was used from the sva package to adjust for batch effects from the sample plate.17 A total of 856,832 CpG probes were retained for analyses. Average methylation levels at each CpG probe were measured and expressed as β-values (β = intensity of the methylated allele (M))/(intensity of the unmethylated allele (U) + intensity of the methylated allele (M) + 100). β-values were logit transformed to M-values for statistical analyses.18

Statistical analysis

The demographic and clinical characteristics of the participants are reported using means, with standard deviations, or proportions. A directed acyclic graph was created using the web-based software DAGitty (www.dagitty.net) to delineate relationships between a priori selected covariates with placental CpG methylation as the exposure and CLD as the outcome (Supplemental Figure).13,19 Based on this diagram, the minimally sufficient set of covariates for adjustment included: gestational age, birth weight Z-score, maternal age, race, infant sex, inflammation level in the placenta, cell-type heterogeneity, and maternal socioeconomic status, based on marital status, eligibility for public insurance (Medicaid), and the highest level of education completed. Maternal age was treated as a continuous variable. The race was dichotomized as white and non-white. Linear mixed models were used to assess the impact of family structure (e.g., twins) and results were found to be consistent with the standard, linear models. To account for the possibility that methylation levels were affected by the infiltration of inflammatory cells within the placenta, acute inflammation status of the chorion/decidua was used as an adjustment factor in all models. Inflammation of the placenta was defined as greater than 10,000 neutrophils per cubic millimeter in the chorion or decidua.20 To control for cell-type heterogeneity, the top 10 component variables were selected based on a surrogate variable analysis (SVA), a reference-free method that efficiently estimates cell-type mixture in heterogeneous tissues using iteratively re-weighted least squares.21 Missing covariates, both continuous and categorical, were simultaneously imputed using a random forest trained on observed values of the data matrix, assessing for complex interactions and non-linear relations (missForest package).22 Imputation was assessed with an out-of-bag imputation error estimate.

An epigenome-wide association study (EWAS) was performed by fitting a robust linear regression model for each CpG probe adjusted for a priori selected covariates with DNA methylation as the response variable on the M-value scale and CLD as the main predictor. Robust linear regression was used to protect against potential heteroscedasticity.23 In these models, test statistics were modified using the Phipson’s robust empirical Bayes procedure, shrinking probe-wise sample variances towards a common value and controlling for test-statistic inflation.24 Statistically significant associations were identified based on Benjamini–Hochberg false discovery rate (FDR) q-values using a significance level of <0.05.25 As a secondary analysis, sex-specific associations focusing on DNA methylation in autosomes were assessed via stratification of adjusted robust linear regression models, excluding sex of the infant as covariates.

Manhattan plots and QQ-plots were used to report results. QQ-plots show potential test-statistic inflation at tails of the inferential distribution by plotting theoretical quantiles of the t-distribution on the x axis against the empirical modified t-statistics estimated using the empirical Bayes procedure. The interpretation of these QQ-plots is equivalent to those that plot theoretical and sample p-values. Traditional inflation factors are not interpretable in this empirical Bayes setting, often underrepresenting inflation in test statistics.

Pathway analysis

Ingenuity Pathway Analysis (IPA), a web-based software application (available at http://www.ingenuity.com), was used to determine whether the genes that contained the differentially methylated probes are enriched for specific biological functions. IPA relies on a repository of biological pathways based on curated findings from the research literature. For this analysis, we focused on genes associated with CLD identified in both the overall and sex-stratified analyses at FDR values < 0.05. IPA provided associated pathways for each focus gene, as well as p-values calculated using the right-tailed Fisher exact test.

Results

Study cohort

Maternal and newborn characteristics are presented for the ELGAN subjects used in this analysis (n = 217 subjects with CLD and n = 206 subjects without CLD) and the overall ELGAN cohort evaluated for CLD at 36 weeks PMA (n = 1241) in Table 1. Infants with CLD had lower mean gestational age, lower mean birth weight, and were more likely to be white, male, and growth-restricted (i.e., birth weight Z-score < −1) compared to infants without CLD. Mothers of ELGANs with placental tissue available (n = 423) were slightly older and more likely to be non-Hispanic white, have higher educational attainment, and have multiple gestation pregnancy, and less likely to have public insurance and smoke during pregnancy compared to mothers without placentas available. CLD frequency was similar between infants with and without placental samples available, and similar to CLD frequency in the overall sample (Supplemental Table).

Table 1 Maternal and newborn characteristics by CLD diagnosis at 36 weeks PMA within the ELGAN cohort.

Differences in placental DNA methylation based on CLD diagnoses

EWAS analyses for CLD, after adjusting for confounders, identified 49 differentially methylated CpG probes, corresponding to 46 unique genes, at FDR adjusted q-values < 0.05 (Table 2). CLD was associated with increased methylation levels in 3 probes (6%) and decreased methylation levels in 46 probes (94%). The Manhattan plot displays the genomic location of CpG methylation probes based on their association with CLD using threshold FDR values of 0.05 and 0.01 (Fig. 1). A comprehensive list of all significant CpG probes is included as a Supplemental File. Analysis was performed on the genes associated with CLD to explore their involvement/enrichment in canonical pathways. Among the top significant CpG probes associated with top canonical pathways, we identified cg24455359 located on chromosome 17 within the 1st exon of the Acetyl-CoA Carboxylase Alpha (ACACA) gene, cg09529537 located on chromosome 5 within the TSS200 region of the Junction Mediating and Regulatory Protein, P53 Cofactor (JMY) gene, cg214664091 located on chromosome 20 within the TSS220 region of the Proliferating Cell Nuclear Antigen (PCNA) gene, cg09089417 located on chromosome 12 within the body of the Phosphoglycerate Mutase Family Member 5 (PGAM5) gene, and cg13564825 located on chromosome 19 within the TSS200 region of the Protein Phosphatase 1 Regulatory Subunit 14A (PPP1R14A) gene (Table 3).

Table 2 Number of CLD-associated CpG probes in all infants, in male infants only, and in female infants only.
Fig. 1: Manhattan plots of CpG methylation probes associated with CLD.
figure 1

The chromosomal location of CpG probes are presented on the x axis and the −log10(p-value) associated with CLD is shown on the y axis. The models are adjusted for gestational age, birth weight Z-score, maternal age, race, infant sex, inflammation level in the placenta, cell-type heterogeneity, and maternal socioeconomic status. We added lines to indicate FDR thresholds at 0.05 (bottom) and 0.01 (top). On the right side of the figure, we include QQ-plots to indicate theoretical quantiles of the t-distribution on the x axis against the empirical modified t-statistics estimated using the empirical Bayes procedure.

Table 3 CpG probes involved in top canonical pathways associated with CLD with their corresponding chromosome and gene locations, and directional change of methylation.

Table 4 shows the top canonical pathways associated with differentially methylated genes in infants with CLD. The top canonical pathways enriched within the CLD-associated CpGs include biotin-carboxyl carrier protein assembly (p = 3.94 × 10−3), p53 signaling (p = 7.34 × 10−3), mismatch repair in eukaryotes (2.08 × 10−2), and d-myo-inositol (1,4,5,6)- and (3,4,5,6)-tetrakisphosphate biosynthesis (p = 2.23 × 10−2).

Table 4 Top canonical pathways associated with differentially methylated genes in infants with CLD, male infants with CLD, and female infants with CLD.

Sex-stratified analyses of differentially methylated genes

EWAS analyses stratified by the sex of the infant revealed distinct CpG methylation patterns, with male-derived placentas having more differentially methylated CpG probes than female-derived placentas in relation to CLD. In the male-derived placentas (n = 222), there were 518 differentially methylated CpG probes corresponding to 414 unique genes that were associated with CLD. Decreased methylation levels were observed in 440 (85%) CpG sites that corresponded to 358 genes. Increased methylation levels were observed in 78 (15%) CpG sites that corresponded to 56 genes (Table 2). Among the female-derived placentas (n = 201), placental CpG methylation of 12 sites was associated with CLD. Decreased methylation was observed in 7 (58%) CpG probes, corresponding to five genes, and increased methylation was observed in 5 (42%) CpG sites, corresponding to three genes (Table 2). Manhattan and QQ-plots for the sex-stratified EWAS are shown in Fig. 2. There were no overlapping CpG methylation probes between the sexes that displayed significance at q < 0.05. In male placentas, the top canonical pathways included AMP-activated protein kinase signaling pathway (p = 9.25 × 10−4), spliceosomal cycle (p = 1.59 × 10−3), apelin cardiac fibroblast signaling pathway (p = 1.62 × 10−3), Huntington’s disease signaling (p = 2.58 × 10−3), and pyridoxal 5′-phosphate salvage pathway (p = 4.73 × 10−3). In female placentas, the top canonical pathways identified were the role of BRCA1 in DNA damage response (p = 6.77 × 10−3) and hereditary breast cancer signaling (p = 1.20 × 10−2) (Table 4).

Fig. 2: Manhattan plots of CpG methylation probes associated between placental DNA methylation and CLD in males (a) and females (b).
figure 2

Manhattan plots of CpG methylation probes associated between placental DNA methylation and CLD in males (a) and females (b). The chromosomal location of CpG probes are presented on the x axis and the −log10(p-value) associated with CLD is shown on the y axis. The models are adjusted for gestational age, birth weight Z-score, maternal age, race, infant sex, inflammation level in the placenta, cell-type heterogeneity, and maternal socioeconomic status. We added lines to indicate FDR thresholds at 0.05 (bottom) and 0.01 (top). On the right side of the figure, we include QQ-plots to indicate theoretical quantiles of the t-distribution on the x axis against the empirical modified t-statistics estimated using the empirical Bayes procedure.

Discussion

This study is among the first to examine CpG methylation levels in the placenta at more than 850,000 sites in relation to CLD in extremely preterm infants. The placenta is the key organ for this study due to its critical role in the development of the fetus.26 In addition, DNA methylation in the placenta has been associated with childhood cognitive impairment and body mass index, suggesting a potential role as a mediator in the developmental origins of health and disease.27,28 Based on evidence that oxidative stress may adversely affect fetal lung development, we hypothesized that genes involved in pathways related to placental hypoxia would be differentially methylated in infants based on CLD status, and that the patterns of methylation would differ based on fetal sex.

A total of 49 differentially methylated CpG probes corresponding to 46 unique genes were associated with CLD at FDR adjusted q-value < 0.05. CLD was associated with decreased methylation at a CpG site within the PCNA gene and increased methylation at a CpG site within the JMY gene. The PCNA gene encodes a cofactor for DNA polymerase and the JMY gene encodes an important nuclear cofactor for p53 signaling, which regulates cell growth.29 Both of these genes are involved in p53 signaling, regulating syncytiotrophoblast apoptosis and cell-cycle arrest in the placenta in response to hypoxia. Placental p53 expression is upregulated by increased concentrations of hypoxia-inducible factors in response chronic hypoxia.30,31 Chronic fetal hypoxia leads to impaired fetal lung and vascular development through altered expression of fibroblast growth factor and vascular endothelial growth factors.32 Impaired fetal lung development leads to alveolar simplification, impaired pulmonary angiogenesis, and pulmonary vascular remodeling, which can increase the need for invasive respiratory support following birth.33 Prolonged exposure to mechanical ventilation and supplemental oxygen causes additional lung injury, further increasing the risk of CLD.

CLD was associated with differential methylation levels of CpG sites within genes that are part of the canonical pathway myo-inositol (1,4,5,6)-tetrakisphosphate biosynthesis based on decreased methylation at CpG sites within the PGAM5 and PPP1R14A genes. Myo-inositol is an important cell-signaling molecule that in the placenta acts via abundant inositol transporters to regulate fetal growth.34 In the fetus, myo-inositol plays a major role in lung development by promoting the maturation of phospholipids involved in surfactant production.35 In premature infants, low serum inositol levels are associated with more severe respiratory symptoms.36 This observation has led to multiple randomized control trials of supplemental inositol in infants with respiratory distress syndrome. However, a Cochrane meta-analysis found that inositol supplementation did not reduce death, CLD, or other clinically important outcomes.37

We found unique CpG methylation patterns associated with CLD based on infant sex. CLD in male placentas was associated with differential methylation levels at CpG sites within 9 genes in the AMP-activated protein kinase (AMPK) signaling pathway: decreased methylation at CpG sites within the AK2, ARID2, CHRNA10, GNG7, PFKL, PRKAA1, RAB39, and TSC2 genes, and increased methylation at CpG sites within the PIK3CD gene. Interestingly, single nucleotide polymorphisms leading to increased maternal expression of PRKAA1 in response to altitude-related hypoxia, resulting in increased uterine artery blood flow via AMPK signaling activation, are thought to be an adaptive response to protect against FGR in populations living in the Andean and Tibetan mountains.38 In our study, decreased methylation at CpG sites within 8 genes in the AMPK signaling pathway, a pattern typically indicative of increased gene expression,39 may reflect simultaneous compensatory mechanisms occurring in the maternal circulation to promote uterine blood flow in response to hypoxia in pregnancy.40 Activation of the AMPK signaling pathway has also been associated with impaired trophoblast invasion in placentas from pregnancies complicated by pre-eclampsia.41 Our observation of distinct genes and pathways associated with CLD based on fetal sex is consistent with evidence that sex influences gene expression in human placental villi, which may contribute to worse outcomes in male fetuses.42

While this study is among the first to highlight the relationship between placental CpG methylation and CLD, a few limitations should be noted when interpreting the results. First, the study may not be sufficiently powered to draw definite conclusions using an EWAS approach. While the placenta plays a critical role in fetal development, we are mindful that tissue-specific patterns of CpG methylation exist, and that changes that are observed within the placenta may not reflect patterns in fetal lung tissue.43 Despite this limitation, studies examining fetal and postnatal samples from multiple tissues over time demonstrate that limited patterns of differentially methylated regulatory regions of select genes may be highly conserved across placental, perinatal (i.e., umbilical cord blood), and postnatal human tissues.44,45 An additional limitation of the study is that CpG methylation does not directly indicate the direction or extent of gene expression change.46 Further studies incorporating messenger RNA levels could enhance understanding of the regulatory mechanisms involved. We used the SVA approach to address cell-type heterogeneity. However, this approach may not have adequately controlled for the mix of villous and vascular structures present in fetal-side placenta samples and alternative methods could be used in future studies to address cell-type heterogeneity. Our analyses would also be more robust with the inclusion of ancestry data. In our EWAS analyses, we considered FDR adjusted values < 0.05 as significant. However, we acknowledge that no CpG sites in the overall sample were significant at the more stringent level of <0.01. Our definition of CLD was operationally defined by the ELGAN Study as supplemental oxygen therapy with or without mechanical ventilation at 36 weeks PMA, which is not consistent with the most recent NIH consensus definition of CLD.47

It is important to note that several mechanisms may explain the association of placental methylation and CLD. CpG methylation is correlated with gene expression,48 and altered gene expression may be an indicator of placental health. Therefore, changes in gene expression involved in pathways related to trophoblast migration and implantation may reflect reduced nutrient and oxygen delivery to the developing fetus, resulting in altered fetal lung programming affecting alveolar and pulmonary vascular development. Altered placental methylation may also be an indicator of disrupted inflammation and signaling pathways that lead to cytokine-mediated deleterious effects on the fetal lung, or an indicator of FGR. We included placental inflammation and FGR (as defined by birth weight Z-score) as covariates in our model due to their independent associations with altered placental methylation and CLD.49,50,51 Finally, our results may reflect residual confounding of covariates related to CLD that were not included in the model.

Our findings suggest that prenatal factors may play a role in the development of CLD, and, if replicated in future studies, might explain why CLD incidence has not decreased despite multiple intervention trials to reduce its frequency.52 Further analyses measuring gene expression in the placenta and correlation with blood sampling from infants may provide further insights into the complex interaction between the placenta and fetal lung development.