Maternal biological age assessed in early pregnancy is associated with gestational age at birth

Maternal age is an established predictor of preterm birth independent of other recognized risk factors. The use of chronological age makes the assumption that individuals age at a similar rate. Therefore, it does not capture interindividual differences that may exist due to genetic background and environmental exposures. As a result, there is a need to identify biomarkers that more closely index the rate of cellular aging. One potential candidate is biological age (BA) estimated by the DNA methylome. This study investigated whether maternal BA, estimated in either early and/or late pregnancy, predicts gestational age at birth. BA was estimated from a genome-wide DNA methylation platform using the Horvath algorithm. Linear regression methods assessed the relationship between BA and pregnancy outcomes, including gestational age at birth and prenatal perceived stress, in a primary and replication cohort. Prenatal BA estimates from early pregnancy explained variance in gestational age at birth above and beyond the influence of other recognized preterm birth risk factors. Sensitivity analyses indicated that this signal was driven primarily by self-identified African American participants. This predictive relationship was sensitive to small variations in the BA estimation algorithm. Benefits and limitations of using BA in translational research and clinical applications for preterm birth are considered.


Results
Participant demographics. After filtering for all inclusion/exclusion criteria, the PREG cohort consisted of 177 women who self-identified as non-Hispanic Black ( n = 89 ) or non-Hispanic White ( n = 88 ). Meeting the same criteria, the GAPPS cohort included 52 women who all self-identified as non-Hispanic Caucasian and not as AA (see Table 1 for additional participant demographics). In order to maintain consistent terms across cohorts, EA and AA will be used to describe women who self-identified as non-Hispanic White/Caucasian or as Black/AA, respectively. The demographic attributes of the PREG EA subset and GAPPS cohort were, for the most part, more similar to each other than to the PREG AA subset. Overall, PREG AA women were more likely to be younger, report higher levels of perceived stress, and were less likely to report taking daily prenatal vitamins. The PTB rate for the PREG and GAPPS cohorts were similar ( PREG = 5.1% , GAPPS = 5.8% ), but PREG AA women had significantly earlier GAAD (Table 1; see Supplemental Figs. S1 and S2 for full distribution of GAAD).
After filtering DNAm data based on quality metrics, 262 and 94 person time points of data remained for the PREG and GAPPS cohorts, respectively. Subsequent division of measures based on gestational age (GA) at assessment resulted in 95 early pregnancy time points ( EA = 49 , AA = 46 ) and 167 late pregnancy time points ( EA = 85 , AA = 82 ) in PREG (Fig. 1). The GAPPS cohort consisted of 45 early pregnancy measurements and 49 late pregnancy measurements (Fig. 1). Since all participants provided a minimum of 2 samples during pregnancy, early and late pregnancy measurements were available for the majority of women ( PREG = 86 , GAPPS = 42 ). However, some participants had only early pregnancy time points ( PREG = 9 , GAPPS = 3 ) and some had only late pregnancy time points ( PREG = 81 , GAPPS = 7 ), since measurements collected mid-pregnancy did not meet early/late definitions. The mean GA at collection was 72.3 days at early pregnancy measurements, and 213.2 days at the late pregnancy measures (standard deviation of 16.8 and 23.5 days, respectively; see Supplemental Fig. S3 for full distribution). The mean GAAD was not significantly different between those participants with early ( PREG = 273.9 , GAPPS = 275.6 ) and late ( PREG = 275.9 , GAPPS = 276.5 ) measures ( p = 0.119 and p = 0.637 for PREG and GAPPS, respectively). BA estimates were nominally higher than chronological age (Table 1 and Fig. 4). Maternal chronological age and BA was moderately correlated in the PREG study (Pearson's; 0.63 and 0.74 [ EA = 0.42 and 0.62, AA = 0.67 and 0.73], in early and late pregnancy, respectively). The correlation between chronological age and BA was 0.71 in early pregnancy and 0.66 in late pregnancy (Pearson's) in the GAPPS cohort. Intraindividual variation in BA measurements was relatively low, with mean absolute differences between early and late pregnancy estimates of 3.1 years in PREG (standard deviation = 3.3 ) and 2.6 years in GAPPS (standard deviation = 1.5 ). During www.nature.com/scientificreports/ preprocessing steps, 13 of the Horvath probes were identified as poor quality and removed from PREG, and 33 probes removed in GAPPS (46 total unique Horvath probes between both cohorts). To assess the impact of different probe subsets, analyses were performed with both the largest possible Horvath probe set for each cohort ( PREG = 340 [96%], GAPPS = 320 [91%] (see Fig. 1) and with the subset of Horvath probes shared in common between the two cohorts ( n = 307 [87%]).
Association between BA and GAAD. The coefficients, standard errors, and p-values for all models tested with the PREG cohort are reported in Table 2. For each model, BA and GAAD were the predictor and response variables, respectively. In the full PREG sample, BA estimates outperformed chronological age in predicting GAAD (adjusted R-squared = 7.67% and 3.57%, respectively). The full PREG sample showed a significant relationship between the early pregnancy Horvath-derived BA estimates and GAAD (p-value threshold < 0.008 after Bonferroni adjustment for multiple testing). Higher BA estimates had a positive relationship with GAAD, indicating that an earlier GAAD is associated with younger BAs. Although the relationship between BA and GAAD was primarily supported by the AA subset, the significant relationship between BA and GAAD in the full sample remained after including a self-reported race variable in the model ( p = 0.006 ). However, the relationship between early prenatal BA and GAAD was attenuated when retaining the maximum number of probes available ( p = 0.005 in n = 340 probes (Supplementary Table S1); p = 0.003 in n = 320 probes [ Table 2]). There were no significant findings between GAAD and late pregnancy BA estimates. A marginally significant relationship between prenatal PSS and BA estimates in early pregnancy was identified ( p = 0.009 ) in the full sample. Similar to the direction of the relationship identified in the GAAD analyses, a higher PSS was associated with a lower BA. A nominally significant relationship between BA and GAAD remained even after adjusting for perceived stress in early pregnancy ( p = 0.012 ). A follow-up analysis in the GAPPS sample, composed entirely of women with EA ancestry, showed no significant relationships between Table 2. Relationships between gestational age at delivery, perceived stress, and biological age estimates in the PREG cohort. Horvath probe sets were reduced to match the probes available for GAPPS. coef = coefficient, SE = standard error, EA = European American, AA = African American, PSS = perceived stress scale total score, GAAD = gestational age at delivery, BA = Horvath-derived biological age estimates. Maternal chronological age was included as a covariate in all models. * Survives Bonferroni adjustment for 6 tests, p-val < 0.008.  Table 3. Relationships between gestational age at delivery, perceived stress, and biological age estimates in the GAPPS replication cohort. Horvath probe sets were reduced to match the probes available for PREG. coef = coefficient, SE = standard error, PSS = perceived stress scale total score, GAAD = gestational age at delivery, BA = Horvath-derived biological age estimates. Maternal chronological age was included as a covariate in all models. * Survives Bonferroni adjustment p-val < 0.008. www.nature.com/scientificreports/ BA and GAAD or between perceived stress and BA (Table 3). Given previously identified associations between tobacco use and DNAm 18 , the effect of smoking status on the BA-GAAD relationship was similarly considered. A nominally significant relationship between BA and GAAD remained after including smoking history (i.e., never, former, current) as a covariate in the sensitivity analyses ( p = 0.015).
Evaluation of BA as a potential clinical marker for GAAD. Residualized BA scores were calculated by regressing BA onto chronological age and reflect the deviation between chronological age and BA. For PREG, residualized BA scores were calculated using the largest possible Horvath probe subset ( n = 340 ). Overall, BA residualized scores were relatively stable over the course of pregnancy regardless of self-identified race and had significant between-person heterogeneity ( Fig. 2). A significant relationship between BA baseline measurement (i.e., the model intercept), but not rate of change across pregnancy (i.e., the slope of the model), and GAAD was identified (see Supplement). This finding is in agreement with the results from the linear regression models showing early BA associated with GAAD. Critically, there was greater variability in the residualized BA scores in the PREG AA subset compared to the EA subset (Fig. 3a). Follow up analyses revealed that BA residualized scores were sensitive to probe subset size and self-identified race. Residualized scores were calculated for both the full PREG ( n = 340 ) and shared ( n = 307 ) Horvath probe subsets, and self-identified Census-based race significantly predicted BA residuals for the shared probe set above and beyond the BA residuals for the full PREG BA subset (t-value = −5.89 ; Fig. 3b). The sensitivity of BA estimation to probe subset size and composition was further highlighted by comparing the correlation between BA and chronological age in the PREG and GAPPS cohorts, which had different subsets of Horvath probes available (Fig. 4).

Discussion
Excitement over the potential benefits associated with using BA to index personal risk liability for adverse health outcomes has prompted dozens of studies 7 . Indeed, such a biological marker could improve the accuracy of screening algorithms for multifactorial disorders. To our knowledge, this study is the first to examine the relationship between longitudinal measurements of prenatal maternal BA and GAAD. The results of this study highlight both potential benefits and caveats associated with using BA in translational research and clinical applications. Several characteristics of maternal prenatal BA are appealing for future follow up studies assessing clinical utility. Importantly, early prenatal BA was the most strongly associated with GAAD, which means that PTB risk assessments could occur in time to consider medical interventions and preventative measures. Further, this study observed large interindividual variation in baseline BA estimates which remained relatively stable throughout  www.nature.com/scientificreports/ pregnancy. Early prenatal BA was significantly associated with GAAD above and beyond other risk factors like maternal prenatal perceived stress and chronological age. These findings suggest that early prenatal BA may be a promising candidate for inclusion in a precision clinical obstetrics screening algorithm.
Although results from this study support the possibility of adopting BA for estimating risk for PTB, some critical observations also were noted. First, sensitivity analyses revealed that the relationship between early prenatal BA and GAAD was impacted by probe set composition. Based on these findings, researchers should take care when estimating BA and clearly report the number of probes used in BA calculations. Second, the strongest association signal was found in the AA subset of the PREG sample. Although this relationship remained significant in the full PREG cohort after adjusting for self-identified Census-based race and multiple testing correction, sensitivity analyses using residualized BA scores suggest that the reliability of BA may vary by genetic ancestry and/or demographic factors. These findings suggest that cryptic, currently unidentified factors may be influencing the predictive validity and reliability of DNAm-based BA estimation. The problem of genomically-informed risk assessments failing to generalize to non-European populations has received increasing attention not only because such results limit the utility of clinical assessments but also because they threaten to exacerbate existing racial health disparities 36 . Another issue is that the biological significance of the individual sites of DNAm included in BA algorithms is poorly understood 37,38 , which obscures identifying the specific molecular processes BA actually reflects 7 . This knowledge gap makes predicting factors that will influence generalizability challenging. Researchers must be careful when studying populations that include individuals from diverse backgrounds, especially given that most DNAm-based BA estimation algorithms work analogously to other methods that exhibit variable predictive validity by genetic ancestry (i.e., polygenic risk score calculation) 36 .
Although significant relationships were identified, the direction of the relationship between BA and GAAD was unexpected. Advanced biological aging is a putative driver of increased risk for negative health outcomes and would be expected in individuals with higher levels of perceived stress and pregnancies with a lower GAAD. In this case, the algorithm predicts that, on average, AA participants are biologically younger than their EA counterparts despite group differences in lifetime exposure to stressors that would predict greater positive deviations from chronological age. Given that a younger BA is associated with adverse outcomes during pregnancy, the results from this study may not support the traditional weathering hypothesis. The interpretation of BA-disease relationships may be complicated by the fact that risk for PTB is increased among both the youngest and oldest mothers 39 , rather than increasing over the lifetime like other age-related disorders. This nonlinear distribution between maternal chronological age and PTB could be similarly reflected in BA, so that any prominent deviations from mean BA, rather than advanced BA alone, may highlight those pregnancies at higher risk.
These findings contradict results from another study, which did not find a significant relationship between Horvath BA and GAAD, but did identify an inverse relationship between maternal BA estimated using another DNAm-derived BA algorithm and length of gestation 40 . However, other studies have similarly noted an unexpected direction of the association between DNAm-based BA and adverse pregnancy outcomes, including research assessing the relationships between the BA of infants at birth and maternal antenatal depression, PTB, www.nature.com/scientificreports/ and future psychiatric problems 41 . Contradictory relationships between fetal and placental telomere length, an alternative measure of cellular aging, and GAAD are also prevalent in the literature 34,35,42 . These results could arise from measurement variance that leads to unreliable BA estimates due to genetic and/or physiological status (i.e., pregnancy). The generalizability and reliability of genomic risk scores depends on the diversity and size of the training dataset composition, respectively. To our knowledge, no existing BA algorithm includes blood samples from pregnant women. As a result, BA estimates could be influenced by pregnancy-related DNAm remodeling. As the epigenetic aging field advances, BA estimators for specific populations have been established 31,43,44 , and the development of future algorithms should be tailored for birth outcomes research and include pregnant women.
Integrating DNAm-derived BA with other indices of cellular senescence (e.g., telomere length) could further increase our understanding of the molecular processes reflected in BA.
Overall, these results suggest that BA estimates hold potential to serve as a biomarker for PTB, but extreme care must be taken to assess the accuracy and generalizability of BA across a wide variety of genetic and demographic backgrounds. The ability to assess risk for PTB at the beginning of pregnancy would provide opportunities for early intervention and targeted medical care throughout gestation. Logistically, many attributes of DNAmbased BA make for a good candidate biomarker 45,46 . DNAm is a stable mark that can be measured reliably, and BA estimates are easily calculated using the Horvath method. In this study, DNAm was measured in peripheral www.nature.com/scientificreports/ blood, a tissue with a minimally invasive collection procedure that is already a normal part of pregnancy monitoring, posing no additional risk to patients. While more research is necessary to examine how reliably BA predicts GAAD in other samples, in the future BA should be considered for potential clinical applications.

Strengths and limitations.
To our knowledge, this study is the largest study to investigate maternal BA during pregnancy and is the first to examine the stability of prenatal BA and its relationship across time with GAAD. Major strengths of this study include the use of both a primary and replication cohort both containing longitudinal measurements during pregnancy. The inclusion of a diverse cohort allowed for the investigation of BA differences by self-reported race. Finally, all analyses and hypotheses examined in this study were preregistered on the Open Science Framework 47 using the AsPredicted format.
The results of this study should be considered in the context of four primary study limitations. First, crossstudy comparisons were complicated by variation in data collection protocols. Perceived stress was assessed at four study visits in PREG while only two measures were collected in GAPPS. This limitation would have been easier to resolve if more detailed information about GA at assessment were available for GAPPS participants (e.g., GA in days). Second, the two study populations differed significantly in demographic composition (Table 1). These differences were particularly problematic given that main effects of BA on GAAD were seen primarily in the PREG AA subsample. Additionally, notable demographic differences were observed between the PREG AA subsample, the PREG EA subsample, and the GAPPS EA cohort. It is possible that both measured and unmeasured demographic differences (e.g., differences in parity and personal pregnancy history) contributed to differences in GAAD and BA residuals. Future work will be needed to assess the impact of reproductive history characteristics (e.g., prior history of preterm delivery, parity) on biological aging. Third, neither the PREG nor the GAPPS samples had complete probe data for the full Horvath algorithm. The GAPPS sample was measured using a newer technology missing seventeen of the Horvath probes, and both samples had probes removed during quality control. It is not clear if and how these missing probes influenced the final results, but the strength of the association between early prenatal BA and GA was slightly attenuated in the maximum possible probe subset ( n = 340 ) compared to the smaller probe subset for PREG ( n = 307 ; see Supplement for results from analyses including all available probes). Finally, the PREG and GAPPS participants were generally healthy women with uncomplicated pregnancies due, in part, to exclusion criteria related to placental and amniotic abnormalities and hypertensive disorders. The exclusion of heterogeneous causes of PTB putatively increases statistical power for genetic research at the cost of limiting observed biological variability. Future studies will be needed to characterize maternal BA stability and correlates in high-risk pregnancies.

Study cohort. Pregnancy, Race, Environment, Genes (PREG). The Pregnancy, Race, Environment, Genes
(PREG) Study is a prospective longitudinal cohort assessing the relationship between epigenetic factors, environmental exposures, and pregnancy outcomes 48 . Self-report questionnaires and maternal peripheral blood samples were collected up to four times throughout pregnancy. Inclusion criteria at enrollment were (1) singleton pregnancy conceived without assisted reproductive technology, (2) mother was 18-40 years old with no diagnosis of diabetes, (3) enrollment before 24 completed weeks of gestation, (4) mother and father had to self-identify as either both White or both Black without Hispanic or Middle Eastern ancestry. The rationale for limiting the cohort by ancestry was to maximize the statistical power for genetic/epigenetic analyses and to investigate the role of environmental and epigenetic factors to perinatal health disparities. Exclusion criteria included diagnosis of maternal blood pressure disorders (e.g., preeclampsia), fetal congenital anomalies, placental or amniotic anomalies (e.g., placenta previa, polyhydramnios), fewer than three study time points completed, or use of a cerclage. GA was confirmed by ultrasound. GA at each study visit and GAAD were recorded in days since conception.

Replication cohort. Global Alliance to Prevent Prematurity and Stillbirth (GAPPS).
Maternal blood specimens were obtained from the Global Alliance to Prevent Prematurity and Stillbirth (GAPPS) BioServices repository. GAPPS participant selection criteria matched most PREG study inclusion and exclusion criteria to facilitate cross-study comparisons. AA samples were not available from GAPPS at the time of study initiation. Maternal peripheral blood samples were collected along with self-report questionnaires up to three times across pregnancy. Due to the smaller number of total possible study visits, GAPPS participants were included if they had least two time points of data. GAAD was reported in days since conception, but GA at each study visit was reported as trimester (i.e., 1, 2, or 3).
Biological age measurement. BA was estimated from genome-wide DNAm measurements using the Horvath method 12 . The Horvath algorithm calculates BA from DNAm levels at 353 genomic loci each measured by a single probe. Most of the loci only contribute modestly to the final age estimate (i.e., median weight is 6 weeks; range is 0.00000594 to 3.07 years) 12 . Both PREG and GAPPS measured DNAm from peripheral blood specimens using Illumina microarray technology. The PREG study used the Infinium HumanMethylation450 BeadChip (450k); GAPPS, the Infinium EPIC BeadChip (850k). The 850k array is a newer sister technology to the 450k and includes 92% of the 450k probe set. The newer 850k array design omits 17 of the Horvath probes (4.8%). Despite the probe set differences, previous reports have suggested that the Horvath age estimates are only slightly underestimated in peripheral blood when these probes are missing ( r > 0.91 , n = 172) 49 www.nature.com/scientificreports/ specimen placement were randomized on the array, but all specimens from a single participant were loaded onto a single array to minimize potential batch effects (see Supplement). Before calculating BA, the quality of DNAm microarrays was assessed (Fig. 1) using the Bioconductor R package minfi 51 . Probes with either poor signal intensity or known cross-hybridization activity were removed in accordance with established best practices (see Supplement for additional details). Principal components analysis was used to identify potential experimental artifacts (e.g., batch effects), and based on this analysis, probe Beta-values were adjusted for positional effects using ComBat 52 . BA estimates for each specimen were calculated from adjusted Beta-values using the wateRmelon R package 53 . All statistical analyses were conducted in the R environment (version 3.5) 54 .
Perceived stress measurement. The Perceived Stress Scale (PSS) is a ten-question validated self-report instrument for assessing the magnitude and severity of recent stress levels 55 . Each item is a 5-point Likert-type question, with 0 indicating "never" and 4 indicating "very often". Possible scores range from 0 to 40 with higher scores indicating greater levels and interference of perceived stress. The PSS was administered at every visit for the PREG study and in the second and third trimester health questionnaires for the GAPPS study. PSS scores have been associated with advanced BA and with greater vulnerability to depressive symptoms precipitated by stressful life events. For this study, PSS scores were used to index each participant's feelings of cumulative stress and control over the events in her life. Data analysis. Linear regression was used to test the relationship between BA estimates from early and late pregnancy with GAAD and prenatal perceived stress. To harmonize the data across studies while maintaining sample size, early and late prenatal DNAm measurements were defined in PREG as blood specimens obtained at a GA less than 100 days and after 180 days, respectively. In GAPPS, early pregnancy was defined as measurements collected in the first trimester while late pregnancy measurements were those obtained in the third trimester. To control for individual differences in chronological age, maternal age (collected at the time of study enrollment), was included as a covariate in all analyses. Lifetime smoking status (i.e., never, former, current), self-reported race, and prenatal perceived stress levels were included as covariates in the regression models for sensitivity analyses. Cell-type proportion estimates were not included because the Horvath BA algorithm is robust to biases related to cell-type heterogeneity 12 . Prenatal BA trajectories were characterized using linear latent growth curve models evaluated in Mplus and built using the R package MplusAutomation 56 . The purpose of the growth curve model was to quantify the interindividual difference in the baseline and rate of change of BA estimates across pregnancy.
Informed consent and ethical approvals. The PREG study received Virginia Commonwealth University Institutional Review Board approval (14000) and all research was performed in accordance with relevant guidelines and regulations. Written confirmation of informed consent was obtained from each participant.
Preregistration. Analyses presented in this manuscript were preregistered on the Open Science Framework and are available at https://osf.io/6a9db. All of the original preregistered study questions were addressed in these analyses. However, there are other notable deviations from the analyses outlined in the preregistration document. Originally, two BA algorithms prominently featured in the literature, the Horvath and Hannum methods, were selected for this study. However, several probes included in the Hannum algorithm were removed during quality control processing steps. The Hannum method is known to be more sensitive to missing probes, potentially leading to a biased BA estimates 50 . As per the original preregistered study design, the same analyses were completed with the Hannum clock (see Supplement for relevant methods and results). Interestingly, both epigenetic clocks performed similarly in these samples, suggesting they are capturing the same biological phenomenon. Additionally, methods and results for a secondary analyses examining the use of Y chromosome probes to detect cell-free DNA contamination of maternal samples are available in the Supplement. Finally, a more parsimonious model was selected to adjust for chronological age variability in the models. Rather than adopting a two-step approach in which BA is first regressed on chronological age before modeling the resulting residual, maternal age was simply included as a covariate in all analyses.

Data availability
The preregistration document and R code used to analyze the data and generate figures is available on the Open Science Framework (OSF) project landing page (https:// osf. io/ sqmzg). Sharing PREG and GAPPS study data is limited by Institutional Review Board agreements and participant consent forms, which restrict openly sharing individual-level DNAm measures. Anyone interested in data access or collaboration is encouraged to contact Dr. Timothy P. York (timothy.york@vcuhealth.org) for more information. www.nature.com/scientificreports/