Introduction

In the United States, more than 10% of all live births are preterm (defined as <37 completed weeks of gestation). Preterm birth (PTB) remains a leading cause of infant mortality and morbidity in the US and the globe,1,2 and can lead to a cascade of health problems later in life.3,4 Of all PTBs, about 70% are spontaneous with either preterm labor (i.e., having regular contractions and cervical changes at <37 weeks of gestation) or preterm premature rupture of membranes, and the remaining PTBs are medically indicated that occur largely due to gestational complications. Women of African ancestry are known to bear a disproportionally high rate of PTB than women of other ethnicities.5 However, risk factors of PTB that may underpin this disparity remain largely unknown.

Maternal perceived stress, in the form of acute, chronic, pregnancy-related and/or other life event-related stressors, is widespread in the general population and even more so among African Americans (AAs).6 Maternal stress is potentially an important modifiable social determinant of maternal and child health in the US.7 Although many epidemiologic studies have attempted to link maternal stress with risk of PTB, the results have been inconclusive.7,8,9,10,11,12,13,14 Such inconsistencies may be due to multiple factors. Aside from study design or methodological issues, one possibility that contributes to this inconsistency is the effect modification by maternal genetic susceptibility. It is likely that women of different genetic backgrounds vary in their biological vulnerability to stress, leading to differences in stress−PTB associations. Boyce has proposed the “orchid vs dandelion” theory,15 which also suggests that certain genetic variants can increase a person’s susceptibility to stressors. This plausibility was further supported by previous epidemiological studies that demonstrated a significant impact of the interaction between maternal genes and perceived stress on multiple child health outcomes.16,17,18 Although there is an increasing number of genome-wide association studies (GWAS) of PTB,19,20 few studies have been conducted to systematically investigate maternal gene × stress interactions on PTB, especially on a genome-wide scale.

To fill the aforementioned research gap, in this study, we performed genome-wide interaction analyses to explore common single nucleotide polymorphisms (SNPs) that may interact with maternal perceived stress (lifetime and during pregnancy) to affect PTB risk in AA women. We used a two-stage case−control study design (including the discovery and the replication stages) and focused on spontaneous PTB (sPTB) to reduce phenotypic heterogeneity. We further performed a meta-analysis combining the discovery and replication samples.

Material and methods

The study cohort and study population in the discovery stage

The study participants in the discovery stage were enrolled in the Boston Birth Cohort (BBC), an ongoing longitudinal, multiethnic, predominantly urban, low-income minority birth cohort designed to identify gene−environment interactions associated with prematurity and other adverse birth outcomes.21 Briefly, since 1998, mother−infant dyads have been recruited 1–3 days post delivery and interviewed using a standard questionnaire at Boston Medical Center. Pregnancies resulting from in vitro fertilization, multiple gestations, and/or pregnancies with fetal chromosomal abnormalities or major birth defects were excluded. After giving written informed consent, each enrolled mother was interviewed using a standard questionnaire to collect data on demographic variables, lifestyle and dietary intake. A maternal blood sample was obtained within 24–72 h after delivery and an umbilical cord blood sample was obtained at delivery. The study protocol was approved by the Institutional Review Boards of Boston University Medical Center, and of the Johns Hopkins Bloomberg School of Public Health.

As we reported previously,22 698 biologically unrelated AA mothers of preterm babies (PTB, <36.8 weeks of gestation) and 1035 AA mothers of term babies (TB, >37 weeks of gestation), who are frequency matched with the cases on maternal country of origin (Haitian or non-Haitian), maternal age at delivery (±5 years), parity and year of delivery, were successfully genotyped for the GWAS of PTB in the BBC. After removing 237 mothers with medically indicated PTB and 6 mothers with missing data for maternal perceived stress, this study included 457 mothers of sPTB (cases) and 1033 mothers of TBs (controls) as the discovery sample.

Phenotype definition and covariates

Gestational age was assessed by early (<20 weeks) prenatal ultrasound and/or the first day of the last menstrual period. Spontaneous PTB (sPTB) was defined as a birth occurring secondary to documented active preterm labor (uterine contractions with cervical effacement and dilation at <37 weeks) or premature rupture of membranes at <37 weeks without uterine contractions or both. Early sPTB was defined as a birth occurring <33 weeks of gestation, and late sPTB as a birth occurring from 33 to 366/7 weeks.

Maternal perceived lifetime stress and stress during pregnancy were self-reported by mothers using a standard questionnaire interview, with the following two questions: (1) “how would you characterize the amount of stress in your life in general?” and (2) “How would you characterize the amount of stress in your life during this pregnancy?” The response options for both questions included: 0 = not stressful (or low), 1 = average, or 2 = very stressful (or high). Other maternal characteristic variables were also collected through a questionnaire interview,23 including: smoking during pregnancy, which was classified as “never smoker” (did not smoke cigarettes throughout the index pregnancy), “former smoker” (only smoked in the 3 months before pregnancy or during the first trimester), or “continuous smoker” (smoked continuously from pre-pregnancy to delivery); and, social support from the baby’s father, which was classified as “none”, “a little”, “a good amount”, or “an excellent amount.” Pre-pregnancy body mass index (BMI) was calculated as self-reported pre-pregnancy weight (kg) divided by height squared (m2).

Genome-wide genotyping and genetic ancestry estimation

DNA samples from the proposed mothers were quantified and then genotyped by the Center for Inherited Disease Research using the Illumina HumanOmni 2.5 array. The raw genome-wide genotyping and phenotypic data from the BBC have been deposited in the NIH dbGaP database (entry # phs000332.V3.P2). Detailed information on genotyping and quality control steps can be found in our previous publication.22 A total of 2,160,368 SNPs from 1733 biologically unrelated mothers passed the quality control steps and were available for SNP imputation. Phasing was performed using SHAPEIT24 and SNP imputation was done using IMPUTE225 software, with all individuals in the 1000 Genomes Project (1000GP) as the reference panel.

Genetic ancestry for each subject was then computed by principal component analysis (PCA) using Eigenstrat,26 with all individuals in the 1000GP as the reference. Those mothers (n = 9) whose estimated genetic ancestry was inconsistent with self-reported AA ancestry were removed from the subsequent data analyses, as we previously reported.22 The estimated genetic ancestry, represented by the first three principal components from PCAs, was then included as three covariates in subsequent analyses.

Statistical analyses in the discovery stage

Population characteristics in the PTB case and control groups were compared using t tests for continuous variables or chi-square tests for categorical variables. The genome-wide genotyped SNP × stress interactions were tested using the conventional 1-degree of freedom (d.f.) interaction test. We added each SNP (under an additive genetic model, all with minor allele frequency [MAF] > 2%), maternal perceived stress (lifetime or during pregnancy, coded as 0 = low, 1 = average, 2 = high, and treated as an ordered discrete variable) and their interaction term into a logistic regression model in PLINK(v1.07),27 with adjustment of covariates including genotyping batch, maternal genetic ancestry, age at delivery, marital status, parity, social support from the baby’s father and newborn sex. The genome-wide suggestive and significance thresholds were set as P < 5.0 × 10–7 and P < 5.0 × 10−8, respectively. Manhattan and quantile−quantile (Q−Q) plots were generated using the R package GWASTools28 to present genome-wide interaction associations. For any genomic loci having significant or suggestive interactions with maternal stress, imputed SNPs nearby (±1 Mb), with MAF > 0.02 were further analyzed for their interactions with maternal stress on risk of sPTB using the logistic regression model as described above. The LocusZoom plot was then generated using a published web tool29 to locate the most significant locus or SNP.

The identified interactions in the discovery sample were validated in the replication samples as described below. For the validated gene × stress interactions, additional analyses were also performed to test whether the identified interaction varied by different PTB subtypes (i.e., early vs late sPTB) or by newborn sex.

Replication studies

The replication study was conducted using the epidemiological and GWAS data from the NICHD Genomic and Proteomic Network for Preterm Birth Research (the GPN study, dbGaP entry #phs000714.v1.p1). The GPN study, as reported previously by Zhang et al.,30 is to investigate genome-wide associations of sPTB in multiethnic populations. Mothers of sPTB (defined as birth at 20–336/7 weeks) and of TB controls (defined as birth at 39–416/7 weeks) were matched on race/ethnicity, maternal age, and parity, and genotyping was performed using Affymetrix Genome-wide Human SNP Array 6.0. After data cleaning using similar criteria as in the BBC samples and removing individuals with missing data on maternal stress or on the targeted SNP, the GPN study had 337 AA mothers and 738 Caucasian mothers for replication. Maternal stress was self-reported based on questionnaire interview, with the question about “felt nervous and stressed?” Due to the limited number of AA mothers, we recoded maternal stress into three categories: “low” (if the mothers reported “never” or “almost never” stressed), “average” (if the mothers reported “sometimes” or “fairly often”),” or “high” (if mothers reported “very often”). The gene × stress interactions were analyzed using the conventional 1-d.f. test based on the logistic regression model, with the adjustment of genetic ancestry, maternal age at enrollment, parity, marital status, and newborn sex. For the imputed SNP rs35331017, the best-guessed genotype was applied for interaction tests.

Results

Population characteristics of the discovery sample

After data-cleaning steps (see “Materials and methods”), the discovery sample of this study included 457 AA mothers of sPTB (cases) and 1033 AA mothers delivering at term (controls) from the BBC. Their population characteristics are presented in Table 1. Compared to controls, mothers of sPTB were more likely to have high lifetime stress (16.2% vs 10.2%, P = 0.001), to be unmarried (72.9% vs 66.4%, P = 0.013) and to smoke during pregnancy (17.1% vs 9.4%, P < 0.001); and mothers of sPTB were less likely to receive substantial support from the baby’s father (P = 0.001) or from other family members/friends (P = 0.003). In comparison, maternal stress during pregnancy was only marginally different between these two groups (23.6% vs 18.1%, P = 0.071).

Table 1 Population characteristics of 1490 African-American mothers from the Boston Birth Cohort.

Genome-wide screening for maternal SNP × maternal perceived stress interactions

At the discovery stage, we analyzed genome-wide SNP interactions with maternal lifetime stress and with maternal stress during pregnancy, separately, on risk of sPTB. After adjustment for covariates (see “Materials and methods”), we found a suggestive genome-wide interaction between rs11795005 at 9p24.1-p23 and maternal lifetime stress on risk of sPTB (PG × E = 1.4 × 10−7, Fig. 1a). There was no evidence of genomic inflation in our data analyses, as demonstrated by the Q−Q plot (Fig. 1b). The identified SNP, rs11795005, is located within an intronic region of the protein-tyrosine phosphatase receptor Type D (PTPRD) gene. Following SNP imputation of this genomic region, we identified another SNP, rs35331017, that demonstrated a genome-wide significant interaction with maternal lifetime stress (PG × E = 4.7 × 10−8, Fig. 1c) on risk of sPTB. Of note, rs35331017 and rs11795005 are in significant linkage disequilibrium (r = 0.97). In comparison, we did not identify any genomic significant or suggestive regions interacting with maternal stress during pregnancy on risk of sPTB (Supplementary Fig. S1).

Fig. 1: Manhattan, quantile−quantile (Q−Q), and LocusZoom plot of the genome-wide interaction associations with maternal lifetime stress on spontaneous PTB, in 1490 African-American mothers from the Boston Birth Cohort.
figure 1

ac Manhattan plot, Q−Q plot and LocusZoom plot, respectively, for the genome-wide interaction analyses performed using the conventional 1-degree of freedom interaction test based on the multiple logistic regression models, adjusted for genotyping batch, maternal genetic ancestry, age at delivery, parity, marital status, social support from the baby’s father and newborn sex.

Rs35331017 × maternal stress interaction and sensitivity analyses

SNP rs35331017 is a T-nucleotide insertion (I)/deletion (D) polymorphism with a minor allele (or D allele) frequency of 5% in the BBC. This variant was analyzed under a dominant model (the II genotype vs the ID/DD genotype) in subsequent analyses, given that only four women carried the DD genotype. Table 2 presents the odds ratios (ORs) of maternal lifetime stress on risk of sPTB, stratified by the rs35331017 genotypes. Among women carrying the rs35331017-II genotype, those reporting average and high lifetime stress had 1.5 (95% CI = 1.1–1.9, P = 0.007) and 2.1 times (95% CI = 1.4–3.1, P = 0.0003) increased odds of sPTB, respectively, compared to mothers reporting low lifetime stress. However, in women carrying the rs35331017-ID/DD genotype, those reporting high lifetime stress demonstrated a reduction in the odds of sPTB compared to women reporting low lifetime stress (OR = 0.05, 95% CI = 0.01–0.32; P = 0.001). The interacting effects of the rs35331017 genotype and maternal stress are presented in Fig. 2, which further demonstrates a dose−response positive association between maternal lifetime stress and risk of sPTB only in mothers carrying the rs35331017-II genotype. We then carried out similar analyses to explore whether there was an interaction effect between the rs35331017 genotype and maternal stress during pregnancy on the risk of sPTB. We found a similar pattern, although the effect size was relatively modest (Table 2 and Fig. 2, PG × E = 1.2 × 10−5).

Table 2 Stratified analyses by genotypes of the PTPRD rs35331017 variant for the association between lifetime stress and spontaneous PTB in the mothers from the Boston Birth Cohort.
Fig. 2: Joint associations between rs35331507 in the PTPRD gene and maternal perceived stress on sPTB in African-American mothers from the BBC.
figure 2

Y axis reflects the odds ratio (OR) and 95% confidence interval (CI) of sPTB risk for each subgroup stratified by the genotype of rs35331507 and maternal lifetime stress (a) or stress during pregnancy (b), with low-stress mothers carrying the rs35331017-II genotype as the reference group. This analysis was conducted based on multiple logistic regression models adjusted for genotyping batch, maternal genetic ancestry, age at delivery, marital status, parity, social support from the baby’s father and newborn sex. #0.05 < P < 0.1; **P < 0.01; ***P < 0.001; ****P < 0.0001.

We further performed the sensitivity analyses to assess the robustness of the maternal rs35331017× lifetime stress interaction on PTB subtypes. As presented in Supplementary Table 1, the effect size and direction of the rs35331017× maternal lifetime stress interaction was comparable for early sPTB (<33 weeks; PG × E = 0.0005) and late sPTB (33–36 weeks; PG × E = 1.3 × 10−6). Lastly, we stratified our interaction analyses by newborn sex, which revealed that the rs35331017 × maternal lifetime stress interaction on risk of sPTB tended to be stronger among females (PG × E = 1.8 × 10−6) than males (PG × E = 0.046) (Supplementary Table 2).

Replication studies and meta analyses

The replication sample includes 337 AA mothers and 738 Caucasian mothers from the GPN study, with their population characteristics shown in Supplementary Table 3. Among the 337 AA mothers, a marginally significant rs35331017 × maternal stress interaction was observed on sPTB (PG × E = 0.088), with the association in the same direction as in the BBC. Interestingly, we found that there was a rs35331017 × maternal stress interaction for sPTB risk in 738 Caucasian mothers (PG × E = 0.023, Table 3 and Supplementary Fig. 2), as well as in the combined cohort of AA and Caucasian mothers (P = 0.009) from the GPN study (which included adjustment for maternal ancestry). With the meta analyses combining the discovery and replication samples, we identified an even stronger rs35331017 × maternal stress interaction on risk of sPTB (P = 4.5 × 10−10).

Table 3 The main effects of maternal rs35331017, maternal stress and their interaction effects on spontaneous PTB in the mothers from the Boston Birth Cohort and from the GPN study.

Discussion

This is the first study to investigate maternal genome-wide gene × maternal perceived stress interactions on the risk of sPTB in AAs, a high-risk population for PTB. Specifically, we have identified a genome-wide significant interaction between maternal PTPRD genetic variants and maternal lifetime stress in AA women from the BBC, and further replicated this interaction in AA and Caucasian women from the GPN study. These findings, if further validated in other large populations, underscore the importance of considering maternal genetic factors and G × E interactions when assessing socio-environmental determinants of sPTB.

Our study may contribute new insight into the relationship between maternal stress and sPTB. To our knowledge, previous studies evaluating associations between maternal perceived stress and risk of sPTB are inconclusive. Some studies have reported maternal perceived stress as a risk factor for sPTB,8,11,14 while others demonstrated a negative association between them.9,10,13 Findings from the current study may help to explain such inconsistent findings. Our data indicated that the associations between maternal lifetime stress and risk of PTB may vary in women with different genetic backgrounds. Among mothers carrying the rs35331017-II genotype, those reporting high lifetime stress were at about twofold higher risk of sPTB than those reporting low lifetime stress. However, this association was reversed among mothers carrying the DD or ID genotype at rs35331017. A similar pattern, but with a relatively modest effect size, was observed for the interaction between rs35331017 genotypes and maternal stress during pregnancy on risk of sPTB. Furthermore, we observed that, among all the genotyped mothers, high lifetime stress was more prevalent in mothers of sPTB than in mothers of TB, while maternal stress during pregnancy was only marginally different between these two groups. One possible explanation is that lifetime stress is better able to capture cumulative stress that may affect maternal preconception health as well as health during pregnancy, which is consistent with a life course health framework.

Although the underlying mechanism linking maternal perceived stress with risk of sPTB is not yet clear, several biological pathways have been implicated. First, activation of the hypothalamic-pituitary-adrenal axis, a major endocrine pathway, is activated in response to stress. Maternal stress triggers norepinephrine and cortisol release, which results in activation of placental corticotrophin-releasing hormone (CRH) gene expression. CRH gradually increases as the pregnancy progresses, and serves as a “placental clock” determining the timing of parturition.31 It has been hypothesized that in mothers experiencing high levels of stress, increased CRH levels may alter the timing of parturition and ultimately promote PTB.32,33 Another biological pathway implicated in the pathophysiology of sPTB is inflammation/infection. Higher stress levels are associated with higher circulating levels of inflammatory markers (such as C-reactive protein, interleukin (IL) 1 beta and IL-6), and lower levels of anti-inflammatory cytokines such as IL-10,34,35 leading to the T-helper 1 skewing and increased sPTB risk.

The 35331017 × stress interaction was further replicated in Caucasian women from the GPN study, suggesting a shared effect across different ethnic populations. It is largely unknown how the identified maternal rs35331017 × maternal perceived stress interaction may affect sPTB risk. SNP rs35331017 is a T-nucleotide insertion/deletion polymorphism located in the intronic region of the gene coding protein-tyrosine phosphatase receptor Type D (PTPRD). Protein PTPRD, which is highly expressed in the human brain, plays a potential role in psychopathology because it can bidirectionally induce pre- and post-synaptic differentiation of neurons by mediating interaction with IL1 receptor accessory protein and IL1 receptor accessory protein like 1.36 Previous studies have linked the PTPRD gene with multiple traits, such as substance addiction,37,38 and gestational diabetes,39 which show some degree of association with PTB. For example, It is likely that substance addiction during pregnancy, which may directly or indirectly target placental transport systems between maternal and fetal circulation, may lead to accumulated levels of serotonin and norepinephrine in the intravillous space.40 Increased levels of these neurotransmitters may be associated with an increased risk of sPTB via altering CRH levels, a shared pathway underlying maternal stress−PTB associations, which may help to explain the interaction effect identified in this study. It would be interesting to further investigate whether the identified rs35331017 × maternal stress interactions on sPTB risk are mediated by CRH levels.

The implication of G × E interactions in sPTB could be far-reaching. In addition to further elucidating the “missing heritability” of PTB, the discovery of significant G × E interactions creates the opportunity for translational application. For example, our findings, if further confirmed, could prove valuable for the prediction and prevention of sPTB, since rs35331017 genotypes and maternal perceived lifetime stress can be measured well before pregnancy and maternal stress is, to an extent, a modifiable factor. Our findings also indicate that effective prevention strategies targeting sPTB may differ for women with different genetic backgrounds. Specifically, women carrying the rs35331017-II genotype are at particular risk for stress-related sPTB, and they may therefore benefit from prioritizing stress reduction before and during pregnancy to decrease the risk of sPTB. Conversely, women carrying the rs35331017 DD or ID genotype may demonstrate more resilience in the face of acute and chronic stress, and for these women, emphasis on avoiding and mitigating other risk factors may be more effective in preventing PTB. However, these strategies require further investigation.

Several limitations should be acknowledged. First, perceived stress levels during a mother’s lifetime and during pregnancy were self-reported by the mothers, which may lead to recall bias. We believe that this bias is largely random and does not significantly affect our G × E findings, since mothers were not aware of their genotypes at the time of report, and the findings were successfully replicated in an independent cohort. Second, other stress-related variables, such as depressive symptoms and adverse life events, are less frequent and thus were not analyzed in the current study due to limited statistical power. Third, the current study may have a limited power to test G × E for SNPs with low minor allele frequencies (i.e., MAF < 2%) and/or with relatively modest interaction effects with maternal stress. Fourth, the results of the study remain purely associative, with unknown mechanisms, and warrant future investigation. The identified variant, rs35331017, is an insertion/deletion polymorphism in an intronic region, whose function is not known. Finally, the BBC was specifically designed to study PTB in a predominantly urban, low-income minority population. Caution is warranted when generalizing our findings to other populations.

In conclusion, this is the first study to demonstrate a significant genome-wide maternal gene × stress interaction on sPTB risk in a high-risk AA population in Boston, MA. Our study highlights the importance of considering genetic factors and G × E interactions when investigating socio-environmental determinants of sPTB. Our findings, if further confirmed, may provide new insight into individual susceptibility to stress-induced sPTB; help advance prediction and prevention of sPTB; and stimulate future studies on the functionality of the identified target gene and SNPs, and validations of this finding in other ethnic groups.