Introduction

Advances in obstetric and neonatal intensive care practices have led to dramatic increases in survival of preterm infants1,2,3 over the past 50 years. However, preterm infants are subject to an elevated risk of multiple types of long-term morbidities, particularly neurodevelopmental disorders,4 including behavioral difficulties,5 motor deficits,6 cerebral palsy,7 and seizure disorders.8,9 Neurodevelopmental disorders of preterm infants also result in tremendous economic losses.10 Significantly, there is no cure for most neurodevelopmental disorders. Both primary prevention and early detection that can lead to more effective treatment are therefore crucial.11

Although a large number of studies focused on preterm infant neurodevelopment have identified risk factors for adverse outcomes,12,13,14 few have focused on the reduction of environmental stress in the neonatal intensive care unit (NICU). Both the built environment (e.g., light and sound) and NICU care practices that may cause pain and stress are known to activate the hypothalamic–pituitary–adrenal axis (HPA axis). Animal and human studies have found that short-term pain experienced early in life can alter neurodevelopmental and behavioral outcomes later in life.15 A review article16 summarized the small number of studies focused on the association between individual components of the NICU environment (e.g., chemical exposures, light, and sound) and preterm infant neurodevelopmental outcome. The review concluded that NICU stressors such as chemical exposure, excess light, and loud noise negatively impact preterm infant neurodevelopment. Specifically, reduced exposure to bright light is associated with improved motor function and reduced exposure to noise is associated with less fussy behavior.16 To date, only small studies have evaluated the impact of individual components of the NICU environment on preterm infant neurodevelopment. To our knowledge, the biologic response to concurrent exposure to multiple NICU stressors has not been addressed.

Cortisol, a biomarker of HPA activation and stress response, has been extensively used as a measure of biologic response to diverse environmental stressors. The concentration of cortisol in the saliva is highly correlated with plasma cortisol level.17 Salivary cortisol has a short half-life17 and has previously been validated as a biomarker of acute exposure of environmental stress in preterm infants.18 Cortisol has previously been used in studies of the impact of stress on neurodevelopment in non-premature infant populations.19,20,21 To our knowledge, studies evaluating stress biomarker levels during the NICU stay, a period with intensive stress exposure, and neurobehavioral outcomes among preterm infants are lacking.

It has been shown in non-NICU populations that the association between environmental stress exposure and neurodevelopment can be altered by sex due to substantial differences in stress response22 and risk of neurodevelopmental disorders4 between males and females. Specifically, associations between maternal stress during pregnancy and offspring neurodevelopmental outcome in early to middle childhood are different based on offspring sex, although no consensus exists on which sex is more susceptible.19,23,24,25,26,27,28,29 For example, stress-exposed girls appear to have worse cognitive performance,23,25,26,27,28 and stress-exposed boys appear to have an elevated risk of attention deficit/hyperactivity disorder (ADHD).24,29 One group demonstrated sexual dimorphism in the association between stress biomarkers (salivary cortisol and alpha-amylase) measured at 14-months old and multiple neurodevelopment outcomes at four years old.23 Another small study reported that former preterm infants had a higher waking cortisol concentration and exaggerated response to stress compared to full-term children, with differences more pronounced among girls than boys.30 Evidence in early infancy is relatively limited but worth further investigation.

The neurotoxic effects of several environmental exposures (e.g., air pollution, metals, etc.) are known to vary by the developmental period in which the exposure occurs.31,32,33,34 Studies identifying the critical developmental period for stress exposure on premature infant development are lacking. For preterm infants, where much of the third trimester equivalent stage of neurodevelopment is lived ex utero, and where interventions in the highly controlled NICU environment could be altered to decrease impactful stress, the concept of a critical window for exposure has heightened importance. Additionally, studying preterm infants has the potential to differentiate between developmental critical windows—those defined by time from conception—and chronological critical windows, those defined as the time from birth.

In the present study, we hypothesized that salivary cortisol level measured during the NICU hospitalization would be associated with early neurobehavioral performance in a cohort of preterm infants. Neurodevelopment was assessed with the NICU Network Neurobehavioral Scale (NNNS), a multi-domain neurobehavioral assessment, performed prior to NICU discharge. We explored susceptible critical developmental windows associated with cortisol level, and the impact of infant sex on the relationship between cortisol and neurobehavior.

Methods

Study participants

Participants of this study were preterm infants enrolled in the NICU Hospital Exposures and Long-Term Health (NICU-HEALTH) cohort.35 NICU-HEALTH is a preterm birth cohort focused on defining the impact of hospital-based environmental exposures between birth and NICU discharge. NICU-HEALTH participants represent the large population of preterm infants born between 28-0/7 and 32-6/7 gestational week who require hospitalization in the NICU following birth, but who have low rates of physiological derangement, such as hypoxia or hypotension, or other medical predictors of poor outcome.36,37 NICU-HEALTH enrolled infants born at birth weight <1500 grams or with gestational age 28-0/7 through 32-6/7 weeks at the Mount Sinai Hospital from September 2011 until March 2020. Under the study protocol, detailed information about clinical events, specific exposure to medical equipment, biological samples including urine, stool, hair, saliva, and blood, and detailed maternal survey information were collected throughout the infants’ NICU stay. Outcomes including early neurodevelopmental assessments, growth parameters, and medical morbidities were ascertained. Longer-term outcomes after NICU discharge are collected as part of the DINE (Developmental Impact of NICU Exposures) cohort of the ECHO (Environmental influences on Child Health Outcomes) program (ClinicalTrials.gov NCT01420029, NCT01963065, NCT03061890; http://www.nih.gov/echo). NICU-HEALTH was approved by the Mount Sinai Program for the Protection of Human Subjects, and parents of NICU inpatients provided informed consent for participation.

Saliva collection and cortisol measurement

Saliva specimens were collected for NICU-HEALTH participants in the morning and afternoon on a single day each week during the NICU stay. Collections were conducted by trained study staff. The methods of sample collection, storage, and cortisol measurement have been described previously.18 In brief, saliva specimens were collected using a cotton swab (SalivaBio Infant Swab, Salimetrics) placed in the infant’s mouth for 2–5 min until the cotton swab was saturated. The swab was then centrifuged at 4,000 rotations per minute for 30 min, and extracted saliva was aliquotted to storage vials and frozen at −80 °C pending batch analysis. Saliva specimens were sent frozen to the Technical University of Dresden for cortisol measurement using a commercially available chemiluminescence assay.38,39

NICU Network Neurobehavioral Scale (NNNS)

Neurobehavioral performance of NICU-HEALTH participants was assessed using the NNNS40 administered by a certified examiner just prior to NICU discharge. The NNNS is a structured physical exam designed to measure behavioral and motor development and reactivity, consisting of 13 individual subscales that score an infant’s performance in a single aspect of motor function, neurological reflex, attention, social reactivity, or stress response. NNNS has been used in epidemiological studies as an measurement of infants’ neurobehavior41,42,43,44,45 and is used in some centers clinically as a predictor of long-term developmental risk.

Descriptive statistics

Descriptive statistics were calculated for salivary cortisol levels, subscales of the NNNS, and covariates. We compared study variables among male infants and female infants using Student’s t test for continuous variables and chi-square for categorical variables, respectively.

Reverse distributed lag model (rDLM)

We used rDLM46,47 to estimate time-specific associations between salivary cortisol and NNNS performance. The existing approach to rDLM was adapted from a typical DLM,48 which has been widely used to identify “critical windows” for repeatedly measured exposures and outcomes measured at a single later time point. The concept of a “critical window” identifies a specific sensitive period of development during which a repeated or continuous exposure may impact a health outcome.49 In contrast to the typical DLM, rDLM46,47 reverses the position of exposure and outcome in the model, such that the exposure becomes the dependent variable and the outcome becomes the independent variable. This approach can prevent the loss of sample size due to missing data.50 In a typical DLM approach, participants with any missing exposure data at a single time point would be dropped from modeling due to the requirement of complete and consistent data for the DLM. rDLM is particularly useful in our study because our participants are preterm infants born at different gestational ages who spent varying periods of development in the NICU. The SAS macro of rDLM was obtained from the supplementary materials of Gennings et al.47 We revised the obtained SAS macro by adding an interaction term between the dependent variable and the indicator variable of effect modifier, which allowed the SAS macro program to examine the modification effect. The revised rDLM SAS macro is available in our supplementary material.

As in many studies of preterm infants, we initially focused our analyses on postmenstrual age, or the developmental age, rather than the chronological age of the infant. However, previous studies of full-term infants reported critical windows in chronological age (CA).32,33 To allow comparison to that literature, we additionally computed rDLM by CA (weeks after birth). Both the rDLM results assessed by PMA and by CA were used in this study to identify critical windows of the impact of the stress response on neurobehavior.

Because the number of saliva samples collected before 30 weeks PMA (N = 6), after 38 weeks PMA (N = 9), and after the eighth week of life (N = 11) were small, we grouped those samples. The salivary concentration (nmol/l) was natural log transformed (ln nmol/l) to satisfy the linear assumption of regression models. The results of rDLM were presented as coefficients and 95% confidence intervals (CIs), indicating the unit change in outcome (1 unit = 1 unit score of NNNS subscale) per unit change in exposure [1 unit = one natural log-transformed concentration (ln nmol/l) of salivary cortisol] at each week of PMA or CA. The Holm–Bonferroni (HB)-adjusted CIs were additionally estimated at each time point to account for multiple comparisons among time points. A critical window was defined as when the time-specific coefficient was statistically significant, which was presented and illustrated as the estimated 95% CI completely above or below zero.

Gender modification effect

We estimated time-specific associations between salivary cortisol and each NNNS subscale stratified by sex. The differences in associations between the two sexes were estimated as the time-specific coefficient and 95% CI of the interaction term adjusted in rDLM models.

Covariates

Model covariates included gestational age at birth, small for gestational age (SGA) status defined as birth weight percentile less than ten, race/ethnicity, maternal smoking status, maternal insurance type, delivery type, antenatal steroids use, and an interaction term of infant sex.

Sensitivity analysis

In a sensitivity analysis, we further examined the association between average salivary cortisol concentration measured during the NICU stay and NNNS performance using multivariate linear regression including the same covariates as the rDLM. However, we did not stratify linear regression models by infant sex because sex did not significantly modify the association between average salivary cortisol and NNNS.

Results

Characteristics of the study population

One hundred thirty-nine preterm infants who had both salivary cortisol measured and NNNS examination during the NICU hospitalization were included in this analysis. Key clinical and demographic characteristics of the 139 included infants were compared to the entire NICU-HEALTH cohort in Table 1. The average gestational age of the 139 preterm infants was 30 +/− 1.4 weeks. Fifteen percent of infants were SGA at birth. There were no significant differences between male and female infants (all P values > 0.05).

Table 1 Characteristics of the 139 preterm infants included in this study.

Among the 139 preterm infants included in this study, the average PMA at NNNS examination was 36.3 +/− 1.4 weeks. Table 2 shows the mean, standard deviation (SD), and range of scores on the 13 NNNS subscales.

Table 2 NNNS performance in the 139 preterm infants included in this study.

Salivary cortisol concentrations

Eight hundred forty salivary cortisol measurements representing 502 person-days were included in our analysis. Figure 1 depicts salivary cortisol levels for male and female infants based on age and gestational maturity. Consistent with past research demonstrating the lack of cortisol diurnal cycling in preterm infants before 44 weeks PMA,51 we did not observe diurnal cycling of salivary cortisol level (Fig. 1a). This finding allowed us to use the average salivary cortisol concentration measured at multiple times in a day. Our group showed previously that salivary cortisol levels in preterm infants are associated with the intensity of stressful events in the NICU as recorded by an accepted survey tool.18 These stressful events are more common in the highly medicalized period soon after birth than in the subsequent convalescent phase of NICU hospitalization, resulting in lower cortisol levels later in the course of the NICU hospitalization. Figure 1b, c show that salivary cortisol levels in our study population were, as expected, higher in the first several weeks after birth than near discharge. This decreasing trend in male infants was more pronounced than among female infants, but the difference was not statistically significant (ANOVA P value = 0.07). Also, as expected, there was no clear change in salivary cortisol level across gestational ages (Fig. 1d).

Fig. 1: Salivary cortisol concentration (nmol/l) by the timing of specimen collection.
figure 1

a By the time of day; b by postmenstrual age; c by chronological age; d by gestational age at birth. Smooth lines representing trends for male and female infants were fitted using the loess method.

Critical windows identified by rDLM

Using the revised rDLM model, we examined sex-stratified associations between salivary cortisol and NNNS performance by week of cortisol measurement (represented by PMA and CA). The identified critical windows varied by NNNS subscale, infant sex, and developmental period (PMA or CA). To better illustrate the statistical results, we present here only the subscales with significant HB-adjusted 95% CIs (Figs. 2 and 3), while the full results are available in the Supplementary Materials (Figs. S1 and S2).

Fig. 2: The results of rDLM examining associations between salivary cortisol and NNNS Habituation and Lethargy scores.
figure 2

The Y axis is the time-specific coefficient between salivary cortisol and performance on each subscale. The X axis is the developmental stage at salivary cortisol measurement. First column (in pink): female infants. Second column (in blue): male infants. Third column (in black): differential effect, males–females. Bold line: time-specific coefficient; colored area: 95% CI not adjusted for multiple comparisons; error bar: Holm–Bonferroni-adjusted 95% CI. The full results for all 13 NNNS subscales are available in Supplementary Fig. S1.

Fig. 3: The results of rDLM examining associations between salivary cortisol and the NNNS Attention and Regulation scores.
figure 3

The Y axis is the time-specific coefficient between salivary cortisol and performance on each subscale. The X axis is the week of chronological age at salivary cortisol measurement. First column (in pink): female infants. Second column (in blue): male infants. Third column (in black): differential effect, males–females. Bold line: time-specific coefficient; colored area: 95% CI not adjusted for multiple comparison; error bar: Holm–Bonferroni-adjusted 95% CI. The full results for all 13 NNNS subscales are available in Supplementary Fig. S2.

Figure 2 shows the adjusted coefficients and 95% CIs (including both original CIs and HB-adjusted CIs) of the statistically significant associations between salivary cortisol and the Habituation and Lethargy subscales by PMA. Statistically significant associations between salivary cortisol and both Habituation and Lethargy were observed for both sexes at the least mature PMA category (30 weeks and less). For cortisol levels measured at 30 PMA weeks and less, one unit (ln nmol/l) increase of salivary cortisol was associated with a 0.87 (95% CIhb: 0.07, 1.68) point increase of the Habituation score among female infants, and a 1.04 (0.22, 1.86) point increase among male infants. A one-unit (ln nmol/l) increase of salivary cortisol was associated with a 0.63 (0.02, 1.23) point increase of Lethargy score in female infants, and 0.73 (0.13, 1.32) point increase in male infants. Habituation and Lethargy quantify different neurobehavioral domains:52 higher Habituation scores indicate greater response decrement to recurring auditory and visual stimuli, and higher Lethargy scores indicate lower levels of motor, state and physiological reactivity. Our results suggested these behaviors measured at discharge was significantly associated with salivary cortisol at less mature PMAs, and the strength and magnitude of this association decreased for stress exposure at more mature PMAs.

Figure 3 shows the adjusted coefficients and 95% CIs (including both original CIs and HB-adjusted CIs) of statistically significant associations between salivary cortisol and the Attention and Regulation subscales by CA. Statistically significant associations between salivary cortisol and Attention were observed for both sexes during the first week after birth: a one-unit (ln nmol/l) increase of cortisol was associated with a −1.01 (95% CIhb: −1.81, −0.21) point change in the Attention score among female infants and a −0.93 (−1.75, −0.11) point change among male infants. There was a consistently inverse association between cortisol level and Regulation subscale score regardless of CA. This association was statistically significant for cortisol measured in the first week and the 6–7th week of life in both sexes. Of note, the average NICU length of stay in our cohort was 6 weeks. For female infants, the (ln nmol/l) increase of cortisol measured at Week 0, Week 6, and Week 7 was associated with a −1.14 (−2.17, −0.11), −1.22 (−2.30, −0.13), and −1.25 (−2.37, −0.13) point change in the Regulation score, respectively. In male infants, the (ln nmol/l) increase of cortisol measured at Week 0, Week 6, and Week 7 was associated with a −1.07 (−2.10, −0.03), −1.21 (−2.29, −0.13), and −1.19 (−2.32, −0.06) point change, respectively.

Sexual dimorphism

Associations between salivary cortisol and NNNS subscales are shown separately for male and female infants in Figs. 2 and 3. The Y axis shows the difference in the coefficient when comparing male to female. Each p-value was adjusted for multiple comparisons. Among the four subscales, the most prominent indication of sex difference was observed for Handling, shown in Fig. 4. Although the direction of the association between salivary cortisol level and Handling score was opposite for the two sexes, trending negative among females and positive among males across all observational times, statistically significant sex differences were not observed. Supplementary Figs. S1 and S2 demonstrate similar results for other subscales.

Fig. 4: The results of rDLM examining the association between salivary cortisol level and Handling subscale score by PMA and CA, respectively.
figure 4

Y axis is the week-specific coefficient between salivary cortisol level and the Handling subscale score; X axis is the week of salivary cortisol was collected. First column (in pink): female infants. Second column (in blue): male infants. Third column (in black): differential effect, males–females. Bold line: time-specific coefficient; colored area: 95% CI not adjusted for multiple comparisons; error bar: Holm–Bonferroni-adjusted 95% CI.

Sensitivity analysis results

In sensitivity analysis, we estimated the association between average salivary cortisol concentration and NNNS scores using linear regression models (Table 3). For comparison, we included the time-specific associations identified by rDLM in Table 3. In linear regression modeling, salivary cortisol was significantly associated with the Regulation subscale score (P = 0.03), but this association lost significance after adjusting for multiple comparisons (PBH = 0.32). This finding was consistent with the rDLM that salivary cortisol level in multiple weeks after birth was associated with the Regulation score in models adjusted for multiple comparisons (Fig. 3).

Table 3 Association of NICU-based cortisol and NNNS subscale performance modeled by multivariate linear regression analysis and rDLM.

Discussion

We examined the association between salivary cortisol measured between birth and term equivalent life stage and neurobehavioral performance assessed by NNNS prior to NICU discharge in a group of preterm infants. In our study, salivary cortisol level at specific PMAs and CAs were associated with specific NNNS subscales. These findings indicate that stress during specific critical windows of the NICU stay may impact preterm infant neurodevelopment. Specifically, stress at lower PMA was found to impact performance on the Habituation and Lethargy subscales at term-equivalent, while stress early and late in the NICU hospitalization was found to impact performance on the Attention and Regulation subscales.

To our knowledge, this study is the first to leverage a critical window analysis for neurobehavioral outcomes in preterm infants. Our findings are consistent with a generally accepted concept that environmental exposures during the third trimester developmental period particularly impact human central nervous system development.32,33 Our study is unique in its ability to pinpoint the critical windows for stress response during this period of development as we were able to measure biomarkers of stress response directly from preterm infants, whose “third trimester” developmental window overlaps with the early postnatal life course.

We did not observe statistically significant sex differences in the association between salivary cortisol and the NNNS subscale scores, possibly because of the small sample size. This is important as sexual dimorphism has been reported in several studies of childhood behavioral outcome after exposure to stress during pregnancy. For example, in the Queensland Flood Study,27 in utero exposure to maternal distress during a flooding disaster altered offspring temperament measured in early childhood for boys but not for girls. There is known to be a stronger association between in utero exposure to maternal depression and unfavorable behavioral outcome in boys compared to girls.25,53 Notably, there is conflicting data about whether sexual dimorphism in behavioral outcome related to stress exposure is due to biological differences or differing socialization of the two sexes, or both.54,55 Our study was underpowered to detect a sexual dimorphic impact of early life stress on behavioral development. This topic should be further explored in larger studies as newborn-based outcomes may be able to distinguish between biological versus sociological underpinnings of previously seen the sex differences of stress response likely driven by biology rather than the later life socialization.

The results of sensitivity analysis further confirm our findings of the inverse association between salivary cortisol level and performance on the NNNS Regulation subscale. The estimated association is particularly robust, as time-specific associations were statistically significant across different models (i.e., PMA and chronological age) and across multiple time points.

A major strength of our study is the unique, modifiable environment of the NICU, where stressful exposures, stress response, and outcomes are easily measured and confounders are assessed. Unlike studies that use a single biomarker measured in response to an isolated stressful event a measure of stress reactivity, our study did not target biospecimen collection to follow an identified stressful event. Cortisol levels in our study reflect the individual’s response to the totality of environmental stressors in the NICU in a chronic resting state. Cortisol levels do not vary diurnally in the NICU population.51 Single measurements in preterm infants thus provide similar information to the slope of a cortisol curve in a more mature individual exhibiting diurnal cycling and may reflect chronic HPA axis function that has been linked to neurodevelopment.56

That said, our study has several limitations. First, although large for studies of preterm infant stress, our study had a small absolute number of participants, which might lead to uncertainties in analysis results. Second, cortisol, an integral part of the HPA axis, is only one biomarker of stress. Damage to the HPA axis could impact salivary cortisol response, thus confounding response for some critically ill neonates. Although our NICU-HEALTH population was drawn from the moderately preterm population—rather than the critically ill extremely preterm population—and had overall low rates of medical morbidity, this potential confounder remains. Last but not least, preterm infants’ HPA axis develops during the period of time spent in the NICU. Their cortisol response might change between birth and NICU discharge. Nevertheless, we found strong associations between the salivary cortisol and neurobehavioral performance at various time points through the NICU stay.

In summary, we use a sophisticated statistical approach to examine the association between a biomarker of stress response and preterm infant neurobehavioral performance at a fine temporal scale (weekly). The application of rDLM prevented sample size loss caused by an incomplete set of time-dependent variables, thereby enabling the identification of critical windows in a dynamic cohort.

Conclusion

We found that salivary cortisol level at specific time points during NICU-based development was associated with specific domains of neurobehavioral performance in preterm infants. This finding suggests that environmental stressors during NICU hospitalization may impact preterm infant neurodevelopment, particularly in domains related to Habituation, Attention, Regulation, and Lethargy. Future, larger studies may be able to further elucidate the role of exposure timing and sex in the association between stress, stress response, and neurodevelopment among preterm infants.