Introduction

Respiratory support is a major component of care for premature infants in the neonatal intensive care unit (NICU) which helps to decrease the work of breathing and maintain optimum oxygen saturation (SpO2) in these infants. Vital signs are very important to clinicians for guiding weaning/increasing non-invasive respiratory support. Some studies have attempted to improve the weaning of non-invasive respiratory support via different weaning protocols for stable preterm infants at 34 weeks, with no significant difference in length of stay between various weaning methods.1,2,3 Other studies have looked at diaphragmatic activity as a potential source of information for whether wean in respiratory support is successful with promising results.4 However, assessing diaphragmatic activity requires additional, invasive equipment to provide useful information during respiratory support weaning. Studies targeting different target SpO2 ranges clearly demonstrate that long term outcomes including bronchopulmonary dysplasia, retinopathy of prematurity, and death are significantly impacted with a lower saturation’s targets.5,6,7 Little attention has been paid to the wealth of information available on vital signs monitors for guiding daily management of respiratory support.

Continuous monitoring of vital signs is a critical source of information clinicians have, to determine when and how to wean non-invasive respiratory support, but they have not been systematically studied to improve the respiratory support care in preterm neonates. Bedside nurses are typically responsible for monitoring and recording vital signs that later become a permanent part of the patient’s chart. It has been shown, at least, in adult population that even a single nursing recording of abnormal vital signs can signal an increased risk of critical event.8 Although these vital signs are gathered from the bedside monitors, they can only capture a few moments (range 8–24 data points per day) of a patient’s clinical status, which may be misleading given the variability in the vital signs throughout the day especially in our neonatal population. This approach also carries the burden of subjective bias of the healthcare provider regarding the overall infant’s clinical status. In preterm neonates, cumulative analysis of heart rate (HR) variability trends assimilated through continuous monitoring has been shown to be beneficial in identifying cases of neonatal sepsis or major adverse events in neonatal intensive care unit.9 The benefit and potential use of continuous monitoring of respiratory vital signs to inform short-term respiratory support management has not been explored.

As bedside monitors record data constantly from patients in the NICU, the use of this continuous data directly from monitors, which can be readily available, provides an opportunity for improvement in weaning non-invasive respiratory support and improving the care of preterm infants. The objective of this study was to investigate the correlation between interval recording of respiratory vital signs by nurses to continuous monitor-recorded events to detect burden of hypoxia and tachypnea, and to determine if the respiratory rate (RR) and SpO2 data collected by the bedside monitors could be used to predict success in respiratory weaning. We hypothesized that monitor-recorded vital signs are superior to nursing recorded vital signs in predicting trends for escalation/de-escalation of short-term respiratory support.

Methods

Study design

This was a prospective study approved by our local institutional review board and was conducted in a level III NICU at Hurley Children’s Hospital, Flint, Michigan. This study was blinded, as both medical and nursing team did not have access to the real-time daily reports generated from monitor data for individual patients. Respiratory guidelines: All the respiratory management was based on daily clinical assessment and nursing recorded data by the team rounding on that day. In our NICU, non-invasive respiratory support is generally provided through BiPAP, CPAP, or rarely High flow nasal cannula till infants Fio2 requirement comes down to 21% or at least 30% for 24 h. Infants are then weaned to low flow nasal canunla or straight to room air if FiO2 is 21%, based on neonatal team assessment. Decision to escalate support is based on SpO2 falling below target range of 88–94% or bedside assessment of increase in work of breathing. Overall respiratory care management in relatively stable infants, on noninvasive support, is largely driven by both subjective bedside nursing assessment and objective nursing documentation of vital signs. Any increase in either flow, pressure, or mode of respiratory support was defined as an escalation event. We did not include, “increase in FiO2” as an escalation event in our study, as it was frequently titrated by the bedside nurses, leading to more than one escalation/ de-escalation of FiO2 in a day. Therefore, it was difficult to capture and thereby define it as an “escalation event”. The study population consisted of preterm infants born at < 36 weeks’ gestational age, from birth until their discharge, and admitted to the NICU between November 2016 and March 2017. Data were collected on all the infants until their initial discharge (December 2017). Patients were excluded if they stayed less than 48 h in the Hurley NICU or if they remained on invasive ventilation for the entire length of stay. We also excluded infants with known congenital malformation or chromosomal abnormality. The Human Research and Ethics Committees of our institution approved this study.

Data collection

Our research team prospectively collected data from electronic medical records from the day of admission until discharge. HR, RR, and peripheral SpO2 were monitored continuously by Masimo probes and Philips Intellivue MP50 Neonatal monitors. Every minute, 1 data point for each vital sign was sampled and recorded for a total of 1440 data points per vital sign per day and collected in a daily report. The monitor data was reported as histograms (2.5% bins for SpO2; 10 bpm bins for RR). Total percentage of time for both SpO2 and RR at different threshold level was calculated daily for each patient. All the Masimo and Philips data were collected by Crystal reports and stored for 24 h in the central “STAR” dataset ware house through Epic. We received 24 h data every day from Epic in Microsoft excel format. After each patient was discharged, a report of all the vital signs recorded by nursing staff in the Epic electronic medical record (EMR) was collected. Each patient’s Epic EMR chart was reviewed with gestational age, birth weight, gender, race, daily weight, respiratory support type and settings, positive blood cultures, number of transfusions, medications, length of stay, and discharge support was recorded.

Statistical analysis

Infants were categorized into three groups based on the respiratory support type that was last active on each day of life: RA (room air), NC (nasal cannula), and HCN (High flow nasal cannula, CPAP, NIPPV). Each day was further categorized to include baseline flow or positive-end expiratory pressure (PEEP). Baseline characteristics were analyzed for the entire population. We performed Receiver operator curve analysis to determine the best cut off of lowest saturation and highest RR on monitor data for any change in respiratory support within three days. The correlation between nursing and monitor data were analyzed by Kendall’s Tau rank correlation coefficient. Continuous variables were compared by Kruskal–Wallis test and categorical variables by Chi-square test. To assess the baseline variability, daily average and daily standard deviation, of RR and SpO2 were compared among stable infants on RA, NC, and HCN. We also compared the cumulative frequency of tachypneic and hypoxic events in this group. These parameters were then compared with any escalation in respiratory status within next three days. The effects of potential confounders like gestational age, birth weight, gender, and race were examined by logistic regression. A p value < 0.05 was considered statistically significant. All statistical analysis was performed using MedCalc version 17.9 (MedCalc Software, Ostend, Belgium).

Results

Study population and demographics

Of the 101 preterm infants initially recruited, seven were excluded due to death, length of stay less than 48 h, or transfer without any time on non-invasive ventilation. Of the 94 infants included in the study, the median gestational age at birth was 32 weeks (26w 3d–35w 5d) and the median birth weight was 1848 g (709 g–3800 g). About 46% of our study subjects were male, and 70% were white. The total number of patient days observed was 2204 (1085 on RA, 424 on NC, and 695 on HCN). Median observation period per infant was 21 days (10d–91d). The total number of escalation events were 150 (6.8% of the patient days observed) in 20 infants (21% of the total infants). For all patients enrolled, there were only two positive blood cultures and 14% of patients received at least one platelet or packed blood transfusion during their stay.

Definition of tachypnea and hypoxia

The threshold for tachypnea and hypoxia was defined using receiver operator curve (ROC) analysis (Table 1). ROC analysis using monitor-recorded vital signs showed that SpO2 of less than 90% was most sensitive and specific for increasing respiratory support within three days (AUC: 0.814, SE: 0.0147, 95% CI: 0.79–0.83), with a cut-off value of > 5%. ROC analysis also revealed that a RR greater than 70 was most sensitive and specific for increasing respiratory support within 3 days, though generally less so than hypoxia (AUC: 0.652, SE: 0.0229, 95% CI: 0.63-0.68) with a cut-off value > 20%. Other threshold values analyzed were less sensitive and specific with lower AUC values (Table 1). Monitor-derived ROC threshold limits above, were also used for nursing data, but they were not able to generate robust sensitivity or specificity values for either SpO2 < 90% (AUC: 0.553, SE: 0.0339, 95% CI: 0.51–0.59) or RR > 70 (AUC: 0.538, SE: 0.0288, 95% CI: 0.51–0.57). When we used other threshold values using nursing recorded data, we found SpO2 of less than 88% (AUC: 0.565, SE: 0.0271, 95% CI: 0.53–0.59; cut-off value > 5%), and RR greater than 80 (AUC: 0.593, SE: 0.0336, 95% CI: 0.57–0.52; cut-off value > 0%) to be most sensitive and specific. As the area under the curve for nursing data was less robust, we used the ROC curves from the monitor data for our definitions. Based on the ROC curves from monitor data, tachypnea was defined as RR greater than 70 breaths per minute, and hypoxia was defined as SpO2 less than 90%.

Table 1 Comparison of ROC analysis threshold parameters based on monitor-recorded vital parameters in infants with no change vs escalation in respiratory status categorized by level of baseline respiratory support

Comparison of nursing- and monitor-collected vitals

Median number of events recorded per day was eight (4–12) from nursing chart vs. 1424 (701–1440) from monitors. Total median number of events recorded per infant was 168 (84–252) from nursing chart and 29904 (14,721–30,240) from monitors. To compare the correlation between nursing- and monitor-collected SpO2 and RR, the total percentage of daily recordings (7:00 am–6:59 am) of tachypnea and hypoxic vitals were used. The overall Kendall’s Tau rank correlation coefficient between nursing- and monitor records was 0.507 for hypoxia events and 0.431 for tachypnea (RR > 70). This correlation was much lower in infants on either RA or NC (Table 2). Moreover, for infants who had escalation of support, nursing documentation of hypoxia recorded by monitors was not present in in 46.6% (p < 0.0001), and tachypnea in 27.8% (p < 0.0001) of infants.

Table 2 Correlation of nursing- and monitor recorded hypoxia and tachypnea episodes

Comparison of monitor-collected vitals at baseline

To determine normal trends in respiratory vital signs we aggregated RR and saturation data from all monitor-recorded events in our cohort. Infants on RA had a significantly lower average RR, variability, and tachypnea events as compared to infants on any non-invasive support (p < 0.0001). The average SpO2 and variability in SpO2 significantly decreased whereas hypoxic events increased with increasing respiratory support at baseline (p < 0.0001) (Table 3).

Table 3 Comparison of monitor-recorded vital parameters in infants with no change vs escalation in respiratory status categorized by level of baseline respiratory support

SpO2 and change in respiratory status

Average SpO2 decreased in infants requiring subsequent respiratory support escalation from RA (p = 0.0048), NC, and HCN (p < 0.0001), whereas a decrease in daily variability was significant only in RA (p = 0.0071) and NC group (p < 0.0001) (Table 3). Incidence of hypoxia was significantly greater among all the three groups, who required an increase in respiratory support within the subsequent three days (Fig. 1). When we corrected for confounders we found that patients who were hypoxic 5–10% of the time were 3.70 (95% CI 2.04–6.70) times more likely to have worsening respiratory status (p < 0.0001), while infants who were hypoxic greater than 10% of the time were 4.56 (95% CI 2.80–7.43) times more likely to require an increase in respiratory support (p < 0.0001) within three days. The other significant variables in our model were infants born at GA < 28 weeks and birth weight < 1500 g (Table 4).

Fig. 1
figure 1

Incidence of Hypoxia and increasing respiratory support. The plot depicts the daily percentage of monitor saturation recording less than 90% for infants in room air, nasal cannula and HCN (HFNC, CPAP, and NIPPV). Data representative of 94 patients with 2204 observation (RA: 1085; NC: 424; HCN: 695) (data presented as median ± 95% CI)

Table 4 Logistic regression for increasing respiratory support for all patients in the study once off mechanical ventilation

RR and change in respiratory status

Average RR increased in infants needing subsequent respiratory support escalation for infants in RA (p = 0.05), NC (p = 0.01), or HCN (p = 0.001), whereas an increase in variability was significant only in RA group (p = 0.01) (Table 3). Incidence of tachypnea was significantly greater in RA (p = 0.04) and NC (p = 0.002) groups requiring escalation of support within three days. When we corrected for confounders we found that infants who were tachypneic for more than 30% of the time were 2.80 (95% CI 1.69–4.67) times more likely to require an increase in respiratory support (p < 0.0001). The other significant variables in our model were infants born at GA < 28 weeks and Birth weight < 1500 g (Table 4).

Weight gain in relation to work of breathing and hypoxia

Hypoxia and tachypnea, in particular are widely believed to be associated with poor weight gain in preterm babies. We therefore looked at the association between daily weight gain analyzed as Fenton weight z-scores and hypoxia/tachypnea. Interestingly, we noted that infants in RA who were tachypneic greater than 30% of the time had a significantly lower Fenton weight z-scores, after regaining their birth weight (p < 0.001) (Fig. 2). While this correlation was most striking in room air, for all infants on non-invasive respiratory support, we found a significant negative linear correlation between the percentage of monitor-recorded tachypnea episodes and Fenton weight z-scores (p < 0.001) (Fig. 3). Interestingly, there was a more significant negative linear correlation between the percentage of monitor-recorded hypoxia episodes and Fenton weight z-score (p < 0.001) (Fig. 4).

Fig. 2
figure 2

Weight gain and frequency of monitor-recorded tachypnea on room air. The plot depicts the daily Fenton weight Z-score after regaining birth weight of infants in room air. Data representative of 94 patients with 701 observations. (Data presented as median ± 95% CI)

Fig. 3
figure 3

Linear regression of weight gain and frequency of monitor-recorded tachypnea on non-invasive ventilation. The plot depicts the daily Fenton weight Z-score after regaining birth weight of infants on room air and non-invasive respiratory support and occurrence of monitor-recorded tachypnea. Data representative of 94 patients with 1074 observations

Fig. 4
figure 4

Linear regression of weight gain and frequency of monitor-recorded hypoxia on non-invasive ventilation. The plot depicts the daily Fenton weight Z-score after regaining birth weight of infants on room air and non-invasive respiratory support and occurrence of monitor-recorded hypoxia. Data representative of 94 patients with 1043 observations

Discussion

One of the major challenges in taking care of our premature infants in NICU is to provide adequate respiratory support. To the best of our knowledge, our study is the first to report the importance of using bedside monitor-recorded data in day to day respiratory management in NICU. Clinical decision making at the bedside is a complex process that involves evaluation of vital sign trends, bedside nursing assessment, clinician’s judgment, and results of test like blood gases. Although respiratory vital sign trends are a guide to decision making, in relatively stable babies, these assessments occur every 2–4 h resulting in, ~8 data points per day. The overall quality of these assessments also tends to be subpar.10 In stable adult patients, continuous monitoring has been shown to decrease both ICU admissions11 and length of stay in ICU.12 Although these interval vital signs are collected from the monitors, there is an element of subjective bias in these assessments. These evaluations also tend to miss the significant variability present in the respiratory pattern of premature babies. The significance of continuous objective monitoring of the vital signs are increasingly being recognized more and more13,14 especially in the settings of sepsis and NEC.9,15 In addition, the recent continuous HeRO monitoring studies have shown to decrease mortality in very low birth weight population.16,17 Similar to these previous studies, our data demonstrate that monitor-recorded vital signs can be a useful adjunct to bedside assessment, and help improve decision making on escalation of respiratory support.

We found at baseline, a decreasing pattern of average SpO2 and variability, as well as increasing hypoxia events were associated with increasing respiratory support in the subsequent three days. This probably reflect the higher degree of instability in infants on higher respiratory support. We also noted a significant reduction in variability in the event of escalation, similar to the HeRO trial where higher HR and lower variability was associated with sepsis.9 The cumulative amount of the time monitors registered SpO2 less than 90% were significantly higher in infants on RA, NC, and HCN, who required an increase in respiratory support within three days. There was a 3.7-fold increased risk of escalating the respiratory support if hypoxia events were 5–10% and this risk increased by 20% if the cumulative hypoxia time was greater than 10%. We used the SpO2 of less than 90% to define hypoxia. This is consistent with the targets of 90–94% currently followed across majority of the NICUs. Keeping this threshold also makes sense given the substantial evidence of increasing mortality in infants on lower target SpO2 (85–89%) in both SUPPORT and COT trial.5,7 We also used a cut-off time of 5%/day of these hypoxic events to be considered significant as it is considered an acceptable clinical threshold according to British Thoracic Society Guidelines.9 The 5% cut-off is also comparable to the percentage of hypoxia events reported in the Support trial (SpO2 < 90%; 6%) and Cot trial (SpO2 < 85%; 8.2%) in higher target saturation groups. Our data from continuous monitor recordings elucidates a greater variability and more hypoxia events in infants on respiratory support, and these data can guide optimization of respiratory support. During the study period all the clinical decisions were made by the rounding team, and they did not have any access to the monitor trend data. This was in accordance to our study design, as we did not want to influence the daily care of our infants.

Very few studies have examined the patterns of respiratory rates and saturation targets in relation to various degrees of respiratory support and short-term respiratory outcomes. We found a direct correlation between increasing level of nasal respiratory support and tachypnea, and inverse relationship between level of support and RR variability. We speculate that tachypnea related to increasing respiratory support may suggest inadequate support. This assumption is supported by the fact that infants with highest respiratory rates across any level of respiratory support were at increased risk of respiratory support escalation in the subsequent three days. The decreasing variability seen with increasing respiratory support could reflect the influence of respiratory disease severity in limiting lower respiratory rates. A strength of our study is that we determined RR > 70 as significant tachypnea not arbitrarily but by performing ROC analysis on > 3 × 106 respiratory observations over > 2200 patient days. These data suggest that a cumulative assessment of monitor-recorded tachypnea provides useful information to make a more educated assessment of continuing or escalating support. We also considered using increase in supplemental FiO2 as an escalation event, but typically found too many FiO2 changes in a 3-day window to define a clear “escalation event”. Therefore, while it would have been interesting to look at this, the data would be very granular and difficult to analyze.

Our data were obtained from a single center. While validation of these results in other centers would have strengthened our conclusions, our respiratory management was not dissimilar to practices in other centers, and nursing vital signs are routinely reported during rounds, and contribute to decision making. There is clearly potential for utilizing this untapped resource of monitor-recorded events to help in the management of our premature infants. Another limitation of our study is that we did not have a power calculation prior to the data collection. Attempts were made to use pre-existing data from previous studies on vital signs to calculate the sample size. Considering the paucity of published prior data, along with variation in hospital specific policies for monitor settings and practice variation in respiratory management, a priori sample size calculations were not feasible. In a post hoc power analysis, (from ROC analysis), we found that 99 patient days (for hypoxia i.e., SpO2 < 90%) and 605 patient days (for tachypnea i.e., RR > 70) would result in a power of 90% with an alpha of 0.025. As far as we are aware this is a first study intended to examine the feasibility of using monitor-recorded vital signs to predict respiratory escalation events. For infants in our study, each infant served as his or her control (non-escalation event days) in the analysis for saturation and respiratory parameters. However, there is potential statistical non-independence of our data as we analyzed multiple measurements and escalation events in the same patients.

Generally, infants on RA or NC are considered relatively stable, and thereby are at the risk of being overlooked by bedside nurses, especially in busy NICU settings. These subtle and variable changes in the vital signs also tends to be missed in nursing documentation. It has been shown that these subtle signs are present which was missed in detection of late onset sepsis and NEC in preterm infants.18 Moreover, one also need to be aware that infant oxygen levels can drop quickly (up to 8% per second), partly due to their small pulmonary reserve which can be missed by spot interval nursing documentations.19 In our study, we did not find a good correlation between nursing documentation and monitor-recorded data of the percentage of hypoxia and tachypnea events. This correlation was even poorer in babies considered relatively more stable such as in RA or NC. It is also important to note that nursing documentation failed to identify any hypoxia event in close to 50% infants, and any tachypnea events in more than 25% infants who required escalation in respiratory support. In summary we did not find an association between documented nursing assessment and monitor-recorded events and documented nursing data were not able to predict future escalation events. However, it is possible that bedside assessment on the day could have contributed to the change in respiratory care. Whether decision making driven by information from continuous monitoring is better than current standard of care is a separate study, and a logical next step. A minor limitation of our study is that we used raw data from the monitor and did not edit it for possible artifacts. However, in previous studies the reported difference between raw and edited data was very low. The difference of median hypoxia time (SpO2 < 85%) during NICU stay between original and revised data in COT trial was reported to be only 0.90%7 and the difference of 24-hour SpO2 monitoring of raw data from the edited data in infants ready to be discharged was found to be only 0.70%.20 Given these published variabilities of less than 1%, we donot think this would affect the outcome of our findings. We also found that the odds of increasing the respiratory support were significantly higher in more premature infants (GA < 28 weeks) and infants with lower birth weights (Birth weight < 1500 g), which suggest that they may benefit from more objective monitoring.

It is essential to minimize the work of breathing of premature babies and thereby decreasing their overall energy expenditure. This, in turn, can help in supporting better growth pattern of our premature babies. While it is generally assumed that tachypnea or increased work of breathing causes poor weight gain, we are not aware of any study addressing this prospectively. We found an overall negative correlation between tachypnea with rate of weight gain. Although there was not a linear correlation of tachypnea and weight gain on RA, we did find a very significant drop in z-scores if tachypnea was more than 30%. This suggests that probably these infants are diverting a good portion of their calories for breathing and less on growth. Premature infants with active respiratory disease do seem to have a significantly higher energy expenditure as shown by urinary hypoxanthine levels, a marker of ATP breakdown21 This is very important as an overall respiratory pattern from the monitors can be an additional tool in assessing overall poor weight gain. Hypoxia has been shown to negatively impact the growth at least in BPD patients post discharge.22,23 Multiple desaturation events can deplete energy stores by reducing the rate of oxidative phosphorylation by aerobic respiration and thereby ATP synthesis.24 Our findings of lower weight z-scores with higher hypoxia events are also in line with these studies. These findings suggest that monitor-recorded vital signs are a potentially rich source of information when assessing the success of respiratory weaning and planning patient care in the context of the patient’s complete clinical status.

Conclusion

This is the first study to compare the correlation of continuous and interval vital signs monitoring in NICU. Vital signs as recorded from the monitor are an underutilized resource that have the potential to improve short-term respiratory outcomes and should not be discarded without consideration. While trends in RR and peripheral SpO2 could provide a useful adjunct to clinical presentation, this data is only currently available temporarily for 24 h. Nursing documentation is the only vital signs information stored in the medical record. Integrating 1440 data points recorded daily, into a histogram that can be easily interpreted, can provide valuable information on the time each infant is in a stable respiratory zone vs. showing signs of respiratory distress. The use of computerized algorithms for analyzing monitor events and oxygen delivery are gradually gaining application in clinical practice both in preterm neonates25,26 and in adults,27 paving the way for more precise targeting of current therapies. Although it will be premature to suggest a change in practice, the findings of our study do emphasize the potential underutilization of this readily available resource as an adjunct to bedside assessment for the care of our neonatal population. Future studies will be needed, preferably by randomizing infants to monitor-based interventions versus care with usual data, to see if clinical outcomes are impacted.