Introduction

The risks of mechanical ventilation (MV) to premature lungs is well known. Even a brief exposure to large volume breaths can initiate an inflammatory cascade leading to bronchopulmonary dysplasia (BPD) [1,2,3]. Avoiding intubation in the delivery room (DR) and stabilization with continuous positive airway pressure (CPAP) improves outcomes [4,5,6]. In recent years, preterm infants are increasingly being admitted to the neonatal intensive care unit (NICU) on CPAP [7, 8]. However, 40–65% of these infants require intubation and mechanical ventilation during the first three days of life [8,9,10]. Importantly, infants failing CPAP have worse outcomes compared to those who successfully remain on CPAP [11,12,13]. Identifying infants at risk for CPAP failure could help target early interventions to avoid intubation and MV.

Previous studies that evaluated both antenatal and immediate postnatal factors to predict CPAP failure have provided varying results [11, 13,14,15]. A single center retrospective study by Ammari et al. reported that birth weight ≤750 g, GA <26 weeks, alveolar-arterial oxygen gradient (A-a DO2) >180 mmHg, and severe respiratory distress syndrome (RDS) on the initial chest radiograph were predictive of CPAP failure among infants born ≤1250 g. However, the positive predictive value (PPV) of these variables ranged from 50 to 55% [14]. A subsequent study from the same center reported a much higher PPV with severe RDS on initial chest radiograph (0.81), but the model lacked sensitivity (32%) [16]. De Jaegere et al reported that combination of male gender, birth weight <800 g and fraction inspired oxygen (FiO2) threshold of ≥0.25 at 1 and 2 h of life (HOL) were predictive of CPAP failure in infants born <30 weeks’ GA [17]. A two-center prospective study by Dargaville et al suggested that a maximum FiO2 ≥ 0.3 at 2 HOL predicted CPAP failure in 25–28 week GA infants [11]. Although, the model based on FiO2 is attractive for clinicians, inconstancies between studies raises questions about the generalizability of such a threshold. In addition, despite several studies evaluating early predictors of CPAP failure, none of the proposed prediction models have been validated [18].

The objective of our study is to first establish early predictors of CPAP failure using a modeling cohort (MC) to identify a population of infants at risk of requiring intubation within 72 HOL, and then to validate the model using a separate validation cohort (VC) of infants with similar demographics.

Methods

Setting

Parkland Hospital and Health System (PHHS) is a large public hospital with over 13,000 deliveries annually with a dedicated resuscitation team that attends high risk deliveries. All infants born ≤32 weeks’ GA that demonstrate respiratory distress are immediately started on CPAP using a T-Piece resuscitator (Neopuff,TM Fisher & Paykel, Auckland, NZ). Face mask positive pressure ventilation (Fm-PPV) is provided when indicated per Neonatal Resuscitation Program (NRP) guidelines. Resuscitation is started at 0.21 FiO2 and titrated to meet NRP goal saturation parameters. Peak inspiratory pressure (PIP) is usually set at 25 cm H2O and positive end expiratory pressure (PEEP) at 5 cm H2O. Infants stabilized on face mask CPAP are then started on bi-nasal prongs (Hudson Prongs, Hudson RCI, Morrisville, NC) connected to a PEEP valve, running 8–10 l min−1 flow before transport to the NICU. Resuscitation details are verbalized by the team members and entered in a paper form in real time by a trained obstetric nurse. A peripheral arterial blood gas (ABG) is obtained upon admission (usually from the right radial artery), a chest radiograph is obtained upon admission or soon after placing the umbilical catheters. Hudson prongs are used exclusively as the nasal interface. Starting May 2014, the CPAP generator was changed from a ventilator to a bubble device. (Fisher & Paykel Healthcare, Auckland, NZ). Regular training of nurses and respiratory therapists together with bedside auditing of the CPAP delivery system based on previously published guidelines [19] were conducted throughout the study period. There was no set algorithm for escalation of CPAP level and clinical decisions were left to the individual treatment teams. Infants were intubated in the NICU for surfactant therapy if they required 0.45–0.5 FiO2 at CPAP level of 5–7 cm H2O or for frequent apnea. The pulse oximeter oxygen saturation target limits in the PHHS NICU are maintained between 88 and 94%. All infants admitted on CPAP are started on ampicillin and gentamicin after initiation of an evaluation for the presence of sepsis and antibiotics are stopped at 48 h if the blood cultures are negative.

The antenatal steroid (ANS) administration policy was changed in September 2015 from the previous practice of excluding women with pregnancy-induced hypertension (PIH) and diabetes mellitus to now include all mothers with an impending preterm delivery between 24 and 34 week GA [20, 21]. All mothers at risk of preterm delivery <28 weeks’ GA received magnesium sulfate (MgSO4) for neonatal neuroprotection [22].

Study population

All preterm infants 23–29 weeks’ GA admitted to NICU on CPAP between January 2013 and April 2018 were included in the study. Infants intubated in the DR and those receiving only comfort care measures were excluded. Infants were randomly assigned to either the MC or the VC using a random number generator with a 2:1 ratio.

Ethical consideration

The study was approved by the Institutional Review Board of University of Texas Southwestern Medical Center with waiver of consent for retrospective data collection.

Data collection

Details of DR resuscitation such as the need for Fm-PPV, intubation, maximum FiO2, Apgar scores, umbilical cord blood gas values, CPAP levels, and FiO2 at admission, at one and two HOL, admission blood gas values and the CPAP level at that time, and the FiO2 and blood gas values at the time of CPAP failure were collected retrospectively from the electronic medical record (EMR) by the members of the study team. Outcome measures such as rates of pneumothorax, bronchopulmonary dysplasia (BPD) and intraventricular hemorrhage (IVH) were obtained from the NICU data base. The first chest radiograph was extracted without patient identifiers and a unique identifier code was applied prior to being assigned to a blinded radiologist to evaluate.

Study outcomes

Maternal details and neonatal characteristics were compared between the CFG and CSG in both the MC and VC. CPAP failure was defined as the need for intubation within 72 h of life. Administration of a single dose of betamethasone two hours prior to delivery was considered as exposure to ANS. Maximum FiO2 was calculated from the recorded FiO2 at 1 and 2 HOL. The respiratory severity score = (CPAP level × FiO2), A-a DO2 = [(FiO2 × 713) − (PaCO2/0.8) − PaO2] and arterial/alveolar PO2 (a/A PO2) = (PaO2/[(FiO2 × 713) − (PaCO2/0.8)]) were calculated. A radiologist who was blinded to all clinical information and study design evaluated the first chest radiograph obtained upon admission for lung volume, granularity of the lung parenchyma, degree of atelectasis, visibility of air-bronchograms, presence of interstitial emphysema, and silhouetting of heart borders. The radiologic diagnosis of severe RDS was made in the presence of low lung volumes, diffuse reticular granular pattern and prominent peripheral air bronchograms or with low lung volumes, diffuse opacification of the lungs and indistinct cardiac borders [23]. BPD was defined as the need for supplemental oxygen at 36 weeks’ postmenstrual age [24]. Severe IVH was defined as grade III or greater on any ultrasound scans of the head unilaterally or bilaterally as per Papile criteria [25]. Severe ROP was defined as Stage 3 or greater based on the international classification of ROP [26].

Statistical analysis

Analysis was performed using SPSS version 19 (IBM). Categorical variables were analyzed by Chi square or Fisher’s exact test as applicable. Continuous variables were analyzed by Student’s T-test or Mann–Whitney U-test. The level of statistical significance (alpha) for all univariate tests was 0.05. Any significant missing data (≥5%) were reported separately.

Creation of the prediction model

All univariate variables with P value < 0.1 were included in the multivariate forward stepwise logistic regression analysis. Correlation analysis was conducted to identify any collinearity of these variables. If two variables had high collinearity (Pearson r ≥ 0.6), one of them was removed. Categorical variables used Phi and Cramer’s V to evaluate the strength of association between variables. Continuous variables were dichotomized using receiver operating curve at the point of intersection of maximum sensitivity and specificity. The forward stepwise multivariate logistic model allowed only statistically significant variables into the model.

Results

Of the 529 infants who were actively resuscitated, 284 infants (54%) were admitted to the NICU on CPAP. 189 infants (66%) were assigned to MC and 95 (34%) to VC. Infants in both these cohorts were further grouped as CSG or CFG. (Fig. 1). Among the entire cohort, nine infants (3%) were intubated within two HOL and 15 infants (5%) did not have ABG on admission.

Fig. 1
figure 1

Flow diagram of study population showing number of infants included in CPAP success group (CSG) and CPAP failure group (CFG) in both modeling cohort (MC) and validation cohort (VC)

Modeling cohort

Table 1 describes the MC. Of the 189 infants in this cohort, 95 were in the CSG and 94 in the CFG. There were no differences in maternal age and ethnicity. The proportion of mothers receiving ANS was lower among CFG compared to CSG. Mothers in the CFG had a higher proportion of PIH, exposure to intrapartum MgSO4 and Cesarean section. Infants in the CFG also had lower median birth weight compared to CSG. In addition, median maximum FiO2 within two HOL and A-a DO2 were higher and median a/A PO2 was lower in the CFG compared to CSG. A higher proportion of infants had severe RDS on first chest radiograph in the CFG. The CFG also had higher incidence of BPD (4% vs. 17% P ≤ 0.01) compared to CSG.

Table 1 Comparison between CPAP success group (CSG) and CPAP failure group (CFG) in the modeling cohort (MC)

Among infants who failed CPAP in the MC, the median time for CPAP failure was 5.9 h in the entire cohort. There were no differences between the 23–26 weeks’ and 27–29 weeks’ GA categories in the time to CPAP failure, median FiO2, CPAP level, A-a DO2, a/A PO2, and PaCO2 (Table 2).

Table 2 Respiratory status at the time of CPAP failure in the modeling cohort (MC)

The proportion of infants with PIH and exposure to magnesium sulfate were collinear with Cesarean section. Similarly, median PaO2, A-a DO2, and birth weight were collinear with median maximum FiO2 within two HOL, a/A PO2 and GA, respectively. Continuous variables were dichotomized at maximum FiO2 > 0.3 within two HOL (sensitivity: 0.75, specificity: 0.68, PPV 0.71) and a/A PO2 ≤ 0.22 (sensitivity: 0.39, specificity: 0.93, and PPV 0.66).

The forward stepwise logistic regression model showed that combination of three factors including maximum FiO2 > 0.3 within two HOL, radiographic severe RDS by first CXR, and a/A PO2 ≤ 0.22 predicted CPAP failure, with the area under the receiver operator curve (AUC) of 0.78 (Table 3).

Table 3 Forward step multivariate stepwise logistic regression model for CPAP failure in both modeling cohort and validation cohort

Validation cohort

Table 4 describes the VC. Of the 95 infants in the VC, 46 were in the CSG and 49 were in the CFG. There were no differences between two groups with maternal age, ethnicity, exposure to ANS and Cesarean section rate. However, mothers in the CFG had significantly higher proportion of PIH rate compared to CSG. In addition, compared to CSG, infants in the CFG had lower median birth weight and a/A PO2, higher median maximum FiO2 within two HOL and higher proportion of radiographic severe RDS.

Table 4 Comparison between CPAP success group (CSG) and CPAP failure group (CFG) in the validation cohort (VC)

Forward stepwise logistic regression of the VC confirmed maximum FiO2 within two HOL and radiographic severe RDS as the two most important predictors of CPAP failure (AUC 0.81). However, a/A PO2 < 0.22 was not a significant predictor of CPAP failure in the VC (Table 3).

Discussion

Our single center study shows that combination of maximum FiO2 > 0.3 within two HOL and severe RDS on initial chest radiograph are the most important predictors of CPAP failure among infants born ≤ 29 weeks’ GA. To the best of our knowledge, this is the first study to establish a validated prediction model for CPAP failure.

Our study differs from previous reports by Dargaville et al. where a maximum FiO2 ≥0.3 within two HOL was the single most important predictor of CPAP failure [11] and the study by Tagleafero et al., where severe RDS on initial chest radiograph predicted CPAP failure [16]. In our study, although FiO2 was the primary driver of our predictive model in both cohorts, severe RDS on first radiograph was a significant predictor, adding further strength to the model. The differences between the previous studies and our study could be attributed primarily to the study design. Tagliafero et al. did not evaluate the predictive value of FiO2 in the first few hours of life and Dargaville et al. did not evaluate the severity of RDS in the first radiograph.

The success rates of CPAP vary between centers [27]. Several factors may contribute to the success of CPAP therapy in an individual unit [28]. Role of higher versus lower FiO2 thresholds and CPAP level on the success of CPAP therapy is not clear. De Jaeger et al proposed a predictive model including male gender, birth weight <800 g and FiO2 ≥ 0.25 at 1 and 2 HOL for infants born <30 weeks’ GA from a center that intubated infants at 0.4 FiO2 at CPAP level 7 cm H2O [17]. The Dargaville et al. study used the FiO2 threshold of 0.5 at CPAP level of 8 cm H2O. Our center used CPAP level of 5–7 at FiO2 of 0.45–0.5. The average time to failure in our study is comparable to Dargaville et al. study, although the median FiO2 at the time of intubation was lower (0.4 vs. 0.5) [11]. However, the CPAP failure rates were similar in these studies (40–45%). Interestingly, Fuchs et al. showed that targeting 0.35–0.45 FiO2 can decrease the time to intubation without significantly increasing the risk of unnecessary intubation compared to a threshold of 0.6 FiO2 at CPAP level of 8 cm H20 [13]. This could explain the similar CPAP failure rates between studies, despite differences in the thresholds used. Our proposed model incorporates the FiO2 thresholds suggested by Dargaville et al. and the radiographic model suggested by Tagleaferro et al. and therefore will mitigate the bias introduced by center differences in FiO2 thresholds and improve the sensitivity of the prediction model. The level of CPAP did not provide additional strength to the predictive model than when FiO2 > 0.3 was used by itself. This is in concordance with the findings of Dargaville et al. [11].

Identifying the group of infants that will fail CPAP by two HOL is important to target interventions to decrease CPAP failure. Increasing the CPAP level improves functional residual capacity [29], decreases work of breathing [30], decreases intrapulmonary shunting and ventilation perfusion mismatch in the setting of RDS and thus decrease oxygen requirement [31]. There are concerns raised about the risk of pneumothorax by delaying surfactant administration with higher CPAP levels [9], but large studies using levels upto 7 cm H2O show no increase in air leak [10]. In addition to the higher FiO2 thresholds used for intubation, a single center experience of higher success rate than all other centers may be related to additional factors such as better positioning, use of a chin strap, appropriate prong size, airway clearance, frequent monitoring and buy in from the multidisciplinary team [28]. In order to minimize exposure to MV, prophylactic rescue surfactant therapy using an Intubation SURfactant and rapid Extubation (INSURE) strategy was studied [32,33,34]. This strategy did not decrease the need for MV compared to early CPAP and rescue surfactant [35]. Early rescue INSURE using a lower FiO2 threshold has been suggested as an alternative to rescue INSURE or rescue surfactant and continued MV for infants at risk for CPAP failure [36]. The Less Invasive Surfactant Administration (LISA) strategy of tracheal instillation of surfactant using a thin catheter while continuing CPAP decreases the need for MV [37,38,39]. Two randomized control trials (RCT) used a threshold FiO2 of 0.3 at CPAP level of 4–6 cm H2O [37, 38, 40]. A single RCT evaluated LISA with INSURE at FiO2 0.4 at CPAP level of 5–7 cm H2O [39]. Currently, a large RCT is underway comparing LISA with rescue surfactant and MV using a threshold of FiO2 > 0.3 with CPAP level ≥7 cm H2O [41]. Further, a recent survey conducted in the United States indicates that the use of LISA is currently very limited [42].

Our study has several strengths. First, this is the first study to validate the predictive model using a separate VC. Infants were randomly assigned to each cohort to minimize the affect of any temporal changes in practice. In addition, a large number of variables that are routinely available during first few hours of life were evaluated for inclusion in the prediction model. Second, the predictors of CPAP failure in our study although different, is in alignment with the previous models that suggested FiO2 and severity of RDS by radiograph as the most important predictors [11]. A lower FiO2 threshold and level of CPAP were used for intubation for surfactant therapy in our cohort compared to the two previous studies evaluating predictors of CPAP failure provided the opportunity to test the proposed model in a different setting. Third, 95% of infants in the study cohort had an ABG upon admission to NICU. This provided a unique opportunity to evaluate the utility of a/A PO2 and A-a DO2 as a predictor of CPAP failure. A previous study by Ammari et al. suggested that an A-a DO2 of 180 and severe RDS on initial chest radiograph are predictors of CPAP failure [14, 43]. In our study, A-a DO2 was highly collinear with FiO2 therefore was not included in the model. This confirms the observation from previous studies which showed that SpO2 is highly correlated with PaO2 in preterm neonates with acute and chronic lung disease [33]. Lastly, the initial chest radiograph was read by a single radiographer who was blinded to patient information and study design. This study design decreases the possibility of bias.

There are several limitations to this study. First, the ANS rate in our study population is lower compared to previous reports. The noted lower ANS rate among CFG compared to CSG in the MC in our study could be explained by the lower ANS in our cohort. However, ANS exposure did not add further strength to the predictive model in either cohort. Second, although the threshold for intubation and surfactant is generally uniform in our unit, variations in practice exist between treating teams. This limitation was also noted in other similar studies [16]. Third, although by protocol ABGs were drawn from right radial artery (pre-ductal), in some situations, the ABG may have been drawn from the left radial or from umbilical arterial lines. This may affect the a/A PO2 and A-a DO2 values. Fourth, 23–24 weeks GA infants accounted for only 5% of the entire cohort. Therefore, caution should be exercised when using these thresholds for infants at the edge of viability. Further studies in this small group of neonates are required to define the right strategy. Lastly, all radiographs were read by a single pediatric radiologist and inter rater reliability of the evaluation of chest radiographs between radiologist and neonatologists has not been tested. However, identification of severe RDS on radiographs is routinely done on standards widely described in the literature and less likely to be subjective compared to radiographic mild or moderate RDS [44].

In conclusion, this retrospective study shows that a FiO2 > 0.30 within two HOL and severe RDS on first chest radiograph predicts CPAP failure in the group of neonates born <30 weeks’ GA. This threshold needs to be further evaluated in multicenter study involving larger cohort of infants cared for in different settings. Such threshold can help target early interventions to decrease exposure to mechanical ventilation.