Introduction

Oxygen has been used for the treatment of many conditions since eighteenth century English scientist Joseph Priestly first liberated this ‘dephlogisticated air’ from mercuric oxide. However, it was not widely used for the care of neonates until the development of sealed isolettes in the 1940s made it feasible to maintain high concentrations for extended periods of time.1 The excitement generated by this innovation was quickly dampened by an explosion of blindness from retinopathy of prematurity (ROP), affecting an estimated 10 000 infants worldwide over the decade following the introduction of routine high oxygen concentration therapy.2, 3

Although oxygen was implicated in the ROP epidemic, the perception of decreased mortality associated with the use of high oxygen concentrations among many pediatricians allowed this practice to continue. It would take until 1955 for the pattern to change, when a landmark randomized control trial was published demonstrating a 66% reduction in ROP for the low oxygen concentration group, with no increased risk of mortality.4 This study set off a cascade of fierce debate and a series of seemingly contradictory studies,5, 6, 7 highlighted by the SUPPORT trial, published in 2010, that demonstrated increased mortality but decreased incidence of ROP in the lower oxygen saturation group,8 and the COT trial, published 3 years later, that demonstrated no difference in ROP or mortality between the high and low saturation groups.9

Over the past two decades, the focus of investigation has shifted from measurement of the partial pressure of oxygen in arterial blood (PaO2) to the oxygen saturation of hemoglobin measured using simultaneous pulse oximetry (SpO2), a technology that provides a continuous noninvasive measure highly correlated with PaO2. Although this technology rapidly became the standard of care, owing largely to its relatively low cost and simplicity, it has a narrow range of acceptable saturations, the consensus being 90 to 95%,10 although a solid definition of these limits remains elusive.

Near-infrared spectroscopy (NIRS) functions in a manner similar to pulse oximetry, using the difference in absorptive qualities of oxy- and deoxyhemoglobin to infrared light to quantify the percent saturation. However, NIRS measurements are not pulse synchronized and are more heavily weighted to the venous compartment, essentially a weighted average of the arterial and venous saturations.11 NIRS has been used extensively12, 13, 14 to study cerebral hemodynamics in preterm infants with normative data suggesting a much broader range of values within the normal range (55 to 85%).15, 16 Given that the retinal vascular bed arises from the cerebral vasculature, cerebral NIRS (cNIRS) measurements may also be reflective of the retinal circulation. Fractional tissue oxygen extraction (FTOE), a measure of the proportion of intra-arterial oxygen extracted and consumed by the target tissue,14 can be calculated from the cNIRS and pulse oximetry values, affording an opportunity to assess the balance between oxygen delivery and consumption and thereby detecting a hyperoxic state.

In this study, we report a direct comparison of hyperoxia burden, calculated using 8.9 million cerebral NIRS and pulse oximetry measurements, in order to predict severe ROP in preterm infants. We hypothesize that the broader range of acceptable NIRS measurements will allow for better discrimination between dangerous and acceptable levels of oxygenation during this developmentally vulnerable period.

Methods

Patient selection

Infants born at 30 weeks of gestation by best obstetrical estimate, weighing <1500 g and who were admitted to the neonatal intensive care unit at St Louis Children’s Hospital were enrolled as close to birth as possible for a prospective monitoring study conducted between 2012 and 2015. This population was chosen to represent preterm infants at highest risk for ROP; all infants meeting these criteria routinely undergo routine ophthalmologic screening for ROP at our institution. Infants were excluded from recruitment if there was an antenatally diagnosed chromosomal anomaly, if the infant was >24 h old at the time of recruitment or if the infant was medically unstable and not expected to survive the first week of life. Infants were excluded from analysis if they died before ophthalmologic examination or developed grade III/IV intraventricular hemorrhage (a potential confounder known to alter cNIRS values14). The study protocol was reviewed and approved by the Human Research Protection Office at the Washington University School of Medicine and informed written consent was obtained from the parents before the start of the study procedure.

Maternal and infant clinical characteristics

Perinatal factors were collected including antenatal steroid or magnesium sulfate administration, presence or absence of pathologic diagnosis of chorioamnionitis and method of delivery (Cesarean or vaginal). Infant clinical characteristics were also collected from the infant’s medical record including gestational age, birth weight centile (calculated from the Fenton growth charts17), sex, race, Apgar score, length and method of respiratory support and any diagnosis of bronchopulmonary dysplasia (defined as need for supplemental oxygen past 36 weeks of postmenstrual age18). A CRIB-II (revised Clinical Risk Index for Babies) score was calculated for each infant using the algorithm defined by Parry et al.19 The highest stage of ROP disease in either eye was obtained from the Ophthalmology record. Severe ROP was defined as the need for diode laser treatment in the setting of high-risk prethreshold or threshold disease as determined by the pediatric ophthalmologist caring for the patient. The fraction of inspired oxygen (FiO2) during the study period was obtained from the electronic medical record.

Institutional practices

During the study period, our institutional practice was to maintain oxygen saturations, measured by pulse oximetry, between 88 and 93% by adjusting the ambient FiO2. The bedside pulse oximeter was programmed with an alarm that would notify the bedside nurse when the saturations were out of range and adjustment was indicated.

At our institution, high-risk infants (estimated gestational age <30 weeks, birth weight <1500 g) undergo ROP at the bedside starting in the fourth postnatal week or 31 weeks postmenstrual age, whichever is later,20 and the diagnosis of ROP is made consistent with the most recent revision International Classification of Retinopathy of Prematuirty.21 Consistent with the findings of the ETROP study, infants with threshold (5 contiguous clock hours or 8 total clock hours of stage 3 ROP and plus disease in zone I or II) or high-risk prethreshold disease (any stage ROP with plus disease in zone I, stage 3 ROP without plus disease in zone I or stage 2/3 ROP with plus disease in zone II) undergo laser therapy.22

Procedure

NIRS data collection

Cerebral tissue oxygen saturation (SctO2) was obtained using 4-wavelength (690, 780, 805 and 850 nm) near-infrared spectroscopy (FORE-SIGHT, CAS Medical Systems, Branford, CT, USA) with a transducer containing a fiber optic emitter and one detector located 25 mm from the light source. A nonadhesive optode (FORE-SIGHT sensor kit small, CAS Medical Systems) was placed on the frontoparietal scalp, secured by a soft head band and recording was conducted over the first 96 h after birth.

Maintenance of skin integrity in this highly vulnerable population was a high priority; therefore, a ‘safe-skin’ protocol was developed in order to prevent the occurrence of bruising/pressure ulcers. This plan included limiting individual recordings to no more than 12 consecutive hours, after which time the optode was displaced 1 to 2 cm laterally or medially. If redness or irritation developed, the infant was given a 6 to 12 h of ‘rest-period’ to prevent worsening of the condition. Any interruption in recording was noted by the bedside nurse or research team member.

Pulse oximetry data collection

Pulse oximetry data (SpO2) were collected in a time-synchronized manner with the cNIRS data using the Nellcor OxiMax algorithm integrated into the bedside patient monitor (Philips MP70 equipped with multi-measurement module M3001A-A04, Philips Healthcare, Andover, MA, USA) using an adhesive probe placed on the hand or foot (Neonatal-Adult SpO2 Sensor, Covidien, Mansfield, MA, USA).

Analysis

Preprocessing

The cNIRS and SpO2 data streams were extracted from the source data file. Both data streams underwent multistep preprocessing to eliminate missing or invalid data. The data were partitioned into 1 min epochs (30 serial, non-overlapping samples) and inspected for (1) interrupted regions of the recording (as noted in the research record), (2) regions of the recording tagged by the NIRS or SpO2 device where it was not able to properly measure saturations (for example, probe not in contact with the skin) and (3) regions with sudden, nonphysiologic changes in the baseline or excessive variance, based on the sliding-window motion artifact rejection algorithm proposed by Ayaz et al.23 The entire data epoch was rejected if either data stream failed one or more of these checks or if continuous measurements were not available for both data sources.

FTOE calculation

In order to evaluate the balance of oxygen delivery and consumption in the vascular region that includes the retinal artery, the fractional tissue oxygen extraction was calculated. As previously noted, the FTOE represents proportional difference in hemoglobin oxygen saturation between the arterial and venous systems and is calculated as (SpO2−SctO2)/SpO2. This approach has successfully been used by other researchers to investigate patterns of FTOE in premature infants and has an established normative range.14, 15 For the purposes of this study, the FTOE was calculated for all error-corrected data pairs of SpO2 and SctO2.

Hyperoxia burden calculation

Hyperoxia burden was calculated as a percentage of the cumulative, error-corrected SpO2 and FTOE recordings with measurements exceeding defined thresholds (>90/93/95% and <20/15/10%, respectively). These threshold values were chosen a priori based on published empiric data15, 24 to represent a ‘low acceptable’ level (90% for SpO2 and 20% for FTOE), a ‘high acceptable’ level (93% for SpO2 and 15% for FTOE) and beyond the acceptable limit (95% for SpO2 and 10% for FTOE). Only those infants with >12 h of cumulative error-free recording time were considered for hyperoxia burden calculation. All signal processing was conducted using an in-house software package developed in MATLAB 8.6 (The Mathworks, Inc., Natick, MA, USA).

Statistical approach

Univariate comparisons for key perinatal and clinical characteristics were made between infants with and without severe ROP using the Mann–Whitney U-test for continuous variables and a two-sided Fisher’s exact test for categorical variables. In order to account for the differences in antenatal and postnatal exposures, the association between hyperoxia burden at each threshold and severe ROP was calculated using binary logistic regression, adjusting for important covariates. Covariate selection was undertaken by combining factors known to be associated with the development of ROP (gestational age, low birth weight and male sex25) with other factors that would be plausible modifiers of ROP risk (race, antenatal steroid and magnesium sulfate administration, method of delivery, inotrope exposure, bronchopulmonary dysplasia and mean FiO2 during the study period). Results were considered statistically significant if P0.05 for any comparison. Given the novel nature of this metric, a priori calculation of the sample size was not to be performed.

Correlation between predictors was assessed using the variance inflation factor, a measure of the degree of multicollinearity, where variance inflation factor of >5 is indicative of highly correlated predictors. The CRIB-II score was calculated to provide a broad comparison of infants included in the study, but was not used in the regression model because of concerns of collinearity and the inability to assess the effect of the individual components of the score. Statistical analysis including descriptive statistics and regression modeling was conducted using R version 3.2.3 (R Project for Statistical Computing, Vienna, Austria).

Results

Sample characteristics

A total of 113 infants were initially recruited for the study. Of these, 17 infants (15%) were excluded from analysis because of death before ophthalmologic examination, 25 (22%) because of short or corrupted recordings (<12 cumulative hours) and 8 (7%) owing to the development of grade III/IV intraventricular hemorrhage. For the remaining 63 infants in the analysis, the mean estimated gestational age was 25.8±1.5 weeks, the mean birth weight was 898.5.2±207 g and 39/63 (62%) were male. Of the 63 infants, 13 (20%) underwent laser therapy for treatment of severe ROP. Three infants were treated with bevacizumab before laser therapy. There were no infants who received bevacizumab without also undergoing laser therapy.

Data quality

The median postnatal age at the start of recording was 16 h (range: 4 to 23) and the mean cumulative recording length was 36±18 h. Recording time was spread roughly equally across the 4 days of recording with valid data available for 75%, 95%, 84% and 57% of participants on postnatal days 1 to 4, respectively. Preprocessing resulted in a median rejection of 0.4% of collected data epochs (range 0.2 to 2.4%). Approximately 1% of rejected data epochs occurred because of motion artifact, 61% because of invalid cNIRS data and 38% owing to invalid SpO2 data. No infants were excluded on the basis of error correction. This process yielded a total of 8.9 million data points for use in hyperoxia burden calculations.

Skin safety data

Three infants (5%) developed erythema at the location of NIRS probe placement, necessitating a ‘rest period.’ In all cases, the erythema resolved without further invention after 24 h and monitoring was resumed.

Univariate comparison between those with and without severe ROP

Infants who developed severe ROP were less mature (25.0 vs 26.0 weeks of gestation), had lower birth weight (656.5 vs 961.5 g), had greater inotrope exposure (54% vs 10%) and were more likely to be diagnosed with bronchopulmonary dysplasia (100% vs 64%). There were no differences in antenatal steroid exposure, average %FiO2 and average %SctO2 during the first 96 h. There was no difference between the two groups for the average percentage of SpO2 values above any of the three thresholds. In addition, there was no difference between the two groups in the average percentage of FTOE values below any of the three thresholds. A comparison of infants with and without severe ROP can be found in Table 1.

Table 1 Univariate comparison between those with and without severe ROP

Hyperoxia burden regression model

After adjusting for important covariates, there was no significant association between the development of severe ROP and hyperoxia burden at any threshold when measured using pulse oximetry. In contrast, hyperoxia burden measured using FTOE was not significant at the 20% threshold, but was a statistically significant independent predictor of severe ROP at the 15% threshold (P=0.04) and the 10% threshold (P=0.03). Gestational age at birth, birth weight centile, inotrope exposure and delivery method were also significant independent predictors of severe ROP in the models. Complete output of the regression models can be found in Tables 2 and 3.

Table 2 Severe ROP regression models: FTOE
Table 3 Severe ROP regression models: SpO2

Discussion

In this direct comparison of hyperoxia burden in the first 96 h of life, calculated using 8.9 million measurements of oxygen saturation, increasing periods of hyperoxia, measured as a FTOE of <15%, are predictive of ROP requiring laser therapy, whereas hyperoxia measured using pulse oximetry is not. Indeed, hyperoxia burden calculated using pulse oximetry was not able to predict severe ROP at any threshold, including a threshold that would universally be considered too high (>95%). Infants who developed severe ROP spent 20% more time with FTOE values <15% that, given a mean recording length of 36 h, equates to an additional >40 min of hyperoxia exposure.

Although retinal arterial oxygenation has been directly measured in adults using a specialized fundal camera,26, 27 this technique has never been evaluated in preterm infants, is invasive and uncomfortable and is impractical for longitudinal monitoring. Our data suggest that cNIRS is a feasible surrogate measure and is more strongly associated with severe ROP than pulse oximetry. The strength of this approach lies in the nature of the FTOE measurement, namely that it represents both arterial and venous saturations in the region below the sensor. As this measurement provides the ability to approximate the relative difference between oxygen delivery and consumption, changes in the FTOE are more likely to be the result of changes in oxygen delivery rather than consumption and represent an overabundance of oxygen during this vulnerable period. This stands in contrast to pulse oximetry measurements that are devoid of the oxygen consumption context.

As the results of this study demonstrate, there is not a 1:1 relationship between the pulse oximetry and NIRS data, suggesting the likelihood that the same value of SpO2 may yield very different cNIRS values in different patients, a consequence of different oxygen delivery/consumption ratios. Further reinforcing this point is that both groups of infants were exposed to similar fractions of inspired oxygen and had similar mean cNIRS values, another indication that excessive oxygenation can be best detected as an imbalance of oxygen delivery and consumption. The use of the FTOE measurement overcomes these problems and provides a means to standardize hyperoxia between subjects when absolute cNIRS and SpO2 may vary significantly.

Generalizability and limitations

Several factors should be taken into consideration when applying the results of this study to other centers. First, there is known variation in measured values between NIRS monitors from different manufacturers and differently sized sensors, likely the result of different proprietary internal algorithms.28, 29 Given that prior studies have shown a high degree of interdevice correlation (although not necessarily the same absolute values), we do not anticipate that this would alter the general finding of this study, namely that the hyperoxia burden calculated using FTOE measurements is associated with severe ROP. However, threshold determination should be repeated using equipment from other manufacturers.

cNIRS and pulse oximetry data were not made available to the ophthalmologist so as not to bias treatment decisions. In addition, intravitreal injections of bevacizumab have been adopted by some pediatric ophthalmologists (including those at our center) with the hope that the anti-vascular endothelial growth factor properties will reduce the rate of poorly controlled vessel growth and reduce the need for laser therapy.30 In this study, 25% of infants received bevacizumab therapy, but all of them went on to require laser treatment. In this small cohort, the use of bevacizumab does not appear to be a confounding factor.

The general practice of our institution for infants of the gestational age and birth weight in this study is intubation, prophylactic surfactant administration and extubation from low ventilator support.31 Differences in oxygen exposure, both concentration and pressure, between this approach and immediate nasal continuous positive airway pressure with rescue surfactant (as suggested by the SUPPORT trial32) or prophylactic surfactant administration following by immediate extubation (that is, the InSurE approach33) have not been evaluated and should be taken into consideration when comparison is made with other institutions.

Finally, although the incidence of severe ROP in this cohort (20%) is somewhat higher than might be expected, the reported incidence of severe ROP in the preterm population is quite variable, ranging between 5 and 43%, and appears to be increasing over time.34, 35 Given that gestational age at birth is one of the strongest predictors of ROP risk, it is not surprising that a cohort with a mean estimated gestational age of 25.8±1.5 weeks would have a ROP incidence that tends toward the higher end of the estimated incidence.

A surprising finding in this study was the unexpected deviance in the measured percentage of the SpO2 recording above the institutional goal of 93%, with more than one-quarter of the values above this threshold. Emerging literature describing the effects of alarm fatigue,36 the concept that the high frequency of false alarms contributes to a delayed response to true alarms, may be an underlying factor contributing to this result. Developing alternative strategies to maintain adherence to alarm limits, such as a daily review of alarm limit violations by the medical team or automated oxygen titration systems, should be a priority.

Future directions

An important modifying factor not addressed by this study is oxygen exposure during resuscitation in the delivery room. The exposure is not routinely monitored or recorded, yet there is clear evidence that even brief exposure to high oxygen concentrations can result in long-lasting oxidative stress.37 Another potential modifying factor is the method of quantification for the FiO2. Although NIRS and SpO2 data were captured continuously at a relatively high resolution (0.5 Hz), the FiO2 data were obtained from the electronic medical record, where it was recorded intermittently, generally once per hour. Although these data provide a high-level overview of oxygen exposure, there is certainly considerable minute-to-minute variability in the actual delivered FiO2 not captured by the current approach. An automated method for capturing this information, both in the delivery room and the neonatal intensive care unit, at a similar resolution to the cNIRS/SpO2 data should be considered in future studies and will allow for more accurate quantification of the complete system: oxygen delivery to the lungs, systemic absorption and the balance of delivery/consumption at the end-organ.

The increased risk of mortality associated with lower pulse oximetry saturations noted in the SUPPORT trial8 was not examined in this study that was focused only on the effects of excessive oxygenation. Indeed, as most preterm infants who die do so in the first week of life, they were specifically excluded from the analysis, given they had not yet undergone ophthalmologic examination to identify the presence or extent of ROP disease. Future studies should examine a potential role for cNIRS measures of hypoxia and the risk of mortality, again taking advantage of the ability of cNIRS to measure the balance of oxygen delivery and consumption. It is possible that a similar pattern of individual variation may emerge and allow for quantification of differing mortality risk in infants with similar systemic saturations.

An important caveat to this study is that it encompasses only the first 4 days of life. Given the natural history of ROP, namely the initial hyperoxia after birth suppressing the production of vascular endothelial growth factor and delaying retinal vascular maturation, followed by later retinal tissue hypoxia that promotes the unregulated production of vascular endothelial growth factor and stimulates vascular overgrowth, there are likely several key developmental epochs requiring targeted care to reduce the risk of ROP.38 The first 96 h are certainly part of the initial stage of development where maintenance of lower oxygen saturations in the retina is important; however, the length of this particular epoch is not currently known. Future projects should continue monitoring for much longer periods of time, potentially over the first 4 to 6 weeks of life in order to identify these inflection points.

In conclusion, although these results represent a preliminary stage of investigation, they open the door to a new strategy for monitoring tissue oxygen delivery to this vulnerable population. Longitudinal evaluation of hyperoxia burden using cNIRS over first weeks of hospitalization in a larger population will allow for better delineation of key epochs in retinal development. This information can then be used as a part of an empirically based, developmentally linked oxygen saturation targeting plan with the aim of resolving the current state of confusion about ideal oxygen saturations in premature infants.