Introduction

Retinopathy of prematurity (ROP) is the most common disease leading to childhood blindness among preterm infants [1,2,3]. ROP is characterized by pathological retinal neo-vascularization in the maturing retina, detectable weeks after preterm birth. Preterm infants are exposed to repeated ophthalmological examinations to identify ROP requiring treatment, defined as ROP type 1 according to ETROP guidelines [4]. Current ROP screening is based on a simple prediction model with two dichotomized predictors, birth weight (BW) and gestational age at birth (GA). The Australia and New Zealand guideline recommends that all infants with BW under 1250 g or gestation under 31 weeks should be screened for ROP [5]. Current ROP screening guidelines have high sensitivity, but low specificity. In fact, very few preterm infants examined require treatment, based on multiple large studies [6,7,8]. The Australian data suggest that only 6.3% of total screened infants had severe ROP (stage 3 and above) and less than half of these infants with severe ROP received treatment [5, 9]. Further, repeated ophthalmological examinations lead to stress and discomfort in these fragile preterm infants even when performed by an experienced ophthalmologist [10]. These screening sessions are time consuming and uncomfortable. They also cause stress and anxiety to parents. Further, there is a dearth of experienced ophthalmologists in both the high-income and low-income countries to carry out these ROP examinations.

There is some recent evidence to support use of prediction models that include postnatal weight gain which may potentially reduce the number of infants requiring examinations while still accurately identifying infants who require treatment [11,12,13,14,15,16]. The scientific rationale is that low postnatal weight gain acts as a surrogate marker for a slower-than expected rise in serum insulin-like growth factor-1 (IGF-1), resulting in an insufficient activation of retinal vascular endothelial growth factor by IGF-1 and poor retinal vascular growth [17, 18].

Based on the above rationale, an online prediction model, WINROP, was developed in Sweden [19, 20]. By recording the infant’s BW and GA along with weekly weight measurements, this WINROP model accumulates and calculates the infant’s risk of developing severe ROP requiring treatment. It also has an alarm function that signals if an infant is at high risk of developing severe ROP requiring treatment. The infants’ data is recorded anonymously in the online system, identified only by WINROP identification number. WINROP model is developed as a supplement rather than being a substitute to established ROP screening examination. It aims to safely minimize the number of ROP screening examinations in infants at low risk of ROP requiring treatment and to alert physicians to pay special attention to infants who are at high risk. The WINROP model has been validated in several Swedish cohorts with high sensitivity as well as in other high and low-income countries [11,12,13,14,15,16]. However, the sensitivity and specificity of the WINROP model varies in these different cohorts; reflecting the inherent characteristics of the preterm infants and the neonatal care rendered to them in these diverse setups. It has been suggested that the WINROP model should be validated in different populations across the world. The aim of the present study was to validate the WINROP prediction model in Australian preterm infants.

Materials and methods

This was a retrospective study carried out in a level 3 NICU of a tertiary hospital in Western Australia to evaluate the ability of an online postnatal weight-gain prediction model (WINROP) to identify severe ROP in an Australian preterm population.

Objective

The objective was to measure the sensitivity, specificity, positive, and negative predictive values of WINROP identifying severe ROP, requiring treatment in an Australian preterm population.

WINROP model

The use of WINROP prediction model requires that the infant’s GA to be from 23 to 32 gestational weeks at birth, weekly weight measurements, and physiological weight gain. Infants with incomplete data regarding BW, GA, or final ROP outcome were excluded. Further, infants with incomplete weekly weight measurements or if the weight measurements were judged to inaccurately reflect physiological postnatal weight gain (e.g., when the weight reflects accumulated fluid, as in hydrocephalus) were also excluded from the study. The preterm infants, who met the above criterion, in King Edward Memorial Hospital, Western Australia for a period of 3 years (January 2014 and December 2016), were entered into WINROP prediction model (https://winrop.com/). The infants’ BW, GA, and weekly weight measurements until the postmenstrual age of 35 weeks were retrieved from the database along with the final ROP outcomes. ROP was classified according to Early treatment of ROP (ETROP trial) into type 1 or type 2 or non-type 1/2 or no ROP, based on the zone involved and the staging as per the revised International Classification of ROP [4, 21]. All treatments were done according to the ETROP study guidelines [4]. Data were anonymously recorded in the WINROP system, by an investigator unaware of the final ROP outcome. The WINROP outcome was an alarm or not and was recorded in a separate data file. An alarm means that the infant is at risk for developing severe ROP (requiring treatment according to ETROP guidelines). In this separate data file, the infants ROP outcome were added and calculations performed. No additional interventions were necessary on the participants. ROP screening was continued till treatment was required or complete vascularization of retina occurred.

Statistical analysis

All analyses were performed in SPSS version 25 (IBM, Armonk, NY, USA). Based on actual ROP outcome, the sensitivity and specificity of WINROP alarm in predicting severe ROP was calculated. The prevalence of severe ROP in the study cohort was further used to calculate the negative and positive predictive values of WINROP. Overall, 95% confidence intervals (CIs) were calculated.

Results

The study included data on 221 preterm infants (123 males and 98 females). The study flow chart has been depicted in Fig. 1. A total of 19 infants had to be excluded due to various reasons as per the study criteria and have been depicted in the study flow chart (Fig. 1). A total of 202 infants were finally included in the analysis. None of these infants had nonphysiological weight gain. The median BW was 1040 g (range, 459–1915 g), and median GA was 27.9 weeks (range, 23.4–31.9 weeks) in the included infants. A detailed BW and GA distribution of these infants with their ROP outcomes is given in Table 1. All included infants completed their final ROP examination. No ROP was detected in 129 infants (63.9%), and less severe ROP was diagnosed in 64 infants (31.7%) and 9 infants (4.45%) received ROP treatment with laser for severe ROP. Of the nine infants receiving treatment, seven infants fulfilled treatment criteria, i.e., developed ROP type 1, and however two infants developing ROP type 2 also received treatment in view of findings on examination suggestive of pre-plus disease. Infants’ median postmenstrual age for first ROP treatment was 36.4 weeks (range, 33.3–44.6 weeks). None of the preterm infants had developed Aggressive Posterior ROP.

Fig. 1: The above figure depicts the number of eligible infants.
figure 1

The number of infants excluded and their reasons for exclusion are also depicted in the above figure.

Table 1 ROP outcome in association to birth weight and gestational age distribution.

ROP alarm was signaled in 86 (42.6%) of all infants and in six of the seven infants developing type 1 ROP (85.6%). In all infants receiving an alarm the median postmenstrual age week for alarm was 30.5 weeks (range 27–35 weeks) and median time to alarm was 2.5 weeks from birth (range, 0–11 weeks). Fifty-nine (68.6 %) of the infants received an alarm within the first 3 weeks of life, Fig. 2. In infants developing ROP type 1 and receiving treatment the median postmenstrual age week for alarm was 27.0 weeks (range 27–32 weeks) and median time to alarm was 3.5 weeks from birth (range, 2–9 weeks). The median time from alarm to treatment for Type 1 ROP was 7.8 weeks (range, 1–11 weeks). The sensitivity and specificity of WINROP for type 1 ROP were 85.7% (42.0–99.2) and 59.0% (51.7–65.9%), respectively. The positive predictive value was 6.98% (2.88–15.1) and negative predictive value (NPV) was 99.1% (94.6–99.9), (Table 2).

Fig. 2: The above figure depicts the time from birth to have an alarm on the WINROP software to be at high risk for severe retinopathy.
figure 2

Majority of the infants who had an alarm, were mostly in the first 3 weeks of life after birth.

Table 2 WINROP sensitivity, specificity, positive and negative predictive values in identifying type 1 ROP.

By using the WINROP algorithm, there could have been 17.7% reduction in number of direct ophthalmic examinations and around 30% of the total examinations could have been safely delayed in our cohort. This approach did not miss any Type 1 or Type 2 ROP.

WINROP did not signal an alarm in one infant diagnosed and treated for type 1 ROP. This infant developed intra ventricular hemorrhage (IVH) grade 3–4 and temporary ventricular dilatation, which resolved after resolution of clot. WINROP did not signal with an alarm in two infants with ROP type 2; ROP stage 3 in zone II without plus disease. These infants did not have any unphysiological weight gain.

Discussion

Earlier studies have shown that prolonged early IGF-1 deficits and slow postnatal weight gain are associated with a higher risk of severe ROP. Serum IGF-1 levels correlate with fetal and postnatal growth, so postnatal weight gain is a good surrogate marker for serum IGF-1 [17,18,19]. Clinical prediction models such as WINROP, using postnatal weight gain, have been used to identify preterm infants with risk of severe ROP requiring treatment in different setups around the world [11,12,13,14,15,16].

To our knowledge, this is the first study from Australia using a weight gain-based online model for prediction of severe ROP. In the present study, the sensitivity of the WINROP alarm for type 1 ROP was 85.7% which is comparable with the previously mentioned literature.

In this retrospective study, WINROP algorithm did not identify an infant with suspicious nonphysiological weight gain due to temporary ventricular dilatation. This emphasizes the importance of clinical judgment when prospectively recording the infants’ weekly weight measurements in WINROP. WINROP demonstrated a very high sensitivity for detecting severe ROP in some high-income countries: 100% in a Swedish (353 infants) and an American cohort (318 infants) [12, 22]. However, when WINROP model was studied in some developing countries, the sensitivity ranged from a low 55% in a Mexican cohort (352 infants) to being 91% in a Brazilian cohort (366 infants) [13, 23]. In the present study, the specificity was 59%. The highest specificity of 81.7% was noted in the American cohort [12]. Due to low specificity and high false-positive rate, there is a need to do ROP screening as usual for infants with positive alarm. The differences in these values could be partly explained by the varying diversity of each of the cohorts studied in different parts of the world. It also partly reflects the differences in the perinatal and postnatal care in the different parts of the world. Alarm was triggered at birth in six infants. Most alarms (68.6%) occurred in the first 3 weeks after birth and the median time from alarm to treatment was around 8 weeks. This was similar to the study done in India where they had enrolled 70 preterm infants [24].

By using the WINROP algorithm, there could have been 17.7% reduction in number of direct ophthalmic examinations and around 30% of the total examinations could have been safely delayed in our cohort, even without missing a single case of type 2 ROP. This is similar to the studies done previously in the developed world [12, 22].

The present study has a few limitations. First, for any prediction model to be considered robust for screening, the CIs should be narrow. However, the CIs for both sensitivity and specificity in the present study were not narrow enough for it to be considered for routine use. Second, the overall specificity of WINROP alarm was low due to a high false-positive rate. The positive predictive value was very low at 6.98%. Hence, infants with positive alarm would need repeated ROP screening as usual. Third, this was a single centered retrospective study.

Our study has few merits such as having infants encompassing all GAs in a tertiary care setup. The WINROP prediction model in the present study had a very high NPV, thus could potentially be helpful in reducing the number of infants needing repeated ROP screening.

In light of the present study findings, the authors would suggest that the WINROP model could be used alongside the standard ROP screening criteria, rather than replacing it. This could potentially help in changing the examination frequency or timing based on predicted risk. In future, more multicentric studies with larger sample size in different tertiary care neonatal units in Australia are warranted to validate the above findings. Further, the results could be improved by modifying the WINROP model depending upon the different population characteristics.

In this study the WINROP model had a moderately high sensitivity of 85.7% and a very high NPV of 99%. Hence, it could potentially be used along with the current ROP screening criteria to reduce the number of ROP examinations needed in high-risk infants. However, more multicentric studies are needed before modifying existing guidelines.

Summary

What was known before

  • Current ROP screening guidelines (based on birth weight and gestational age) have high sensitivity, but low specificity. This leads to screening a large number of infants with very few needing treatment (Australian data: only 6.3% of total screened infants had severe ROP and <50% of these infants received treatment).

  • Repeated ophthalmological examinations can be stressful to fragile preterm infants and their parents.

What this study adds

  • The WINROP model had a moderately high sensitivity of 85.7% and a very high negative predictive value of 99% in an Australian preterm infant cohort.

  • It could potentially be used along with the current ROP screening criteria to reduce the number of ROP examinations needed in high-risk infants.