Introduction

Although therapeutic hypothermia reduces the incidence of adverse outcomes in moderate to severe neonatal encephalopathy following perinatal asphyxia, the rate of mortality and severe disabilities is still high.1 In addition to motor and cognitive disabilities, feeding and control of spontaneous breathing may also be affected among survivors of neonatal encephalopathy.1,2 The need for special medical care such as gastrostomy or tracheostomy owing to difficulties in feeding and spontaneous breathing significantly affects the quality of life of children with severe disabilities.3,4 Thus, withdrawal of life-sustaining treatment is considered in infants with neonatal encephalopathy expected to have severe neurological sequelae with special health care needs.5 Therefore, this decision must be made based on a reliable prediction of poor outcomes.6,7,8

However, prognostication in neonatal encephalopathy is complex and often relies on a combination of factors. Death and motor and cognitive impairments are predicted based on several factors including clinical course, neurological examination,9,10,11 brain magnetic resonance imaging (MRI) findings,12,13 and electroencephalogram/amplitude-integrated electroencephalogram.14 Parents often ask not only about motor and cognitive impairment but also about the ability for independent feeding and breathing. However, there are few reports on predictive measures of difficulty in feeding or spontaneous breathing in the very early period that would facilitate earlier clinical decision making. As such, the development of an early predictive tool for spontaneous breathing and feeding impairment is essential.

The Thompson score11,15 assesses the severity of neonatal encephalopathy based on nine clinical signs. Each sign is rated on a scale of 0–3, then the total score is calculated (Table 1). The Thompson score and other encephalopathy scoring systems9,10,16 correlate well with subsequent neurodevelopmental outcomes in infancy.

Table 1 Thompson score.

Therefore, this study aimed to investigate the performance of the Thompson score during the first 4 days of life for predicting long-term respiratory or feeding impairment requiring tracheostomy or gavage feeding. To achieve this goal, we evaluated a large, multicenter cohort of infants with cooled neonatal encephalopathy in Japan.

Methods

Study design and patients

This observational study reviewed infants with neonatal encephalopathy enrolled in the Baby Cooling Registry of Japan between 2012 and 2016. Briefly, the Baby Cooling Registry of Japan is an online case registry of neonatal encephalopathy patients treated with therapeutic hypothermia. It was established in 2012 and invited all registered level II/III neonatal intensive care units in Japan.3 All enrolled infants had perinatal asphyxia, defined as the fulfillment of at least one of the following criteria: (1) 10-min Apgar score ≤5; (2) continued need for resuscitation over 10 min after birth; and (3) pH <7.00 or base deficit ≥16 mmol/L.17 Mild to severe encephalopathy was defined as stage 1–3 encephalopathy as per Sarnat et al.18 All infants underwent therapeutic hypothermia for up to 72 h. The Thompson score9 of each infant was assessed by neonatologists at each cooling center four times: 0–24, 24–48, 48–72, and 72–90 h after birth. Detailed information on perinatal factors, clinical course, administration of analgesics or sedatives during cooling, and outcomes were also collected.

In the current study, infants born at ≥36 gestational weeks without congenital anomalies were considered eligible. Infants with no Thompson score registration at all four time periods or with no outcome registration were excluded from the study. We also excluded those who received muscle relaxants during cooling because of the difficulty of Thompson score assessment in these patients. A score change of ≥10 between two sequential measurements was considered erroneous (data entry error), and these scores were excluded from the analysis. For example, if the scores were 15, 15, 15, and 0 at 0–24, 24–48, 48–72, and 72–90 h, respectively, the score at 72–90 h was excluded.

Outcomes

Outcomes were assessed during hospital discharge. Adverse outcomes were defined as (1) death before discharge or survival with long-term respiratory impairment (requiring tracheostomy or endotracheal intubation for ≥1 month) and (2) death before discharge or survival with long-term respiratory or feeding impairment (requiring gavage feeding; gastrostomy or tube feeding for ≥1 month) at discharge or referral to other hospitals. The indications for tracheostomy and gavage feeding were determined by the physician. We did not independently analyze the three outcomes (i.e., death, respiratory impairment, and feeding impairment). Instead, we analyzed two composite outcomes: “death or respiratory impairment” and “death, respiratory or feeding impairment,” because not only the severity of encephalopathy but also the decision making regarding the life-sustaining treatment would affect outcomes of severely injured infants. Thus, analyzing “death” and “survival with special care needs” as different outcomes would not have allowed us to meet our objective of investigating the predictive value of the Thompson encephalopathy score.

Statistical analysis

The baseline patient characteristics were presented using descriptive analysis. The predictive performance of the Thompson score in each period was evaluated using a receiver operating characteristic curve, and the area under the curve (AUC) was calculated. In addition, the sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) for each outcome were calculated. The optimal cutoff value was obtained by maximizing the F1-score, which was defined as the harmonic mean of sensitivity and PPV. This is because using other indices based on accuracy, sensitivity, or specificity would be inappropriate when there is an imbalance in classification (i.e., the prevalence is very high or low).

Internal validation was performed by bootstrapping, and 150 simulations were performed to obtain a bootstrapped AUC. In addition, to evaluate the impact of the use of analgesics/sedatives on the predictive performance of the Thompson score at 72–90 h, we compared the AUCs between two models, with and without analgesics/sedatives as a covariate, using a bootstrap test. All statistical analyses were performed using IBM SPSS Statistics for Windows, version 25.0 (Armonk, NY) and R version 4.0.3 (The R Foundation for Statistical Computing, Vienna, Austria). A p value <0.05 was considered statistically significant.

Results

Patient characteristics

Among the 727 infants enrolled in the registry, we excluded 95 infants, including those who had no Thompson score in all four periods (n = 85), no registered outcomes (n = 5), and who received muscle relaxants (n = 5). Finally, 632 infants were evaluated. Of which 629 infants had a recorded Thompson score at 0–24 h; 605 infants, 24–48 h; 594 infants, 48–72 h; and 583 infants, 72–90 h (Fig. 1). The baseline characteristics of the included and excluded patients are shown in Table 2. The Sarnat grade was higher in the study patients (n = 632) than in the excluded patients (n = 95) (p = 0.03). In total, 21 (3.3%) infants died before discharge, 59 (9.3%) survived with respiratory impairments (requiring tracheostomy or endotracheal intubation for ≥1 month), and 113 (17.9%) survived with feeding impairments (requiring gavage feeding for ≥1 month) at discharge or were referred to other hospitals. The remaining 498 patients (78.8%) were discharged without special medical care needs.

Fig. 1: Patient inclusion flowchart.
figure 1

All infants who required a tracheostomy or long-term endotracheal intubation also required gavage feeding, and thus, infants with respiratory impairment had both respiratory and feeding impairments.

Table 2 Patient characteristics.

All infants who required a tracheostomy or long-term endotracheal intubation also required gavage feeding, and thus, infants with respiratory impairment had both respiratory and feeding impairments. The median length of hospital stay was 21.0 (range, 16.0–29.0), 207.0 (range, 115.0–318.5), and 105.5 (range, 57.0–231.0) days for those who survived without special care needs, survived with respiratory impairments, and survived with feeding impairments, respectively.

Predictive value for death or long-term respiratory impairment

The prognostic performance of the Thompson score stratified by the examination time is shown in Table 3. The Thompson scores in the 0–24, 24–48, 48–72, and 72–90 h time periods had AUCs of 0.918 (95% CI: 0.893–0.943), 0.925 (95% CI: 0.900–0.949), 0.932 (95% CI: 0.906–0.958), and 0.947 (95% CI: 0.926–0.968) for predicting death or respiratory impairment, respectively. All models maintained high accuracy upon internal validation; the 0–24 h Thompson score had an AUC of 0.919 at 0–24 h, 0.924 at 24–48 h, 0.932 at 48–72 h, and 0.946 at 72–90 h. The 72–90 h Thompson score had the highest AUC, and a cutoff of ≥15 showed that it had a sensitivity of 0.85, specificity of 0.92, PPV of 0.57, and NPV of 0.98 for death or long-term respiratory impairment.

Table 3 Prognostic utility of the Thompson score for adverse outcomes stratified by assessment period.

Predictive value for death, respiratory or feeding impairment

The 0–24, 24–48, 48–72, and 72–90 h Thompson score showed AUCs of 0.880 (95% CI: 0.848–0.912), 0.869 (95% CI: 0.835–0.902), 0.880 (95% CI: 0.848–0.912), and 0.913 (95% CI: 0.885–0.941) for predicting death, respiratory or feeding impairment, respectively (Table 3). Internal validation using bootstrap resampling for the models also indicated good performance as evidenced by the similarity of AUCs between the original and bootstrapped samples (0.879 at 0–24 h, 0.869 at 24–48 h, 0.880 at 48–72 h, and 0.912 at 72–90 h). The 72–90 h Thompson score showed the highest AUC, and a cutoff score of ≥14 showed that it had a sensitivity of 0.71, specificity of 0.92, PPV of 0.68, and NPV of 0.93 for death, respiratory or feeding impairment.

Distribution of outcomes according to the 72–90 h Thompson score

The Thompson score at 72–90 h showed the highest predictive performance. Therefore, the proportion of death, survival with respiratory impairment, survival with respiratory or feeding impairment, and survival without special care need according to the Thompson score at 72–90 h were analyzed (Fig. 2). In general, the incidence of death, respiratory impairment, or feeding impairment increased with increasing Thompson score at 72–90 h. A Thompson score of <9 was associated with a 0% incidence of death or respiratory impairment. Among the 98 patients with a Thompson score of ≥15 at 72–90 h, 43% (42/98) survived without requiring tracheostomy or intubation, 9% (9/98) died, 48% (47/98) survived after undergoing tracheostomy or long-term intubation, and 67% (66/98) required gavage feeding.

Fig. 2
figure 2

Outcome distribution by Thompson score at 72–90 h of age.

Effect of analgesics/sedatives on predictive performance of Thompson score

In total, 44 (7.3%) infants received no analgesic/sedative treatment, 423 (70.3%) received one analgesic or sedative (morphine, n = 78; fentanyl, n = 225; midazolam, n = 120), and 135 (22.4%) received two analgesics or sedatives (morphine and midazolam, n = 15; fentanyl and midazolam, n = 119; morphine and fentanyl, n = 1). There were 30 infants with no data on the administration of analgesics or sedatives.

We compared the AUCs of the 72–90 h Thompson scores between two models, with and without analgesics/sedatives as a covariate, in 559 infants who had information about sedative use at 72–90 h. Although the difference was not significant, the 72–90 h Thompson score for the model using Thompson score plus analgesic/sedative use had a higher AUC than with the model using Thompson score alone, for both death or respiratory impairment (0.947 [95% CI: 0.924–0.970] vs. 0.941 [95% CI: 0.918–0.965], p = 0.11), and for death, respiratory, or feeding impairment (0.914 [95% CI: 0.884–0.944] vs. 0.909 [95% CI: 0.880–0.939], p = 0.239).

Discussion

Predictive measures of difficulty in feeding or spontaneous breathing in neonatal encephalopathy patients are lacking. In this study, the Thompson score from 1 to 4 days of age was found to be useful in predicting death, respiratory impairment, or feeding impairment. Importantly, the 72–90 h Thompson score showed the highest predictive capability.

The decision to withdraw life-sustaining treatment among neonatal encephalopathy patients is made based on the probability of mortality or survival with severe disability.5,6,7,8 The hospital mortality rate in this study was low, but the proportion of surviving infants requiring tracheostomy, prolonged intubation, or gavage feeding was high. We have previously reported that the mortality rate for the most severely asphyxiated children with a 10-min Apgar score of 0 is lower in Japan than in other countries; however, the proportion of surviving infants with very severe disabilities is also higher.19 This is because withdrawal of life-sustaining treatment is allowed in many countries,5,20,21 but rarely in Japan.19,22 There is no guarantee of legal protection for withdrawal of treatment in Japanese guidelines,22 and public insurance5 covers almost all neonatal care in Japan. This is a potential explanation for the low rate of treatment withdrawal in Japan.

Withdrawal of life-sustaining treatment is likely to lead to the rapid death of infants with severe encephalopathy.7 Death owing to the withdrawal of life-sustaining treatment makes it impossible to determine the quality of life of surviving infants undergoing prolonged life support. However, withdrawal of life-sustaining treatment is rare in Japan, allowing us to examine the quality of life of infants who would need prolonged life support if treatment is continued. In this study, the hospital mortality rate for cooled neonates was 3.3%, which was markedly lower than that in a multicenter contemporary cohort of cooled infants with neonatal encephalopathy in the United States (15%)2 and other previous clinical trials and registers (11–25%).1,23

Furthermore, the survival rates of patients with tracheostomy and gavage feeding were 9.3 and 17.9%, respectively, which was higher than that in the United States cohort (<1 and 6%, respectively).2 Meanwhile, the event rate for overall adverse outcomes, that is, either death or survival with prolonged life support, was similar in our study and in the United States study.2 This difference in the mortality rate between Japan and other countries is primarily caused by social differences influencing the withdrawal of life-sustaining treatment. However, there is no significant difference in disease severity between our study population and that in other recent studies of neonatal encephalopathy.2,23

Several studies have previously investigated the predictive value of encephalopathy score for estimating death or neurodevelopmental outcomes.9,11,15,16Thompson et al. investigated the predictive value of the Thompson score for estimating death or neurodevelopmental outcome in non-cooled infants with neonatal encephalopathy.11 They reported a sensitivity of 0.53 and specificity of 0.96 at a cutoff score of ≥15 on day 4. In another study, a maximum Thompson score of ≥15 had a sensitivity of 0.60 and specificity of 0.94 for cognitive impairment in cooled infants with neonatal encephalopathy.15 Despite the difference in predicted adverse outcomes, our results showed similar cutoff values and high sensitivity and specificity. Our finding that severe encephalopathy that persists through postnatal day 4 is strongly predictive of adverse outcomes is consistent with that of a previous multicenter study that evaluated the serial modified Sarnat score.9

Withdrawal of life-sustaining interventions may be more complicated when a neonate with a severe brain injury recovers his or her respiratory drive and can be extubated if life-sustaining treatment is continued.6,7 In very severe cases, parents and clinicians must be able to integrate adequate medical information and parental values to develop a shared decision-making approach.6 However, there is little available evidence regarding the return of spontaneous respiration in severely asphyxiated infants when life-sustaining treatment is continued.7 Our results showed that when life-sustaining treatment is continued, many infants with persistent severe encephalopathy with a Thompson score of ≥15 at 72–90 h of age survived; however, by this time the decision to withdraw life-sustaining treatment may already have been made in the United States for severe encephalopathy.5 Nevertheless, 43% of such cases have the potential to regain spontaneous breathing, be extubated, and survive without tracheostomy. Meanwhile, approximately 50% of infants who survived without tracheostomy required gavage feeding. Although decision making concerning life-sustaining treatment may vary among settings and families, access to information on the resulting prognosis if life-sustaining treatment is continued remains important for both parents and clinicians worldwide.

It is essential to accurately predict post-discharge motor and cognitive outcomes to make an appropriate decision on withdrawing life-sustaining treatment. Unfortunately, the present study does not include data on the neurodevelopmental outcomes of severe encephalopathy infants with continued life-sustaining treatment. Many studies have reported that brain MRI/MR spectroscopy findings within the first week of life can accurately predict motor and cognitive functions.12,13,24 Findings of lesions in the basal ganglia, thalamus, and brain stem on brain MRI scans at 1 week of age have been reported to be useful predictors of feeding disorders in infants with neonatal encephalopathy.25,26,27 In this regard, the combination of the Thompson score, which predicts the most severe disabilities in the very acute phase, and MRI/MR spectroscopy findings could possibly provide more relevant information.

We examined the association between analgesic/sedative use and the predictive performance of the 72–90 h Thompson score because these drugs are known to affect neurological function.28 The results showed that although the difference was not significant, the addition of analgesic/sedative information slightly increased the predictive performance of the Thompson score. This indicates that the Thompson score remains a useful predictive tool in patients under sedation. The analgesic/sedative medications were not standardized but were given according to institutional guidelines. Meanwhile, we could not obtain information on anticonvulsants (e.g., phenobarbital), nor were the serum concentrations of the medications obtained.

This study has some limitations. First, missing data, such as the Thompson score in particular time periods for some patients, may have affected the quality of the research. This could not be avoided owing to the use of a large database. However, we screened for erroneous entries to mitigate data errors, and established exclusion criteria to avoid selection bias in ruling out erroneous Thompson scores that contradict the known clinical course. Second, the quality and timing of the Thompson score evaluation could not be standardized in clinical practice. Since 2012, we have been conducting training sessions for all cooling centers participating in this registry study, to ensure proper evaluation of the Thompson score. Nevertheless, the differences in Thompson score evaluation among different centers and physicians may have influenced the results.

Conclusion

The Thompson score from 1 to 4 days of life is a useful predictor of death during hospitalization and respiratory or feeding impairments requiring tracheostomy or gavage feeding at discharge among neonatal encephalopathy patients. Particularly, severe encephalopathy that persists 72–90 h after birth accurately predicts death and impaired breathing or feeding. Among patients with a Thompson score of ≥15 and who received continued life-sustaining treatment, 43% regained spontaneous breathing and survived without requiring a tracheostomy. Meanwhile, approximately 50% of extubated infants required gavage feeding. Our results could provide useful information for clinical decision making concerning infants with persistent severe encephalopathy during the first 4 days of life.