Main

Fever and neutropenia (FN) is a common complication of the treatment of childhood cancer and the leading cause of unplanned hospital admissions (Mueller et al, 2015). A severe infection is documented in fewer than half of all episodes (Hann et al, 1997). Many paediatric clinical decision rules (CDRs) are available to identify children at low risk for severe infection who may qualify for reduced-intensity treatment (Phillips et al, 2010, 2012a). However, because of a paucity of validation and implementation studies, most centres admit all children with FN for intravenous antibiotics irrespective of risk status. This approach, although safe, contributes to overtreatment, negatively affects quality of life and increases health-care expenditure (Haeusler et al, 2015b).

Before a CDR can be implemented it should undergo evaluation in a population external to the derivation data set to ensure it is safe and reproducible (McGinn et al, 2000; Phillips, 2010). Seven paediatric FN CDRs (Rackoff et al, 1996; Klaassen et al, 2000; Baorto et al, 2001; Santolaya et al, 2001; Alexander et al, 2002; Ammann et al, 2003, 2010) have undergone formal validation in centres external to the derivation studies (Madsen et al, 2002; Dommett et al, 2009; Ammann et al, 2010; Macher et al, 2010; Miedema et al, 2011). Of these, only two have been shown to be reproducible, highlighting the importance of validation before implementation (Klaassen et al, 2000; Ammann et al, 2010).

Contributing to discordant derivation and validation results are differences in study methodology and definitions and insufficient sample sizes (Phillips et al, 2012b). To overcome this, the ‘Predicting Infectious ComplicatioNs in Children with Cancer’ (PICNICC) collaboration completed an individual participant data (IPD) meta-analysis of data from 22 groups and derived a new predictive model. The PICNICC model uses the following weighted variables to predict microbiologically documented infection (MDI): malignancy, maximum temperature, clinically severely unwell, haemoglobin, white cell count and absolute monocyte count (Phillips et al, 2016). Although subjective, ‘severely unwell’ was shown to be a highly sensitive predictor of MDI and is included in the adult Multinational Supportive Care in Cancer low-risk scoring system (Klastersky et al, 2000; Phillips et al, 2016).

The aim of this study was to validate the PICNICC CDR in children with cancer presenting to the emergency department (ED) with FN at an Australian hospital. To extend our understanding of the CDR, originally developed for implementation at FN presentation, we also explored the predictive performance of the PICNICC rule at day 2. Finally, to provide a baseline understanding of potential impact of low-risk management strategies we investigated costs associated with current FN management strategies.

Materials and methods

Data collection

Consecutive episodes of outpatient-onset FN in children (age <19 years) with cancer and receiving chemotherapy or haematopoietic stem cell transplant (HSCT) at the Royal Children’s Hospital (RCH), Melbourne, were retrospectively identified from electronic databases. The RCH is a tertiary paediatric hospital with a 26-bed haematology/oncology and HSCT unit and the majority of patients are treated on Children’s Oncology Group chemotherapy protocols. Multiple, discrete FN episodes per patient were allowed. Episodes were excluded if they were receiving antibiotics for treatment of documented infections. Inpatient-onset FN was also excluded as these events are difficult to consistently identify retrospectively from hospital records and because they tend to occur in very high-risk patients (including AML or HSCT recipients) who would not be considered for early discharge at our institution.

Demographic, episode and clinical outcome data were obtained from scanned electronic medical records and entered into REDCap database (Harris et al, 2009). Data were collected by a research assistant blinded to the PICNICC CDR. In-patient costs were obtained from the hospital activity-based costing system. This captures direct and indirect medical costs that are collated into resource use groups including medical, nursing, diagnostics and pharmacy.

The six PICNICC variables were collected at two time points: presentation (0–4 h) and day 2 (D2). For maximum temperature, the highest temperature in the preceding 12 h or in the ED was used. Data for D2 assessment were taken between 0900 and 1100 h to replicate existing practice. Outcome data were collected at the end of FN episode. The date and time bacteraemia episodes were known were extracted from the electronic pathology database. For all other MDI, the date and time the infection was documented in the medical record were used. An infectious diseases physician (GMH) reviewed microbiological results to ensure the correct diagnosis was assigned.

There were no changes to empiric antibiotic protocols during the study period that included piperacillin–tazobactam (all patients) plus amikacin (if high-risk cancer protocol, inpatient-onset FN or systemic compromise) (Haeusler et al, 2013). At RCH, patients with acute myeloid leukaemia (AML) and immediately post HSCT are routinely admitted during the neutropenic phase for FN observation and therefore unlikely to present to the emergency department with fever. A formal low-risk FN pathway was not in clinical use during the study period and discharge was typically considered in patients with evidence of marrow recovery, negative blood cultures at 48 h and who have been afebrile for at least 24 h.

Microbiological investigation of FN routinely includes two blood culture sets and a urine for culture (all patients) as well as nasal swab for respiratory virus PCR; chest X-ray; stool for culture, Clostridium difficile toxin assay and viral PCR; and skin or wound swab for culture and viral PCR (as clinically indicated and according to local guidelines and international recommendations (Lehrnbecher et al, 2012)).

Definitions

Fever was defined as a single tympanic temperature of 38 °C and neutropenia as an absolute neutrophil count of <1000/mm3. ‘Severely unwell’ was any of severe sepsis or septic shock (defined according to Goldstein et al, 2005), altered conscious state (Glasgow Coma Score <15 or only responsive to voice or pain) or documented as ‘severely unwell’ or equivalent in the patient record (Goldstein et al, 2005).

Outcomes were defined according to international consensus recommendations (Goldstein et al, 2005; Haeusler et al, 2015a). An MDI was defined as an infection that was clinically detectable and microbiologically proven. Bacteraemia was defined as a recognised pathogen (including viridans group streptococci in the setting of mucositis or neutropenia) from 1 blood cultures or common commensals from 2 blood cultures drawn on separate occasions (Haeusler et al, 2015a). Bacterial MDI was any of bacteraemia, bacterial respiratory infection, urinary tract infection (UTI) or skin and soft infection (SSTI).

Data analysis

Validation of the PICNICC rule consisted of two components: statistical validation and clinical utility (Altman and Royston, 2000).

Statistical validation comprised assessment of the discrimination and calibration of the new data set compared with the derivation data set (Steyerberg, 2009; Phillips et al, 2016). Calibration is assessed by comparing how accurately the predicted risk of MDI fits with the observed rate of MDI, and discrimination is the ability of predicted values to categorise the episode correctly, given different threshold values. These statistical assessments were made by comparing the area under the receiver operating characteristic (AUC-ROC) curve to assess the overall discriminatory ability and calibration slope that estimates how precisely the predicted probability of infection meets the measured values (Steyerberg, 2009).

To establish baseline calibration variables, a separate, initial survey of 101 episodes was performed using the same methodology and recalibration was undertaken as described by Steyerberg (2009). First, the mean observed outcomes in the Australian data set (n=101) were compared with those in the derivation data set. These were similar (23.5 vs 21.5%), and simple recalibration using this change in intercept did not markedly affect the calibration. Second, the predictors were modified by multiplying the estimated overall calibration slope along with the essentially unchanged intercept. Both the intercept and variable coefficients of the original PICNICC model were adjusted during the recalibration process. Details of the model, including original and adjusted variable coefficients, are available in the Supplementary Information (Phillips et al, 2016). There was no significant difference in type of cancer or temperature between the validation group (n=650) and the baseline calibration group (n=101). The validation group had a marginally higher mean total white cell count (0.56 vs 0.48, difference 0.08 CI 0.07–0.9) than the calibration group.

Clinical utility was assessed by calculating the sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) of dichotomising at 10% chance of MDI, and comparing this with the derivation data set (Steyerberg, 2009). Data were presented for non-recalibrated and recalibrated PICNICC CDR for any MDI and the recalibrated CDR for bacteraemia and bacterial MDI.

The clinical utility of the PICNICC rule at D2 was assessed in two ways. First, as described by the Swiss Paediatric Oncology Group (SPOG), using variables collected at presentation, the sensitivity of the rule at D2 (between 0900 and 1100 h) was determined by combining the information on episodes with MDI known at that time (Proportionknown=MDIknown/MDI total) with the results of prediction on the remaining episodes (Sensitivitycombined=Proportionknown+Sensitivitypredicted × (1–Proportionknown)) (Ammann et al, 2010). Second, variables collected on D2 were used to determine chance of MDI. Episodes that had already been shown to have any of MDI, severe sepsis/septic shock or transferred to ICU before D2 assessment were excluded from analysis, assuming that they would be classified as high risk.

Outcomes were presented according to underlying risk status. Continuous data were presented as median and interquartile range. Mann–Whitney U-test was used to estimate P-values for continuous data and Fisher’s exact test for categorical data. For between-patient utility of the score we undertook a hierarchical logistic regression analysis nesting the individual episodes within the separate patients to examine the contribution of the patient identifier to variation in accuracy of predicting MDI. The analysis was restricted to patients with multiple episodes, and modelled the predictive ability of the presentation PICNICC score with the individual as a random effect modifier, using the lme4 package in R (R Foundation for Statistical Computing, Vienna, Austria). All tests were two tailed, and a P-value of <0.05 was considered statistically significant.

Cost data were presented as mean and s.d. Differences in mean and median were compared using parametric (t-test) and the nonparametric (Mann–Whitney U) tests.

Sample size

International data indicate that MDI occurs in between 18 and 25% of episodes of FN (Hann et al, 1997; Phillips et al, 2010, 2012a). For validation, 650 FN episodes, with an estimated event rate of 18%, were required for 80% power to show AUC-ROC of the PICNICC model is 0.660. This was the AUC-ROC of the simplest version of the original PICNICC CDR that included only two variables: tumour type and temperature.

Ethics

The study was approved by the RCH Human Research Ethics Committee and, given the retrospective nature, informed consent was not required (35034B).

Results

A total of 294 324 children presented to the RCH ED between November 2011 and June 2015, of whom 3854 (1.3%) had a diagnosis of cancer. From this cohort, there were 650 episodes of FN occurring in 327 patients (median 2 episodes, range 1–7).

Demographic data are summarised in Table 1. The most common malignancy was acute lymphoblastic leukaemia, of which 39, 72, 77, 19 and 121 FN episodes occurred in induction, consolidation, delayed intensification, interim maintenance and maintenance phases, respectively (4 unknown).

Table 1 Demographic data for paediatric FN episodes

A pre-antibiotic blood culture was performed in all episodes, with 448 (68.9%) having two or more. Urine was taken for culture in 328 (50%) episodes, including 102 episodes pre-antibiotics. Of these pre-antibiotic urine cultures, a pathogen was detected in 14 (14%). Additional microbiological investigations that were performed according to clinical symptoms included viral respiratory PCR in 185 (28%), chest X-ray in 145 (22%), stool for culture and PCR in 100 (15%) and a skin swab in 69 (11%) FN episodes.

Of the 650 FN episodes, 244 had one or more microbiologically or clinically documented infections (209 had 1; 31 had 2; and 4 had 3). The primary cause of fever was attributed to an MDI in 153 (23.5%) episodes (bacteraemia in 61; other MDI in 92). Of the 202 episodes with a single pre-antibiotic blood culture, 8 (1.2%) had a common commensal identified (4 with Micrococcus spp.; 4 with coagulase negative staphylococci). An alternative source of infection was identified in four, and in the remaining, ‘likely contaminant’ was documented.

Validation at presentation

The recalibrated PICNICC rule had an AUC-ROC of 0.638 (95% CI 0.590–0.685) and a calibration slope of 0.24. A total of 231 episodes (35.5%) were identified as low risk for MDI (Table 2). Of these, 33 had an MDI, including 9 episodes with bacteraemia.

Table 2 The PICNICC rule predicted risk status vs microbiologically documented infection diagnosis

The median time to diagnosis of any MDI was 23.2 h (IQR 18.1–47.5 h) in the low-risk group and 23.3 h (IQR 15.3–41.4 h) in the high-risk group (P=0.56). The median time to diagnosis of a bacteraemia was 20.0 h (IQR 19.7–26.2 h) in the low-risk group and 22.0 h (IQR 16.8–28.1 h) in the high-risk group (P=0.85).

Detailed comparison of the clinical utility of the PICNICC CDR in the derivation and validation study is presented in Table 3. Without recalibration, and using the original PICNICC coefficients, the CDR performed poorly, only classifying 3.2% of episodes as low risk. For bacteraemia alone, the sensitivity of the recalibrated PICNICC CDR improved from 78.4% to 85.2% and the AUC-ROC from 0.64 to 0.71. With regard to multiple episodes per patient, individual ID accounted for 10% of variation and was a highly nonsignificant predictor (P=0.999).

Table 3 Comparison of the discrimination and clinical utility of the PICNICC rule in the original derivation study and the external validation study (with 95% confidence intervals)

Validation at day 2

Day 2 assessment occurred a median of 19.1 h after presentation (IQR 15.1–24.0 h). Sixty-six (43%) MDI episodes were diagnosed before D2 assessment: 11 in the low-risk group including 3 episodes with bacteraemia (Pseudomonas aeruginosa in 2; Escherichia coli in 1) and 55 in the high-risk group, including 25 episodes with bacteraemia.

Taking into consideration proportion of MDI known at reassessment, the sensitivity of the PICNICC rule applied to presentation variables was 87.7% (Ammann et al, 2010). Pathogens responsible for the bacteraemia episodes in low-risk group that were not identified before D2 included Staphylococcus epidermidis (2 episodes), Staphylococcus aureus and Granulicatella adiacens, Citrobacter braakii and viridans group streptococci (2 episodes). There were four viral respiratory tract infections, four bacterial UTI, three bacterial SSTI, three episodes of C. difficile colitis and two of HSV gingivostomatitis in the low-risk group that were also not identified before D2.

Day 2 variables were available for 514 FN episodes (Table 2): full blood count was unavailable in 131 and temperature not documented in 5. There was no significant difference in tumour type, maximum temperature, white cell count, severity of illness and MDI between episodes with and without missing data on D2. After exclusion of 71 FN episodes that had been shown to have any of MDI, severe sepsis/septic shock or ICU admission by D2 assessment, 443 episodes were available for analysis. In all, 193 episodes were identified as low risk. Of these, 28 had an MDI, including 8 episodes with bacteraemia. There were 65 episodes identified as low risk at presentation and D2 (Figure 1), of which 5 had an MDI (S. epidermidis bacteraemia, S. aureus SSTI, herpes simplex virus gingivostomatitis, respiratory syncytial virus respiratory infection and C. difficile colitis). The remaining 23 misclassified MDI were identified as high risk at presentation. There were 46 episodes identified as low risk at presentation and high risk on D2, of which 9 (19.6%) had an MDI. Conversely, 128 episodes were high risk at presentation and low risk on D2 of which 23 (17.9%) had an MDI. Finally, 204 remained high risk, including 36 (17.6%) with a MDI.

Figure 1
figure 1

Risk classification of FN episodes using variables collected on day 2 (risk classification using presentation variables available for comparison).

The sensitivity of the PICNICC rule applied to clinical variables collected on D2 was 61.6% (Table 3).

Clinical outcomes

Outcomes according to presentation risk status are outlined in Table 4. There were four episodes of Gram-negative bacteraemia and six with Gram-positive bacteraemia in the low-risk group. There were no serious medical complications (any of late-onset severe sepsis, admission to ICU or death within 30 days) in low-risk episodes with bacteraemia.

Table 4 Detailed outcome data according to risk status on presentation

One episode (0.4%) in the low-risk group had late-onset severe sepsis/septic shock (time to diagnosis 4.5 h) and was subsequently admitted to ICU (time to admission 8.9 h). There was one further ICU admission in the low-risk group 3.3 h after presentation. The median hospital length of stay (LOS) was significantly lower in the low-risk group compared with the high-risk group (3.5 days vs 6.9 days). No deaths occurred in the low-risk group.

Resource use

Admission costs according to resource use are presented in Table 5. The total average cost assigned to low-risk episodes was significantly less than high-risk episodes (mean difference AUD 10 758). However, the total average cost per day of in-patient services was not significantly different between the groups (Table 6). Inpatient medical and nursing/allied health were the highest resource use categories across both groups.

Table 5 Resource use by activity-based hospital cost category
Table 6 Resource use per day by activity-based inpatient cost category

Discussion

This is the first time the PICNICC CDR, developed from IPD meta-analysis of existing paediatric FN CDR, has undergone validation in a population entirely external to the derivation studies. For prediction of any MDI, both the non-recalibrated and the recalibrated rule performed poorly as reflected by the low AUC-ROC and calibration slope. This is in keeping with previously published validation studies where the sensitivity and specificity are frequently lower than derivation study (Macher et al, 2010; Miedema et al, 2011; Delebarre et al, 2014). However, for bacteraemia, arguably the most serious MDI, the AUC-ROC showed moderate discrimination and sensitivity was similar to that of the derivation study with an even greater specificity. In fact, the performance of the recalibrated PICNICC rule in our population in the prediction of bacteraemia exceeded the performance of many of the published CDR that have undergone external validation (Macher et al, 2010; Phillips et al, 2012a). Whereas the non-recalibrated PICNICC CDR had the greatest sensitivity, the specificity was compromised with very few episodes being identified as low risk.

Reassuringly, although 33 episodes identified as low risk at presentation subsequently had an MDI, including 9 with a bacteraemia, there were very few serious medical complications in this group. There were no infection relapses or deaths and only two episodes required ICU admission, both of which occurred within the first 9 h of presentation. Although the impact of hospital admission and IV antibiotics in the prevention of these serious outcomes remains unknown, this low rate is in keeping with studies of oral and outpatient antibiotic management of FN (Morgan et al, 2016). At D2, 11 (33.3%) of the MDI in the low-risk group were known including the three serious Gram-negative bacteraemia episodes. Of the remaining unknown MDI, at least six did not require broad-spectrum IV antibiotics. This suggests that some missed MDI may be of less clinical significance and therefore less likely to result in readmission as compared with bacteraemia.

Application of the rule at D2 using variables collected at presentation (therefore unaffected by hospital admission and antibiotics) and taking into consideration the proportion of episodes with an MDI known at reassessment, the sensitivity of the PICNICC rule improved to 88%, lying within the 95% confidence interval of the derivation study (Phillips et al, 2016). This improvement is in keeping with results of the SPOG rule predicting adverse outcome in children with FN that also performed better at D2 (Ammann et al, 2010). Conversely, the sensitivity of the rule applied to clinical variables collected at D2 dropped to just 61.6%, despite the exclusion of episodes where MDI was already known. The impact of admission, observation and treatment with broad-spectrum antibiotics on the patients clinical status may in part, explain this marked reduction in sensitivity. These data suggest methodology used by SPOG yield more sensitive, and hence clinically meaningful, results as compared with using variables collected after a period of inpatient treatment.

Variability in results obtained in derivation and validation of clinical decision rules is well recognised and underpins the rationale for vigorous testing in a population external to the original derivation set (Phillips, 2010). Factors contributing to this variability include the unavoidable differences in time and geography as well as differences in the type of chemotherapy treatment protocols used, all of which are relevant to our results. The PICNICC CDR was developed from a global collaboration of 15 countries, none of which included Australia, and from studies that were published in the decades before this validation study (Phillips et al, 2016). The impact of geography is most evident in the results of the PINDA rule where validation data support its use in Chile but not Europe and highlights the importance of local validation before implementation irrespective of results obtained at other centres (Santolaya et al, 2001; Phillips et al, 2012a). Furthermore, although the influence of region-specific chemotherapy protocols, which adapt and change over time, has not been formally explored, it is likely to also contribute to discordant derivation and validation results. Rule recalibration, a unique feature of our study, is one way the impact of these inevitable differences can be reduced.

Little is known about the costs of treating FN in children in Australia. International data support the potential cost savings of oral or intravenous outpatient therapy for low-risk FN, with reductions in hospital bed-days accounting for the highest savings (Wiernikowski et al, 1991; Santolaya et al, 2004; Teuffel et al, 2011). Our data similarly show that LOS is the main contributor to overall cost. Accordingly, total cost is higher in the high-risk group because of longer LOS, with no difference in cost per day between the groups. In the absence of a low-risk pathway at our centre, these patients remain in hospital for a median of 3.5 days. With increasing evidence that discharge, as early as 24 h, is safe and feasible, these data also provide an estimate of the potential for reductions in in-patient hospital expenditure of up to AUD 2183 per day following implementation of a formal low-risk pathway (Morgan et al, 2016).

Although retrospective, this is the one of the largest external paediatric FN validation studies conducted to date and had a sufficient sample size to compare derivation and validation results. To avoid recruitment bias, we collected all consecutive episodes of outpatient-onset, chemotherapy-induced FN. However, because inpatient-onset FN was excluded, very few patients with AML or immediately post-HSCT were included. Although traditionally considered high risk for MDI, it is unclear from our study how the PICNICC rule applies to these patients. Another potential limitation was in the allocation of MDI to FN episodes. Although this would not have affected bacteraemia, it may have resulted in incorrect allocation of viral respiratory infections based on highly sensitive PCR testing. Reassuringly, this error would have underplayed the true performance of the PICNICC rule in our population. It is also possible that the date and time that non-bacteraemia MDIs were known were earlier than what was documented in the medical records. This would have underestimated the sensitivity of the PICNICC rule using the SPOG methodology that takes into account the number of MDI known at time of assessment. Finally, although all episodes had a blood culture as part of the diagnostic work up for MDI, only 50% had a urine culture. The clinical impact of this is likely to be low as no patient, including those without a sample taken, had a relapsed UTI within 30 days of the FN episode.

This comprehensive validation of the internationally derived PICNICC CDR highlights the importance of CDR recalibration using local data, and provides a simple framework to achieve this. Although the recalibrated PICNICC rule did not perform as well in our cohort as compared with the derivation study for prediction of any MDI, the performance is promising for bacterial MDI and after an overnight period of observation. Our study also provides a contemporary and detailed understanding of the variety of causes of FN, the potential limitations of the PICNICC CDR and reassurance as to the low rate of serious medical complications. Implementation of a low-risk pathway, using the recalibrated and validated PICNICC CDR after an overnight period of inpatient observation and in context of a structured low-risk outpatient program with careful follow-up, is likely to be safe and has the potential to reduce health-care expenditure. Centres planning low-risk FN management strategies should consider using the PICNICC rule after local validation and recalibration. Following implementation, further research is required to assess the clinical, psychosocial and economic impact of this model of care.