Introduction

Despite the advances in medical care, infectious diseases remain a major disease burden worldwide, especially in children aged five years and younger.1 Sepsis, defined as a life-threatening syndrome caused by dysregulated host immune response to infection, is a principal cause of morbidity and mortality in children.2,3 Globally, an estimated incidence of pediatric sepsis was roughly 48 per 100,000 person-years (1.2 million cases per yaer).4 Mortality for children with sepsis ranges from 4% to 50%, depending on severity of illness, medical resources, and geographic location.5

Early recognition and early initiation of treatment are critical to improving outcomes for patients with sepsis.6 Systemic infammatory response syndrome (SIRS) criteria are being used to screen children with infection for development of sepsis since 2005.7 However, SIRS have been criticized as screening and prognostic tools because of inadequate sensitivity and specificity (sensitivity ranged from 0.21 to 0.83, specificity from 0.29 to 0.62).8,9,10 Recently, the Third International Consensus Defnitions for Sepsis and Septic Shock (Sepsis-3) Task Force replaced the SIRS criteria with the new Sequential (Sepsis-related) Organ Failure Assessment (SOFA) scoring system.11 The SOFA score was used to evaluate the severity of organ dysfunction in adult patients with infection.11,12 In addintion, the concept of the quick SOFA (qSOFA) was suggested as a simple bedside tool to identify patients with sepsis outside the intensive care units (ICUs), when no laboratory tests are available.11,12

Regrettably, neither SOFA nor qSOFA were developed for children, implying an urgent need for pediatric scores. To fill the gap, researchers have attempted to develop age-adjusted SOFA and age-adjusted qSOFA scores for pediatric patients by using age-specifc variables.10,13 Recently, several studies have been conducted to validate the age-adjusted SOFA and age-adjusted qSOFA criteria by assessing their performance in the identification of pediatric patients with confirmed or suspected infection who are likely to have adverse outcomes.14,15,16,17 However, these studies have produced conflicting evidence regarding predictive accuracy of age-adjusted SOFA and age-adjusted qSOFA socres. It is currently unclear whether age-adjusted SOFA and age-adjusted qSOFA socres have prognostic value for unfavorable outcomes in children with infection.

Thus, the purpose of this study was to evaluate the prognostic value of age-adjusted SOFA and age-adjusted qSOFA and compare them with the SIRS criteria for predicting mortality in children with infection in the hospital setting.

Methods

This meta-analysis is reported according to the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) guidelines.18 The study protocol was registered with the PROSPERO (CRD42021232441).

Data sources and search strategy

We searched Pubmed, Web of science, Cochrane library, and Embase databases from inception to July 10, 2021. The detailed search strategy is presented in Supplementary file 1. We did not restrict our search by year of publication. Only English-language articles were included. For gray literature search, we manually searched the bibliographies of all included studies to identify potentially eligible studies.

Study selection and outcomes

Studies were included based on the following inclusion criteria: (1) involved children (aged 29 days-18 years) with suspected or confirmed infection in the ED, hospital wards, or the ICU; (2) evaluated the age-adjusted SOFA, age-adjusted qSOFA scores or the SIRS criteria for predicting mortality; (3) reported enough data to formulate 2 × 2 table (true positive (TP), false positive (FP), true negative (TN), and false negative (FN)) for further analysis; (4) performed retrospectively or prospectively and published as full-length articles in peer-reviewed journals.

Articles were excluded: (1) if they were case reports, case series, abstracts or letters; (2) if there were insufficient data to calculate the outcome data (TP, FP, TN, FN).

The primary outcome was in-hospital mortality. For studies in which researchers did not investigate in-hospital mortality, we used the 28- or, 30-day mortality instead. The primary outcome measures were the sensitivity and specificity; secondary measures included the positive likelihood ratio (PLR), negative likelihood ratio (NLR), diagnostic odds ratio (DOR), and area under the summary receiver operating characteristic curve (AUSROC).

Data extraction, definitions, and quality assessment

Two reviewers (ZW, YH) independently screened the titles and abstracts and retrieved full-text articles for potentially relevant references. All selections were decided by consensus.

Two investigators (ZW and YH) independently collected the following variables from the included articles: authors, year of publication, country, study design, study setting, cut-off, mean or median age, number of patients included, criteria to diagnosis of infection, variables used for developing age-adjusted SOFA and age-adjusted qSOFA criteria by the individual authors in the included studies, mortality (in-hospital, 28-day, or 30-day), and outcome data (TP, FP, FN, and TN). Discrepancies were resolved by consensus.

The majority of the included studies of the age-adjusted SOFA used the scoring system proposed by Matics et al.13, which was termed pediatric SOFA (pSOFA). The pSOFA score was developed by adapting the original SOFA. Specifically, the mean arterial pressure and serum creatinine level cut-offs for the first score of the pediatric logistic organ dysfunction score-2 (PELOD-2)19,20,21 were used to assign a score of 1 in the pSOFA cardiovascular and renal subscores, respectively. The cardiovascular subscores 2 to 4 were kept identical to the original SOFA score. The renal subscores 2 to 4 were modified by increasing the cut-off values for each score by the same factor as the original SOFA score. For the respiratory subscore, the arterial partial pressure of oxygen (PaO2):fraction of inspired oxygen (FiO2) ratio cut-offs were kept identical to the original score and the peripheral oxygen saturation (SpO2):FiO2 ratio was used as an alternative surrogate of respiratory dysfunction. Additionally, two studies used the another age-adapted SOFA scoring system developed by Schlapbach et al.10, which is slightly different from the pSOFA score. In brief, the renal and cardiovascular variables of the original SOFA score were modified only using age-specific cut-offs from the PELOD-2. Besides, the respiratory subscore was kept identical to the original SOFA score. On the other hand, to adapt a pediatric version of the qSOFA score, the original tachypnoea and hypotension criteria, based on respiratory rate and systolic blood pressure, were modified using age-specific cut-offs, respectively, as per the 2005 Pediatric Sepsis consensus.3,22

Two investigators (ZW and YH) independently assessed risk of bias of the included studies using the QUADAS-223 (Quality Assessment of Diagnostic Accuracy Studies 2) tool by Review Manager (RevMan, version 5.3).

Data synthesis and statistical analysis

A 2 × 2 diagnostic table with the four values, i.e., the TP, FP, FN, and TN was constructed for all studies included. Studies, which did not report these four indexes, relevant values were calculated from the reported specificity, sensitivity, positive predictive value (PPV), and negative predictive value (NPV) in Revman 5.3. Because high between-study heterogeneity was anticipated, the bivariate random-effects regression model24 was used to calculate the pooled sensitivity, specificity, PLR, NLR, and DOR with corresponding 95% credible interval (CI) in STATA 16.0. We also plotted summary receiver operating characteristic curve (SROC) and calculated the AUSROC. Between-study heterogeneity was assessed using the I2 statistic.25 An I2 > 50% indicated a substantial heterogeneity. Meta-regression analyses were conducted to explore the potential sources of heterogeneity among included studies with following covariates if data were available: study design (prospective vs. retrospective), disease severity (suspected or confirmed infection vs. sepsis or septic shock), study location (ICU vs. ED or general ward), overall mortality (≥10% vs. <10%), cut-off (predetermined vs. optimal), region, and outcome definitions (in-hospital mortality vs. 28- or 30-day mortality). Deek’s funnel plot26 was used to detect publication bias, with P < 0.05 indicating potential publication bias.

The overall rating of confidence in pooled prognostic accuracy estimates was assessed using the GRADE (Grading of Recommendations Assessment, Development and Evaluation) approach.27 Assessments were based on the following criteria: precision, risk of bias of the included studies, consistency, directness of the evidence, and risk of publication bias. Confidence in prognostic accuracy estimates was rated as high, moderate, low, or very low. A GRADE evidence profile was created using the guideline development tool by GRADEpro (gdt.gradepro.org).

Results

Study search

The PRISMA flow diagram in Fig. 1 shows the literature search process. The initial search identified 3,452 citations. After removing duplicate articles, we screened 2,766 potentially eligible articles. Of these articles, 2,571 were excluded based on title and abstract. A total of 195 articles underwent full-text review. 181 studies were excluded for the reasons presented in Fig. 1. Finally, a total of 14 articles9,10,14,15,16,17,28,29,30,31,32,33,34,35 (70,194 participants) met our inclusion criteria.

Fig. 1: Flow diagram of study inclusion.
figure 1

PRISMA flow diagram for literature search and screening.

Study characteristics and quality assessment

Demographic variables of the included studies are summarized in Table 1. The number of patients in each study ranged from 30 to 40,228, and the overall mortality rate in each study ranged from 0.04% to 46.7%. Of the included studies, 42.9% were conducted in Asia, 21.4% in North America, 14.3% in Europe, 14.3% in Africa, and 7.1% in Oceania. All studies were observational. Six studies were prospective, and the remaining eight were retrospective. Nine studies evaluated patients at PICU admission,10,14,15,17,29,30,32,33,34 four studies9,16,28,35 evaluated patients in the ED, and one study31 evaluated patients in general ward. The utility of predicting mortality was evaluated in 4 studies10,16,17,35 using age-adjusted qSOFA score and in 8 studies9,10,15,28,31,32,34,35 using SIRS criteria. Eight studies10,14,15,29,30,32,33,34 assessed age-adjusted SOFA for mortality prediction, of which six14,15,29,30,33,34 employed age-adapted SOFA scoring system proposed by Matics et al.13, and two10,32 employed scoring system developed by Schlapbach et al.10 For age-adjusted SOFA, six studies used a pre-specified cut-off value (≥2), one study used a cut-off value of ≥5, and one used a cut-off value of ≥8. The included studies assumed different criteria to define infection in selected patients. Eight studies10,17,28,30,31,32,34,35 considered clinically presumed or suspected infection, four14,15,29,33 considered the diagnosis of sepsis or septic shock at ICU admission, one16 considered the suspicion of infection resulting in obtaining blood tests, and one9 considered SIRS-positive patients (SIRS ≥ 2). In terms of outcome measures, 10 studies analyzed in-hospital mortality, 2 evaluated 28-day mortality, and 2 examined 30-day mortality.

Table 1 Characteristics of studies included in the meta-analysis.

Quality assessments using QUADAS-2 criteria are summarized in Fig. 2. The majority of studies have a high or unclear risk of bias in patient selection and index test because of the retrospective nature of their analyses. All included studies were judged as low risk of bias for reference standard, and 9 studies were judged as having low bias in terms of flow and timing. Regarding applicability concerns, all included studies were judged as having low bias in relation to reference standard, whereas the index test do raise questions about applicability in several studies.

Fig. 2: Methodological quality assessment of included studies based on the QUADAS-2 scale.
figure 2

Assessment of risk of bias for individual studies (left) and risk of bias for studies of overall prognosis (right).

Diagnostic accuracy for mortality using age-adjusted SOFA, age-adjusted qSOFA scores and SIRS criteria

Figure 3 shows forest plots and heterogeneity tests of the sensitivity and specificity of age-adjusted SOFA, age-adjusted qSOFA and the SIRS criteria reported in the 14 included studies. Figure 4 shows the SROC curves for three scoring systems in predicting mortality. Summary estimates of all diagnostic accuracy measures are shown in Supplementary file 1: Table S1.

Fig. 3: Forest plots of sensitivity and specificity.
figure 3

a Age-adjusted SOFA, b age-adjusted qSOFA, c SIRS. Point estimates for sensitivity and specificity and 95% confidence intervals are shown with pooled estimates. Q = Cochran Q statistic.

Fig. 4: Summary receiver operating characteristic (SROC) curves for predicting mortality in children with suspected infection.
figure 4

a Age-adjusted SOFA, b age-adjusted qSOFA, c SIRS AUC = area under the curve.

The pooled sensitivity and specificity for age-adjusted SOFA score were 0.82 (95% CI, 0.74–0.88, I2 = 91.5%) and 0.62 (95% CI, 0.45–0.77, I2 = 99.6%), respectively. The PLR, NLR, and pooled DOR were 2.18 (95% CI, 1.46–3.25), 0.29 (95% CI, 0.22–0.38), and 7.50 (95% CI, 4.45–12.63), respectively. The pooled sensitivity of the age-adjusted qSOFA score across all included studies was 0.46 (95% CI, 0.22–0.71, I2 = 97.4%), and the specificity was 0.90 (95% CI, 0.66–0.98, I2 = 99.9%). The PLR, NLR, and the pooled DOR were 4.63 (95% CI, 1.34–16.02), 0.60 (95% CI, 0.39–0.93), and 7.68 (95% CI, 1.88–31.42), respectively. The pooled sensitivity and specificity for positive SIRS criteria were 0.79 (95% CI, 0.66–0.88, I2 = 92.4%) and 0.39 (95% CI, 0.26–0.54, I2 = 99.5%), respectively. The PLR, NLR, and the pooled DOR were 1.30 (95% CI, 1.15–1.48), 0.53 (95% CI, 0.40–0.69), and 2.48 (95% CI, 1.82–3.38), respectively. The AUSROC and corresponding 95% CI of age-adjusted SOFA, age-adjusted qSOFA, and the SIRS were 0.82 (95% CI, 0.79–0.85), 0.66 (95% CI, 0.62–0.70), and 0.64 (95% CI, 0.60–0.68), respectively.

The GRADE evidence profiles for pooled prognostic accuracy results of age-adjusted SOFA, age-adjusted qSOFA and the SIRS criteria are displayed in Supplementary file 1: Tables S24. The overall certainty of evidence for all scoring systems assessed was judged as low to very low due to risk of bias, imprecision and inconsistency.

Meta-regression and subgroup analyses

A high degree of heterogeneity in sensitivity and specificity was present in the included studies (Fig. 3). Univariate meta-regression analysis of sensitivity, and specificity was performed to find potential sources of heterogeneity for age-adjusted SOFA and SIRS (Supplementary file 1: Tables S5-S6) scores. We could not perform further analysis for age-adjusted qSOFA score due to the limited number of studies.

For studies in which researchers evaluated the prognostic accuracy of age-adjusted SOFA score, cut-off value was the probable source of heterogeneity. The pooled specificity for age-adjusted SOFA was higher in studies using an optimal cut-off (0.74 (95% CI 0.57–0.91), P < 0.01) than in studies using a predetermined cut-off (0.21 (95% CI 0.06–0.35), P < 0.01).

For studies examined prognostic performance of the SIRS criteria, region was the possible source of heterogeneity. The sensitivity of SIRS in North America (0.61 (95% CI 0.36–0.86), P = 0.03) was significantly lower than in Africa (0.86 (95% CI 0.70–1.00), P = 0.73), Asia (0.85 (95% CI 0.68–1.00), P = 0.83), Europe (0.84 (95% CI 0.46–1.00), P = 0.49), and Oceania (0.85 (95% CI 0.61–1.00), P = 0.67).

Publication bias

Supplementary file 1: Figure S1 shows the assessment of publication bias. Based on the P values of age-adjusted SOFA, age-adjusted qSOFA and the SIRS criteria (0.07, 0.73, and 0.78, respectively) and the corresponding Deek’s funnel plot, there was no evidence of publication bias.

Discussion

In the present meta-analysis, we evaluated the prognostic values of age-adjusted SOFA, age-adjusted qSOFA and the SIRS for predicting mortality in children with infection. The certainty of evidence ranged from low to very low for all scoring systems. We found that age-adjusted SOFA is more sensitive and specific than SIRS for mortality prediction. The age-adjusted qSOFA score is more specific, but less sensitive than the SIRS criteria. However, due to the significant heterogeneity among the included studies, our findings should be interpreted with caution.

Our pooled estimates showed that age-adjusted SOFA score had high sensitivity with relatively low specificity. Sepsis is a common cause of multi-organ dysfunction; the SOFA score evaluates the dysfunction of six organ systems, making it more sensitive in predicting poor outcomes.36 The relatively low specificity of the age-adjusted SOFA score indicates that some patients at low risk of mortality may be misclassified as severe. However, early recognition of sepsis and prompt disease management are crucial for decreasing sepsis-related mortality.6 High sensitivity is particularly important in identifying critically ill patients. Thus, the age-adjusted SOFA score might provide prognostic value as a useful tool to identify children with infection who are likely to have a poor prognosis. However, there is substantial heterogeneity among the included studies. Instead of using a predetermined cut-off, some studies used an optimal cut-off, which could result in overestimation of prognostic accuracy. Moreover, two age-adapted SOFA scoring system were employed by the included studies, which might be another source of heterogeneity. However, owing to the small number of eligible studies, we could not conduct a further subgroup analysis. Notably, all studies that assessed the prognostic value of age-adjusted SOFA in this review were conducted in the ICU. It remains uncertain whether age-adjusted SOFA could have similar prognostic value in patients outside the ICU. The predictive power of the age-adjusted SOFA score for poor prognosis needs to be further confirmed in different settings. Previous meta-analysis37 on the prognostic accuracy of qSOFA in adult patients showed that the diagnostic specificity of qSOFA score for predicting mortality was higher in studies with overall mortality < 10% than studies with overall mortality ≥ 10%. We noticed that mortality rates varied widely between included studies (0.04% to 46.7%), which might have modifier effect for the prognostic accuracy of age-adjusted SOFA. However, our subgroup analysis based on mortality rates did not reveal differences in the prognostic value of age-adjusted SOFA score (Supplementary file 1: Tables S5).

Our results showed that age-adjusted qSOFA had high specificity for mortality prediction but very low sensitivity. There could be several possible reasons for these results. First, hypotension is known to be a very late sign of pediatric sepsis.38 Thus, a predictive tool that incorporates late shock signs, such as hypotension could improve on the specificity but decrease sensitivity. Second, the analyses of prognostic performance for age-adjusted qSOFA were conducted by using data predominantly from ICU patients.10,17 There may be differences in characteristics between patients outside the ICU and those in the ICU. For example, respiratory rate and systolic blood pressure in ICU patients may be influenced by mechanical ventilation and vasoactive medications, and mental status may be influenced by sedative agents. This heterogeneity may contribute to a lower sensitivity of the age-adjusted qSOFA for mortality prediction. We attempted to perform subgroup analysis to determine whether there was a difference in between the predictive performance of age-adjusted qSOFA score in ED versus ICU. However, we could not perform subgroup analysis on this subject because of the insufficient number of eligible studies. Further studies are warranted to verify the prognotic and predictive value of age-adjusted qSOFA in different clinical settings, especially outside the ICU.

Given that the “quick” organ dysfunction criteria are developed as a simple tool to aid physicians in making rapid clinical decisions, the very low sensitivity of the age-adjusted qSOFA score in the identification of sepsis raises concerns about potential delays in recognition and management of sepsis. It may be debatable whether the existing qSOFA parameters are suitable and adequate in predicting poor outcomes for pediatric patients with infection. First, respiratory rate could be affected by multiple non-infectious causes, such as inconvenience and pain. Moreover, fever alone might be a significant determinant of the presence of tachypnea.8 Second, as described previously, hypotension is a late sign of pediatric sepsis. Thus, the respiratory rate and blood pressure variables included in pediatric version of qSOFA may be not highly specific for children. Researchers found that heart rate might be superior to respiratory rate in predicting poor outcomes.8 Further more, the sensitivity of age-adjusted qSOFA could be improved by substituting the hypotension from the initial version of qSOFA into capillary refill time.16 More work is needed to determine variables which could be useful in predicting poor outcomes and develop a robust and simple screening tool for pediatric patients.

The SIRS criteria have been used to diagnose pediatric sepsis since 2001, and have shown an inability for poor prognosis prediction for children with infection.8 The SIRS criteria were found in >90% of children with fever in the emergency department, but only <5% of those needed intensive care.8 Further, a multicenter cohort study found that a positive SIRS criteria were present in over 80% of children admitted to ICU with infection, leading to poor specificity to capture children at high risk for poor prognosis.10 Consistent with previous studies, we found that although the discriminative ability for mortility using positive SIRS criteria was moderately good, the specificity was very low.

The present study has several strengths. It included a comprehensive search strategy across multiple databases, and multiple subgroup and regression analyses to identify potential sources of bias. The large number of patients included in our study is an additional strength. Moreover, there is a low risk of publication bias and high degree of applicability. Meanwhile, there are several limitations. First, the heterogeneity among the included studies was significant, although heterogeneity is common in meta-analysis of diagnostic studies.39 Second, although we performed multiple meta-regression analyses and the findings of most meta-regression analyses were in keeping with those in the overall study cohort, the results of these analyses should be interpreted with some caution due to lack of statistical power. It’s obvious that as we have included patients from different settings, heterogeneity will always exist. Due to a lack of access to individual patient data, it was impossible to adjust the pre-existing clinical risk factors of mortality in these patients. Third, we found only four studies using age-adjusted qSOFA score for predicting mortality. Although our results reveal that the discriminative ability for age-adjusted qSOFA score was higher than SIRS criteria, limited evidence did not allow us to draw a firm conclusion.

Conclusion

Age adjusted SOFA score is a useful tool for predicting mortality in children with sepsis/suspected sepsis