Introduction

Neonatal jaundice is common, and up to two-thirds of all newborns develop this problem within the first two weeks of life1. While most cases are classified as physiological jaundice that is self-limiting without long-term sequalae, more serious disorders are not uncommon. For example, biliary atresia (BA), choledochal cyst, Alagille syndrome and progressive familial intrahepatic cholestasis are common pathological causes of neonatal cholestasis requiring prompt surgical treatment. Approximately 80% of infants with pathological jaundice in need of surgical intervention also have BA2, which requires the most urgent surgical attention, as there is a narrow treatment window for this condition. Therefore, distinguishing BA from other causes of neonatal cholestasis (non-BA) is essential for an optimal outcome.

The occurrence of BA varies widely among different populations, with the highest incidence rates in Asia3. If untreated, affected infants develop progressive liver disease and die within the first two years of life4. A timely and uneventful Kasai portoenterostomy (KPE) can potentially restore bile flow in 30%-80% of BA patients, but complications do occur5. It is estimated that up to 56%-74% of post-KPE BA patients at 10 years of age require liver transplantation6. Despite significant advances in the management of BA, establishing an early diagnosis for it and predicting post-KPE outcomes remain two major challenges. An early diagnosis will lead to a timely operation, which may promote surgical success rate7. Currently, an accurate diagnosis can only be established by invasive methods such as intraoperative cholangiography and liver biopsy8; diagnosis can be confirmed by examination of the liver tissue obtained at the time of KPE9,10. Postoperatively, however, there is a lack of objective parameters for predicting a poor prognosis. Our objective in this study is to identify noninvasive biomarkers to promote early diagnosis and identification of patients with poor prognosis based on a systematic review and meta-analysis of reported serum biomarkers.

Materials and methods

This systematic review and meta-analysis followed the Preferred Reporting Items for Systematic Reviews and Meta-analyses guidelines11,12. No Ethical or Institutional Review Board approval was required for the study design.

Literature search

We conducted a computerized search in the PubMed, Web of Science, Embase, Scopus and OVID databases to identify English-language articles relevant to our objective up to August 1st, 2020. The following terms were used: ((“biliary atresia”[MeSH]) OR (biliary atresia)) AND (biomarker) AND (diagnosis OR prognosis OR portoenterostomy).

Inclusion and exclusion criteria

Studies evaluating the application of serum biomarkers for early diagnosis or post-KPE prognosis were considered eligible for our analysis. In addition, the articles met the following inclusion criteria: (1) populations—for diagnosis, both BA patients and non-BA and/or healthy control (HC) groups were compared and for prognosis, BA patients were assessed after KPE; (2) reference standard— BA diagnoses or post-KPE prognostic outcomes were compared using standard measurements, including liver function and native liver survival rate. Potential citations that met any of the following criteria were excluded: (1) article type—animal experiments, reviews, case reports and case series including less than 10 patients, editorials, letters, comments and conference papers; (2) biomarkers that required liver biopsy/intraoperative cholangiography/laparoscopy; and (3) overlapping study populations.

Data extraction

The following data were extracted from the included studies using a standardized form: (1) study characteristics—last name of the first author, publication year, study duration, country of origin, study type, and number of patients; (2) demographic characteristics—age, percentage of males; (3) biomarker characteristics—name, test sample, test method, test timeframe and cutoff value; and (4) outcome characteristics—positive or negative correlation of biomarkers and diagnostic performance or prognostic outcomes, and the sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), diagnostic odds ratio (DOR) and area under the curve (AUC) of the biomarkers with/without an identified cutoff value.

Quality assessment

The methodological quality of the articles included for the meta-analysis was assessed using our tailored questionnaires in terms of the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) criteria13. Two reviewers (PHY Chung and L He) independently assessed the literature search, study selection, data extraction, and quality assessment. If there were any inconsistences, they were addressed by a third reviewer (PKH Tam).

Data synthesis and statistical analysis

The primary outcome of our study was the performance of biomarkers for the early diagnosis of BA and/or post-KPE prognosis. The ultimate purpose of this study was to identify biomarkers that promote an early diagnosis or prediction of post-KPE prognosis. We constructed 2 × 2 contingency tables based on the extracted true and false positives and negatives from available studies. Summary estimates of diagnostic test accuracy data, including sensitivity, specificity, PLR, NLR and DOR with their 95% confidence intervals (CIs), were calculated by the Mantel–Haenszel method (fixed effect model) or the DerSimonian-Laird method (random effect model)14. A hierarchical summary receiver operating characteristic (SROC) curve with its 95% confidence region was plotted. Of note, several statistical methods were employed to evaluate any possible bias in our meta-analysis, as follows. (1) The threshold effect—computation of the Spearman correlation coefficient between the logit of sensitivity and logit of 1-specificity; a strong positive correlation (Spearman correlation coefficient > 0.6; p < 0.05) would indicate a considerable threshold effect15,16. (2) Heterogenicity—heterogenicity that represented the degree of variability in results across the included studies, as evaluated by Cochran’s Q test and I2 test17. The p value of Cochran’s Q test < 0.10 suggested significant heterogeneity and different cutoff intervals of I2 values at 0–25%, 25–50%, 50–75% and 75–100%, respectively corresponding to nonsignificant, moderate, substantial and considerable heterogeneity. The heterogenicity of the hierarchical SROC curve was calculated by weighted regression with the Inverse Variance method (Moses’ model), and the result of the heterogenicity test determined the pooling model selection. (3) Because of the small number of studies included, the calculation of publication bias was not possible. An AUC of SROC greater than 0.7 indicated a high predictive accuracy for the biomarker18. The diagnostic meta-analysis was conducted by Meta-Disc 1.4 software, version 1.4 (http://www.hrc.es/investigacion/metadisc_en.htm,).

Results

Literature search

A flow diagram of the literature screening selection is outlined in Fig. 1. Of the 1,189 citations obtained from the PubMed, Web of Science, Embase, Scopus and OIVD databases, 263 duplications, 21 conference papers, 196 reviews and 4 case reports were excluded. The remaining 705 citations were examined by title and abstract screening, and 600 of them were removed. After full-text assessment of the remaining 105 studies, 54 were further omitted for the following reasons: (1) 7 articles had no useful information; (2) 12 involved BA alone without prognosis; (3) 5 were animal experimental studies; (4) 28 used biomarkers for the liver or biliary tract; and (5) 2 were reviews. Ultimately, 51 articles were considered eligible for systematic review19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69. Thirty-one studies20,21,22,23,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,67 reported single-center evidence of biomarkers related to BA diagnosis or post-KPE prognosis, and 8 studies24,25,26,27,28,29,64,65 provided multicenter evidence but were unable to be included in the meta-analysis. The data from the remaining 12 studies were entered into 2 × 2 contingency tables, and the performance of biomarkers for early diagnosis19,56,57,58,59,60,66,68,69 or post-KPE prognosis of BA61,62,63 was evaluated. All biomarkers that are useful for the identification of BA diagnosis or post-KPE prognosis are summarized in eTable 1 (Supplementary materials, pages 1–2). More studies are needed to reaffirm the performance of these biomarkers because only evidence from individual case–control studies or case series was documented.

Figure 1
figure 1

Flow diagram of study selection. Abbreviations: BA, biliary atresia.

Quality assessment of the studies included for meta-analysis

The quality assessment of the 12 included studies based on the QUADAS-2 criteria is shown in Fig. 2. In the patient selection domain, there was a high risk of bias in 7 studies because they did not avoid a case–control design19,56,59,63,66,68,69. Four studies showed a high risk of bias in the index test domain because their index test results were not interpreted without knowledge of the results of the reference standard19,59,63,69. In addition, five studies showed a high risk of bias in the reference standard domain due to a lack of reciprocal blinding between the index test and the reference standard19,56,59,63,69. For the flow and timing domain, there was a high risk of bias for six articles because intervals between the index test and the reference standard were not appropriate19,56,58,59,66,69; the other 3 articles showed an unclear risk of bias because these intervals were not clearly reported57,63,68. Applicability concerns were found for none of the 12 studies.

Figure 2
figure 2

Quality assessment of the included studies for meta-analysis based on QUADAS-2 criteria. (A) Methodological quality graph; (B) Methodological quality summary. ‘ − ’ in red, ‘ + ’ in green and ‘?’ in yellow mean high risk, low risk and unclear risk, respectively.

Characteristics of the studies included for meta-analysis

The study and biomarker characteristics of the included studies are presented in Table 1. The study characteristics in “patient-level” analysis and the demographic characteristics of the included studies are summarized in Table 2. The publication date of the 12 included articles ranged from 2008 to 2020 (median: 2018).; the number of case–control studies was identical to that of retrospective cohort studies (n = 5), and more than half of the included studies were conducted in China19,57,58,60,62,68,69. We retrieved 4 studies investigating the diagnostic performance of matrix metallopeptidase-7 (MMP-7)57,58,59,60, 2 investigating that of interleukin 33 (IL-33)19,56 and 3 investigating that of γ-glutamyl transferase (GGT)59,68,69 in the early differentiation of BA from non-BA; 3 studies evaluating that of aspartate aminotransferase to the platelet ratio index (APRi) for post-KPE prognosis (specifically, 2 for significant liver fibrosis and 3 for cirrhosis)61,62,63. Additionally, Table 1 summarizes the test method and test timeframe of biomarkers, and Table 2 represents the percentage of males and mean age of the analyzed subjects in individual biomarker analysis.

Table 1 Characteristics of included studies in the “study-level” analysis.
Table 2 Characteristics of the included studies in the “patient-level” analysis.

Biomarkers for differentiating BA from non-BA

MMPs are key enzymes responsible for the degradation and deposition of all protein components in the extracellular matrix and basement membrane and participate in liver fibrosis caused by BA as well as other liver diseases70. Three retrospective cohort studies57,58,60 and one case–control study59 suggested that BA infants had a significantly higher level of serum MMP-7 than non-BA infants. Although the cutoff values used in these 4 articles were discordant, optimal diagnostic performance for serum MMP-7 in distinguishing BA from non-BA was demonstrated: the sensitivity, specificity and AUC ranged from 95 to 99%, 83% to 95% and 0.96 to 0.99, respectively (Table 3). The summary sensitivity and specificity of serum MMP-7were 96% (95% CI: 94–98%) and 91% (95% CI: 87–94%), respectively (Fig. 3A and 3B), and the PLR, NLR and DOR were 10.60 (95% CI: 5.48–20.52), 0.04 (95% CI: 0.02–0.07), and 313.42 (95% CI: 138.89–707.28), respectively (eFigure 1A, B and C in Supplementary materials, page 4). The AUC of MMP-7 for the diagnosis of BA was 0.9847 (eFigure 6A in Supplementary materials, pages 9–10), indicating strong predictive accuracy.

Table 3 Summary of diagnostic test accuracy data in individual studies.
Figure 3
figure 3

Coupled forest plots of the sensitivity and specificity of MMP-7 for BA diagnosis. (A) Sensitivity of MMP-7 for BA diagnosis; (B) Specificity of MMP-7 for BA diagnosis.

One of the pivotal etiologies of BA is associated with inflammatory processes, whereby activated immune cells release proinflammatory cytokines that result in ongoing damage and obstruction of bile ducts and ductules71. Several case–control studies indicated that serum levels of IL-33 and IL-18 in BA patients were significantly greater than those in HC children24,25,26,56. We calculated the diagnostic performance of IL-33 for the identification of BA from non-BA using data from 2 of 9 selected articles19,56. With different cutoff values for each, the sensitivity, specificity and AUC of the included articles ranged from 61 to 94%, 78% to 97% and 0.67 to 0.995, respectively (Table 3). The summary sensitivity and specificity of IL-33 were 77% (95% CI: 65–87%) and 85% (95% CI: 75–92%) (Fig. 4A and B), respectively, and the PLR, NLR and DOR were 7.78 (95% CI: 0.48–127.48), 0.20 (95% CI: 0.02–1.90) and 41.76 (95% CI: 0.56–3,124.11), respectively (eFigure 2A, 2B and 2C in Supplementary materials, page 5).

Figure 4
figure 4

Coupled forest plots of the sensitivity and specificity of IL-33 for BA diagnosis. (A) Sensitivity of IL-33 for BA diagnosis; (B) Specificity of IL-33 for BA diagnosis.

GGT has been the most widely used biomarker for BA diagnosis in recent decades. Four retrospective cohort studies suggested that serum levels of GGT in BA patients were significantly higher than those in non-BA infants59,66,68,69, and they also provided data to calculate the diagnostic performance of GGT in the differentiation of BA from non-BA. The sensitivity, specificity and AUC of GGT ranged from 79 to 83%, 71% to 81% and 0.843 to 0.9, respectively, according to the cutoff value (Table 3). The summary sensitivity and specificity of GGT were 80% (95% CI: 79–82%) and 79% (95% CI: 74–83%) (Fig. 5A and B), respectively, and the PLR, NLR and DOR were 3.76 (95% CI: 3.09–4.57), 0.25 (95% CI: 0.23–0.28) and 15.06 (95% CI: 11.67–19.43), respectively (eFigure 3A, B and C in Supplementary materials, pages 5–6). The AUC of GGT for the diagnosis of BA was 0.9645 (eFigure 6B in Supplementary materials, pages 9–10), indicating great predictive accuracy.

Figure 5
figure 5

Coupled forest plots of the sensitivity and specificity of GGT for BA diagnosis. (A) Sensitivity of GGT for BA diagnosis; (B) Specificity of GGT for BA diagnosis.

Biomarkers for post-KPE persistent jaundice

Similar to the inflammatory processes that trigger BA, inflammation also plays a key role in the postoperative period. Two case–control studies found that a postoperative increase in serum IL-18 levels was positively associated with post-KPE jaundice25,26. The findings suggest that serum IL-18 may serve as a marker for postoperative jaundice in BA patients.

Biomarkers for post-KPE significant liver fibrosis

MMPs are not only involved in the pathogenesis of BA but also predict post-KPE clinical outcomes. Evidence from 4 case–control studies showed that a high serum concentration of preoperative MMP-7 correlates positively with the severity of post-KPE liver fibrosis27,57,60,65. Serum hyaluronic acid (HA) is a high-molecular-weight glycosaminoglycan that is produced and present in the extracellular matrix. Progressive liver diseases impair uptake of HA and raise the concentration of serum HA72. Elevated serum HA levels are a sensitive predictor of liver impairment. The results of some studies collectively suggest that the concentration of postoperative serum HA is associated with the severity of liver fibrosis27,28,29.

Three case–control studies concluded that APRi was a sensitive biomarker for predicting post-KPE liver fibrosis62,63,73. We calculated the diagnostic performance of APRi in predicting post-KPE significant liver fibrosis from 2 of these 3 studies62,63. With different cutoff values of APRi, the sensitivity, specificity and AUC of the 2 included studies ranged from 61 to 62%, 76% to 88% and 0.75 to 0.88, respectively (Table 3). The summary sensitivity and specificity were 61% (95% CI: 49–72%) and 80% (95% CI: 67–90%), respectively (Fig. 6A and B), and the PLR, NLR and DOR were 3.09 (95% CI: 1.73–5.51), 0.49 (95% CI: 0.36–0.66) and 6.34 (95% CI: 2.89–13.90), respectively (eFigure 4A, B and C in Supplementary materials, pages 6–7).

Figure 6
figure 6

Coupled forest plots of the sensitivity and specificity of APRi for post-KPE significant liver fibrosis and cirrhosis. (A) Sensitivity of APRi for significant fibrosis; (B) Specificity of APRi for significant fibrosis.

Biomarkers for post-KPE cirrhosis

The progression of liver fibrosis eventually leads to the occurrence of cirrhosis, for which liver transplantation is required as the only solution. Discovery of noninvasive methods to predict the occurrence of cirrhosis would enable effective prevention of the development of liver failure and the need for liver transplantation in BA patients. Three of our 9 included studies found that APRi was a useful noninvasive tool to predict post-KPE cirrhosis61,62,63. The sensitivity, specificity and AUC of APRi with different cutoff values in these studies ranged from 71 to 91%, 81% to 84% and 0.81 to 0.88, respectively (Table 3). The summary sensitivity and specificity were 78% (95% CI: 63–88%) and 83% (95% CI: 79–87%), respectively (Fig. 7A and B); PLR, NLR and DOR were 4.56 (95% CI:3.37–6.18), 0.28 (95% CI: 0.17–0.46) and 16.69 (95% CI: 8.34–33.38), respectively (eFigure 5A, B and C in Supplementary materials, pages 7–8). The AUC of APRi for the prediction of post-KPE cirrhosis was 0.8729 (eFigure 6C in Supplementary materials, pages 9–10), indicating high predictive accuracy.

Figure 7
figure 7

Coupled forest plots of the sensitivity and specificity of APRi for post-KPE cirrhosis. (A) Sensitivity of APRi for cirrhosis; (B) Specificity of APRi for cirrhosis.

M2BPGi, also known as Wisteria floribunda agglutinin-positive human Mac-2-binding protein, has been used as a glycol biomarker of liver fibrosis in patients with chronic hepatitis C74. Recently, two studies from Japan reported that M2BPGi was capable of predicting post-KPE cirrhosis in BA63,64.

Heterogenicity and threshold effect

Most analyses did not find significant heterogenicity based on Cochran’s Q test and Higgins I2 statistic test, whereas the following analyses showed substantial to considerable heterogenicity, as follows: (1) PLR (p = 0.0499, I2 = 61.0%) of serum MMP-7 in BA diagnosis; (2) sensitivity (p = 0.0015, I2 = 90.0%), specificity (p = 0.0136, I2 = 83.6%), PLR (p = 0.0055, I2 = 87.0%), NLR (p = 0.0014, I2 = 90.2%) and DOR (p = 0.0011, I2 = 90.6%) of serum IL-33 in BA diagnosis (eTable 2in Supplementary materials, page 3). The Spearman correlation coefficient revealed no threshold effects in the analyses of MMP-7 for BA diagnosis or APRi for predicting cirrhosis, and the weighted regression of their SROC curves showed no heterogeneity (eTable 3 in Supplementary materials, page 3).

Discussion

The existing diagnostic methods of BA diagnosis rely on invasive procedures such as surgical exploration and operative cholangiogram, and all preoperative tests are unreliable. Although the Kasai operation offers potential bile drainage, it has a narrow treatment window, and a large proportion of patients experience ongoing problems, including persistent cholestasis, portal hypertension, liver fibrosis and cirrhosis. The monitoring of persistent jaundice requires continued measurements of total bilirubin, while liver fibrosis is traditionally evaluated by liver biopsy. Therefore, noninvasive measures that promote early diagnosis and recognition of complications are beneficial. Serum biomarkers as a screening or prognostic tool have been widely used for other conditions, such as degenerative and malignant diseases75,76. Various biomarkers for BA have been reported, but there is a lack of high-level evidence to confirm their values. Herein, we conduct a systematic review and meta-analysis that investigates the most appropriate noninvasive biomarkers for diagnosing BA and predicting post-KPE outcomes.

Inflammation is a trigger factor that can cause an autoimmune response against antigens from the biliary epithelium77. Immune-mediated biliary injury is characterized by overexpression of histocompatibility antigens on bile ducts and obvious infiltration of immunologically active T lymphocytes, which is a central feature of adult liver diseases and is also related to obstruction of neonatal extrahepatic bile ducts. Our study confirms the diagnostic performance of serum IL-33 in the early diagnosis of BA and provides good evidence that IL-18 can predict post-KPE persistent jaundice. IL-33 and IL-18 are the two most closely related and best-characterized members of the IL-1 family78. Dong et al.24 found that the serum IL-33 level was significantly higher in BA patients (791.0 ± 22.22 pg/mL) than in non-BA (607.1 ± 20.68 pg/mL) and HC (588.5 ± 27.71 pg/mL) groups (both p < 0.001), but no significant difference was observed between the latter groups (p > 0.05). However, one study by Vejchapipat and colleagues25 found elevated serum IL-18 in medium-term BA survivors, which increased significantly with the severity of post-KPE persistent jaundice (p = 0.004). Taken together, inflammatory factors and the autoimmune response are both involved in the etiology and disease progression of BA.

The diagnostic performance of serum MMP-7 in the early diagnosis of BA and its good level of evidence for predicting post-KPE significant liver fibrosis are presented here. Of note, the collection of serum MMP-7 was performed prior to the implementation of KPE. Overall, serum MMP-7 is a reliable biomarker to diagnose BA in a clinical setting due to its high specificity (95–99%) and sensitivity (83–95%)57,58,59,60. γ-Glutamyltransferase (GGT), one of the factors measured in biochemical liver function tests, is also utilized to differentiate BA from non-BA; at a cutoff value of 250-303 IU/L, the sensitivity and specificity were 82.8–83.3% and 70.6–81.6%, respectively59,66,69, suggesting that the diagnostic accuracy of MMP-7 is significantly higher than that of GGT. In addition, Wu et al.57 showed a positive correlation between serum MMP-7 and the severity of liver fibrosis in infants with cholestasis at a mean age of 1.5 months, indicating that MMP-7 is likely useful for predicting post-KPE liver fibrosis in young BA patients. Furthermore, the MMP-7 protein was found to be significantly elevated in BA patients with persistent cholestasis and liver fibrosis79,80,81. Therefore, MMP-7 is a very valuable biomarker for both BA diagnosis and post-KPE prognosis.

Although there are no biomarkers with good evidence levels for post-KPE portal hypertension, we found that serum HA and APRi to have good evidence levels in association with post-KPE liver fibrosis. HA, a linear polysaccharide, is responsible for liver fibrogenesis; hepatic production of HA is predominantly carried out by hepatic stellate and myofibroblast-like cells82. The serum HA level in HC children is low, whereas it is significantly increased in BA patients34, suggesting that HA is a potential biomarker for differentiating BA from HC. Additionally, the concentration of serum HA correlates positively with the severity of post-KPE cirrhosis and its complications in BA patients28, such as ascites or esophageal varices, both of which reflect the clinical characteristics of portal hypertension.

APRi was first introduced as a noninvasive tool by Wai and colleagues in 2003 to predict significant liver fibrosis and cirrhosis in adults with chronic hepatitis C83, and it has been employed as a biomarker in the evaluation of liver fibrosis in BA patients61,84. This measure is calculated as serum aspartate aminotransferase level (U/L)/upper normal × 100/platelet count (103/μL)83. Our current meta-analysis revealed a high diagnostic accuracy of APRi for predicting post-KPE significant liver fibrosis and cirrhosis. Early reports suggested that postoperative APRi might help predict post-KPE esophageal varices and native liver survival in BA patient85,86. Nevertheless, it is worth mentioning that no correlation between preoperative APRI and native liver survival has been observed67. One study by Yang et al.62 further showed that preoperative APRI correlated significantly with post-KPE metavir scores (a scoring system to quantify liver fibrosis) in BA patients and could predict post-KPE persistent jaundice and cirrhosis, despite the different reference values among centers. In the clinical setting, these results should be interpreted with other clinical parameters. Similarly, serum M2BPGi has been confirmed as a novel biomarker for predicting post-KPE cirrhosis in BA patients; its AUC (0.93) is higher than that of APRi (0.81) and the fibrosis-4 index (FIB-4) (0.59)63,64. A study by Kuno74 found that serum M2BPGi was also a sensitive biomarker for predicting significant liver fibrosis and cirrhosis in adults with chronic viral hepatitis, and the performance for cirrhosis prediction was superior to that of FIB-4 and HA. Our present study consistently demonstrates a high diagnostic accuracy of M2BPGi in the prediction of post-KPE cirrhosis in BA patients.

Based on our analysis, we propose using biomarkers to assist in the diagnosis of BA and in the monitoring of postoperative outcomes (Fig. 8). The best cutoff value of biomarkers, if obtainable, can be selected in terms of the largest AUC value. A high concentration of serum IL-33 or IL-18 distinguishes newborns with BA from HCs, and differentiation of BA from non-BA can be suggested by measuring levels of serum MMP-7, IL-33 and GGT. Regarding the prediction of post-KPE prognosis, a higher serum IL-18 level indicates the occurrence of persistent jaundice; an increased value of APRi, preoperative serum MMP-7 level or postoperative serum HA level predicts the occurrence of significant liver fibrosis. A value of APRi or postoperative serum M2BPGi higher than the best cutoff value suggests liver cirrhosis.

Figure 8
figure 8

Flow diagram of practical strategy using biomarkers for diagnosing BA and prognosing post-KPE clinical outcomes. Abbreviations: MMP-7, matrix metallopeptidase-7; IL, interleukin; GGT, γ-glutamyl transferase; HA, hyaluronic acid; APRi, aspartate aminotransferase to platelet ratio index; M2BPGi, Mac-2-binding protein glycosylation isomer. Selecting the best cut-off value is based on the largest AUC. The cut-off value is not presented in the investigations of serum IL-18 for diagnosing BA and prognosing post-KPE persistent jaundice and serum HA for prognosing post-KPE significant fibrosis. The best cut-off values of MMP-7 for diagnosing BA and prognosing post-KPE significant liver fibrosis both are 52.85 ng/mL with the largest AUC of 0.9958; the best cut-off value of IL-33 for diagnosing BA is 20.8 pg/mL with the largest AUC of 0.99556; the best cut-off value of GGT for diagnosing BA is 300 U/L with the largest AUC of 0.960; the best cut-off value of APRi for prognosing liver fibrosis and cirrhosis is 0.88 with the largest AUC of 0.8862.

Nonetheless, biomarkers are not without limitations. The universal use of biomarkers for BA screening may increase the health-care budget; on the other hand, it might minimize the socioeconomic burden on BA by improving outcomes. Furthermore, such an approach might replace other screening policies, such as stool color cards, as already practiced in mainland China, Japan and Taiwan87. In general, the clinical application of some biomarkers is limited by availability. Last, the best cutoff value of some of the biomarkers, including MMP-7, IL-33 and IL-18, remained unclear, and further study will be required to determine their cutoff values.

We acknowledge a number of limitations in our study. Our meta-analysis included all currently available relevant English-language literature, but publication bias may exist due to the small number of papers analyzed. In addition, partial analyses detected significant heterogeneity, which might lead to further bias. Only 12 studies have the complete data available for meta-analysis and the others could be included for systematic review only. Moreover, because only a few publications were included in the analysis of BA diagnosis based on serum IL-33 and prediction of post-KPE significant liver fibrosis with APRi, the threshold effect and SROC curves were not obtained in either analysis. Last, the present work was a diagnostic meta-analysis, and all the included studies were either case–control or cohort studies rather than randomized controlled trials, which limited the calculation of predictive values and the evidence level of the biomarkers studied.

Conclusions

Serum IL-33 and IL-18 are both useful biomarkers for differentiating BA from HC, and serum MMP-7, IL-33 and GGT are applicable biomarkers to distinguish BA from non-BA. After KPE, biomarkers predicting the prognosis may include (1) the serum IL-18 level to predict persistent jaundice; (2) APRi, postoperative serum HA and preoperative serum MMP-7 to predict significant liver fibrosis; and (3) APRi and postoperative serum M2BPGi to predict cirrhosis. These noninvasive biomarkers should be incorporated into the management strategy for BA.