Diagnostic performance of serum IgG4 level for IgG4-related disease: a meta-analysis

An elevated serum IgG4 level is one of the most useful factors in the diagnosis of IgG4-related disease (IgG4-RD). In this study, we performed a meta-analysis of the published articles assessing the diagnostic accuracy of serum IgG4 concentrations for IgG4-RD. The databases of MEDLINE/PubMed, EMBASE and Web of Science were systematically searched for relevant studies. Sensitivities and specificities of serum IgG4 in each study were calculated, and the hierarchical summary receiver operating characteristic (HSROC) model with a random effects model were employed to obtain the individual and pooled estimates of sensitivities and specificities. In total, twenty-three studies comprising 6048 patients with IgG4-RD were included in the meta-analysis. The pooled sensitivity was 85% with a 95% confidence interval (CI) of 78–90%; the pooled specificity was 93% with a 95% CI of 90–95%. The HSROC curve for quantitative serum IgG4 lies closer to the upper left corner of the plot, and the area under the curve (AUC) was 0.95 (95% CI 0.93, 0.97), which suggested a high diagnostic accuracy of serum IgG4 for the entity of IgG4-RD. Our study suggests that serum IgG4 has high sensitivity and specificity in the diagnosis of IgG4-RD.

patients with IgG4-RD 6,9,15 . Notably, IgG4-RD is a systemic fibro-inflammatory disease that has a similar pathogenesis and cardinal features in affected organs; however, it was rarely analysed as an extensive entity but rather an individual disorder focused on a single organ, especially the pancreas and salivary or lacrimal glands 2,11,16-19 . Up to now, a comprehensive overview of the accuracy and precision of the serum IgG4 concentration for the diagnosis of all IgG4-RD has not been performed. We aimed to establish the diagnostic performance of the serum IgG4 concentration for IgG4-RD involving the pancreas, bile duct, salivary gland, and lacrimal gland from non-IgG4-RD and/or healthy controls.

Methods
Search strategy. We searched the electronic databases of MEDLINE (via PubMed), EMBASE and the Web of Science from 2000 to September 2015 in accordance with the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) guidelines 20 . The systemic search was conducted combining the terms "serum", "immunoglobulin g4" OR "igg4", as well as the terms "sclerosing pancreatitis" OR "autoimmune chronic pancreatitis" OR "autoimmune pancreatitis" OR "cholangitis" OR "sclerosing cholangitis" OR "Küttner tumor" OR "sialadenitis" OR "sclerosing sialadenitis" OR "sclerosing dacryoadenitis" OR "Mikulicz's disease" OR "igg4-rd" OR "igg4-related disease" with the species restriction of Human and language restriction of English. The relevant reference lists of the review articles were also screened to identify additional eligible articles not obtained in database searches.
Data extraction and quality assessment. Prospective or retrospective case-control studies on the utility of serum IgG4 concentration in the diagnosis of IgG4-RD were deemed eligible for inclusion in the meta-analysis. The studies also met the criteria in that serum IgG4 concentration with an unambiguous cut-off value had been evaluated between IgG4-RD with a wide variety of organs involved and other diseases, as well as healthy controls. Articles with a larger sample size or more recently published articles were included when they used the same case series. Studies for which inadequate data for confirming the diagnosis of IgG4-RD and those assessing the role of IgG4 in the pathogenic mechanism were excluded. Conference or poster abstracts without sufficient clinical information or subsequent publication in full text were excluded. Studies with fewer than 10 included patients or based on animal or cell cultures were also excluded.
Risk of bias and applicability were critically assessed according to the revised Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool, which has 4 key domains including patient selection, index text, reference standard, and flow and timing 21 . Risk of bias and applicability concerns were judged as "low, " "high, " or "unclear". Data extraction and quality assessment were performed by two reviewers (Wen-long Xu and Ying-chun Ling), and disagreements were resolved by discussion.
Statistical analysis. The forest plot of individual and summarised sensitivity and specificity along with 95% confidence intervals (95% CIs) of the included studies were generated to graphically represent the diagnostic value of serum IgG4 in IgG4-RD. Subsequently, a hierarchical summary receiver operating characteristic (HSROC) model with a corresponding 95% confidence contour and 95% prediction contour was calculated. A bivariate random effects model following the DerSimonian-Laird method with a corresponding test of heterogeneity was used for data pooling. The heterogeneity across studies included in the meta-analysis was statistically detected using a Q test and I 2 statistics, which ranged from 0 to 100% and were interpreted as representing low, medium and high inconsistency with the values of ≤ 25%, ≤ 50% and ≤ 75%, respectively, in accordance with the proposal of Higgins and Thompson 22 . Stratified analysis and meta-regression based on variations in features of ethnicity, spectrum of IgG4-RD and detection method were performed to explore potential sources of heterogeneity. Publication biases were tested using Egger precision weighted linear regression tests and sensitivity analysis and demonstrated graphically using funnel plots. The causes of heterogeneity were further assessed using a sensitivity analysis in which the sequential omission of individual studies was performed to analyse the influence of a single study on the overall detection rate of IgG4-RD. The meta-analysis of the data was conducted using the Stata/SE version 13.1 (StataCorp LP, Texas, USA). P < 0.05 was considered statistically significant.

Results
Study identification and selection. The initial keyword search yielded 2071 potentially relevant studies from the databases of PubMed (n = 533), Embase (n = 708), and the Web of Science (n = 830). After 464 duplicates were discarded, 1607 articles remained, of which the title and abstract were screened for eligibility in the meta-analysis. In accordance with the predefined inclusion criteria, 1472 articles were removed, and the remaining 135 articles were deemed potentially relevant. Following the further review of the full text articles, 111 studies were excluded due to improper design for data extraction (n = 70), insufficient data for fourfold table construction (n = 38), duplicated publication (n = 2) and published in a language other than English (n = 1). Finally, 23 articles were considered as eligible and included in this meta-analysis (Fig. 1). No disagreement occurred between the two reviewers.
Study characteristics and methodological quality. In total, 6048 patients in 23 studies were included in the spectrum of IgG4-RD including AIP (13 studies), MD (3 studies), IgG4-associated cholangitis (6 studies) and IgG4-RD without further classification according to the organ involved (4 studies). The study type included 10 retrospective studies, 6 prospective studies and 7 studies without reporting the study type. Ethnicity included Asian in 15 studies with 3931 patients and Caucasian in 8 studies with 2117 patients (Table 1).
Overall, the included studies were of moderate methodological quality according to the QUADAS-2 tool. The high risk of bias and concerns of applicability regarding patient selection were introduced because all included studies were of a randomised or case-control design. The risk of bias for the index test and reference standard remained either (1) unclear because the index test and/or reference standard were interpreted double-blindly in all included studies with the exception of one 23 or (2) high because the index test of serum IgG4 was not reported in five studies [24][25][26][27][28] . There were no major concerns regarding the applicability of the reference standard for the included studies. The risk of bias on flow and timing arose from the fact that the description of the interval and interventions between index tests and the reference standard were not reported in all studies (Table 1).
The diagnostic values of the studies were demonstrated in a HSROC graph in which the summary operating point represents the pooled sensitivity and specificity, as well as 95% confidence and the prediction region represents 95% CI of the pooled and individual sensitivity and specificity. The HSROC curve for quantitative serum IgG4 lies closer to the upper left corner of the HSROC plot, and the area under the curve (AUC) was 0.95 (95% CI 0.93, 0.97), which suggested an impressive diagnostic accuracy of serum IgG4 for the entity of IgG4-RD. Finally, the curve is symmetrical with the Z statistic of − 0.58 (p = 0.564), which also indicates a high diagnostic accuracy for serum IgG4 (Fig. 4).
Subgroup analysis and publication bias. The subgroup analysis was performed according to study period (published before vs. after 2011), study design (designed prospectively vs. retrospectively), sample size (less than 150 vs. more than 150), ethnicity (Asian vs. Caucasian), and serum IgG4 concentration detection assay (nephelometry vs. another method). The sensitivity of serum IgG4 was higher for (1) the studies published after 2011, (2) retrospective studies performed before 2011 that used detection methods other than assays, and (3) prospective studies using the method of nephelometry, whereas specificity was significantly different between the subgroups of all variables for the diagnosis of IgG4-RD (Table 3).
Egger's regression test did not reveal any publication bias arising from small-study effects (p = 0.30) (Fig. 4). A sensitivity analysis suggested only a minor influence for diagnostic accuracy of omitting single-study estimates from 3 studies with larger sample sizes 2,29,30 ; however, the estimates still fall within the indicated spread of lower and higher CI limits (Fig. 5).

Discussion
Accurately differentiating IgG4-RD from malignancies such as pancreatic cancer is very important to avoid unnecessary surgeries. Compared with the histopathological criteria, the detection of serum IgG4 is one of the most convenient and valuable non-invasive examinations in clinical practice for the diagnosis of IgG4-RD, especially in disease screening at an early stage. However, because the positive rate of serum IgG4 varied according to different studies, the diagnostic accuracy of this method is controversial. A previous meta-analysis published in 2009 demonstrated that the serum IgG4 is a good marker of a single disease of AIP with a pooled sensitivity ranging from 82.3% to 89.3% and a specificity of 94.6% to 95.8% according to different control 31 . Beyond that, no studies were carried out to systematically summarise the currently available data on the performance of serum IgG4 for the diagnosis of IgG4-RD as an entity.   In the last decade, numerous studies have attempted to evaluate the diagnostic value of serum IgG4 in IgG4-RD. To summarise these studies, we conducted this meta-analysis, which included 23 studies comprising a total of 6048 patients with IgG4-RD diagnosed with different criteria. The key findings of our analysis are that serum IgG4 has a very high accuracy for sensitivity (85%) and specificity (93%) in detecting IgG4-RD involving the pancreas, bile duct, salivary gland, or lacrimal gland from non-IgG4-RD involving the same organs and/or healthy controls (or both). The serum IgG4 has a higher summarised sensitivity and specificity compared with the histopathological method using the infiltration of IgG4-positive plasma cells in which the pooled sensitivity and specificity were 58.8% and 90.2%, respectively 32 . The diagnostic value of serum IgG4 remained significant with a sensitivity range from 78% to 88% and a specificity range from 90% to 95% when analysed separately in different subgroups. An association between the sensitivity or specificity and the study period, study, sample size, ethnicity, or detection assay of serum IgG4 were also identified using meta-regression and subgroup analysis. The findings of such an assessment are useful both in providing evidence-based patient information in clinical application and further investigation.
Our meta-analysis had several limitations. Firstly, there was inevitable heterogeneity in the meta-analysis of diagnostic accuracy due to the variability in design characteristics and the poor quality of reporting in the primary studies. The quality of the studies included in the meta-analysis was modest for our meta-analysis because

Author
Year Assay for IgG4 Cut-off (mg/dL) Participant, n True positive False positive False negative True negative Sensitivity Specificity     ten studies were retrospective designed 19,[23][24][25][26][27][28][29]33,34 , six studies were prospective designed 11,16,18,30,35,36 , and seven studies did not report the study design 2,9,10,15,17,37,38 . The QUADAS-2 tools for the methodological assessment indicated that other contributors to the potential heterogeneity across the studies result from the risk bias in patient selection due to the diagnosis of IgG4-RD on the basis of multiple criteria, as well as due to the detection method for serum IgG4 despite the clear cut-off value given in several studies [24][25][26][27][28] . Regarding procedure and timing, the interval and whether there were any interventions between the index tests and the reference standard were not described in all studies. Secondly, because differing cutoff values were used in the same primary studies 2,11,18,35 , widely accepted values varying from 135 mg/dL (range 130 mg/dL to 200 mg/dL) were employed in the meta-analysis. It was not possible to assess a threshold effect for an optimum serum IgG4 concentration. Finally, several studies with high quality were excluded because the results were reported in the form of means and SD, which may contribute to the heterogeneity and may have impaired the stringency of the meta-analysis.  Serum IgG4 is a more cost-effective, easy and a time-efficient assay that can be carried out to detect IgG4-RD. We conclude that this meta-analysis has achieved its primary objectives by demonstrating that the detection of serum IgG4 has high sensitivity and represents a specific investigative modality in the detection of IgG4-RD as an indicator. However, this does not necessarily indicate that the positive or negative serum IgG4 could be used to rule out (or confirm) a diagnosis of IgG4-RD. A diagnosis based on IgG4-RD should be made after comprehensive analysis that considers clinical symptoms, as well as histopathologic, haematological, and imaging findings. The histopathological assessment of biopsy specimens from the involved tissues remains the cornerstone in both the definite diagnosis of IgG4-RD and the exclusion of malignancies. In summary, appropriate considerations and cautious interpretations of these findings combined with other parameters (especially pathological examination) are highly recommended, and additional studies that evaluate the accuracy of IgG4 in a wider spectrum of IgG-RD are needed.