Diagnostic accuracy of PLA2R autoantibodies and glomerular staining for the differentiation of idiopathic and secondary membranous nephropathy: an updated meta-analysis

The diagnostic performance of M-type phospholipase A2 receptor (PLA2R) autoantibodies and PLA2R glomerular staining in discriminating between idiopathic membranous nephropathy (iMN) and secondary membranous nephropathy (sMN) has not been fully evaluated. We conducted an updated meta-analysis to investigate the accuracy and clinical value of serological anti-PLA2R test and histological PLA2R staining for differentiation iMN from sMN. A total of 19 studies involving 1160 patients were included in this meta-analysis. The overall sensitivity, specificity, diagnostic odds ratio (DOR) and area under the receiver operating characteristic curve (AUROC) of serum anti-PLA2R were 0.68 (95% CI, 0.61–074), 0.97 (95% CI, 0.85–1.00), 73.75 (95% CI, 12.56–432.96) and 0.82 (95% CI, 0.78–0.85), respectively, with substantial heterogeneity (I2 = 86.42%). Subgroup analyses revealed the study design, publication type, study origin, assay method might account for the heterogeneity. Additionally, the overall sensitivity, specificity, DOR and AUROC of glomerular PLA2R staining were 0.78 (95% CI, 0.72–0.83), 0.91 (95% CI, 0.75–0.97), 34.70 (95% CI, 9.93–121.30) and 0.84 (95% CI, 0.81–0.87), respectively, without heterogeneity (I2 = 0%). Serological anti-PLA2R testing has diagnostic value, but it must be interpreted in context with patient clinical characteristics and histological PLA2R staining in seronegative patients is recommended.

M embranous nephropathy (MN), a common cause of adult nephrotic syndrome worldwide, can be idiopathic, or secondary to various clinical conditions, including systemic autoimmune disease, infections, neoplasia and drug intoxications 1 . Discriminating between these two groups of patients is of utmost clinical importance, since therapy in the sMN must be directed at the underlying cause and some of the treatments for iMN are potentially toxic both to the patient and the kidney 2,3 . To date, the diagnosis of iMN is still made by the exclusion of secondary causes using a detailed medical history, physical examination, laboratory studies and often invasive procedures 4 . However, in reality, differentiating iMN from sMN is difficult, especially in elderly patients in whom malignancies tend to occur 5,6 . Therefore, the need for an accurate biomarker to differentiate iMN from sMN is urgent.
In 2009, M-type phospholipase A2 receptor (PLA2R), a 185 kDa type I transmembrane glycoprotein expressed on glomerular podocytes, was identified as a major target antigen of the autoantibodies involved in iMN 7 . Circulating PLA2R autoantibodies were found in a majority (52-82%) of serum samples from patients with iMN, but absent in patients with sMN and other glomerular or autoimmune diseases, so these autoantibodies were suggested to not only play a direct pathogenic role but also be a promising marker for the differential diagnosis [8][9][10][11][12][13][14] . Furthermore, PLA2R staining were assessed in the renal biopsies and showed a good correlation with the serological test, although there was discordance in rare cases 12,13,15,16 .
However, with accumulating evidence, conflicting results have raised concerns about the clinical performance of serological anti-PLA2R and histological PLA2R staining tests for the diagnosis of iMN across various clinical situations. Thus, we performed a systematic review and meta-analysis to comprehensively investigate the diagnostic accuracy of the serological and histological tests to differentiate between iMN and sMN.

Results
Search results and study characteristics. As shown in Figure 1, our search initially yielded 432 publications in total, with 162 duplicates. After screening titles and/or abstracts, another 181 articles were excluded, including reviews, case reports and basic research. 89 studies remained for evaluation via detailed reading. Among them, the topic of 27 studies did not focus on the diagnosis, and we could not extract data for a 2 3 2 quadrant table in 12 studies. The other 31 studies did not match inclusion criteria. Additional search of the reference lists of included studies and previous relevant reviews did not identify any articles. Finally, 19 studies were included in the analysis. 13 [7][8][9][10][11]14,[21][22][23][24][25]27,28 of them only investigated the diagnostic value of anti-PLA2R detection, 3 studies 15,16,29 only provided complete data for PLA2R glomerular deposits in the discernment between iMN and sMN, and 3 studies 12,13,26 contained both serological and histological tests. Characteristics of included studies are listed in Table 1. A total of 1160 patients with MN were enrolled, and all the studies were conducted in adult patients.
Quality assessment. The quality of the included studies according to the QUADAS standard is summarized in Figure 2. The inter-rater reliability for 14 items of QUADAS was 0.21 (p , 0.01). Overall, the methodological quality was moderate, with the scores ranging from 8 to 11. These studies were performed in Europe, America and Asia during 2009-2014, representing an international experience. 4 [8][9][10]12 of the studies were designed as prospective research, which possessed sufficient clinical information, whereas other studies were retrospective. 8 studies 8,14,21,22,[25][26][27][28] did not state the time of serum sampling relative to the detection by biopsy or whether immunosuppressors were administered prior to the serological tests. 10 7-14,24-26 of the 19 studies reported classification of the secondary causes of MN, such as lupus, hepatitis B and malignant tumors. The levels of proteinuria and serum creatinine were reported in 9 studies 7,9-16 . 7 studies 8,[11][12][13]21,24,25 used an indirect immunofluorescence (IIF) assay to detect anti-PLA2R in serum, and only 4 7,9,10,23 and 3 14,26,28 studies used western blot (WB) and ELISA, respectively. However, most studies did not provide cutoff values.
Data analysis and calculations. The true positive (TP), false positive (FP), true negative (TN), false negative (FN) and sensitivity and specificity of each study are listed in Table 2. Studies were stratified by different sample type, used to differentiate between iMN and sMN. As shown in Table 3, we found a DOR of 73.75 (95% CI, 12.56-432.96) for anti-PLA2R to differentiate iMN from sMN at a pooled sensitivity and specificity of 0.68 and 0.97 (Figure 3), respectively. The I 2 statistic was 83.70%, indicating significant heterogeneity across these studies. When patients were restricted to serum anti-PLA2R in conditions of .3.5 g/24 h proteinuria before immunesuppressor treatment at the time of renal biopsy (raw data shown in Table S1), the results revealed 0.78 for the sensitivity, 0.82 for the specificity, 16.54 for the DOR, 0.82 for AUC and I 2 statistic decreased to 0.00%. PLA2R staining in biopsy showed a DOR (34.70, 95% CI, 9.93-121.30), a sensitivity of 0.78 and a specificity of 0.91 without significant heterogeneity (I 2 5 0.00%). The area under the receiver operating characteristic curve (AUROC) was 0.82 (95% CI, 0.78-0.85; Figure 4) versus 0.84 (95% CI, 0.81-0.87) in serological and histological tests.
Threshold effect and publication bias. The Spearman correlation coefficient of sensitivity and 1-specificity of the serological test was 0.16 (p 5 0.55 . 0.05) suggesting that there is no threshold effect. As Deeks' funnel plot shown in Figure 5 (t 5 23.41, P 5 0.004), we observed the existence of asymmetry and a slope coefficient in funnel plots of these studies indicating that publication bias exists in the studies of serological tests. Several studies involved less than 100 patients which may be the main cause for the presence of publication bias.
Subgroup analyses. Studies of serological tests were stratified into several subgroups (summary data shown in Table 4). The results   revealed the study design, publication type, study origin, assay method might account for the heterogeneity.
Serum anti-PLA2R and type of sMN. Three major causes of sMN involved in the included studies were SLE, hepatitis B and presence of a tumor (raw data shown in Table S2). The sensitivity, specificity, DOR and AUROC, respectively, were 0.70, 0.97, 65. 19  Serum anti-PLA2R and proteinuria levels. We divided the patients into nephrotic syndrome group and non-nephrotic syndrome group according to proteinuria levels (raw data shown in Table S3). The sensitivity, specificity and DOR, respectively, were 0.77, 0.91 and 34.44 for patients with nephrotic syndrome and 0.32, 0.91 and 4.77 for patients with non-nephrotic syndrome. The AUROC was much higher in the patients with nephrotic syndrome (0.83 vs 0.47).
Serum anti-PLA2R and immunosuppressant therapy. Considering affections of immunosuppressant therapy (data shown in Table S4), we compared the sensitivity, specificity and DOR in the patients who were treated with and without immunosuppressant therapies (0.44, 0.93 and 10.79 versus 0.72, 0.89 and 20.21, respectively).
Serum anti-PLA2R and sampling time from biopsy. The sensitivity, specificity and DOR, respectively, were 0.73, 0.87 and 17.99 for the patients when the sampling time was at the day of biopsy and 0.52, 0.95 and 19.37 for the patients when the sampling time was several times after the day of the biopsy (data shown in Table S5).

Discussion
Serum anti-PLA2R level or glomerular PLA2R staining diagnostic accuracy in iMN was investigated previously, with heavily biased results 26,27 . One meta-analyses 26 enrolled healthy individuals and patients without MN (non-kidney diseases or other glomerular diseases) as controls, possibly increasing the overall diagnostic accuracy  estimate. We included only patients with sMN as controls, directly assessing whether serum anti-PLA2R and histological PLA2R staining can discriminate between iMN and sMN correctly. The quantitative data analysis by Hu and colleagues 26 summarized sensitivity and specificity into one diagnostic accuracy measure. To retain the two-dimensional character, we used the optimal statistical methods of combining the studies, i.e., bivariate mixed effects regression models and HSROC. Moreover, they included only three studies for predicting active-stage iMN, two studies for biopsy testing, and two datasets from the same patient group to combine effects, therefore their conclusion should be considered with caution. They also said little about the accuracy of high-heterogeneity summary estimates, which we investigated using subgroup analyses. Du and colleagues 27 also used a healthy population and patients with other kidney diseases as controls. Some data extracted for meta-analyses differed from the inclusion criteria: they excluded patients with non-kidney diseases or who received immunosuppressive therapy, but such patients were included from the studies of Beck 7 and Hoxha 8 . Additionally, their conclusion was conflicted that the possible confounders therapeutic intervention and disease progression during the testing interval may have contributed to the heterogeneity. Furthermore, study of Dähnrich and colleagues 28 should not be used to calculate sensitivity data, as their ''iMN'' patients were pre-selected to be anti-PLA2R-positive, leading to overestimation in the given calculation. Thus, these mentioned above greatly affected their overall conclusions.
Overall, serological anti-PLA2R testing had diagnostic value in discriminating iMN from sMN. According to proteinuria subgroup analysis, anti-PLA2R testing had much greater diagnostic accuracy for nephrotic syndrome (AUC 5 0.83) than for non-nephrotic syndrome (AUC 5 0.47). Anti-PLA2R levels may fluctuate with disease activity 14,29,30 ; serum anti-PLA2R decreases spontaneously and even faster under immunosuppressive therapy (the positive rate decreases to 15.79% , 28.57%) (Table S6). Then, primary and secondary MN cannot be distinguished when anti-PLA2R becomes negative. Subgroup analyses also showed that the pooled specificity was low, only 0.44 and 0.52 for patients treated and sampled after biopsy respectively, indicating immunosuppressive therapy could have removed the antibodies or spontaneous remission with the consequent antibody disappearance could have occurred after a long time interval following renal biopsy. Our subsequent subgroup analyses included only patients with nephrotic-range proteinuria before immunosuppressive therapy at renal biopsy, a more homogeneous group (I 2 5 0.00%): serum anti-PLA2R testing performed well (AUC 5 0.82) ( Table 3). However, including these patients may much accord with actual clinical situations; such investigations regarding serum anti-PLA2R diagnostic value should be interpreted with caution.
The possible explanations of biopsy testing is more sensitive than serological testing for diagnosing iMN are rapid antibodies clearance from the blood and deposition in the glomeruli, or late sampling when proteinuria persisted because of irreversible ultrastructural changes 8,31-33 . Therefore, extended screening for PLA2R staining in the glomeruli is recommended in seronegative patients.
The data were significantly discrepant between different forms of sMN. AUC 5 0.97 represented the highest diagnostic accuracy when sMN was lupus MN. Moreover, the form of sMN significantly influenced the FP rate, although iMN coinciding with the associated disease cannot be excluded. Anti-PLA2R FNs would not have resulted in severe outcomes in all patients, as spontaneous remission occurs in 30-40% of patients, but anti-PLA2R FPs could result in harmful treatments and delayed/no detection of primary diseases, especially underlying malignancy in older patients with MN 34 . Hence, routine age-and sex-appropriate malignancy screening is necessary for older patients with newly diagnosed MN, even serological anti-PLA2R-and/or histological PLA2R antigen-positive patients. In our study, positive LRs of 24.48 and 8.40 implied that a person with iMN was 24.48 (serological testing) and 8.40 (biopsy testing) times likelier, respectively, to have a positive result than a person with sMN. Given a 25% pretest probability, the post-test probability of a positive test result was 89% (serological testing) and 74% (biopsy testing). Meanwhile, negative LRs of 0.33 and 0.24 reduced the post-test probability of a negative result to 10% (serological testing) and 7% (biopsy testing) (Table 3). However, these LRs were calculated from dichotomized data: the result is either positive or negative. The disadvantage is that useful information is lost. To obtain more precise information, we suggest calculating LRs using multiple cutoffs.
Our study has several limitations. Firstly, the methodological quality of included studies was moderate: many had potential verification or disease progression biases. In some studies, implementation was poorly reported, especially test review bias, uninterpretable results and withdrawals. As there is currently no common validated measurement platform, inter-study assay methods varied. Furthermore, WB and IIF signal intensity are difficult to standardize because there are no guidelines/criteria for establishing a diagnostic cutoff value. Future diagnostic accuracy studies require ELISA standardization, which defines the normal range and objective threshold for discriminating positive and negative results in clinical studies.
Thirdly, only six histological testing studies were included, and the available information was insufficient for subgroup analyses, therefore it was difficult to draw a definitive conclusion for its ability to discriminate. Hence, biopsy testing requires further study. Additionally, we only included English-language articles, thus language bias may have influenced the results.
In conclusion, this meta-analysis suggests that serological anti-PLA2R testing has diagnostic value for differentiating iMN from sMN, but it must be interpreted in context with patient clinical characteristics (degree of proteinuria, immunosuppressive treatment, time of detection). Histological PLA2R staining in seronegative patients is recommended. Studies included were nonrandomized, and potential confounders cannot be strictly controlled. Thus, well-designed prospective studies with large patients cohorts are required to reliably evaluate the value of anti-PLA2R and PLA2R antigen for identifying iMN.

Methods
This systematic review and meta-analysis was performed in accordance with the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) guidelines 35 .
Data sources and searches. We searched MEDLINE (PubMed), EMBASE, Web of Science and Cochrane library databases until October 2014 using following search terms with English language: PLA2R (phospholipase A2 receptor or M-type phospholipase A2 receptor) and MN (membranous nephropathy, membranous glomerulonephritis or membranous glomerulopathy). Manual searches were conducted following reviews of the reference lists of all selected articles to identify any missing studies.
Study selection, data extraction, quality assessment and data synthesis. Studies were included if they assessed anti-PLA2R in serum and/or PLA2R in glomeruli for differentiation between iMN and sMN. To be eligible, studies had to have a well defined reference standard, which included patients were designated as sMN if they The summary point represents the summary sensitivity and specificity, the 95% confidence region represents the 95% confidence intervals of the summary sensitivity and specificity and the 95% prediction region represents the 95% confidence interval of sensitivity and specificity of each individual study included in the analysis. The plot also includes study estimates indicating the sensitivity and specificity estimated using the data from each study separately. The size of the marker is scaled according to the total number in each study.       had confirmed etiologies of MN and designated as idiopathic after exclusion of known secondary etiologies through history, physical exam, and laboratory tests and kidney biopsy (including light, electron microscopy and immunofluorescence). Morever, the studies had to provide sufficient information for the 2 3 2 contingency table. Conference abstracts could be included if they contained available data. Study selection, data extraction, quality assessment, and data synthesis were independently performed by two reviewers, Dai and Zhang, Any disagreements were resolved through discussion or a third reviewer, He. Characteristics of included studies and data of 2 3 2 contingency tables were extracted. The Quality Assessment of Studies of Diagnostic Accuracy included in Systematic Review (QUADAS) assessment tool 36 , which contains 14 items was applied for the quality assessment of the included studies. A bivariate mixed-effects model of meta-analytical integration of diagnostic accuracy studies (MIDAS) module in STATA (version 12.0) was used for calculation of sensitivity, specificity, diagnostic odds ratio (DOR), area under the receiver operating characteristic curve (AUROC) and positive and negative likelihood ratios (LRs). A hierarchical summary receiver operating curve (HSROC) was constructed with the derived logit estimates of sensitivity, specificity 37 . The post-test probability with assigned pre-test probability of 25% was calculated based on the pooled sensitivity and specificity. The k statistic was calculated for the inter-rater reliability between two investigators for quality assessment. The I 2 was used to assess heterogeneity with I 2 . 50% indicating the presence of significant heterogeneity. For Q test, P value less than 0.05 stands for significant heterogeneity. To explore the potential source of heterogeneity, we stratified the studies into several subgroups (according to characteristics of studies and patients, including type of publication, country of origin, study design, sample size, proteinuria, treatment with or without immunosuppressor, type of sMN, interval between biopsy and serum sample) and calculated specificity, sensitivity and relevant parameters. Publication bias was investigated by Deek's plot and considered to be present if there was a non-zero slope coefficient (P , 0.05) 38 .