Epidemiology Characteristics, Methodological Assessment and Reporting of Statistical Analysis of Network Meta-Analyses in the Field of Cancer

Because of the methodological complexity of network meta-analyses (NMAs), NMAs may be more vulnerable to methodological risks than conventional pair-wise meta-analysis. Our study aims to investigate epidemiology characteristics, conduction of literature search, methodological quality and reporting of statistical analysis process in the field of cancer based on PRISMA extension statement and modified AMSTAR checklist. We identified and included 102 NMAs in the field of cancer. 61 NMAs were conducted using a Bayesian framework. Of them, more than half of NMAs did not report assessment of convergence (60.66%). Inconsistency was assessed in 27.87% of NMAs. Assessment of heterogeneity in traditional meta-analyses was more common (42.62%) than in NMAs (6.56%). Most of NMAs did not report assessment of similarity (86.89%) and did not used GRADE tool to assess quality of evidence (95.08%). 43 NMAs were adjusted indirect comparisons, the methods used were described in 53.49% NMAs. Only 4.65% NMAs described the details of handling of multi group trials and 6.98% described the methods of similarity assessment. The median total AMSTAR-score was 8.00 (IQR: 6.00–8.25). Methodological quality and reporting of statistical analysis did not substantially differ by selected general characteristics. Overall, the quality of NMAs in the field of cancer was generally acceptable.

General characteristics of included NMAs. The first NMA in the field of cancer was published in 2006 20 .
Reporting of literature search. Thirteen NMAs did not report any information on literature search, whereas one NMAs was conducted based on previous meta-analyses without additional searching. 98.90% (88/89) NMAs searched only English databases. The median number of Chinese databases searched was 5 (IQR: 3-6), and it was 3 (IQR: 3-4) for English databases. 22.50% (20/89) NMAs reported the search strategy, and the median number of search strategies reported was 2 (IQR: 1-3). 27.00% (24/89) NMAs searched previous published meta-analyses as a supplemental literature search. Other supplemental literature search methods included reference list checking, clinical trial registration platform, conference abstracts or web sites, and google engine ( Table 2). PubMed/MEDLINE was the most common single database searched, and it was often combined with a search of Cochrane Library. The details of databases searched were showed in Table 3.
Reporting of statistical analysis processes. Sixty-one (59.80%) NMAs were conducted using a Bayesian framework (2 reviews are adjusted indirect comparisons). 43 reviews were adjusted indirect comparisons (2 adjusted indirect comparisons use Bayesian framework).
For adjusted indirect comparisons, the majority of NMAs (42/43, 97.67%) also conducted traditional meta-analyses and 53.49% (23/43) adjusted indirect comparisons were performed using methods described by Bucher 21 . 58.14% (25/43) assessed the heterogeneity of direct comparisons, but none of NMAs assessed the heterogeneity of indirect comparisons. Only two (4.65%) NMAs described the details of handling of multi group trials and three (6.98%) described the methods of similarity assessment. Most of NMAs did not report whether sensitivity analyses were performed (38/43, 88.37%) and whether subgroup analyses or meta-regression were  Scientific RepoRts | 6:37208 | DOI: 10.1038/srep37208 performed (34/43, 79.07%). These results did not differ by journal quality or year of publication. The details of statistical reporting for adjusted indirect comparisons was showed in Table 5.
Methodological quality assessment. The results of methodological quality assessment based on modified AMSTAR checklist were presented in Fig. 5. The median total score was 8.00 (IQR: 6.00-8.25). Approximately half of the included NMAs did not perform a comprehensive literature search (Item 3, 42.31%). More than half of NMAs (69.61%) did not consider the scientific quality of the included studies in formulating conclusions, and 84.31% NMAs did not assess the likelihood of publication bias. Table 6 presented the results of stratified analyses of methodological quality assessment. NMAs published in journals with higher impact factors more often performed a comprehensive literature search (78.13% versus 45.45%, p = 0.002), reported appropriate methods used to combine the findings of studies (81.25% versus 58.18%, p = 0.019), and assessed the likelihood of publication bias (25.00% versus 5.45%, p = 0.017). NMAs published after December 31st 2013 more often assessed the scientific quality of the included studies (86.36% versus 55.17%, p = 0.001) and considered the scientific quality in formulating conclusions (43.18% versus 20.69%, p = 0.015). Most of these items did not differ between funding support and non-funding support. NMAs published in China more often reported two independent reviewers for study selection and data extraction (89.66% versus 65.75%, p = 0.015), assessed the scientific quality of the included studies (86.21% versus 61.64%, p = 0.016) and considered the scientific quality in formulating conclusions (68.97% versus 15.07%, p = 0.000). Moreover, Bayesian NMAs more often reported two independent reviewers for study selection and data extraction (81.97% versus 60.47%, p = 0.015), performed a comprehensive literature search (70.49% versus 39.53%, p = 0.002), considered the status of publication (i.e. grey literature) used as an inclusion criterion (86.89% versus 67.44%, p = 0.017), and assessed the scientific quality of the included studies (78.69% versus 55.81%, p = 0.013).  Table 7 presented the association of total AMSTAR-score and selected general characteristics. Although the AMSTAR-score of NMAs published in China was higher than NMAs published in others (p = 0.023), there were no significant differences between AMSTAR-score and different countries (p = 0.465). The differences were not significant between AMSTAR-score and other selected general characteristics.

Discussion
We identified 102 NMAs involving 24 kinds of cancer. Methodological quality and statistical reporting were assessed based on PRISMA extension statement and modified AMSTAR checklist. In addition, we also assessed the conduct of literature search in the included NMAs. Some key methodological components including the literature search and statistical analysis were missing or inadequate in most of included NMAs, such as only 22.50% of NMAs reported search strategy, 6.56% assessed the heterogeneity in NMAs. Methodological quality and reporting of statistical analysis did not substantially differ by selected general characteristics of NMAs.
NMAs could provide useful evidence on relative effectiveness of different interventions for decision-making when there are no or insufficient direct comparison trials 11 . Methodological quality of NMAs is a crucial point for health care decision-makers and researchers. We assessed the methodological quality of NMAs in the field of cancer based on modified AMSTAR checklist. Some methodological flaws were identified, especially regarding   to literature search (Item 3), assessment of scientific quality (Item 7) and scientific quality used appropriately in formulating conclusions (Item 8), the methods used to combine the findings of studies (Item 9), and assessment of publication bias (Item 10).

Items
All studies (n = 61) NMAs aimed to rank the benefits (or harms) of interventions, based on all available RCTs. Thus, the identification of all relevant data is critical 7 . Most of the included NMAs (80.39%, 82/102) did not report database search strategy. For those that reported search strategy, 26.96% only searched previous published meta-analyses. It is important to search, track, and include previous systematic reviews and meta-analyses in conducting NMAs 22 . PubMed/MEDLINE was the most commonly used databases and the most common combination of databases was PubMed/MEDLINE and EMBASE. The majority of NMAs did not search Chinese databases. Cohen et al.' study showed that searching Chinese databases might lead to the identification of a large amount of additional clinical evidence, and suggested that Chinese biomedical databases should be searched when performing systematic reviews 23 .
The assessment of scientific quality of individual studies could affect findings of NMAs 24 . However, 31.37% of the included NMAs did not report methods for assessing the risk of bias of individual studies in methods sections. And 69.61% did not consider the scientific quality of the included studies in formulating conclusions. Although reporting bias could have a substantial effect on the conclusions of a NMA 12 , most of the included NMAs (84.31%) did not report a method to assess publication bias.
The complex nature of NMA mainly reflected in the diversification of interventions and complex statistical analysis process. Homogeneity and consistency assumptions underlie NMA 25 . Although assessment of heterogeneity in traditional meta-analyses was common, only 4 NMAs (3.92%) assessed the heterogeneity in the entire network by heterogeneity variance parameter (Tau 2 ). Eleven (10.78%) explicitly reported the methods of assessment of similarity. For those with Bayesian framework, 17 (27.87%) assessed the inconsistency between direct comparisons and indirect comparisons. GRADE tool was proposed to assess the quality of evidence from NMAs in 2014 26 . However, it still was rarely used to assess quality of evidence in NMAs related to cancer.
To the best of our knowledge, this is the first review to comprehensively assess the methodological quality using a modified AMSTAR checklist, and simultaneously assess the quality of reporting of literature search and statistical analysis methods. Two recent reviews that also focused on the methodological problems of published network meta analyses 12,15 covered a wide range of medical areas and some details of reporting of literature search and statistical analysis were missing. Bafeta A et al. 12 included 121 NMAs to examine the methodological reporting of NMAs, the results showed that 73% did not report the electronic search strategy for each database compared with 77.5% in our study. Most of NMAs did not assess quality of evidence using GRADE tool (3% vs. 4.92%). The results of methodological reporting were similar to our study. Chambers J et al. 15 also showed that there were similar methodological quality problems in their included NMAs. However, AMSTAR checklist has not been used to systematically assess the methodological quality of NMAs. Furthermore, we explored the potential factors influencing methodological quality and statistical reporting according to general characteristics of the included NMAs. There were no substantial differences by selected general characteristics of NMAs.
Our study also have some limitations. There was no standard tool to assess the methodological quality of NMAs. We slightly modified three of the 11 AMSTAR items (Item 1, Item 5, and Item 9) to assess the   methodological quality of NMAs. However, there are still some problems or uncertain issues, such as the difficulty in defining type of interventions and type of comparisons for inclusion in NMAs, how to draw geometry of the network, how to handle multi group trials, how to decide whether the assessment of similarity and consistency was appropriate, and whether statistical analysis methods were appropriate for NMAs. The complex nature of statistical analysis of NMAs raised the necessity to develop a guideline about the reporting of statistical analysis of NMAs. As with other methodological studies, assessing methodological quality and reporting quality from published reports alone could be misleading. The study authors may have used adequate methods but omitted important details from published reports 12 , or published reports were sufficient referring to relevant reporting guidelines but not rigorous during the conduct process. For example, while we distinguished whether study selection and data extraction were performed by least two independent reviewers, we did not know whether the processes were really performed by two independent reviewers. Finally, we did not identify any eligible NMAs related to diagnostic test accuracy and animal study. We also did not include reviews based on individual patient data (IPD) due to the differences of method and statistical analysis processes between IPD and aggregated data. Overall, the methodological quality of NMAs in the field of cancer was generally acceptable. However, some methodological flaws have been identified in published NMAs, especially regarding to literature search, assessment of scientific quality and scientific quality used appropriately in formulating conclusions, the methods used to combine findings of studies, and assessment of publication bias. Methodological quality and statistical reporting did not substantially differ by general characteristics. Eligibility criteria. We included any NMAs in the field of cancer in the English and Chinese languages, regardless of interventions. NMAs were defined as meta-analyses that used network meta-analytic methods to analyze, simultaneously, three or more different interventions 7 , adjusted indirect comparisons were also included. If the same NMA had duplicate publications, the latest was included. We excluded methodological articles, conference abstracts, letters, editorials, correspondences, cost-effectiveness reviews, and reviews based on individual patient data.

Study selection.
Literature search records were imported into ENDNOTE X6 literature management software. Two independent reviewers (LG, LL) examined the title and abstract of retrieved studies to identify potentially relevant studies according to the eligibility criteria. Then, full-text versions of all potentially relevant studies were obtained. Excluded trials and the reasons for their exclusion were listed, conflicts were resolved by a third reviewer (J-HT, or K-HY). General characteristics. The following general characteristics were collected by one reviewer (LG): first author, year of publication, country of corresponding author, journal name, publishing period (time from received to accepted), funding source (industry-supported, non-industry-supported, unfunded or not report), number of author, language of publication (English or Chinese), number and type of included original studies, sample size of included original studies, number of study arm, type of outcome (dichotomous, continue, or survival time), categories of disease, and number of interventions included in the network. We categorised journal types into Science Citation Index (SCI) or non-SCI; we also identified journals with high impact factors (IF ≥ 5.000, as reported on Journal Citation Reports 2014) 27 or low impact factors (IF < 5.000). We also categorised NMAs into older studies or recent studies based on the median division of number of included NMAs.
Reporting of literature search. One reviewer (XQ) extracted following information regarding reporting of literature search: number of databases searched (Chinese, English, or both), name of databases searched, whether the search strategy was provided, whether the previous systematic reviews/meta-analyses were searched, name and number of other sources searched (e.g., reference lists checking, clinical trial registration platform, conference abstracts or web sites, Google engine).

Reporting of statistical analysis processes.
We assessed the reporting and quality of statistical analysis processes in the methods sections of each NMA report according to the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) extension statement for NMAs 28 Table 6. Subgroup analyses of methodological quality assessment (n/%). * 6 studies published in journals with no associated impact factor. # Based on the median division of number of included NMAs, December 31st 2013 is the cut-off point. & 2 adjusted indirect comparisons also were conducted using Bayesian framework.
Scientific RepoRts | 6:37208 | DOI: 10.1038/srep37208 according to the statistical analysis section of PRISMA extension statement, and were extracted by two independent reviewers (LG, JZ), and conflicts were resolved by a third reviewer (J-HT, or K-HY): • Was traditional meta-analysis conducted?
• Were summary measures reported? State the principal summary measures (e.g., risk ratio, odd ratio, mean difference, hazard ratio). Also describe the use of additional summary measures assessed, such as treatment rankings (e.g., treatment rankings, best, or surface under the cumulative ranking curve (SUCRA) values), shape and scale parameters for survival data 29  Was a sensitivity analysis performed? (e.g., excluding studies, alternative prior distributions for Bayesian analyses, alternative formulations of the treatment network). Was subgroup analysis or meta-regression performed?
• Was the Grading of Recommendations Assessment, Development and Evaluation (GRADE) tool used to assess quality of evidence 26 ? Methodological quality assessment. There were no consensuses to assess the methodological quality of NMAs. We assessed the methodological quality of included NMAs using a modified AMSTAR checklist. This checklist included 11 items, with possible responses of "Yes" (item/question fully addressed), "No" (item/question not addressed), "Cannot answer" (not enough information to answer the question), and "Not applicable". Two reviewers (XQ, G-QP) independently extracted data, and conflicts were resolved by a third reviewer (LG, or J-HT). The total score using AMSTAR was obtained by summing one point for each "yes" and no points for any other responses ("no", "Cannot answer" and "Not applicable"), ranging from 0 to 11. In our study, three of the 11 items were slightly modified as follows (Appendix 3): • "Was an 'a priori' design provided?" was amended to "Was the research question (i.e., research purpose, inclusion and exclusion criteria) clarified?" The reason for this modification was that only a small minority of published non-Cochrane reviews reported a protocol 38 . Where a protocol providing this information was available, the answer to this question would be "Yes". Where no protocol was available but detailed information about research purpose and inclusion and exclusion criteria (patients, interventions, comparators, outcome, and study design) were supplied, we also considered answer this question "Yes". • "Was a list of studies (included and excluded) provided?" was amended to "Were a list of included studies and flow diagram provided?" The reason for this modification was that most of published systematic reviews did not provide a list of excluded studies. Where a list of included studies and flow diagram of literature selection were provided (as references, electronic link, or supplement), we considered answer this question "Yes". • Were the methods used to combine the findings of studies appropriate?
For pairwise meta-analysis, we scored "Yes" if they mentioned or described heterogeneity and reported how to handle heterogeneity. For NMA, the following factors should be taken into consideration except heterogeneity, but not be limited to: summary measures, model used, model fit, prior distributions (Bayesian analysis), convergence (Bayesian analysis), and inconsistency.
Statistical analysis. Quantitative data were summarised by medians and interquartile range (IQR), and categorical data summarised by numbers and percentages. The association between methodological quality and following characteristic variables was explored using the Mann-Whitney U test and Kruskai-Wallis test: journal impact factor, year of publication, funding source, country of corresponding author, type of NMAs, and categories of disease. Moreover, the subgroup analyses for statistical reporting were performed according to journal impact factor (high vs. low impact factors) and year of publication (older vs. recent studies). Proportion results were analysed by Chi-square test using STATA version 12.0 39 . All tests were two sided, and P ≤ 0.05 was considered statistically significant.