The evolution of evidence synthesis

Increasing interest in promoting evidence-based clinical practice has led to methodological advancements in evidence syntheses [1, 2]. Narrative reviews have been superseded by systematic reviews, which may include meta-analysis—statistical pooling of treatment effect estimates across similar trials to improve precision [3,4,5]. Systematic reviews minimize the risk of selection bias by considering all evidence relevant to a clinical question; however, an important limitation of conventional meta-analyses is that they only inform treatments that have been directly compared in clinical trials. Moreover, many trials compare active interventions against placebo, usual, or standard care, whereas patients and clinicians are typically concerned with the relative effectiveness of competing interventions. Network meta-analysis (NMA) has emerged to address these limitations by allowing for calculation of the comparative effects of more than two competing interventions, even when they have not been directly compared in clinical trials [6, 7].

What is network meta-analysis?

NMA requires the same steps as a conventional meta-analysis which include a systematic search of the literature, assessment of risk of bias among eligible trials, statistical pooling of reported pairwise comparisons for all outcomes of interest, and assessment of the overall certainty of evidence on an outcome-by-outcome basis. This provides the “direct” evidence for treatments that have been compared against each other, which is graphically represented by a network map. An NMA then identifies all interventions that are connected by virtue of a common comparator. For example, two different active treatments may have been compared against placebo in different trials. An NMA allows for a theoretical trial to be created that compares these active treatments against each other, based on their effect against a common comparator (placebo), which provides “indirect” evidence. Indirect comparisons provide an opportunity to fill knowledge gaps within the available evidence, providing a more comprehensive understanding of treatment options for the clinician. The network estimate is the pooled result of the direct and indirect evidence for a given comparison, or only the indirect evidence if no direct evidence is available [6, 8, 9]. Once all treatments have been compared within a network, there are different methods for ranking treatments to convey their relative net effectiveness. Limitations and advancements in the ranking methodology will be discussed in greater detail within the example provided below.

Network meta-analysis in practice

An example network map on first-line medications effects on intra-ocular pressure (IOP) for primary open angle glaucoma (POAG) is shown in Fig. 1, which represents all pharmacologic treatments that have been directly evaluated in 114 clinical trials for this condition [10]. Traditional meta-analysis would be limited in comparing two of these treatments at a time, and could not inform the effectiveness of treatments that have not been directly compared; however, this NMA provides the relative effectiveness of all 15 treatments in a single investigation, even when no RCT is available to make a direct comparison between two treatments. The network map uses circles, or nodes, for each included treatment, that increase in size relative to the number of patients treated with that medication within included RCTs. The lines connecting different treatments are weighted by the number of RCTs comparing them (i.e., thicker lines convey more direct trials) [10]. In this particular study, the authors color coded their treatment nodes by drug class to improve interpretation. The network is specific to one outcome, in this case IOP, and the network assumes that the baseline characteristics of patients enrolled across trials are similar.

Fig. 1: Network diagram from Li et al. (2016) [10] comparing medications for POAG.
figure 1

Size of nodes represents the number of patients, line thickness represents number of trials.

As Fig. 1 demonstrates, there are many RCTs assessing pharmacotherapy for POAG. Some treatments, such as Timolol or Latanoprost, have large bodies of evidence, while many others have far fewer – and smaller – trials assessing their efficacy [10]. This network enables the comparison of 14 active medications, as well as placebo, for POAG.

While the ability to summarize large bodies of evidence is also possible for traditional meta-analyses, NMAs provide comparative effectiveness data between competing treatments. It is important to note that the evidence provided by an NMA is subject to the limitations of the individual RCTs included within the network [11]. In addition, the ranking of interventions by NMAs using methods such as the Surface Under the Cumulative Ranking Curve (SUCRA) approach is problematic – despite this currently being the most common form of treatment ranking in NMAs. This approach ranks all treatments within a network from “best” to “worst” for each analyzed outcome, but only considers the effect estimate and not the associated precision or the certainty of evidence [12]. Thus, interventions supported by small, low-quality trials that report large effects are ranked highly. Minimally or partially contextualized approaches, instead, consider the magnitude of effect in the context of patient importance as well as the certainty of evidence [13, 14].

How can you have certainty in the findings of an NMA?

Like all study designs, there are considerations when evaluating the credibility of the findings of an NMA. These include the same issues that should be considered when evaluating a traditional pairwise meta-analysis, such as the rigor of the literature search, risk of bias among included trials, consistency of effect estimates contributing to pooled effects (heterogeneity), precision of the pooled effect estimate, publication bias, and directness of the included evidence in relation to the primary research question [8, 9, 15, 16]. However, there are two additional considerations that are specific to NMAs: incoherence and transitivity [8, 9, 15, 17].

Incoherence exists when the direct and indirect estimates for a comparison are not consistent with one another [6]. A meta-epidemiological study of 112 published NMAs found inconsistent direct and indirect treatment effects in 14% of the comparisons made [18]. This means that while in most cases it is appropriate to combine indirect and direct evidence, this is not always the case, and review authors should formally explore this issue. In the presence of incoherence, the higher certainty evidence should be presented rather than the network estimate. If the direct and indirect effects are both supported by the same certainty of evidence, then the network estimate can be used but should be downgraded one level for incoherence. The GRADE approach is increasingly used for rating the certainty in evidence for network estimates, which incorporates these aforementioned criteria [11, 15,16,17]. A GRADE rating can assign high, moderate, low, or very low certainty in the evidence [11, 15,16,17]. Clinicians should take the certainty of the evidence in consideration when determining the impact findings would have on their clinical practice, as lower certainty evidence provides less confidence in the results.

Transitivity refers to the similarity between study characteristics that allows indirect effect comparisons to be made with the assurance that there are limited factors that could modify treatment effects, aside from the intervention under investigation [6, 15]. Essentially, transitivity refers to the inclusion of studies that fundamentally address the same research questions within the same population [6]. Intransitivity can result in biased indirect estimates, which would then impact the overall findings of the network estimates [15, 17]. As previously discussed, incoherence exists when discrepancies between direct and indirect estimates are present, thus, transitivity is a common cause of incoherence [17].

Clinicians cannot be expected to evaluate transitivity and incoherence within an NMA and authors should clearly report on these two important aspects. Indeed, the absence of reporting should lead readers to question the findings. Table 1 provides an example and overview of the core items for readers to identify for critical appraisal of published NMAs, as applied to the Li et al. (2016) POAG study [10, 19]. These criteria are based on the Users’ Guides to the Medical Literature: Essentials of Evidence-Based Clinical Practice [19].

Table 1 Example appraisal of the Li et al. (2016) POAG NMA.


Rigorously conducted and reported NMA may provide helpful information for advancing evidence-based ophthalmology, specifically in the common scenario in which multiple treatment options exist. However, clinicians should appraise the quality of NMAs before accepting the results, and even rigorously conducted NMAs cannot provide high certainty evidence if the primary trials eligible for review are flawed.