What is a noninferiority trial?

Traditionally, randomised trials address whether an intervention is superior to an alternative intervention, placebo, or standard care. Noninferiority trials are an alternative type of randomised trial. They compare a intervention against standard care to demonstrate that the intervention is non inferior to (no worse than) the standard care—often because standard care is associated with greater burdens or costs compared to the intervention being studied.

The appraisal and interpretation of noninferiority trials require many of the same considerations as superiority trials. For noninferiority trials to produce valid and reliable results, they must, like superiority trials, maintain allocation concealment, ensure patients in different arms are treated similarly and receive identical care apart from the intervention being tested, follow-up all patients for outcome assessment and analysis, and avoid selective reporting. In this editorial, we highlight some unique considerations involved in appraising and interpreting noninferiority trials using examples within ophthalmology.

Establishing a noninferiority margin

The objective of a noninferiority trial is to demonstrate that the intervention being evaluated achieves the benefit of standard care within a noninferiority margin. This noninferiority margin is critical to the design and interpretation of noninferiority trials, but there is no consensus on the optimal method to establish the noninferiority margin or its clinical relevance. General guidance suggests that it should be specified a priori and reflect both statistical and clinical considerations [1, 2]. If investigators choose a non-inferiority margin that is too broad, they may claim noninferiority even when patients and clinicians prefer standard care. Conversely, if a noninferiority margin is too narrow, investigators may not claim noninferiority for a treatment that is clinically acceptable. We recommend clinicians to use their expertise to assess investigators’ choice of the non-inferiority margin and consider whether patients may consider the magnitude of loss of treatment efficacy important.

The TENAYA and LUCERNE trials tested the noninferiority of faricimab with aflibercept and specified a 4-letter reduction as the noninferiority margin for best corrected visual acuity with statistical and clinical justification [3]. Statistically, the investigators report that a 4-letter margin preserves 70% of the least estimated benefit of ranibizumab, based on a previous placebo-controlled trial [4]. Clinically, the trials considered a loss of 5 letters of vision (one line on the Early Treatment of Diabetic Retinopathy Study chart) or more to be the minimal important difference.

Conclusions of noninferiority are inappropriate in superiority trials. Investigators may incorrectly interpret results that are not statistically significant as evidence of noninferiority. For example, a trial comparing bromfenac with dexamethasone for anterior chamber inflammation after cataract surgery concluded bromfenac to be as effective as dexamethasone in laser flare photometry based on the absence of a statistically significant difference between interventions [5]. While there is no significant statistical difference between the two groups, the 95% confidence interval may include values suggesting bromfenac is not a clinically acceptable alternative.

What are specific considerations in appraising noninferiority trials?

When appraising noninferiority trials, clinicians should look for deficiencies in trial design that may diminish the difference in outcomes between the intervention and standard care.

For superiority trials, analysing results based on the intention-to-treat principle is considered a robust approach to demonstrate superiority as it maintains prognostic balance between arms by including patients who may have stopped treatment (e.g., due to side effects) or may have been lost to follow-up. In noninferiority trials, however, intention-to-treat analyses may produce misleading results by reducing the difference in treatment effect between groups, thus making it easier to demonstrate noninferiority. For example, high rates of non-adherence and loss to follow-up may render effects of the intervention and standard care more similar and closer to the null. No consensus exists on the optimal approach to analysing non-inferiority trials. However, demonstrating non-inferiority based on both intention-to-treat and per-protocol analyses strengthens the overall conclusion.

In addition to nonadherence and attrition, other trial design factors may also make the intervention and standard care appear more similar than they may be. For instance, suboptimal administration of the standard care (e.g., subtherapeutic dose) can reduce its effects making non-inferiority easier to achieve. Similarly, enroling a population at low risk of the outcome or at high risk of non-adherence or terminating follow-up before treatment effects are fully manifest will also make the outcomes of the intervention and standard care groups appear more similar.

How should we interpret non-inferiority trial results?

Results from noninferiority trials can be interpreted based on the “line of no difference” and the noninferiority margin (Fig. 1). If the confidence interval lies beyond the line of no difference in favour of the new treatment (p < 0.05), the intervention is considered statistically superior to standard care (Fig. 1E). If the confidence interval lies below the noninferiority margin (Fig. 1A), the new intervention is considered inferior to standard care. If the confidence interval lies between the noninferiority margin and the line of no difference (Fig. 1D), the intervention is non-inferior to standard care.

Fig. 1: Illustration of Possible Outcome Scenarios in Non-Inferiority Trials.
figure 1

Listed possible outcome scenarios in non-inferiority trials (A–E) interpreted based on the “line of no difference” and the noninferiority margin.

The challenge is interpreting confidence intervals that lie in between the line of no difference and the noninferiority margin (Fig. 1C) or those that cross the noninferiority margin but not the line of no difference (Fig. 1B). In the former, the overall effect of the intervention is considered noninferior. In the latter, the intervention may be noninferior but results are too imprecise to draw confident conclusions.

Both TENAYA and LUCERNE yielded best corrected visual acuity results where the point estimate and confidence interval did not include the null effect nor the non-inferiority margin, suggesting that faricimab is non-inferior to aflibercept [4].

Conclusion

Noninferiority trials evaluate a new treatment against standard care to demonstrate that the new treatment is no worse than standard care [6]. Interpretation of noninferiority trials requires consideration of the choice of the non-inferiority margin and deficiencies in trial design that may diminish the estimated difference between the intervention and standard care arms. Clinicians should also be careful to draw appropriate inferences from noninferiority trials and not to mistake lack of statistical significance (which in some cases be due to lack of statistical power) in superiority trials as evidence of noninferiority.