Evidence based medicine is the integration of the best evidence to guide management decisions for patients. Randomised control trial (RCT) data provide gold-standard evidence on treatment efficacy and form the basis for the widespread usage of vascular endothelial growth factor (VEGF) inhibitors in treating neovascular age-related macular degeneration. However, VEGF inhibitor therapy outcomes in real-world practice are variable, often reporting poorer results compared to RCTs [1]. Thus the difference between RCT and real-world evidence (RWE) data reflect different but complementary aspects of therapeutic outcomes. In contrast to RCTs which aim to evaluate efficacy by controlling as many variables as possible, RWE measures overall effectiveness, which incorporates the multitude of variables in clinical practice [2]. In contrast to RCTs, which have well-established design analysis and reporting structure, the rigour of RWE has often been questioned, due to inherent challenges in study design due to the lack of randomisation, variability in the quality of data and non-uniform analytical methods. In this editorial, several principles are recommended for the analysis and reporting of RWE. These are summarised in Table 1.

Table 1 Strategies for the analysis and interpretation of real-world data sets.

Principle 1: Clearly defined clinical question

A clear clinically relevant question with a specific outcome should be pre-defined. This question should be hypothesis driven comprising: (1) the comparison to be made, (2) study population, (3) clinically impactful measure and (4) the duration of the study or intervention. These should be set out clearly in the final report to ensure that the reader is aware of the main clinical question.

Principle 2: Interrogate the data source

The highly variable nature of real-world data requires that the data source be reviewed thoroughly before embarking on any analysis. The incompleteness of data is a major pitfall of RWE. Completeness should be assessed to ensure no systematic bias in under-reporting. Some strategies to address completeness include the use of a disease registry which mandates inclusion of all patients and all relevant data. This avoids selecting only patients with favourable outcomes. Participating physicians should also conform to uniform reporting standards to ensure consistency data entered [3].

Once the overall dataset has been standardised, inclusion criteria related to the clinical question in principle 1 should be applied. Applying criteria to restrict the study population may seem counterintuitive to the generalisability of RWE but this may be necessary as practice patterns may change. For example, analysing a treat and extend regimen prior to 2014 is fruitless as the first large study to describe the efficacy of this regimen was only reported in 2015 [4].

Principle 3: A pre-specified statistical analysis plan and addressing bias

A pre-specified analysis plan is important to avoid picking a method post hoc that gives a ‘desirable’ result. This plan should identify relevant independent, or predictor, variables that are available from the dataset. The plan should also aim to address biases inherent to real-world data due to the lack of randomisation.

Covert and overt biases exist in RWE. Covert biases are results of imbalances between groups due to factors that are not observed by investigators. Methods to address this type of bias have limited success [5, 6]. Disease registries, which consist of a large heterogenous data pool, can help to address this by ensuring enough variability to mitigate the effects of covert bias. Overt biases, which are apparent and identifiable differences, can be corrected through matching or statistical adjustments. Selection bias which can manifest as high dropout rates are a major concern that can invalidate RWE [7, 8]. A method of addressing selection bias from dropout is to take the last observation and carry it forwards (LOCF) to the pre-specified time point. This method ensures that patients that may have dropped out from poor outcomes are still included in the analysis. A sensitivity analysis comparing completers and non completers should be included if a LOCF strategy is used to address any confounding of the primary outcome by the non-completer group [9]. Statistical methods such as a generalised additive model (GAM) can further help to address bias from dropout. These models analyse all data accrued including those by subjects that do not fulfil the pre-specified time point.

Statistical methods in RWE require careful deliberation as comparators are not as straightforward as in RCT. Comparator groups in RWE may not always be categorical (treatment versus untreated groups in RCT), and an accepted method is the grouping of the population by quartiles (or tertiles) by the continuous variable if interest. This categorisation method is an easy way to communicate findings to the wider scientific community but require careful consideration to ensure that groupings are clinically significant [10].

Traditional hypothesis testing methods applied to RCT may not be applicable to RWE where relationships may often be non-linear [10]. Advanced non-linear techniques such as GAM or locally weighted regression (‘lowess’) may be preferred, as it allows for the use of all available data to be ‘fitted’ to establish a predictor-outcome relationship [11, 12], This allows for the visualisation of relationships and can often reveal insightful patterns that are missed if linear models are applied [13].

Principle 4: Structured interpretation of results

The interpretation of the results should be targeted toward answering the clinical question defined in principle 1. The discussion should highlight the primary outcome and contextualise this outcome for stakeholders and decision-makers. The plausibility of the comparison should be discussed as well as the magnitude of the relative benefits (or harm). Careful attention should be paid to the accurate reporting of causal inference or correlation avoiding over-generalisation. The STrengthening the Reporting of OBservational studies in Epidemiology statement provides guidance on transparent reporting of observational studies and should be applied to real-world data analysis.


Real-world evidence is an important form of research in informing medical and health care decisions in clinical practice. While it can help answer questions that RCT cannot, it is vital to apply thoughtful methodology to improve its scientific rigour.