A person’s lifetime risk of developing disease is influenced by a combination of genetic, demographic, behavioral, physiological and socioeconomic factors. 48% of the global disease burden is estimated to be associated with modifiable risk factors, so understanding these risk–outcome relationships can help in developing effective interventions to improve health and wellbeing. Clinical and public health guidelines are aimed at modifying these risk factors, whether tackling harmful risks or promoting preventative factors.

Tobacco control is a prominent example of global policy implementation that stems from studying the health risks of tobacco smoking. The evidence underlying the harmful effects of smoking on health is well established and has led to the introduction of public policies, such as the World Health Organization’s Framework Convention of Tobacco Control, which has achieved success in reducing global smoking prevalence since its implementation in 2003 and in mitigating lung cancer mortality in some countries. However, the evidence base for other risk–outcome associations is less robust, and some guidelines have been developed predominantly on the basis of expert knowledge rather than quantitative evidence, which has left them open to criticism. A lack of quantitative evidence of risk has also resulted in the formulation of guidelines with conflicting recommendations, such as the controversial NutriRECS consortium that recommends maintaining current levels of red meat consumption, in contrast to other dietary guidelines that advise a lower intake.

Primary studies that report evidence of risk factor and outcome relationships use varying study designs, such as randomized controlled trials and cohort, case-control and cross-sectional studies, findings from which can be incorporated into meta-analyses to provide a summary level of evidence. Although different study designs have their inherent strengths and weaknesses, biases in data collection methods and analyses can give rise to heterogeneous findings. Meta-analytical methods such as Grading of Recommendations Assessment, Development and Evaluation (GRADE) interpret evidence through evaluation of the certainty of evidence and risk of bias. Although it is useful, GRADE is constrained by providing only a qualitative assessment of a given analysis.

A new set of papers in this issue of Nature Medicine describe the development and implementation of a novel meta-analytical tool for assessing the quality of evidence for risk–outcome relationships. In the flagship paper, Murray and colleagues describe the burden of proof risk function (BPRF), an open-source, quantitative approach to examining the quality of evidence that supports risk–outcome relationships. The risk–outcome relationship is then converted into a ‘star’ rating on the basis of the most conservative interpretation of evidence, wherein one star refers to no true association and two or more stars suggests an association, with five stars supporting a very strong association of harmful or protective factors.

Demonstrating proof of concept of the BPRF tool, Dai and colleagues re-assess the evidence behind smoking and health and identify consistent harmful associations of smoking for most health outcomes, with very strong (five-star) associations for lung, laryngeal and pharyngeal cancers, aortic aneurysms and peripheral artery disease. Next, Razo and colleagues report a strong (five-star), continuous relationship between high systolic blood pressure and the risk of ischemic heart disease. In two separate studies, Stanaway and colleagues and Lescinsky and colleagues find that intake of vegetables or unprocessed red meat is associated with weak to moderate increased protection against or increased risk of (two- to three-star ratings) various cardiometabolic disease outcomes and cancer outcomes, respectively.

The BPRF tool is not without its methodological limitations and is intended to be a complementary metric for the synthesis of evidence alongside existing metrics, such as GRADE. Notably, it does not provide a solution to the inherent weaknesses of traditional data-collection methods. Traditional methods of assessing risk–outcome relationships, such as lifestyle questionnaires for recording risk factors and self-reporting of health outcomes, are subject to bias. Using nutritional epidemiology as an example, in an accompanying News and Views, Tong and colleagues discuss the impact of residual confounding and differing global dietary patterns, which illustrates the challenges of interpreting evidence in observational studies. With advances in technology, data-collection methods such as electronic health records, wearable devices and ‘-omics’ data are now incorporated into clinical and epidemiological studies. These advances in data-collection methods could provide more-objective reporting of existing risk factors and health outcomes, or emerging risk factors such as exposure to electronic cigarettes.

Fundamentally, most of the published studies are mainly from North America, Europe and East Asia, with scarce evidence from under-resourced settings. The risk factors and disease burdens on these continents are not generalizable to all populations globally. In the case of blood pressure, Jeemon and Harikrishnan caution against a one-size-fits-all approach to defining high blood pressure and highlight the variability in blood pressure in populations of different ethnicities. Too often clinical practice guidelines make recommendations based on studies from well-resourced regions, lacking evidence from regional healthcare systems and local factors in under-resourced regions, so implementation in the latter regions is thus unlikely to be feasible. In her World View, Lia Tadesse, Minister of Health in Ethiopia, calls for increased investment in health data and enhanced global collaboration and capacity building to better equip under-resourced regions and thereby enhance the global diversity of research.

The Burden of Proof studies bring a new tool that can be used for critical interpretation of evidence. However, higher-quality study designs and increased global diversity of studies are also essential to providing accurate assessments of risk–outcome relationships to inform policy recommendations that, ultimately, improve global health and wellbeing.