The rich diversity of human populations reflects the biological consequences of DNA recombination and sexual reproduction, coupled with factors such as migration, admixture and selection. Our genomes record the history of previous generations, and the meetings, matings and movements of our ancestors are responsible for the current makeup and distribution of the global human population. We are all descended from a common African lineage, and indeed we have vastly more genetic similarities than differences across different backgrounds. However, the underlying genetic variation that exists between populations can greatly affect genetic analyses of diseases or complex traits, and associations found in one population may not be valid for another. Therefore, to ensure that the benefits obtained from genetic studies can be universally shared, a global approach is essential in analyzing cohorts of diverse ancestry in genome-wide association studies (GWAS) and other genome-sequencing projects.

The current skew of population cohorts used in genetic studies and the potential consequences stemming from this imbalance are highlighted in a Perspective from Martin and colleagues in this issue. Although the authors are enthusiastic about recent advances in genomic medicine, especially the development of methods to derive polygenic risk scores for disease prediction, they caution that, if current trends prevail, the existing gap in the applicability and clinical relevance of genetic discoveries across different populations will widen even further. At present, most individuals who participate in, and thus benefit from, large genetic studies are of European descent. Consequently, the rest of the world is left understudied and potentially excluded from being able to take full advantage of resulting discoveries that inform disease prevention or treatment. Figure 1 of the Perspective illustrates the extreme disparity among studied populations: individuals of European descent account for approximately 16% of the global population but represent almost 80% of all GWAS participants. Troublingly, the authors also highlight that little progress has been made in rectifying this imbalance: in recent years, cohort compositions have largely stayed the same or declined in diversity even further.

Genetic variation can lead to differential effects on disease susceptibility or manifestation across populations. Additionally, parameters that affect the outcomes of genetic studies, such as allele frequencies and linkage disequilibrium, vary across populations. Thus, because most GWAS findings are being discovered in European-derived populations, the results will have the most relevance for those populations. Using the specific example of polygenic risk scores, Martin and colleagues discuss how the clinical utility of genetic risk scores generated from existing GWAS data will disproportionally benefit people with European ancestry. They posit that this lack of generalizability of risk prediction can be remedied only through concerted efforts to expand and diversify the cohorts used for genetic research.

Another important point emphasized by Martin and colleagues is that the restriction of studies to predominantly European-derived populations severely limits understanding of how genetics influences disease. More genetic variation is found in non-European populations, such as those with African ancestry, and this variation can provide a rich resource for finding new genetic associations. In addition, comparative analysis across different populations can help fine-map associated loci and lead to the identification of causal genes or pathways that increase mechanistic understanding of disease. Thus, apart from the ethical reasons to increase the inclusion of individuals of non-European ancestry in genetic studies, there is solid scientific motivation as well.

The historical (and current) imbalance in genetic cohorts can be explained by disparate features of geography, healthcare and funding, among other factors. With an increasingly globalized and connected research community, we hope to see prioritization of recruitment and analysis of diverse cohorts. As some of the barriers that prevent establishment of large cohorts from understudied populations diminish, we will start to see a much-needed redress in the distribution of relevant genetic findings that are population specific. In addition, the knowledge gained through in-depth analysis of non-European-ancestry populations will advance understanding of genetics, variation, evolution and disease, to the benefit of all.