Precision medicine is based on a vision of effective preventive and therapeutic strategies grounded in precise understandings of the genetic and environmental determinants of disease. Recent advances in the science of precision medicine have yielded biomedical discoveries and pharmaceutical innovations.1 At the same time, critics of precision medicine suggest that because the primary drivers of health inequalities are social factors, precision medicine is unlikely to result in widespread benefit for population health.2,3 Even so, for precision medicine to realize its full potential—at the individual level for diagnosis and treatment, and at the population level for health promotion and disease prevention—risks and benefits must be shared across societal groups, a key principle of distributive justice. Without careful attention to the structure and design of studies, benefits may be realized by only select groups, exacerbating current health disparities.4 In the current climate of accelerating precision medicine, how to measure and achieve genetic, environmental, and social diversity of participants must receive the attention it requires. (Although we focus here on insuring genetic diversity in precision-medicine studies, as the number of studies that focus on social and environmental variables increases, many of the issues identified will be relevant to achieving diversity in exposure to those variables as well).

Precision-medicine studies are designed to identify genetic variants that contribute to disease risk or affect treatment, the frequencies of which differ substantially within (and across) populations of varying ancestries.4 However, improvements in diagnosis and treatment may be distributed unevenly, with majority populations benefiting in higher proportions if studies are based on data from existing genomic databases, which predominantly contain samples from persons of European ancestry.4 Studies of such data are likely to miss disease risk variants that are rare among European populations but common among other groups, and to distort effect size estimates.5 Similar considerations affect medication responses: for example, some genetic variants related to drug metabolism are more common among individuals of African ancestry. That is, there may be increased sensitivity or diminished response to β-agonists, warfarin, chemotherapy, and other medications.6 The implications of such variants for dosing strategies may not be understood until a sufficient number of people with African ancestry are included in relevant study samples to account for within-group and between-group variabilities. This selection problem is not new. Scientists have completed and analyzed seminal studies only to discover that the proportion of minorities enrolled was too small for meaningful subgroup analysis.5 Precision-medicine studies risk a similar fate unless three major challenges are addressed.

The first challenge is deciding prospectively which groups to include to ensure a meaningful degree of diversity. No single study is likely to be large enough to encompass the full diversity of the United States, much less the world, with sufficient power for meaningful findings to emerge, and no single approach is right for all studies. Options consist of oversampling various groups. Examples include those with greater disease burdens, groups that have been historically underrepresented, and populations with higher levels of within-group genetic diversity. Each choice involves deciding how best to balance redress of past wrongs with important, ongoing population-based science. Given the realities of American society, these decisions will encompass groups that differ in ancestry from Americans of European descent. Although the groups probably overlap, investigators may still not be able to sample all eligible groups adequately for meaningful analysis.

Insofar as some groups, such as persons of African descent, have higher levels of within-group genetic diversity,4 a larger number of samples may be required to generate significant findings. Similar considerations arise for Mexican Americans with 35–64% Native American ancestry and Hispanics and Latinos with variable proportions of European, Native American, and African ancestry.4 Because existing data sets overrepresent people of European ancestry, some scientists have suggested that newer studies should sample predominantly participants of color to provide data for the accurate assessment of pathogenicity and penetrance of identified variants uncommon among those of European descent.4 This type of oversampling has been shown to provide significant data that would not have been obtained otherwise. Whatever approach is selected, additional efforts to involve genetically and socially diverse groups will be necessary to satisfy the desideratum of fairly distributing the benefits of precision medicine.

The second challenge involves selecting appropriate criteria for individual inclusion. Current approaches in the United States emphasize self-reported race, using categories developed by the Office of Management and Budget. Self-report, however, largely reflects societal and individual constructions of identity, not the genetic variation of the population (although data about this are conflicting).7 Moreover, it is important to recognize that relying on these traditional racial categories in genomic studies carries the additional risk of reification of race as a genetic concept, in the face of a growing consensus that it is conceptually incoherent.7 Despite these limitations and risks, eliminating self-reported race categories would reduce our ability to discern the health consequences of racial status in our society. Furthermore, some studies suggest that perceived racial status and its social consequences have possible biologic implications, for example, variation in DNA methylation8 and telomere length.9

A number of alternatives to self-reported race have been suggested, such as the use of genetic markers,4 admixture analysis,10 or ancestry-informative regions.4 These approaches, however, all depend on the analysis of genomic data. Thus, unless existing databases that contain genomic information are being used to identify potential participants, these measures can only be applied after enrollment and genetic analysis have occurred. Discarding large numbers of potential participants because of post facto failure to meet the criteria for stratification of the sample is expensive and arguably wasteful.

To account for both the health effects of race in our society and the genetic variation within and between ancestries, a reasonable interim approach might include the following: use of self-reported race and ethnicity for initial enrollment and of genetic markers of ancestry to adjust the samples when those data become available; iterative updating of recruitment targets and outreach strategies based on these combined measures; and coordination of sampling strategies across studies to ensure diversity of groups, thus ensuring adequate representation and robust data in a quantity sufficient for analysis within and between subgroups. Whether this is the best approach to achieve the desired diversity is still an open question, but precision-medicine funders and investigators should establish a priori how genetic diversity will be defined and measured in their studies.

The third challenge is designing outreach to the targeted populations. Because of an egregious history of exclusion and exploitation in research, some groups, most notably African Americans and Native Americans, have been absent or underrepresented in study samples. Recently, however, additional opportunities for respectful, meaningful participation have been developed and have contributed to a growing body of literature that includes successful and promising methods to increase partnership, engagement, and enrollment with minority populations. Among strategies that have proven effective are early involvement in planning and defining study goals; understanding how participants would like to receive individual and aggregate study results; methods to facilitate two-way communication, including overcoming language barriers; and long-term commitment to the well-being of a population from inception to study completion and beyond. Approaches will vary in different contexts, but the need to include thoughtful strategies for community engagement in research planning, and to link local, regional, and national stakeholders, is constant.

Some questions regarding how best to promote distributive justice in precision medicine remain unanswered:

  • Given that precision medicine cannot address the needs of every group, what is a substantively fair set of goals for precision-medicine research?

  • Which groups’ inclusion should be prioritized and on what basis? How can other aspects of social diversity (e.g., socioeconomic position, which may affect epigenetic diversity) be taken into account?

  • Should genetic markers be used? If so, how, and which markers make sense and under what circumstances?

  • Will effective practices for recruitment in discrete geographic areas work when scaled up to a national (or international) level?

This list, although partial and only suggestive, indicates the interplay between normative and empirical questions, as well as the data that are needed to achieve just distribution of benefits in precision medicine. Without the development of scientific and practical guidance on diversity and inclusion, precision-medicine discoveries may improve health for some, but not all, and almost certainly least for those already disadvantaged. Advances in precision medicine cannot wait until all of these questions are answered. But to move ahead in both good faith and good form, it is wise to consider interim measures and monitor outcomes for the discovery and fair distribution of benefits alike.

Disclosure

The authors declare no conflict of interest.