Science and Society | Published:

Prioritizing diversity in human genomics research

Nature Reviews Genetics volume 19, pages 175185 (2018) | Download Citation


Recent studies have highlighted the imperatives of including diverse and under-represented individuals in human genomics research and the striking gaps in attaining that inclusion. With its multidecade experience in supporting research and policy efforts in human genomics, the National Human Genome Research Institute is committed to establishing foundational approaches to study the role of genomic variation in health and disease that include diverse populations. Large-scale efforts to understand biology and health have yielded key scientific findings, lessons and recommendations on how to increase diversity in genomic research studies and the genomic research workforce. Increased attention to diversity will increase the accuracy, utility and acceptability of using genomic information for clinical care.

Key points

  • Knowledge of how genomic variants vary by population increases our ability to understand genomic contributions to health and disease and to apply this knowledge to clinical care.

  • In addition to producing more robust science, studies involving diverse participants facilitate a more equitable distribution of resulting benefits.

  • Existing obstacles related to study enrolment and analysis can be overcome by rigorous attention to community engagement and analytic strategies, although this may come at the expense of expediency and convenience.

  • Researchers, funding agencies and journal editors have roles to play in increasing the inclusion of diverse participants and populations, prioritizing diversity-related research and raising publication standards, respectively.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.


  1. 1.

    et al. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res. 45, D896–D901 (2017).

  2. 2.

    et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).

  3. 3.

    et al. Distribution and clinical impact of functional variants in 50,726 whole-exome sequences from the DiscovEHR study. Science 354, aaf6814 (2016).

  4. 4.

    , & Genetic variation and the de novo assembly of human genomes. Nat. Rev. Genet. 16, 627–640 (2015).

  5. 5.

    , & The History and Geography of Human Genes. (Princeton Univ. Press, 1994).

  6. 6.

    1000 Genomes Project Consortium et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).

  7. 7.

    & Genomics is failing on diversity. Nature 538, 161–164 (2016).

  8. 8.

    United Nations Department of Economic and Social Affairs. World population prospects: the 2015 revision (UN, 2015).

  9. 9.

    et al. A standardized framework for representation of ancestry data in genomics studies, with application to the NHGRI-EBI GWAS Catalog. Preprint at bioRxiv (2017).

  10. 10.

    & Merging and emerging cohorts: necessary but not sufficient. Nature 445, 259 (2007).

  11. 11.

    , , & Gm3;5,13,14 and type 2 diabetes mellitus: an association in American Indians with genetic admixture. Am. J. Hum. Genet. 43, 520–526 (1988).

  12. 12.

    [No authors listed.] After Havasupai litigation, Native Americans wary of genetic research. Am. J. Med. Genet. 152A, fm ix (2010).

  13. 13.

    et al. The continuum of translation research in genomic medicine: how can we accelerate the appropriate integration of human genome discoveries into health care and disease prevention? Genet. Med. 9, 665–674 (2007).

  14. 14.

    , & National Human Genome Research Institute. Charting a course for genomic medicine from base pairs to bedside. Nature 470, 204–213 (2011).

  15. 15.

    et al. Bedside back to bench: building bridges between basic and clinical genomic research. Cell 169, 6–12 (2017).

  16. 16.

    , & Diversity and inclusion in genomic research: why the uneven progress? J. Commun. Genet. 8, 255–266 (2017).

  17. 17.

    et al. Low LDL cholesterol in individuals of African descent resulting from frequent nonsense mutations in PCSK9. Nat. Genet. 37, 161–165 (2005).

  18. 18.

    et al. Genetic misdiagnoses and the potential for health disparities. N. Engl. J. Med. 375, 655–665 (2016).

  19. 19.

    National Institutes of Health. Population Architecture Using Genomics and Epidemiology (PAGE), phase II — study investigators (U01). National Institutes of Health: Grants & Funding (2012).

  20. 20.

    et al. Genetic diversity turns a new PAGE in our understanding of complex traits. Preprint at bioRxiv (2017).

  21. 21.

    National Institutes of Health. Human Heredity and Health in Africa (H3Africa): research projects (U01). National Institutes of Health: Grants & Funding (2016).

  22. 22.

    National Institutes of Health. Centers for common disease genomics (UM1). National Institutes of Health: Grants & Funding (2014).

  23. 23.

    National Institutes of Health. Inclusion of women and minorities as participants in research involving human subjects — policy implementation page. National Institutes of Health: Grants & Funding (2016).

  24. 24.

    et al. Characterizing race/ethnicity and genetic ancestry for 100,000 subjects in the Genetic Epidemiology Research on Adult Health and Aging (GERA) Cohort. Genetics 200, 1285–1295 (2015).

  25. 25.

    National Human Genome Research Institute. Genomics and health disparities lecture series. National Human Genome Research Institute (2017).

  26. 26.

    National Human Genome Research Institute & National Institute on Minority Health and Health Disparities. Workshop on the use of race and ethnicity in genomics and biomedical research (Rockville, 2016).

  27. 27.

    et al. Genetic research and health disparities. JAMA 291, 2985–2989 (2004).

  28. 28.

    , & Genomics, health disparities, and missed opportunities for the nation's research agenda. JAMA 317, 1831–1832 (2017).

  29. 29.

    , & Genome science and health disparities: a growing success story? Genome Med. 5, 61 (2013).

  30. 30.

    , & Genomics for the world. Nature 475, 163–165 (2011).

  31. 31.

    Misreading race and genomics after BiDil. Nat. Genet. 37, 655–656 (2005).

  32. 32.

    et al. The use of race variables in genetic studies of complex traits and the goal of reducing health disparities: a transdisciplinary perspective. Am. Psychol. 60, 77–103 (2005).

  33. 33.

    , & Genes, race, and culture in clinical care: racial profiling in the management of chronic illness. Med. Anthropol. Q. 27, 253–271 (2013).

  34. 34.

    , & Addressing social determinants of health and health inequalities. JAMA 316, 1641–1642 (2016).

  35. 35.

    & The social determinants of health: it's time to consider the causes of the causes. Public Health Rep. 129 (Suppl. 2), 19–31 (2014).

  36. 36.

    , , & Race, socioeconomic status, and health: complexities, ongoing challenges, and research opportunities. Ann. NY Acad. Sci. 1186, 69–101 (2010).

  37. 37.

    et al. Comparison of breast cancer molecular features and survival by African and European ancestry in the Cancer Genome Atlas. JAMA Oncol. (2017).

  38. 38.

    International Cancer Genome Consortium for Medicine. Linking genomics to clinical information. International Cancer Genome Consortium for Medicine (2017).

  39. 39.

    National Cancer Institute. Early onset malignancies initiative. National Cancer Institute (2017).

  40. 40.

    , & A systematic review of barriers and facilitators to minority research participation among African Americans, Latinos, Asian Americans, and Pacific Islanders. Am. J. Public Health. 104, e16–e31 (2014).

  41. 41.

    , & Effective recruitment and retention of minority research participants. Annu. Rev. Public Health 27, 1–28 (2006).

  42. 42.

    et al. Differences between African American and White research volunteers in their attitudes, beliefs and knowledge regarding genetic testing for Alzheimer's disease. J. Genet. Couns. 20, 650–659 (2011).

  43. 43.

    et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006).

  44. 44.

    Gene—environment-wide association studies: emerging approaches. Nat. Rev. Genet. 11, 259–272 (2010).

  45. 45.

    National Institutes of Health. Diversity Matters. National Institutes of Health: Diversity in Extramural Programs (2017).

  46. 46.

    International HapMap Consortium. Integrating ethics and science in the International HapMap Project. Nat. Rev. Genet. 5, 467–475 (2004).

  47. 47.

    National Institutes of Health. The Electronic Medical Records and Genomics (eMERGE) network, phase III – study investigators (U01). National Institutes of Health: Grants & Funding (2014).

  48. 48.

    National Institutes of Health. Genomic sequencing and newborn screening disorders (U19). National Institutes of Health: Grants & Funding (2012).

  49. 49.

    National Institutes of Health. Clinical sequencing exploratory research (UM1). National Institutes of Health: Grants & Funding (2012).

  50. 50.

    National Institute of Minority Health and Health Disparities. Transdisciplinary Collaborative Centers for Health Disparities Research Program (TCC). National Institute of Minority Health and Health Disparities (2017).

  51. 51.

    et al. Challenges and strategies for implementing genomic services in diverse settings: experiences from the Implementing GeNomics In pracTicE (IGNITE) network. BMC Med. Genom. 10, 35 (2017).

  52. 52.

    et al. Race, genomics and chronic disease: what patients with African ancestry have to say. J. Health Care Poor Underserved 28, 248–260 (2017).

  53. 53.

    The study is open: Participants are now recruiting investigators. Sci. Transl Med. 9, eaaf1001 (2017).

  54. 54.

    National Human Genome Research Institute. Community Engagement in Genomics Working Group. National Human Genome Research Institute (2017).

  55. 55.

    & The use of social media in recruitment for medical research studies: a scoping review. J. Med. Internet Res. 18, e286 (2016).

  56. 56.

    National Center for Biotechnology Information. Database of Genotypes and Phenotypes (dbGaP). National Center for Biotechnology Information (2017).

  57. 57.

    National Center for Biotechnology Information. ClinVar. National Center for Biotechnology Information (2017).

  58. 58.

    & Building the foundation for genomics in precision medicine. Nature 526, 336–342 (2015).

  59. 59.

    et al. Facilitating health data sharing across diverse practices and communities. AMIA Jt. Summits Transl Sci. Proc. 2010, 16–20 (2010).

  60. 60.

    et al. Exploring pathways to trust: a tribal perspective on data sharing. Genet. Med. 16, 820–826 (2014).

  61. 61.

    & Biospecimen policy: family matters. Nature 500, 141–142 (2013).

  62. 62.

    The Genome Reference Consortium. The Genome Reference Consortium. National Center for Biotechnology Information (2017).

  63. 63.

    National Institutes of Health. High quality human and non-human primate genome sequences (U24). National Institutes of Health: Grants & Funding (2017).

  64. 64.

    NHLBI GO Exome Sequencing Project (ESP). Exome Variant Server. University of Washington (2017).

  65. 65.

    et al. The ExAC browser: displaying reference data information from over 60 000 exomes. Nucleic Acids Res. 45, D840–D845 (2017).

  66. 66.

    National Institutes of Health. NIH policy on reporting race and ethnicity data: subjects in clinical research. National Institutes of Health: Grants & Funding (2001).

  67. 67.

    , , , & Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nat. Genet. 44, 955–959 (2012).

  68. 68.

    , , & New approaches to population stratification in genome-wide association studies. Nat. Rev. Genet. 11, 459–463 (2010).

  69. 69.

    , & New approaches to disease mapping in admixed populations. Nat. Rev. Genet. 12, 523–528 (2011).

  70. 70.

    , , , & Mapping of disease-associated variants in admixed populations. Genome Biol. 12, 223 (2011).

  71. 71.

    & Trans-ethnic genome-wide association studies: advantages and challenges of mapping in diverse populations. Genome Med. 6, 91 (2014).

  72. 72.

    et al. Genome-wide association studies in diverse populations. Nat. Rev. Genet. 11, 356–366 (2010).

  73. 73.

    et al. Guidelines for large-scale sequence-based complex trait association studies: lessons learned from the NHLBI Exome Sequencing Project. Am. J. Hum. Genet. 99, 791–801 (2016).

  74. 74.

    et al. Identification of unique venous thromboembolism-susceptibility variants in African-Americans. Thromb. Haemost. 117, 758–768 (2017).

  75. 75.

    et al. Genetics of low spinal muscular atrophy carrier frequency in sub-Saharan Africa. Ann. Neurol. 75, 525–532 (2014).

  76. 76.

    et al. A variant in CDKAL1 influences insulin response and risk of type 2 diabetes. Nat. Genet. 39, 770–775 (2007).

  77. 77.

    et al. Trans-ethnic fine-mapping of genetic loci for body mass index in the diverse ancestral populations of the Population Architecture using Genomics and Epidemiology (PAGE) Study reveals evidence for multiple signals at established loci. Hum. Genet. 136, 771–800 (2017).

  78. 78.

    et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 17, 405–424 (2015).

  79. 79.

    et al. Actionable exomic incidental findings in 6503 participants: challenges of variant classification. Genome Res. 25, 305–315 (2015).

  80. 80.

    et al. Human genetics. The genetics of Mexico recapitulates Native American substructure and affects biomedical traits. Science 344, 1280–1285 (2014).

  81. 81.

    , & Accounting for ancestry: population substructure and genome-wide association studies. Hum. Mol. Genet. 17, R143–R150 (2008).

  82. 82.

    et al. The IGNITE network: a model for genomic medicine implementation and research. BMC Med. Genom. 9, 1 (2016).

  83. 83.

    eMERGE Network. Collaborate. eMERGE Network (2017).

  84. 84.

    Glucose-6-phosphate dehydrogenase deficiency. Diagnosis, clinical and genetic implications. Am. J. Clin. Pathol. 47, 303–311 (1967).

  85. 85.

    , , , & HLA genotype and carbamazepine-induced cutaneous adverse drug reactions: a systematic review. Clin. Pharmacol. Ther. 92, 757–765 (2012).

  86. 86.

    , & Training the workforce for 21st-century science. JAMA 316, 1675–1676 (2016).

  87. 87.

    & National Institutes of Health addresses the science of diversity. Proc. Natl Acad. Sci. USA 112, 12240–12242 (2015).

  88. 88.

    , & Diversity in the biomedical research workforce: developing talent. Mt. Sinai J. Med. 79, 397–411 (2012).

  89. 89.

    National Institutes of Health. Notice of NIH's interest in diversity. National Institutes of Health: Grants & Funding (2017).

  90. 90.

    National Science Foundation, National Center for Science and Engineering Statistics. Women, minorities, and persons with disabilities in science and engineering. (National Science Foundation, 2017).

  91. 91.

    et al. Race, ethnicity, and NIH research awards. Science 333, 1015–1019 (2011).

  92. 92.

    National Human Genome Research Institute. Plan for increasing the number of underrepresented minorities trained in genomics and ELSI research. National Human Genome Research Institute (2008).

  93. 93.

    National Human Genome Research Institute. NHGRI's 11th annual meeting of DAP and T32 training programs. National Human Genome Research Institute (2014).

  94. 94.

    National Human Genome Research Institute. Genomic medicine VIII: NHGRI's genomic medicine portfolio. Executive summary. National Human Genome Research Institute (2015).

  95. 95.

    & A new initiative on precision medicine. N. Engl. J. Med. 372, 793–795 (2015).

  96. 96.

    International Committee of Medical Journal Editors. Recommendations for the conduct, reporting, editing, and publication of scholarly work in medical journals. International Committee of Medical Journal Editors (2016).

  97. 97.

    , , , & in European Society of Human Genetics 2017 Conference (Copenhagen, 2017).

  98. 98.

    et al. in ACMG Annual Clinical Genetics Meeting 2017 (Phoenix, 2017).

  99. 99.

    National Human Genome Research Institute. Roundtable on inclusion and engagement of underrepresented populations in genomics (NHGRI, 2015).

  100. 100.

    et al. Global implementation of genomic medicine: we are not alone. Sci. Transl Med. 7, 290ps13 (2015).

  101. 101.

    & in GeneReviews® (eds Pagon, R. A. et al.) (University of Washington, Seattle, 1993).

  102. 102.

    , & in GeneReviews® (eds Pagon, R. A. et al.) (University of Washington, Seattle, 1993).

  103. 103.

    & in GeneReviews® (eds Pagon, R. A. et al.) (University of Washington, Seattle, 1993).

  104. 104.

    & Genomics in CKD: is this the path forward? Adv. Chron. Kidney Dis. 23, 120–124 (2016).

  105. 105.

    et al. Recommendations for reporting of secondary findings in clinical exome and genome sequencing, 2016 update (ACMG SF v2.0): a policy statement of the American College of Medical Genetics and Genomics. Genet. Med. 19, 249–255 (2017).

  106. 106.

    & Unequal representation of genetic variation across ancestry groups creates healthcare inequality in the application of precision medicine. Genome Biol. 17, 157 (2016).

  107. 107.

    , & DNA polymorphism discovery resource for research on human genetic variation. Genome Res. 8, 1229–1231 (1998).

  108. 108.

    et al. The Electronic Medical Records and Genomics (eMERGE) Network: past, present, and future. Genet. Med. 15, 761–771 (2013).

  109. 109.

    et al. Genetic variants associated with the white blood cell count in 13,923 subjects in the eMERGE Network. Hum. Genet. 131, 639–652 (2012).

Download references


The authors thank L. Brooks, A. Felsenfeld, T. Gatlin, G. Ginsburg, B. Graham, M. Hahn, G. Jarvik, D. Kaufman, R. Li, N. Lockhart, E. Madden, J. McEwen, J. Mulvihill, G. Petersen, D. Roden, L. Rodriguez, C. Rotimi, H. Sofia, J. Troyer, M. Williams and A. Wise for valuable discussion and feedback. The authors are grateful to the investigators supported by the US National Human Genome Research Institute (NHGRI) and the individuals who have participated in NHGRI-supported research for their contributions to further diversity-related efforts in genomics.

Author information


  1. National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892–2152, USA.

    • Lucia A. Hindorff
    • , Vence L. Bonham
    • , Lawrence C. Brody
    • , Margaret E. C. Ginoza
    • , Carolyn M. Hutter
    • , Teri A. Manolio
    •  & Eric D. Green


  1. Search for Lucia A. Hindorff in:

  2. Search for Vence L. Bonham in:

  3. Search for Lawrence C. Brody in:

  4. Search for Margaret E. C. Ginoza in:

  5. Search for Carolyn M. Hutter in:

  6. Search for Teri A. Manolio in:

  7. Search for Eric D. Green in:


L.A.H. and M.E.C.G. researched data for the article. L.A.H., T.A.M., V.L.B., L.C.B., C.M.H. and E.D.G. substantially contributed to discussions of the content. L.A.H. and E.D.G. wrote the article. All authors reviewed and/or edited the manuscript before submission.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to Lucia A. Hindorff.



The interbreeding of individuals from two isolated populations; often used in the context of ancestry arising from two or more continents of origin (for example, admixed populations).

Allele frequency

A measure of the frequency of a particular allele relative to all alleles in a population; typically expressed as a percentage.

Genome-wide association studies

(GWAS). An approach used to associate specific genomic variants with particular diseases by scanning the genomes from many different people and looking for genomic markers that can be used to predict the presence of a disease.

Haplotype structure

A pattern or block-like structure comprising a set of DNA variations, or polymorphisms, that tend to be inherited together. A haplotype can refer to a combination of alleles or to a set of single nucleotide polymorphisms found on the same chromosome.


A statistical approach to predicting unobserved genotypes in a study population by use of known genotypes from a reference population.

Linkage disequilibrium

The nonrandom association of alleles at different loci; a sensitive indicator of the population genetic forces that structure a genome.


Pathogenicity classification for a genomic alteration that increases an individual's susceptibility or predisposition to a certain disease or disorder.

Population stratification

Differences in allele frequencies between cases and controls due to systematic differences in ancestry rather than association of genes with disease.

Reference sequence

A genomic sequence representative of a particular species' sequence, often used to align and analyse genome sequences from participants in human genomic studies.

Secondary findings

Genomic test results that do not pertain to the primary diagnostic question or reason for testing; also referred to as incidental or additional findings.

Trans-ethnic fine mapping

An approach to refine initial GWAS results by leveraging differences in the degree of linkage disequilibrium among multiethnic populations, narrowing the genomic region in which a causal variant may reside.

About this article

Publication history



Further reading