Strategic vision for improving human health at The Forefront of Genomics


Starting with the launch of the Human Genome Project three decades ago, and continuing after its completion in 2003, genomics has progressively come to have a central and catalytic role in basic and translational research. In addition, studies increasingly demonstrate how genomic information can be effectively used in clinical care. In the future, the anticipated advances in technology development, biological insights, and clinical applications (among others) will lead to more widespread integration of genomics into almost all areas of biomedical research, the adoption of genomics into mainstream medical and public-health practices, and an increasing relevance of genomics for everyday life. On behalf of the research community, the National Human Genome Research Institute recently completed a multi-year process of strategic engagement to identify future research priorities and opportunities in human genomics, with an emphasis on health applications. Here we describe the highest-priority elements envisioned for the cutting-edge of human genomics going forward—that is, at ‘The Forefront of Genomics’.


Beginning in October 1990, a pioneering group of international researchers began an audacious journey to generate the first map and sequence of the human genome, marking the start of a 13-year odyssey called the Human Genome Project1,2,3. The successful and early completion of the Project in 2003, which included parallel studies of a set of model organism genomes, catalysed enormous progress in genomics research. Leading the signature advances has been a greater than one million-fold reduction in the cost of DNA sequencing4. This decrease has allowed the generation of innumerable genome sequences, including hundreds of thousands of human genome sequences (both in research and clinical settings), and the continuous development of assays to identify and characterize functional genomic elements5,6. These new tools, together with increasingly sophisticated statistical and computational methods, have enabled researchers to create rich catalogues of human genomic variants7,8, to gain an ever-deepening understanding of the functional complexities of the human genome5, and to determine the genomic bases of thousands of human diseases9,10. In turn, the past decade has brought the initial realization of genomic medicine11, as research successes have been converted into powerful tools for use in healthcare, including somatic genome analysis for cancer (enabling development of targeted therapeutic agents)12, non-invasive prenatal genetic screening13, and genomics-based tests for a growing set of paediatric conditions and rare disorders14, among others.

In essence, with growing insights about the structure and function of the human genome and ever-improving laboratory and computational technologies, genomics has become increasingly woven into the fabric of biomedical research, medical practice, and society. The scope, scale, and pace of genomic advances so far were nearly unimaginable when the Human Genome Project began; even today, such advances are yielding scientific and clinical opportunities beyond our initial expectations, with many more anticipated in the next decade.

Embracing its leadership role in genomics, the National Human Genome Research Institute (NHGRI) has developed strategic visions for the field at key inflection points, in particular at the end of the Human Genome Project in 200315 and then again at the beginning of the last decade in 201116. These visions outlined the most compelling opportunities for human genomics research, in each case informed by a multi-year engagement process. NHGRI endeavoured to start the new decade with an updated strategic vision for human genomics research. Through a planning process that involved more than 50 events (such as dedicated workshops, conference sessions, and webinars) over the past two years (see, the institute collected input from a large number of stakeholders, with the resulting input catalogued and synthesized using the framework depicted in Fig. 1.

Fig. 1: Four-area strategic framework at The Forefront of Genomics.

Together, the indicated progressive and interrelated areas serve to organize the major elements in the strategic vision described here.

Unlike the past, this round of strategic planning was greatly influenced by the now widely disseminated nature of genomics across biomedicine. A representative glimpse into this historic phenomenon is illustrated in Fig. 2. During the Human Genome Project, NHGRI was the primary funder of human genomics research at the US National Institutes of Health (NIH), but the past two decades have brought a greater than tenfold increase in the relative fraction of funding coming from other parts of the NIH.

Fig. 2: Funding trends of NIH and NHGRI over the past 30 years.

The total funding levels for the NIH (top) and NHGRI (middle) are indicated for 1990, 2010, and 2020. Also shown (bottom) is the relative proportion of funds supporting human genomics research provided by NHGRI versus all of the NIH for the three corresponding time intervals (as derived from queries of the internal NIH Research, Condition, and Disease Categorization database for funds assigned to the ‘human genome’ category). During the 30-year period when the NHGRI budget increased roughly tenfold (middle), the proportion of total NIH funding for human genomics research actually increased more markedly, from less than 5% during the Human Genome Project to around 90% at the beginning of the current decade (bottom). In essence, these trends reflect a leveraging of NHGRI’s funds that increased NIH’s overall human genomics research funding by greater than tenfold.

The planning process continually encountered the realities associated with the broad and extensive use of genomics and the impracticality of being comprehensive, which together served to focus attention on the most cutting-edge opportunities in human genomics. This experience affirmed NHGRI’s recently rearticulated role in providing genomics leadership at the NIH, embodied by our newly conceived organizational mantra: ‘The Forefront of Genomics’. We ultimately linked this mantra to the strategic planning process to help guide the formulation of input. From the ensuing discussions, it became apparent that responsible stewardship is a central aspect of being at (and pushing forward) The Forefront of Genomics, specifically in the four major areas detailed in Fig. 1, Boxes 14, and below.

Principles and values for human genomics

As genomics has matured as a discipline, the field has embraced a growing set of fundamental principles and values that together serve as a guiding compass for the research efforts—some of these emerged organically within the field, whereas others have been adopted from the broader scientific community. The growing complexities of human genomics and its many applications (especially in medicine) at The Forefront of Genomics make it imperative to reaffirm, sharpen, and even extend these tenets, such as those highlighted in Box 1.

Many of these principles and values have been informed by the recognized area of genomics that focuses on ethical, legal, and social implications (ELSI) research17, which was established at the beginning of the Human Genome Project to ensure that the eugenics movement and other misuses of genetics are not repeated. ELSI research has since grown to encompass a broad portfolio of studies that examine issues at the interface of genomics and society, the results of which have informed policies and laws related to genetic discrimination, intellectual property, data sharing, and informed consent18. Similar efforts seek to ensure that the benefits of genomics are available to all members of society19. Genomics, like other scientific fields, must reckon with systematic injustices and biases, fully mindful of their importance for health equity. In the future, ELSI research needs to focus on aspects of genomic medicine implementation that present challenging questions about legal boundaries, study governance, data control, privacy, and consent. Complex societal issues must also be studied, including the expanded application of genomics in non-medical realms (for example, ancestry testing, law enforcement, and genetics-based marketing of consumer goods)20. Finally, ELSI research should also examine the implications of studying genetic associations with bio-behavioural traits (such as intelligence, sexual behaviour, social status, and educational attainment)21 and of a future in which machine learning and artificial intelligence are used to adapt risk communication and clinical decisions based on analysing an individual’s genome sequence22.

Robust foundation for genomics

Genomics is now routinely and broadly used throughout biomedical research, with widespread reliance on a robust foundation for facilitating genomic advances. The foundation’s integrity depends on several key elements, including infrastructure, resources, and dynamic areas of technology development and research. Sustaining and improving that foundation are key responsibilities at The Forefront of Genomics, the major elements of which are highlighted in Box 2 and detailed in corresponding paragraphs below.

Genome structure and function

The past two decades have brought a greater than million-fold reduction in the cost of DNA sequencing23 along with marked advances in technologies for functional genomics6,24,25 (that is, the study of how elements in the genome contribute to biological processes). Further opportunities are anticipated as the generation and analysis of genomic data become even faster, cheaper, and more accurate. Near-term expectations include enhanced capabilities for generating high-quality and complete (for example, telomere-to-telomere and phased) genome sequences26,27, and continued refinement and enhanced utilization of a human genome reference sequence(s) that increasingly reflects human variation and diversity on a global scale28 and that serves as a substrate for genome annotation29. Technologies for generating DNA sequence and other data types (for example, transcriptomic data, epigenetic data, and functional readouts of DNA sequences) need to be enabled at orders-of-magnitude lower costs, at single-cell resolution, at distinct spatial locations within tissues, and longitudinally over time30,31,32. These genomic data should be integrated with other multi-omic data (for example, proteomes, metabolomes, lipidomes, and/or microbiomes) in sophisticated ways, including methods that collect many data types from a single sample32. Transformative approaches will become increasingly vital for assimilating, sharing, and analysing these complex and heterogeneous data types33 and must expand to include the integration of environmental, lifestyle, clinical, and other phenotypic data. These capabilities should be incorporated into browsers, portals, and visualization tools for use by a broadening community of researchers and clinicians.

Genome sequences have now been generated for more than 1,000 vertebrate species and are increasingly accompanied by multi-species annotations34. Understanding natural genomic variation, the conservation of genomic elements, and the rapid evolutionary changes in genomic regions associated with specific traits is crucial for attaining a comprehensive view of genome structure and function. The study of a wide range of organisms continues to be instrumental for determining the effect of genomic variation on biological processes and phenotypes, providing insights about the interplay of genomic variants and environmental pressures35 and the relevance of putative pathogenic variants identified in clinical studies36. It is essential that the generation of high-quality multi-species genomic data is accompanied by community-accepted standards for data, metadata, and data interoperability. New methods would allow for integrating functional data from diverse species with human data and visualizing increasingly complex comparative genomic datasets. Continued progress in this area would move the field closer to the long-term aspirational goal of understanding the evolutionary history of every base in the human genome.

Genomic data science

All major genomics breakthroughs so far have been accompanied by the development of ground-breaking statistical and computational methods. Accordingly, continued innovations in both traditional and advanced methods (including machine learning and artificial intelligence) should be prioritized37. These approaches must be considered from the early stages of study planning and data collection in ways that complement and enhance, rather than inhibit, technical progress. Furthermore, the biomedical research community requires accurate, curated, accessible, secure, and interoperable genomic data repositories and informatics platforms that benefit all populations. Approaches for improving the efficiency of such resources include the use of shared storage and computing infrastructure, the adoption of common data-management processes, and the development of increasingly automated data-curation methods38. Carefully considered funding strategies must be designed to support these methods and resources, including a global, multi-funder model that ensures their development, enhancements, and long-term sustainability39.

Recent progress has brought substantial transformations in how the petabytes of genomic data being generated each year are assimilated and analysed, including the emergence of cloud-based and federated approaches. Effective and efficient management of increasingly complex genomic datasets requires addressing challenges with these emerging approaches as well as innovations in the use of hardware, algorithms, software, standards, and platforms40. Current barriers include the lack of interoperable genomic data resources (which limits downstream access, integration, and analyses) and the absence of controlled and consistently adopted data and metadata vocabularies and ontologies41,42. User-friendly systems that capture metadata in a scalable, intelligent, and cost-effective manner and that allow for intuitive data visualizations are essential. Ever-improving routines and guidelines should be formulated to continue and even enhance responsible data sharing, requiring the collective efforts of researchers, funders, and publishers alike; similar attention should focus on ensuring the use of FAIR (findable, accessible, interoperable, and reusable) data standards and the reproducibility of data analyses38. Innovations in technology and policy must be integrated to develop data-stewardship models that ensure open science and reduce data-access burdens to advance research, including the use of optimally balanced and ethically sound approaches for respecting participant preferences and consent as well as engaging communities. Such developments should be done in an open-source culture to build consensus and enable the development, maintenance, and use of best-in-class tools, pipelines, and platforms that can be applied to all datasets.

The full integration of genomics into medical practice will require informatics and data-science advances that effectively connect the growing body of genomic knowledge to clinical decision-making. To make genomic information readily accessible and broadly useful to clinicians, user-friendly electronic health record-based clinical decision support tools must be created to interact with a variety of clinical data from electronic health record and other data systems (for example, laboratory, pharmacy, and radiology) as well as non-computable reports, such as those provided as portable document format (PDF) files43,44. These efforts require well-curated, highly integrated, and up-to-date knowledgebases that connect genomic information to clinical characteristics, other phenotypic data, and information on family health history45. Reliable risk-stratification and prevention algorithms, including polygenic risk scores (PRSs)46, must be developed and should incorporate both common and rare genomic variants from a broad range of population subgroups, phenotypic data, and environmental information into the risk modelling47. Such algorithms should be evaluated both for their validity across many populations and for their effect on patient outcomes and subsequent healthcare utilization. Finally, it will be important to evaluate new genomics-oriented clinical decision support tools to ensure that they are acceptable to practitioners across the spectrum of clinical disciplines.

Genomics and society

Understanding the role of genomics in human health requires knowledge and insights about how social, environmental, and genomic risk factors interact to produce health outcomes48,49 (Box 1). Given that such interactions are, in general, poorly understood, it is crucial that studies of genomic risk (particularly of common, complex diseases) account for the social and environmental factors that influence health and disease50. These factors must be properly described, measured, and incorporated in genomic studies51. Optimal implementation of genomic medicine will require an understanding of how the intersectional aspects of people’s social and political identities influence the ways in which populations are described in research. Such knowledge will, in turn, provide clarity about the interrelationships among these many influences on health and disease.

People want to be able to make well-informed decisions about their genomic data, leading to the engagement efforts in initiatives such as the UK Biobank52 and the ‘All of Us’ Research Program53. Partnering with communities and individuals is fundamental to engaging participants in such large-scale research. Genomics researchers must incorporate models and methods of community engagement in their experimental design. Such studies must be appropriately adapted for different cultures and designed to reduce inequities and healthcare disparities; they must also be accompanied by effective information dissemination54. An unrelenting focus on the optimal ways to conduct research in partnership with data stakeholders and communities would ensure the identification of the key issues and values influencing peoples’ choices about the provision of personal data for research55,56. Data-stewardship infrastructures that integrate appropriate policies, technologies57, and governance and legal frameworks must be developed and assessed to ensure alignment between communities’ and individuals’ decisions about their data and the practices of researchers and clinicians.

To fully realize the benefits of genomic advances, a working understanding of the basic concepts of genomics will be important for science educators58, healthcare professionals59, policymakers, and the public60. Several educational strategies will inevitably be required to enhance the genomic literacy of these heterogeneous groups, which points to the need for innovative approaches that are shared, assessed, and improved over time58. A growing evidence base shows that increasing the understanding of key genomics concepts and applications attracts students to careers in genomics61, assists with the use of genomics for addressing health disparities62, and facilitates the uptake of genomic medicine63. Curricula for enhancing genomic literacy must be designed to be accessible, effective, and scalable for use in the full range of settings where genomics education is provided—including primary and secondary schools, science museums, and informal science-education venues. Researchers and educators must also disseminate information about both the science of genomics as well as the key ethical and societal implications of genomics64.

Training and genomics workforce development

Appropriate skills in data science and data stewardship are now prerequisites for becoming a genomics researcher65. Furthermore, given the ever-expanding use of genomics in basic, translational, social, behavioural, and clinical research, a greater number of scientists will require fundamental data-science skills that are appropriate for the genomic applications being used66. Establishing and maintaining data-science competencies for conducting genomics research requires a series of interrelated educational and training efforts67, including the recruitment of many data scientists into genomics and the reciprocal exchange of expertise between genomics researchers and data scientists.

Moving into healthcare, providers must be poised to manage questions from patients who receive genomic information, including that from direct-to-consumer (DTC) testing, and this applies to the full spectrum of medical professionals (including nurses, pharmacists, physicians, and other clinicians)68. Education modules tailored to specific user groups should be designed to adapt rapidly to advances in genomics and data-science technologies; these should be available on demand and, where appropriate, integrated into existing clinical systems69. Research on the methodologies for train-the-trainer approaches, implementation of standards and competency-based education, and strategies for enhancing genomic literacy among all healthcare providers at all career stages70 should also be pursued. The involvement of patients, caregivers, educators, professional organizations71, and accreditation boards will be crucial to ensure success. Importantly, cross-training in relevant aspects of genomics must also be available for specialists working in or around healthcare systems, including (but not limited to) those involved in health services research, health economics, law, bioethics, and social and behavioural sciences.

In both research and clinical settings, the global genomics workforce—as with the general biomedical research workforce—falls considerably short of reflecting the diversity of the world’s population (a vivid example of this is seen in the United States72), which limits the opportunity of those systematically excluded to bring their unique ideas to scientific and clinical research73. To attain a diverse genomics workforce, new strategies and programs to reduce impediments to career opportunities in genomics are required, as are creative approaches to promote workforce diversity, leadership in the field, and inclusion practices. Efforts must intentionally include women, underrepresented racial and ethnic groups, disadvantaged populations, and individuals with disabilities. Initiatives should not focus exclusively on early-stage recruitment; instead, they must also include incentives to recruit and retain a diverse workforce at all career stages74 as well as new approaches for cultivating the next generation of genomics practitioners.

Breaking down barriers in genomics

Genomics has benefited enormously from the proactive identification of major obstacles impeding progress and the subsequent focused efforts to break down those barriers. Prototypic successes include the call for a ‘[US]$1,000 human genome sequence’ after completion of the Human Genome Project15 and proposed actions to facilitate genomic medicine implementation in 201116; in these cases, both the risks of failure and the benefits of success were high. Once again, breaking down barriers, as highlighted in Box 3 and detailed below, would accelerate progress and create new research and clinical opportunities at The Forefront of Genomics.

Laboratory and computational technologies

Advances in DNA synthesis and genome editing allow the field of genomics to progress from largely observational (‘reading DNA’) to more experimental (‘writing’ and ‘editing’ DNA) approaches. Enabling true ‘synthetic genomics’ (that is, the synthesis, modification, and perturbation of nucleic acid sequences at any scale) will allow for more powerful experimental testing of hypotheses about genome variation and function and improve opportunities for linking genotypes to phenotypes75. Genome editing is increasingly being used for practical applications in medicine (such as in gene therapy76), biotechnology, and agriculture. Despite recent triumphs, however, the current approaches are limited in their ability to interrogate genome function at the pathway or network level and to study important phenomena, such as gene regulation and chromosome organization and mechanics, that involve factors that act across large chromosomal (or genomic) distances. Furthermore, radically new capabilities for understanding how the full complement of genomic variation within any individual genome contributes to phenotype should be pursued. Innovative approaches for generating nucleic acid molecules with defined sequences and of any size, coupled with technologies that allow for the concurrent and large-scale perturbation of many genes or simultaneous examination of multiple genomic variants, would be transformative. These advances would benefit from the development of methods to introduce large synthetic constructs into mammalian cells.

In recent years, large human genomics projects have often relied on data generated as part of existing research studies, and emerging approaches involve developing biobanks and organized cohorts77,78,79. Meanwhile, DTC companies are generating substantial amounts of genomic data, and those efforts are rapidly being eclipsed by that being generated in the clinical care setting80. Properly leveraged, these DTC and clinical data offer opportunities for genomics-based studies at unprecedented scales; however, these data are often heavily fragmented, siloed, and mostly outside the purview of genomics researchers and their typical funders81. Eliminating the barriers to accessing these sources of data for conducting research is essential, but this will require resolving issues related to governance, policy infrastructure, and informatics and workflow solutions. Approaches are needed to mitigate the resulting gaps, limitations, and biases within this highly distributed data environment (for example, with regards to population diversity, data-collection strategies, data standards, and data privacy), all while addressing concerns of the patients, participants, and groups. These challenges must be addressed globally81 (Box 1), so as to accommodate differences in healthcare systems and views about data privacy. In addition, the healthcare stakeholders should take advantage of opportunities offered by genomics, thereby enabling virtuous-cycle routes between genomic learning healthcare systems and basic genomics research82 (Fig. 3).

Fig. 3: Virtuous cycles in human genomics research and clinical care.

As human genomics has matured as a discipline, productive and connected virtuous cycles of activity have emerged, each self-improving with successive rounds of new advances. The cycle on the left reflects basic genomics research, in which technology innovations spur the collection and analysis of genomics research data, often yielding new knowledge and further hypotheses for testing. The cycle on the right reflects a genomic learning healthcare system, in which the implementation of new genomic medicine practice innovations allows for the collection and analysis of outcomes data, often yielding new genomic knowledge and additional genomics-based strategies for improving the quality of clinical care. Note that the new knowledge emerging from either the left or the right cycle has the potential to feed into the other, creating opportunities for ‘bench to bedside’ and ‘bedside back to bench’ progressions82—both of which are expected to grow in the coming decade.

Biological insights

Despite progress in identifying genomic variants that cause monogenic traits or are statistically associated with complex phenotypes, determining the connection of specific variants to phenotypes remains challenging83. Systematic approaches, including tactics that connect high-throughput molecular readouts of functional genomic assays to organismal phenotypes, are required to establish the phenotypic consequences of all genomic variants—individually and in combination—in a cell-type context across the life span84. Progress in this area requires global collaboration85, advances in integrating several data types and performing perturbation assays, protein localization or interaction experiments, and animal models, as well as resources cataloguing information about the fitness consequences of de novo mutations and the clinical relevance of genomic variants83. Because it is not possible to directly test every variant in all cell types and states, developmental stages, and disease processes, new data-collection strategies and analytical approaches are needed that can generalize and adapt predictions to new contexts, handle sparse data, and prioritize variants for experimental follow-up.

Recent advances have led to a greater appreciation of the extent of mosaicism—that is, genomic variation among cells (both somatic and germline) within an individual. Although there have been remarkable advances in understanding the somatic genomic changes encountered in cancer86, there is a paucity of detailed knowledge about other effects of mosaicism beyond a few well-studied examples87. Important areas of future research include investigating the prevalence and extent of different forms of mosaic variation in both nuclear and mitochondrial DNA, the mechanisms that generate mosaicism, and the roles of mosaicism in physiology and human disease. Such efforts might reveal whether this form of genomic variation contributes to variable penetrance and expressivity, comprises a form of genetic epistasis, explains any currently undiagnosed diseases or sporadic cases (or apparent phenocopies) of known inherited diseases9, or can inform the design of therapies for genetic diseases. Single-cell genomic technologies have extended knowledge about the functional effects of mosaicism in different experimental systems88,89, with the next challenge being to translate such single-cell understanding to in vivo settings. The development of laboratory and clinical approaches to readily detect genomic mosaicism at high spatial and temporal resolutions, especially in non-invasive ways (for example, requiring minimal amounts of tissue), would be catalytic.

Implementation science

A crucial barrier to using genomics for improving health and preventing disease is the lack of clinical uptake of proven genomic interventions. Implementation science approaches are needed to identify the most effective methods and strategies for facilitating the use of evidence-based genomic applications, most notably pharmacogenomics-based selection of medications90, in routine clinical care. New experimental designs, such as genotype-specific participant recruitment91 or integration of patient-provided genomic data92 (captured during previous healthcare encounters or from DTC sources), should be explored for their potential to speed adoption and limit costs. The effectiveness of centralized resources for genomic referrals (for example, genomic medicine specialists, consult services93,94, and centres of excellence in undiagnosed diseases—akin to transplantation centres or cancer centres) should be explored as potential steppingstones to the more generalized uptake of genomics in clinical care. Strategies for deploying the limited workforce of highly trained genetics or genomics specialists (for example, systematic referral networks or telemedicine or telecounseling) should also be evaluated for their effectiveness at increasing the availability of services broadly—as opposed to being limited to select, highly specialized centres.

Universal newborn genetic screening may represent the most visible and successful approach to population-based identification of serious and treatable inherited conditions, but population screening across the lifespan for other genetic conditions is less widely accepted. Standard public health screening approaches for the US Centers for Disease Control and Prevention Tier 1 conditions95,96 (for example, Lynch syndrome, hereditary breast and ovarian cancer, and familial hypercholesterolemia) identify people at risk through blood relatives of affected individuals (referred to as ‘cascade testing’ by geneticists97). Implementation research methods, coupled with effective science communication, are primed for optimizing approaches to engage individuals in genetic testing for these disorders, in addition to other emerging indications, such as genetic predisposition to adverse drug effects (pharmacogenomics), carrier testing of prospective parents, use of PRSs in disease detection and prevention46, and genomic indicators (for example, gene-expression and epigenetic patterns) of exposure to infectious pathogens98 and other environmental agents.

Compelling genomics research projects

The field of genomics has routinely benefited from a willingness to articulate ambitious—often audacious—research efforts that aim to address questions and acquire knowledge that (at the time) may seem out of reach. Such boldness has served to stimulate interest in emerging opportunities, recruit new expertise, galvanize international collaborations involving several funders, and propel the field forward. Although by no means comprehensive, the areas highlighted in Box 4 and detailed below illustrate the broadening range of compelling research projects that are ripe for pursuit at The Forefront of Genomics.

Advances in understanding gene regulation5,24, the myriad functional roles of RNA99, and the multi-dimensional nature of the nucleome100 —coupled with the use of single-cell genomic approaches30,31 and anticipated new technological and computational capabilities for analysing genomic datasets and variants—provide an unprecedented opportunity to decipher the individual and combined roles of each gene and regulatory element. This must start with establishing the function of each human gene, including the phenotypic effects of human gene knockouts. Because genes and regulatory elements do not function in isolation, it is imperative to build robust experimental and computational models that deduce causal relationships and accurately predict cellular and organismal phenotypes using pathway and network models101,102. Analysis methods must address functional redundancy as well as the nearly boundless experimental space and complexity, including cell states and fates, temporal relationships, environmental conditions, and individual genetic background.

Building on the recent successes in unravelling the genetic underpinnings of rare and undiagnosed diseases9, the field is poised to gain a more comprehensive understanding of the genetic architecture of all human diseases and traits10,85. However, myriad complexities can be anticipated. For example, any given genomic variant(s) may affect more than one disease or trait (that is, pleiotropy); can confer disease risk or reduce it; and can act additively, synergistically, and/or through intermediates. New methods to analyse data that account for human diversity103, coupled with a growing clarity about genotype–phenotype relationships, must be developed to deduce associations and interactions among genomic variants and environmental factors, improve estimates of penetrance and expressivity, and enhance the clinical utility of genomic information for predicting risk, prognosis, treatment response, and, ultimately, clinical outcomes.

Prioritizing the generation of genomic and corresponding phenotypic data from ancestrally diverse participants is a scientific imperative104 and essential for achieving equitable benefits from genomic advances105 (Box 1). However, this is an area in which genomics has repeatedly fallen short19, leading to missed opportunities for understanding genome structure and function, identifying variants conferring risk for common diseases106, and implementing genomic medicine for the benefit of all107,108,109. Ideally, studies should be designed for different groups, adapted for local sensibilities and situations, and consistent in capturing key information beyond participants’ ancestry (for example, the physical and social environments in which they live and receive healthcare110). Leveraging new insights from studies of diverse populations will require the development of robust methods for identifying signatures of natural selection, performing genotype imputation, mapping disease loci, characterizing genomic variant pathogenicity, and calculating PRSs103,109. Success in these efforts will yield a more-complete understanding of how the human genome functions in different environments and offer benefit to those participating in genomics research. Attaining the level of population diversity that will truly benefit all people requires bold scientific and community-based leadership, dedicated resources from funders, highly committed researchers, and effective partnerships that earn the trust of diverse groups of participants and their communities.

As genomics has grown in medicine and society, its potential to influence people’s actions has also expanded. Increasingly, genomics has affected concepts of health, disease, responsibility, family, identity, and community, raising many important and changing questions. When and how is genomic information shared and communicated within families111? Will the identification of a strong genetic risk for a disease change a person’s perception of their health or others’ perception of that person? As some genetic risks are more common in certain identifiable populations, what role does group affiliation have in how risk is communicated and perceived, including potential group stigmatization? Research that catalogues, analyses, and measures the effect of genomics on individuals, families, and communities is important to provide a more informed context to avoid future misrepresentations, misunderstandings, and misuses of genomics54. Finally, researchers must appreciate how their own backgrounds and experiences shape their interpretations of genomic data112.

Extending genomics research in clinical settings beyond DNA sequence to include other multi-omic data, together with clinical variables and outcomes, would advance understanding of disease onset and progression and may also prove important for drug-discovery efforts113,114. This would require tissue- and cell-specific analyses that integrate these data, providing real-time snapshots of biological and disease processes. For clinical applicability and adoption, these high-dimensional, multi-omic data should be integrated with clinical decision support tools and electronic health records. Ultimately, such efforts could reveal important relationships among genomic, environmental, and behavioural variation and facilitate a transition of the use of genomics in medicine from diagnosing and treating disease to maintaining health.

Sharp barriers between research and clinical care obstruct the virtuous cycle of moving scientific discoveries rapidly into clinical care and bringing clinical observations back to the research setting82 (Fig. 3). Learning healthcare systems—in which real-time data on outcomes of healthcare delivery are accessed and used to enhance clinical practice—can lead to continuous care improvement, but only if the barriers between research and clinical care are reduced115. For example, offering genome sequencing to all members of a healthcare system, performed in conjunction with research and participant engagement and provided in real time81, could help to assess the clinical utility of genomic information and may allow providers to improve disease diagnosis and management. System-wide implementation of such an experiment requires not only extensive patient and provider education, sophisticated informatics capabilities, and genomics-based clinical decision support, but also the development and evaluation of data security and privacy protections to ensure patient confidentiality116. Patients should be engaged in the design of such systems and informed at entry to them (and periodically thereafter), so as to be fully aware of the nature of the ongoing research with their clinical data and the goals and potential risks of their participation117. Extending such studies across many healthcare systems should reveal common challenges and solutions118,119, thereby enhancing the learning healthcare model for genomic medicine more broadly (Fig. 3).

Concluding thoughts

The dawn of genomics featured the launch of the Human Genome Project in October 19901. Three decades later, the field has seen stunning technological advances and high-profile programmatic successes, which in turn have led to the widespread infusion of genomic methods and approaches across the life sciences and, increasingly, into medicine and society.

NHGRI has for the third time15,16 since the Human Genome Project undergone an extensive horizon-scanning process to capture, synthesize, and articulate the most compelling strategic opportunities for the next phase of genomics—with particular attention to elements that are most relevant to human health. The now near-ubiquitous nature of genomics (including in the complex healthcare ecosystem) presented practical challenges for attaining a holistic assessment of the field. Another reality was that the NHGRI investment in genomics has now been multiplied many-fold by the seeding of human genomics throughout the broader research community. These changes reflect a continued maturation of both the field (in general) and NHGRI (more specifically), nicely aligning with the institute’s evolving leadership role at The Forefront of Genomics.

Embracing that role, NHGRI formulated the strategic vision described here, which provides an optimistic outlook that the successes in human genomics over the past three decades will be amplified in the coming decade. Many of the details about what is needed to fulfil the promise of genomics have now come into focus. Major unsolved problems remain—among them determining the role for the vast majority of functional elements in the human genome (especially those outside of protein-coding regions), understanding the full spectrum of genomic variation (especially that implicated in human disease), developing data-science capabilities (especially those that keep pace with data generation), and improving healthcare through the implementation of genomic medicine (especially in the areas of prevention, diagnosis, and therapeutic development). The new decade also brings research questions related to the societal implications of genomics, including those related to social inequities, pointing to the continued importance of investigating the ethical, legal, and social issues related to genomics. But now more than ever, solutions to these problems seem to be within striking distance. Towards that end (and with the characteristic spirit of genomics audacity), we offer ten bold predictions of what might be realized in human genomics by 2030 (Box 5).

The strategic vision articulated here was crafted on behalf of the field of human genomics and emphasizes broad strategic goals as opposed to implementation tactics. The realization of these goals will require further planning in conjunction with the collective creativity, energies, and resources of the global community of scientists, funders, and research participants. NHGRI has taken some initial steps to implement this vision, although these will inevitably need to be adapted as advances occur and circumstances change. Indeed, the final words of this strategic vision were formulated as the world moved urgently to deal with the coronavirus disease 2019 (COVID-19) pandemic (see below), providing a vivid reminder of the need to be nimble and the importance of nurturing all parts of the research continuum—from basic to translational to clinical—for protecting public health and advancing medical science.

Despite the seismic changes seen in genomics since the inception of the field, the fundamental sense of curiosity, marvel, and purpose associated with genome science seems to be timeless. In concluding NHGRI’s previous strategic vision16—published just under a decade ago —the then-envisioned opportunities and challenges were provided with “… a continuing sense of wonder, a continuing need for urgency, a continuing desire to balance ambition with reality, and a continuing responsibility to protect individuals while maximizing the societal benefits of genomics….” With the 2020 strategic vision described here providing a thoughtful guide and with enduring feelings of wonder, urgency, ambition, and social consciousness providing unfettered momentum, we are ready to embark on the next exciting phase of the human genomics journey.

Epilogue: COVID-19 and genomics

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) emerged as a global threat to public health at the end of the multi-year process that generated the above strategic vision. Nonetheless, the COVID-19 pandemic provides a potent lesson about how a tiny string of nucleic acids can wreak global havoc on humankind. Understanding the mechanisms involved in the transmission of the virus, viral invasion and clearance, as well as the highly variable and at times disastrous physiological responses to infection, are fertile grounds for genomics research. Genomics rapidly assumed crucial roles in COVID-19 research and clinical care in areas such as (1) the deployment of DNA- and RNA-sequencing technologies for diagnostics, tracking of viral isolates, and environmental monitoring; (2) the use of synthetic nucleic acid technologies for studying SARS-CoV-2 virulence and facilitating vaccine development; (3) the examination of how human genomic variation influences infectivity, disease severity, vaccine efficacy, and treatment response; (4) the adherence to principles and values related to open science, data sharing, and consortia-based collaborations; and (5) the provision of genomic data science tools to study COVID-19 pathophysiology. The growing adoption of genomic approaches and technologies into myriad aspects of the global response to the COVID-19 pandemic serves as another important and highly visible example of the integral and vital nature of genomics in modern research and medicine.


  1. 1.

    The Human Genome Project; (accessed 28 June 2020)

  2. 2.

    Lander, E. S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).

    CAS  PubMed  PubMed Central  ADS  Google Scholar 

  3. 3.

    International Human Genome Sequencing Consortium. Finishing the euchromatic sequence of the human genome. Nature 431, 931–945 (2004).

    ADS  Google Scholar 

  4. 4.

    NHGRI. The cost of sequencing a human genome; (accessed 12 June 2020)

  5. 5.

    Moore, J. E. et al. Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature 583, 699–710 (2020).

    PubMed  PubMed Central  ADS  Google Scholar 

  6. 6.

    Shema, E., Bernstein, B. E. & Buenrostro, J. D. Single-cell and single-molecule epigenomics to uncover genome regulation at unprecedented resolution. Nat. Genet. 51, 19–25 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  7. 7.

    The 1000 Genomes Project Consortium et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).

    PubMed Central  Google Scholar 

  8. 8.

    Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020). Analysis of a large dataset of exome sequences, yielding important descriptions of the extent and nature of human genomic variation and insights into protein evolution.

    CAS  PubMed  PubMed Central  ADS  Google Scholar 

  9. 9.

    Posey, J. E. et al. Insights into genetics, human biology and disease gleaned from family based genomic studies. Genet. Med. 21, 798–812 (2019).

    PubMed  PubMed Central  Google Scholar 

  10. 10.

    Claussnitzer, M. et al. A brief history of human disease genetics. Nature 577, 179–189 (2020).

    CAS  PubMed  PubMed Central  ADS  Google Scholar 

  11. 11.

    Manolio, T. A. et al. Opportunities, resources, and techniques for implementing genomics in clinical care. Lancet 394, 511–520 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  12. 12.

    Mardis, E. R. The impact of next-generation sequencing on cancer genomics: from discovery to clinic. Cold Spring Harb. Perspect. Med. 9, a036269 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  13. 13.

    Bianchi, D. W. & Chiu, R. W. K. Sequencing of circulating cell-free DNA during pregnancy. N. Engl. J. Med. 379, 464–473 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  14. 14.

    Wright, C. F., FitzPatrick, D. R. & Firth, H. V. Paediatric genomics: diagnosing rare disease in children. Nat. Rev. Genet. 19, 253–268 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  15. 15.

    Collins, F. S., Green, E. D., Guttmacher, A. E. & Guyer, M. S. A vision for the future of genomics research. Nature 422, 835–847 (2003).

    CAS  PubMed  PubMed Central  ADS  Google Scholar 

  16. 16.

    Green, E. D. & Guyer, M. S. Charting a course for genomic medicine from base pairs to bedside. Nature 470, 204–213 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  17. 17.

    McEwen, J. E. et al. The Ethical, Legal, and Social Implications Program of the National Human Genome Research Institute: reflections on an ongoing experiment. Annu. Rev. Genomics Hum. Genet. 15, 481–505 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  18. 18.

    Burke, W. et al. The translational potential of research on the ethical, legal, and social implications of genomics. Genet. Med. 17, 1–9 (2014).

    Google Scholar 

  19. 19.

    Popejoy, A. B. & Fullerton, S. M. Genomics is failing on diversity. Nature 538, 161–164 (2016). Comprehensive analysis of genome-wide association studies, demonstrating continued severe underrepresentation of individuals of African and Latin American ancestry and Indigenous peoples.

    CAS  PubMed  PubMed Central  ADS  Google Scholar 

  20. 20.

    Wolf, S. M. et al. Integrating rules for genomic research, clinical care, public health screening and DTC testing: creating translational law for translational genomics. J. Law Med. Ethics 48, 69–86 (2020).

    PubMed  Google Scholar 

  21. 21.

    Adam, D. The promise and peril of the new science of social genomics. Nature 574, 618–620 (2019). Summary of recent studies examining the genetics of bio-behavioural traits, highlighting dangers to groups and society of over-interpreting results in this new field.

    CAS  PubMed  PubMed Central  ADS  Google Scholar 

  22. 22.

    Dias, R. & Torkamani, A. Artificial intelligence in clinical and genomic diagnostics. Genome Med. 11, 70 (2019).

    PubMed  PubMed Central  Google Scholar 

  23. 23.

    Schloss, J. A., Gibbs, R. A., Makhijani, V. B. & Marziali, A. Cultivating DNA sequencing technology after the human genome project. Annu. Rev. Genomics Hum. Genet. 21, 117–138 (2020). Retrospective overview of the NHGRI program for advancing DNA-sequencing technologies, the goal of which was to reduce the cost of sequencing a human genome to $1,000.

    CAS  PubMed  PubMed Central  Google Scholar 

  24. 24.

    ENCODE: Encyclopedia of DNA Elements; (accessed 24 June 2020).

  25. 25.

    Risca, V. I. & Greenleaf, W. J. Unraveling the 3D genome: genomics tools for multiscale exploration. Trends Genet. 31, 357–372 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. 26.

    Logsdon, G. A., Vollger, M. R. & Eichler, E. E. Long-read human genome sequencing and its applications. Nat. Rev. Genet. (2020).

  27. 27.

    Miga, K. H. et al. Telomere-to-telomere assembly of a complete human X chromosome. Nature 585, 79–84 (2020). Demonstration of the use of emerging DNA-sequencing technologies, analysis methods, and validation routines to produce the first gapless de novo assembly of a human chromosome sequence.

    CAS  PubMed  PubMed Central  Google Scholar 

  28. 28.

    Human Pangenome Reference Consortium. Diverse human references drive genomic discoveries for everyone; (accessed 29 June 2020)

  29. 29.

    Zerbino, D. R., Frankish, A. & Flicek, P. Progress, challenges, and surprises in annotating the human genome. Annu. Rev. Genomics Hum. Genet. 21, 55–79 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  30. 30.

    Rood, J. E. et al. Toward a common coordinate framework for the human body. Cell 179, 1455–1467 (2019).

    CAS  Google Scholar 

  31. 31.

    Stuart, T. & Satija, R. Integrative single-cell analysis. Nat. Rev. Genet. 20, 257–272 (2019).

    CAS  Google Scholar 

  32. 32.

    Mimitou, E. P. et al. Multiplexed detection of proteins, transcriptomes, clonotypes and CRISPR perturbations in single cells. Nat. Methods 16, 409–412 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  33. 33.

    Schreiber, J., Durham, T., Bilmes, J. & Noble, W. S. Avocado: a multi-scale deep tensor factorization method learns a latent representation of the human epigenome. Genome Biol. 21, 81 (2020).

    PubMed  PubMed Central  Google Scholar 

  34. 34.

    Cunningham, F. et al. Ensembl 2019. Nucleic Acids Res. 47 (D1), D745–D751 (2019).

    CAS  Google Scholar 

  35. 35.

    Lewin, H. A. et al. Earth BioGenome Project: Sequencing life for the future of life. Proc. Natl Acad. Sci. USA 115, 4325–4333 (2018).

    CAS  Google Scholar 

  36. 36.

    Lindblad-Toh, K. What animals can teach us about evolution, the human genome, and human disease. Ups. J. Med. Sci. 125, 1–9 (2020).

    PubMed  PubMed Central  Google Scholar 

  37. 37.

    Schatz, M. C. Biological data sciences in genome research. Genome Res. 25, 1417–1422 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  38. 38.

    Wilkinson, M. D. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016). Description of foundational principles to improve data sharing and stewardship by ensuring that biomedical research data (including genomic data) are findable, accessible, interoperable, and reusable.

    PubMed  PubMed Central  Google Scholar 

  39. 39.

    Anderson, W. et al. Towards coordinated international support of core data resources for the life sciences. Preprint at (2017).

  40. 40.

    Grossman, R. L. Data lakes, clouds, and commons: a review of platforms for analyzing and sharing genomic data. Trends Genet. 35, 223–234 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  41. 41.

    Haendel, M. A., Chute, C. G. & Robinson, P. N. Classification, ontology, and precision medicine. N. Engl. J. Med. 379, 1452–1462 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  42. 42.

    Martínez-Romero, M. et al. Using association rule mining and ontologies to generate metadata recommendations from multiple biomedical databases. Database (Oxford) 2019, 59 (2019).

    Google Scholar 

  43. 43.

    Levy, K. D. et al. Opportunities to implement a sustainable genomic medicine program: lessons learned from the IGNITE Network. Genet. Med. 21, 743–747 (2019).

    CAS  Google Scholar 

  44. 44.

    Williams, M. S. et al. Genomic information for clinicians in the electronic health record: Lessons learned from the clinical genome resource project and the electronic medical records and genomics network. Front. Genet. 10, 1059 (2019).

    PubMed  PubMed Central  Google Scholar 

  45. 45.

    Lemke, A. A. et al. Primary care physician experiences utilizing a family health history tool with electronic health record-integrated clinical decision support: an implementation process assessment. J. Community Genet. 11, 339–350 (2020).

    PubMed  PubMed Central  ADS  Google Scholar 

  46. 46.

    Khera, A. V. et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 50, 1219–1224 (2018). Development and validation of genome-wide polygenic scores that identify population subsets with risk levels equivalent to monogenic genomic variants that are commonly reported and acted upon.

    CAS  PubMed  PubMed Central  Google Scholar 

  47. 47.

    Zeggini, E., Gloyn, A. L., Barton, A. C. & Wain, L. V. Translational genomics and precision medicine: Moving from the lab to the clinic. Science 365, 1409–1413 (2019).

    CAS  PubMed  PubMed Central  ADS  Google Scholar 

  48. 48.

    Koehly, L. M. et al. Social and behavioral science at the forefront of genomics: discovery, translation, and health equity. Soc. Sci. Med. 112450, 112450 (2019).

    Google Scholar 

  49. 49.

    Khan, S. S., Cooper, R. & Greenland, P. Do polygenic risk scores improve patient selection for prevention of coronary artery disease? J. Am. Med. Assoc. 323, 614–615 (2020).

    Google Scholar 

  50. 50.

    Mostafavi, H. et al. Variable prediction accuracy of polygenic scores within an ancestry group. eLife 9, 1–52 (2020).

    Google Scholar 

  51. 51.

    Morris, T. T., Davies, N. M., Hemani, G. & Smith, G. D. Population phenomena inflate genetic associations of complex social traits. Sci. Adv. 6, eaay0328 (2020).

    PubMed  PubMed Central  ADS  Google Scholar 

  52. 52.

    Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).

    CAS  PubMed  PubMed Central  ADS  Google Scholar 

  53. 53.

    Denny, J. C. et al. The “All of Us” Research Program. N. Engl. J. Med. 381, 668–676 (2019).

    PubMed  PubMed Central  Google Scholar 

  54. 54.

    Garrison, N. A. et al. Genomic research through an indigenous lens: understanding the expectations. Annu. Rev. Genomics Hum. Genet. 20, 495–517 (2019). Discussion of issues related to conducting genomics research with Indigenous peoples, coupled with suggestions for respecting tribal governance and protecting Indigenous people from group harms.

    CAS  PubMed  PubMed Central  Google Scholar 

  55. 55.

    Sanderson, S. C. et al. Public attitudes toward consent and data sharing in biobank research: a large multi-site experimental survey in the US. Am. J. Hum. Genet. 100, 414–427 (2017). Survey results from 13,000 individuals regarding participation in research in which their data are shared with others, yielding insight into factors that predict a willingness of people to participate in research and concerns about data privacy.

    CAS  Google Scholar 

  56. 56.

    Milne, R. et al. Trust in genomic data sharing among members of the general public in the UK, USA, Canada and Australia. Hum. Genet. 138, 1237–1246 (2019).

    PubMed  PubMed Central  Google Scholar 

  57. 57.

    Grishin, D., Obbad, K. & Church, G. M. Data privacy in the age of personal genomics. Nat. Biotechnol. 37, 1115–1117 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  58. 58.

    Genomic Literacy, Education and Engagement Initiative; (accessed 29 June 2020)

  59. 59.

    Manolio, T. A. & Murray, M. F. The growing role of professional societies in educating clinicians in genomics. Genet. Med. 16, 571–572 (2014).

    PubMed  PubMed Central  Google Scholar 

  60. 60.

    Krakow, M., Ratcliff, C. L., Hesse, B. W. & Greenberg-Worisek, A. J. Assessing genetic literacy awareness and knowledge gaps in the US population: results from the health information national trends survey. Public Health Genomics 20, 343–348 (2017).

    PubMed  PubMed Central  Google Scholar 

  61. 61.

    LaRue, K. M., McKernan, M. P., Bass, K. M. & Wray, C. G. Teaching the genome generation: bringing modern human genetics into the classroom through teacher professional development. J. STEM Outreach 1, 48–60 (2018).

    PubMed  PubMed Central  Google Scholar 

  62. 62.

    Mboowa, G. & Sserwadda, I. Role of genomics literacy in reducing the burden of common genetic diseases in Africa. Mol. Genet. Genomic Med. 7, e00776 (2019).

    PubMed  PubMed Central  Google Scholar 

  63. 63.

    Veilleux, S., Bouffard, M. & Bourque Bouliane, M. Patient and health care provider needs and preferences in understanding pharmacogenomic and genomic testing: a meta-data analysis. Qual. Health Res. 30, 43–59 (2020).

    PubMed  PubMed Central  Google Scholar 

  64. 64.

    Kung, J. & Wu, C.-T. Leveling the playing field: closing the gap in public awareness of genetics between the well served and underserved. Hastings Cent. Rep. 46, 17–20 (2016).

    PubMed  PubMed Central  Google Scholar 

  65. 65.

    Stephens, Z. D. et al. Big data: astronomical or genomical? PLoS Biol. 13, e1002195 (2015).

    PubMed  PubMed Central  Google Scholar 

  66. 66.

    Attwood, T. K., Blackford, S., Brazas, M. D., Davies, A. & Schneider, M. V. A global perspective on evolving bioinformatics and data science training needs. Brief. Bioinform. 20, 398–404 (2019).

    PubMed  PubMed Central  Google Scholar 

  67. 67.

    Genomics Education Partnership; (accessed 16 June 2020).

  68. 68.

    Campion, M., Goldgar, C., Hopkin, R. J., Prows, C. A. & Dasgupta, S. Genomic education for the next generation of health-care providers. Genet. Med. 21, 2422–2430 (2019).

    PubMed  PubMed Central  Google Scholar 

  69. 69.

    McClaren, B. J. et al. Development of an evidence-based, theory-informed national survey of physician preparedness for genomic medicine and preferences for genomics continuing education. Front. Genet. 11, 59 (2020).

    PubMed  PubMed Central  Google Scholar 

  70. 70.

    Dougherty, M. J., Wicklund, C. & Johansen Taber, K. A. Challenges and opportunities for genomics education: Insights from an Institute of Medicine Roundtable Activity. J. Contin. Educ. Health Prof. 36, 82–85 (2016).

    PubMed  PubMed Central  Google Scholar 

  71. 71.

    NHGRI. Inter-Society Coordinating Committee for Practitioner Education in Genomics; (accessed 16 June 2020).

  72. 72.

    Valantine, H. A., Collins, F. S. & Verma, I. M. National Institutes of Health addresses the science of diversity. Proc. Natl Acad. Sci. USA 112, 12240–12242 (2015).

    CAS  PubMed  PubMed Central  ADS  Google Scholar 

  73. 73.

    Hofstra, B. et al. The diversity–innovation paradox in science. Proc. Natl Acad. Sci. USA 117, 9284–9291 (2020). Study of the US doctorate recipients from 1977 to 2015, identifying new contributions by gender and racial or ethnic minority scholars, evidence for lower rates of recognition by majority scholars, and the resulting diversity–innovation paradox in science.

    CAS  PubMed  PubMed Central  Google Scholar 

  74. 74.

    Martinez, L. R., Boucaud, D. W., Casadevall, A. & August, A. Factors contributing to the success of NIH-designated underrepresented minorities in academic and nonacademic research positions. CBE Life Sci. Educ. 17, ar32 (2018).

    PubMed  PubMed Central  Google Scholar 

  75. 75.

    Schindler, D., Dai, J. & Cai, Y. Synthetic genomics: a new venture to dissect genome fundamentals and engineer new functions. Curr. Opin. Chem. Biol. 46, 56–62 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  76. 76.

    Doudna, J. A. The promise and challenge of therapeutic genome editing. Nature 578, 229–236 (2020). Review of the scientific, technical, and ethical aspects of using CRISPR technology for therapeutic applications in humans.

    CAS  PubMed  PubMed Central  ADS  Google Scholar 

  77. 77.

    UK Biobank; (accessed 14 June 2020).

  78. 78.

    NIH. All of Us; (accessed 14 June 2020).

  79. 79.

    International HundredK+ Cohorts Consortium (IHCC). Linking cohorts, understanding biology, improving health; (accessed 14 June 2020).

  80. 80.

    Birney, E., Vamathevan, J. & Goodhand, P. Genomics in healthcare: GA4GH looks to 2022. Preprint at (2017).

  81. 81.

    Stark, Z. et al. Integrating genomics into healthcare: a global responsibility. Am. J. Hum. Genet. 104, 13–20 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  82. 82.

    Manolio, T. A. et al. Bedside back to bench: building bridges between basic and clinical genomic research. Cell 169, 6–12 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  83. 83.

    Rehm, H. L. et al. ClinGen — The clinical genome resource. N. Engl. J. Med. 372, 2235–2242 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  84. 84.

    Starita, L. M. et al. Variant interpretation: functional assays to the rescue. Am. J. Hum. Genet. 101, 315–325 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  85. 85.

    International Common Disease Alliance; (accessed 24 June 2020).

  86. 86.

    Welcome to the Pan-Cancer Atlas; (accessed 19 June 2020).

  87. 87.

    Steensma, D. P. et al. Clonal hematopoiesis of indeterminate potential and its distinction from myelodysplastic syndromes. Blood 126, 9–16 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  88. 88.

    Baslan, T. & Hicks, J. Unravelling biology and shifting paradigms in cancer with single-cell sequencing. Nat. Rev. Cancer 17, 557–569 (2017).

    CAS  Google Scholar 

  89. 89.

    D’Gama, A. M. & Walsh, C. A. Somatic mosaicism and neurodevelopmental disease. Nat. Neurosci. 21, 1504–1514 (2018).

    Google Scholar 

  90. 90.

    Roden, D. M. et al. Pharmacogenomics. Lancet 394, 521–532 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  91. 91.

    Corbin, L. J. et al. Formalising recall by genotype as an efficient approach to detailed phenotyping and causal inference. Nat. Commun. 9, 711 (2018).

    PubMed  PubMed Central  ADS  Google Scholar 

  92. 92.

    Savatt, J. M. et al. ClinGen’s GenomeConnect registry enables patient-centered data sharing. Hum. Mutat. 39, 1668–1676 (2018).

    PubMed  PubMed Central  Google Scholar 

  93. 93.

    Eadon, M. T. et al. Implementation of a pharmacogenomics consult service to support the INGENIOUS trial. Clin. Pharmacol. Ther. 100, 63–66 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  94. 94.

    Darnell, A. J. et al. A clinical service to support the return of secondary genomic findings in human research. Am. J. Hum. Genet. 98, 435–441 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  95. 95.

    CDC. Public Health Genomics and Precision Health Knowledge Base (v6.4); (accessed 17 June 2020).

  96. 96.

    Dotson, W. D. et al. Prioritizing genomic applications for action by level of evidence: a horizon-scanning method. Clin. Pharmacol. Ther. 95, 394–402 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  97. 97.

    Hopkins, P. N. Genotype-guided diagnosis in familial hypercholesterolemia: population burden and cascade screening. Curr. Opin. Lipidol. 28, 136–143 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  98. 98.

    Bierne, H., Hamon, M. & Cossart, P. Epigenetics and bacterial infections. Cold Spring Harb. Perspect. Med. 2, a010272 (2012).

    PubMed  PubMed Central  Google Scholar 

  99. 99.

    Bhat, A. A. et al. Role of non-coding RNA networks in leukemia progression, metastasis and drug resistance. Mol. Cancer 19, 57 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  100. 100.

    Sparks, T. M., Harabula, I. & Pombo, A. Evolving methodologies and concepts in 4D nucleome research. Curr. Opin. Cell Biol. 64, 105–111 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  101. 101.

    Young, A. I., Benonisdottir, S., Przeworski, M. & Kong, A. Deconstructing the sources of genotype-phenotype associations in humans. Science 365, 1396–1400 (2019).

    CAS  PubMed  PubMed Central  ADS  Google Scholar 

  102. 102.

    Mitra, K., Carvunis, A.-R., Ramesh, S. K. & Ideker, T. Integrative approaches for finding modular structure in biological networks. Nat. Rev. Genet. 14, 719–732 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  103. 103.

    Bien, S. A. et al. The future of genomic studies must be globally representative: perspectives from PAGE. Annu. Rev. Genomics Hum. Genet. 20, 181–200 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  104. 104.

    Bentley, A. R., Callier, S. L. & Rotimi, C. N. Evaluating the promise of inclusion of African ancestry populations in genomics. Genomic Med. 5, 5 (2020).

    Google Scholar 

  105. 105.

    Hindorff, L. A. et al. Prioritizing diversity in human genomics research. Nat. Rev. Genet. 19, 175–185 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  106. 106.

    Wojcik, G. L. et al. Genetic analyses of diverse populations improves discovery for complex traits. Nature 570, 514–518 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  107. 107.

    Landry, L. G., Ali, N., Williams, D. R., Rehm, H. L. & Bonham, V. L. Lack of diversity in genomic databases is a barrier to translating precision medicine research into practice. Health Aff. (Millwood) 37, 780–785 (2018).

    Google Scholar 

  108. 108.

    Manrai, A. K. et al. Genetic misdiagnoses and the potential for health disparities. N. Engl. J. Med. 375, 655–665 (2016). Demonstration of frequent erroneous classification of genomic variants as pathogenic among patients of African or unspecified ancestry that were subsequently re-categorized as benign, with considerable health implications of those misclassifications.

    PubMed  PubMed Central  Google Scholar 

  109. 109.

    Martin, A. R. et al. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 51, 584–591 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  110. 110.

    Horowitz, C. R. et al. Successful recruitment and retention of diverse participants in a genomics clinical trial: a good invitation to a great party. Genet. Med. 21, 2364–2370 (2019).

    PubMed  PubMed Central  Google Scholar 

  111. 111.

    Botkin, J. R., Mancher, M., Busta, E. R. & Downey, A. S. Returning Individual Research Results to Participants (National Academies Press, 2018).

  112. 112.

    Lázaro-Muñoz, G. et al. Issues facing us. Am. J. Med. Genet. B. Neuropsychiatr. Genet. 180, 543–554 (2019).

    PubMed  PubMed Central  Google Scholar 

  113. 113.

    Lloyd-Price, J. et al. Multi-omics of the gut microbial ecosystem in inflammatory bowel diseases. Nature 569, 655–662 (2019).

    CAS  PubMed  PubMed Central  ADS  Google Scholar 

  114. 114.

    Hasin, Y., Seldin, M. & Lusis, A. Multi-omics approaches to disease. Genome Biol. 18, 83 (2017).

    PubMed  PubMed Central  Google Scholar 

  115. 115.

    Chambers, D. A., Feero, W. G. & Khoury, M. J. Convergence of implementation science, precision medicine, and the learning health care system: a new model for biomedical research. J. Am. Med. Assoc. 315, 1941–1942 (2016).

    CAS  Google Scholar 

  116. 116.

    Sugano, S. International code of conduct for genomic and health-related data sharing. HUGO J. 8, 1 (2014).

    PubMed  PubMed Central  Google Scholar 

  117. 117.

    Clayton, E. W., Halverson, C. M., Sathe, N. A. & Malin, B. A. A systematic literature review of individuals’ perspectives on privacy and genetic information in the United States. PLoS One 13, e0204417 (2018).

    PubMed  PubMed Central  Google Scholar 

  118. 118.

    Cavallari, L. H. et al. Multi-site investigation of strategies for the clinical implementation of CYP2D6 genotyping to guide drug prescribing. Genet. Med. 21, 2255–2263 (2019).

    PubMed  PubMed Central  Google Scholar 

  119. 119.

    Ginsburg, G. S. A global collaborative to advance genomic medicine. Am. J. Hum. Genet. 104, 407–409 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

Download references


The strategic vision described here was formulated on behalf of the NHGRI. We are grateful to the many members of the institute staff for their contributions to the associated planning process (see for details) as well as to the numerous external colleagues who provided input to the process and draft versions of this strategic vision. The National Advisory Council for Human Genome Research (current members are J. Botkin, T. Ideker, S. Plon, J. Haines, S. Fodor, R. Irizarry, P. Deverka, W. Chung, M. Craven, H. Dietz, S. Rich, H. Chang, L. Parker, L. Pennacchio, and O. Troyanskaya) ratified the strategic planning process, themes, and priorities associated with this strategic vision.

Author information




All authors contributed to the concepts, writing, and/or revisions of the manuscript.

Corresponding author

Correspondence to Eric D. Green.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature thanks Jantina de Vries, Eleftheria Zeggini and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Green, E.D., Gunter, C., Biesecker, L.G. et al. Strategic vision for improving human health at The Forefront of Genomics. Nature 586, 683–692 (2020).

Download citation


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing