Introduction

Genomic information is increasingly finding its way into clinical and public health practice, from diagnosis of rare genetic diseases to diagnosis and treatment of common chronic diseases and infections. The new Precision Medicine Initiative1 and related developments will increase the number of people whose genomes are sequenced in the next decade. Meaningful clinical interpretation of emerging information requires the integration of data from basic, clinical, and population studies. Although such data are abundant, they are widely dispersed across the peer-reviewed literature and other online resources. Clinicians and public health professionals require credible information to use genomic information in practice.

In 2001, the Centers for Disease Control and Prevention (CDC) Office of Public Health Genomics began to systematically compile and curate an online catalog of published population-based studies of human gene–disease associations. In 2008, Office of Public Health Genomics launched the Human Genome Epidemiology Navigator (HuGE Navigator)2 as an application for mining the rapidly growing database. By 2015, it contained citations for more than 100,000 scientific publications and had acquired more than 150,000 users (i.e., unique IP addresses). At that time, the scientific agenda and public interest were shifting increasingly from gene discovery to translation—that is, the use of genomic information to develop genome-based tests, drugs, and other applications.3 The Genomic Applications in Practice and Prevention Network was one of several government-sponsored, interdisciplinary efforts to address a perceived “lack of readily accessible information about the utility of most genomic applications and the lack of necessary knowledge by consumers and providers to implement what is known.”4

To help address this need, we have taken what we view as the next step in showing how epidemiologic and other information can be used to improve population health: launching the Public Health Genomics Knowledge Base (PHGKB) (http://phgkb.cdc.gov). Our goal is to organize information from a wide variety of sources and in varying formats that are needed to describe the translational trajectories of genomic discoveries. Thus, although PHGKB’s component databases have different formats and data structures and can be searched individually, searching PHGKB as a whole also produces seamless results. Here, we briefly describe PHGKB and present an initial cross-sectional analysis of its contents.

Materials and Methods

PHGKB includes publications and other relevant Web-based resources captured by our weekly horizon scan. Some of our methods and early results have been described in previous publications.5,6 The content is indexed and grouped into categories that include practice guidelines, systematic reviews, implementation studies, and applications of genomic tests and family health history classified according to the level of available evidence.7 PHGKB was built using J2EE technology8 and other Java open-source frameworks, including Hibernate9 and Strut.10 As the largest constituent of PHGKB content, the scientific literature is represented by PubMed abstracts indexed with Medical Subject Headings (MeSH) terminology. Use of the MeSH tree hierarchies and the Unified Medical Language System metathesaurus enhances the system’s search capacity.

Results

PHGKB is an open access, Web-based, searchable database that provides access to a spectrum of information on genomics and population health, from basic research to implementation. PHGKB currently consists of nine component databases, including HuGE Navigator ( Table 1 ). Users interested in specific topics can perform a global search of the entire knowledge base or search one or more component databases directly and choose options for customizing the display of their search results. The results of searching all databases in PHGKB are displayed according to steps in the translational pathway from discovery to implementation.11 For example, results of a search for breast cancer ( Figure 1 ) are arrayed from discovery to implementation, with special emphasis on evidence synthesis and guidelines, as well as on CDC products. Each category also includes links to specialized external resources.

Table 1 PHGKB component databases
Figure 1
figure 1

Results of a search performed on 24 April 2016 for breast cancer information using the Public Health Genomics Knowledge Base.

PHGKB also offers users several ways to keep abreast of new information. First, the PHGKB main page features two sections that are updated almost daily: Hot Topics of the Day, curated by domain experts, and What’s New, which displays recent additions to the database and summary statistics. In addition, two weekly e-mail newsletters are available by subscription—Genomics & Health Impact Weekly Scan and Advanced Molecular Detection Clips (focused on human and pathogen genomics, respectively)—which direct users to new content posted on the PHGKB website.

Discussion

Genomics has given rise to many specialized online databases that were designed primarily for use by researchers and other expert users. PHGKB is unique in providing systematically curated and updated information that bridges population-based research with clinical and public health applications. We acknowledge that PHGKB is not comprehensive, especially given the fluid state of translation and implementation research. Most genomic research is still focused on new discoveries; however, the focus of PHGKB is the small fraction—perhaps 1%—of genomics-related publications that address epidemiology, evaluation and evidence synthesis, implementation, and outcomes, as we have described elsewhere.5 Finding these needles in a haystack is important because they are most relevant to population health. As the knowledge base grows, it will become useful for tracing the translational trajectories of specific discoveries into clinical application and population health outcomes. So far, PHGKB has undergone limited pilot testing by selected users at the CDC and in state health departments. With this report, we invite potential users to explore the resource. We intend to conduct additional evaluation studies in the near future.

Genomic literacy is becoming a fundamental requirement for clinical and public health decision makers who have the power to improve patient and population health. We hope that PHGKB offers a useful resource to researchers, policy makers, practitioners, and members of the public who are interested in understanding how genomic research can contribute to better health.

Disclosure

The authors declare no conflict of interest.