Main

Electronic health records (EHRs) provide a number of benefits to the field of clinical genomics. These range from the ability to return results to the practitioner, to the ability to use genetic findings in clinical decision support, to having data collected in the EHR that serve as a source of phenotypic information for analysis purposes. Not all EHRs are created equal, however. They differ in terms of features, capabilities, and ease of use. Even with a given EHR, no two customers’ implementations are exactly alike because they have different institutional characteristics and clinical workflows. Therefore, to understand the potential of the EHR as a tool to support the practice of clinical genomics and to serve as a source of data for genomics research, it is necessary to understand the capabilities of the EHR as well as the impact that both implementation strategy and institutional characteristics have on usability. We provide an overview of these topics by focusing on the following areas: (i) how the EHR is used to capture data in clinical practice settings; (ii) how the implementation and configuration of the EHR affect the quality and availability of data; (iii) the management of clinical genetic test results and the feasibility of EHR integration; and (iv) the challenges of implementing an EHR in a research-intensive environment. This is followed with a discussion of the minimum functional requirements that an EHR must meet to enable the satisfactory integration of genomic results and the open issues that remain.

Use of the EHR in Clincal Practice Settings

The EHR has become the primary source of data for the practice of clinical genetics and an important source of information for genomic research.

Role of the EHR in the US health-care environment

Although the use of EHR systems to manage patient information is not required by US law, the Meaningful Use (MU) incentives from the Health Information Technology for Economic and Clinical Health (HITECH) Act, enacted as part of the American Recovery and Reinvestment Act of 2009,1 has compelled widespread adoption.2 The net effect of MU3 and new electronic reporting requirements for quality measures4 is that EHRs are becoming de facto requirements for medical practice in the United States. Because the MU program is specifically aimed at the goal of meaningful information exchange (“interoperability”), the adoption of these systems has accelerated the exchange of phenotypic and genetic data. The MU ideal is that all medical information about a patient can travel with the patient regardless of care site. Such ubiquitous, long-term access has important implications for clinical genetics laboratory results, which have been relatively immovable in their traditional format of scanned documents accessible only to the ordering clinician.

As of 2011, 35% of hospitals and 39% of providers have adopted fully functional EHR systems.5 Given that academic centers and large group practices are the most likely to adopt EHRs, it is highly likely that a patient with a genetic condition will have some of his or her genetic information stored in an EHR and will need to transmit that data across different sites of care. The MU program focuses heavily in its requirement to exchange medical records electronically, and, in particular, exchange laboratory results as structured data.3 The net result of the MU program is that there will be a sharp rise in the availability of computable genetic data derivable from clinical sources, as compared with the former state, in which genetic testing existed in the form of scanned document images.

One might be tempted to imagine that these genetic laboratory data would be accompanied by a large amount of discrete phenotypic data (historical findings, physical examination descriptions, family history data, etc.), but the MU program does not require that. In fact, such data are likely to remain in free-text form indefinitely. This is because the methods required to record such findings as discrete data are both time consuming and expressively restrictive. Stage 2 of MU requirements (which providers will begin qualifying for in 2014) require discrete family history data in primary relatives of only 20% of patients and require no discrete data capture of the patient’s historical information or physical examination findings. Factor in the highly variable extent to which providers implement fully functional systems, and it appears that a totally computable medical record may never exist in the general clinical setting.

One factor that affects the variability in the implementation of fully functional systems is the varying level of enthusiasm for this method of managing patient data. Providers seem generally satisfied with their EHR implementations,6,7,8 and there are scattered reports on the value of various subcomponents of EHRs;9 however, there remains a lack of evidence that these systems result in direct positive effects on patient care.10,11 Most of the literature on the positive effects of EHRs comes from limited trails of homegrown systems designed specifically to solve a problem that is the focus of the study.9 Such results have not generally been replicated in the more typical case of the vendor-provided system, but studies are beginning to emerge.12 Another factor is cost. Despite large incentives available through the MU program, the cost—for software licensing, hardware, and training—is still substantial. The American Academy of Family Physicians estimates that the average cost for a provider in primary-care practice to implement an EHR is $40,000 per provider, with additional maintenance costs.13 Estimates for specialty providers do not exist.

Variation in the quality of data in the EHR

As mentioned above, documentation within the EHR generally falls into two categories: discrete data and free text. Discrete elements are those that can be captured within a structured data collection form. Common discrete elements include allergies, immunizations, encounter diagnoses, and problems. Less commonly available as discrete data elements are presenting findings and family, surgical, and social histories. By capturing data in a discrete field, it is possible to use the information for reporting, analytics, and decision support. Even so, a large amount of information within the EHR is captured in the form of free-text notes. This leads to a large amount of variation in the quality and completeness of the documentation because practitioners often exercise choice in whether to record a finding as data. Only in the case in which there is a financial penalty for missing data (e.g., failing to record an encounter diagnosis) can one assume reliable data entry. Natural-language processing is emerging as a method for EHR data analysis14 but is still far from being reliable enough to replace prospective data collection for particular purposes.15

Even in areas in which EHR adoption is widespread, the only clinical data that are collected reliably enough to permit analysis remain the data that are submitted to payers in the form of claims (and the reliability of these data is contingent on the accuracy of the provider). Other data in the record, such as clinical descriptions of phenotypes, remain in free text or are gathered so sporadically that valid analysis is not possible. Claims data within the EHR will typically be coded to one of the following terminologies, depending on domain:

  • ICD-9-CM:16 The International Classification of Diseases, Ninth Revision, Clinical Modification has been used in the United States for decades to encode diagnoses in claims. This system is a classification system, which means that an item is a member of only one hierarchy. For example, streptococcal meningitis is a type of meningitis but not a type of streptococcal infection. The inflexibility of a classification system reduces ambiguity—a desirable feature in a system used to submit administrative data efficiently—but severely limits expressivity. Still, because of the long familiarity we have with ICD-9, it is often used to describe phenotypes.

  • ICD-10: As of this writing, all diagnosis data submitted with claims for payment must be encoded with the World Health Organization’s ICD-10 system on 1 October 2014.17 Although the rest of the industrial world moved to ICD-10 long ago, the United States has delayed its adoption due to concerns about the cost of modifying systems to accept these codes. Once ICD-10 is in place, researchers hoping to use EHR data will have a somewhat richer terminology system to use (the 6,969 diagnosis terms in ICD-9 go to 12,420 in ICD-10).18 Still, it is a classification scheme, so expressivity is limited.

  • SNOMED-CT:19 The Systemized Nomenclature of Medicine—Clinical Terms is an extensive clinical terminology standard endorsed by the MU program as a general-purpose reference terminology system,20 that includes diagnoses, conditions, historical findings, examination findings, and test results. EHRs typically map ICD codes to Systemized Nomenclature of Medicine—Clinical Terms (through proprietary terminology systems) to provide a more flexible method of expressing diagnostic terms. These codes may not appear to the clinical user, but are present in the data that one may extract from an EHR.

  • Current Procedural Terminology:21 Current Procedural Terminology is a code set maintained by the American Medical Association that is used to describe medical, surgical, and diagnostic services that are rendered during a visit or hospital stay.

  • Diagnosis-related groups:22 These are used to classify hospital cases into groups. A case is assigned to a diagnosis-related group by a “grouper” program that makes decisions on the basis of factors such as ICD diagnosis, Current Procedural Terminology procedure, age, sex, discharge status, and comorbidities.

  • LOINC:23 Logical Observation Identifiers Names and Codes is a set of standards for identifying laboratory and clinical observations. The certification standards for EHRs promulgated by the Office of the National Coordinator for Health IT require that EHRs be able to accept LOINC. This will allow the exchange of laboratory results for the purposes of MU.

  • RxNorm:24 The National Library of Medicine developed this standard naming terminology that assigns a unique concept identifier for each synonymous drug term complied from several drugs. Although adoption of RxNorm is not yet widespread, it is endorsed in EHR certification guidelines for MU, so it is expected to be widely available in the next few years. RxNorm includes semantic relationships between concepts like “has-ingredient” for combination preparations or “has-trade name” to identify brands.

Shortcomings of terminology systems inherent in the domain of human genetics

In any area of medicine in which new syndromes are described frequently, standard terminology systems are going to lag medical knowledge. Another area in which terminology systems fall short is that of rare syndromes. Insofar as human genetics involves many rare syndromes, with many new ones described every year, no terminology system is going to be perfectly adequate at any time. The solution is to use less-specific terms or enter syndrome names in free text. This limits the ability to analyze data, of course.

Another factor that makes terminology systems seem inadequate for human genetics is the need to assign diagnoses to patients that consist of descriptions of chromosomal abnormalities (e.g., deletions) and specific mutations. In many cases, these descriptions may not be associated with a particular clinical syndrome. In addition, patients may need to be described in terms of a mutation that they personally may not have, but for which a family member is positive. Even the family member may not have the clinical syndrome, but only the mutation, as in the sibling of a patient discovered to have a mutation for familial adenomatous polyposis coli. A pediatric genetics provider is therefore at a loss when a diagnostic terminology system does not have a “family history of” term for a specific gene mutation.

Implementation Decisions Affect Both the Quality and Availability of EHR Data

Having complete, high-quality data in the EHR is ultimately reliant on having someone enter it, but there are factors that influence whether robust data collection is even possible. One is the particular category of EHR—whether it is a homegrown, best-of-breed, or enterprise (i.e., vendor) system. Another relates to the configuration of the EHR—whether the system was customized for each clinic or simply deployed in a “big bang” fashion. We discuss the impact of these factors in turn.

Homegrown systems

Homegrown solutions are those that were developed at a single institution (or health-care system) and customized for that institution’s specific workflows and clinical needs. In some cases, they may have been sold to a vendor and integrated into a commercial product, but they are typically maintained using internal institutional resources. Examples of homegrown EHRs include Vanderbilt’s StarPanel and the Massachusetts General Hospital’s Longitudinal Medical Record. Although homegrown EHRs are appealing because they can be uniquely tailored to the needs of an institution,6,9 they are falling out of favor as the cost of maintenance and certification grows.

Best of breed

Best-of-breed solutions were popular in the 1990s, before enterprise EHR systems reached the level of maturity they have today. In a best-of-breed system, an institution simply selects the product considered to best meet the needs of each specialty or function. Because no one sold a truly integrated, institution-wide system anyway, this was a rational approach. An organization could start by implementing systems for administrative functions, then add a laboratory information management system (LIMS), a picture archiving and communication system, and defer the decision and expense of an EHR for later when the clinical culture was more prepared. Such a staged approach to building health information technology is described by the Health Information Management Systems Society, in their survey data that describe the seven stages of a hospital’s maturation.25 At present (Q4 2012), only 2% of US hospitals are described as being at the highest level of implementation (stage 7), which requires a fully integrated system, whereas over 80% are at stage 3 (this number includes those institutions at stages 3–7), which only requires the installation of ancillary systems (laboratory, radiology, pharmacy), physician access to a clinical data repository for the review of order and results, nursing/clinical documentation, order-based decision support, and picture archiving and communication system image access.26 (The Health Information Management Systems Society has launched a similar staging system for ambulatory care.) The cost of maintaining the interfaces of best-of-breed systems usually drives larger organizations to seek an integrated solution, for which there is a burgeoning market in the MU age.

Enterprise

With enterprise EHRs, the various components (e.g., registration, billing, laboratory, emergency department, inpatient, outpatient) are integrated into a single product that uses a common data repository for all functions. Although the concept of total integration is appealing, the reality is that no one is 100% integrated onto a single product. This is especially true when some of the care occurs in highly specialized clinical areas, like neonatal intensive care or genetics. The driver of information exchange promoted by MU will tend to drive organizations to more integration, which can be seen as a threat to surviving niche applications in clinical specialty areas. The MU driver of patient engagement will also tend to encourage integration because patients will become less tolerant of the gaps in access to their information that niche systems create.

This also means that any other activity that involves modifying the EHR, such as the integration of genomic data and results, must compete for a finite set of information technology (IT) resources. Those with the best chances of having resources successfully allocated will be those that are complementary to MU priorities.

EHR configuration

Another major factor that affects the availability and usability of EHR data is the level of customization of the build and deployment of the EHR. When implementing an EHR, institutions tend to follow one of two paths: customized builds for specific conditions and specialties, or generic builds, in which each clinic or specialty receives the default, or “model” system. With a customized build, one can create screens that allow for the collection of condition-specific information. This can often lead to increased clinician engagement because of the ability to capture the data that matter most to the particular specialty. It can also lead to more robust clinical phenotypes. There are several drawbacks, however. The creation of clinic-specific builds often leads to clinic-specific workflows within the EHR. In many EHRs, the workflow used to collect the data has an impact on where that data is stored in the underlying reporting database. Knowledge of those workflows is therefore critical when trying to work with the EHR data for analytical or research purposes. Otherwise, there is significant risk that portions of the clinic population will be excluded. Another challenge posed by customized builds is that strong institutional governance is needed to ensure that data elements are defined consistently across clinics. The same data elements should not be defined in multiple places with different names. At the same time, it might be necessary to create additional elements to capture more nuanced versions of standard data elements (e.g., blood pressure at the ankle for nephrology, instead of the standard measure at the arm). If there is a desire to share this information across institutions—for research or quality improvement purposes, for instance—then the custom data elements are typically mapped to a standard terminology (if it exists). This can also be expensive, especially for large institutions or clinics that frequently add, modify, or delete data elements from their EHR data collection forms. As one might imagine, customized builds are also time consuming, making them more expensive in terms of resources. Most institutions that opt for customized EHR builds elect to implement the EHR in a phased rollout, going live with a few clinics every quarter to make the process more manageable.

Generic builds, on the other hand, simply use default workflows from the vendor’s preconfigured system. Generic builds are typically used when an institution deploys the EHR in a “big bang” fashion, going live with all clinics at once. Despite requiring less resources and yielding a shorter implementation cycle, there are drawbacks to deploying the EHR, with little in the way of customization. For institutions with subspecialties or clinics with specialized workflows, the standard EHR build may be a poor fit and require condition-specific variables to be captured in free text instead of in discrete fields. If those institutions also have a research focus, clinicians may be disappointed when they realize that the only way they can “get the data out” of the EHR is in a blob of text.

The Impact of the Interface Between the Genetic Laboratory Management System and the EHR

To be useful as a tool for the management of genetic data, EHRs need to contain information on both patient phenotypes and genetic results. We have illustrated why the quantity and quality of phenotypic data in the EHR will depend on both the type of EHR and the level of customization. In a similar manner, the ability to manage genetic results and leverage them for decision support also depends on the configuration of the EHR. With genetic results, the EHR can be used to store individual mutations and/or variations, a text report of the findings along with an interpretation, or both. At this point, EHRs are only able to store a handful of variants for a given test, which poses a problem for next-generation or whole-genome sequence results.

Because the data management and operation of a hospital’s clinical laboratory differ greatly from those of a clinic or hospital, it is typical that laboratories have their own information systems and even their own administrative infrastructure for managing them. Major vendors of EHRs like Cerner (Kansas City, MO) and Epic (Madison, WI) have LIMS that integrate with the EHR, but full implementation of both clinical information systems and LIMS under the same vendor is still rare. For this reason, interfaces are required to move orders from the EHR to the LIMS and for results to flow back to the EHR. As with all data interfaces, it is impractical or even impossible to represent the data from one system with exact fidelity in the other. For example, the LIMS may store some results as discrete data that are incorporated into a free-text report on the EHR end for the sake of feasibility. From the clinician’s point of view, this free-text representation may be perfectly adequate for care of individual patients, but it is inadequate to support population management or automated decision support.

EHRs have a number of ways that to accept these results. The most common method would be to transmit the results via an HL727 message as part of a two-way, real-time interface. HL7 specifies message formats that can store laboratory results of any type and other clinical information. In the LIMS world, HL7 message formats are used almost universally in these interfaces. Health information exchanges can also use HL7 messages to transmit data between EHRs and between EHRs and other systems. Service-oriented architecture or “Web services,” is a newer way that LIMS and EHRs may be connected. Such service-oriented architecture systems may be proprietary, however, and therefore are not available for general use.

A very common way that genetic results are transferred to an EHR is via a document-scanning system, in which the images of printed reports are stored in the LIMS and transmitted to the EHR. This method is the least satisfactory from the point of view of interoperability, data access, computability, or security but has been the standard method for years for dealing with the kind of complexity reflected within a genetic testing report. In some cases, EHRs allow manual entry of discrete data from results. For genetic testing, the volume of the data is not likely to lend itself to that kind of workflow.

Implementing the EHR in a Research-Intensive Environment

When an EHR is implemented in a research-intensive environment, its data must be integrated with a variety of sources.

Research data management versus clinical data management

The objective of clinical data management is first to satisfy the security requirements imposed by the Health Insurance Portability and Accountability Act and state law. After that, the big driver of clinical data management is to offer highly reliable, fast access in all environments where it is reasonable to expect a need for clinical access. Research data management entails the same security concerns but entails far more complex requirements for novel data access, analytic tools, and the ability to access data without individual identifiable health information (“protected health information” as defined by the Health Insurance Portability and Accountability Act). Because of these differing missions, the research IT administration and the clinical IT administration are often at odds on priorities. They may even be in different organizational structures—clinical IT on the hospital or practice plan side, and research IT on the academic side. To complicate matters further, formal organizational structures for quality improvement often straddle the academic and operational sides of an organization.

To succeed in an environment that is attempting to achieve ambitious clinical, research, and improvement goals, one must either consolidate all these operations into one (not usually possible) or work out boundaries of responsibility within a general agreement of cooperation. When the responsible organizations are independent, such an agreement is largely dependent on goodwill but should be cast as a formula for mutual success. Because techniques for creating meaningful genetic data are often the topic of research, those who work with those data must engage effectively with both clinical and research IT administration—an impossible task unless the two sides work together.

Discussion: Minimum Functional Requirements of the EHR and Open Issues

The category and configuration of the EHR have a large influence on the EHR’s suitability for the optimal practice of clinical genetics; nonetheless there remains a basic set of functional requirements that every EHR must be able to satisfy. These requirements are intended to be general and not focused on specific technical details, which have been covered elsewhere.28 They are:

  • It must be possible to store genetic test results in a discrete, computable format. Storing the various components of the result—e.g., the gene, protein/nucleotide variant, single-nucleotide polymorphism, copy-number variant, existence of the test result itself—in a discrete format allows them to be reasoned on within the EHR.

  • The results of a genetic test must be output in such a way that they are suitable for EHR interoperability (i.e., so that they can follow the patient as they go through transitions of care). Initially, this may just mean that it is possible to transmit the text version of the test results, but as EHRs develop the capacity to better accept discrete results, those components should be transmittable as well.

  • It should be possible to record phenotypic data in a sufficiently expressive terminology system to enable robust analysis. It should also be possible to crosswalk from the source terminology to alternative ones, should the need arise. This analysis may or may not occur within the EHR itself. Ideally, EHR interoperability would evolve to the point that phenotypic data on a single patient could be pulled from multiple institutions.

  • It should be possible to expose the genetic data to clinical decision support processes in the EHR. If the EHR has a sophisticated enough rules engine, having the various components of the test result stored in a discrete format allows them to trigger various rules and processes—dosing alerts, consults, study recruitment, appointment follow-ups, etc.

  • The EHR should be able to retrieve/display external content to assist in patient/provider education about the findings. This is particularly important as the knowledge about a test or condition evolves over time. Having this information external from the EHR frees an institution from having to maintain a copy locally and keep it up-to-date.

If an EHR is able to satisfy the above criteria, then it will be possible to achieve a basic level of integration. Of course, there is a significant difference between possible and feasible, given available resources; the costs of achieving high-quality discrete data in an EHR are significant. As the field evolves, additional requirements are likely to emerge, but over time, EHRs will reach a level of maturity, such that most, if not all, are able to satisfy them.

Although the technical barriers to EHR integration have been identified, there are a number of open social issues that remain. It will be imperative to find solutions to these problems to truly achieve the integration of clinical genetics into the EHR.

  • Transitions of care. Genetic test results remain relevant far longer than most other types of testing. Whereas a peripheral white blood cell count usually loses its clinical significance after a day or so, genetic information may actually become more important years later, after new knowledge can be brought to bear on it. For this reason, it will be necessary to have the ability to transmit computable genetic data that endures throughout the lifetime of the patient, and perhaps through the lifetimes of his or her offspring. There is a lack of clarity on how to effectively maintain accessibility to this information in the long term. It is unclear on how to determine to whom this information should be transmitted as patients change locations and providers. Perhaps the responsibility should lie with the patient instead of the provider. We have not begun to answer these questions in Western societies.

  • Managing results over the pediatric–adult transition. As children with chronic, heritable disease receive better treatment, they increasingly grow to an age at which they need to transition from their pediatric specialty provider to an adult specialty provider.29 In the past, which dovetailed with the days of paper medical records, such transitions were less common, but the nature of paper lessened the complexity and volume of the information transfer. Because there was no better alternative, such a throttled information flow was accepted and workarounds were developed. Now, with widespread EHRs, there is no limit on the amount of text and computable information that can be transmitted. The higher bandwidth of these transitions, coupled with their increased frequency, creates special challenges to providers burdened with increased care-management duties imposed by the current accountable-care trend. Given that we already have a difficult time digesting the decision support that EHR systems provide for simple matters like acute medication ordering, it is difficult to imagine how to construct effective decision support for highly complex genetic data that land in the hands of an adult provider unaccustomed to the care of patients with pediatric illnesses.30

  • Changes in genetic testing technology. When tests become more sensitive or more tests are run, there is the potential to affect clinical significance. This makes older results less useful. For these tests, should interpretations be updated as more tests are generated and more clinical information becomes available? If so, who bears the responsibility for communicating and educating the patient?

It is likely that the clinical genetics community will grapple with these issues for some time. They make many of the technical issues with integration, which are daunting, appear much more tractable. The interoperability requirements of MU may enable progress in these areas, but effective solutions may require larger re-engineering.

Disclosure

The authors declare no conflict of interest.