When genetic testing detects a candidate variant (CV) in a gene that is not known to cause disease in humans, clinicians face a dilemma. Where can they find the additional evidence needed to make a diagnosis? One action a clinician can take is to search for additional patients with a deleterious variant in the same gene. Here we present a case study where the electronic health record (EHR) was used to facilitate genotypic matchmaking to help diagnose a patient at Vanderbilt’s Undiagnosed Disease Network (UDN)1 clinical site.

A 26-year-old female with unexplained mild intellectual disability (ID), autism spectrum disorder (ASD), obsessive compulsive disorder (OCD), high myopia, and joint hypermobility had exome sequencing (ES) as part of her work-up for the UDN. Interestingly, a de novo MSL2 p.S232Tfs*10 variant was included in the ES report along with its possible association with ASD.2

The MSL2 protein had been reported to be part of a complex that plays a major role in chromatin regulation and structure through histone H4 acetylation and H2B ubiquitylation.3,4 Due to the participant’s CV being a frameshift, de novo, and the function of the MSL2 protein, the Vanderbilt University Medical Center (VUMC) UDN team considered it an attractive candidate, but the lack of a previously reported role in Mendelian disease precluded it from being diagnostic.

The UDN’s process to find additional cases is to upload CVs to PhenomeCentral, a founding member of Matchmaker Exchange.5 A widely used, similar strategy is for the provider to post the candidate gene in GeneMatcher, MyGene2, and/or Matchmaker Exchange.6,7,8 These online connection tools rely on other providers using these platforms, require some length of waiting time, and need communication between providers to share additional information to possibly solidify a match. To date, the VUMC team has not had any MSL2 matches through PhenomeCentral.

Having failed to find a match through available matchmaking platforms, the Vanderbilt UDN team searched for “MSL2” in VUMC’s Synthetic Derivative (SD), a continuously updated, de-identified database of EHRs for more than 3 million patients.9 This revealed two patients whose charts were subsequently reviewed using SD Discover, a browser-based application developed at Vanderbilt to facilitate chart review. Both patients were found to have de novo MSL2 variants identified through clinical trio ES and had strikingly similar phenotypes to the proband.

The first patient identified through SD was a 15-year-old male with a history of ASD, mild ID, obsessive–compulsive disorder (OCD), hypermobile Ehlers–Danlos syndrome (EDS), and high myopia. He was heterozygous for a de novo p.Pro25Ser MSL2 CV, a well conserved variant that is predicted to be damaging based on in silico models. The second patient was a 13-year-old female with a history of global developmental delay, ASD, attention deficit disorder (ADD), visual and language processing disorder, OCD, high myopia, and hypermobile EDS. She was heterozygous for a de novo p.Thr217AspfsX2 MSL2 CV, a variant that created a premature stop codon and was predicted to cause loss of normal protein function. The variants found in both patients were absent from large population cohorts.

In both cases, clinical genetics providers were suspicious of the MSL2 CV as a cause of their patients’ ASD, behavioral problems, and possibly their hypermobility. They documented that there was not enough evidence to consider the MSL2 CV to be diagnostic.

All three individuals had a de novo, likely gene disrupting, MSL2 CV. Since all three individuals also had similar phenotypes (ASD, OCD, mild ID or learning disabilities, and an EDS type III or joint hypermobility), this suggests its causative role in a new disorder. These additional patients gave the team the confidence to initiate contact with clinical laboratories to find more individuals with MSL2 as a candidate gene and connect with a team doing functional studies. Finding these two additional individuals was an integral part of giving the UDN participant a molecular diagnosis for a new disease.

The American College of Medical Genetics and Genomics (ACMG) recently published a statement encouraging providers to optimize their use of genomic testing results in the EHR.10 Based on the experience with MSL2, VUMC’s team proposes another use for genomic testing results in the EHR: a full text search for a gene name to quickly ascertain additional individuals with shared phenotype(s) and candidate gene(s). We have termed this strategy “EHR-based genotype matchmaking” to indicate the process of finding a genotype match that has been clinically tested and might provide evidence for causality.

EHR-based genotype matchmaking offers several benefits beyond what is realized by current methods. EHRs contain dense and longitudinal medical histories that are not always available to clinical laboratories; also, EHRs contain a large and growing population of patients who have received genetic testing compared with patient cohorts whose data have been uploaded to matchmaking platforms. Motivated by the successful resolution of the case above, the Vanderbilt UDN team has added EHR-based genotype matchmaking to its variant analysis pipeline and is in the process of reanalyzing past cases.

The VUMC UDN team recognizes that this powerful tool has potential barriers that complicate implementation. Like many medical centers, VUMC does not currently store genetic data in an easily accessible format. Genetic testing reports are mainly uploaded to the EHR as PDF files that are currently not searchable in the text-based SD. Our geneticists and genetic counselors record genetic results as text in the EHR, but this is a time-consuming process that is subject to typos and errors, and not all VUMC providers record genetic test results in this way. Fortunately, many genes have fairly distinctive names—the search for “MSL2” easily found the two records above. However, searching for gene names that are short (e.g., TR), overlap with common terms, or that are frequently reported as somatic variants (e.g., EGFR) can yield thousands of irrelevant records.

Major EHR vendors are beginning to develop and market ancillary systems and modules to store structured genetic test results. If adopted, these new features, along with standard reporting of results from laboratories, will greatly simplify the process of identifying patients with matching genotypes. In the meantime, the UDN team plans to continue to index medical records based on a string-matching approach for new cases, and to reanalyze previous undiagnosed cases. The database that stores these string matches has become an invaluable resource for rapidly assessing new UDN cases, and the growth illustrates the rapid growth over the last decade of genetic data in the EHR.

More work is needed to streamline the dissemination of knowledge gained from EHR-based patient matching. The Vanderbilt UDN team is working on an institutional review board (IRB) application that would allow them to inform the ordering provider of the new connections regarding their patients’ genetic testing results. While the research team is not the clinical laboratory who performed the testing, there is still an ethical duty to reinterpret unsolved exomes and inform the ordering provider/patients of new conclusions.11,12 This IRB permission would allow the research team to also ask clarifying phenotype questions that are unanswered in the EHR. Ideally, VUMC’s search process, which is done on a research basis, could potentially be done as part of routine clinical care in the future.

Further work may be directed toward more automated phenotyping methods that use the EHR. The combination of structured genetic data and automated phenotyping methods may facilitate more automated approaches to scaling EHR-based matchmaking, similar to approaches used by matchmaking platforms.

The problem with candidate genes is not restricted to a research setting. Clinical ES, which is increasing in popularity, often reports CVs to clinicians. Although it may not be standard for all clinical laboratories to report variants in candidate genes, two large clinical laboratories have reported that ~8% of their diagnostic ES testing reports a candidate gene.13,14 One laboratory shared that a candidate gene was included on a total of 24% of their reports.14 Linking candidate genes found in these tests with the medical phenome described in the EHR represents an opportunity to improve interpretation of exomes, which will in turn improve patient care, increase the utility of clinical genetic testing, and further the discovery of new Mendelian diseases. The institutional and technological changes necessary to enable this function in other institutions will take extensive effort and funding, but we hope that this report creates additional incentive for change.