Introduction

The increased capacity for high-throughput DNA sequencing, including exome and whole-genome analysis, has driven the use of large-scale sequencing strategies earlier in the diagnostic pipeline when a genetic disorder is suspected. Although casting a wider net to identify causative genetic variation can be beneficial, the massive amount of data produced by these approaches makes interpretation and reporting a challenge, particularly when the tested individual is so young that the ultimate phenotype may not be apparent. Pushing this envelope, a recent study has compared the results of standard newborn screening, which is typically performed within 48 h of birth, to exome sequencing, although this is not yet feasible on a population scale.1 Additional studies are underway to further this comparison and to analyze the ethical, legal, and social implications of such testing in a newborn population.2,3,4 Clinical exome sequencing, removed from the newborn screening setting, relies on a combination of accurate clinical information submitted by the ordering provider and careful variant interpretation from the reporting laboratory. Current clinical exome offerings have reported a yield of approximately 25%.5 Higher success rates have been reported for specific phenotypes, trios (proband and parents), or targeted gene panels.6,7,8,9 No matter the application, exome and genome sequencing produces massive amounts of data that must be processed quickly and accurately and interpreted by the laboratory.

In an effort to standardize clinical variant interpretation, the American College of Medical Genetics and Genomics published guidelines that can be used by all clinical laboratories. These guidelines incorporate five categories of variant classification: benign, likely benign, uncertain (variants of uncertain significance (VUS)), likely pathogenic, and pathogenic.10 An uncertain finding can be frustrating for clinicians and patients,11 but it is a necessary result when there is insufficient information to determine whether a variant is benign or pathogenic. Reports with uncertain findings can result in additional medical costs and a continued diagnostic odyssey when a provider lacks sufficient information to rule in or rule out a diagnosis.

As additional information becomes available, VUS can sometimes be reclassified. When this occurs after a VUS has been reported to a patient, laboratories have been advised to recontact the provider to communicate the revised classification. Much has been written about the duty to recontact, with very little consensus reached on the specifics of how and when to do so.12,13 Most research agrees that, in theory, recontacting patients and providers with updated information is optimal; however, the process by which this should occur is not agreed on.14 These complications argue that minimizing the need to recontact by providing accurate and complete information as early as possible is the best practice.

We analyzed the current process of variation classification in the interest of identifying the problems and proposing solutions for the future. To facilitate the analysis, we focused on a relatively simple model: metabolic disease. We focused our attention on variants found during clinical sequence analysis for three common and well-characterized metabolic disorders: classic galactosemia, phenylketonuria, and medium-chain acyl-CoA dehydrogenase (MCAD) deficiency. We selected these three disorders because they have readily available clinical biochemical testing, they are common for Mendelian genetic disorders, there is clear evidence of the benefits of early identification and treatment, and they are included in newborn screening. Our analysis focused on VUS reported in the relevant genes in our laboratory and determined the information that would be necessary to minimize these findings. To reflect the reality of clinical laboratories, we did not specifically seek out any additional information for this research study. We worked with the information submitted along with the specimen or that which was obtained prior to the initial report being released.

Materials and Methods

We investigated sequence variants associated with three metabolic disorders that are included in newborn screening in all 50 states: classic galactosemia due to galactose-1-phosphate uridyltransferase (GALT) deficiency, caused by GALT variants;15 phenylketonuria, caused by PAH variants;16 and MCAD deficiency, caused by ACADM variants.17 These conditions were selected based on their incidence, well-defined biochemical phenotype, availability of testing, and the need for treatment to be instituted soon after birth to avoid significant morbidity and mortality.

All variants identified by sequence analysis (targeted gene analysis, gene panels, and exome sequencing) of clinical specimens at Emory Genetics Laboratory between 1 January 2005 and 1 January 2015 were included in this study and extracted from our in-house variant tracking database, EmVar, which currently holds more than 3 million variant observations. We did not perform any additional analyses (segregation studies or copy-number analysis) beyond what was requested by ordering providers to accurately reflect the situations encountered in a clinical laboratory. Clinical information used to classify variants is limited to what was provided by the ordering provider at the time of the initial classification. As part of this study, we did not recontact providers for any additional information. Basic information about all variants discussed in this study, including current classification, is available publically online through EmVClass.18 The design of this study was reviewed and approved by Emory University’s Institutional Review Board.

Variants in the three targeted genes were analyzed and sorted by classification (benign, likely benign, VUS, likely pathogenic, or pathogenic)10 at the outset of this project. We identified unique variants on a per-family basis to avoid overestimating population prevalence due to familial testing. Although there are common pathogenic variants in each of the three genes accounting for as much as 85% of affected alleles for GALT, the majority of the identified variants in all three genes were detected by sequence analysis rather than targeted testing. Although copy-number analysis is necessary to detect an estimated 6% of Human Gene Mutation Database–reported variants in PAH, 3% in GALT, and 2% in ACADM, the majority of individuals included in this study did not have copy-number analysis completed. In-house classifications were compared with outside laboratories that had submitted the same variant to ClinVar at the time of data analysis. We also recorded whether there were other publically accessible reports of a variant. For rare variants, no new information may be available because of the initial classification. Each VUS was reviewed by personnel experienced in variant interpretation to update variant classifications for this study. As variants were reviewed, we identified key types of information that allowed reclassification, including population incidence, publication of additional cases, functional studies, and additional clinical testing (parental testing to confirm inheritance and copy-number analysis).

Results

Summary information for all variants seen in our laboratory for these three genes is shown in Table 1 . For each, the fraction of variants classified as VUS ranged from 13% for PAH to as high as 50% for ACADM. In total, we have classified 134 unique variants in GALT, 46 (34%) of which were of uncertain significance (VUS) and 78 (58%) of which were pathogenic or likely pathogenic. In PAH, we observed 132 variants, including 17 VUS (13%), 98 pathogenic or likely pathogenic (74%), and 17 benign variants (13%). In ACADM, 32 of 64 unique variants (50%) were uncertain, whereas 22 (34%) were pathogenic. After surveying all variants, we focused on VUS to determine why we had a large number of uncertain findings for conditions with clear clinical significance and for which biochemical testing is diagnostic. Most VUS for all three conditions were identified in individuals who also carried a known pathogenic variant. Consistent with the previously reported GALT mutation spectrum, the most common pathogenic variant found with a VUS was p.Q188R (seen in 14 individuals). In ACADM, of 32 individuals with a VUS, 23 were found to also have the p.K329E (previously known as p.K304E), which is responsible for more than 50% of pathogenic alleles.17 In contrast to GALT and ACADM, the most common pathogenic variant in PAH accounts for less than 25% of alleles in the BioPKU database,19 so it was not unexpected to find that no single PAH pathogenic variant was seen more than three times with a VUS.

Table 1 Initial classifications of all variants seen in targeted genes

In our survey, GALT had the largest number of VUS identified, possibly because sequence analysis of GALT is performed early in the diagnostic process for galactosemia. Classification of GALT variants is further complicated by the large number of variant alleles, including the common Duarte allele, which causes reduced activity in vivo and in vitro, but is not associated with classic galactosemia, even when inherited in trans with a classic allele. Of the 46 unique VUS in GALT, 41 were seen in a single family, whereas 5 VUS were each seen in two unrelated families. Twenty-four VUS were seen with one of the common pathogenic GALT variants. Five individuals were compound heterozygous for two unique VUS. One individual was homozygous for a VUS, and two siblings in a different family were homozygous for a VUS as well as the LA variant, which does not cause classic galactosemia.15 No biochemical results or indications for testing were provided in the cases with homozygous VUS, impeding the classification. Fifteen of the VUS observed in this study were compound heterozygous with a Duarte allele, the most common nonclassic variant, with an estimated carrier frequency of 5–15%.20 In 12 of these individuals, biochemical testing measured enzyme levels suggestive of Duarte galactosemia. In these individuals, it is possible that the variant allele is acting like a classic allele, but this cannot be proven from the evidence available with the case.

ACADM had the second most VUS of the genes included in our study, and it also has several common pathogenic variants. Although p.K329E is by far the most common pathogenic variant in MCAD deficiency, the advent of newborn screening has led to the identification of other common variants in this gene. Of the 30 patients with a VUS and a known pathogenic variant, 23 were observed with p.K329E, which is not an unexpected finding. Six VUS in ACADM were seen in more than one family (ranging from 2–5). This includes five unrelated individuals found to carry c.600-18G>A, a variant that was recently shown to affect splicing.21 Six individuals were compound heterozygous for two VUS. Among these VUS was one deletion identified by copy-number analysis. Biochemical results (plasma acylcarnitine, urine organic acid, or acylglycine analysis) were available to our molecular laboratory in only one of these cases. Copy-number analysis is important for complete ACADM molecular analysis when two known pathogenic variants are not detected on sequence analysis. Despite this, only 7 of the 30 individuals who were compound heterozygous for ACADM variants underwent copy-number analysis; other than the VUS mentioned, the results were negative. Two of these seven were compound heterozygous for two different VUS, whereas five had one VUS and p.K329E. Two individuals with only a single VUS had deletion/duplication analysis; however, we were unable to determine whether these two individuals had biochemical testing suggestive of MCAD deficiency. During this review, functional studies about our most common ACADM VUS (c.600-18G>A; seen in five unique families) were published, allowing this variant to be reclassified.21

In PAH, we identified 17 VUS, 15 of which were only seen in a single individual or family. One of the recurrent VUS, c.1200-35C>T, has been seen in three affected individuals in our laboratory, each of whom also had two known pathogenic variants, and was reclassified as likely benign. All of the 15 private VUS were compound heterozygous with a pathogenic or likely pathogenic PAH variant. In four of these cases, the results of plasma amino acid analysis were available for consultation when interpreting the sequence analysis results. In an additional four individuals, the reason for referral was stated as “hyperphe (hyperphenylalaninemia).” Copy-number analysis was performed in only two cases and was negative in both.

Interpretation of ACADM and GALT variants is complicated by the fact that, although both disorders are recessive, unaffected heterozygous individuals can display a phenotype by biochemical testing. For classic galactosemia, the most common diagnostic assay is enzyme analysis, in which true heterozygous carriers are expected to demonstrate approximately 50% residual enzyme activity. For ACADM, the difference in acylcarnitine or acylglycine profiles can be more subtle, but true-positive profiles should be apparent to experienced laboratory technicians and differentiated from heterozygosity and other false-positive profiles.22,23,24 Although these profiles can be reliably identified by biochemical geneticists, the interpretation of this testing still needs to be communicated accurately to the laboratory performing molecular testing. Indications on a testing requisition such as “abnormal acylcarnitines” or “decreased GALT enzyme” are not enough to make an assumption about the biochemical phenotype that was observed. Laboratories performing molecular analysis should receive copies of relevant clinical notes and precise abnormal results that triggered the need for molecular confirmation.

There are fewer issues distinguishing carriers of pathogenic PAH variants based on their biochemical phenotype. However, recurring false positives by amino acid analysis may impact PAH analysis. False-positive biochemical screening (elevated phenylalanine) can be due to parenteral nutrition in infants or nonfasting samples, but such profiles are usually easily distinguished from those of affected individuals and should be reported appropriately by the biochemical genetics laboratory to establish that further testing is unnecessary.25

During our review, we reclassified 17 VUS ( Table 2 ; nine in GALT, seven in ACADM, one in PAH). The time required for reclassification of a variant by a laboratory director experienced in variant interpretation ranged from 5 to 60 min. Most reclassifications to pathogenic or likely pathogenic were the result of published information of functional studies or large case cohorts. The majority of reclassifications to benign or likely benign were the result of newly available population data from exome and genome studies, such as the Exome Aggregation Consortium.26 For most variants that could not be reclassified, there was no new information available since the initial classification. Variants that remained uncertain, the information available at the time of interpretation, and the remaining type of information that would be needed to reclassify each are shown in Supplementary Table S1 online. Our laboratory benefits from having a biochemical genetics laboratory in-house, allowing easy review of records and consultation with laboratory directors. A majority of our cases for which biochemical test results were available were those for which the testing had been performed in-house. The presence of positive biochemical test results is an important factor to ensure that the correct gene is being analyzed. As is apparent in Supplementary Table S1 online, for many variants that remained VUS, additional functional studies had not been performed (or reported), no new cases were identified, and the variant was not observed in any population databases at significant frequency.

Table 2 VUS reclassified in this study

Discussion

This study revealed recurrent missing information that could have facilitated the original variant classifications. We have categorized this information according to the likelihood of prior availability and the level of difficulty to obtain ( Table 3 ). In many cases, information we have categorized as level 1 either is already available to the diagnostic laboratory or could be made available with a moderate amount of additional effort and expense. The single most important category of information for variant reclassification is designated level 2. This information would require additional effort (e.g., specimen collection from parents) and cost (e.g., copy-number analysis, known variant testing for parents or siblings). Although justifiable even in the context of an individual family, it is important to note that these additional steps will cease to be required once a VUS is classified as either benign or pathogenic, and future individuals identified to have the variants will receive a definitive classification without additional testing. For many of the other variants, levels 1 and 2 data would be enough to shift a classification to either likely benign or likely pathogenic.

Table 3 Missing information in variant classification

Parental testing required to confirm that variants are inherited on opposite alleles often seems frustratingly easy to obtain, but practical barriers exist. The unavoidable issues of death, adoption, and conflict exist; however, even when both parents are available, the associated costs may be too much for families to absorb. Children are often eligible for insurance programs not available to adults, and thus the child’s testing and care is covered, but the additional testing of parents may not be. When additional testing may prove that a child is unaffected and that detected biochemical abnormalities are due to carrier status, such testing may save the health-care system a significant amount of money going forward and avoid needless treatment of a healthy child.

The types of information listed in our level 3 category are more difficult to obtain. There needs to be a commitment from laboratories and funding agencies to direct time, effort, and money to understanding the vast amounts of genetic data generated from exome and genome studies, rather than just its production. As a relevant example, researchers and clinicians in Canada with the FORGE collaborative project use exome and genome sequencing in an attempt to diagnose clinically significant cases while at the same time collaborating with basic science groups across the country to develop functional assays to evaluate these novel variants and newly identified genes.27 If funding and regulatory agencies required and supported similar efforts at the level of interpretation and functional annotation, then many patients would benefit now and in the future. Laboratories would be able to provide more conclusive information, and efforts to recontact patients with updated information could become exceptional rather than routine. In our laboratory, when a variant is reclassified, an internal database generates a list of cases that require updated reports. Reports in which a VUS is updated to be either pathogenic or likely pathogenic are issued by a laboratory director; reports in which a VUS is classified as benign or likely benign are queued for review and release. Revised reports are communicated to the original ordering physician.

The variants that we describe here were identified through a traditional process in which gene sequencing was performed once there was clinical or biochemical suspicion of a certain disorder. If whole-genome or exome sequencing really is used as a newborn screen, then there will be an almost complete absence of clinical and biochemical information to put together with the variants identified in each baby. Our analysis emphasizes the challenge with variant interpretation in this scenario. With this DNA-based screen, the only affected babies likely to be identified will be those with previously identified pathogenic variants. Thus, we would predict that this approach would reduce screen sensitivity or—if VUS are included in positive screens—that the specificity for many disorders would decrease as well.

Conclusions

VUS are unavoidable results in clinical genetic testing, whether a single gene is sequenced by traditional methods or whether a large gene panel or exome is produced by next-generation sequencing. As laboratory directors and clinicians, we should make every effort to minimize uncertain findings and, as a result, provide conclusive results to as many patients as possible. This will require providers to accurately relay clinical information and indications for testing to the performing laboratory. Testing laboratories must commit to accurately reporting variants and clearly explaining the rationale of any additional testing requested and the possible changes in variant interpretation that may result. The data presented here show that, even for well-understood disorders, VUS are being reported either because of incomplete communication between the laboratory and the ordering provider or because of missing test results, such as copy-number variant or parental testing, which can be due to financial constraints. With the cooperation of ordering providers and payers, simple modifications, such as implementing a panel approach to testing that includes copy-number analysis and confirmation of inheritance, may allow laboratories to obtain crucial information with a minimal amount of additional effort. More involved studies, such as those required to prove pathogenicity of a specific variant through functional analysis, will require a commitment by clinical laboratories to share their findings with research groups who are able to perform these studies. Combined efforts of all parties involved are necessary to provide the best and most conclusive results for all patients in a timely manner.

Disclosure

K.B.G., S.H.A., M.H., and P.L.H. are employed by EGL Genetic Diagnostics, LLC, which performs clinical testing for the disorders included in this study. The other authors declare no conflict of interest.