Phenotypic and clinical information are often critical to interpreting the significance of genetic variants discovered through genome-scale sequencing. However, the amount of information necessary and the ability of clinical laboratories to utilize additional information to clarify variants of uncertain significance (VUS) are unclear. The clinical and molecular expertise required to determine the molecular diagnosis in an individual case may also vary. These are among the many challenges that face our community as we move along the path toward routine use of genomic sequencing in clinical care. The article by Narravula and colleagues, published in this issue of Genetics in Medicine, addresses the types of “missing information” that were useful for reclassifying variants initially classified as VUS.1 These findings provide insight into the real-world application of variant classification guidelines.2

When a variant is known to be pathogenic or benign, based on abundant data in the medical literature or the previous experience of the laboratory, phenotypic information is not needed to determine pathogenicity (e.g., CFTR F508del). A clinical laboratory can confidently assert that a variant or combination of variants with known pathogenicity would be consistent with carrier status for a particular genetic condition, or with that individual having a high likelihood of developing the condition. Assessing the individual’s phenotype, contextualizing the genotype/phenotype relationship, and incorporating it into the patient’s clinical care (the “clinical correlation” often recommended) are the responsibilities of the clinician. Unfortunately, even when a variant is “known” to be pathogenic, details about penetrance and expressivity of that variant often remain unpublished because of lack of novelty or because it is difficult to aggregate sufficient numbers of patients with complete genotypic and phenotypic data to merit publication. This will probably be the case in most instances because few variants are as intensively studied and well known as CFTR F508del. Characterizing penetrance and expressivity of such variants may require well-curated databases containing case-level data aggregated from multiple sources.

When a variant is novel, poorly characterized, or controversial in terms of pathogenicity, additional data are required to make a definitive classification. Rare uncharacterized variants are often classified as VUS because, de facto, not enough is known to allow determination of pathogenicity. The Exome Aggregation Consortium3 resource contains data from large numbers of individuals from multiple populations, thus providing valuable population-level data that can be used to adjudicate variant pathogenicity. In many cases, this new information allows an assertion of benign or likely benign because the prevalence in the population exceeds what would be expected for a variant that causes a particular monogenic condition, depending on allele frequency, prevalence of the condition, the proportion of cases attributable to that gene, and the penetrance of the disorder. The value of having population-level data at this scale cannot be overstated; Exome Aggregation Consortium population frequency data are cited as “key information” in 5/17 variants reclassified by Narravula and colleagues.1 Continued development of high-quality population reference databases, including the Precision Medicine Initiative, will greatly facilitate reclassification of variants over time.

As powerful as population allele frequency data are, there are nuances to consider when using such information to inform pathogenicity assessments. Incomplete knowledge about disease prevalence, locus heterogeneity, penetrance, and expressivity impair confident classification of a variant as benign. In many cases, variants previously reported as being causally implicated in disease are present in population databases (among individuals whose phenotypic state is typically unknown) at frequencies that raise substantial concerns about their pathogenicity but are not high enough to definitively allow classification as benign. Furthermore, even the fact that a variant is extremely rare is generally necessary but not sufficient to prove pathogenicity. In these scenarios, the quality of the available case-level evidence plays a critical role in determining pathogenicity. In some instances, laboratories are able to acquire family segregation data or details about a patient’s phenotypic state, whereas in others additional evidence is obtained from published works. In the work by Narravula et al., private laboratory data and recently published data were each cited as “key information” in 10 reclassified variants (7 of which cited both in-house and published data together).1 How many of these variants would remain VUS if the key publication had not been available in this interval? How many other clinical laboratories might have in-house data that would help in the interpretation of other variants, and how many would have VUS of their own that could be reclassified if the in-house data of Narravula et al. were available to them? It is in data sharing that the ClinGen consortium is filling a critical gap by facilitating variant adjudication through deposition of clinical assertions into the ClinVar database, conflict resolution between submitters, and expert curation.4 Nevertheless, even when laboratories are evaluating the same data, conflicting assertions will still exist,5 necessitating further data collection and consensus building.

Such examples emphasize the incredible value of widely shared data to aid in the clinical interpretation of variants. However, there remains the question of how phenotypic data help to interpret the “case,” i.e., how the overall combination of variants leads to a conclusion about the possible cause(s) of a patient’s symptoms. This domain is where variant pathogenicity assessment and clinical correlation (areas in which the clinical laboratorian and the clinician each have overlapping expertise) collide, and where communication is critical for arriving at a correct diagnosis and thus to optimal patient care. Whereas family history and segregation data to define inheritance or de novo occurrence can provide evidence to support the assessment of an individual variant, detailed clinical information is also needed to determine whether any variant or combination of variants is likely to provide a molecular explanation (as opposed to simply being an “incidental” or “secondary” finding). That said, it is important to acknowledge the pitfalls of genotype/phenotype correlation, in which molecular analysts may be misled by phenotypic features from a checklist and clinicians may be overly influenced by a genomic result and ascribe significance to mere coincidence.

Technology has dramatically changed how clinicians diagnose rare disorders. In the past, when testing was predominantly one gene at a time, clinicians needed to generate a differential diagnosis and then evaluate those conditions with a combination of diagnostic tests, often sequentially. In that era, laboratories often developed expertise around one gene or a limited set of genes, reporting back a wide range of variants (including, in some cases, a list of benign variants detected). Advances in array technology revolutionized genetic testing for chromosomal gains and losses, and massively parallel sequencing technology has enabled genetic testing of large panels or the entire genome, altering the way that clinicians are able to evaluate patients. The differential diagnosis remains no less important; however, the roles of the clinician and the laboratory in developing that list of possible diagnoses have changed. Now that gene panels routinely exceed 50–100 genes, and now that genome-scale sequencing yields hundreds of rare variants across thousands of genes implicated in Mendelian disorders, laboratories can no longer have expertise in every gene, nor can they easily evaluate every variant. Further complicating the landscape is the fact that untargeted approaches will discover numerous variants that are irrelevant to the diagnostic question, diluting the analysis and potentially clouding the results with secondary findings. Yet, there is little doubt that genome-scale sequencing will be the most efficient way to generate the data needed to ultimately yield a molecular diagnosis for many patients with rare genetic disorders.

This new state of affairs has led to the development of informatic variant filtering and prioritization schemes that take into account the characteristics of the variant (e.g., protein effect, allele frequency, in silico predictions, and trio information) as well as phenotypic characteristics of the patient from clinical details provided on a test requisition form or even copies of medical records. Results returned to clinicians are often a highly selected subset of variants that could provide a plausible explanation; therefore, the combination of possible pathogenicity and possible relationship to the patient’s symptoms determines the return of results. There is great potential for heterogeneity in returned results because laboratories may have different criteria and thresholds that guide return.6 The clinician is left to wonder which variants fell below the laboratory’s threshold for return (and why), or which genes were carefully evaluated and excluded as a cause. Clearly, there is room for enhanced communication between clinicians and laboratorians, for whom the roles are no longer so one-sided; both parties can exercise their domains of expertise and engage in development of the differential diagnosis and evaluate the plausible roles of variants in the patient’s condition. The combination of phenotype, family history, genotype/phenotype segregation data, variant assessment, and knowledge about rare inherited conditions must all be considered to enable optimal molecular diagnosis. Considering all of these critical variables, it is clear that we need novel ways to facilitate communication and interaction between clinicians and laboratories, new models for how this shared activity should be reimbursed, and equitable acknowledgement of contributions in subsequent academic publications.

Finally, we must consider implications for the potential use of genomic sequencing as a means of primary screening in the general population for rare but actionable disorders. As noted by Narravula et al., the genes they evaluated were all involved in conditions screened for in all infants. Although they did not specifically address the predictive value of genetic variant data in a screening context, their data highlight the significant challenges of this approach and whether VUS results should be returned in a nondiagnostic setting. We have come to the conclusion that the thresholds for returning variants of different levels of known or unknown pathogenicity depend on a number of factors, including the inherent burden of rare variants in a gene among the general population, downstream consequences of false-positive and false-negative results, and availability of subsequent diagnostic tests to eliminate false-positive results or assess the chance of developing symptoms of the condition.7 The application of genome-scale sequencing in a population screening mode, in which the additional information that aids in evaluating variant pathogenicity is inherently absent, requires a great deal of additional study before it can be broadly implemented.

In conclusion, the experience of Narravula and colleagues probably occurs in all clinical genetics laboratories on a nearly continual basis. New models of interaction between clinicians and laboratories may be necessary, similar to the way in which physicians visit the radiology reading room or call the radiologist to discuss radiographic findings. Furthermore, databases such as ClinVar, which aggregate curated variants with clinical assertions in the public domain, will be critical for variant interpretation, and efforts such as ClinGen will contribute to the expert curation process. Ultimately, detailed phenotypic and genotypic information will need to be combined to fully understand the molecular basis of human disease.

Disclosure

J.S.B.’s work has been funded by the NIH.