Abstract
Purpose:
Using exome sequence data from 159 families participating in the National Institutes of Health Undiagnosed Diseases Program, we evaluated the number and inheritance mode of reportable incidental sequence variants.
Methods:
Following the American College of Medical Genetics and Genomics recommendations for reporting of incidental findings from next-generation sequencing, we extracted variants in 56 genes from the exome sequence data of 543 subjects and determined the reportable incidental findings for each participant. We also defined variant status as inherited or de novo for those with available parental sequence data.
Results:
We identified 14 independent reportable variants in 159 (8.8%) families. For nine families with parental sequence data in our cohort, a parent transmitted the variant to one or more children (nine minor children and four adult children). The remaining five variants occurred in adults for whom parental sequences were unavailable.
Conclusion:
Our results are consistent with the expectation that a small percentage of exomes will result in identification of an incidental finding under the American College of Medical Genetics and Genomics recommendations. Additionally, our analysis of family sequence data highlights that genome and exome sequencing of families has unavoidable implications for immediate family members and therefore requires appropriate counseling for the family.
Genet Med 16 10, 741–750.
Similar content being viewed by others
Introduction
“Incidental findings” are defined as genetic variants with medical or social implications that are discovered during genetic testing for an unrelated indication.1 On the basis of recent publications,2 the American College of Medical Genetics and Genomics (ACMG) Working Group on Incidental Findings in Clinical Exome and Genome Sequencing determined that looking for and reporting some incidental findings would probably have medical benefit for patients and their families. The working group therefore recommended reporting incidental findings from a “minimum list” of 56 genes for individuals having clinical exome or genome sequencing.3 This recommendation has been widely debated and openly challenged.4
Although the return of incidental findings represents an important step forward in the use of sequencing for medical benefit,5 implementing these recommendations requires the development of infrastructure to support evaluation and reporting.3 Family members other than the proband are often included in diagnostic exome sequencing, and thus this also has implications for unaffected family members. The typical number of reportable variants that will be generated in practice has not been widely studied. One study of 572 subjects, selected for atherosclerosis phenotypes, found that ~1% of exomes may require disclosure of an incidental genetic finding, but the set of genes analyzed in that study did not include all the genes in the ACMG list, and the cohort was nonfamilial.2 A more recent study found that ~3.4% of European ancestry exomes and 1.2% of African ancestry exomes in the National Heart, Lung, and Blood Institute Exome Sequencing Project bear actionable pathogenic or likely pathogenic incidental findings in 114 genes.6 More data are needed to assess the possible impact of the ACMG recommendations in a variety of clinical settings. This is an important issue because resources are required to implement the recommendations.
We analyzed research exome sequence data from 543 individuals derived from 159 families. For the recommended 56 genes, this analysis identified 14 independent reportable variants in the exome sequence data of 27 participants. In nine families with parental sequence data, a parent transmitted the variant to one or more children. These analyses provide data that may be used to refine strategies for the reporting of incidental findings.
Materials and Methods
Subject cohort
Family members gave informed consent or assent to protocol 76-HG-0238, “Diagnosis and Treatment of Patients With Inborn Errors of Metabolism and Other Genetic Disorders,” approved by the institutional review board of the National Human Genome Research Institute. The exome sequence data were derived from a 159-family cohort consisting of 543 subjects, with 188 affected subjects, 137 unaffected siblings, and 218 parents. The average and median ages of the 543 subjects at the time of sequencing were 34.0 (SD: 20.8 years) and 37 years, respectively. Some subjects were deceased at the time of sequencing, and for those subjects projected age at time of sequencing was used because it is anticipated that incidental findings will only be sought in living subjects. Self-reported ancestry was white/European (89.1%), black/African American (4.1%), unknown (3.3%), Asian (2.2%), and multiracial (1.3%) (Supplementary Table S1 online). These families included all those admitted to the National Institutes of Health (NIH) Undiagnosed Diseases Program (UDP) and selected for exome analysis as previously described.7 The sequencing was performed on a research basis, not in a Clinical Laboratory Improvement Amendments–certified fashion.
Exome sequencing
Genomic DNA was extracted from peripheral whole blood using the Gentra Puregene Blood Kit (Qiagen, Germantown, MD) as per the manufacturer’s protocol. The Illumina TruSeq exome capture kit (Illumina, San Diego, CA), which targets ~60 million bases consisting of the Consensus Coding Sequence annotated gene set as well as some structural RNAs, was used. Captured DNA was sequenced on the Illumina HiSeq platform until coverage was sufficient to call high-quality genotypes at 85% or more of targeted bases.
Alignment and genotype calling
Reads were mapped to National Center for Biotechnology Information (NCBI) build 37 (hg19) using the Illumina ELAND aligner. When at least one read in a pair mapped to a unique location in the genome, that read and its pair were then aligned with Novoalign (Novocraft, Selangor, Malaysia). These alignments were stored in BAM format and then fed as input to bam2mpg (http://research.nhgri.nih.gov/software/bam2mpg/index.shtml), which called genotypes using a Bayesian algorithm (most probable genotype, or MPG).8
Coverage
Using the UCSC Genome Browser’s hg19 human genome reference exon annotations for the 56 genes, we identified 1,257 discrete exon regions, including the untranslated regions. We recorded base-by-base coverage (Supplementary Table S2 online) and calculated the percentage of each exon with 10-, 20-, or 30-fold coverage (Supplementary Tables S3–S5 online). We also summarized how many exons had at least 90% of their bases covered to at least each of these coverage thresholds ( Table 1 ).
Annotations
The variants were annotated using Annovar.9 Variants and genes listed in Human Gene Mutation Database Professional were added to the annotations. We also used annotations extracted from the Supplementary Data online published by Johnston et al.2 and added annotations for variants listed in ClinVar10 and locus-specific databases (LSDBs) registered in the Leiden Open Variation Database.11 For LSDBs not registered in Leiden Open Variation Database, annotations were manually collected from the individual LSDBs and used to annotate the variants on the basis of matching Human Genome Variation Society nomenclature.
Data extraction
Variants within the 56 genes recommended by the ACMG were considered if they had at least one minor allele call with a minimum coverage of 20 and a minimum most probable genotype (mpg)/coverage ratio of 0.5.12
Data analysis
The ACMG recommendations state that “known pathogenic” variants in 56 genes (and “expected pathogenic” variants in a subset of those 56) should be reported to subjects sequenced for unrelated clinical reasons. The LSDBs and catalogs of clinically relevant variants, such as Human Gene Mutation Database and ClinVar, catalog variants identified in a gene, together with annotations of each variant as “pathogenic,” “probable pathogenic,” “variant of unknown significance,” “probable nonpathogenic,” or “nonpathogenic” (or similar categories). Such annotations can serve as a foundation for determining whether a variant is “known pathogenic.”
An accepted standard for determination of variant pathogenicity (with or without consultation of the databases described above) has not emerged, although several have been proposed.13 Various methods have been proposed to evaluate the likelihood of pathogenicity for variants of unknown significance in genes associated with disease,14,15,16 but we did not use them because they depend on data unavailable to us, i.e., defined penetrance15,16 or population frequency and phenocopy rate.14 Additionally, we did not use allele prevalence as supporting criteria because (i) the phenotyping of subjects included in the 1000 Genomes and Exome Sequencing Project cohorts is incomplete17; (ii) many of the disorders are of adult-onset type and therefore might not be expressed fully among subjects in the 1000 Genomes and Exome Sequencing Project cohorts17; (iii) some disorders have environmentally dependent expressivity (e.g., malignant hyperthermia susceptibility) and therefore might not be expressed fully among subjects in the 1000 Genomes and Exome Sequencing Project cohorts17; and (iv) large control cohorts (>10,000) are needed to properly evaluate case–control disparities for rare variants.13
Understanding that potential harm is posed both by false-positive and false-negative incidental findings and that variants discovered in sporadic cases may have a high false-positive rate,18,19,20 we chose the following criteria for accepting variants as “known pathogenic”: (i) designation in at least one variant database as “pathogenic” or “probable pathogenic” and supporting evidence such as experimental assays or segregation with disease or (ii) meeting the criteria for “expected pathogenic” (see below) and a listing in at least one variant database as “pathogenic.” This process required review of the literature and required ~320 man-hours from individuals knowledgeable of genetics, experimental methodology, and medicine. Approximately 200 hours were spent intersecting LSDBs with our variant set and flagging variants for further review. The remaining ~120 hours were spent reviewing literature and splice predictions for individual variants under consideration for reporting.
Our minimum acceptable segregation patterns for autosomal dominant disorders were either a confirmed de novo variant in an affected child with two unaffected parents or segregation of the variant to three affected family members in two generations. We judged requiring five informative meioses or positive evidence of linkage as unreasonably stringent criteria21 and only requiring two affected family members in two generations as too lax a criterion for association of a variant with disease.18,19 We did not accept clinically identified variants claimed to cause disease as pathogenic without reported functional data or familial segregation.
To define variants as “expected pathogenic,” we used previously described criteria.22 Briefly, these include mutations leading to premature translation termination, loss of a translation termination codon, loss of a translation initiation codon, and alteration of canonical splice donor or acceptor sites.
Missense variants not previously associated with disease are considered to comprise a class of variants that may or may not cause disease and therefore are not automatically disclosed to the patient.22 Furthermore, the lack of information regarding these variants in an LSDB, Human Gene Mutation Database, or ClinVar indicates that they are unlikely to be recognized by the medical genetics community as known pathogenic variants. We therefore designated missense variants not present in these databases as nonreportable.
Both alleles of MUTYH must be mutated to meet the ACMG reporting recommendations. We therefore selected homozygous nonreference variants and paired compound-heterozygous variants. We deemed a variant pair reportable only if each variant of the pair met the criteria of being listed as “pathogenic” in at least one variant database and having supporting evidence such as experimental assays or segregation with disease.
To count the number of reportable incidental findings per independent exome, one subject per family was selected randomly, and the number of incidental findings in those subjects was counted. We also counted the number of reportable incidental findings in subjects who are currently minors and noted whether the disease associated with the variant in question was of adult-onset or childhood-onset type.
Phenotype correlation
Family and medical history and pertinent laboratory findings were reviewed where available for individuals with a reportable variant.
Results
For the UDP cohort of 543 exome sequence data, there were 5,948 variants in the 56 ACMG-recommended genes ( Figure 1 ; see Supplementary Table S2 online for a complete list of all variants with annotations) when compared with the human reference sequence (NCBI build 37; hg19) ( Table 2 ). To select variants of sufficient quality, we limited further analyses to those variants with a minimum coverage of 20 reads and a minimum mpg/coverage ratio of 0.5. Of the 5,928 variants that remained, 4,932 were judged highly unlikely to be reportable under ACMG recommendations because they were not present in LSDBs and were localized to introns outside of the canonical spice sites (67%), resided in 3′-untranslated regions (13%), encoded synonymous amino acid changes (7.5%), or resided in other non–protein-coding regions such as 5′-untranslated regions or the kilobase flanking the gene (6%) ( Figure 1 ). Two other classes of variants that we excluded on the basis of absence from LSDBs, predicted functional impact, and per ACMG recommendations22 were missense variants of unknown significance (6.5%) and variants predicted to affect splicing but outside of the canonical splice sites.
Each of the remaining 996 variants was then annotated with information available from Human Gene Mutation Database, ClinVar, and LSDBs and for the predicted consequence (e.g., frameshift, splicing, and termination). Of these, 250 variants were listed as known pathogenic or probable pathogenic in at least one database or were known to cause a premature translation termination, loss of a translation termination codon, loss of a translation initiation codon, or alteration of canonical splice donor or acceptor site. After reviewing the literature for supporting evidence to justify designating these 250 variants as pathogenic, 3 variants met criteria as “expected pathogenic” and 11 as “known pathogenic” ( Table 3 and Figure 1c ). These 14 variants were present in 27 subjects from 14 families. No reportable variant was observed in more than one family. Thus, 5.0% (27/543) of the exomes in our cohort had a finding that would result in disclosure under the ACMG recommendations.
To determine how many of the variants arose de novo as opposed to being inherited, we analyzed the parental sequences in 9 of the 14 families where parental sequences were available. For all nine families (nine minor children and four adult children), one parent transmitted the variant to one or more children. The remaining five variants were identified in an adult for whom parental sequence was not available.
We identified a reportable incidental finding in nine minor subjects in our cohort. For these nine subjects, five had incidental findings associated with adult-onset conditions, and four had incidental findings associated with childhood-onset conditions.
A review of family and personal medical history revealed pertinent medical findings in only two cases. An adult subject with an SCN5A mutation had a history of exercise-induced fatigue and a first-degree relative with an unspecified early-onset cardiac condition; this relative was not enrolled in our study and, therefore, we could not evaluate segregation of the variant or verify phenotypic relevance. Another adult subject had an APOB mutation with a normal lipid profile: serum cholesterol = 161 mg/dl (normal: <200 mg/dl), low-density lipoprotein = 93 mg/dl (normal: <100 mg/dl), and high-density lipoprotein = 56 mg/dl (high risk: <40 mg/dl, low risk: ≥60 mg/dl).
Discussion
By analysis of exome sequence data from 543 individuals distributed among 159 families, we clarify the reporting burden for the recommendations of the ACMG Working Group on Incidental Findings in Clinical Exome and Genome Sequencing.3 We discovered 14 reportable variants for 27 individuals in 14 families. Therefore, 8.8% of families enrolled for exome sequencing under the NIH UDP protocol had incidental findings requiring disclosure if the sequencing had been performed by a Clinical Laboratory Improvement Amendments–certified laboratory.
Compared with the 1% rate of reportable incidental findings observed for 23 of the 56 genes analyzed by Johnston et al.2 and the rate of 1.2–3.4% for 114 genes analyzed by Dorschner et al.,6 we found a higher rate of reportable incidental findings. This increased rate of reportable incidental findings could arise for several reasons, including (i) increased coverage and quality of sequencing of the exome, (ii) differences in variant selection, (iii) differences in the subject cohort, or (iv) higher frequency of reportable variants in the ACMG-recommended genes compared with the previously studied genes.
Regarding the sequence coverage and quality, the study of Johnston et al.2 analyzed a smaller portion of the exome and aligned the sequences against an earlier version of the human reference genome. These two factors suggest that inclusion of more of the human exome and refinement of the reference genome might increase the number of detectable reportable variants. Testing of this proposal by a detailed analysis of exons—both sequenced and not sequenced—in the two data sets was, however, beyond the scope of this work because we did not have access to the exome sequences studied by Johnston et al.2 To enable future comparative investigations, we have provided details of coverage for our exome sequence data (Supplementary Tables S3–S6 online).
Regarding differences in variant selection, the ACMG’s estimation of a 1% rate of reportable incidental findings was based on an allele frequency of >0.5% within the cohort and an allele frequency of >0.015% in dbSNP (Single-Nucleotide Polymorphism Database) as exclusionary criteria for a pathogenic designation.2 We did not use allele frequency as an exclusionary criterion for pathogenicity for two reasons. First, deleterious alleles occasionally exhibit higher prevalence in some populations.23,24 Second, as discussed above, phenotyping is incomplete in cohorts from which most frequency data are derived.
To classify a variant as reportable, Dorschner et al.6 required an allelic frequency of less than a predetermined disease-specific maximum prevalence plus various permutations of independently observed segregation with disease. Compared with our study, their criterion was 4 vs. 3 segregations of the variant with disease; however, they did not consider functional assays as evidence for pathogenicity and only considered protein truncation as pathogenic if it occurred in the first 90% of the amino acid sequence. These differences probably contributed to the differences in our rates (5% vs. 1.2–3.4%) of incidental findings. For example, their more stringent segregation requirements and lack of consideration of functional experimental evidence (e.g., patch-clamp results) probably led to their classification of three variants—CACNA1S p.T1354S, SCN5A p.T220I, and SCN5A p.E428K—that we considered “known pathogenic” as “variants of unknown significance.”
In this context, we expect that judicious comparison of variant classification may demonstrate that even reasonable parties disagree regarding the benefits and risks of reporting such variants as incidental findings. The ACMG recommendations try to balance the need and ability to return highly beneficial risk information to the patients (true positives) while at the same time limiting the potential harm by not returning false-positive results. The recommendations are written quite conservatively to strike a good balance between these two competing goals. Consequently, the recommendations clearly state that “variants that are previously unreported but are of the type which is expected to cause the disorder, as defined by prior ACMG guidelines, should be reported.” The aforementioned guidelines are from the ACMG Recommendations for Standards for Interpretation and Reporting of Sequence Variations: Revisions 2007 (ref. 3) and can be found at https://www.acmg.net/StaticContent/SGs/ACMG_recommendations_for_standards_for.9.pdf. These guidelines state that if a variant is not previously reported to cause the disease, only two paths lead to classification of a variant as reportable. On detecting predicted deleteriousness (stops, indels, and some splice sites) or in case of uncertainty (missense, potential splice site, in-frame indels, single-nucleotide polymorphism association only), the researchers need to collect supporting evidence to favor the deleteriousness of the variant.
Although one might advocate for even stricter criteria, the criteria that we have selected for our study are more stringent than those provided by both the ACMG Recommendations for Reporting of Incidental Findings in Clinical Exome and Genome Sequencing and ACMG Recommendations for Standards for Interpretation and Reporting of Sequence Variations: Revisions 2007. We also acknowledge that the supporting evidence for these uncertain variants will vary in its quality and quantity and that the evidence will never be unequivocal for the simple fact that in light of unequivocal evidence, the variant in question would otherwise have been previously reported as disease causing. These variants and supporting evidence need to be returned to the clinician who ordered the sequencing, and it is the clinician’s duty to put these test results in the context of the patient’s clinical background. Clinicians do this for other tests, and the clinician’s understanding of the test characteristics is more important in the correct interpretation of the test than the test characteristics themselves. A test with high false-positive rate but also with high sensitivity can be quite useful and desirable if used in the correct context with the right information to interpret the results. Our approach is therefore in agreement with the ACMG Recommendations for Reporting of Incidental Findings in Clinical Exome and Genome Sequencing, although until all possible changes in the human genome are annotated with unequivocal evidence to either support or refute the pathogenicity of each variant, there will always be a risk of making a false-positive call. A priori, the sensitivity or specificity of our methods cannot be determined, although higher specificity might be achieved with the use of very demanding requirements with respect to segregation or case–control disparities. The higher rate of incidental findings in our cohort as compared with those of the studies by Johnston et al.2 and Dorschner et al.6 highlights a possible limitation of our study in that our criteria may have a high false-positive rate. More research is needed to compare the sensitivity and specificity of different filtering strategies, ideally with long-term follow-up. In any case, incidental findings should be worked up in accordance with the degree of confidence in their deleteriousness, with a conservative approach taken to those variants with a minimum of evidence supporting pathogenicity.
With respect to differences in the study populations, the cohort reported by Johnston et al.2 was selected for atherosclerotic phenotypes (including unrelated controls) and was not a familial cohort. The cohort reported by Dorschner et al.6 was selected from among the National Heart, Lung, and Blood Institute Exome Sequencing Project on the basis of European and African ancestries. Our cohort is largely of European ancestry. Transmission within our cohort increased the number of individuals at risk from 14 to 27. With undiagnosed disorders, there is also the possibility of an antecedent hypermutable disorder; however, no one individual in our cohort had an increased number of reportable variants, and our previous analyses of numbers of exome sequence variants within the UDP families did not identify marked differences from those reported for other cohorts.25
Regarding differences in the gene lists used, Johnston et al.2 analyzed only a subset of the genes recommended by the ACMG Working Group on Incidental Findings in Clinical Exome and Genome Sequencing, i.e., the 23 associated with cancer syndromes. By contrast, the ACMG list also encompasses genes associated with cardiac arrhythmias, myopathies, connective tissue disorders, familial hypercholesterolemia, and malignant hyperthermia susceptibility. Dorschner et al.6 analyzed 114 genes, including 52 of the 56 genes on the ACMG list.
Another variable in estimating the rate of reportable incidental findings is the thoroughness with which a disease and gene have been studied. In other words, the more individuals who have been identified with a disorder and checked for mutations in a gene, the more disease-causing mutations are likely to have been characterized. Reviewing our data, SCN5A (n = 4) and BRCA2 (n = 2) had the most reportable variants. For SCN5A, this may reflect the fact that more variants are entered in databases because (i) both gain- and loss-of-function variants in SCN5A can cause disease and (ii) functional testing for pathogenicity is relatively accessible using patch-clamping experiments.
Four additional issues arising during our analysis were as follows: (i) defining the level of disease penetrance warranting reporting of a potential disease-causing variant, (ii) determining how to weight variants deposited by clinical laboratories without corroborating evidence of pathogenicity, (iii) the need for clinical correlation, and (iv) obligations to extended family members. Relevant to the first issue, the ACMG recommendations state that variants with “higher” penetrance should be reported, but they leave the determination of “higher” to the clinical laboratory. For example, we identified a TP53 variant (p.R337H/chr17:g.7574017C>T, see Table 3 ) with 2.5–9.9% penetrance for pediatric adrenocortical carcinoma,26,27 and newborn screening programs in Brazil have shown that screening for carriers of this mutation reduces morbidity and mortality.26 This reporting conundrum was not resolved by the relationship of TP53 to Li–Fraumeni syndrome because this variant has not been associated with Li–Fraumeni syndrome. Consequently, the reporting of a variant is difficult to code bioinformatically and will require human interpretation and possibly clinical consultation.
Regarding delineation of the pathogenicity of variants deposited by clinical laboratories, BRCA1 and BRCA2 variants provide an excellent illustration. Although our criteria for pathogenicity are scientifically sound, many BRCA1 and BRCA2 variants in public databases lack information on segregation with disease or experimental functional assays. Because variants lacking this information would not be considered pathogenic according to our paradigm, our approach may well underreport the BRCA1- and BRCA2-associated cancer risks.
Another issue arising from this analysis is that a molecular finding is not a clinical diagnosis. Clinical records are often not available to testing laboratories, although, in some cases, they may substantiate or cast doubt on a variant’s pathogenicity. The subject in whom we identified a pathogenic APOB mutation (p.R3527W/chr2:g. 21229161G>A), a conclusion supported by functional assays demonstrating reduced low-density lipoprotein receptor binding,28 had a favorable serum cholesterol and lipoprotein profile. A similar finding was also reported by Andreasen et al.20 on “causative variants” for cardiomyopathies. This highlights that even conservative standards to determine pathogenicity do not obviate the need for clinical interpretation and correlation.
The last issue is that of obligation to provide potentially helpful medical information to extended family members. For example, the person with an SCN5A variant and exercise-induced fatigue had a brother with an unspecified early-onset cardiac condition. If this brother carried the SCN5A variant, then this information might be diagnostically and therapeutically useful to him. Possible ethical approaches to notification include encouraging the subject in our cohort to discuss this finding with his brother, with or without provision of counseling to the brother, or direct notification of the brother. The American Medical Association’s Code of Medical Ethics endorses encouraging the subject to notify at-risk relatives, with provision of assistance to the subject regarding communication of opportunities for testing and counseling.29 This serves as a reminder that genetic testing may generate professional ethical obligations extending beyond the subject being tested.
Discussion on whether to inform individuals enrolled under the NIH UDP protocol about the identified variants focused on the delineated and perceived obligations defined by the language of the consent document and the process by which the consent was explained. In conclusion, whether to return or not return the incidental findings was deferred to the choices the individual or guardian had made when completing the written informed consent.
An issue raised by our study was the amount of work needed to determine the variants that are reportable. We found that variants were listed occasionally as mutations or known pathogenic alleles in LSDBs without published evidence of segregation with disease or functional assays to support pathogenicity. Consequently, it is incumbent on the reporting laboratory to assemble and determine the credibility of the evidence used to determine the pathogenicity of a variant. Confounding this is the failure of many LSDBs to provide access to variants in a format that is easily applied to data sets derived from exome and genome sequencing. In contrast, ClinVar provides the required annotations as readily usable variant call files. Deposition of variants and their clinical significance in ClinVar would improve the efficiency of the recommended analysis.
Our analysis had some limitations. First, the exome sequencing that produced the variants for analysis was a research-grade exercise rather than a clinical-grade investigation, and therefore not all exons in the 56 recommend genes had sufficient sequence coverage to call variants in all individuals. In addition, we did not validate the variants by Sanger sequencing but rather inspected the alignments of short reads using Integrative Genome Viewer, a method that we have found more sensitive than Sanger sequencing. Second, our curation of variants was limited by the availability of annotations in public databases; we expect that the number and quality of these annotations will improve with time, as will the number of reportable variants. This raises the question of whether exome and genome sequence data should be reanalyzed at regular intervals to take into account the increasing information.
In summary, clinical exome and genome sequencing are cost-effective methods for identifying the molecular bases of genetic conditions. These untargeted approaches, however, also uncover genetic variants with medical or social implications unrelated to the indication for testing. In this context, the ACMG Working Group on Incidental Findings in Clinical Exome and Genome Sequencing recently recommended reporting “known pathogenic” and “expected pathogenic” mutations for 56 genes. Approximately 5% of all exomes in the NIH UDP familial cohort and 8.8% of the families in our cohort had a reportable finding. The most time-consuming aspect of fulfilling these recommendations was assembling the evidence for “pathogenicity” or “probable pathogenicity” because no well-curated comprehensive public database is currently available.
Disclosure
The authors declare no conflict of interest.
References
Wolf SM, Crock BN, Van Ness B, et al. Managing incidental findings and research results in genomic research involving biobanks and archived data sets. Genet Med 2012;14:361–384.
Johnston JJ, Rubinstein WS, Facio FM, et al. Secondary variants in individuals undergoing exome sequencing: screening of 572 individuals identifies high-penetrance mutations in cancer-susceptibility genes. Am J Hum Genet 2012;91:97–108.
Green RC, Berg JS, Grody WW, et al. ACMG recommendations for reporting of incidental findings in clinical exome and genome sequencing. Genet Med 2013;15:565–574.
Burke W, Matheny Antommaria AH, Bennett R, et al. Recommendations for returning genomic incidental findings? We need to talk! Genet Med 2013;15:854–859.
Christenhusz G, Devriendt K, Dierickx K . To tell or not to tell? A systematic review of ethical reflections on incidental findings arising in genetics contexts. Eur J Human Genet 2012;21:248–255.
Dorschner MO, Amendola LM, Turner EH, et al. Actionable, pathogenic incidental findings in 1,000 participants’ exomes. Am J Hum Genet 2013;93:631–640.
Gahl WA, Markello TC, Toro C, et al. The National Institutes of Health Undiagnosed Diseases Program: insights into rare diseases. Genet Med 2012;14:51–59.
Teer JK, Bonnycastle LL, Chines PS, et al.; NISC Comparative Sequencing Program. Systematic comparison of three genomic enrichment methods for massively parallel DNA sequencing. Genome Res 2010;20:1420–1431.
Wang K, Li M, Hakonarson H . ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res 2010;38:e164.
ClinVar. ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf/clinvar_00-latest.vcf.gz. Accessed 17 June 2013.
Fokkema IF, Taschner PE, Schaafsma GC, Celli J, Laros JF, den Dunnen JT . LOVD v.2.0: the next generation in gene variant databases. Hum Mutat 2011;32:557–563.
Ajay SS, Parker SC, Abaan HO, Fajardo KV, Margulies EH . Accurate and comprehensive sequencing of personal genomes. Genome Res 2011;21:1498–1505.
Sunyaev SR . Inferring causality and functional significance of human coding DNA variants. Hum Mol Genet 2012;21(R1):R10–R17.
Petersen GM, Parmigiani G, Thomas D . Missense mutations in disease genes: a Bayesian approach to evaluate causality. Am J Hum Genet 1998;62:1516–1524.
Thompson D, Easton DF, Goldgar DE . A full-likelihood method for the evaluation of causality of sequence variants from family data. Am J Hum Genet 2003;73:652–655.
Mohammadi L, Vreeswijk MP, Oldenburg R, et al. A simple method for co-segregation analysis to evaluate the pathogenicity of unclassified variants; BRCA1 and BRCA2 as an example. BMC Cancer 2009;9:211.
Abecasis GR, Altshuler D, Auton A, et al. A map of human genome variation from population-scale sequencing. Nature 2010;467:1061–1073.
Norton N, Robertson PD, Rieder MJ, et al.; National Heart, Lung and Blood Institute GO Exome Sequencing Project. Evaluating pathogenicity of rare variants from dilated cardiomyopathy in the exome era. Circ Cardiovasc Genet 2012;5:167–174.
Cassa CA, Tong MY, Jordan DM . Large numbers of genetic variants considered to be pathogenic are common in asymptomatic individuals. Hum Mutat 2013;34:1216–1220.
Andreasen C, Nielsen JB, Refsgaard L, et al. New population-based exome data are questioning the pathogenicity of previously cardiomyopathy-associated genetic variants. Eur J Hum Genet 2013;21:918–928.
Jordan DM, Kiezun A, Baxter SM, et al. Development and validation of a computational method for assessment of missense variants in hypertrophic cardiomyopathy. Am J Hum Genet 2011;88:183–192.
Richards CS, Bale S, Bellissimo DB, et al.; Molecular Subcommittee of the ACMG Laboratory Quality Assurance Committee. ACMG recommendations for standards for interpretation and reporting of sequence variations: revisions 2007. Genet Med 2008;10:294–300.
Roa BB, Boyd AA, Volcik K, Richards CS . Ashkenazi Jewish population frequencies for common mutations in BRCA1 and BRCA2. Nat Genet 1996;14:185–187.
Miserez AR, Laager R, Chiodetti N, Keller U . High prevalence of familial defective apolipoprotein B-100 in Switzerland. J Lipid Res 1994;35:574–583.
Adams DR, Sincan M, Fuentes Fajardo K, et al. Analysis of DNA sequence variants detected by high-throughput sequencing. Hum Mutat 2012;33:599–608.
Custódio G, Parise GA, Kiesel Filho N, et al. Impact of neonatal screening and surveillance for the TP53 R337H mutation on early detection of childhood adrenocortical tumors. J Clin Oncol 2013;31:2619–2626.
Figueiredo BC, Sandrini R, Zambetti GP, et al. Penetrance of adrenocortical tumours associated with the germline TP53 R337H mutation. J Med Genet 2006;43:91–96.
Fisher E, Scharnagl H, Hoffmann MM, et al. Mutations in the apolipoprotein (apo) B-100 receptor-binding region: detection of apo B-100 (Arg3500–>Trp) associated with two new haplotypes and evidence that apo B-100 (Glu3405–>Gln) diminishes receptor-mediated uptake of LDL. Clin Chem 1999;45:1026–1038.
Code of Medical Ethics: Opinion 2.131 - Disclosure of Familial Risk in Genetic Testing. http://www.ama-assn.org/ama/pub/physician-resources/medical-ethics/code-medical-ethics/opinion2131.page. Accessed 22 August 2013.
DiGiammarino EL, Lee AS, Cadwell C, et al. A novel mechanism of tumorigenesis involving pH-dependent destabilization of a mutant p53 tetramer. Nat Struct Biol 2002;9:12–16.
Makita N, Behr E, Shimizu W, et al. The E1784K mutation in SCN5A is associated with mixed clinical phenotype of type 3 long QT syndrome. J Clin Invest 2008;118:2219–2229.
Wei J, Wang DW, Alings M, et al. Congenital long-QT syndrome caused by a novel mutation in a conserved acidic domain of the cardiac Na+ channel. Circulation 1999;99:3165–3171.
Wang Q, Chen S, Chen Q, et al. The common SCN5A mutation R1193Q causes LQTS-type electrophysiological alterations of the cardiac sodium channel. J Med Genet 2004;41:e66.
Hwang HW, Chen JJ, Lin YJ, et al. R1193Q of SCN5A, a Brugada and long QT mutation, is a common polymorphism in Han Chinese. J Med Genet 2005;42:e7; author reply e8.
Darbar D, Kannankeril PJ, Donahue BS, et al. Cardiac sodium channel (SCN5A) variants associated with atrial fibrillation. Circulation 2008;117:1927–1935.
Olesen MS, Yuan L, Liang B, et al. High prevalence of long QT syndrome-associated SCN5A variants in patients with early-onset lone atrial fibrillation. Circ Cardiovasc Genet 2012;5:450–459.
Benson DW, Wang DW, Dyment M, et al. Congenital sick sinus syndrome caused by recessive mutations in the cardiac sodium channel gene (SCN5A). J Clin Invest 2003;112:1019–1028.
Gui J, Wang T, Jones RP, Trump D, Zimmer T, Lei M . Multiple loss-of-function mechanisms contribute to SCN5A-related familial sick sinus syndrome. PLoS ONE 2010;5:e10985.
Kapplinger JD, Tester DJ, Alders M, et al. An international compendium of mutations in the SCN5A-encoded cardiac sodium channel in patients referred for Brugada syndrome genetic testing. Heart Rhythm 2010;7:33–46.
Olson TM, Michels VV, Ballew JD, et al. Sodium channel mutations and susceptibility to heart failure and atrial fibrillation. JAMA 2005;293:447–454.
ClinVar (2013). http://www.ncbi.nlm.nih.gov/clinvar/RCV000030362/#evidence. Accessed 14 July 2013.
Dalal D, James C, Devanagondi R, et al. Penetrance of mutations in plakophilin-2 among families with arrhythmogenic right ventricular dysplasia/cardiomyopathy. J Am Coll Cardiol 2006;48:1416–1424.
Andersen PS, Hedley PL, Page SP, et al. A novel Myosin essential light chain mutation causes hypertrophic cardiomyopathy with late onset and low expressivity. Biochem Res Int 2012;2012:685108.
Maron BJ . Hypertrophic cardiomyopathy: a systematic review. JAMA 2002;287:1308–1320.
Gersh BJ, Maron BJ, Bonow RO, et al.; American College of Cardiology Foundation/American Heart Association Task Force on Practice Guidelines; American Association for Thoracic Surgery; American Society of Echocardiography; American Society of Nuclear Cardiology; Heart Failure Society of America; Heart Rhythm Society; Society for Cardiovascular Angiography and Interventions; Society of Thoracic Surgeons. 2011 ACCF/AHA guideline for the diagnosis and treatment of hypertrophic cardiomyopathy: executive summary: a report of the American College of Cardiology Foundation/American Heart Association Task Force on Practice Guidelines. Circulation 2011;124:2761–2796.
Spada M, Pagliardini S, Yasuda M, et al. High incidence of later-onset fabry disease revealed by newborn screening. Am J Hum Genet 2006;79:31–40.
De Brabander I, Yperzeele L, Ceuterick-De Groote C, et al. Phenotypical characterization of a-galactosidase A gene mutations identified in a large Fabry disease screening program in stroke in the young. Clin Neurol Neurosurg 2013;115:1088–1093.
Terryn W, Vanholder R, Hemelsoet D, et al. Questioning the pathogenic role of the GLA p.Ala143Thr “mutation” in Fabry disease: implications for screening studies and ERT. JIMD Rep 2013;8:101–108.
Pirone A, Schredelseker J, Tuluc P, et al. Identification and functional characterization of malignant hyperthermia mutation T1354S in the outer pore of the Cavalpha1S-subunit. Am J Physiol, Cell Physiol 2010;299:C1345–C1354.
Sharing Clinical Reports . (2013)http://sharingclinicalreports.org/. Accessed 16 July 2013.
Meindl A ; German Consortium for Hereditary Breast and Ovarian Cancer. Comprehensive analysis of 989 patients with breast or ovarian cancer provides BRCA1 and BRCA2 mutation profiles and frequencies for the German population. Int J Cancer 2002;97:472–480.
Gaffney D, Reid JM, Cameron IM, et al. Independent mutations at codon-3500 of the apolipoprotein-B gene are associated with hyperlipidemia. Arteriosclerosis Thrombosis and Vascular Biology 1995;15:1025–1029.
Choong ML, Koay ES, Khoo KL, Khaw MC, Sethi SK . Denaturing gradient-gel electrophoresis screening of familial defective apolipoprotein B-100 in a mixed Asian cohort: two cases of arginine3500–>tryptophan mutation associated with a unique haplotype. Clin Chem 1997;43(6 Pt 1):916–923.
Tai DY, Pan JP, Lee-Chen GJ . Identification and haplotype analysis of apolipoprotein B-100 Arg3500–>Trp mutation in hyperlipidemic Chinese. Clin Chem 1998;44(8 Pt 1):1659–1665.
Acknowledgements
We thank Patricia Birch and Shelin Adam for critical review of the manuscript. We thank the staff at the National Human Genome Research Institute (NHGRI) Intramural Sequencing Center for their sequencing, alignment, genotyping, and annotation services. This work was supported in part by the Common Fund, Office of the Director, and the Intramural Research Program of the NHGRI (National Institutes of Health, Bethesda, MD).
Author information
Authors and Affiliations
Corresponding author
Supplementary information
Supplementary Table S1
(XLS 84 kb)
Supplementary Table S2
(DOC 952 kb)
Supplementary Table S3
(XLS 102221 kb)
Supplementary Table S4
(DOC 1597 kb)
Supplementary Table S5
(DOC 1265 kb)
Supplementary Table S6
(DOC 1260 kb)
Supplementary Data
(DOC 366 kb)
Rights and permissions
About this article
Cite this article
Lawrence, L., Sincan, M., Markello, T. et al. The implications of familial incidental findings from exome sequencing: the NIH Undiagnosed Diseases Program experience. Genet Med 16, 741–750 (2014). https://doi.org/10.1038/gim.2014.29
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/gim.2014.29
Keywords
This article is cited by
-
Actionable secondary findings in 1116 Hong Kong Chinese based on exome sequencing data
Journal of Human Genetics (2021)
-
1 in 38 individuals at risk of a dominant medically actionable disease
European Journal of Human Genetics (2019)
-
Secondary findings in 421 whole exome-sequenced Chinese children
Human Genomics (2018)
-
Exome sequencing has higher diagnostic yield compared to simulated disease-specific panels in children with suspected monogenic disorders
European Journal of Human Genetics (2018)
-
Incidental and clinically actionable genetic variants in 1005 whole exomes and genomes from Qatar
Molecular Genetics and Genomics (2018)