Settling the score: variant prioritization and Mendelian disease

Eilbeck, Karen; Quinlan, Aaron; Yandell, Mark

doi:10.1038/nrg.2017.52

Review Article
Published: 14 August 2017

Settling the score: variant prioritization and Mendelian disease

Karen Eilbeck¹^na1,
Aaron Quinlan^1,2^na1 &
Mark Yandell²

Nature Reviews Genetics volume 18, pages 599–612 (2017)Cite this article

16k Accesses
151 Citations
96 Altmetric
Metrics details

Subjects

Key Points

Exome and genome sequencing reveal thousands to millions of genetic variants in a typical individual. A fundamental challenge in human genetics is isolating the small subset (typically one or two) of variants that cause a Mendelian disease phenotype. This Review describes the computational approaches used to prioritize variants in Mendelian disease.
A multitude of tools prioritize variants on the basis of biochemical, evolutionary, allele segregation and population frequency characteristics in an attempt to prioritize the list of potential causative variants. The strategies and caveats associated with these tools are outlined in this Review.
Burden tests take prioritization to the next level by aggregating the variants observed at a given locus to calculate a burden score for the gene. Most burden testing software tools also evaluate potentially damaging genotypes in the context of other genotypes observed at the same locus in a control population.
Variant interpretation is the process of drawing direct connections from individual variants to disease phenotypes, and this process is central to both clinical reporting of results and incidental findings, as well as research endeavours that include variant discovery and return of results.
Variant prioritization and interpretation are especially challenging for non-coding variants, structural variants and synonymous exonic variants. Furthermore, increasingly complex reference genomes introduce new demands for variant discovery tools. Each of these challenges drive increasingly sophisticated software solutions.

Abstract

When investigating Mendelian disease using exome or genome sequencing, distinguishing disease-causing genetic variants from the multitude of candidate variants is a complex, multidimensional task. Many prioritization tools and online interpretation resources exist, and professional organizations have offered clinical guidelines for review and return of prioritization results. In this Review, we describe the strengths and weaknesses of widely used computational approaches, explain their roles in the diagnostic and discovery process and discuss how they can inform (and misinform) expert reviewers. We place variant prioritization in the wider context of gene prioritization, burden testing and genotype–phenotype association, and we discuss opportunities and challenges introduced by whole-genome sequencing.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: A demonstration of the multiple possible effects of a single variant across transcripts and genes.**

**Figure 2: Population stratification and regional constraint within a gene are critical to variant interpretation.**

**Figure 3: Phenotypes are described across a spectrum of granularity, and different terminologies are used to define these features.**

Refining the impact of genetic evidence on clinical success

Article Open access 17 April 2024

Tissue-specific enhancer–gene maps from multimodal single-cell data identify causal disease alleles

Article 09 April 2024

Genome-wide association studies

Article 26 August 2021

References

Bamshad, M. J. et al. Exome sequencing as a tool for Mendelian disease gene discovery. Nat. Rev. Genet. 12, 745–755 (2011).
Article CAS PubMed Google Scholar
Chong, J. X. et al. The genetic basis of Mendelian phenotypes: discoveries, challenges, and opportunities. Am. J. Hum. Genet. 97, 199–215 (2015). This review summarizes findings from the study of more than 8,000 families with Mendelian disease phenotypes by the Centers for Mendelian Genomics.
Article CAS PubMed PubMed Central Google Scholar
The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015). By sequencing the genomes of more than 2,500 individuals from diverse world ancestries, this study provides the first genome-wide map of both common and rare human genetic variation.
Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016). The ExAC-integrated exome sequencing data from 60,706 individuals provides an invaluable reference data set of genetic variation in protein-coding genes. Assessing variant allele frequencies in ExAC facilitates the interpretation of candidate variants observed in Mendelian disease families.
Article CAS PubMed PubMed Central Google Scholar
Cooper, G. M. & Shendure, J. Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data. Nat. Rev. Genet. 12, 628–640 (2011).
Article CAS PubMed Google Scholar
Kennedy, B. et al. Using VAAST to identify disease-associated variants in next-generation sequencing data. Curr. Protoc. Hum. Genet. 81, 6.14.1–6.14.25 (2014).
Article Google Scholar
Wu, M. C. et al. Powerful SNP-set analysis for case-control genome-wide association studies. Am. J. Hum. Genet. 86, 929–942 (2010).
Article CAS PubMed PubMed Central Google Scholar
Price, A. L. et al. Pooled association tests for rare variants in exon-resequencing studies. Am. J. Hum. Genet. 86, 832–838 (2010).
Article PubMed PubMed Central Google Scholar
Liu, D. J. & Leal, S. M. A novel adaptive method for the analysis of next-generation sequencing data to detect complex trait associations with rare variants due to gene main effects and interactions. PLoS Genet. 6, e1001156 (2010).
Article CAS PubMed PubMed Central Google Scholar
Li, B. & Leal, S. M. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am. J. Hum. Genet. 83, 311–321 (2008).
Article CAS PubMed PubMed Central Google Scholar
Lee, S., Abecasis, G. R., Boehnke, M. & Lin, X. Rare-variant association analysis: study designs and statistical tests. Am. J. Hum. Genet. 95, 5–23 (2014).
Article CAS PubMed PubMed Central Google Scholar
Coonrod, E. M., Margraf, R. L., Russell, A., Voelkerding, K. V. & Reese, M. G. Clinical analysis of genome next-generation sequencing data using the Omicia platform. Expert Rev. Mol. Diagn. 13, 529–540 (2013).
Article CAS PubMed Google Scholar
Doig, K. D. et al. PathOS: a decision support system for reporting high throughput sequencing of cancers in clinical diagnostic laboratories. Genome Med. 9, 38 (2017).
Article CAS PubMed PubMed Central Google Scholar
Miller, C. A., Qiao, Y., DiSera, T., D'Astous, B. & Marth, G. T. bam.iobio: a web-based, real-time, sequence alignment file inspector. Nat. Methods 11, 1189 (2014).
Article CAS PubMed PubMed Central Google Scholar
Vandeweyer, G., Van Laer, L., Loeys, B., Van den Bulcke, T. & Kooy, R. F. VariantDB: a flexible annotation and filtering portal for next generation sequencing data. Genome Med. 6, 74 (2014).
Article PubMed PubMed Central Google Scholar
Amberger, J. S., Bocchini, C. A., Schiettecatte, F., Scott, A. F. & Hamosh, A. OMIM.org: Online Mendelian Inheritance in Man (OMIM^®), an online catalog of human genes and genetic disorders. Nucleic Acids Res. 43, D789–D798 (2015).
Article CAS PubMed Google Scholar
Landrum, M. J. et al. ClinVar: public archive of interpretations of clinically relevant variants. Nucleic Acids Res. 44, D862–D868 (2016). ClinVar is an important repository for collating and understanding genome variant interpretation.
Article CAS PubMed Google Scholar
DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).
Article CAS PubMed PubMed Central Google Scholar
Alkan, C., Coe, B. P. & Eichler, E. E. Genome structural variation discovery and genotyping. Nat. Rev. Genet. 12, 363–376 (2011).
Article CAS PubMed PubMed Central Google Scholar
Van der Auwera, G. A. et al. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr. Protoc. Bioinformatics 43, 11.10.1–11.10.33 (2013).
Google Scholar
Zook, J. M. et al. Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls. Nat. Biotechnol. 32, 246–251 (2014).
Article CAS PubMed Google Scholar
Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).
Article CAS PubMed PubMed Central Google Scholar
McLaren, W. et al. The Ensembl variant effect predictor. Genome Biol. 17, 122 (2016).
Article CAS PubMed PubMed Central Google Scholar
Yandell, M. et al. A probabilistic disease-gene finder for personal genomes. Genome Res. 21, 1529–1542 (2011).
Article CAS PubMed PubMed Central Google Scholar
Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly 6, 80–92 (2012).
Article CAS PubMed PubMed Central Google Scholar
Eilbeck, K. et al. The Sequence Ontology: a tool for the unification of genome annotations. Genome Biol. 6, R44 (2005). The Sequence Ontology is a project that initiated developing standardized terminologies for genomic sequence features and became widely used in both genome annotation and more recently in variant annotation. It is a key vocabulary used by tools that assign consequences to variants.
Article CAS PubMed PubMed Central Google Scholar
Cunningham, F., Moore, B., Ruiz-Schultz, N., Ritchie, G. R. & Eilbeck, K. Improving the Sequence Ontology terminology for genomic variant annotation. J. Biomed. Semantics 6, 32 (2015).
Article PubMed PubMed Central Google Scholar
Sherry, S. T. et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29, 308–311 (2001).
Article CAS PubMed PubMed Central Google Scholar
Aken, B. L. et al. Ensembl 2017. Nucleic Acids Res. 45, D635–D642 (2017).
Article CAS PubMed Google Scholar
Lappalainen, I. et al. DbVar and DGVa: public archives for genomic structural variation. Nucleic Acids Res. 41, D936–D941 (2013).
Article CAS PubMed Google Scholar
Eilbeck, K., Moore, B., Holt, C. & Yandell, M. Quantitative measures for the management and comparison of annotated genomes. BMC Bioinformatics 10, 67 (2009).
Article CAS PubMed PubMed Central Google Scholar
Pertea, M. & Salzberg, S. L. Between a chicken and a grape: estimating the number of human genes. Genome Biol. 11, 206 (2010).
Article CAS PubMed PubMed Central Google Scholar
Ezkurdia, I. et al. Multiple evidence strands suggest that there may be as few as 19 000 human protein-coding genes. Hum. Mol. Genet. 23, 5866–5878 (2014).
Article CAS PubMed PubMed Central Google Scholar
MacArthur, D. G. et al. A systematic survey of loss-of-function variants in human protein-coding genes. Science 335, 823–828 (2012). Through careful examination of LOF variants in 185 individuals, this study predicted that a typical human harbours roughly ∼100 potential LOF variants in their genome, highlighting the challenge of isolating the one or two causal variants underlying a Mendelian disease phenotype.
Article CAS PubMed PubMed Central Google Scholar
Saleheen, D. et al. Human knockouts and phenotypic analysis in a cohort with a high rate of consanguinity. Nature 544, 235–239 (2017). This manuscript studies individuals harbouring homozygous LOF variants in a population with a high rate of consanguinity, revealing more than 1,000 genes that were predicted to be completely knocked out in at least one individual studied.
Article CAS PubMed PubMed Central Google Scholar
Sheikh, T. I., Mittal, K., Willis, M. J. & Vincent, J. B. A synonymous change, p. Gly16Gly in MECP2 Exon 1, causes a cryptic splice event in a Rett syndrome patient. Orphanet J. Rare Dis. 8, 108 (2013).
Article PubMed PubMed Central Google Scholar
Nackley, A. G. et al. Human catechol-O-methyltransferase haplotypes modulate protein expression by altering mRNA secondary structure. Science 314, 1930–1933 (2006).
Article CAS PubMed Google Scholar
Kimchi-Sarfaty, C. et al. A 'silent' polymorphism in the MDR1 gene changes substrate specificity. Science 315, 525–528 (2007).
Article CAS PubMed Google Scholar
Kircher, M. et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 46, 310–315 (2014). This manuscript describes the Combined Annotation-Dependent Depletion (CADD) score, which integrates diverse genome annotations into a classifier to assess the relative deleteriousness of variants genome-wide.
Article CAS PubMed PubMed Central Google Scholar
Gulko, B., Hubisz, M. J., Gronau, I. & Siepel, A. A method for calculating probabilities of fitness consequences for point mutations across the human genome. Nat. Genet. 47, 276–283 (2015). By integrating high-throughput functional data from the ENCODE project, the fitCons method estimates the probability of whether any genome-wide point mutation will result in a fitness consequence.
Article CAS PubMed PubMed Central Google Scholar
Ng, P. C. & Henikoff, S. Predicting deleterious amino acid substitutions. Genome Res. 11, 863–874 (2001).
Article CAS PubMed PubMed Central Google Scholar
Adzhubei, I. A. et al. A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249 (2010).
Article CAS PubMed PubMed Central Google Scholar
Yip, S. P. Sequence variation at the human ABO locus. Ann. Hum. Genet. 66, 1–27 (2002).
Article CAS PubMed Google Scholar
Kaiser, V. B. et al. Homozygous loss-of-function variants in European cosmopolitan and isolate populations. Hum. Mol. Genet. 24, 5464–5474 (2015).
Article CAS PubMed PubMed Central Google Scholar
The 1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).
The 1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).
Sudmant, P. H. et al. An integrated map of structural variation in 2,504 human genomes. Nature 526, 75–81 (2015). This study provides the first genome-wide map of all common forms of structural variation from thousands of human genomes.
Article CAS PubMed PubMed Central Google Scholar
Tennessen, J. A. et al. Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science 337, 64–69 (2012).
Article CAS PubMed PubMed Central Google Scholar
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).
Article CAS PubMed PubMed Central Google Scholar
Kidd, J. M. et al. Population genetic inference from personal genome data: impact of ancestry and admixture on human genomic variation. Am. J. Hum. Genet. 91, 660–671 (2012).
Article CAS PubMed PubMed Central Google Scholar
Gabriel, S. E., Brigman, K. N., Koller, B. H., Boucher, R. C. & Stutts, M. J. Cystic fibrosis heterozygote resistance to cholera toxin in the cystic fibrosis mouse model. Science 266, 107–109 (1994).
Article CAS PubMed Google Scholar
Hedrick, P. W. Population genetics of malaria resistance in humans. Heredity 107, 283–304 (2011).
Article CAS PubMed PubMed Central Google Scholar
Shah, N. et al. Identification of misclassified ClinVar variants using disease population prevalence. Preprint at bioRxiv http://dx.doi.org/10.1101/075416 (2016).
Google Scholar
Minikel, E. V. & MacArthur, D. G. Publicly available data provide evidence against NR1H3 R415Q Causing multiple sclerosis. Neuron 92, 336–338 (2016).
Article CAS PubMed PubMed Central Google Scholar
Petrovski, S., Wang, Q., Heinzen, E. L., Allen, A. S. & Goldstein, D. B. Genic intolerance to functional variation and the interpretation of personal genomes. PLoS Genet. 9, e1003709 (2013). The authors use genetic variation from 6,515 exomes in the NHLBI Exome Sequencing Project to develop the Residual Variation Intolerance Score (RVIS), which ranks genes by their intolerance to 'functional' (that is, missense or LOF) variation.
Article CAS PubMed PubMed Central Google Scholar
Samocha, K. E. et al. A framework for the interpretation of de novo mutation in human disease. Nat. Genet. 46, 944–950 (2014).
Article CAS PubMed PubMed Central Google Scholar
Shyr, C. et al. FLAGS, frequently mutated genes in public exomes. BMC Med. Genomics 7, 64 (2014).
Article CAS PubMed PubMed Central Google Scholar
Herman, D. S. et al. Truncations of titin causing dilated cardiomyopathy. N. Engl. J. Med. 366, 619–628 (2012).
Article CAS PubMed PubMed Central Google Scholar
Nigro, V. & Savarese, M. Genetic basis of limb-girdle muscular dystrophies: the 2014 update. Acta Myol. 33, 1–12 (2014).
CAS PubMed PubMed Central Google Scholar
Hackman, P. et al. Tibial muscular dystrophy is a titinopathy caused by mutations in TTN, the gene encoding the giant skeletal-muscle protein titin. Am. J. Hum. Genet. 71, 492–500 (2002).
Article CAS PubMed PubMed Central Google Scholar
Ang-Tiu, C. U. & Nicolas, M. E. O. Ichthyosis bullosa of Siemens. J. Dermatol. Case Rep. 6, 78–81 (2012).
Article PubMed PubMed Central Google Scholar
Chamcheu, J. C. et al. Keratin gene mutations in disorders of human skin and its appendages. Arch. Biochem. Biophys. 508, 123–137 (2011).
Article CAS PubMed Google Scholar
Madsen, B. E. & Browning, S. R. A groupwise association test for rare mutations using a weighted sum statistic. PLoS Genet. 5, e1000384 (2009).
Article CAS PubMed PubMed Central Google Scholar
Auer, P. L. & Lettre, G. Rare variant association studies: considerations, challenges and opportunities. Genome Med. 7, 16 (2015).
Article PubMed PubMed Central Google Scholar
Lee, S. et al. Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. Am. J. Hum. Genet. 91, 224–237 (2012).
Article CAS PubMed PubMed Central Google Scholar
Hu, H. et al. VAAST 2.0: improved variant classification and disease-gene identification using a conservation-controlled amino acid substitution matrix. Genet. Epidemiol. 37, 622–634 (2013).
Article PubMed PubMed Central Google Scholar
Hu, H. et al. A unified test of linkage analysis and rare-variant association for analysis of pedigree sequence data. Nat. Biotechnol. 32, 663–669 (2014).
Article CAS PubMed PubMed Central Google Scholar
Ross, C. A. & Tabrizi, S. J. Huntington's disease: from molecular pathogenesis to clinical treatment. Lancet Neurol. 10, 83–98 (2011).
Article CAS PubMed Google Scholar
Paila, U., Chapman, B. A., Kirchner, R. & Quinlan, A. R. GEMINI: integrative exploration of genetic variation and genome annotations. PLoS Comput. Biol. 9, e1003153 (2013).
Article CAS PubMed PubMed Central Google Scholar
Wang, G. T., Peng, B. & Leal, S. M. Variant association tools for quality control and analysis of large-scale sequence and genotyping array data. Am. J. Hum. Genet. 94, 770–783 (2014).
Article CAS PubMed PubMed Central Google Scholar
Köhler, S. et al. The Human Phenotype Ontology in 2017. Nucleic Acids Res. 45, D865–D876 (2017). The Human Phenotype Ontology provides a systematic description of clinical features and is annotated to both genes and diseases, making it an invaluable resource for variant prioritization.
Article CAS PubMed Google Scholar
Girdea, M. et al. PhenoTips: patient phenotyping software for clinical and research use. Hum. Mutat. 34, 1057–1065 (2013).
Article PubMed Google Scholar
Hamosh, A. et al. PhenoDB: a new web-based tool for the collection, storage, and analysis of phenotypic features. Hum. Mutat. 34, 566–571 (2013).
CAS PubMed PubMed Central Google Scholar
Smedley, D. & Robinson, P. N. Phenotype-driven strategies for exome prioritization of human Mendelian disease genes. Genome Med. 7, 81 (2015).
Article CAS PubMed PubMed Central Google Scholar
Smedley, D. et al. Next-generation diagnostics and disease-gene discovery with the Exomiser. Nat. Protoc. 10, 2004–2015 (2015).
Article CAS PubMed PubMed Central Google Scholar
Javed, A., Agrawal, S. & Ng, P. C. Phen-Gen: combining phenotype and genotype to analyze rare disorders. Nat. Methods 11, 935–937 (2014).
Article CAS PubMed Google Scholar
Sifrim, A. et al. eXtasy: variant prioritization by genomic data fusion. Nat. Methods 10, 1083–1084 (2013).
Article CAS PubMed Google Scholar
Yang, H., Robinson, P. N. & Wang, K. Phenolyzer: phenotype-based prioritization of candidate genes for human diseases. Nat. Methods 12, 841–843 (2015).
Article CAS PubMed PubMed Central Google Scholar
James, R. A. et al. A visual and curatorial approach to clinical variant prioritization and disease gene discovery in genome-wide diagnostics. Genome Med. 8, 13 (2016).
Article CAS PubMed PubMed Central Google Scholar
Singleton, M. V. et al. Phevor combines multiple biomedical ontologies for accurate identification of disease-causing alleles in single individuals and small nuclear families. Am. J. Hum. Genet. 94, 599–610 (2014).
Article CAS PubMed PubMed Central Google Scholar
Robinson, P. N. et al. Improved exome prioritization of disease genes through cross-species phenotype comparison. Genome Res. 24, 340–348 (2014).
Article CAS PubMed PubMed Central Google Scholar
Brownstein, C. A. et al. An international effort towards developing standards for best practices in analysis, interpretation and reporting of clinical genome sequencing results in the CLARITY Challenge. Genome Biol. 15, R53 (2014).
Article CAS PubMed PubMed Central Google Scholar
Wallis, Y. et al. Practice guidelines for the evaluation of pathogenicity and the reporting of sequence variants in clinical molecular genetics. ACGS http://www.acgs.uk.com/media/774853/evaluation_and_reporting_of_sequence_variants_bpgs_june_2013_-_finalpdf.pdf (2013).
Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 17, 405–424 (2015). This paper provides the methodology with which to use the various lines of evidence for consistent variant interpretation.
Article PubMed PubMed Central Google Scholar
Association for Clinical Genetic Science. Consensus statement on adoption of American College of Medical Genetics and Genomics (ACMG) guidelines for sequence variant classification and interpretation. ACGS http://www.acgs.uk.com/media/1032817/acgs_consensus_statement_on_adoption_of_acmg_guidelines__1_.pdf (2016).
den Dunnen, J. T. et al. HGVS recommendations for the description of sequence variants: 2016 update. Hum. Mutat. 37, 564–569 (2016).
Article CAS PubMed Google Scholar
Gray, K. A., Yates, B., Seal, R. L., Wright, M. W. & Bruford, E. A. Genenames.org: the HGNC resources in 2015. Nucleic Acids Res. 43, D1079–D1085 (2015).
Article CAS PubMed Google Scholar
Rehm, H. L. et al. ClinGen — the Clinical Genome Resource. N. Engl. J. Med. 372, 2235–2242 (2015).
Article CAS PubMed PubMed Central Google Scholar
MacArthur, D. G. et al. Guidelines for investigating causality of sequence variants in human disease. Nature 508, 469–476 (2014).
Article CAS PubMed PubMed Central Google Scholar
ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Ponting, C. P. & Hardison, R. C. What fraction of the human genome is functional? Genome Res. 21, 1769–1776 (2011).
Article CAS PubMed PubMed Central Google Scholar
Pollard, K. S., Hubisz, M. J., Rosenbloom, K. R. & Siepel, A. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res. 20, 110–121 (2010).
Article CAS PubMed PubMed Central Google Scholar
Davydov, E. V. et al. Identifying a high fraction of the human genome to be under selective constraint using GERP++. PLoS Comput. Biol. 6, e1001025 (2010).
Article CAS PubMed PubMed Central Google Scholar
Smedley, D. et al. A whole-genome analysis framework for effective identification of pathogenic regulatory variants in Mendelian disease. Am. J. Hum. Genet. 99, 595–606 (2016).
Article CAS PubMed PubMed Central Google Scholar
Huang, Y.-F., Gulko, B. & Siepel, A. Fast, scalable prediction of deleterious noncoding variants from functional and population genomic data. Nat. Genet. 49, 618–624 (2017).
Article CAS PubMed PubMed Central Google Scholar
Hunt, R. C., Simhadri, V. L., Iandoli, M., Sauna, Z. E. & Kimchi-Sarfaty, C. Exposing synonymous mutations. Trends Genet. 30, 308–321 (2014).
Article CAS PubMed Google Scholar
Willig, L. K. et al. Whole-genome sequencing for identification of Mendelian disorders in critically ill infants: a retrospective analysis of diagnostic and clinical findings. Lancet Respir. Med. 3, 377–387 (2015).
Article CAS PubMed PubMed Central Google Scholar
Wu, N. et al. TBX6 null variants and a common hypomorphic allele in congenital scoliosis. N. Engl. J. Med. 372, 341–350 (2015).
Article CAS PubMed PubMed Central Google Scholar
Wieczorek, D. et al. Compound heterozygosity of low-frequency promoter deletions and rare loss-of-function mutations in TXNL4A causes Burn–McKeown syndrome. Am. J. Hum. Genet. 95, 698–707 (2014).
Article CAS PubMed PubMed Central Google Scholar
Redin, C. et al. The genomic landscape of balanced cytogenetic abnormalities associated with human congenital anomalies. Nat. Genet. 49, 36–45 (2017).
Article CAS PubMed Google Scholar
Merker, J. et al. Long-read whole genome sequencing identifies causal structural variation in a Mendelian disease. Genet. Med. http://dx.doi.org/10.1038/gim.2017.86 (2017).
Brandler, W. M. et al. Frequency and complexity of de novo structural mutation in autism. Am. J. Hum. Genet. 98, 667–679 (2016).
Article CAS PubMed PubMed Central Google Scholar
Church, D. M. et al. Extending reference assembly models. Genome Biol. 16, 13 (2015).
Article PubMed PubMed Central Google Scholar
Jäger, M. et al. Alternate-locus aware variant calling in whole genome sequencing. Genome Med. 8, 130 (2016).
Article PubMed PubMed Central Google Scholar
Harrison, S. M. et al. Using ClinVar as a resource to support variant interpretation. Curr. Protoc. Hum. Genet. 89, 8.16.1–8.16.23 (2016).
Article Google Scholar
Ackerman, J. P. et al. The promise and peril of precision medicine: phenotyping still matters most. Mayo Clin. Proc. 91, 1606–1616 (2016).
Article Google Scholar
Dorfman, R. et al. Do common in silico tools predict the clinical consequences of amino-acid substitutions in the CFTR gene? Clin. Genet 77, 464–473 (2010).
Article CAS PubMed Google Scholar
Global Alliance for Genomics and Health. GENOMICS. A federated ecosystem for sharing genomic, clinical data. Science 352, 1278–1280 (2016).
Krawczak, M. et al. Human gene mutation database-a biomedical information and research resource. Hum. Mutat. 15, 45–51 (2000).
Article CAS PubMed Google Scholar
Samuels, M. E. & Rouleau, G. A. The case for locus-specific databases. Nat. Rev. Genet. 12, 378–379 (2011).
Article CAS PubMed Google Scholar
Rath, A. et al. Representation of rare diseases in health information systems: the Orphanet approach to serve a wide range of end users. Hum. Mutat. 33, 803–808 (2012).
Article PubMed Google Scholar
Kumar, P., Henikoff, S. & Ng, P. C. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat. Protoc. 4, 1073–1081 (2009).
Article CAS PubMed Google Scholar
Adzhubei, I., Jordan, D. M. & Sunyaev, S. R. Predicting functional effect of human missense mutations using PolyPhen-2. Curr. Protoc. Hum. Genet. 7.20.1–7.20.41 (2013).
Shihab, H. A. et al. Predicting the functional, molecular, and phenotypic consequences of amino acid substitutions using hidden Markov models. Hum. Mutat. 34, 57–65 (2013).
Article CAS PubMed Google Scholar
Mistry, J., Finn, R. D., Eddy, S. R., Bateman, A. & Punta, M. Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions. Nucleic Acids Res. 41, e121 (2013).
Article CAS PubMed PubMed Central Google Scholar
Choi, Y. & Chan, A. P. PROVEAN web server: a tool to predict the functional effect of amino acid substitutions and indels. Bioinformatics 31, 2745–2747 (2015).
Article CAS PubMed PubMed Central Google Scholar
Ioannidis, N. M. et al. REVEL: an Ensemble method for predicting the pathogenicity of rare missense variants. Am. J. Hum. Genet. 99, 877–885 (2016).
Article CAS PubMed PubMed Central Google Scholar
Siepel, A. et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034–1050 (2005).
Article CAS PubMed PubMed Central Google Scholar
Schwarz, J. M., Cooper, D. N., Schuelke, M. & Seelow, D. MutationTaster2: mutation prediction for the deep-sequencing age. Nat. Methods 11, 361–362 (2014).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

The authors thank J. Chong for insightful discussions about the challenges of rare disease research at the University of Washington Center for Mendelian Genomics Workshop. This Review was supported by US National Institute of Health awards to A.Q. (NIH R01HG006693, NIH U24CA209999), K.E (NIH U41HG006834 (subcontract), NIH U01HG007437 (subcontract), NIH R01HG008628) and M.Y. (NIH R01GM104390, NIH UM1HL128711, NIH U01HL131698 and NSF IOS-1561337).

Author information

Karen Eilbeck and Aaron Quinlan: These authors contributed equally to this work.

Authors and Affiliations

Department of Biomedical Informatics, School of Medicine, University of Utah, 421 Wakara Way, Suite 120, Salt Lake City, 84108, Utah, USA
Karen Eilbeck & Aaron Quinlan
Department of Human Genetics, Eccles Institute of Human Genetics, School of Medicine, University of Utah, 15 S 2030 E, Salt Lake City, 84112, Utah, USA
Aaron Quinlan & Mark Yandell

Authors

Karen Eilbeck
View author publications
You can also search for this author in PubMed Google Scholar
Aaron Quinlan
View author publications
You can also search for this author in PubMed Google Scholar
Mark Yandell
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mark Yandell.

Ethics declarations

Competing interests

A.Q. is a co-founder of Base2 Genomics, LLC. M.Y. is on the Scientific Advisory Board of Fabric Genomics.

Glossary

Mendelian disorders: Diseases or conditions that result from mutation at a genomic locus and are inherited according to Mendel's laws.
Variant prioritization: The process of ranking the variants observed in an individual genome on the basis of factors such as the predicted consequence of each variant and the observed frequency in a population.
Population allele frequencies: The proportion of chromosomes within a population that carry a particular change at a given locus.
Gene prioritization: The process of associating a gene with a disease phenotype; this strategy is often used during variant prioritization.
Burden testing: A gene prioritization approach that scores, ranks and prioritizes genes based on genotypes rather than on single variants. The observed (or for some methods, the theoretical) distribution of burden scores within the wider population is often used to rank a proband's genotype score. Many burden tests can also incorporate adjunct information into their calculations such as phylogenetic conservation, mode of inheritance and variant frequency data. Unlike variant prioritization tools, burden tests require access to genotype data for their calculations.
Decision support frameworks: Interactive, dynamic tools to guide medical decision-making by displaying and integrating patient data.
Nonsense-mediated decay: (NMD). A conserved eukaryotic pathway, the role of which is to detect and eliminate the translation of mRNAs that have premature stop codons.
Variant of uncertain significance: (VUS). Also known as variant of unknown significance. The canonical definition of a VUS is a variant in a disease-associated gene, the specific effect of which is unknown or uncertain. More generally, VUS can also be applied to variants in genes that lack direct disease association but are plausible given the biological function of the resulting protein.
Controlled vocabularies: Sets of agreed upon terms and definitions.
Exome: Generally, the portion of the genome that is translated into proteins.
Population stratification: The difference in allele frequencies across subpopulations.
Balancing selection: Under balancing selection, multiple alleles exist in a population when natural selection favours heterozygous genotypes.
Disease prevalence: The number of cases of a disease that are present in a population at a given point in time.
Purifying selection: Under purifying selection, deleterious alleles are selectively removed from a population.
Functional variants: Variants that alter gene function or expression.
Probands: The proband is the initial person of study in a genetics investigation. In the case of a family trio, the proband is usually the affected child.
De novo variant: A spontaneous mutation in a proband that is missing from the parents.
Phase: For a single variant, phase involves the determination of the parental chromosome on which a variant allele exists. When a proband and both parents have been sequenced, this can be directly determined for 'informative sites' where the allele transmission is unambiguous (for example, the proband is heterozygous A/G, the father is homozygous A/A, and the mother heterozygous A/G; in this case the G allele was clearly transmitted from the mother). More generally, phasing refers to the assignment of alleles from multiple variant sites to parental haplotypes.
Population genotype frequency: The proportion of individuals with a particular genotype at a given locus.
Incidental findings: In whole-exome sequencing (WES) or whole-genome sequencing (WGS), pathogenic and likely pathogenic variants in genes that are not relevant to the initial reason for sequencing may be found and reported back to the patient. These variants may relate to rare disease, disease risk, pharmacogenetic response, and status relating to prenatal screening.
Return of results: The process of returning findings from a research study, or incidental findings from a genetic test, back to the participant or patient.
Compound heterozygous inheritance: The situation in which a proband receives a damaging but different allele in the same gene, from each parent. Both copies of the gene are affected.
Topologically associating domains: (TADs). TADs are genomic regions in which loci have a higher probability of physical interaction.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Eilbeck, K., Quinlan, A. & Yandell, M. Settling the score: variant prioritization and Mendelian disease. Nat Rev Genet 18, 599–612 (2017). https://doi.org/10.1038/nrg.2017.52

Download citation

Published: 14 August 2017
Issue Date: October 2017
DOI: https://doi.org/10.1038/nrg.2017.52

This article is cited by

RExPRT: a machine learning tool to predict pathogenicity of tandem repeat loci
- Sarah Fazal
- Matt C. Danzi
- Vanessa Aguiar-Pulido
Genome Biology (2024)
A computational approach to analyzing the functional and structural impacts of Tripeptidyl-Peptidase 1 missense mutations in neuronal ceroid lipofuscinosis
- Priyanka K
- Priya N Madhana
- Magesh Ramasamy
Metabolic Brain Disease (2024)
The impact of damaging epilepsy and cardiac genetic variant burden in sudden death in the young
- Megan J. Puckelwartz
- Lorenzo L. Pesce
- Elizabeth M. McNally
Genome Medicine (2024)
Current advances in primate genomics: novel approaches for understanding evolution and disease
- David Juan
- Gabriel Santpere
- Tomas Marques-Bonet
Nature Reviews Genetics (2023)
Functional evaluation of BRCA1/2 variants of unknown significance with homologous recombination assay and integrative in silico prediction model
- Qianqian Guo
- Shuting Ji
- Shigeaki Sunada
Journal of Human Genetics (2023)