Key Points
-
Exome and genome sequencing reveal thousands to millions of genetic variants in a typical individual. A fundamental challenge in human genetics is isolating the small subset (typically one or two) of variants that cause a Mendelian disease phenotype. This Review describes the computational approaches used to prioritize variants in Mendelian disease.
-
A multitude of tools prioritize variants on the basis of biochemical, evolutionary, allele segregation and population frequency characteristics in an attempt to prioritize the list of potential causative variants. The strategies and caveats associated with these tools are outlined in this Review.
-
Burden tests take prioritization to the next level by aggregating the variants observed at a given locus to calculate a burden score for the gene. Most burden testing software tools also evaluate potentially damaging genotypes in the context of other genotypes observed at the same locus in a control population.
-
Variant interpretation is the process of drawing direct connections from individual variants to disease phenotypes, and this process is central to both clinical reporting of results and incidental findings, as well as research endeavours that include variant discovery and return of results.
-
Variant prioritization and interpretation are especially challenging for non-coding variants, structural variants and synonymous exonic variants. Furthermore, increasingly complex reference genomes introduce new demands for variant discovery tools. Each of these challenges drive increasingly sophisticated software solutions.
Abstract
When investigating Mendelian disease using exome or genome sequencing, distinguishing disease-causing genetic variants from the multitude of candidate variants is a complex, multidimensional task. Many prioritization tools and online interpretation resources exist, and professional organizations have offered clinical guidelines for review and return of prioritization results. In this Review, we describe the strengths and weaknesses of widely used computational approaches, explain their roles in the diagnostic and discovery process and discuss how they can inform (and misinform) expert reviewers. We place variant prioritization in the wider context of gene prioritization, burden testing and genotype–phenotype association, and we discuss opportunities and challenges introduced by whole-genome sequencing.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$189.00 per year
only $15.75 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Bamshad, M. J. et al. Exome sequencing as a tool for Mendelian disease gene discovery. Nat. Rev. Genet. 12, 745–755 (2011).
Chong, J. X. et al. The genetic basis of Mendelian phenotypes: discoveries, challenges, and opportunities. Am. J. Hum. Genet. 97, 199–215 (2015). This review summarizes findings from the study of more than 8,000 families with Mendelian disease phenotypes by the Centers for Mendelian Genomics.
The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015). By sequencing the genomes of more than 2,500 individuals from diverse world ancestries, this study provides the first genome-wide map of both common and rare human genetic variation.
Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016). The ExAC-integrated exome sequencing data from 60,706 individuals provides an invaluable reference data set of genetic variation in protein-coding genes. Assessing variant allele frequencies in ExAC facilitates the interpretation of candidate variants observed in Mendelian disease families.
Cooper, G. M. & Shendure, J. Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data. Nat. Rev. Genet. 12, 628–640 (2011).
Kennedy, B. et al. Using VAAST to identify disease-associated variants in next-generation sequencing data. Curr. Protoc. Hum. Genet. 81, 6.14.1–6.14.25 (2014).
Wu, M. C. et al. Powerful SNP-set analysis for case-control genome-wide association studies. Am. J. Hum. Genet. 86, 929–942 (2010).
Price, A. L. et al. Pooled association tests for rare variants in exon-resequencing studies. Am. J. Hum. Genet. 86, 832–838 (2010).
Liu, D. J. & Leal, S. M. A novel adaptive method for the analysis of next-generation sequencing data to detect complex trait associations with rare variants due to gene main effects and interactions. PLoS Genet. 6, e1001156 (2010).
Li, B. & Leal, S. M. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am. J. Hum. Genet. 83, 311–321 (2008).
Lee, S., Abecasis, G. R., Boehnke, M. & Lin, X. Rare-variant association analysis: study designs and statistical tests. Am. J. Hum. Genet. 95, 5–23 (2014).
Coonrod, E. M., Margraf, R. L., Russell, A., Voelkerding, K. V. & Reese, M. G. Clinical analysis of genome next-generation sequencing data using the Omicia platform. Expert Rev. Mol. Diagn. 13, 529–540 (2013).
Doig, K. D. et al. PathOS: a decision support system for reporting high throughput sequencing of cancers in clinical diagnostic laboratories. Genome Med. 9, 38 (2017).
Miller, C. A., Qiao, Y., DiSera, T., D'Astous, B. & Marth, G. T. bam.iobio: a web-based, real-time, sequence alignment file inspector. Nat. Methods 11, 1189 (2014).
Vandeweyer, G., Van Laer, L., Loeys, B., Van den Bulcke, T. & Kooy, R. F. VariantDB: a flexible annotation and filtering portal for next generation sequencing data. Genome Med. 6, 74 (2014).
Amberger, J. S., Bocchini, C. A., Schiettecatte, F., Scott, A. F. & Hamosh, A. OMIM.org: Online Mendelian Inheritance in Man (OMIM®), an online catalog of human genes and genetic disorders. Nucleic Acids Res. 43, D789–D798 (2015).
Landrum, M. J. et al. ClinVar: public archive of interpretations of clinically relevant variants. Nucleic Acids Res. 44, D862–D868 (2016). ClinVar is an important repository for collating and understanding genome variant interpretation.
DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).
Alkan, C., Coe, B. P. & Eichler, E. E. Genome structural variation discovery and genotyping. Nat. Rev. Genet. 12, 363–376 (2011).
Van der Auwera, G. A. et al. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr. Protoc. Bioinformatics 43, 11.10.1–11.10.33 (2013).
Zook, J. M. et al. Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls. Nat. Biotechnol. 32, 246–251 (2014).
Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).
McLaren, W. et al. The Ensembl variant effect predictor. Genome Biol. 17, 122 (2016).
Yandell, M. et al. A probabilistic disease-gene finder for personal genomes. Genome Res. 21, 1529–1542 (2011).
Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly 6, 80–92 (2012).
Eilbeck, K. et al. The Sequence Ontology: a tool for the unification of genome annotations. Genome Biol. 6, R44 (2005). The Sequence Ontology is a project that initiated developing standardized terminologies for genomic sequence features and became widely used in both genome annotation and more recently in variant annotation. It is a key vocabulary used by tools that assign consequences to variants.
Cunningham, F., Moore, B., Ruiz-Schultz, N., Ritchie, G. R. & Eilbeck, K. Improving the Sequence Ontology terminology for genomic variant annotation. J. Biomed. Semantics 6, 32 (2015).
Sherry, S. T. et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29, 308–311 (2001).
Aken, B. L. et al. Ensembl 2017. Nucleic Acids Res. 45, D635–D642 (2017).
Lappalainen, I. et al. DbVar and DGVa: public archives for genomic structural variation. Nucleic Acids Res. 41, D936–D941 (2013).
Eilbeck, K., Moore, B., Holt, C. & Yandell, M. Quantitative measures for the management and comparison of annotated genomes. BMC Bioinformatics 10, 67 (2009).
Pertea, M. & Salzberg, S. L. Between a chicken and a grape: estimating the number of human genes. Genome Biol. 11, 206 (2010).
Ezkurdia, I. et al. Multiple evidence strands suggest that there may be as few as 19 000 human protein-coding genes. Hum. Mol. Genet. 23, 5866–5878 (2014).
MacArthur, D. G. et al. A systematic survey of loss-of-function variants in human protein-coding genes. Science 335, 823–828 (2012). Through careful examination of LOF variants in 185 individuals, this study predicted that a typical human harbours roughly ∼100 potential LOF variants in their genome, highlighting the challenge of isolating the one or two causal variants underlying a Mendelian disease phenotype.
Saleheen, D. et al. Human knockouts and phenotypic analysis in a cohort with a high rate of consanguinity. Nature 544, 235–239 (2017). This manuscript studies individuals harbouring homozygous LOF variants in a population with a high rate of consanguinity, revealing more than 1,000 genes that were predicted to be completely knocked out in at least one individual studied.
Sheikh, T. I., Mittal, K., Willis, M. J. & Vincent, J. B. A synonymous change, p. Gly16Gly in MECP2 Exon 1, causes a cryptic splice event in a Rett syndrome patient. Orphanet J. Rare Dis. 8, 108 (2013).
Nackley, A. G. et al. Human catechol-O-methyltransferase haplotypes modulate protein expression by altering mRNA secondary structure. Science 314, 1930–1933 (2006).
Kimchi-Sarfaty, C. et al. A 'silent' polymorphism in the MDR1 gene changes substrate specificity. Science 315, 525–528 (2007).
Kircher, M. et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 46, 310–315 (2014). This manuscript describes the Combined Annotation-Dependent Depletion (CADD) score, which integrates diverse genome annotations into a classifier to assess the relative deleteriousness of variants genome-wide.
Gulko, B., Hubisz, M. J., Gronau, I. & Siepel, A. A method for calculating probabilities of fitness consequences for point mutations across the human genome. Nat. Genet. 47, 276–283 (2015). By integrating high-throughput functional data from the ENCODE project, the fitCons method estimates the probability of whether any genome-wide point mutation will result in a fitness consequence.
Ng, P. C. & Henikoff, S. Predicting deleterious amino acid substitutions. Genome Res. 11, 863–874 (2001).
Adzhubei, I. A. et al. A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249 (2010).
Yip, S. P. Sequence variation at the human ABO locus. Ann. Hum. Genet. 66, 1–27 (2002).
Kaiser, V. B. et al. Homozygous loss-of-function variants in European cosmopolitan and isolate populations. Hum. Mol. Genet. 24, 5464–5474 (2015).
The 1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).
The 1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).
Sudmant, P. H. et al. An integrated map of structural variation in 2,504 human genomes. Nature 526, 75–81 (2015). This study provides the first genome-wide map of all common forms of structural variation from thousands of human genomes.
Tennessen, J. A. et al. Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science 337, 64–69 (2012).
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).
Kidd, J. M. et al. Population genetic inference from personal genome data: impact of ancestry and admixture on human genomic variation. Am. J. Hum. Genet. 91, 660–671 (2012).
Gabriel, S. E., Brigman, K. N., Koller, B. H., Boucher, R. C. & Stutts, M. J. Cystic fibrosis heterozygote resistance to cholera toxin in the cystic fibrosis mouse model. Science 266, 107–109 (1994).
Hedrick, P. W. Population genetics of malaria resistance in humans. Heredity 107, 283–304 (2011).
Shah, N. et al. Identification of misclassified ClinVar variants using disease population prevalence. Preprint at bioRxiv http://dx.doi.org/10.1101/075416 (2016).
Minikel, E. V. & MacArthur, D. G. Publicly available data provide evidence against NR1H3 R415Q Causing multiple sclerosis. Neuron 92, 336–338 (2016).
Petrovski, S., Wang, Q., Heinzen, E. L., Allen, A. S. & Goldstein, D. B. Genic intolerance to functional variation and the interpretation of personal genomes. PLoS Genet. 9, e1003709 (2013). The authors use genetic variation from 6,515 exomes in the NHLBI Exome Sequencing Project to develop the Residual Variation Intolerance Score (RVIS), which ranks genes by their intolerance to 'functional' (that is, missense or LOF) variation.
Samocha, K. E. et al. A framework for the interpretation of de novo mutation in human disease. Nat. Genet. 46, 944–950 (2014).
Shyr, C. et al. FLAGS, frequently mutated genes in public exomes. BMC Med. Genomics 7, 64 (2014).
Herman, D. S. et al. Truncations of titin causing dilated cardiomyopathy. N. Engl. J. Med. 366, 619–628 (2012).
Nigro, V. & Savarese, M. Genetic basis of limb-girdle muscular dystrophies: the 2014 update. Acta Myol. 33, 1–12 (2014).
Hackman, P. et al. Tibial muscular dystrophy is a titinopathy caused by mutations in TTN, the gene encoding the giant skeletal-muscle protein titin. Am. J. Hum. Genet. 71, 492–500 (2002).
Ang-Tiu, C. U. & Nicolas, M. E. O. Ichthyosis bullosa of Siemens. J. Dermatol. Case Rep. 6, 78–81 (2012).
Chamcheu, J. C. et al. Keratin gene mutations in disorders of human skin and its appendages. Arch. Biochem. Biophys. 508, 123–137 (2011).
Madsen, B. E. & Browning, S. R. A groupwise association test for rare mutations using a weighted sum statistic. PLoS Genet. 5, e1000384 (2009).
Auer, P. L. & Lettre, G. Rare variant association studies: considerations, challenges and opportunities. Genome Med. 7, 16 (2015).
Lee, S. et al. Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. Am. J. Hum. Genet. 91, 224–237 (2012).
Hu, H. et al. VAAST 2.0: improved variant classification and disease-gene identification using a conservation-controlled amino acid substitution matrix. Genet. Epidemiol. 37, 622–634 (2013).
Hu, H. et al. A unified test of linkage analysis and rare-variant association for analysis of pedigree sequence data. Nat. Biotechnol. 32, 663–669 (2014).
Ross, C. A. & Tabrizi, S. J. Huntington's disease: from molecular pathogenesis to clinical treatment. Lancet Neurol. 10, 83–98 (2011).
Paila, U., Chapman, B. A., Kirchner, R. & Quinlan, A. R. GEMINI: integrative exploration of genetic variation and genome annotations. PLoS Comput. Biol. 9, e1003153 (2013).
Wang, G. T., Peng, B. & Leal, S. M. Variant association tools for quality control and analysis of large-scale sequence and genotyping array data. Am. J. Hum. Genet. 94, 770–783 (2014).
Köhler, S. et al. The Human Phenotype Ontology in 2017. Nucleic Acids Res. 45, D865–D876 (2017). The Human Phenotype Ontology provides a systematic description of clinical features and is annotated to both genes and diseases, making it an invaluable resource for variant prioritization.
Girdea, M. et al. PhenoTips: patient phenotyping software for clinical and research use. Hum. Mutat. 34, 1057–1065 (2013).
Hamosh, A. et al. PhenoDB: a new web-based tool for the collection, storage, and analysis of phenotypic features. Hum. Mutat. 34, 566–571 (2013).
Smedley, D. & Robinson, P. N. Phenotype-driven strategies for exome prioritization of human Mendelian disease genes. Genome Med. 7, 81 (2015).
Smedley, D. et al. Next-generation diagnostics and disease-gene discovery with the Exomiser. Nat. Protoc. 10, 2004–2015 (2015).
Javed, A., Agrawal, S. & Ng, P. C. Phen-Gen: combining phenotype and genotype to analyze rare disorders. Nat. Methods 11, 935–937 (2014).
Sifrim, A. et al. eXtasy: variant prioritization by genomic data fusion. Nat. Methods 10, 1083–1084 (2013).
Yang, H., Robinson, P. N. & Wang, K. Phenolyzer: phenotype-based prioritization of candidate genes for human diseases. Nat. Methods 12, 841–843 (2015).
James, R. A. et al. A visual and curatorial approach to clinical variant prioritization and disease gene discovery in genome-wide diagnostics. Genome Med. 8, 13 (2016).
Singleton, M. V. et al. Phevor combines multiple biomedical ontologies for accurate identification of disease-causing alleles in single individuals and small nuclear families. Am. J. Hum. Genet. 94, 599–610 (2014).
Robinson, P. N. et al. Improved exome prioritization of disease genes through cross-species phenotype comparison. Genome Res. 24, 340–348 (2014).
Brownstein, C. A. et al. An international effort towards developing standards for best practices in analysis, interpretation and reporting of clinical genome sequencing results in the CLARITY Challenge. Genome Biol. 15, R53 (2014).
Wallis, Y. et al. Practice guidelines for the evaluation of pathogenicity and the reporting of sequence variants in clinical molecular genetics. ACGS http://www.acgs.uk.com/media/774853/evaluation_and_reporting_of_sequence_variants_bpgs_june_2013_-_finalpdf.pdf (2013).
Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 17, 405–424 (2015). This paper provides the methodology with which to use the various lines of evidence for consistent variant interpretation.
Association for Clinical Genetic Science. Consensus statement on adoption of American College of Medical Genetics and Genomics (ACMG) guidelines for sequence variant classification and interpretation. ACGS http://www.acgs.uk.com/media/1032817/acgs_consensus_statement_on_adoption_of_acmg_guidelines__1_.pdf (2016).
den Dunnen, J. T. et al. HGVS recommendations for the description of sequence variants: 2016 update. Hum. Mutat. 37, 564–569 (2016).
Gray, K. A., Yates, B., Seal, R. L., Wright, M. W. & Bruford, E. A. Genenames.org: the HGNC resources in 2015. Nucleic Acids Res. 43, D1079–D1085 (2015).
Rehm, H. L. et al. ClinGen — the Clinical Genome Resource. N. Engl. J. Med. 372, 2235–2242 (2015).
MacArthur, D. G. et al. Guidelines for investigating causality of sequence variants in human disease. Nature 508, 469–476 (2014).
ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Ponting, C. P. & Hardison, R. C. What fraction of the human genome is functional? Genome Res. 21, 1769–1776 (2011).
Pollard, K. S., Hubisz, M. J., Rosenbloom, K. R. & Siepel, A. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res. 20, 110–121 (2010).
Davydov, E. V. et al. Identifying a high fraction of the human genome to be under selective constraint using GERP++. PLoS Comput. Biol. 6, e1001025 (2010).
Smedley, D. et al. A whole-genome analysis framework for effective identification of pathogenic regulatory variants in Mendelian disease. Am. J. Hum. Genet. 99, 595–606 (2016).
Huang, Y.-F., Gulko, B. & Siepel, A. Fast, scalable prediction of deleterious noncoding variants from functional and population genomic data. Nat. Genet. 49, 618–624 (2017).
Hunt, R. C., Simhadri, V. L., Iandoli, M., Sauna, Z. E. & Kimchi-Sarfaty, C. Exposing synonymous mutations. Trends Genet. 30, 308–321 (2014).
Willig, L. K. et al. Whole-genome sequencing for identification of Mendelian disorders in critically ill infants: a retrospective analysis of diagnostic and clinical findings. Lancet Respir. Med. 3, 377–387 (2015).
Wu, N. et al. TBX6 null variants and a common hypomorphic allele in congenital scoliosis. N. Engl. J. Med. 372, 341–350 (2015).
Wieczorek, D. et al. Compound heterozygosity of low-frequency promoter deletions and rare loss-of-function mutations in TXNL4A causes Burn–McKeown syndrome. Am. J. Hum. Genet. 95, 698–707 (2014).
Redin, C. et al. The genomic landscape of balanced cytogenetic abnormalities associated with human congenital anomalies. Nat. Genet. 49, 36–45 (2017).
Merker, J. et al. Long-read whole genome sequencing identifies causal structural variation in a Mendelian disease. Genet. Med. http://dx.doi.org/10.1038/gim.2017.86 (2017).
Brandler, W. M. et al. Frequency and complexity of de novo structural mutation in autism. Am. J. Hum. Genet. 98, 667–679 (2016).
Church, D. M. et al. Extending reference assembly models. Genome Biol. 16, 13 (2015).
Jäger, M. et al. Alternate-locus aware variant calling in whole genome sequencing. Genome Med. 8, 130 (2016).
Harrison, S. M. et al. Using ClinVar as a resource to support variant interpretation. Curr. Protoc. Hum. Genet. 89, 8.16.1–8.16.23 (2016).
Ackerman, J. P. et al. The promise and peril of precision medicine: phenotyping still matters most. Mayo Clin. Proc. 91, 1606–1616 (2016).
Dorfman, R. et al. Do common in silico tools predict the clinical consequences of amino-acid substitutions in the CFTR gene? Clin. Genet 77, 464–473 (2010).
Global Alliance for Genomics and Health. GENOMICS. A federated ecosystem for sharing genomic, clinical data. Science 352, 1278–1280 (2016).
Krawczak, M. et al. Human gene mutation database-a biomedical information and research resource. Hum. Mutat. 15, 45–51 (2000).
Samuels, M. E. & Rouleau, G. A. The case for locus-specific databases. Nat. Rev. Genet. 12, 378–379 (2011).
Rath, A. et al. Representation of rare diseases in health information systems: the Orphanet approach to serve a wide range of end users. Hum. Mutat. 33, 803–808 (2012).
Kumar, P., Henikoff, S. & Ng, P. C. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat. Protoc. 4, 1073–1081 (2009).
Adzhubei, I., Jordan, D. M. & Sunyaev, S. R. Predicting functional effect of human missense mutations using PolyPhen-2. Curr. Protoc. Hum. Genet. 7.20.1–7.20.41 (2013).
Shihab, H. A. et al. Predicting the functional, molecular, and phenotypic consequences of amino acid substitutions using hidden Markov models. Hum. Mutat. 34, 57–65 (2013).
Mistry, J., Finn, R. D., Eddy, S. R., Bateman, A. & Punta, M. Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions. Nucleic Acids Res. 41, e121 (2013).
Choi, Y. & Chan, A. P. PROVEAN web server: a tool to predict the functional effect of amino acid substitutions and indels. Bioinformatics 31, 2745–2747 (2015).
Ioannidis, N. M. et al. REVEL: an Ensemble method for predicting the pathogenicity of rare missense variants. Am. J. Hum. Genet. 99, 877–885 (2016).
Siepel, A. et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034–1050 (2005).
Schwarz, J. M., Cooper, D. N., Schuelke, M. & Seelow, D. MutationTaster2: mutation prediction for the deep-sequencing age. Nat. Methods 11, 361–362 (2014).
Acknowledgements
The authors thank J. Chong for insightful discussions about the challenges of rare disease research at the University of Washington Center for Mendelian Genomics Workshop. This Review was supported by US National Institute of Health awards to A.Q. (NIH R01HG006693, NIH U24CA209999), K.E (NIH U41HG006834 (subcontract), NIH U01HG007437 (subcontract), NIH R01HG008628) and M.Y. (NIH R01GM104390, NIH UM1HL128711, NIH U01HL131698 and NSF IOS-1561337).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
A.Q. is a co-founder of Base2 Genomics, LLC. M.Y. is on the Scientific Advisory Board of Fabric Genomics.
Related links
FURTHER INFORMATION
Glossary
- Mendelian disorders
-
Diseases or conditions that result from mutation at a genomic locus and are inherited according to Mendel's laws.
- Variant prioritization
-
The process of ranking the variants observed in an individual genome on the basis of factors such as the predicted consequence of each variant and the observed frequency in a population.
- Population allele frequencies
-
The proportion of chromosomes within a population that carry a particular change at a given locus.
- Gene prioritization
-
The process of associating a gene with a disease phenotype; this strategy is often used during variant prioritization.
- Burden testing
-
A gene prioritization approach that scores, ranks and prioritizes genes based on genotypes rather than on single variants. The observed (or for some methods, the theoretical) distribution of burden scores within the wider population is often used to rank a proband's genotype score. Many burden tests can also incorporate adjunct information into their calculations such as phylogenetic conservation, mode of inheritance and variant frequency data. Unlike variant prioritization tools, burden tests require access to genotype data for their calculations.
- Decision support frameworks
-
Interactive, dynamic tools to guide medical decision-making by displaying and integrating patient data.
- Nonsense-mediated decay
-
(NMD). A conserved eukaryotic pathway, the role of which is to detect and eliminate the translation of mRNAs that have premature stop codons.
- Variant of uncertain significance
-
(VUS). Also known as variant of unknown significance. The canonical definition of a VUS is a variant in a disease-associated gene, the specific effect of which is unknown or uncertain. More generally, VUS can also be applied to variants in genes that lack direct disease association but are plausible given the biological function of the resulting protein.
- Controlled vocabularies
-
Sets of agreed upon terms and definitions.
- Exome
-
Generally, the portion of the genome that is translated into proteins.
- Population stratification
-
The difference in allele frequencies across subpopulations.
- Balancing selection
-
Under balancing selection, multiple alleles exist in a population when natural selection favours heterozygous genotypes.
- Disease prevalence
-
The number of cases of a disease that are present in a population at a given point in time.
- Purifying selection
-
Under purifying selection, deleterious alleles are selectively removed from a population.
- Functional variants
-
Variants that alter gene function or expression.
- Probands
-
The proband is the initial person of study in a genetics investigation. In the case of a family trio, the proband is usually the affected child.
- De novo variant
-
A spontaneous mutation in a proband that is missing from the parents.
- Phase
-
For a single variant, phase involves the determination of the parental chromosome on which a variant allele exists. When a proband and both parents have been sequenced, this can be directly determined for 'informative sites' where the allele transmission is unambiguous (for example, the proband is heterozygous A/G, the father is homozygous A/A, and the mother heterozygous A/G; in this case the G allele was clearly transmitted from the mother). More generally, phasing refers to the assignment of alleles from multiple variant sites to parental haplotypes.
- Population genotype frequency
-
The proportion of individuals with a particular genotype at a given locus.
- Incidental findings
-
In whole-exome sequencing (WES) or whole-genome sequencing (WGS), pathogenic and likely pathogenic variants in genes that are not relevant to the initial reason for sequencing may be found and reported back to the patient. These variants may relate to rare disease, disease risk, pharmacogenetic response, and status relating to prenatal screening.
- Return of results
-
The process of returning findings from a research study, or incidental findings from a genetic test, back to the participant or patient.
- Compound heterozygous inheritance
-
The situation in which a proband receives a damaging but different allele in the same gene, from each parent. Both copies of the gene are affected.
- Topologically associating domains
-
(TADs). TADs are genomic regions in which loci have a higher probability of physical interaction.
Rights and permissions
About this article
Cite this article
Eilbeck, K., Quinlan, A. & Yandell, M. Settling the score: variant prioritization and Mendelian disease. Nat Rev Genet 18, 599–612 (2017). https://doi.org/10.1038/nrg.2017.52
Published:
Issue Date:
DOI: https://doi.org/10.1038/nrg.2017.52
This article is cited by
-
RExPRT: a machine learning tool to predict pathogenicity of tandem repeat loci
Genome Biology (2024)
-
A computational approach to analyzing the functional and structural impacts of Tripeptidyl-Peptidase 1 missense mutations in neuronal ceroid lipofuscinosis
Metabolic Brain Disease (2024)
-
The impact of damaging epilepsy and cardiac genetic variant burden in sudden death in the young
Genome Medicine (2024)
-
Current advances in primate genomics: novel approaches for understanding evolution and disease
Nature Reviews Genetics (2023)
-
Functional evaluation of BRCA1/2 variants of unknown significance with homologous recombination assay and integrative in silico prediction model
Journal of Human Genetics (2023)