Abstract
Exomiser is an application that prioritizes genes and variants in next-generation sequencing (NGS) projects for novel disease-gene discovery or differential diagnostics of Mendelian disease. Exomiser comprises a suite of algorithms for prioritizing exome sequences using random-walk analysis of protein interaction networks, clinical relevance and cross-species phenotype comparisons, as well as a wide range of other computational filters for variant frequency, predicted pathogenicity and pedigree analysis. In this protocol, we provide a detailed explanation of how to install Exomiser and use it to prioritize exome sequences in a number of scenarios. Exomiser requires ∼3 GB of RAM and roughly 15–90 s of computing time on a standard desktop computer to analyze a variant call format (VCF) file. Exomiser is freely available for academic use from http://www.sanger.ac.uk/science/tools/exomiser.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout



Similar content being viewed by others
References
Ng, S.B. et al. Exome sequencing identifies the cause of a Mendelian disorder. Nat. Genet. 42, 30–35 (2010).
Ng, S.B. et al. Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461, 272–276 (2009).
Yang, Y. et al. Clinical whole-exome sequencing for the diagnosis of Mendelian disorders. N. Engl. J. Med. 369, 1502–1511 (2013).
Yang, Y. et al. Molecular findings among patients referred for clinical whole-exome sequencing. JAMA 312, 1870–1879 (2014).
Zemojtel, T. et al. Effective diagnosis of genetic disease by computational phenotype analysis of the disease-associated genome. Sci. Transl. Med. 6, 252ra123 (2014).
Soden, S.E. et al. Effectiveness of exome and genome sequencing guided by acuity of illness for diagnosis of neurodevelopmental disorders. Sci. Transl. Med. 6, 265ra168 (2014).
Boycott, K.M., Vanstone, M.R., Bulman, D.E. & MacKenzie, A.E. Rare-disease genetics in the era of next-generation sequencing: discovery to translation. Nat. Rev. Genet. 14, 681–691 (2013).
Robinson, P.N., Krawitz, P. & Mundlos, S. Strategies for exome and genome sequence data analysis in disease-gene discovery projects. Clin. Genet. 80, 127–132 (2011).
Gilissen, C., Hoischen, A., Brunner, H.G. & Veltman, J.A. Disease gene identification strategies for exome sequencing. Eur. J. Hum. Genet. 20, 490–497 (2012).
Schwarz, J.M., Cooper, D.N., Schuelke, M. & Seelow, D. MutationTaster2: mutation prediction for the deep-sequencing age. Nat. Methods 11, 361–362 (2014).
Li, M.X. et al. Predicting Mendelian disease-causing non-synonymous single nucleotide variants in exome sequencing studies. PLoS Genet. 9, e1003143 (2013).
Pelak, K. et al. The characterization of twenty sequenced human genomes. PLoS Genet. 6, e1001111 (2010).
MacArthur, D.G. et al. A systematic survey of loss-of-function variants in human protein-coding genes. Science 335, 823–828 (2012).
Moreau, Y. & Tranchevent, L.C. Computational tools for prioritizing candidate genes: boosting disease gene discovery. Nat. Rev. Genet. 13, 523–546 (2012).
Shashi, V. et al. The utility of the traditional medical genetics diagnostic evaluation in the context of next-generation sequencing for undiagnosed genetic disorders. Genet. Med. 16, 176–182 (2014).
de Ligt, J. et al. Diagnostic exome sequencing in persons with severe intellectual disability. N. Engl. J. Med. 367, 1921–1929 (2012).
Oellrich, A. et al. The influence of disease categories on gene candidate predictions from model organism phenotypes. J. Biomed. Semantics 5, S4 (2014).
Köhler, S. et al. Construction and accessibility of a cross-species phenotype ontology along with gene annotations for biomedical research. F1000Res. 2, 30 (2013).
Washington, N.L. et al. Linking human diseases to animal models using ontology-based phenotype annotation. PLoS Biol. 7, e1000247 (2009).
Köhler, S., Bauer, S., Horn, D. & Robinson, P.N. Walking the interactome for prioritization of candidate disease genes. Am. J. Hum. Genet. 82, 949–958 (2008).
Smedley, D. et al. Walking the interactome for candidate prioritization in exome sequencing studies of Mendelian diseases. Bioinformatics 30, 3215–3222 (2014).
Pippucci, T. et al. A novel null homozygous mutation confirms CACNA2D2 as a gene mutated in epileptic encephalopathy. PLoS ONE 8, e82154 (2013).
Requena, T. et al. Identification of two novel mutations in FAM136A and DTNA genes in autosomal-dominant familial Meniere's disease. Hum. Mol. Genet. 24, 1119–1126 (2015).
Farwell, K.D. et al. Enhanced utility of family-centered diagnostic exome sequencing with inheritance model-based analysis: results from 500 unselected families with undiagnosed genetic conditions. Genet. Med. 17, 578–586 (2015).
Markello, T. et al. York platelet syndrome is a CRAC channelopathy due to gain-of-function mutations in STIM1. Mol. Genet. Metab. 114, 474–482 (2015).
Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).
Jäger, M. et al. Jannovar: a java library for exome annotation. Hum. Mutat. 35, 548–555 (2014).
Ramu, A. et al. DeNovoGear: de novo indel and point mutation discovery and phasing. Nat. Methods 10, 985–987 (2013).
Smith, K.R. et al. Reducing the exome search space for Mendelian diseases using genetic linkage analysis of exome genotypes. Genome Biol. 12, R85 (2011).
Abecasis, G.R. et al. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).
Smedley, D. et al. PhenoDigm: analyzing curated annotations to associate animal models with human diseases. Database 2013, bat025 (2013).
Blake, J.A., Bult, C.J., Kadin, J.A., Richardson, J.E. & Eppig, J.T. The Mouse Genome Database (MGD): premier model organism resource for mammalian genomics and genetics. Nucleic Acids Res. 39, D842–D848 (2011).
Koscielny, G. et al. The International Mouse Phenotyping Consortium Web Portal, a unified point of access for knockout mice and related phenotyping data. Nucleic Acids Res. 42, D802–D809 (2014).
Köhler, S. et al. Clinical diagnostics in human genetics with semantic similarity searches in ontologies. Am. J. Hum. Genet. 85, 457–464 (2009).
Oti, M. & Brunner, H.G. The modular nature of genetic diseases. Clin. Genet. 71, 1–11 (2007).
Brown, G.R. et al. Gene: a gene-centered information resource at NCBI. Nucleic Acids Res. 43, D36–D42 (2015).
Van Slyke, C.E., Bradford, Y.M., Westerfield, M. & Haendel, M.A. The zebrafish anatomy and stage ontologies: representing the anatomy and development of Danio rerio. J. Biomed. Semantics 5, 12 (2014).
Köhler, S. et al. The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data. Nucleic Acids Res. 42, D966–D974 (2014).
Amberger, J.S., Bocchini, C.A., Schiettecatte, F., Scott, A.F. & Hamosh, A. OMIM.org: Online Mendelian Inheritance in Man (OMIM(R)), an online catalog of human genes and genetic disorders. Nucleic Acids Res. 43, D789–D798 (2015).
Rath, A. et al. Representation of rare diseases in health information systems: the Orphanet approach to serve a wide range of end users. Hum. Mutat. 33, 803–808 (2012).
Robinson, P.N. et al. The Human Phenotype Ontology: a tool for annotating and analyzing human hereditary disease. Am. J. Hum. Genet. 83, 610–615 (2008).
Gene Ontology Consortium. Gene Ontology Consortium: going forward. Nucleic Acids Res. 43, D1049–D1056 (2015).
Gkoutos, G.V. et al. Entity/quality-based logical definitions for the human skeletal phenome using PATO. Conf. Proc. IEEE Eng. Med. Biol. Soc. 2009, 7069–7072 (2009).
Franceschini, A. et al. STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res. 41, D808–D815 (2013).
Bone W.P. et al. Computational evaluation of exome sequence data using human and model organism phenotypes improves diagnostic efficiency. Genet. Med. (in the press).
Gahl, W.A. et al. The National Institutes of Health Undiagnosed Diseases Program: insights into rare diseases. Genet. Med. 14, 51–59 (2012).
NCBI Resource Coordinators. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 43, D6–D17 (2015).
Schwarz, J.M., Rodelsperger, C., Schuelke, M. & Seelow, D. MutationTaster evaluates disease-causing potential of sequence alterations. Nat. Methods 7, 575–576 (2010).
Adzhubei, I.A. et al. A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249 (2010).
Kumar, P., Henikoff, S. & Ng, P.C. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat. Protoc. 4, 1073–1081 (2009).
Liu, X., Jian, X. & Boerwinkle, E. dbNSFP v2.0: a database of human non-synonymous SNVs and their functional predictions and annotations. Hum. Mutat. 34, E2393–E2402 (2013).
Rosenbloom, K.R. et al. The UCSC Genome Browser database: 2015 update. Nucleic Acids Res. 43, D670–D681 (2015).
Guo, Y., Ye, F., Sheng, Q., Clark, T. & Samuels, D.C. Three-stage quality control strategies for DNA re-sequencing data. Brief. Bioinform. 15, 879–889 (2014).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).
O'Rawe, J. et al. Low concordance of multiple variant-calling pipelines: practical implications for exome and genome sequencing. Genome Med. 5, 28 (2013).
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).
Cunningham, F. et al. Ensembl 2015. Nucleic Acids Res. 43, D662–D669 (2015).
Aleman, A., Garcia-Garcia, F., Salavert, F., Medina, I. & Dopazo, J. A web-based interactive framework to assist in the prioritization of disease candidate genes in whole-exome sequencing studies. Nucleic Acids Res. 42, W88–W93 (2014).
Coutant, S. et al. EVA: Exome Variation Analyzer, an efficient and versatile tool for filtering strategies in medical genomics. BMC Bioinformatics 13 Suppl 14: S9 (2012).
Sifrim, A. et al. Annotate-it: a Swiss-knife approach to annotation, analysis and interpretation of single nucleotide variation in human disease. Genome Med. 4, 73 (2012).
Lee, I.H. et al. Prioritizing disease-linked variants, genes, and pathways with an interactive whole-genome analysis pipeline. Hum. Mutat. 35, 537–547 (2014).
Li, M.X., Gui, H.S., Kwan, J.S., Bao, S.Y. & Sham, P.C. A comprehensive framework for prioritizing variants in exome sequencing studies of Mendelian diseases. Nucleic Acids Res. 40, e53 (2012).
He, Z. et al. Rare-variant extensions of the transmission disequilibrium test: application to autism exome sequence data. Am. J. Hum. Genet. 94, 33–46 (2014).
Ionita-Laza, I. et al. Finding disease variants in Mendelian disorders by using sequence data: methods and applications. Am. J. Hum. Genet. 89, 701–712 (2011).
Yandell, M. et al. A probabilistic disease-gene finder for personal genomes. Genome Res. 21, 1529–1542 (2011).
Singleton, M.V. et al. Phevor combines multiple biomedical ontologies for accurate identification of disease-causing alleles in single individuals and small nuclear families. Am. J. Hum. Genet. 94, 599–610 (2014).
Sifrim, A. et al. eXtasy: variant prioritization by genomic data fusion. Nat. Methods 10, 1083–1084 (2013).
Masino, A.J. et al. Clinical phenotype-based gene prioritization: an initial study using semantic similarity and the human phenotype ontology. BMC Bioinformatics 15, 248 (2014).
Javed, A., Agrawal, S. & Ng, P.C. Phen-Gen: combining phenotype and genotype to analyze rare disorders. Nat. Methods 11, 935–937 (2014).
Robinson, P.N. Deep phenotyping for precision medicine. Hum. Mutat. 33, 777–780 (2012).
Petrovski, S. & Goldstein, D.B. Phenomics and the interpretation of personal genomes. Sci. Transl. Med. 6, 254fs35 (2014).
Corpas, M. Crowdsourcing the corpasome. Source Code Biol. Med. 8, 13 (2013).
Wright, C.F. et al. Genetic diagnosis of developmental disorders in the DDD study: a scalable analysis of genome-wide research data. Lancet 385, 1305–1314 (2014).
Cote, R. et al. The ontology lookup service: bigger and better. Nucleic Acids Res. 38, W155–W160 (2010).
Whetzel, P.L. et al. BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications. Nucleic Acids Res. 39, W541–W545 (2011).
Girdea, M. et al. PhenoTips: patient phenotyping software for clinical and research use. Hum. Mutat. 34, 1057–1065 (2013).
Washington, N.L. et al. How good is your phenotyping? Methods for quality assessment. In. Proceedings of Phenotype Day 2014@ISMB 2014 http://phenoday2014.bio-lark.org/pdf/6.pdf (2014).
Acknowledgements
This project was supported by the Bundesministerium für Bildung und Forschung (BMBF; project no. 0313911), the European Community's Seventh Framework Programme (grant agreement no. 602300; SYBIL) and NIH grant no. 5R24OD011883 (Monarch Initiative).
Author information
Authors and Affiliations
Contributions
P.N.R. and D.S. conceived of the project, programmed the prototype code, and wrote the manuscript. J.O.B.J., M.J., S.K., M.S., N.L.W. and E.S. developed software. T.Z., O.J.B. and W.P.B. tested code and contributed to the development of analysis strategies. S.K., M.A.H. and P.N.R. developed the phenotype analysis framework. M.A.H. helped develop the ontologies and the HPO curation standard. All authors reviewed and approved of the manuscript.
Corresponding author
Ethics declarations
Competing interests
S.K. and P.N.R. are holders of a patent for an ontology-based search methodology.
Rights and permissions
About this article
Cite this article
Smedley, D., Jacobsen, J., Jäger, M. et al. Next-generation diagnostics and disease-gene discovery with the Exomiser. Nat Protoc 10, 2004–2015 (2015). https://doi.org/10.1038/nprot.2015.124
Published:
Issue Date:
DOI: https://doi.org/10.1038/nprot.2015.124
This article is cited by
-
Comparative yield of molecular diagnostic algorithms for autism spectrum disorder diagnosis in India: evidence supporting whole exome sequencing as first tier test
BMC Neurology (2023)
-
Starvar: symptom-based tool for automatic ranking of variants using evidence from literature and genomes
BMC Bioinformatics (2023)
-
Improving the classification of cardinality phenotypes using collections
Journal of Biomedical Semantics (2023)
-
A novel case of two siblings harbouring homozygous variant in the NEUROG1 gene with autism as an additional phenotype: a case report
BMC Neurology (2023)
-
Var∣Decrypt: a novel and user-friendly tool to explore and prioritize variants in whole-exome sequencing data
Epigenetics & Chromatin (2023)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.