Article | Published:

Phrank measures phenotype sets similarity to greatly improve Mendelian diagnostic disease prioritization

Genetics in Medicinevolume 21pages464470 (2019) | Download Citation




Exome sequencing and diagnosis is beginning to spread across the medical establishment. The most time-consuming part of genome-based diagnosis is the manual step of matching the potentially long list of patient candidate genes to patient phenotypes to identify the causative disease.


We introduce Phrank (for phenotype ranking), an information theory–inspired method that utilizes a Bayesian network to prioritize candidate diseases or genes, as a stand-alone module that can be run with any underlying knowledgebase and any variant filtering scheme.


Phrank outperforms existing methods at ranking the causative disease or gene when applied to 169 real patient exomes with Mendelian diagnoses. Phrank’s greatest improvement is in disease space, where across all 169 patients it ranks only 3 diseases on average ahead of the true diagnosis, whereas Phenomizer ranks 32 diseases ahead of the causal one.


Using Phrank to rank all patient candidate genes or diseases, as they start working through a new case, will save the busy clinician much time in deriving a genetic diagnosis.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.


  1. 1.

    Iglesias A, Anyane-Yeboa K, Wynn J, et al. The usefulness of whole-exome sequencing in routine clinical practice. Genet Med. 2014;16:922–31.

  2. 2.

    Yang Y, Muzny DM, Reid JG, et al. Clinical whole-exome sequencing for the diagnosis of Mendelian disorders. N Engl J Med. 2013;369:1502–11.

  3. 3.

    Lee H, Deignan JL, Dorrani N, et al. Clinical exome sequencing for genetic identification of rare Mendelian disorders. JAMA. 2014;312:1880–7.

  4. 4.

    Ng SB, Bigham AW, Buckingham KJ, et al. Exome sequencing identifies MLL2 mutations as a cause of Kabuki syndrome. Nat Genet. 2010;42:790–3.

  5. 5.

    Ng SB, Turner EH, Robertson PD, et al. Targeted capture and massively parallel sequencing of 12 human exomes. Nature. 2009;461:272–6.

  6. 6.

    Ng SB, Buckingham KJ, Lee C, et al. Exome sequencing identifies the cause of a mendelian disorder. Nat Genet. 2010;42:30–35.

  7. 7.

    Wenger AM, Guturu H, Bernstein JA, Bejerano G. Systematic reanalysis of clinical exome data yields additional diagnoses: implications for providers. Genet Med. 2017;19:209–14.

  8. 8.

    Amberger J, Bocchini C, Hamosh A. A new face and new challenges for Online Mendelian Inheritance in Man (OMIM®). Hum Mutat. 2011;32:564–7.

  9. 9.

    Rath A, Olry A, Dhombres F, et al. Representation of rare diseases in health information systems: the Orphanet approach to serve a wide range of end users. Hum Mutat. 2012;33:803–8.

  10. 10.

    Köhler S, Doelken SC, Mungall CJ, et al. The Human Phenotype Ontology Project: linking molecular biology and disease through phenotype data. Nucleic Acids Res. 2014;42:D966–74. (Database issue)

  11. 11.

    Dewey FE, Grove ME, Pan C, et al. Clinical interpretation and implications of whole-genome sequencing. JAMA. 2014;311:1035–45.

  12. 12.

    1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature. 2012;491:56–65.

  13. 13.

    Lek M, Karczewski KJ, Minikel EV, et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016;536:285–91.

  14. 14.

    Taylor JC, Martin HC, Lise S, et al. Factors influencing success of clinical genome sequencing across a broad spectrum of disorders. Nat Genet. 2015;47:717–26.

  15. 15.

    Church G. Compelling reasons for repairing human germlines. N Engl J Med. 2017;377:1909–11.

  16. 16.

    Ng PC, Henikoff S. SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res. 2003;31:3812–4.

  17. 17.

    Adzhubei I, Jordan DM, Sunyaev SR. Predicting functional effect of human missense mutations using PolyPhen-2. Curr Protoc Hum Genet. 2013;Chapter 7:Unit7.20.

  18. 18.

    Kircher M, Witten DM, Jain P, et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nat Genet. 2014;46:310–5.

  19. 19.

    Jagadeesh KA, Wenger AM, Berger MJ, et al. M-CAP eliminates a majority of variants of uncertain significance in clinical exomes at high sensitivity. Nat Genet. 2016;48:1581–6.

  20. 20.

    Singleton MV, Guthery SL, Voelkerding KV, et al. Phevor combines multiple biomedical ontologies for accurate identification of disease-causing alleles in single individuals and small nuclear families. Am J Hum Genet . 2014;94:599–610.

  21. 21.

    Zemojtel T, Köhler S, Mackenroth L, et al. Effective diagnosis of genetic disease by computational phenotype analysis of the disease-associated genome. Sci Transl Med. 2014;6:252ra123–252ra123.

  22. 22.

    Köhler S, Schulz MH, Krawitz P, et al. Clinical diagnostics in human genetics with semantic similarity searches in ontologies. Am J Hum Genet. 2009;85:457–64.

  23. 23.

    Koller D, Friedman N. Probabilistic Graphical Models: Principles and Techniques—Adaptive Computation and Machine Learning. The MIT Press; 2009. Cambridge, MA

  24. 24.

    Deciphering Developmental Disorders Study. Large-scale discovery of novel genetic causes of developmental disorders. Nature. 2015;519:223–8.

  25. 25.

    Lappalainen I, Almeida-King J, Kumanduri V, et al. The European Genome-Phenome Archive of human data consented for biomedical research. Nat Genet. 2015;47:692–5.

  26. 26.

    Wright CF, Fitzgerald TW, Jones WD, et al. Genetic diagnosis of developmental disorders in the DDD study: a scalable analysis of genome-wide research data. Lancet. 2015;385:1305–14.

  27. 27.

    Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010;38:e164.

  28. 28.

    Aken BL, Ayling S, Barrell D, et al. The Ensembl gene annotation system. Database. 2016;baw093.

  29. 29.

    Flicek P, Amode MR, Barrell D, et al. Ensembl 2014. Nucleic Acids Res. 2014;42(D1):D749–55.

  30. 30.

    Jagadeesh KA, Wu DJ, Birgmeier JA, et al. Deriving genomic diagnoses without revealing patient genomes. Science. 2017;357:692–5.

  31. 31.

    Smedley D, Jacobsen JOB, Jäger M, et al. Next-generation diagnostics and disease-gene discovery with the Exomiser. Nat Protoc. 2015;10:2004–15.

  32. 32.

    Yang H, Robinson PN, Wang K. Phenolyzer: phenotype-based prioritization of candidate genes for human diseases. Nat Methods. 2015;12:841–3.

  33. 33.

    Sifrim A, Popovic D, Tranchevent L-C, et al. eXtasy: variant prioritization by genomic data fusion. Nat Methods. 2013;10:1083–4.

Download references


We thank Yosuke Tanigawa, Ethan Dyer, Golan Yona, and all other members of the Bejerano Lab for valuable discussions and project feedback. We would also like to thank the European Genome-Phenome Archive (EGA) and the Deciphering Developmental Diseases (DDD) project. The DDD study presents independent research commissioned by the Health Innovation Challenge Fund (grant number HICF-1009-003), a parallel funding partnership between the Wellcome Trust and the Department of Health, and the Wellcome Trust Sanger Institute (grant number WT098051). The views expressed in this publication are those of the author(s) and not necessarily those of the Wellcome Trust or the Department of Health. The study has UK Research Ethics Committee approval (10/H0305/83, granted by the Cambridge South REC, and GEN/284/12 granted by the Republic of Ireland REC). The research team acknowledges the support of the National Institute for Health Research, through the Comprehensive Clinical Research Network. as well as the patients and professionals involved in the Deciphering Developmental Disorders (DDD) study deposited in the European Genome Archive (EGA). This work was funded in part by the Stanford Graduate Fellowship and CEHG Fellowship to K.A.J., a Bio-X Stanford Interdisciplinary Graduate Fellowship to J.B., the Stanford Pediatrics Department, DARPA, a Packard Foundation Fellowship, and a Microsoft Faculty Fellowship to G.B.

Author information

Author notes

  1. These authors contributed equally: Karthik A. Jagadeesh, Johannes Birgmeier


  1. Department of Computer Science, Stanford University, Stanford, California, 94305, USA

    • Karthik A. Jagadeesh MSc
    • , Johannes Birgmeier MSc
    • , Cole A. Deisseroth
    •  & Gill Bejerano PhD
  2. Department of Pediatrics, Stanford University, Stanford, California, 94305, USA

    • Harendra Guturu PhD
    • , Aaron M. Wenger PhD
    • , Jonathan A. Bernstein MD, PhD
    •  & Gill Bejerano PhD
  3. Department of Developmental Biology, Stanford University, Stanford, California, 94305, USA

    • Gill Bejerano PhD


  1. Search for Karthik A. Jagadeesh MSc in:

  2. Search for Johannes Birgmeier MSc in:

  3. Search for Harendra Guturu PhD in:

  4. Search for Cole A. Deisseroth in:

  5. Search for Aaron M. Wenger PhD in:

  6. Search for Jonathan A. Bernstein MD, PhD in:

  7. Search for Gill Bejerano PhD in:


The authors declare no conflicts of interest.

Corresponding author

Correspondence to Gill Bejerano PhD.

Electronic supplementary material

About this article

Publication history