Brief Communication

Phenolyzer: phenotype-based prioritization of candidate genes for human diseases

Received:
Accepted:
Published online:

Abstract

Prior biological knowledge and phenotype information may help to identify disease genes from human whole-genome and whole-exome sequencing studies. We developed Phenolyzer (http://phenolyzer.usc.edu), a tool that uses prior information to implicate genes involved in diseases. Phenolyzer exhibits superior performance over competing methods for prioritizing Mendelian and complex disease genes, based on disease or phenotype terms entered as free text.

  • Subscribe to Nature Methods for full access:

    $59

    Subscribe

Additional access options:

Already a subscriber?  Log in  now or  Register  for online access.

References

  1. 1.

    & Genome Med. 4, 58 (2012).

  2. 2.

    , & Nucleic Acids Res. 38, e164 (2010).

  3. 3.

    et al. Fly (Austin) 6, 80–92 (2012).

  4. 4.

    et al. Bioinformatics 26, 2069–2070 (2010).

  5. 5.

    et al. Hum. Mutat. 35, 548–555 (2014).

  6. 6.

    et al. Bioinformatics 28, 2267–2269 (2012).

  7. 7.

    et al. Nat. Rev. Genet. 12, 745–755 (2011).

  8. 8.

    et al. Nat. Biotechnol. 24, 537–544 (2006).

  9. 9.

    , & Bioinformatics 26, i561–i567 (2010).

  10. 10.

    et al. Am. J. Hum. Genet. 94, 599–610 (2014).

  11. 11.

    , & Nat. Methods 11, 935–937 (2014).

  12. 12.

    et al. Database 2010, baq020 (2010).

  13. 13.

    et al. Nucleic Acids Res. 41, W109–W114 (2013).

  14. 14.

    , & BMC Bioinformatics 7, 166 (2006).

  15. 15.

    et al. Am. J. Hum. Genet. 85, 457–464 (2009).

  16. 16.

    et al. Genome Res. 24, 340–348 (2014).

  17. 17.

    Nat. Educ. 1, 192 (2008).

  18. 18.

    et al. Sci. Translat. Med. 4, 154ra135 (2012).

  19. 19.

    et al. Nucleic Acids Res. 39, D945–D950 (2011).

  20. 20.

    et al. Nature 515, 209–215 (2014).

  21. 21.

    et al. Database 2014, bau090 (2014).

  22. 22.

    et al. Nature 492, 369–375 (2012).

  23. 23.

    & J. Med. Genet. 49, 433–436 (2012).

  24. 24.

    , & Hum. Mutat. 32, 564–567 (2011).

  25. 25.

    et al. Hum. Mutat. 33, 803–808 (2012).

  26. 26.

    et al. Nucleic Acids Res. 42, D980–D985 (2014).

  27. 27.

    et al. GeneReviews (1993).

  28. 28.

    , , & A Catalog of Published Genome-Wide Association Studies. (National Human Genome Research Institute, 2011).

  29. 29.

    et al. Nucleic Acids Res. 32, D497–D501 (2004).

  30. 30.

    et al. Nucleic Acids Res. 38, D492–D496 (2010).

  31. 31.

    , , , & Nucleic Acids Res. 39, D514–D519 (2011).

  32. 32.

    , & BMC Genomics 13, 405 (2012).

  33. 33.

    , , & Database 2012, bar065 (2012).

  34. 34.

    et al. Nucleic Acids Res. 40, D940–D946 (2012).

  35. 35.

    & Clin. Genet. 77, 525–534 (2010).

  36. 36.

    et al. Nucleic Acids Res. 39, D997–D1001 (2011).

  37. 37.

    et al. BMC Med. Inform. Decis. Mak. 10, 76 (2010).

  38. 38.

    , , & Am. J. Hum. Genet. 92, 107–113 (2013).

  39. 39.

    et al. Nucleic Acids Res. 39, D991–D996 (2011).

Download references

Acknowledgements

This work was supported by US National Institutes of Health grant R01-HG006465 to K.W. We thank members of the Wang laboratory for testing the Phenolyzer website.

Author information

Affiliations

  1. Zilkha Neurogenetic Institute, University of Southern California, Los Angeles, California, USA.

    • Hui Yang
    •  & Kai Wang
  2. Neuroscience Graduate Program, University of Southern California, Los Angeles, California, USA.

    • Hui Yang
  3. Institute for Medical and Human Genetics, Charité-Universitätsmedizin Berlin, Berlin, Germany.

    • Peter N Robinson
  4. Max Planck Institute for Molecular Genetics, Berlin, Germany.

    • Peter N Robinson
  5. Berlin Brandenburg Center for Regenerative Therapies (BCRT), Charité-Universitätsmedizin Berlin, Berlin, Germany.

    • Peter N Robinson
  6. Institute for Bioinformatics, Department of Mathematics and Computer Science, Freie Universität Berlin, Berlin, Germany.

    • Peter N Robinson
  7. Department of Psychiatry, University of Southern California, Los Angeles, California, USA.

    • Kai Wang
  8. Division of Bioinformatics, Department of Preventive Medicine, University of Southern California, Los Angeles, California, USA.

    • Kai Wang

Authors

  1. Search for Hui Yang in:

  2. Search for Peter N Robinson in:

  3. Search for Kai Wang in:

Contributions

H.Y. compiled the data, performed the computational experiments, developed software tools and drafted the manuscript. P.N.R. advised on phenotype data analysis and interpretation. K.W. designed the study, supervised its execution and revised the manuscript.

Competing interests

K.W. is a board member and stock holder of Tute Genomics, a bioinformatics software company.

Corresponding author

Correspondence to Kai Wang.

Integrated supplementary information

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–8, Supplementary Tables 1 and 2, and Supplementary Note 1

Excel files

  1. 1.

    Supplementary Data 1

    The result dataset of causal genes for the 14 monogenic diseases.

  2. 2.

    Supplementary Data 2

    The result dataset of the 590 monogenic disease genes.

  3. 3.

    Supplementary Data 3

    The result dataset of candidate genes for the four complex diseases

  4. 4.

    Supplementary Data 4

    The result dataset of novel discovered genes from four high-profile human genetics journals.

  5. 5.

    Supplementary Data 5

    The original prioritized gene lists for ‘craniopharyngiomas’

  6. 6.

    Supplementary Data 6

    The original prioritized gene lists for ‘SHORT syndrome’

  7. 7.

    Supplementary Data 7

    The original prioritized gene lists for ‘osteoporosis’

  8. 8.

    Supplementary Data 8

    The full disease names and HPO identifiers for the term ‘autism’

Zip files

  1. 1.

    Supplementary Software

    Phenolyzer Software