Article | Published:

An integrative tissue-network approach to identify and test human disease genes

Nature Biotechnology volume 36, pages 10911099 (2018) | Download Citation

Abstract

Effective discovery of causal disease genes must overcome the statistical challenges of quantitative genetics studies and the practical limitations of human biology experiments. Here we developed diseaseQUEST, an integrative approach that combines data from human genome-wide disease studies with in silico network models of tissue- and cell-type-specific function in model organisms to prioritize candidates within functionally conserved processes and pathways. We used diseaseQUEST to predict candidate genes for 25 different diseases and traits, including cancer, longevity, and neurodegenerative diseases. Focusing on Parkinson's disease (PD), a diseaseQUEST-directed Caenhorhabditis elegans behavioral screen identified several candidate genes, which we experimentally verified and found to be associated with age-dependent motility defects mirroring PD clinical symptoms. Furthermore, knockdown of the top candidate gene, bcat-1, encoding a branched chain amino acid transferase, caused spasm-like 'curling' and neurodegeneration in C. elegans, paralleling decreased BCAT1 expression in PD patient brains. diseaseQUEST is modular and generalizable to other model organisms and human diseases of interest.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    et al. Understanding multicellular function and disease with human tissue-specific networks. Nat. Genet. 47, 569–576 (2015).

  2. 2.

    et al. Genome-wide prediction and functional characterization of the genetic basis of autism spectrum disorder. Nat. Neurosci. 19, 1454–1462 (2016).

  3. 3.

    et al. Proteins encoded in genomic regions associated with immune-mediated disease physically interact and suggest underlying biology. PLoS Genet. 7, e1001273 (2011).

  4. 4.

    et al. Network-assisted analysis of GWAS data identifies a functionally-relevant gene module for childhood-onset asthma. Sci. Rep. 7, 938 (2017).

  5. 5.

    International Multiple Sclerosis Genetics Consortium. Network-based multiple sclerosis pathway analysis with GWAS data from 15,000 cases and 30,000 controls. Am. J. Hum. Genet. 92, 854–865 (2013).

  6. 6.

    et al. A Drosophila functional evaluation of candidates from human genome-wide association studies of type2 diabetes and related metabolic traits identifies tissue-specific roles for dHHEX. BMC Genomics 14, 136 (2013).

  7. 7.

    & Zebrafish models of cardiovascular disease. Heart Fail. Rev. 21, 803–813 (2016).

  8. 8.

    et al. Functional screening of Alzheimer pathology genome-wide association signals in Drosophila. Am. J. Hum. Genet. 88, 232–238 (2011).

  9. 9.

    et al. WormNet v3: a network-assisted hypothesis-generating server for Caenorhabditis elegans. Nucleic Acids Res. 42, W76–W82 (2014).

  10. 10.

    et al. Functional knowledge transfer for high-accuracy prediction of under-studied biological processes. PLoS Comput. Biol. 9, e1002957 (2013).

  11. 11.

    et al. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 42, D1001–D1006 (2014).

  12. 12.

    et al. ALS-linked TDP-43 mutations produce aberrant RNA splicing and adult-onset motor neuron disease without aggregation or loss of nuclear TDP-43. Proc. Natl. Acad. Sci. USA 110, E736–E745 (2013).

  13. 13.

    & Pathological roles of MAPK signaling pathways in human diseases. Biochim. Biophys. Acta 1802, 396–405 (2010).

  14. 14.

    , , & Phosphatidylinositol 3-kinase: increased activity and protein level in amyotrophic lateral sclerosis. J. Neurochem. 71, 716–722 (1998).

  15. 15.

    , & Genome-wide identification of splicing QTLs in the human brain and their enrichment among schizophrenia-associated loci. Nat. Commun. 8, 14519 (2017).

  16. 16.

    et al. A novel frameshift mutation in UPF3B identified in brothers affected with childhood onset schizophrenia and autism spectrum disorders. Mol. Psychiatry 16, 238–239 (2011).

  17. 17.

    , , & Dysfunction of the ubiquitin proteasome and ubiquitin-like systems in schizophrenia. Neuropsychopharmacology 38, 1910–1920 (2013).

  18. 18.

    , , & Aberrant lipid metabolism: an emerging diagnostic and therapeutic target in ovarian cancer. Int. J. Mol. Sci. 14, 7742–7756 (2013).

  19. 19.

    , & Antimetastatic effects of cordycepin mediated by the inhibition of mitochondrial activity and estrogen-related receptor α in human ovarian carcinoma cells. Oncotarget 8, 3049–3058 (2017).

  20. 20.

    , , & RNA splicing factors as oncoproteins and tumour suppressors. Nat. Rev. Cancer 16, 413–430 (2016).

  21. 21.

    The genetics of ageing. Nature 464, 504–512 (2010).

  22. 22.

    , & Tissue-specific activities of C. elegans DAF-16 in the regulation of lifespan. Cell 115, 489–502 (2003).

  23. 23.

    , , & Direct and indirect gene regulation by a life-extending FOXO protein in C. elegans: roles for GATA factors and lipid gene regulators. Cell Metab. 17, 85–100 (2013).

  24. 24.

    et al. Intestinal autophagy improves healthspan and longevity in C. elegans during dietary restriction. PLoS Genet. 12, e1006135 (2016).

  25. 25.

    et al. Detection and interpretation of shared genetic influences on 42 human traits. Nat. Genet. 48, 709–717 (2016).

  26. 26.

    et al. Whole genome expression profiling of the medial and lateral substantia nigra in Parkinson's disease. Neurogenetics 7, 1–11 (2006).

  27. 27.

    & Autophagy in the pathogenesis of disease. Cell 132, 27–42 (2008).

  28. 28.

    et al. CeleST: computer vision software for quantitative analysis of C. elegans swim behavior reveals novel features of locomotion. PLoS Comput. Biol. 10, e1003702 (2014).

  29. 29.

    et al. The C. elegans adult neuronal IIS/FOXO transcriptome reveals adult phenotype regulators. Nature 529, 92–96 (2016).

  30. 30.

    , , & Large-scale analysis of gene function in Caenorhabditis elegans by high-throughput RNAi. Curr. Biol. 11, 171–176 (2001).

  31. 31.

    , , , & Leucine-nitrogen metabolism in the brain of conscious rats: its role as a nitrogen carrier in glutamate synthesis in glial and neuronal metabolic compartments. J. Neurochem. 88, 612–622 (2004).

  32. 32.

    et al. A branched-chain amino acid-related metabolic signature that differentiates obese and lean humans and contributes to insulin resistance. Cell Metab. 9, 311–326 (2009).

  33. 33.

    & Branched-chain amino acids in metabolic signalling and insulin resistance. Nat. Rev. Endocrinol. 10, 723–736 (2014).

  34. 34.

    et al. Branched-chain amino acid catabolism is a conserved regulator of physiological ageing. Nat. Commun. 6, 10043 (2015).

  35. 35.

    et al. Comprehensive urinary metabolomic profiling and identification of potential noninvasive marker for idiopathic Parkinson's disease. Sci. Rep. 5, 13888 (2015).

  36. 36.

    & Stronger is not always better: could a bodybuilding dietary supplement lead to ALS? Exp. Neurol. 228, 5–8 (2011).

  37. 37.

    et al. Movement disorders in adult surviving patients with maple syrup urine disease. Mov. Disord. 26, 1324–1328 (2011).

  38. 38.

    & Late manifesting variant of branched-chain ketoaciduria (maple syrup urine disease). Acta Paediatr. 53, 356–364 (1964).

  39. 39.

    et al. Chronic administration of branched-chain amino acids impairs spatial memory and increases brain-derived neurotrophic factor in a rat model. J. Inherit. Metab. Dis. 36, 721–730 (2013).

  40. 40.

    et al. Decreased consumption of branched-chain amino acids improves metabolic health. Cell Rep. 16, 520–530 (2016).

  41. 41.

    , , , & Functional analysis of VPS41-mediated neuroprotection in Caenorhabditis elegans and mammalian models of Parkinson's disease. J. Neurosci. 32, 2142–2153 (2012).

  42. 42.

    , , & 100 years of Lewy pathology. Nat. Rev. Neurol. 9, 13–24 (2013).

  43. 43.

    et al. Dopaminergic neuronal loss and motor deficits in Caenorhabditis elegans overexpressing human alpha-synuclein. J. Neurochem. 86, 165–172 (2003).

  44. 44.

    , , & Torsin-mediated protection from cellular stress in the dopaminergic neurons of Caenorhabditis elegans. J. Neurosci. 25, 3801–3812 (2005).

  45. 45.

    et al. Familial Parkinson mutant alpha-synuclein causes dopamine neuron dysfunction in transgenic Caenorhabditis elegans. J. Biol. Chem. 281, 334–340 (2006).

  46. 46.

    et al. Genome-wide association meta-analysis of neuropathologic features of Alzheimer's disease and related dementias. PLoS Genet. 10, e1004606 (2014).

  47. 47.

    et al. Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer's disease. Nat. Genet. 45, 1452–1458 (2013).

  48. 48.

    et al. Lateral entorhinal cortex is critical for novel object-context recognition. Hippocampus 23, 352–366 (2013).

  49. 49.

    et al. Large-scale analyses of common and rare variants identify 12 new loci associated with atrial fibrillation. Nat. Genet. 49, 946–952 (2017).

  50. 50.

    & Using zebrafish for high-throughput screening of novel cardiovascular drugs. JACC Basic Transl. Sci. 2, 1–12 (2017).

  51. 51.

    Gene Ontology Consortium. Gene Ontology Consortium: going forward. Nucleic Acids Res. 43, D1049–D1056 (2015).

  52. 52.

    et al. The BioGRID interaction database: 2015 update. Nucleic Acids Res. 43, D470–D478 (2015).

  53. 53.

    et al. The MIntAct project: IntAct as a common curation platform for 11 molecular interaction databases. Nucleic Acids Res. 42, D358–D363 (2014).

  54. 54.

    et al. MINT, the molecular interaction database: 2012 update. Nucleic Acids Res. 40, D857–D861 (2012).

  55. 55.

    et al. WormBase 2014: new views of curated biology. Nucleic Acids Res. 42, D789–D793 (2014).

  56. 56.

    et al. JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles. Nucleic Acids Res. 42, D142–D147 (2014).

  57. 57.

    et al. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 37, W202–W208 (2009).

  58. 58.

    et al. NCBI GEO: archive for functional genomics data sets--update. Nucleic Acids Res. 41, D991–D995 (2013).

  59. 59.

    et al. Missing value estimation methods for DNA microarrays. Bioinformatics 17, 520–525 (2001).

  60. 60.

    , , , & Finding function: evaluation methods for functional genomic data. BMC Genomics 7, 187 (2006).

  61. 61.

    et al. High-throughput in vivo analysis of gene expression in Caenorhabditis elegans. PLoS Biol. 5, e237 (2007).

  62. 62.

    , , & Global prediction of tissue-specific gene expression and context-dependent gene networks in Caenorhabditis elegans. PLoS Comput. Biol. 5, e1000417 (2009).

  63. 63.

    et al. Exploring the human genome with functional maps. Genome Res. 19, 1093–1106 (2009).

  64. 64.

    , , & The Sleipnir library for computational functional genomics. Bioinformatics 24, 1559–1561 (2008).

  65. 65.

    & Predicting good probabilities with supervised learning. in ICML ′05 Proc. 22nd Intl. Conf. Mach. Learn. 625–632 (ACM Press, Bonn, Germany, 2005).

  66. 66.

    , , , & Functional genomics complements quantitative genetics in identifying disease-gene associations. PLOS Comput. Biol. 6, e1000991 (2010).

  67. 67.

    Probabilities for SV Machines. in Advances in Large Margin Classifiers (eds. Smola, A.J. et al.) 61–74 (Massachusetts Institute of Technology, Cambridge, MA, USA, 2000).

  68. 68.

    , , & Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008, P10008 (2008).

  69. 69.

    , & Gephi: an open source software for exploring and manipulating networks. In Int. AAAI Conf. Weblogs Soc. Media (Association for the Advancement of Artificial Intelligence, Menlo Park, CA, USA, 2009).

  70. 70.

    , , , & The Gene Expression Barcode: leveraging public data repositories to begin cataloging the human and murine transcriptomes. Nucleic Acids Res. 39, D1011–D1015 (2011).

Download references

Acknowledgements

We thank K. Yao, R. Hong, and J. Zhou for assistance with video analysis, G. Laevsky for assistance with confocal microscopy, the CGC for strains, and Z. Gitai and the laboratories of O.G.T. and C.T.M. for valuable discussion. Strain UA44 was generously provided by G. Caldwell (University of Alabama), and strain BY250 was a generous gift from R. Blakely (Vanderbilt University). V.Y. was supported in part by US NIH grant T32 HG003284. O.G.T. is supported as a senior fellow of the Genetic Networks program of the Canadian Institute for Advanced Research (CIFAR). C.T.M. is supported as the Director of the Glenn Center for Aging Research at Princeton and as an HHMI-Simons Faculty Scholar. This work was supported by the NIH (R01 GM071966 to O.G.T. and Cognitive Aging R01 and DP1 Pioneer Award to C.T.M.).

Author information

Author notes

    • Victoria Yao
    •  & Rachel Kaletsky

    These authors contributed equally to this work.

Affiliations

  1. Department of Computer Science, Princeton University, Princeton, New Jersey, USA.

    • Victoria Yao
    •  & Olga G Troyanskaya
  2. Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, USA.

    • Victoria Yao
    • , Rachel Kaletsky
    • , William Keyes
    • , Danielle E Mor
    • , Salman Sohrabi
    • , Coleen T Murphy
    •  & Olga G Troyanskaya
  3. Department of Molecular Biology, Princeton University, Princeton, New Jersey, USA.

    • Rachel Kaletsky
    • , William Keyes
    • , Danielle E Mor
    • , Salman Sohrabi
    •  & Coleen T Murphy
  4. Flatiron Institute, Simons Foundation, New York, New York, USA.

    • Aaron K Wong
    •  & Olga G Troyanskaya

Authors

  1. Search for Victoria Yao in:

  2. Search for Rachel Kaletsky in:

  3. Search for William Keyes in:

  4. Search for Danielle E Mor in:

  5. Search for Aaron K Wong in:

  6. Search for Salman Sohrabi in:

  7. Search for Coleen T Murphy in:

  8. Search for Olga G Troyanskaya in:

Contributions

V.Y. and R.K. are joint first authors. W.K. and D.E.M. are joint second authors. V.Y. and O.G.T. conceived the computational study; V.Y. and O.G.T. developed, implemented, and applied all computational methods; R.K. and C.T.M. developed the phenotypic analysis; R.K. and W.K. performed the PD-candidate screen; R.K., D.E.M., and W.K. carried out thrashing assays; V.Y. extended the CeleST package and developed scripts for data processing; S.S. carried out automated analyses of thrashing; V.Y., with W.K. and undergraduate assistants, manually checked CeleST video annotations; W.K., R.K., and D.E.M. carried out manual thrashing analysis; R.K. and D.E.M. performed microscopy experiments, and R.K. carried out all other experiments; V.Y. and A.K.W. developed the WISP website. V.Y., R.K., D.E.M., C.T.M. and O.G.T. wrote the paper.

Competing interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to Coleen T Murphy or Olga G Troyanskaya.

Integrated supplementary information

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–8

  2. 2.

    Life Sciences Reporting Summary

  3. 3.

    Supplementary Note

    Extending diseaseQUEST to other model organisms and diseases.

Excel files

  1. 1.

    Supplementary Data 1

    203 tissue- and cell-type specific networks.

  2. 2.

    Supplementary Data 2

    Evaluation of 25 disease predictions.

  3. 3.

    Supplementary Data 3

    GWAS genes used as gold standard for predictions.

  4. 4.

    Supplementary Data 4

    Gene Ontology analysis of top ALS disease candidates.

  5. 5.

    Supplementary Data 5

    Gene Ontology analysis of top schizophrenia candidates.

  6. 6.

    Supplementary Data 6

    Gene Ontology analysis of top ovarian carcinoma candidates.

  7. 7.

    Supplementary Data 7

    Gene Ontology analysis of top pancreatic cancer candidates.

  8. 8.

    Supplementary Data 8

    Evaluation of tissue-specific lifespan gene predictions using human longevity GWAS input.

  9. 9.

    Supplementary Data 9

    Parkinson's disease-associated gene predictions.

  10. 10.

    Supplementary Data 10

    Dopaminergic neuron network clustering of top PD gene predictions and functional enrichment per cluster.

  11. 11.

    Supplementary Data 11

    KEGG pathway and Gene Ontology enrichment of Parkinson's disease predictions.

  12. 12.

    Supplementary Data 12

    Prioritized PD candidate genes.

  13. 13.

    Supplementary Data 13

    CeleST worm movement measures of top candidate genes on days 2, 5, and 8.

  14. 14.

    Supplementary Data 14

    Top non-PD candidate genes tested for curling defects.

  15. 15.

    Supplementary Data 15

    Human BCAT1 expression data obtained from the Allen Brain Atlas.

Zip files

  1. 1.

    Supplementary Software

    Sleipnir Library for Computational Functional Genomics

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nbt.4246