Effective discovery of causal disease genes must overcome the statistical challenges of quantitative genetics studies and the practical limitations of human biology experiments. Here we developed diseaseQUEST, an integrative approach that combines data from human genome-wide disease studies with in silico network models of tissue- and cell-type-specific function in model organisms to prioritize candidates within functionally conserved processes and pathways. We used diseaseQUEST to predict candidate genes for 25 different diseases and traits, including cancer, longevity, and neurodegenerative diseases. Focusing on Parkinson's disease (PD), a diseaseQUEST-directed Caenhorhabditis elegans behavioral screen identified several candidate genes, which we experimentally verified and found to be associated with age-dependent motility defects mirroring PD clinical symptoms. Furthermore, knockdown of the top candidate gene, bcat-1, encoding a branched chain amino acid transferase, caused spasm-like 'curling' and neurodegeneration in C. elegans, paralleling decreased BCAT1 expression in PD patient brains. diseaseQUEST is modular and generalizable to other model organisms and human diseases of interest.
This is a preview of subscription content, access via your institution
Open Access articles citing this article.
Nature Communications Open Access 01 April 2022
GAIT-GM integrative cross-omics analyses reveal cholinergic defects in a C. elegans model of Parkinson’s disease
Scientific Reports Open Access 28 February 2022
BMC Genomics Open Access 13 July 2021
Subscribe to Nature+
Get immediate online access to Nature and 55 other Nature journal
Subscribe to Journal
Get full journal access for 1 year
only $8.25 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
Greene, C.S. et al. Understanding multicellular function and disease with human tissue-specific networks. Nat. Genet. 47, 569–576 (2015).
Krishnan, A. et al. Genome-wide prediction and functional characterization of the genetic basis of autism spectrum disorder. Nat. Neurosci. 19, 1454–1462 (2016).
Rossin, E.J. et al. Proteins encoded in genomic regions associated with immune-mediated disease physically interact and suggest underlying biology. PLoS Genet. 7, e1001273 (2011).
Liu, Y. et al. Network-assisted analysis of GWAS data identifies a functionally-relevant gene module for childhood-onset asthma. Sci. Rep. 7, 938 (2017).
International Multiple Sclerosis Genetics Consortium. Network-based multiple sclerosis pathway analysis with GWAS data from 15,000 cases and 30,000 controls. Am. J. Hum. Genet. 92, 854–865 (2013).
Pendse, J. et al. A Drosophila functional evaluation of candidates from human genome-wide association studies of type2 diabetes and related metabolic traits identifies tissue-specific roles for dHHEX. BMC Genomics 14, 136 (2013).
Bournele, D. & Beis, D. Zebrafish models of cardiovascular disease. Heart Fail. Rev. 21, 803–813 (2016).
Shulman, J.M. et al. Functional screening of Alzheimer pathology genome-wide association signals in Drosophila. Am. J. Hum. Genet. 88, 232–238 (2011).
Cho, A. et al. WormNet v3: a network-assisted hypothesis-generating server for Caenorhabditis elegans. Nucleic Acids Res. 42, W76–W82 (2014).
Park, C.Y. et al. Functional knowledge transfer for high-accuracy prediction of under-studied biological processes. PLoS Comput. Biol. 9, e1002957 (2013).
Welter, D. et al. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 42, D1001–D1006 (2014).
Arnold, E.S. et al. ALS-linked TDP-43 mutations produce aberrant RNA splicing and adult-onset motor neuron disease without aggregation or loss of nuclear TDP-43. Proc. Natl. Acad. Sci. USA 110, E736–E745 (2013).
Kim, E.K. & Choi, E.-J. Pathological roles of MAPK signaling pathways in human diseases. Biochim. Biophys. Acta 1802, 396–405 (2010).
Wagey, R., Pelech, S.L., Duronio, V. & Krieger, C. Phosphatidylinositol 3-kinase: increased activity and protein level in amyotrophic lateral sclerosis. J. Neurochem. 71, 716–722 (1998).
Takata, A., Matsumoto, N. & Kato, T. Genome-wide identification of splicing QTLs in the human brain and their enrichment among schizophrenia-associated loci. Nat. Commun. 8, 14519 (2017).
Addington, A.M. et al. A novel frameshift mutation in UPF3B identified in brothers affected with childhood onset schizophrenia and autism spectrum disorders. Mol. Psychiatry 16, 238–239 (2011).
Rubio, M.D., Wood, K., Haroutunian, V. & Meador-Woodruff, J.H. Dysfunction of the ubiquitin proteasome and ubiquitin-like systems in schizophrenia. Neuropsychopharmacology 38, 1910–1920 (2013).
Pyragius, C.E., Fuller, M., Ricciardelli, C. & Oehler, M.K. Aberrant lipid metabolism: an emerging diagnostic and therapeutic target in ovarian cancer. Int. J. Mol. Sci. 14, 7742–7756 (2013).
Wang, C.W., Hsu, W.H. & Tai, C.J. Antimetastatic effects of cordycepin mediated by the inhibition of mitochondrial activity and estrogen-related receptor α in human ovarian carcinoma cells. Oncotarget 8, 3049–3058 (2017).
Dvinge, H., Kim, E., Abdel-Wahab, O. & Bradley, R.K. RNA splicing factors as oncoproteins and tumour suppressors. Nat. Rev. Cancer 16, 413–430 (2016).
Kenyon, C.J. The genetics of ageing. Nature 464, 504–512 (2010).
Libina, N., Berman, J.R. & Kenyon, C. Tissue-specific activities of C. elegans DAF-16 in the regulation of lifespan. Cell 115, 489–502 (2003).
Zhang, P., Judy, M., Lee, S.-J. & Kenyon, C. Direct and indirect gene regulation by a life-extending FOXO protein in C. elegans: roles for GATA factors and lipid gene regulators. Cell Metab. 17, 85–100 (2013).
Gelino, S. et al. Intestinal autophagy improves healthspan and longevity in C. elegans during dietary restriction. PLoS Genet. 12, e1006135 (2016).
Pickrell, J.K. et al. Detection and interpretation of shared genetic influences on 42 human traits. Nat. Genet. 48, 709–717 (2016).
Moran, L.B. et al. Whole genome expression profiling of the medial and lateral substantia nigra in Parkinson's disease. Neurogenetics 7, 1–11 (2006).
Levine, B. & Kroemer, G. Autophagy in the pathogenesis of disease. Cell 132, 27–42 (2008).
Restif, C. et al. CeleST: computer vision software for quantitative analysis of C. elegans swim behavior reveals novel features of locomotion. PLoS Comput. Biol. 10, e1003702 (2014).
Kaletsky, R. et al. The C. elegans adult neuronal IIS/FOXO transcriptome reveals adult phenotype regulators. Nature 529, 92–96 (2016).
Maeda, I., Kohara, Y., Yamamoto, M. & Sugimoto, A. Large-scale analysis of gene function in Caenorhabditis elegans by high-throughput RNAi. Curr. Biol. 11, 171–176 (2001).
Sakai, R., Cohen, D.M., Henry, J.F., Burrin, D.G. & Reeds, P.J. Leucine-nitrogen metabolism in the brain of conscious rats: its role as a nitrogen carrier in glutamate synthesis in glial and neuronal metabolic compartments. J. Neurochem. 88, 612–622 (2004).
Newgard, C.B. et al. A branched-chain amino acid-related metabolic signature that differentiates obese and lean humans and contributes to insulin resistance. Cell Metab. 9, 311–326 (2009).
Lynch, C.J. & Adams, S.H. Branched-chain amino acids in metabolic signalling and insulin resistance. Nat. Rev. Endocrinol. 10, 723–736 (2014).
Mansfeld, J. et al. Branched-chain amino acid catabolism is a conserved regulator of physiological ageing. Nat. Commun. 6, 10043 (2015).
Luan, H. et al. Comprehensive urinary metabolomic profiling and identification of potential noninvasive marker for idiopathic Parkinson's disease. Sci. Rep. 5, 13888 (2015).
Manuel, M. & Heckman, C.J. Stronger is not always better: could a bodybuilding dietary supplement lead to ALS? Exp. Neurol. 228, 5–8 (2011).
Carecchio, M. et al. Movement disorders in adult surviving patients with maple syrup urine disease. Mov. Disord. 26, 1324–1328 (2011).
Kiil, R. & Rokkones, T. Late manifesting variant of branched-chain ketoaciduria (maple syrup urine disease). Acta Paediatr. 53, 356–364 (1964).
Scaini, G. et al. Chronic administration of branched-chain amino acids impairs spatial memory and increases brain-derived neurotrophic factor in a rat model. J. Inherit. Metab. Dis. 36, 721–730 (2013).
Fontana, L. et al. Decreased consumption of branched-chain amino acids improves metabolic health. Cell Rep. 16, 520–530 (2016).
Harrington, A.J., Yacoubian, T.A., Slone, S.R., Caldwell, K.A. & Caldwell, G.A. Functional analysis of VPS41-mediated neuroprotection in Caenorhabditis elegans and mammalian models of Parkinson's disease. J. Neurosci. 32, 2142–2153 (2012).
Goedert, M., Spillantini, M.G., Del Tredici, K. & Braak, H. 100 years of Lewy pathology. Nat. Rev. Neurol. 9, 13–24 (2013).
Lakso, M. et al. Dopaminergic neuronal loss and motor deficits in Caenorhabditis elegans overexpressing human alpha-synuclein. J. Neurochem. 86, 165–172 (2003).
Cao, S., Gelwix, C.C., Caldwell, K.A. & Caldwell, G.A. Torsin-mediated protection from cellular stress in the dopaminergic neurons of Caenorhabditis elegans. J. Neurosci. 25, 3801–3812 (2005).
Kuwahara, T. et al. Familial Parkinson mutant alpha-synuclein causes dopamine neuron dysfunction in transgenic Caenorhabditis elegans. J. Biol. Chem. 281, 334–340 (2006).
Beecham, G.W. et al. Genome-wide association meta-analysis of neuropathologic features of Alzheimer's disease and related dementias. PLoS Genet. 10, e1004606 (2014).
Lambert, J.C. et al. Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer's disease. Nat. Genet. 45, 1452–1458 (2013).
Wilson, D.I.G. et al. Lateral entorhinal cortex is critical for novel object-context recognition. Hippocampus 23, 352–366 (2013).
Christophersen, I.E. et al. Large-scale analyses of common and rare variants identify 12 new loci associated with atrial fibrillation. Nat. Genet. 49, 946–952 (2017).
Kithcart, A. & MacRae, C.A. Using zebrafish for high-throughput screening of novel cardiovascular drugs. JACC Basic Transl. Sci. 2, 1–12 (2017).
Gene Ontology Consortium. Gene Ontology Consortium: going forward. Nucleic Acids Res. 43, D1049–D1056 (2015).
Chatr-Aryamontri, A. et al. The BioGRID interaction database: 2015 update. Nucleic Acids Res. 43, D470–D478 (2015).
Orchard, S. et al. The MIntAct project: IntAct as a common curation platform for 11 molecular interaction databases. Nucleic Acids Res. 42, D358–D363 (2014).
Licata, L. et al. MINT, the molecular interaction database: 2012 update. Nucleic Acids Res. 40, D857–D861 (2012).
Harris, T.W. et al. WormBase 2014: new views of curated biology. Nucleic Acids Res. 42, D789–D793 (2014).
Mathelier, A. et al. JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles. Nucleic Acids Res. 42, D142–D147 (2014).
Bailey, T.L. et al. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 37, W202–W208 (2009).
Barrett, T. et al. NCBI GEO: archive for functional genomics data sets--update. Nucleic Acids Res. 41, D991–D995 (2013).
Troyanskaya, O. et al. Missing value estimation methods for DNA microarrays. Bioinformatics 17, 520–525 (2001).
Myers, C.L., Barrett, D.R., Hibbs, M.A., Huttenhower, C. & Troyanskaya, O.G. Finding function: evaluation methods for functional genomic data. BMC Genomics 7, 187 (2006).
Hunt-Newbury, R. et al. High-throughput in vivo analysis of gene expression in Caenorhabditis elegans. PLoS Biol. 5, e237 (2007).
Chikina, M.D., Huttenhower, C., Murphy, C.T. & Troyanskaya, O.G. Global prediction of tissue-specific gene expression and context-dependent gene networks in Caenorhabditis elegans. PLoS Comput. Biol. 5, e1000417 (2009).
Huttenhower, C. et al. Exploring the human genome with functional maps. Genome Res. 19, 1093–1106 (2009).
Huttenhower, C., Schroeder, M., Chikina, M.D. & Troyanskaya, O.G. The Sleipnir library for computational functional genomics. Bioinformatics 24, 1559–1561 (2008).
Niculescu-Mizil, A. & Caruana, R. Predicting good probabilities with supervised learning. in ICML ′05 Proc. 22nd Intl. Conf. Mach. Learn. 625–632 (ACM Press, Bonn, Germany, 2005).
Guan, Y., Ackert-Bicknell, C.L., Kell, B., Troyanskaya, O.G. & Hibbs, M.A. Functional genomics complements quantitative genetics in identifying disease-gene associations. PLOS Comput. Biol. 6, e1000991 (2010).
Platt, J.C. Probabilities for SV Machines. in Advances in Large Margin Classifiers (eds. Smola, A.J. et al.) 61–74 (Massachusetts Institute of Technology, Cambridge, MA, USA, 2000).
Blondel, V.D., Guillaume, J.-L., Lambiotte, R. & Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008, P10008 (2008).
Bastian, M., Heymann, S. & Jacomy, M. Gephi: an open source software for exploring and manipulating networks. In Int. AAAI Conf. Weblogs Soc. Media (Association for the Advancement of Artificial Intelligence, Menlo Park, CA, USA, 2009).
McCall, M.N., Uppal, K., Jaffee, H.A., Zilliox, M.J. & Irizarry, R.A. The Gene Expression Barcode: leveraging public data repositories to begin cataloging the human and murine transcriptomes. Nucleic Acids Res. 39, D1011–D1015 (2011).
We thank K. Yao, R. Hong, and J. Zhou for assistance with video analysis, G. Laevsky for assistance with confocal microscopy, the CGC for strains, and Z. Gitai and the laboratories of O.G.T. and C.T.M. for valuable discussion. Strain UA44 was generously provided by G. Caldwell (University of Alabama), and strain BY250 was a generous gift from R. Blakely (Vanderbilt University). V.Y. was supported in part by US NIH grant T32 HG003284. O.G.T. is supported as a senior fellow of the Genetic Networks program of the Canadian Institute for Advanced Research (CIFAR). C.T.M. is supported as the Director of the Glenn Center for Aging Research at Princeton and as an HHMI-Simons Faculty Scholar. This work was supported by the NIH (R01 GM071966 to O.G.T. and Cognitive Aging R01 and DP1 Pioneer Award to C.T.M.).
The authors declare no competing financial interests.
Integrated supplementary information
GO enrichment analysis as performed on PD predictions with a score > 2.0 (n=609 genes). Significant GO terms are shown. Bars represent individual Benjamini p-values derived from GO enrichment analysis.
Neuron-sensitive animals (unc-119p::sid-1) were exposed to adult-only RNAi targeting 45 top candidate PD genes, and tested for thrashing defects on days 2, 5, and 8 of adulthood. Movement was analyzed using CeleST. CeleST quantification of thrashing on day 8 is shown. Control L4440 RNAi (blue), direct GWAS worm orthologs (red), and candidates independently identified using the 23andMe GWAS study (yellow) are shown. Mean ± SEM, unpaired two-sided t-test, Benjamini-Hochberg multiple hypothesis test correction, n ≥ 50 per gene (exact sample sizes per gene in Supplementary Data 13). *p<0.05, **p<0.01, ***p<0.001, ****p<0.0001.
Supplementary Figure 3 Screen of Parkinson's disease–candidate genes for age-specific motor defects.
Animals were exposed to adult-only RNAi targeting 45 top candidate PD genes, and tested for thrashing defects on days 2, 5, and 8 of adulthood. Movement was analyzed using CeleST. (a) Heatmap of (hierarchically clustered) t-statistics comparing 10 CeleST movement measurements for each of the top 45 top PD gene candidates against the control L4440 RNAi on day 8 of adulthood, n ≥ 50 per gene (exact sample sizes per gene in Supplementary Data 13). (b) Pearson's correlation of t-statistics for each of the 10 CeleST movement measurements between all pairs of genes tested on days 2, 5, and 8 of adulthood. (c) Principal components were calculated using all 13,048 worms (across 45 genes and 3 days). PCA plot of RNAi-treated worms and control (aggregated by gene and day, see sample sizes in Supplementary Data 13). Colors indicate age of worm. PC1 (x-axis) and PC2 (y-axis) respectively account for 39.36% and 11.85% of the total variation. (d) Neuronal RNAi-sensitive animals were exposed to adult-only RNAi individually targeting 13 top cancer and metabolic disease predictions, bcat-1 (red) as a positive control, or the L4440 negative control. Curling was examined on day 8 using an automated analysis program (Sohrabi, et al. in preparation). Mean ± SEM. Control n=351, bcat-1 n=420, cyb-2.1 n=287, pxl-1 n=289, frm-2 n=279, mre-11 n=272, sma-4 n=286, snt-4 n=305, cdh-4 n=285, lbp-2 n=320, ani-3 n=300, hcp-1 n=264, BE0003N10.1 n=229, let-363 n=284, hil-3 n=270. n represents the number of animals per condition. One-way ANOVA with Tukey's multiple comparisons test. Control vs bcat-1i p= 4.33e-8. ****p<0.0001.
CeleST was used to analyze control and bcat-1 RNAi-treated worms on day 2, 5, and 8 of adulthood. Mean ± SEM, two-way ANOVA with Sidak's multiple comparisons test, Control: day 2 n=492, day 5 n=345, day 8 n=573. bcat-1 RNAi: day 2 n=675, day 5 n=714, day 8 n=582. Body wave number day 2 control vs bcat-1i: t=3.075, df=3375, 95% CI: (-0.2648, -0.03323), p=0.0064.
Neuron RNAi insensitive, wild-type (N2) worms treated with control (L4440) or bcat-1 RNAi do not exhibit curling on Day 8 of adulthood compared to neuron-RNAi sensitive animals (unc-119p::sid-1). Mean ± SEM, *p<0.05, **p<0.01, ***p<0.001, ****p<0.0001, two-way repeated measures ANOVA, Tukey's post hoc tests. Worm thrashing videos were hand counted. (a) Control;unc-119p::sid-1 n=28 animals, bcat-1i:unc-119p::sid-1 n=41 animals, control;wild type n=24 animals, bcat-1i;wild type n=30 animals. Multiple comparisons: Control:unc-119p::sid-1 vs. bcat-1i:unc-119p::sid-1 t=3.156, df=119, 95% CI: (-18.7, -1.491), p=0.0121. Control:wild type vs. bcat-1i:wild type t=0.7787, df=119, 95% CI: (-11.98, 6.577), p=0.9684. bcat-1i:unc-119p::sid-1 vs. bcat-1i:wild type t=3.422, df=119, 95% CI: (2.272, 18.55), p=0.0051. (b) Control;unc-119p::sid-1 n=75 animals, bcat-1i:unc-119p::sid-1 n=86 animals, control;wild type n=73 animals, bcat-1i;wild type n=76 animals. Multiple comparisons: Control:unc-119p::sid-1 vs. bcat-1i:unc-119p::sid-1 t=4.305, df=306, 95% CI: (-10.68, -2.546), p=0.000135. Control:wild type vs. bcat-1i:wild type t=0.8621, df=306, 95% CI: (-5.595, 2.847), p=0.948. bcat-1i:unc-119p::sid-1 vs. bcat-1i:wild type t=4.576, df=306, 95% CI: (2.952, 11.06), p=0.00041.
Supplementary Figure 6 BCAT1 expression in selected brain regions in healthy human subjects from the Allen Brain Atlas.
Average BCAT1 expression in selected brain regions of healthy human individuals, obtained from the Allen Brain Atlas. Expression data for each of three BCAT1 probes is shown for several major brain regions, in addition to four regions that degenerate in PD. Probe A, A_23_P87528; Probe B, A_24_P52921; Probe C, A_24_P935986. Mean ± SEM. n=6 human donors for each sample from the Allen Brain Atlas database for gene expression. Box plots show minimum, first quartile, median, third quartile, and maximum values.
Supplementary Figure 7 bcat-1 knockdown does not alter ADE cell-body numbers in the presence of α-synuclein.
ADE cell bodies were counted on Day 8 in neuron-RNAi sensitive worms expressing α-synuclein and GFP in dopaminergic neurons. Mean ± SEM, unpaired two-sided Student's t-test. L4440 n=45 animals, bcat-1i n=61 animals. t=0.4156, df=104, 95% CI: (-0.3112, 0.2033), p=0.6785. The experiment was repeated three times independently with similar results. Box plots show minimum, first quartile, median, third quartile, and maximum values.
Supplementary Figure 8 The Functional Representation module is robust to data compendium size, amount of prior knowledge, and initialization state.
Semi-supervised network construction approach was applied to (a) ten progressively smaller compendia sub-sampled from the full worm compendium (without replacement) and (b) seven progressively smaller sets of tissue gene annotations subsampled from all previously known tissue genes (without replacement). Each measurement is an average of 10 independent simulations and standard error (shaded regions) is shown.
Supplementary Figures 1–8 (PDF 1540 kb)
203 tissue- and cell-type specific networks. (XLSX 19 kb)
Evaluation of 25 disease predictions. (XLSX 10 kb)
GWAS genes used as gold standard for predictions. (XLSX 39 kb)
Gene Ontology analysis of top ALS disease candidates. (XLSX 139 kb)
Gene Ontology analysis of top schizophrenia candidates. (XLSX 148 kb)
Gene Ontology analysis of top ovarian carcinoma candidates. (XLSX 93 kb)
Gene Ontology analysis of top pancreatic cancer candidates. (XLSX 99 kb)
Evaluation of tissue-specific lifespan gene predictions using human longevity GWAS input. (XLSX 5016 kb)
Parkinson's disease-associated gene predictions. (XLSX 563 kb)
Dopaminergic neuron network clustering of top PD gene predictions and functional enrichment per cluster. (XLSX 85 kb)
KEGG pathway and Gene Ontology enrichment of Parkinson's disease predictions. (XLSX 32 kb)
Prioritized PD candidate genes. (XLSX 33 kb)
CeleST worm movement measures of top candidate genes on days 2, 5, and 8. (XLSX 186 kb)
Top non-PD candidate genes tested for curling defects. (XLSX 10 kb)
Human BCAT1 expression data obtained from the Allen Brain Atlas. (XLS 178 kb)
Sleipnir Library for Computational Functional Genomics (ZIP 1518 kb)
Extending diseaseQUEST to other model organisms and diseases. (PDF 166 kb)
About this article
Cite this article
Yao, V., Kaletsky, R., Keyes, W. et al. An integrative tissue-network approach to identify and test human disease genes. Nat Biotechnol 36, 1091–1099 (2018). https://doi.org/10.1038/nbt.4246
This article is cited by
GAIT-GM integrative cross-omics analyses reveal cholinergic defects in a C. elegans model of Parkinson’s disease
Scientific Reports (2022)
Nature Communications (2022)
BMC Genomics (2021)
Nature Reviews Materials (2021)
Communications Biology (2021)