Gene prioritization through genomic data fusion

Aerts, Stein; Lambrechts, Diether; Maity, Sunit; Van Loo, Peter; Coessens, Bert; De Smet, Frederik; Tranchevent, Leon-Charles; De Moor, Bart; Marynen, Peter; Hassan, Bassem; Carmeliet, Peter; Moreau, Yves

doi:10.1038/nbt1203

Analysis
Published: 05 May 2006

Gene prioritization through genomic data fusion

Stein Aerts^1,4^na1,
Diether Lambrechts²^na1,
Sunit Maity²^na1,
Peter Van Loo^3,4^na1,
Bert Coessens⁴^na1,
Frederik De Smet²,
Leon-Charles Tranchevent⁴,
Bart De Moor⁴,
Peter Marynen³,
Bassem Hassan¹,
Peter Carmeliet² &
…
Yves Moreau⁴

Nature Biotechnology volume 24, pages 537–544 (2006)Cite this article

6423 Accesses
674 Citations
13 Altmetric
Metrics details

An Erratum to this article was published on 01 June 2006

Abstract

The identification of genes involved in health and disease remains a challenge. We describe a bioinformatics approach, together with a freely accessible, interactive and flexible software termed Endeavour, to prioritize candidate genes underlying biological processes or diseases, based on their similarity to known genes involved in these phenomena. Unlike previous approaches, ours generates distinct prioritizations for multiple heterogeneous data sources, which are then integrated, or fused, into a global ranking using order statistics. In addition, it offers the flexibility of including additional data sources. Validation of our approach revealed it was able to efficiently prioritize 627 genes in disease data sets and 76 genes in biological pathway sets, identify candidates of 16 mono- or polygenic diseases, and discover regulatory genes of myeloid differentiation. Furthermore, the approach identified a novel gene involved in craniofacial development from a 2-Mb chromosomal region, deleted in some patients with DiGeorge-like birth defects. The approach described here offers an alternative integrative method for gene discovery.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Concept of prioritization by Endeavour.**

**Figure 4: *In vitro* functional validation of Endeavour.**

**Figure 5: Functional validation of Endeavour in zebrafish.**

A framework for automated gene selection in genomic applications

Article 10 June 2021

A systems genomics approach to uncover the molecular properties of cancer genes

Article Open access 27 October 2020

GeVIR is a continuous gene-level metric that uses variant distribution patterns to prioritize disease candidate genes

Article 23 December 2019

References

Quackenbush, J. Genomics. Microarrays—guilt by association. Science 302, 240–241 (2004).
Article Google Scholar
Kanehisa, M. & Bork, P. Bioinformatics in the post-sequence era. Nat. Genet. 33 Suppl. 305–310 (2003).
Article CAS Google Scholar
Ball, C.A., Sherlock, G. & Brazma, A. Funding high-throughput data sharing. Nat. Biotechnol. 22, 1179–1183 (2004).
Article CAS Google Scholar
Freudenberg, J. & Propping, P. A similarity-based method for genome-wide prediction of disease-relevant human genes. Bioinformatics 18 Suppl. 2, S110–S115 (2002).
Article Google Scholar
Perez-Iratxeta, C., Bork, P. & Andrade, M.A. Association of genes to genetically inherited diseases using data mining. Nat. Genet. 31, 316–319 (2002).
Article CAS Google Scholar
Turner, F.S., Clutterbuck, D.R. & Semple, C.A. POCUS: mining genomic sequence annotation to predict disease genes. Genome Biol. 4, R75 (2003).
Article Google Scholar
Tiffin, N. et al. Integration of text- and data-mining using ontologies successfully selects disease gene candidates. Nucleic Acids Res. 33, 1544–1552 (2005).
Article CAS Google Scholar
Adie, E.A., Adams, R.R., Evans, K.L., Porteous, D.J. & Pickard, B.S. Speeding disease gene discovery by sequence based candidate prioritization. BMC Bioinformatics 6, 55 (2005).
Article Google Scholar
Lopez-Bigas, N. & Ouzounis, C.A. Genome-wide identification of genes likely to be involved in human genetic disease. Nucleic Acids Res. 32, 3108–3114 (2004).
Article CAS Google Scholar
Kent, W.J. et al. Exploring relationships and mining data with the UCSC Gene Sorter. Genome Res. 15, 737–741 (2005).
Article CAS Google Scholar
Altermann, E. & Klaenhammer, T.R. PathwayVoyager: pathway mapping using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. BMC Genomics 6, 60 (2005).
Article Google Scholar
Aerts, S. et al. TOUCAN 2: the all-inclusive open source workbench for regulatory sequence analysis. Nucleic Acids Res. 33, W393–W396 (2005).
Article CAS Google Scholar
Aerts, S., Van Loo, P., Thijs, G., Moreau, Y. & De Moor, B. Computational detection of cis-regulatory modules. Bioinformatics 19 (Suppl 2), II5–II14 (2003).
Article Google Scholar
Tamayo, P. et al. Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc. Natl. Acad. Sci. USA 96, 2907–2912 (1999).
Article CAS Google Scholar
Stegmaier, K. et al. Gene expression-based high-throughput screening (GE-HTS) and application to leukemia differentiation. Nat. Genet. 36, 257–263 (2004).
Article CAS Google Scholar
Pixley, F.J. et al. BCL6 suppresses RhoA activity to alter macrophage morphology and motility. J. Cell Sci. 118, 1873–1883 (2005).
Article CAS Google Scholar
Galimi, F. et al. Hepatocyte growth factor is a regulator of monocyte-macrophage function. J. Immunol. 166, 1241–1247 (2001).
Article CAS Google Scholar
Brown, N.J. et al. Fas death receptor signaling represses monocyte numbers and macrophage activation in vivo. J. Immunol. 173, 7584–7593 (2004).
Article CAS Google Scholar
Scambler, P.J. The 22q11 deletion syndromes. Hum. Mol. Genet. 9, 2421–2426 (2000).
Article CAS Google Scholar
Baldini, A. Dissecting contiguous gene defects: TBX1. Curr. Opin. Genet. Dev. 15, 279–284 (2005).
Article CAS Google Scholar
Jerome, L.A. & Papaioannou, V.E. DiGeorge syndrome phenotype in mice mutant for the T-box gene, Tbx1. Nat. Genet. 27, 286–291 (2001).
Article CAS Google Scholar
Merscher, S. et al. TBX1 is responsible for cardiovascular defects in velo-cardio-facial/DiGeorge syndrome. Cell 104, 619–629 (2001).
Article CAS Google Scholar
Lindsay, E.A. et al. Tbx1 haploinsufficieny in the DiGeorge syndrome region causes aortic arch defects in mice. Nature 410, 97–101 (2001).
Article CAS Google Scholar
Piotrowski, T. et al. The zebrafish van gogh mutation disrupts tbx1, which is involved in the DiGeorge deletion syndrome in humans. Development 130, 5043–5052 (2003).
Article CAS Google Scholar
Rauch, A. et al. A novel 22q11.2 microdeletion in DiGeorge syndrome. Am. J. Hum. Genet. 64, 659–666 (1999).
Article CAS Google Scholar
Graham, A. The development and evolution of the pharyngeal arches. J. Anat. 199, 133–141 (2001).
Article CAS Google Scholar
Stalmans, I. et al. VEGF: a modifier of the del22q11 (DiGeorge) syndrome? Nat. Med. 9, 173–182 (2003).
Article CAS Google Scholar
Glenisson, P. et al. TXTGate: profiling gene groups with text-based information. Genome Biol. 5, R43 (2004).
Article Google Scholar
Bader, G.D., Betel, D. & Hogue, C.W. BIND: the Biomolecular Interaction Network Database. Nucleic Acids Res. 31, 248–250 (2003).
Article CAS Google Scholar
Aerts, S., Van Loo, P., Moreau, Y. & De Moor, B. A genetic algorithm for the detection of new cis-regulatory modules in sets of coregulated genes. Bioinformatics 20, 1974–1976 (2004).
Article CAS Google Scholar
Stuart, J.M., Segal, E., Koller, D. & Kim, S.K. A gene-coexpression network for global discovery of conserved genetic modules. Science 302, 249–255 (2003).
Article CAS Google Scholar
Westerfield, M. The Zebrafish Book. A Guide for the Laboratory Use of Zebrafish, (University of Oregon Press, Eugene, Oregon, 1994).
Google Scholar
Kimmel, C.B. et al. The shaping of pharyngeal cartilages during early development of the zebrafish. Dev. Biol. 203, 245–263 (1998).
Article CAS Google Scholar
Splawski, I. et al. Ca(V)1.2 calcium channel dysfunction causes a multisystem disorder including arrhythmia and autism. Cell 119, 19–31 (2004).
Article CAS Google Scholar
Robinson, S.W. et al. Missense mutations in CRELD1 are associated with cardiac atrioventricular septal defects. Am. J. Hum. Genet. 72, 1047–1052 (2003).
Article CAS Google Scholar
Hayashi, T. et al. Identification and functional analysis of a caveolin-3 mutation associated with familial hypertrophic cardiomyopathy. Biochem. Biophys. Res. Commun. 313, 178–184 (2004).
Article CAS Google Scholar
Zimprich, A. et al. Mutations in LRRK2 cause autosomal-dominant parkinsonism with pleomorphic pathology. Neuron 44, 601–607 (2004).
Article CAS Google Scholar
Zuchner, S. et al. Mutations in the pleckstrin homology domain of dynamin 2 cause dominant intermediate Charcot-Marie-Tooth disease. Nat. Genet. 37, 289–294 (2005).
Article Google Scholar
Munch, C. et al. Point mutations of the p150 subunit of dynactin (DCTN1) gene in ALS. Neurology 63, 724–726 (2004).
Article CAS Google Scholar
Tian, X.L. et al. Identification of an angiogenic factor that when mutated causes susceptibility to Klippel-Trenaunay syndrome. Nature 427, 640–645 (2004).
Article CAS Google Scholar
Bienengraeber, M. et al. ABCC9 mutations identified in human dilated cardiomyopathy disrupt catalytic KATP channel gating. Nat. Genet. 36, 382–387 (2004).
Article CAS Google Scholar
Windpassinger, C. et al. Heterozygous missense mutations in BSCL2 are associated with distal hereditary motor neuropathy and Silver syndrome. Nat. Genet. 36, 271–276 (2004).
Article CAS Google Scholar
Tonkin, E.T., Wang, T.J., Lisgo, S., Bamshad, M.J. & Strachan, T. NIPBL, encoding a homolog of fungal Scc2-type sister chromatid cohesion proteins and fly Nipped-B, is mutated in Cornelia de Lange syndrome. Nat. Genet. 36, 636–641 (2004).
Article CAS Google Scholar
Krantz, I.D. et al. Exclusion of linkage to the CDL1 gene region on chromosome 3q26.3 in some familial cases of Cornelia de Lange syndrome. Am. J. Med. Genet. 101, 120–129 (2001).
Article CAS Google Scholar
Wang, X. et al. Positional identification of TNFSF4, encoding OX40 ligand, as a gene that influences atherosclerosis susceptibility. Nat. Genet. 37, 365–372 (2005).
Article CAS Google Scholar
Peltekova, V.D. et al. Functional variants of OCTN cation transporter genes are associated with Crohn disease. Nat. Genet. 36, 471–475 (2004).
Article CAS Google Scholar
Aharon-Peretz, J., Rosenbaum, H. & Gershoni-Baruch, R. Mutations in the glucocerebrosidase gene and Parkinson's disease in Ashkenazi Jews. N. Engl. J. Med. 351, 1972–1977 (2004).
Article CAS Google Scholar
Begovich, A.B. et al. A missense single-nucleotide polymorphism in a gene encoding a protein tyrosine phosphatase (PTPN22) is associated with rheumatoid arthritis. Am. J. Hum. Genet. 75, 330–337 (2004).
Article CAS Google Scholar
Helgadottir, A. et al. The gene encoding 5-lipoxygenase activating protein confers risk of myocardial infarction and stroke. Nat. Genet. 36, 233–239 (2004).
Article CAS Google Scholar
Bertram, L. et al. Family-based association between Alzheimer's disease and variants in UBQLN1. N. Engl. J. Med. 352, 884–894 (2005).
Article CAS Google Scholar

Download references

Acknowledgements

We wish to thank all groups and consortia that made their data freely available: Ensembl, NCBI (EntrezGene and Medline), Gene Ontology, BIND, KEGG, Atlas, InterPro, BioBase, the Disease Probabilities from Lopez-Bigas and Ouzounis⁹ and the Prospectr scores from Euan Adie⁸. Ouzounis⁸ and the Prospectr scores from Euan Adie⁹. We also thank the following people for their help in particular areas: Robert Vlietinck with the manuscript, Patrick Glenisson with text mining, Joke Allemeersch and Gert Thijs with the order statistics and Camilla Esguerra with the zebrafish experiments. S.A., D.L. and P.V.L. are sponsored by the Research Foundation Flanders (FWO). This work is supported by Flanders Institute for Biotechnology (VIB), Instituut voor de aanmoediging van Innovatie door Wetenschap en Technologie in Vlaanderen (IWT) (STWW-00162), Research Council KULeuven (GOA-Ambiorics, IDO genetic networks), FWO (G.0229.03 and G.0413.03), IUAP V-22, K.U.L. Excellentiefinanciering CoE SymBioSys (EF/05/007), EU NoE Biopattern and EU EST BIOPTRAIN to Y.M., and by the FWO (G.0405.06), GOA/2006/11 and GOA/2001/09, Squibb and EULSHB-CT-2004-503573 to P.C.

Author information

Stein Aerts, Diether Lambrechts, Sunit Maity, Peter Van Loo and Bert Coessens: These authors contributed equally to this work.

Authors and Affiliations

Department of Human Genetics, Laboratory of Neurogenetics, Flanders Interuniversity Institute for Biotechnology (VIB), University of Leuven, Herestraat 49, bus 602, Leuven, 3000, Belgium
Stein Aerts & Bassem Hassan
The Center for Transgene Technology and Gene Therapy, Flanders Interuniversity Institute for Biotechnology (VIB), University of Leuven, Herestraat 49, bus 602, Leuven, 3000, Belgium
Diether Lambrechts, Sunit Maity, Frederik De Smet & Peter Carmeliet
Department of Human Genetics, Human Genome Laboratory, Flanders Interuniversity Institute for Biotechnology (VIB), University of Leuven, Herestraat 49, bus 602, Leuven, 3000, Belgium
Peter Van Loo & Peter Marynen
Department of Electrical Engineering (ESAT-SCD), Bioinformatics Group, University of Leuven, Belgium
Stein Aerts, Peter Van Loo, Bert Coessens, Leon-Charles Tranchevent, Bart De Moor & Yves Moreau

Authors

Stein Aerts
View author publications
You can also search for this author in PubMed Google Scholar
Diether Lambrechts
View author publications
You can also search for this author in PubMed Google Scholar
Sunit Maity
View author publications
You can also search for this author in PubMed Google Scholar
Peter Van Loo
View author publications
You can also search for this author in PubMed Google Scholar
Bert Coessens
View author publications
You can also search for this author in PubMed Google Scholar
Frederik De Smet
View author publications
You can also search for this author in PubMed Google Scholar
Leon-Charles Tranchevent
View author publications
You can also search for this author in PubMed Google Scholar
Bart De Moor
View author publications
You can also search for this author in PubMed Google Scholar
Peter Marynen
View author publications
You can also search for this author in PubMed Google Scholar
Bassem Hassan
View author publications
You can also search for this author in PubMed Google Scholar
Peter Carmeliet
View author publications
You can also search for this author in PubMed Google Scholar
Yves Moreau
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stein Aerts.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Aerts, S., Lambrechts, D., Maity, S. et al. Gene prioritization through genomic data fusion. Nat Biotechnol 24, 537–544 (2006). https://doi.org/10.1038/nbt1203

Download citation

Published: 05 May 2006
Issue Date: 01 May 2006
DOI: https://doi.org/10.1038/nbt1203

This article is cited by

miR-185-5p May Modulate the Chemosensitivity of LUSC to Cisplatin via Targeting PCDHA11: Multi-omics Analysis and Experimental Validation
- Yicheng Liang
- Mei Liang
- Yushun Gao
Biochemical Genetics (2024)
Deafness gene screening based on a multilevel cascaded BPNN model
- Xiao Liu
- Li Teng
- Jing Sun
BMC Bioinformatics (2023)
Associative gene networks reveal novel candidates important for ADHD and dyslexia comorbidity
- HE Hongyao
- JI Chun
- Li Zengchun
BMC Medical Genomics (2023)
DisGeReExT: a knowledge discovery system for exploration of disease–gene associations through large-scale literature-wide analysis study
- Balu Bhasuran
- Jeyakumar Natarajan
Knowledge and Information Systems (2023)
The stability of different aggregation techniques in ensemble feature selection
- Reem Salman
- Ayman Alzaatreh
- Hana Sulieman
Journal of Big Data (2022)

Gene prioritization through genomic data fusion

Abstract

Access options

Similar content being viewed by others

A framework for automated gene selection in genomic applications

A systems genomics approach to uncover the molecular properties of cancer genes

GeVIR is a continuous gene-level metric that uses variant distribution patterns to prioritize disease candidate genes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Supplementary information

Supplementary Fig. 1

Supplementary Fig. 2

Supplementary Fig. 3

Supplementary Table 1

Supplementary Table 2

Supplementary Methods (PDF 119 kb)

Supplementary Notes (PDF 87 kb)

Rights and permissions

About this article

Cite this article

This article is cited by

miR-185-5p May Modulate the Chemosensitivity of LUSC to Cisplatin via Targeting PCDHA11: Multi-omics Analysis and Experimental Validation

Deafness gene screening based on a multilevel cascaded BPNN model

Associative gene networks reveal novel candidates important for ADHD and dyslexia comorbidity

DisGeReExT: a knowledge discovery system for exploration of disease–gene associations through large-scale literature-wide analysis study

The stability of different aggregation techniques in ensemble feature selection

Search

Quick links

Abstract

Access options

Similar content being viewed by others

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links