Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

An efficient and effective method to identify significantly perturbed subnetworks in cancer

Abstract

The identification of key functional biological networks from high-dimensional genomics data is pivotal for cancer research. Here, we introduce FDRnet, a method for the detection of molecular subnetworks in cancer, which addresses several challenges in pathway analysis. FDRnet detects key subnetworks by solving a mixed-integer linear programming problem, using a given upper bound of false discovery rate (FDR) as a budget constraint, and minimizing a conductance score to find dense subgraphs around seed genes. A large-scale benchmark study was performed on both simulation and cancer genomics data. FDRnet outperformed other methods in the ability to detect functionally homogeneous subnetworks in a scale-free biological network, to control FDRs of the genes in detected subnetworks, to improve computational efficiency and to integrate multi-omics data. By overcoming the limitations of existing approaches, FDRnet can facilitate the detection of key functional pathways in cancer and other genetic diseases.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overview of the proposed method.
Fig. 2: Comparison of six methods in terms of their abilities to detect target genes and modular structures and to control FDRs of identified subnetworks using simulation data.
Fig. 3: Detecting significantly mutated subnetworks in breast cancer using The Cancer Genome Atlas copy number and somatic mutation data.
Fig. 4: Detecting pathways differentially expressed between germinal center B-cell like (GCB) and activated B-cell like (ABC) diffuse large B-cell lymphoma using gene expression data.
Fig. 5: Running time of six methods applied to simulation, breast cancer and lymphoma data.

Similar content being viewed by others

Data availability

The breast cancer somatic mutation and copy number data (dbGaP study accession no. phs000178) were downloaded from the TCGA Firehose website (https://gdac.broadinstitute.org). The iRefIndex9.0 PPI network, the BioGRID v3.5.187 PPI network and the ReactomeFI v2019 PPI network were downloaded from http://compbio-research.cs.brown.edu/pancancer/hotnet2/, https://thebiogrid.org and https://reactome.org, respectively, without any restriction. For the lymphoma study, the gene expression data and the interactome data (HPRD PPI network) were obtained from the BioNet package (https://www.bioconductor.org/packages/release/bioc/html/BioNet.html) without any restriction. Source data are provided with this paper.

Code availability

The software and user manual are available at https://github.com/yangle293/FDRnet (https://doi.org/10.5281/zenodo.4121885; ref. 61) and www.acsu.buffalo.edu/~yijunsun/lab/FDRnet.html.

References

  1. Beroukhim, R. et al. The landscape of somatic copy-number alteration across human cancers. Nature 463, 899–905 (2010).

    Article  Google Scholar 

  2. The Cancer Genome Atlas Network Comprehensive molecular portraits of human breast tumours. Nature 490, 61–70 (2012).

    Article  Google Scholar 

  3. Bailey, M. H. et al. Comprehensive characterization of cancer driver genes and mutations. Cell 173, 371–385 (2018).

    Article  Google Scholar 

  4. Lawrence, M. S. et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 214–218 (2013).

    Article  Google Scholar 

  5. Dees, N. D. et al. MuSiC: identifying mutational significance in cancer genomes. Genome Res. 22, 1589–1598 (2012).

    Article  Google Scholar 

  6. Stransky, N. et al. The mutational landscape of head and neck squamous cell carcinoma. Science 333, 1157–1160 (2011).

    Article  Google Scholar 

  7. Chapman, M. A. et al. Initial genome sequencing and analysis of multiple myeloma. Nature 471, 467–472 (2011).

    Article  Google Scholar 

  8. Raphael, B. J., Dobson, J. R., Oesper, L. & Vandin, F. Identifying driver mutations in sequenced cancer genomes: computational approaches to enable precision medicine. Genome Med. 6, 5 (2014).

    Article  Google Scholar 

  9. Ideker, T., Ozier, O., Schwikowski, B. & Siegel, A. F. Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics 18, S233–S240 (2002).

    Article  Google Scholar 

  10. Dittrich, M. T., Klau, G. W., Rosenwald, A., Dandekar, T. & Müller, T. Identifying functional modules in protein–protein interaction networks: an integrated exact approach. Bioinformatics 24, 223–231 (2008).

    Article  Google Scholar 

  11. Vandin, F., Upfal, E. & Raphael, B. J. Algorithms for detecting significantly mutated pathways in cancer. J. Comput. Biol. 18, 507–522 (2011).

    Article  MathSciNet  Google Scholar 

  12. Ciriello, G., Cerami, E., Sander, C. & Schultz, N. Mutual exclusivity analysis identifies oncogenic network modules. Genome Res. 22, 398–406 (2012).

    Article  Google Scholar 

  13. Iorio, F. et al. Pathway-based dissection of the genomic heterogeneity of cancer hallmarks’ acquisition with SLAPenrich. Sci. Rep. 8, 1–16 (2018).

    Article  Google Scholar 

  14. Sohler, F., Hanisch, D. & Zimmer, R. New methods for joint analysis of biological networks and expression data. Bioinformatics 20, 1517–1521 (2004).

    Article  Google Scholar 

  15. Nacu, Ş., Critchley-Thorne, R., Lee, P. & Holmes, S. Gene expression network analysis and applications to immunology. Bioinformatics 23, 850–858 (2007).

    Article  Google Scholar 

  16. Leiserson, M. D. et al. Pan-cancer network analysis identifies combinations of rare somatic mutations across pathways and protein complexes. Nat. Genet. 47, 106–114 (2015).

    Article  Google Scholar 

  17. Reyna, M. A., Leiserson, M. D. & Raphael, B. J. Hierarchical HotNet: identifying hierarchies of altered subnetworks. Bioinformatics 34, i972–i980 (2018).

    Article  Google Scholar 

  18. Razick, S., Magklaras, G. & Donaldson, I. M. iRefindex: a consolidated protein interaction database with provenance. BMC Bioinformatics 9, 405 (2008).

    Article  Google Scholar 

  19. Giurgiu, M. et al. CORUM: the comprehensive resource of mammalian protein complexes—2019. Nucleic Acids Res. 47, D559–D563 (2019).

    Article  Google Scholar 

  20. Beisser, D., Klau, G. W., Dandekar, T., Müller, T. & Dittrich, M. T. BioNet: an R-package for the functional analysis of biological networks. Bioinformatics 26, 1129–1130 (2010).

    Article  Google Scholar 

  21. Qiu, Y.-Q., Zhang, S., Zhang, X.-S. & Chen, L. Detecting disease associated modules and prioritizing active genes based on high throughput data. BMC Bioinformatics 11, 26 (2010).

    Article  Google Scholar 

  22. Gu, J., Chen, Y., Li, S. & Li, Y. Identification of responsive gene modules by network-based gene clustering and extending: application to inflammation and angiogenesis. BMC Syst. Biol. 4, 47 (2010).

    Article  Google Scholar 

  23. Barabasi, A.-L. & Oltvai, Z. N. Network biology: understanding the cell’s functional organization. Nat. Rev. Genet. 5, 101–113 (2004).

    Article  Google Scholar 

  24. Oughtred, R. et al. The BioGRID interaction database: 2019 update. Nucleic Acids Res. 47, D529–D541 (2019).

    Article  Google Scholar 

  25. Jassal, B. et al. The reactome pathway knowledgebase. Nucleic Acids Res. 48, D498–D503 (2020).

    Google Scholar 

  26. Watson, I. R., Takahashi, K., Futreal, P. A. & Chin, L. Emerging patterns of somatic mutations in cancer. Nat. Rev. Genet. 14, 703–718 (2013).

    Article  Google Scholar 

  27. Mermel, C. H. et al. Gistic2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol. 12, R41 (2011).

    Article  Google Scholar 

  28. Forbes, S. A. et al. COSMIC: somatic cancer genetics at high-resolution. Nucleic Acids Res. 45, D777–D783 (2016).

    Article  Google Scholar 

  29. Olivier, M., Hollstein, M. & Hainaut, P. TP53 mutations in human cancers: origins, consequences and clinical use. Cold Spring Harb. Perspect. Biol. 2, a001008 (2010).

    Article  Google Scholar 

  30. Khatri, P. & Drăghici, S. Ontological analysis of gene expression data: current tools, limitations and open problems. Bioinformatics 21, 3587–3595 (2005).

    Article  Google Scholar 

  31. Dustin, D., Gu, G. & Fuqua, S. A. W. ESR1 mutations in breast cancer. Cancer 125, 3714–3728 (2019).

    Article  Google Scholar 

  32. Toy, W. et al. ESR1 ligand-binding domain mutations in hormone-resistant breast cancer. Nat. Genet. 45, 1439–1445 (2013).

    Article  Google Scholar 

  33. Martínez-Iglesias, O., Alonso-Merino, E. & Aranda, A. Tumor suppressive actions of the nuclear receptor corepressor 1. Pharmacol. Res. 108, 75–79 (2016).

    Article  Google Scholar 

  34. Soutourina, J. Transcription regulation by the Mediator complex. Nat. Rev. Mol. Cell Biol. 19, 262–274 (2018).

    Article  Google Scholar 

  35. Eyboulet, F. et al. Mediator links transcription and DNA repair by facilitating Rad2/XPG recruitment. Genes Dev. 27, 2549–2562 (2013).

    Article  Google Scholar 

  36. Rosenwald, A. et al. The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. New Engl. J. Med. 346, 1937–1947 (2002).

    Article  Google Scholar 

  37. Chapuy, B. et al. Molecular subtypes of diffuse large B cell lymphoma are associated with distinct pathogenic mechanisms and outcomes. Nat. Med. 24, 679–690 (2018).

    Article  Google Scholar 

  38. Keshava Prasad, T. et al. Human Protein Reference Database—2009 update. Nucleic Acids Res. 37, D767–D772 (2008).

    Article  Google Scholar 

  39. Xu-Monette, Z. Y. et al. Mutational profile and prognostic significance of TP53 in diffuse large B-cell lymphoma patients treated with R-CHOP: report from an international DLBCL Rituximab-CHOP Consortium Program Study. Blood 120, 3986–3996 (2012).

    Article  Google Scholar 

  40. Lenz, G. & Staudt, L. M. Aggressive lymphomas. New Engl. J. Med. 362, 1417–1429 (2010).

    Article  Google Scholar 

  41. Phelan, J. D. et al. A multiprotein supercomplex controlling oncogenic signalling in lymphoma. Nature 560, 387–391 (2018).

    Article  Google Scholar 

  42. Munoz, J., Dhillon, N., Janku, F., Watowich, S. S. & Hong, D. S. STAT3 inhibitors: finding a home in lymphoma and leukemia. Oncologist 19, 536–544 (2014).

    Article  Google Scholar 

  43. Hatzi, K. et al. A hybrid mechanism of action for BCL6 in B cells defined by formation of functionally distinct complexes at enhancers and promoters. Cell Rep. 4, 578–588 (2013).

    Article  Google Scholar 

  44. Benson, A. R., Gleich, D. F. & Leskovec, J. Higher-order organization of complex networks. Science 353, 163–166 (2016).

    Article  Google Scholar 

  45. Yin, H., Benson, A. R., Leskovec, J. & Gleich, D. F. Local higher-order graph clustering. In Proc. 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 555–564 (ACM, 2017); https://doi.org/10.1145/3097983.3098069

  46. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. B 57, 289–300 (1995).

    MathSciNet  MATH  Google Scholar 

  47. Efron, B., Tibshirani, R., Storey, J. D. & Tusher, V. Empirical Bayes analysis of a microarray experiment. J. Am. Stat. Assoc. 96, 1151–1160 (2001).

    Article  MathSciNet  MATH  Google Scholar 

  48. Efron, B. & Tibshirani, R. Using specially designed exponential families for density estimation. Ann. Stat. 24, 2431–2461 (1996).

    Article  MathSciNet  MATH  Google Scholar 

  49. Strimmer, K. fdrtool: a versatile R package for estimating local and tail area-based false discovery rates. Bioinformatics 24, 1461–1462 (2008).

    Article  Google Scholar 

  50. Langaas, M., Lindqvist, B. H. & Ferkingstad, E. Estimating the proportion of true null hypotheses, with application to DNA microarray data. J. R. Stat. Soc. B 67, 555–572 (2005).

    Article  MathSciNet  MATH  Google Scholar 

  51. Efron, B. Large-scale simultaneous hypothesis testing: the choice of a null hypothesis. J. Am. Stat. Assoc. 99, 96–104 (2004).

    Article  MathSciNet  MATH  Google Scholar 

  52. Hong, W.-J., Tibshirani, R. & Chu, G. Local false discovery rate facilitates comparison of different microarray experiments. Nucleic Acids Res. 37, 7483–7497 (2009).

    Article  Google Scholar 

  53. Albert, R. Scale-free networks in cell biology. J. Cell Sci. 118, 4947–4957 (2005).

    Article  Google Scholar 

  54. Dao, P. et al. Inferring cancer subnetwork markers using density-constrained biclustering. Bioinformatics 26, i625–i631 (2010).

    Article  Google Scholar 

  55. Colak, R. et al. Dense graphlet statistics of protein interaction and random networks. In Pacific Symposium on Biocomputing 178–189 (World Scientific, 2009); https://doi.org/10.1142/9789812836939_0018

  56. Adams, W. P. & Sherali, H. D. Linearization strategies for a class of zero-one mixed integer programming problems. Oper. Res. 38, 217–226 (1990).

    Article  MathSciNet  MATH  Google Scholar 

  57. Fan, N. & Pardalos, P. M. Multi-way clustering and biclustering by the ratio cut and normalized cut in graphs. J. Combin. Optim. 23, 224–251 (2012).

    Article  MathSciNet  MATH  Google Scholar 

  58. Dilkina, B. N. & Gomes, C. P. Solving connected subgraph problems in wildlife conservation. In 7th International Conference on the Integration of Constraint Programming, Artificial Intelligence and Operations Research 102–116 (ACM, 2010); https://doi.org/10.1007/978-3-642-13520-0_14

  59. IBM, Inc. CPLEX Optimizer Studio 12.7 (2016); https://www.ibm.com/analytics/cplex-optimizer

  60. Andersen, R., Chung, F. & Lang, K. Local graph partitioning using PageRank vectors. In 47th Annual IEEE Symposium on Foundations of Computer Science 475–486 (IEEE, 2006); https://doi.org/10.1109/FOCS.2006.44

  61. Yang, L. FDRnet 1.0.0 (version 1.0.0) (2020); https://doi.org/10.5281/zenodo.4121885

  62. Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported in part by NIH R01AI125982 (Y.S.), NIH R01DE024523195 (Y.S.) and NIH R01CA241123 (S.G.).

Author information

Authors and Affiliations

Authors

Contributions

L.Y., S.G. and Y.S. designed the study. L.Y., R.C. and Y.S. performed the data analysis. S.G. performed the biological discussions. L.Y., S.G. and Y.S. wrote the manuscript. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Steve Goodison or Yijun Sun.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Computational Science thanks the anonymous reviewers for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Editor recognition statement Fernando Chirigati was the primary editor on this Article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Supplementary information

Source data

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, L., Chen, R., Goodison, S. et al. An efficient and effective method to identify significantly perturbed subnetworks in cancer. Nat Comput Sci 1, 79–88 (2021). https://doi.org/10.1038/s43588-020-00009-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s43588-020-00009-4

This article is cited by

Search

Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer