This article has been updated


Methods that integrate molecular network information and tumor genome data could complement gene-based statistical tests to identify likely new cancer genes; but such approaches are challenging to validate at scale, and their predictive value remains unclear. We developed a robust statistic (NetSig) that integrates protein interaction networks with data from 4,742 tumor exomes. NetSig can accurately classify known driver genes in 60% of tested tumor types and predicts 62 new driver candidates. Using a quantitative experimental framework to determine in vivo tumorigenic potential in mice, we found that NetSig candidates induce tumors at rates that are comparable to those of known oncogenes and are ten-fold higher than those of random genes. By reanalyzing nine tumor-inducing NetSig candidates in 242 patients with oncogene-negative lung adenocarcinomas, we find that two (AKT2 and TFDP2) are significantly amplified. Our study presents a scalable integrated computational and experimental workflow to expand discovery from cancer genomes.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Change history

  • 19 December 2017

    In the version of this article initially published online, the color labels for oncogene-positive and oncogene-negative lung adenocarcinomas were swapped in the Figure 3a legend. The error has been corrected in the print, PDF and HTML versions of this article.


  1. 1.

    & Lessons from the cancer genome. Cell 153, 17–37 (2013).

  2. 2.

    et al. Cancer genome landscapes. Science 339, 1546–1558 (2013).

  3. 3.

    et al. Development and validation of a clinical cancer genomic profiling test based on massively parallel DNA sequencing. Nat. Biotechnol. 31, 1023–1031 (2013).

  4. 4.

    et al. Personalized oncology through integrative high-throughput sequencing: a pilot study. Sci. Transl. Med. 3, 111ra121 (2011).

  5. 5.

    et al. Somatic ERCC2 mutations correlate with cisplatin sensitivity in muscle-invasive urothelial carcinoma. Cancer Discov. 4, 1140–1153 (2014).

  6. 6.

    et al. High-throughput detection of actionable genomic alterations in clinical tumor samples by targeted, massively parallel sequencing. Cancer Discov. 2, 82–93 (2012).

  7. 7.

    & Functional impact bias reveals cancer drivers. Nucleic Acids Res. 40, e169 (2012).

  8. 8.

    et al. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol. 12, R41 (2011).

  9. 9.

    et al. Functional copy-number alterations in cancer. PLoS One 3, e3179 (2008).

  10. 10.

    et al. Discovery and prioritization of somatic mutations in diffuse large B-cell lymphoma (DLBCL) by whole-exome sequencing. Proc. Natl. Acad. Sci. USA 109, 3879–3884 (2012).

  11. 11.

    et al. Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505, 495–501 (2014).

  12. 12.

    , , & Mutual exclusivity analysis identifies oncogenic network modules. Genome Res. 22, 398–406 (2012).

  13. 13.

    , , , & Network-based stratification of tumor mutations. Nat. Methods 10, 1108–1115 (2013).

  14. 14.

    et al. Pan-cancer network analysis identifies combinations of rare somatic mutations across pathways and protein complexes. Nat. Genet. 47, 106–114 (2015).

  15. 15.

    , & Algorithms for detecting significantly mutated pathways in cancer. J. Comput. Biol. 18, 507–522 (2011).

  16. 16.

    et al. Systematic identification of cancer driving signaling pathways based on mutual exclusivity of genomic alterations. Genome Biol. 16, 45 (2015).

  17. 17.

    , , , & Discovering functional modules by identifying recurrent and mutually exclusive mutational patterns in tumors. BMC Med. Genomics 4, 34 (2011).

  18. 18.

    , & Combinatorial patterns of somatic gene mutations in cancer. FASEB J. 22, 2605–2622 (2008).

  19. 19.

    et al. Pathway and network analysis of cancer genomes. Nat. Methods 12, 615–621 (2015).

  20. 20.

    et al. A human phenome-interactome network of protein complexes implicated in genetic disorders. Nat. Biotechnol. 25, 309–316 (2007).

  21. 21.

    et al. A scored human protein–protein interaction network to catalyze genomic interpretation. Nat. Methods 14, 61–64 (2016).

  22. 22.

    et al. High-throughput phenotyping of lung cancer somatic mutations. Cancer Cell 30, 214–228 (2016).

  23. 23.

    et al. Integrative genomic approaches identify IKBKE as a breast cancer oncogene. Cell 129, 1065–1079 (2007).

  24. 24.

    et al. In vivo multiplexed interrogation of amplified genes identifies GAB2 as an ovarian cancer oncogene. Proc. Natl. Acad. Sci. USA 111, 1102–1107 (2014).

  25. 25.

    et al. Systematic functional interrogation of rare cancer variants identifies oncogenic alleles. Cancer Discov. 6, 714–726 (2016).

  26. 26.

    et al. Distinct patterns of somatic genome alterations in lung adenocarcinomas and squamous cell carcinomas. Nat. Genet. 48, 607–616 (2016).

  27. 27.

    Cancer Genome Atlas Research Network. Comprehensive molecular profiling of lung adenocarcinoma. Nature 511, 543–550 (2014).

  28. 28.

    et al. Mapping the hallmarks of lung adenocarcinoma with massively parallel sequencing. Cell 150, 1107–1120 (2012).

  29. 29.

    et al. Tissue-specific regulatory circuits reveal variable modular perturbations across complex diseases. Nat. Methods 13, 366–370 (2016).

  30. 30.

    et al. A large-scale analysis of tissue-specific pathology and gene expression of human disease genes and complexes. Proc. Natl. Acad. Sci. USA 105, 20870–20875 (2008).

  31. 31.

    et al. Proteins encoded in genomic regions associated with immune-mediated disease physically interact and suggest underlying biology. PLoS Genet. 7, e1001273 (2011).

  32. 32.

    & Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. B 57, 289–300 (1995).

  33. 33.

    et al. Absolute quantification of somatic DNA alterations in human cancer. Nat. Biotechnol. 30, 413–421 (2012).

Download references


J.D.C. is supported by the LUNGevity Career Development Award (CDA). H.H. was supported by a Fund for Medical Discovery Award from the Executive Committee On Research at Massachusetts General Hospital. H.H. and K.L. are supported by the MGH IRG American Cancer Society. K.L. is supported by a grant from the Stanley Center at the Broad Institute, a Broadnext10 grant from the Broad Institute, 1R01MH109903, a Large Thematic Project Grant from the Lundbeck Foundation (R223-2016-721), and a Research Award from the Simons Foundation (SFARI).

Author information

Author notes

    • Heiko Horn
    •  & Michael S Lawrence

    These authors contributed equally to this work.

    • Jesse S Boehm
    • , Gad Getz
    •  & Kasper Lage

    These authors jointly directed this work.


  1. Department of Surgery, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, USA.

    • Heiko Horn
    • , Jessica Xin Hu
    • , Elizabeth Worstell
    • , Alireza Kashani
    •  & Kasper Lage
  2. Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA.

    • Heiko Horn
    • , Michael S Lawrence
    • , Candace R Chouinard
    • , Yashaswi Shrestha
    • , Jessica Xin Hu
    • , Elizabeth Worstell
    • , Emily Shea
    • , Nina Ilic
    • , Eejung Kim
    • , Atanas Kamburov
    • , Alireza Kashani
    • , William C Hahn
    • , Joshua D Campbell
    • , Jesse S Boehm
    • , Gad Getz
    •  & Kasper Lage
  3. Department of Pathology and MGH Cancer Center, Massachusetts General Hospital, Boston, Massachusetts, USA.

    • Michael S Lawrence
    • , Atanas Kamburov
    •  & Gad Getz
  4. Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA.

    • Nina Ilic
    • , Eejung Kim
    •  & William C Hahn
  5. Department of Medicine, Boston University School of Medicine, Boston, Massachusetts, USA.

    • Joshua D Campbell
  6. Institute for Biological Psychiatry, Mental Health Center Sct. Hans, University of Copenhagen, Roskilde, Denmark.

    • Kasper Lage


  1. Search for Heiko Horn in:

  2. Search for Michael S Lawrence in:

  3. Search for Candace R Chouinard in:

  4. Search for Yashaswi Shrestha in:

  5. Search for Jessica Xin Hu in:

  6. Search for Elizabeth Worstell in:

  7. Search for Emily Shea in:

  8. Search for Nina Ilic in:

  9. Search for Eejung Kim in:

  10. Search for Atanas Kamburov in:

  11. Search for Alireza Kashani in:

  12. Search for William C Hahn in:

  13. Search for Joshua D Campbell in:

  14. Search for Jesse S Boehm in:

  15. Search for Gad Getz in:

  16. Search for Kasper Lage in:


H.H. developed, benchmarked, and implemented the NetSig algorithm with input from M.S.L. and supervision from G.G. and K.L. C.R.C., Y.S., E.S., N.I., and E.K. executed the in vivo tumorigenesis experiments with input from H.H. and K.L. and supervision from J.S.B. H.H. developed and implemented the quantitative analytical framework of in vivo tumorigenesis data with input from C.R.C., Y.S., and E.S. as well as supervision from J.S.B. and K.L. J.D.C. reanalyzed lung adenocarcinoma data with input from H.H., J.S.B., G.G., and K.L. All authors analyzed data and discussed the results. H.H., W.C.H., J.D.C., J.S.B., G.G., and K.L. wrote the manuscript with input from all authors. J.S.B., G.G., and K.L. designed and directed the work. K.L. initiated and led the study.

Competing interests

K.L. is on the scientific advisory board and is the founder of Intomics A/S with equity in the company. InWeb_InBioMap is a product of Intomics A/S that is freely available to academic users from and

Corresponding authors

Correspondence to Jesse S Boehm or Gad Getz or Kasper Lage.

Integrated supplementary information

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–15 and Supplementary Notes 1–10

  2. 2.

    Life Sciences Reporting Summary

    Life Sciences Reporting Summary

Excel files

  1. 1.

    Supplementary Table 1

    Genes in the Tiers 1-5 used for benchmarking

  2. 2.

    Supplementary Table 2

    Gene-specific NetSig scores

  3. 3.

    Supplementary Table 3

    Literature review of NetSig5000 genes

  4. 4.

    Supplementary Table 4

    NetSig candidates tested experimentally

  5. 5.

    Supplementary Table 5

    80 barcoded cDNA constructs corresponding to 79 activating alleles of 25 known oncogenes

  6. 6.

    Supplementary Table 6

    80 Random genes tested experimentally

  7. 7.

    Supplementary Table 7

    Details on sensitivity and specificity calculations for genes

  8. 8.

    Supplementary Table 8

    Datasets of patients with unknown driver mutations

  9. 9.

    Supplementary Table 9

    Mutation rates and patterns in patients with no known driver mutations

  10. 10.

    Supplementary Table 10

    Candidate genes for pan cancer analysis from NetSig, Hotnet 2 and Muffinn

Zip files

  1. 1.

    Supplementary Software

    Scripts and data to reproduce the tumorigenesis assay

About this article

Publication history





Further reading