Comprehensive genomic characterization of prostate cancer has identified recurrent alterations in genes involved in androgen signaling, DNA repair, and PI3K signaling, among others. However, larger and uniform genomic analysis may identify additional recurrently mutated genes at lower frequencies. Here we aggregate and uniformly analyze exome sequencing data from 1,013 prostate cancers. We identify and validate a new class of E26 transformation-specific (ETS)-fusion-negative tumors defined by mutations in epigenetic regulators, as well as alterations in pathways not previously implicated in prostate cancer, such as the spliceosome pathway. We find that the incidence of significantly mutated genes (SMGs) follows a long-tail distribution, with many genes mutated in less than 3% of cases. We identify a total of 97 SMGs, including 70 not previously implicated in prostate cancer, such as the ubiquitin ligase CUL3 and the transcription factor SPEN. Finally, comparing primary and metastatic prostate cancer identifies a set of genomic markers that may inform risk stratification.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

18A list of members and affiliations appears in the Supplementary Note.


  1. 1.

    Robinson, D. et al. Integrative clinical genomics of advanced prostate cancer. Cell 161, 1215–1228 (2015).

  2. 2.

    Cancer Genome Atlas Research Network. Molecular taxonomy of primary prostate cancer. Cell 163, 1011–1025 (2015).

  3. 3.

    Tomlins, S. A. et al. Recurrent fusion of TMPRSS2 and ETS transcription factor genes in prostate cancer. Science 310, 644–648 (2005).

  4. 4.

    Barbieri, C. E. et al. Exome sequencing identifies recurrent SPOP, FOXA1 and MED12 mutations in prostate cancer. Nat. Genet. 44, 685–689 (2012).

  5. 5.

    Lawrence, M. S. et al. Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505, 495–501 (2014).

  6. 6.

    Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).

  7. 7.

    Beltran, H. et al. Divergent clonal evolution of castration-resistant neuroendocrine prostate cancer. Nat. Med. 22, 298–305 (2016).

  8. 8.

    Kumar, A. et al. Substantial interindividual and limited intraindividual genomic diversity among tumors from men with metastatic prostate cancer. Nat. Med. 22, 369–378 (2016).

  9. 9.

    Baca, S. C. et al. Punctuated evolution of prostate cancer genomes. Cell 153, 666–677 (2013).

  10. 10.

    Hieronymus, H. et al. Copy number alteration burden predicts prostate cancer relapse. Proc. Natl. Acad. Sci. USA 111, 11139–11144 (2014).

  11. 11.

    Lawrence, M. S. et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 214–218 (2013).

  12. 12.

    Cancer Genome Atlas Research Network. Comprehensive molecular characterization of clear cell renal cell carcinoma. Nature 499, 43–49 (2013).

  13. 13.

    Cancer Genome Atlas Research Network. Comprehensive molecular characterization of urothelial bladder carcinoma. Nature 507, 315–322 (2014).

  14. 14.

    Theurillat, J.-P. P. et al. Ubiquitylome analysis identifies dysregulation of effector substrates in SPOP-mutant prostate cancer. Science 346, 85–89 (2014).

  15. 15.

    Geng, C. et al. Prostate cancer–associated mutations in speckle-type POZ protein (SPOP) regulate steroid receptor coactivator 3 protein turnover. Proc. Natl. Acad. Sci. USA 110, 6997–7002 (2013).

  16. 16.

    Yuan, W.-C. et al. A Cullin3–KLHL20 ubiquitin ligase–dependent pathway targets PML to potentiate HIF-1 signaling and prostate cancer progression. Cancer Cell 20, 214–228 (2011).

  17. 17.

    Groner, A. C. et al. TRIM24 is an oncogenic transcriptional activator in prostate cancer. Cancer Cell 29, 846–858 (2016).

  18. 18.

    Boysen, G. et al. SPOP mutation leads to genomic instability in prostate cancer. eLife 4, e09207 (2015).

  19. 19.

    Abida, W. et al. Prospective genomic profiling of prostate cancer across disease states reveals germline and somatic alterations that may affect clinical decision making. JCO Precis. Oncol. https://doi.org/10.1200/PO.17.00029 (2017).

  20. 20.

    Zehir, A. et al. Mutational landscape of metastatic cancer revealed from prospective clinical sequencing of 10,000 patients. Nat. Med. 23, 703–713 (2017).

  21. 21.

    Dolatshad, H. et al. Disruption of SF3B1 results in deregulated expression and splicing of key genes and pathways in myelodysplastic syndrome hematopoietic stem and progenitor cells. Leukemia 29, 1092–1103 (2015).

  22. 22.

    Ciriello, G. et al. Comprehensive molecular portraits of invasive lobular breast cancer. Cell 163, 506–519 (2015).

  23. 23.

    Papaemmanuil, E. et al. Somatic SF3B1 mutation in myelodysplasia with ring sideroblasts. N. Engl. J. Med. 365, 1384–1395 (2011).

  24. 24.

    McHugh, C. A. et al. The Xist lncRNA interacts directly with SHARP to silence transcription through HDAC3. Nature 521, 232–236 (2015).

  25. 25.

    Shi, Y. et al. Sharp, an inducible cofactor that integrates nuclear receptor repression and activation. Genes Dev. 15, 1140–1151 (2001).

  26. 26.

    Légaré, S. et al. The estrogen receptor cofactor SPEN functions as a tumor suppressor and candidate biomarker of drug responsiveness in hormone-dependent breast cancers. Cancer Res. 75, 4351–4363 (2015).

  27. 27.

    Kuchay, S. et al. FBXL2- and PTPL1-mediated degradation of p110-free p85β regulatory subunit controls the PI(3)K signalling cascade. Nat. Cell. Biol. 15, 472–480 (2013).

  28. 28.

    Fraser, M. et al. Genomic hallmarks of localized, non-indolent prostate cancer. Nature 541, 359–364 (2017).

  29. 29.

    Cibulskis, K. et al. ContEst: estimating cross-contamination of human samples in next-generation sequencing data. Bioinformatics 27, 2601–2602 (2011).

  30. 30.

    Cibulskis, K. et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 31, 213–219 (2013).

  31. 31.

    Costello, M. et al. Discovery and characterization of artifactual mutations in deep coverage targeted capture sequencing data due to oxidative DNA damage during sample preparation. Nucleic Acids Res. 41, e67 (2013).

  32. 32.

    Van Allen, E. M. et al. Whole-exome sequencing and clinical interpretation of formalin-fixed, paraffin-embedded tumor samples to guide precision cancer medicine. Nat. Med. 20, 682–688 (2014).

  33. 33.

    Saunders, C. T. et al. Strelka: accurate somatic small-variant calling from sequenced tumor–normal sample pairs. Bioinformatics 28, 1811–1817 (2012).

  34. 34.

    Chang, M. T. et al. Identifying recurrent mutations in cancer reveals widespread lineage diversity and mutational specificity. Nat. Biotechnol. 34, 155–163 (2016).

  35. 35.

    Mermel, C. H. et al. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol. 12, R41 (2011).

  36. 36.

    Shen, R. & Seshan, V. E. FACETS: allele-specific copy number and clonal heterogeneity analysis tool for high-throughput DNA sequencing. Nucleic Acids Res. 44, e131 (2016).

  37. 37.

    Cerami, E. et al. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2, 401–404 (2012).

  38. 38.

    Cheng, D. T. et al. Memorial Sloan Kettering–Integrated Mutation Profiling of Actionable Cancer Targets (MSK-IMPACT): a hybridization capture-based next-generation sequencing clinical assay for solid tumor molecular oncology. J. Mol. Diagn. 17, 251–264 (2015).

  39. 39.

    Chakravarty, D. OncoKB: a precision oncology knowledge base. Precis. Oncol. https://doi.org/10.1200/PO.17.00011 (2017).

  40. 40.

    McGranahan, N. et al. Clonal status of actionable driver events and the timing of mutational processes in cancer evolution. Sci. Transl. Med. 7, 283ra54 (2015).

  41. 41.

    Hartmaier, R. J. et al. High-throughput genomic profiling of adult solid tumors reveals novel insights into cancer pathogenesis. Cancer. Res. 77, 2464–2475 (2017).

Download references


We thank the patients for participating in this study. We also thank the Broad Cancer Genome Analysis and Data Sciences groups for analysis methodology and computational support. This work was supported by the SU2C-PCF Prostate Cancer International Dream Team, Prostate Cancer Foundation Young Investigator Awards (B.S.T., C.P., N.S., and E.M.V.A.), Prostate Cancer Foundation–V Foundation Challenge Award (E.M.V.A., P.S.N., and J.S.d.B.), NIH K08CA188615 (E.M.V.A.), NCI P50-CA097186 and NCI P50-CA92629 SPOREs in Prostate Cancer, the Marie-Josée and Henry R. Kravis Center for Molecular Oncology, a National Cancer Institute Cancer Center Core Grant (P30-CA008748), and the Robertson Foundation (B.S.T. and N.S.).

Author information

Author notes

  1. These authors contributed equally: Joshua Armenia, Stephanie A. M. Wankowicz, David Liu, Nikolaus Schultz and Eliezer M. Van Allen.


  1. Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA

    • Joshua Armenia
    • , Jianjiong Gao
    • , Ritika Kundra
    • , Ed Reznik
    • , Walid K. Chatila
    • , Debyani Chakravarty
    • , Craig M. Bielski
    • , Alexander V. Penson
    • , Barry S. Taylor
    • , Charles L. Sawyers
    •  & Nikolaus Schultz
  2. Marie-Josée and Henry R. Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, USA

    • Joshua Armenia
    • , Jianjiong Gao
    • , Ritika Kundra
    • , Ed Reznik
    • , Walid K. Chatila
    • , Debyani Chakravarty
    • , Craig M. Bielski
    • , Alexander V. Penson
    • , Barry S. Taylor
    •  & Nikolaus Schultz
  3. Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA

    • Stephanie A. M. Wankowicz
    • , David Liu
    • , G. Celine Han
    • , Franklin W. Huang
    • , Levi A. Garraway
    • , Mary-Ellen Taplin
    •  & Eliezer M. Van Allen
  4. Cancer Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA

    • Stephanie A. M. Wankowicz
    • , David Liu
    • , G. Celine Han
    • , Charlotte Tolonen
    • , Franklin W. Huang
    • , Levi A. Garraway
    •  & Eliezer M. Van Allen
  5. Divisions of Human Biology and Clinical Research, Fred Hutchinson Cancer Research Center, Seattle, WA, USA

    • Ilsa Coleman
    •  & Peter S. Nelson
  6. Department of Medicine, University of Washington, Seattle, WA, USA

    • Bruce Montgomery
    •  & Peter S. Nelson
  7. Department of Laboratory Medicine, University of Washington, Seattle, WA, USA

    • Colin Pritchard
  8. Department of Urology, University of Washington, Seattle, WA, USA

    • Colm Morrissey
  9. Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA

    • Christopher E. Barbieri
    • , Andrea Sboner
    •  & Mark A. Rubin
  10. Department of Medicine, Division of Medical Oncology, Weill Cornell Medicine, New York, NY, USA

    • Himisha Beltran
  11. Englander Institute for Precision Medicine, Weill Cornell Medical College–New York Presbyterian Hospital, New York, NY, USA

    • Himisha Beltran
    •  & Mark A. Rubin
  12. Sandra and Edward Meyer Cancer Center at Weill Cornell Medical College, New York, NY, USA

    • Himisha Beltran
    •  & Mark A. Rubin
  13. Biomarkers Team, Division of Clinical Studies, The Institute of Cancer Research and Royal Marsden Hospital, London, UK

    • Zafeiris Zafeiriou
    • , Susana Miranda
    •  & Johann S. de Bono
  14. Michigan Center for Translational Pathology, University of Michigan, Ann Arbor, MI, USA

    • Dan Robinson
    • , Yi Mi Wu
    • , Robert Lonigro
    •  & Arul M. Chinnaiyan
  15. Centre for Integrated Biology, University of Trento, Trento, Italy

    • Francesca Demichelis
  16. Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA

    • Philip W. Kantoff
    • , Wassim Abida
    •  & Howard I. Scher
  17. Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA

    • Barry S. Taylor
    •  & Nikolaus Schultz


  1. Search for Joshua Armenia in:

  2. Search for Stephanie A. M. Wankowicz in:

  3. Search for David Liu in:

  4. Search for Jianjiong Gao in:

  5. Search for Ritika Kundra in:

  6. Search for Ed Reznik in:

  7. Search for Walid K. Chatila in:

  8. Search for Debyani Chakravarty in:

  9. Search for G. Celine Han in:

  10. Search for Ilsa Coleman in:

  11. Search for Bruce Montgomery in:

  12. Search for Colin Pritchard in:

  13. Search for Colm Morrissey in:

  14. Search for Christopher E. Barbieri in:

  15. Search for Himisha Beltran in:

  16. Search for Andrea Sboner in:

  17. Search for Zafeiris Zafeiriou in:

  18. Search for Susana Miranda in:

  19. Search for Craig M. Bielski in:

  20. Search for Alexander V. Penson in:

  21. Search for Charlotte Tolonen in:

  22. Search for Franklin W. Huang in:

  23. Search for Dan Robinson in:

  24. Search for Yi Mi Wu in:

  25. Search for Robert Lonigro in:

  26. Search for Levi A. Garraway in:

  27. Search for Francesca Demichelis in:

  28. Search for Philip W. Kantoff in:

  29. Search for Mary-Ellen Taplin in:

  30. Search for Wassim Abida in:

  31. Search for Barry S. Taylor in:

  32. Search for Howard I. Scher in:

  33. Search for Peter S. Nelson in:

  34. Search for Johann S. de Bono in:

  35. Search for Mark A. Rubin in:

  36. Search for Charles L. Sawyers in:

  37. Search for Arul M. Chinnaiyan in:

  38. Search for Nikolaus Schultz in:

  39. Search for Eliezer M. Van Allen in:


  1. PCF/SU2C International Prostate Cancer Dream Team

    1. PCF/SU2C International Prostate Cancer Dream Team


      J.A., S.A.M.W., N.S., E.M.V.A., D.L., J.G., R.K., E.R., W.K.C., D.C., G.C.H., C.E.B., A.S., C.M.B., A.V.P., C.T., F.D., M.A.R., and B.S.T. contributed with algorithm development and analysis of genomic data. R.L., L.A.G., I.C., B.M., C.P., C.M., H.B., Z.Z., S.M., F.W.H., D.R., Y.M.W., P.W.K., M.-E.T., W.A., H.I.S., P.S.N., J.S.d.B., M.A.R., C.L.S., and A.M.C. developed the patient cohort, obtained tumor biopsies, performed molecular testing for metastatic cases, and carried out data interpretation of the overall cohort. J.A., S.A.M.W., D.L., N.S., and E.M.V.A. performed final aggregate cohort assembly, mutation review, interpretation, and manuscript preparation.

      Competing interests

      E.M.V.A. is a consultant for Tango Therapeutics and Genome Medical.

      Corresponding authors

      Correspondence to Nikolaus Schultz or Eliezer M. Van Allen.

      Integrated supplementary information

      1. Supplementary Figure 1

        Workflow of sample inclusion and quality control.

      2. Supplementary Figure 2 Correlates of genomic burden.

        Metastatic tumors have increased mutational and copy number burden as compared to primary tumors, adjusted for differences in purity and coverage. In primary tumors, increased age at diagnosis and higher Gleason score are associated with higher mutational and copy number burden, adjusting for purity and coverage. Reported P values for each predictor (metastatic versus primary disease; age; Gleason score) are from the multivariate regression adjusting for purity and coverage. The center values represent the median value of each group and the error bars below or above the median line define the first and third quartile respectively.

      3. Supplementary Figure 3 Schematic workflow of the combination of statistical, quality and biological filters used to identify SMGs.

        Flowchart describing how we applied a combination of statistical, quality, and biological filters to the results to identify SMGs. Approach to defining significantly mutated genes using both statistical and biological filters.

      4. Supplementary Figure 4 Allele frequencies and mRNA expression of significantly mutated genes.

        (a) Allele frequencies of significantly mutated genes. Boxplot showing the distribution of allele frequencies across SMGs sorted by decreasing median allele frequency. (b) mRNA expression of the SMGs across the TCGA cohort. mRNA expression of the SMGs across the PCF-SU2C cohort. The red center line in the boxplots indicates the median value.

      5. Supplementary Figure 5 Supervised hierarchical clustering on arm-level deletions.

        Samples (rows) are ordered according to six copy number clusters derived from hierarchical clustering on arm-level deletions (2q, 5q, 6q, 8p, 13q, and 16q), with SPOP and CUL3 mutations indicated in green. Chromosomes are shown from left to right, samples from top to bottom. Regions of loss are indicated by shades of blue, and gains are indicated by shades of red.

      6. Supplementary Figure 6 Additional mutation characteristics of individual SMGs.

        (a) CUL3 mutations observed in our cohort. (b) Mutations observed in PIK3R2. Hotspot mutations observed in PIK3R2 are paralogous to the oncogenic D560 mutation in PIK3R1. (c) Mutation distribution of CDK12 variants, showing that the majority of mutations are truncating and that missense variants cluster in the kinase domain.

      7. Supplementary Figure 7 The overlap of the three bait sets used in this analysis (Agilent, Ilumina, NimbleSeq).

        (a) Overlap of all bases. (b) Overlap of coding regions.

      8. Supplementary Figure 8 Segmented copy-number profiles of 303 primary tumors from TCGA.

        (a) Whole exome sequencing (top) vs Affymetrix SNP6 data (bottom). 303 tumors are shown for data set, sorted in identical order. Regions of gain are shown in shades of red, losses as shades of blue. The center summary plots show the fraction of samples with a log2(cn/2) value >0.1 (red) or <-0.1 (blue) at a given position. (b) Scatter plot to compare the segment means of matched segments >200KB from the SNP6 and the ReCapSeg data, resulting in a pearson correlation of 0.92 (C.I. 0.916-0.918). The graph shows the correlation between the inferred copy number of 29,290 microarray segments with size > 200 KB with the corresponding inferred copy number from the WES. The segments with size > 200KB represent 99.94% of the covered genome from the SNP array.

      Supplementary information

      1. Supplementary Text and Figures

        Supplementary Figures 1–8, Supplementary Tables 5 and 7, and Supplementary Note

      2. Life Sciences Reporting Summary

      3. Supplementary Table 1

        Source of cohorts used for this analysis

      4. Supplementary Table 2

        Complete list of all somatic mutations in this cohort

      5. Supplementary Table 3

        Patient-level information

      6. Supplementary Table 4

        Mutational significance analysis

      7. Supplementary Table 6

        Genes in cancer pathways

      8. Supplementary Table 8

        Intersected BED file

      9. Supplementary Table 9

        Unfiltered MAF file

      10. Supplementary Table 10

        Matrix of copy number calls

      About this article

      Publication history




      Issue Date



      Further reading