Fewer than half of all patients with advanced-stage high-grade serous ovarian cancers (HGSCs) survive more than five years after diagnosis, but those who have an exceptionally long survival could provide insights into tumor biology and therapeutic approaches. We analyzed 60 patients with advanced-stage HGSC who survived more than 10 years after diagnosis using whole-genome sequencing, transcriptome and methylome profiling of their primary tumor samples, comparing this data to 66 short- or moderate-term survivors. Tumors of long-term survivors were more likely to have multiple alterations in genes associated with DNA repair and more frequent somatic variants resulting in an increased predicted neoantigen load. Patients clustered into survival groups based on genomic and immune cell signatures, including three subsets of patients with BRCA1 alterations with distinctly different outcomes. Specific combinations of germline and somatic gene alterations, tumor cell phenotypes and differential immune responses appear to contribute to long-term survival in HGSC.
This is a preview of subscription content, access via your institution
Subscribe to Nature+
Get immediate online access to Nature and 55 other Nature journal
Subscribe to Journal
Get full journal access for 1 year
only $6.58 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
ICGC datasets: Previously published WGS and RNA-seq data generated as part of the ICGC Ovarian Cancer project14 are available from the European Genome-phenome Archive (EGA) repository (https://ega-archive.org) as a single BAM file for each sample type (tumor/normal) under the accession code EGAD00001000877. Due to the sensitive nature of these patient data sets, access is subject to approval from the ICGC Data Access Compliance Office (https://docs.icgc.org/download/data-access/), an independent body who authorizes controlled access to ICGC sequencing data. ICGC SNP array and methylation data sets have been deposited into the Gene Expression Omnibus (GEO; https://www.ncbi.nlm.nih.gov/geo/) under accession code GSE65821, without access restrictions. ICGC gene count level transcriptomic data have been deposited into the GEO under accession code GSE209964.
MOCOG datasets: WGS, RNA-seq and SNP array data from long-term survivors generated as part of the MOCOG study have been deposited in the EGA repository under accession code EGAS00001005984. WGS and RNA-seq data are available as raw FASTQ files for each sample type (tumor/normal) and SNP array data are available as raw signal intensity files in text format for each sample type (tumor/normal). Access to patient sequence data can be gained for academic use through application to the independent Data Access Committee (email@example.com). Responses to data requests will be provided within two weeks. Information on how to apply for access is available at the EGA under accession code EGAS00001005984. The MOCOG cohort raw methylation data sets have been submitted to the GEO under accession code GSE211687, with no access restrictions.
Uniformly processed somatic variant data from the ICGC and MOCOG cohorts have been deposited in Synapse under accession code syn34616347, and processed expression and methylation data from both cohorts have been submitted into the GEO under accession code GSE211687, without access restrictions.
Population frequencies of genetic variants can be accessed via the Genome Aggregation Database (gnomAD) at https://gnomad.broadinstitute.org/. Supporting evidence for pathogenicity of genomic alterations can be accessed via ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/), BRCA Exchange (https://brcaexchange.org/) and the TP53 Database (https://tp53.isb-cgc.org/). The Ensembl ranked order of severity of variant consequences is available at: https://m.ensembl.org/info/genome/variation/prediction/predicted_data.html. Precomputed TCGA ovarian serous cystadenocarcinoma survival analysis data can be downloaded from OncoLnc (http://www.oncolnc.org/). Mutational signature reference databases can be accessed via COSMIC (https://cancer.sanger.ac.uk/signatures/) and Signal (https://signal.mutationalsignatures.com/). The LM22 signature matrix used for immune cell deconvolution can be downloaded at https://cibersortx.stanford.edu/. The COSMIC Cancer Gene Census can be accessed at https://cancer.sanger.ac.uk/census. MSigDB hallmark gene sets can be accessed at https://www.gsea-msigdb.org/gsea/msigdb/. Illumina methylation probes that were filtered out due to poor performance (for example, cross-reactive or nonspecific probes) can be found at https://github.com/sirselim/illumina450k_filtering. Germline polymorphic sites for reference and variant allele read counts used in FACETS analysis can be found at ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b151_GRCh37p13/VCF/common_all_20180423.vcf.gz. The gene transfer format used for annotation and RNA-seq counts is available at ftp://ftp.ensembl.org/pub/grch37/release-92/. All other data are available within the article and its supplementary information files.
No custom code or software was used in the data analyses. All results can be replicated using publicly available tools and software. The tools and versions used are fully described in the Methods and Supplementary Information.
Millstein, J. et al. Prognostic gene expression signature for high-grade serous ovarian cancer. Ann. Oncol. 31, 1240–1250 (2020).
Hoppenot, C., Eckert, M. A., Tienda, S. M. & Lengyel, E. Who are the long-term survivors of high grade serous ovarian cancer? Gynecol. Oncol. 148, 204–212 (2018).
Fagö-Olsen, C. L. et al. Does neoadjuvant chemotherapy impair long-term survival for ovarian cancer patients? A nationwide Danish study. Gynecol. Oncol. 132, 292–298 (2014).
Chi, D. S. et al. What is the optimal goal of primary cytoreductive surgery for bulky stage IIIC epithelial ovarian carcinoma (EOC)? Gynecol. Oncol. 103, 559–564 (2006).
Horowitz, N. S. et al. Does aggressive surgery improve outcomes? Interaction between preoperative disease burden and complex surgery in patients with advanced-stage ovarian cancer: an analysis of GOG 182. J. Clin. Oncol. 33, 937–943 (2015).
Alsop, K. et al. BRCA mutation frequency and patterns of treatment response in BRCA mutation-positive women with ovarian cancer: A report from the Australian ovarian cancer study group. J. Clin. Oncol. 30, 2654–2663 (2012).
The Cancer Genome Atlas Research Network. Integrated genomic analysis of ovarian cancer. Nature 474, 609–615 (2011).
Walsh, T. et al. Mutations in 12 genes for inherited ovarian, fallopian tube, and peritoneal carcinoma identified by massively parallel sequencing. Proc. Natl Acad. Sci. USA 108, 18032–18037 (2011).
Ciriello, G. et al. Emerging landscape of oncogenic signatures across human cancers. Nat. Genet. 45, 1127–1133 (2013).
Ahmed, A. A. et al. Driver mutations in TP53 are ubiquitous in high grade serous carcinoma of the ovary. J. Pathol. 221, 49–56 (2010).
Tothill, R. W. et al. Novel molecular subtypes of serous and endometrioid ovarian cancer linked to clinical outcome. Clin. Cancer Res. 14, 5198–5208 (2008).
Etemadmoghadam, D. et al. Integrated genome-wide DNA copy number and expression analysis identifies distinct mechanisms of primary chemoresistance in ovarian carcinomas. Clin. Cancer Res. 15, 1417–1427 (2009).
Hwang, W. T., Adams, S. F., Tahirovic, E., Hagemann, I. S. & Coukos, G. Prognostic significance of tumor-infiltrating T cells in ovarian cancer: A meta-analysis. Gynecol. Oncol. 124, 192–198 (2012).
Patch, A. M. et al. Whole-genome characterization of chemoresistant ovarian cancer. Nature 521, 489–494 (2015).
Wang, Y. K. et al. Genomic consequences of aberrant DNA repair mechanisms stratify ovarian cancer histotypes. Nat. Genet. 49, 856–864 (2017).
Macintyre, G. et al. Copy number signatures and mutational processes in ovarian carcinoma. Nat. Genet. 50, 1262–1270 (2018).
Pennington, K. P. et al. Germline and somatic mutations in homologous recombination genes predict platinum response and survival in ovarian, fallopian tube, and peritoneal carcinomas. Clin. Cancer Res. 20, 764–775 (2014).
Farmer, H. et al. Targeting the DNA repair defect in BRCA mutant cells as a therapeutic strategy. Nature 434, 917–921 (2005).
Fong, P. C. et al. Poly(ADP)-ribose polymerase inhibition: frequent durable responses in BRCA carrier ovarian cancer correlating with platinum-free interval. J. Clin. Oncol. 28, 2512–2519 (2010).
Swisher, E. M. et al. Rucaparib in relapsed, platinum-sensitive high-grade ovarian carcinoma (ARIEL2 Part 1): an international, multicentre, open-label, phase 2 trial. Lancet Oncol. 18, 75–87 (2017).
Bolton, K. L. et al. Association between BRCA1 and BRCA2 mutations and survival in women with invasive epithelial ovarian cancer. JAMA 307, 382–390 (2012).
Candido-dos-Reis, F. J. et al. Germline mutation in BRCA1 or BRCA2 and ten-year survival for women diagnosed with epithelial ovarian cancer. Clin. Cancer Res. 21, 652–657 (2015).
Garsed, D. W. et al. Homologous recombination DNA repair pathway disruption and retinoblastoma protein loss are associated with exceptional survival in high-grade serous ovarian cancer. Clin. Cancer Res. 24, 569–580 (2018).
Ciriello, G., Cerami, E., Sander, C. & Schultz, N. Mutual exclusivity analysis identifies oncogenic network modules. Genome Res. 22, 398–406 (2012).
Etemadmoghadam, D. et al. Synthetic lethality between CCNE1 amplification and loss of BRCA1. Proc. Natl Acad. Sci. USA 110, 19489–19494 (2013).
Newman, A. M. et al. Determining cell type abundance and expression from bulk tissues with digital cytometry. Nat. Biotechnol. 37, 773–782 (2019).
Miller, R. E. et al. ESMO recommendations on predictive biomarker testing for homologous recombination deficiency and PARP inhibitor benefit in ovarian cancer. Ann. Oncol. 31, 1606–1622 (2020).
Nguyen, L., W. M. Martens, J., Van Hoeck, A. & Cuppen, E. Pan-cancer landscape of homologous recombination deficiency. Nat. Commun. 11, 1–12 (2020).
Joshi, P. M., Sutor, S. L., Huntoon, C. J. & Karnitz, L. M. Ovarian cancer-associated mutations disable catalytic activity of CDK12, a kinase that promotes homologous recombination repair and resistance to cisplatin and poly(ADP-ribose) polymerase inhibitors. J. Biol. Chem. 289, 9247–9253 (2014).
Anaya, J. OncoLnc: Linking TCGA survival data to mRNAs, miRNAs, and lncRNAs. Peer J. Comp. Sci. 2, e67 (2016).
Norquist, B. et al. Secondary somatic mutations restoring BRCA1/2 predict chemotherapy resistance in hereditary ovarian carcinomas. J. Clin. Oncol. 29, 3008–3015 (2011).
Alexandrov, L. B. et al. The repertoire of mutational signatures in human cancer. Nature 578, 94–101 (2020).
Degasperi, A. et al. A practical framework and online tool for mutational signature analyses show inter-tissue variation and driver dependencies. Nat. Cancer 1, 249–263 (2020).
Popova, T. et al. Ovarian cancers harboring inactivating mutations in CDK12 display a distinct genomic instability pattern characterized by large tandem duplications. Cancer Res. 76, 1882–1891 (2016).
Wu, Y. M. et al. Inactivation of CDK12 delineates a distinct immunogenic class of advanced prostate cancer. Cell 173, 1770–1782.e1714 (2018).
Funnell, T. et al. Integrated structural variation and point mutation signatures in cancer genomes using correlated topic models. PLoS Comput. Biol. 15, 1–24 (2019).
Zhang, L. et al. Intratumoral T cells, recurrence, and survival in epithelial ovarian cancer. N. Engl. J. Med. 348, 203–213 (2003).
Ovarian Tumor Tissue Analysis (OTTA) Consortium. Dose-response association of CD8+ tumor-infiltrating lymphocytes and survival time in high-grade serous ovarian cancer. JAMA Oncol. 3, e173290 (2017).
Jiménez-Sánchez, A. et al. Heterogeneous tumor-immune microenvironments among differentially growing metastases in an ovarian cancer patient. Cell 170, 927–938.e920 (2017).
Yang, S. Y. C. et al. Landscape of genomic alterations in high-grade serous ovarian cancer from exceptional long- and short-term survivors. Genome Med 10, 81 (2018).
Korotkevich, G. et al. Fast gene set enrichment analysis. Preprint at bioRxiv https://doi.org/10.1101/060012 (2021).
Liberzon, A. et al. The molecular signatures database hallmark gene set collection. Cell Syst. 1, 417–425 (2015).
Saner, F. A. M. et al. Going to extremes: determinants of extraordinary response and survival in patients with cancer. Nat. Rev. Cancer 19, 339–348 (2019).
Wheeler, D. A. et al. Molecular features of cancers exhibiting exceptional responses to treatment. Cancer Cell 39, 38–53.e37 (2021).
Moore, K. et al. Maintenance olaparib in patients with newly diagnosed advanced ovarian cancer. N. Engl. J. Med. 379, 2495–2505 (2018).
Ewing, A. et al. Structural Variants at the BRCA1/2 loci are a common source of homologous repair deficiency in high-grade serous ovarian carcinoma. Clin. Cancer Res. 27, 3201–3214 (2021).
Swisher, E. M. et al. Characterization of patients with long-term responses to rucaparib treatment in recurrent ovarian cancer. Gynecol. Oncol. 163, 490–497 (2021).
Velez-Cruz, R. et al. RB localizes to DNA double-strand breaks and promotes DNA end resection and homologous recombination through the recruitment of BRG1. Genes Dev. 30, 2500–2512 (2016).
Fan, W. et al. MET-independent lung cancer cells evading EGFR kinase inhibitors are therapeutically susceptible to BH3 mimetic agents. Cancer Res. 71, 4494–4505 (2011).
Cole, A. NFATC4 promotes quiescence and chemotherapy resistance in ovarian cancer. JCI Insight 5, e131486 (2020).
Sieh, W. et al. Hormone-receptor expression and ovarian cancer survival: an Ovarian Tumor Tissue Analysis consortium study. Lancet Oncol. 14, 853–862 (2013).
Gersekowski, K. et al. Germline BRCA variants, lifestyle and ovarian cancer survival. Gynecol. Oncol. 165, 437–445 (2022).
Jung, Y. S. et al. Impact of smoking on human natural killer cell activity: A large cohort study. J. Cancer Prev. 25, 13–20 (2020).
Cress, R. D., Chen, Y. S., Morris, C. R., Petersen, M. & Leiserowitz, G. S. Characteristics of long-term survivors of epithelial ovarian cancer. Obstet. Gynecol. 126, 491–497 (2015).
Schröder, J., Corbin, V. & Papenfuss, A. T. HYSYS: Have you swapped your samples? Bioinformatics 33, 596–598 (2017).
Song, S. et al. qpure: A tool to estimate tumor cellularity from genome-wide single-nucleotide polymorphism profiles. PLoS One 7, 5–11 (2012).
Van Loo, P. et al. Allele-specific copy number analysis of tumors. Proc. Natl Acad. Sci. USA 107, 16910–16915 (2010).
Wingett, S. W. & Andrews, S. FastQ Screen: A tool for multi-genome mapping and quality control. F1000Res. 7, 1338 (2018).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).
Shen, R. & Seshan, V. E. FACETS: Allele-specific copy number and clonal heterogeneity analysis tool for high-throughput DNA sequencing. Nucleic Acids Res. 44, 1–9 (2016).
Lai, Z. et al. VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research. Nucleic Acids Res. 44, e108 (2016).
Kim, S. et al. Strelka2: fast and accurate calling of germline and somatic variants. Nat. Methods 15, 591–594 (2018).
Koboldt, D. C. et al. VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res. 22, 568–576 (2012).
Danecek, P. et al. Twelve years of SAMtools and BCFtools. Gigascience 10, 1–4 (2021).
Tan, A., Abecasis, G. R. & Kang, H. M. Unified representation of genetic variants. Bioinformatics 31, 2202–2204 (2015).
Shyr, C. et al. FLAGS, frequently mutated genes in public exomes. BMC Med. Genomics 7, 64 (2014).
Chen, X. et al. Manta: Rapid detection of structural variants and indels for germline and cancer sequencing applications. Bioinformatics 32, 1220–1222 (2016).
Cameron, D. L. et al. GRIDSS: Sensitive and specific genomic rearrangement detection using positional de Bruijn graph assembly. Genome Res. 27, 2050–2060 (2017).
Wala, J. A. et al. SvABA: Genome-wide detection of structural variants and indels by local assembly. Genome Res. 28, 581–591 (2018).
Lawrence, M., Gentleman, R. & Carey, V. rtracklayer: An R package for interfacing with genome browsers. Bioinformatics 25, 1841–1842 (2009).
Bielski, C. M. et al. Genome doubling shapes the evolution and prognosis of advanced cancers. Nat. Genet. 50, 1189–1195 (2018).
Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).
Sztupinszki, Z. et al. Migrating the SNP array-based homologous recombination deficiency measures to next generation sequencing data of breast cancer. NPJ Breast Cancer 4, 16 (2018).
Nariai, N. et al. HLA-VBSeq: Accurate HLA typing at full resolution from whole-genome sequencing data. BMC Genomics 16, 1–6 (2015).
Robinson, J. et al. IPD-IMGT/HLA Database. Nucleic Acids Res. 48, D948–D955 (2020).
Hundal, J. et al. PVACtools: A computational toolkit to identify and visualize cancer neoantigens. Cancer Immunol. Res. 8, 409–420 (2020).
Dobin, A. et al. STAR: Ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Wang, L., Wang, S. & Li, W. RSeQC: Quality control of RNA-seq experiments. Bioinformatics 28, 2184–2185 (2012).
Anders, S., Pyl, P. T. & Huber, W. HTSeq-A Python framework to work with high-throughput sequencing data. Bioinformatics 31, 166–169 (2015).
Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: A Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2009).
Ritchie, M. E. et al. Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47 (2015).
Aryee, M. J. et al. Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics 30, 1363–1369 (2014).
Fortin, J.-P. et al. Functional normalization of 450k methylation array data improves replication in large cancer studies. Genome Biol. 15, 503 (2014).
Leek, J. T., Johnson, W. E., Parker, H. S., Jaffe, A. E. & Storey, J. D. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28, 882–883 (2012).
We thank P. Webb, K. Byth, R. Lupat, J. Ellul and the Peter MacCallum Cancer Centre Research Computing Facility for their contributions to the study. This work was supported by the U.S. Army Medical Research and Materiel Command Ovarian Cancer Research Program (Award No. W81XWH-16-2-0010 and W81XWH-21-1-0401), the National Health and Medical Research Council of Australia (1092856, 1117044 and 2008781 to D.D.L.B., and 1186505 to D.W.G.), and the U.S. National Cancer Institute (P30CA046592 for C.L.P. and P30CA008748 for M.C.P.). This research was made possible by generous support from the Border Ovarian Cancer Awareness Group, the Garvan Research Foundation, the Graf Family Foundation, Mrs Margaret Rose AM, Arthur Coombs and family, and the Piers K Fowler Fund. The Australian Ovarian Cancer Study (AOCS) gratefully acknowledges the cooperation of participating institutions in Australia and the contribution of study nurses, research assistants and all clinical and scientific collaborators. The complete AOCS Group can be found at www.aocstudy.org. We would like to thank all of the women who participated in the study. AOCS was supported by the U.S. Army Medical Research and Materiel Command (DAMD17-01-1-0729), The Cancer Council Victoria, Queensland Cancer Fund, The Cancer Council New South Wales, The Cancer Council South Australia, The Cancer Council Tasmania, The Cancer Foundation of Western Australia and the National Health and Medical Research Council of Australia (NHMRC; ID199600, ID400413, ID400281). AOCS gratefully acknowledges additional support from Ovarian Cancer Australia and the Peter MacCallum Cancer Foundation. We thank all the women who participated in the GynBiobank and gratefully acknowledge the Departments of Gynaecological Oncology, Medical Oncology and Anatomical Pathology at Westmead Hospital, Sydney. The Gynaecological Oncology Biobank at Westmead was funded by the NHMRC (ID310670, ID628903), the Cancer Institute NSW (12/RIG/1-17, 15/RIG/1-16) and the Department of Gynaecological Oncology, Westmead Hospital, and acknowledges financial support from the Sydney West Translational Cancer Research Centre, funded by the Cancer Institute NSW (15/TRC/1-01). E.L.C. was supported by NHMRC grant APP1161198. F.A.M.S. was supported by a Swiss National Foundation EarlyPostdoc Fellowship (P2BEP3-172246), Swiss Cancer Research Foundation grant BIL KFS-3942-08-2016 and a Professor Dr Max Cloëtta and Uniscientia Foundation grant. A.M.P. and J.D.B. were supported by Cancer Research UK (A22905). B.H.N. was supported by the BC Cancer Foundation, Canada’s Networks of Centres of Excellence (BioCanRx), Genome BC and the Canada Foundation for Innovation. D.D.L.B. was supported by the U.S. National Cancer Institute U54 program (U54CA209978).
S.F., K.A., N.T. and A.D. received grant funding from AstraZeneca for unrelated work. G.A.-Y. received grant funding from AstraZeneca and Roche-Genentech for unrelated work. M.F. declares honoraria for advisory boards AstraZeneca, GSK, Incyclix, Lilly, MSD, Novartis and Takeda; consultancy for AstraZeneca, Eisai and Novartis; speaker’s fee and travel from AstraZeneca; speaker’s fee from ACT Genomics; and institutional research funding from AstraZeneca, BeiGene, Novartis; all for unrelated work. J.D.B. received funding from Aprea and Clovis Oncology for unrelated work. D.D.L.B. received funding from AstraZeneca, Genentech-Roche and BeiGene for unrelated work. The remaining authors declare no competing interests.
Peer review information
Nature Genetics thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
a, Overview of patients (n = 126) and tumor samples analyzed in this study. In addition to paired germline and primary tumor samples in all patients, 5 relapse tumor samples were also analyzed from 4 long-term survivor patients. OS, overall survival. b, Clinical characteristics of patients by survival group. All patients received primary platinum therapy. aKruskal–Wallis, bChi-square, or clog-rank Mantel–Cox test P values comparing survival groups reported.
a, Overview of somatic alterations in driver genes detected by GRIN, dNdScv, GISTIC, and/or in cancer-associated genes (COSMIC Cancer Gene Census) that are enriched in a survival group relative to another survival group. From left: two-sided Fisher’s test of the difference in proportions of altered samples between survival groups, triangles and color indicate direction of the log odds ratio (LOR; blue = down, pink = up), asterisks indicate P value < 0.05 (see Supplementary Table 6 for P values), P values were not adjusted for multiple comparisons; role of gene in COSMIC Cancer Gene Census (TSG, tumor suppressor gene); genomic alterations split by survival groups, bars at the top indicate the number of alterations in each listed gene per patient; bar plot of the number of samples with an alteration (alteration type indicated by color); bar plots showing the proportion of alteration types per gene; P values were calculated using the genomic random interval (GRIN) statistical model (one-sided) for recurrent structural variants (SV) (see Supplementary Data 2 for GRIN P values), the dNdScv likelihood-ratio test (two-sided) for recurrent base substitutions and small-scale deletions and insertions (see Supplementary Data 1 for dNdScv P values), and GISTIC2 permutation-of-markers test (one-sided) for recurrent copy-number variants (CNV) with red indicating amplification and blue indicating deletion (see Supplementary Data 3 for GISTIC2 P values), P values were adjusted for multiple comparisons using the Benjamini-Hochberg procedure (dNdScv, GISTIC2) or the robust false discovery rate procedure (GRIN) and are shown as negative log10 P values and capped at 0.001 for display purposes. Each patient (column) is annotated with survival group (LTS, long-term survivor; MTS, moderate-term survivor; STS, short-term survivor). Below the alterations are bar plots indicating somatic mutation burden in variants per megabase (Mb); SV count including duplications, deletions, inversions and intrachromosomal rearrangements; and the proportion of the tumor genome that is duplicated (WGD) or lost (WGL). b, Pairwise comparison of the alteration frequencies between survival groups for genes in the COSMIC Cancer Gene Census. The difference in relative alteration frequency is shown on the x-axis and the P value (Fisher’s test, two-sided) is shown on the y-axis. Symbols of genes with P values < 0.05 are displayed. Multiple hypothesis correction was not applied in this analysis as adjusted P values were all greater than 0.1. Alterations in this analysis included non-synonymous mutations, homozygous deletions, amplifications and structural variants in coding genes that are expressed.
Extended Data Fig. 3 Key features of mutational signature clusters and associated survival outcomes.
a, Summary of the key clinical and genomic features of each mutational signature cluster. Clusters are ordered top to bottom by lowest to highest proportion of long-term survivors (LTS) in each cluster. HR, homologous recombination; LOH, loss-of-heterozygosity; SV, structural variant; MTS, moderate-term survivor; STS, short-term survivor; DUP, duplications; DEL, deletions; INV, inversions. b, Kaplan–Meier analysis of progression-free and c, overall survival in patients stratified by signature clusters. P values calculated by Mantel–Cox log-rank test and dotted lines indicate median survival. d, Boxplots summarize the proportion (y-axis) of clustered and nonclustered rearrangements by size (x-axis) and type, for each mutational signature cluster (SIG.1 n = 14, SIG.2 n = 25, SIG.3 n = 13, SIG.4 n = 27, SIG.5 n = 22, SIG.6 n = 9, SIG.7 n = 16); boxes show the interquartile range (25–75th percentiles), central lines indicate the median, whiskers show the smallest/largest values within 1.5 times the interquartile range and values outside it are shown as individual data points. Del, deletions; tds, tandem duplications; inv, inversions, tra, interchromosomal translocations; Kb, kilobase; Mb, megabase.
a, Proportion of patients affected by gene alterations per mutational signature cluster. Genes are ordered by significance using Fisher’s exact test (two-sided) and clusters are ordered by the proportion of long-term survivors. b, Proportion of patients with categorical features per cluster. Features are ordered by significance using Fisher’s exact test (two-sided) and the clusters are arranged by the proportion of long-term survivors. The Fisher’s test P values displayed in (a) and (b) are Benjamini-Hochberg adjusted P values. Features include homologous recombination (HR) status, homologous recombination deficiency (HRD) type, number of DNA repair pathway alterations, survival group (LTS, long-term survivor; MTS, moderate-term survivor; STS, short-term survivor), status at last follow-up (D, dead; P, progressed and alive; PF, progression-free and alive), self-reported smoking status, DeepCC molecular subtype (C1, mesenchymal; C2, immunoreactive; C4, differentiated; C5, proliferative), and neoadjuvant treatment (Y, yes; N, no).
a, Boxplots summarize numerical, clinical and genomic features by mutational signature cluster; points represent each sample, boxes show the interquartile range (25–75th percentiles), central lines indicate the median, whiskers show the smallest/largest values within 1.5 times the interquartile range, red triangles indicate the mean, and dotted lines join the means of each cluster to visualize the trend. The Kruskal–Wallis test P values displayed are Benjamini-Hochberg adjusted P values. Features are ordered by their significance and clusters are ordered by the proportion of long-term survivors. CD8 scores were available for n = 54 primary tumors as previously measured by immunohistochemistry23 and scored as density of CD8+ T cells (average cells/mm2, y axis) in the tumor epithelium (TE). HRD, homologous recombination deficiency; DEL, deletions; DUP; duplications; SV, structural variants; Mb, megabase; ITX, intrachromosomal rearrangements; LOH, loss-of-heterozygosity; INV, inversions. b, Bubble plot summary of mutational signature enrichment across signature clusters. The dendrogram is reused from the signature clustering (Fig. 3) to order the mutational signature types (columns). Mutational signature clusters (rows) are sorted by the proportion of long-term survivors in each cluster, indicated in brackets. The color and size of bubbles indicate the z-score scaled values of the mean signature exposure per cluster. Bubbles with a z-score of greater than or equal to 1 have a black border and bubbles with a z-score of greater than 0.5 but less than 1 have a gray border. Bordered bubbles have asterisks filled in to indicate Kruskal–Wallis test P values adjusted for multiple testing using Benjamini-Hochberg correction.
a, Heatmap of methylation data following consensus clustering of primary tumors (columns) based on the standardized CpG probe intensities (M-values) of the 1% most variable CpG probes (rows; number of probes = 3,645) across all primary tumor samples (n = 126). The heatmap scale shows the beta values. Five methylation clusters were identified (MET.1—MET.5), and each patient (column) is annotated with survival group (LTS, long-term survivor; MTS, moderate-term survivor; STS, short-term survivor), age at diagnosis (quartiles), and self-reported smoking history. Tumor samples are also classified according to CCNE1 amplification (amp) status, BRCA1 alteration status, CIBERSORTx absolute (abs) immune scores (quartiles), and molecular subtype11 (C1, mesenchymal; C2, immunoreactive; C4, differentiated; C5, proliferative). Bars in the bottom panel represent the BRCA1 (orange) and BRCA2 (blue) type homologous recombination deficiency (CHORD28) scores of each tumor sample. b, Kaplan–Meier analysis of progression-free (PFS) and overall survival (OS) in patients stratified by methylation clusters. P values calculated by Mantel–Cox log-rank test and dotted lines indicate median survival in years since diagnosis.
a, Clustered heatmap summarizing gene set enrichment analysis (GSEA) using the hallmark Molecular Signatures Database (MSigDB) gene sets. Direction and color of triangles relate to the normalized enrichment score (NES) as generated by FGSEA. P values (two-sided) were calculated using the FGSEA default Monte Carlo method; the size of the triangles corresponds to the negative log10 Benjamini-Hochberg adjusted P value (Padj). The columns are split by survival groups (STS, short-term survivor; MTS, moderate-term survivor; LTS, long-term survivor), with the direction of enrichment denoted by the group in the heading (numerator) versus the two other groups labeled below. b, Boxplots summarize expression of MKI67 and PCNA proliferation gene markers across the survival groups (left; STS n = 34, MTS n = 32, LTS n = 60); points represent each sample, boxes show the interquartile range (25–75th percentiles), central lines indicate the median, and whiskers show the smallest/largest values within 1.5 times the interquartile range. Differential expression analysis was performed using DESeq2 to determine fold change (right) of gene expression between survival groups (two-tailed Wald test, both unadjusted P values and Benjamini-Hochberg adjusted P values (Padj) are shown). c, Forest plot (left) indicates the hazard ratio (HR, squares) and 95% confidence interval (CI; whiskers) for overall survival calculated using a univariate Cox proportional hazard regression model based on the LM22 immune cell types detected by CIBERSORTx analysis (n = 126 patients). Cell types are arranged by HR. P values < 0.05 are colored red (*P < 0.05, **P < 0.01) and were not adjusted for multiple comparisons. Absolute enrichment scores per cell type across the cohort are shown in boxplots (right); boxes show the interquartile range (25–75th percentiles), central lines indicate the median, whiskers show the smallest/largest values within 1.5 times the interquartile range and values outside it are shown as individual data points.
a, A condensed bubble plot of the various LM22 cell types used for the immune clustering (IMM.1 n = 32, IMM.2 n = 23, IMM.3 n = 22, IMM.4 n = 24, IMM.5 n = 25). The dendrogram is reused from the immune clustering (Fig. 5a) to order the cell types. Immune clusters (rows) are sorted by the proportion of long-term survivors indicated in brackets. The color and size of bubbles indicate z-score scaled values of the mean abundance of cell types per cluster. Bubbles with a z-score of greater than or equal to 1 have a black border, and those with a z-score of greater than 0.5 but less than 1 have a gray border. Asterisks indicate Kruskal–Wallis test P values adjusted for multiple testing using Benjamini-Hochberg correction. Boxplots (right) summarize CIBERSORTx absolute scores of each cluster; points represent each sample, boxes show the interquartile range (25–75th percentiles), central lines indicate the median, and whiskers show the smallest/largest values within 1.5 times the interquartile range. b, Boxplots summarize numerical, clinical and genomic features by immune cluster (IMM.1 n = 32, IMM.2 n = 23, IMM.3 n = 22, IMM.4 n = 24, IMM.5 n = 25); points represent each sample, boxes show the interquartile range (25–75th percentiles), central lines indicate the median, whiskers show the smallest/largest values within 1.5 times the interquartile range, red triangles indicate the mean, and dotted lines join the means of each cluster to visualize the trend. The Kruskal–Wallis test P values displayed are Benjamini-Hochberg adjusted. Features are ordered by their significance and clusters are ordered by the proportion of long-term survivors. CD8 scores were available for n = 54 primary tumors as previously measured by immunohistochemistry23 and scored as density of CD8+ T cells (average cells/mm2, y axis) in the tumor epithelium (TE). HRD, homologous recombination deficiency; DEL, deletions; DUP; duplications; SV, structural variants; Mb, megabase; ITX, intrachromosomal rearrangements; LOH, loss-of-heterozygosity; INV, inversions.
a, Proportion of patients with categorical features per cluster. Features are ordered by significance using Fisher’s exact test (two-sided) and the clusters are arranged by the proportion of long-term survivors. Features include homologous recombination (HR) status, homologous recombination deficiency (HRD) type, number of DNA repair pathway alterations, survival group (LTS, long-term survivor; MTS, moderate-term survivor; STS, short-term survivor), status at last follow-up (D, dead; P, progressed and alive; PF, progression-free and alive), self-reported smoking status, DeepCC molecular subtype (C1, mesenchymal; C2, immunoreactive; C4, differentiated; C5, proliferative), and neoadjuvant treatment (Y, yes; N, no). b, Proportion of patients affected by gene alterations per immune cluster. Genes are ordered by significance using Fisher’s exact test (two-sided) and clusters are ordered by the proportion of long-term survivors. The Fisher’s test P values displayed in (a) and (b) are Benjamini-Hochberg adjusted P values.
Supplementary Note, Figures 1–21 and Tables 18–20.
dNdScv cancer driver gene detection results. The table contains the output from the dNdScv R package that uses maximum-likelihood models (two sided) to detect genes under selection in the study cohort (n = 126). Only high-confidence base substitutions and small-scale deletions and insertions were used for the analysis. Columns include the gene symbol (gene_name) and global adjusted P value (qglobal_cv). More details regarding the interpretation of all the columns are available at https://github.com/im3sanger/dndscv.
Genomic random interval (GRIN) statistical model (one-sided) for recurrent structural variants. The table contains the output from the gene level GRIN analysis on the study cohort (n = 126) for high-confidence structural variants filtered for expressed protein coding genes. Columns include the gene symbol (gene.label), the adjusted P value of the cohort level overlap statistic (q.subjects) for all genes, as well as a re-calculation of the adjusted P value after filtering for the expressed protein coding genes (q.subjects.after_filter). The table has also been annotated for blacklisted genes for SNVs and indels (snv_indel_blacklist), fragile sites (fragile_site), the fragile site database (fragile_site.list), presence in the COSMIC database (cosmic), the tier in the COSMIC database (cosmic.tier), and its role in cancer in the COSMIC database (cosmic.tier). More details regarding the interpretation of the columns output by GRIN are available at https://www.stjude.org/research/departments/biostatistics/software/grin.html.
Regions of recurrent copy-number change detected by GISTIC2 (permutation-of-markers test, one-sided). Sheet 1 contains cytoband level results for amplifications and deletions in the overall cohort of 126 primary tumors. Columns include the cytoband (Descriptor) and the adjusted P value of the region (q values). Sheet 2 contains the cytoband level analysis with adjusted P values (q values) for all 126 primary tumors as well as for tumors grouped by survival category. Sheet 3 contains GISTIC2 results at the gene level, and columns include the gene symbol (genes.in.wide.peak) and adjusted P values (q values) for all 126 primary tumors well as for tumors grouped by survival category. More details regarding the interpretation of the columns output by GISTIC2 are available at https://www.genepattern.org/modules/docs/GISTIC_2.0.
RNA-seq differential expression (DE) and gene set enrichment analysis (GSEA) across survival groups. Sheet 1 contains GSEA output from the FGSEA tool using a pre-ranked list of differentially expressed genes for each survival group comparison. The analysis was run using the Molecular Signatures Database (MSigDB) hallmark gene sets. P values (two-sided) were calculated using the FGSEA default Monte Carlo method. Columns include the tumor comparison set (COMPARISON), hallmark pathways (pathway), adjusted P values (padj) and Normalized Enrichment Scores (NES). Sheets 2–4 contain the DE analyses performed using the R package DESeq2 (two-tailed Wald test); comparisons include long-term survivors versus short-term survivors (Sheet2_LTS_vs_STS), long-term survivors versus moderate-term survivors (Sheet3_LTS_vs_MTS) and moderate-term survivors versus short-term survivors (Sheet4_MTS_vs_STS). The first survival group in the name of each sheet corresponds to the numerator of the DE analysis. Columns include the gene symbol (SYMBOL), logged and unlogged fold changes (log2FoldChange, foldChange), and adjusted P values (padj) as output by DESeq2. More details regarding the output of FGSEA can be found at https://github.com/ctlab/fgsea, and DESeq2 can be found at http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html.
RNA-seq high- and medium-confidence gene fusion results from the Arriba tool. Columns include the fusion gene pair (gene1, gene2), the breakpoints in the gene pair (breakpoint1, breakpoint2), the confidence of the fusion assigned by Arriba (confidence) and coordinates of any breakpoint support detected in the whole-genome sequencing list of high-confidence structural variants (closest_genomic_breakpoint1, closest_genomic_breakpoint2). In-frame fusion gene pairs that were identified in more than one patient are indicated in the column ‘Recurrent in-frame gene fusion’. More details regarding the interpretation of the Arriba output can be found at https://arriba.readthedocs.io/en/latest/output-files/.
Differential methylation (DM) analysis results using the Limma R package in order of long-term survivors versus moderate-term survivors (LTS - MTS), long-term survivors versus short-term survivors (LTS - STS) and moderate-term survivors versus short-term survivors (MTS - STS). The first survival group in the name of each sheet corresponds to the numerator of the DM analysis. Columns include the genomic coordinates of the methylation probe (seqname, start, end), the identifier of the probe (Name), the symbol of the gene target (GeneSymbol), the distance to the transcription start site (TSS) of the gene (distance2TSS), if the probe is in the promoter region of the gene (PROMOTER), if the probe intersects the gene body (GENEBODY), the minimum, maximum, mean, and median beta value of the probe in the cohort (min_beta, max_beta, mean_beta, median_beta), the minimum, maximum, mean and median RNA expression of the gene in the cohort (min_exp, max_exp, mean_exp, median_exp), the two-sided Pearson correlation test with gene expression (pearson_cor), the unadjusted Pearson correlation P value (pearson_pval), the Benjamini-Hochberg adjusted Pearson correlation P value (pearson_qval), the direction of Pearson correlation (pearson_cor_dir), the logged fold-change as output by Limma (logFC), the adjusted P value of the DM result as output by Limma (adj.P.Val) and a column to indicate a filter for genes that were deemed to be turned off in the numerator comparison group (EXP_TURNED_OFF_IN_LTS or EXP_TURNED_OFF_IN_MTS). Further details in Supplementary Note. More details regarding the interpretation of the Limma output can be found at https://bioconductor.org/packages/release/bioc/html/limma.html.
RNA-seq eifferential expression (DE) and fast gene set enrichment analysis (FGSEA), comparing transcriptomes of CCNE1 amplified tumors in each survival group (short-term survivors n = 11, moderate-term survivors n = 4, long-term survivors n = 6) with a reference group of tumors with no CCNE1 amplification or loss that had no homologous recombination (HR) alterations and were classified as HR proficient (reference n = 21). Sheet 1 contains the output from the FGSEA tool using a pre-ranked list of differentially expressed genes for each survival group comparison. The analysis was run using the Molecular Signatures Database (MSigDB) hallmark gene sets. P values (two-sided) were calculated using the FGSEA default Monte Carlo method. Columns include the tumor comparison set (COMPARISON), hallmark pathways (pathway), adjusted P values (padj) and Normalized Enrichment Scores (NES). Sheets 2–4 contain the DE analyses performed using the R package DESeq2 (two-tailed Wald test); comparisons include CCNE1 amplified tumors in long-term survivors versus the reference tumors (Sheet2_LTS_vs_reference), CCNE1 amplified tumors in moderate-term survivors versus the reference tumors (Sheet3_MTS_vs_reference) and CCNE1 amplified tumors in short-term survivors versus the reference tumors (Sheet4_STS_vs_reference). The first survival group in the name of each sheet corresponds to the numerator of the DE analysis. Columns include the gene symbol (SYMBOL), logged and unlogged fold changes (log2FoldChange, foldChange), and adjusted P values (padj) as output by DESeq2.
Supplementary Tables 1–17
About this article
Cite this article
Garsed, D.W., Pandey, A., Fereday, S. et al. The genomic and immune landscape of long-term survivors of high-grade serous ovarian cancer. Nat Genet 54, 1853–1864 (2022). https://doi.org/10.1038/s41588-022-01230-9