Whole transcriptome signature for prognostic prediction (WTSPP): application of whole transcriptome signature for prognostic prediction in cancer


Developing prognostic biomarkers for specific cancer types that accurately predict patient survival is increasingly important in clinical research and practice. Despite the enormous potential of prognostic signatures, proposed models have found limited implementations in routine clinical practice. Herein, we propose a generic, RNA sequencing platform independent, statistical framework named whole transcriptome signature for prognostic prediction to generate prognostic gene signatures. Using ovarian cancer and lung adenocarcinoma as examples, we provide evidence that our prognostic signatures overperform previous reported signatures, capture prognostic features not explained by clinical variables, and expose biologically relevant prognostic pathways, including those involved in the immune system and cell cycle. Our approach demonstrates a robust method for developing prognostic gene expression signatures. In conclusion, our statistical framework can be generally applied to all cancer types for prognostic prediction and might be extended to other human diseases. The proposed method is implemented as an R package (PanCancerSig) and is freely available on GitHub (https://github.com/Cheng-Lab-GitHub/PanCancer_Signature).

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: The OV signature is predictive of patient survival in six independent ovarian cancer gene expression datasets.
Fig. 2: The OV signature provides additional prognostic value over clinical variables.
Fig. 3: The OV signature can be used to stratify individual clinical variables.
Fig. 4: The OV signature outperforms 14 published ovarian cancer-specific gene signatures.
Fig. 5: The LUAD signature is predictive of patient survival in independent lung adenocarcinoma gene expression datasets.


  1. 1.

    Ferlay J, Soerjomataram I, Dikshit R, Eser S, Mathers C, Rebelo M, et al. Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012. Int J Cancer. 2015;136:E359–86.

    CAS  Article  Google Scholar 

  2. 2.

    Siegel RL, Miller KD, Jemal A. Cancer statistics, 2020. CA Cancer J Clin. 2020;70:7–30.

    Article  Google Scholar 

  3. 3.

    Liu J, Lichtenberg T, Hoadley KA, Poisson LM, Lazar AJ, Cherniack AD, et al. An integrated TCGA pan-cancer clinical data resource to drive high-quality survival outcome analytics. Cell. 2018;173:400–16.e11.

    CAS  Article  Google Scholar 

  4. 4.

    Amin MB, Edge S, Greene F, Byrd DR, Brookland RK, Washington MK, et al., editors. AJCC cancer staging manual. 8th ed. New York: Springer International Publishing; 2017.

  5. 5.

    Gershenwald JE, Scolyer RA, Hess KR, Sondak VK, Long GV, Ross MI, et al. Melanoma staging: evidence-based changes in the American Joint Committee on Cancer eighth edition cancer staging manual. CA Cancer J Clin. 2017;67:472–92.

    Article  Google Scholar 

  6. 6.

    Cortez AJ, Tudrej P, Kujawa KA, Lisowska KM. Advances in ovarian cancer therapy. Cancer Chemother Pharmacol. 2018;81:17–38.

    CAS  Article  Google Scholar 

  7. 7.

    Chang S-J, Hodeib M, Chang J, Bristow RE. Survival impact of complete cytoreduction to no gross residual disease for advanced-stage ovarian cancer: a meta-analysis. Gynecol Oncol. 2013;130:493–8.

    Article  Google Scholar 

  8. 8.

    Chin L, Hahn WC, Getz G, Meyerson M. Making sense of cancer genomic data. Genes Dev. 2011;25:534–55.

    CAS  Article  Google Scholar 

  9. 9.

    Beumer IJ, Persoon M, Witteveen A, Dreezen C, Chin S-F, Sammut S-J, et al. Prognostic value of MammaPrint® in invasive lobular breast cancer. Biomark Insights. 2016;11:139–46.

    Article  Google Scholar 

  10. 10.

    Paik S, Shak S, Tang G, Kim C, Baker J, Cronin M, et al. A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N Engl J Med. 2004;351:2817–26.

    CAS  Article  Google Scholar 

  11. 11.

    Tsai M, Lo S, Audeh W, Qamar R, Budway R, Levine E, et al. Association of 70-gene signature assay findings with physicians’ treatment guidance for patients with early breast cancer classified as intermediate risk by the 21-gene assay. JAMA Oncol. 2018;4:e173470.

    Article  Google Scholar 

  12. 12.

    Toustrup K, Sørensen BS, Metwally MAH, Tramm T, Mortensen LS, Overgaard J, et al. Validation of a 15-gene hypoxia classifier in head and neck cancer for prospective use in clinical trials. Acta Oncol. 2016;55:1091–8.

    CAS  Article  Google Scholar 

  13. 13.

    Yang L, Forker L, Irlam JJ, Pillay N, Choudhury A, West CML. Validation of a hypoxia related gene signature in multiple soft tissue sarcoma cohorts. Oncotarget. 2018;9:3946–55.

    Article  Google Scholar 

  14. 14.

    Mak MP, Tong P, Diao L, Cardnell RJ, Gibbons DL, William WN, et al. A patient-derived, pan-cancer EMT signature identifies global molecular alterations and immune target enrichment following epithelial to mesenchymal transition. Clin Cancer Res. 2016;22:609–20.

    CAS  Article  Google Scholar 

  15. 15.

    Zhao Y, Varn FS, Cai G, Xiao F, Amos CI, Cheng C. A P53-deficiency gene signature predicts recurrence risk of patients with early-stage lung adenocarcinoma. Cancer Epidemiol Biomarkers Prev. 2018;27:86–95.

    CAS  Article  Google Scholar 

  16. 16.

    Takahashi S, Moriya T, Ishida T, Shibata H, Sasano H, Ohuchi N, et al. Prediction of breast cancer prognosis by gene expression profile of TP53 status. Cancer Sci. 2008;99:324–32.

    CAS  Article  Google Scholar 

  17. 17.

    Wang Y, Ung MH, Cantor S, Cheng C. Computational investigation of homologous recombination DNA repair deficiency in sporadic breast cancer. Sci Rep. 2017;7:15742.

    Article  Google Scholar 

  18. 18.

    Oh SC, Sohn BH, Cheong J-H, Kim S-B, Lee JE, Park KC, et al. Clinical and genomic landscape of gastric cancer with a mesenchymal phenotype. Nat Commun. 2018;9:1777.

    Article  Google Scholar 

  19. 19.

    Kuiper R, Broyl A, de Knegt Y, van Vliet MH, van Beers EH, van der Holt B, et al. A gene expression signature for high-risk multiple myeloma. Leukemia. 2012;26:2406–13.

    CAS  Article  Google Scholar 

  20. 20.

    O’Mara TA, Zhao M, Spurdle AB. Meta-analysis of gene expression studies in endometrial cancer identifies gene expression profiles associated with aggressive disease and patient outcome. Sci Rep. 2016;6:36677.

    Article  Google Scholar 

  21. 21.

    Tian S. Identification of subtype-specific prognostic genes for early-stage lung adenocarcinoma and squamous cell carcinoma patients using an embedded feature selection algorithm. PLoS ONE. 2015;10:e0134630.

    Article  Google Scholar 

  22. 22.

    Kern SE. Why your new cancer biomarker may never work: recurrent patterns and remarkable diversity in biomarker failures. Cancer Res. 2012;72:6097–101.

    CAS  Article  Google Scholar 

  23. 23.

    Bentink S, Haibe-Kains B, Risch T, Fan J-B, Hirsch MS, Holton K, et al. Angiogenic mRNA and microRNA gene expression signature predicts a novel subtype of serous ovarian cancer. PLoS ONE. 2012;7:e30269.

    CAS  Article  Google Scholar 

  24. 24.

    Sabatier R, Finetti P, Bonensea J, Jacquemier J, Adelaide J, Lambaudie E, et al. A seven-gene prognostic model for platinum-treated ovarian carcinomas. Br J Cancer. 2011;105:304–11.

    CAS  Article  Google Scholar 

  25. 25.

    Kernagis DN, Hall AHS, Datto MB. Genes with bimodal expression are robust diagnostic targets that define distinct subtypes of epithelial ovarian cancer with different overall survival. J Mol Diagn. 2012;14:214–22.

    CAS  Article  Google Scholar 

  26. 26.

    Waldron L, Haibe-Kains B, Culhane AC, Riester M, Ding J, Wang XV, et al. Comparative meta-analysis of prognostic gene signatures for late-stage ovarian cancer. J Natl Cancer Inst. 2014;106:dju049.

    PubMed  PubMed Central  Google Scholar 

  27. 27.

    Subramanian J, Simon R. Gene expression-based prognostic signatures in lung cancer: ready for clinical use? J Natl Cancer Inst. 2010;102:464–74.

    CAS  Article  Google Scholar 

  28. 28.

    Tímár J, Gyorffy B, Rásó E. Gene signature of the metastatic potential of cutaneous melanoma: too much for too little? Clin Exp Metastasis. 2010;27:371–87.

    Article  Google Scholar 

  29. 29.

    Cheng C, Yan X, Sun F, Li LM. Inferring activity changes of transcription factors by binding association with sorted expression profiles. BMC Bioinformatics. 2007;8:452.

    Article  Google Scholar 

  30. 30.

    Varn FS, Andrews EH, Mullins DW, Cheng C. Integrative analysis of breast cancer reveals prognostic haematopoietic activity and patient-specific immune response profiles. Nat Commun. 2016;7:10248.

    Article  Google Scholar 

  31. 31.

    Varn FS, Wang Y, Mullins DW, Fiering S, Cheng C. Systematic pan-cancer analysis reveals immune cell interactions in the tumor microenvironment. Cancer Res. 2017;77:1271–82.

    CAS  Article  Google Scholar 

  32. 32.

    Matulonis UA, Sood AK, Fallowfield L, Howitt BE, Sehouli J, Karlan BY. Ovarian cancer. Nat Rev Dis Primers. 2016;2:16061.

    Article  Google Scholar 

  33. 33.

    Cancer Genome Atlas Research Network. Integrated genomic analyses of ovarian carcinoma. Nature. 2011;474:609–15.

    Article  Google Scholar 

  34. 34.

    Verhaak RGW, Tamayo P, Yang J-Y, Hubbard D, Zhang H, Creighton CJ, et al. Prognostically relevant gene signatures of high-grade serous ovarian carcinoma. J Clin Invest. 2013;123:517–25.

    CAS  PubMed  Google Scholar 

  35. 35.

    Noel-MacDonnell JR, Usset J, Goode EL, Fridley BL. Assessment of data transformations for model-based clustering of RNA-Seq data. PLoS ONE. 2018;13:e0191758.

    Article  Google Scholar 

Download references


This work is supported by the Cancer Prevention Research Institute of Texas (CPRIT) (RR180061 to CC) and the National Cancer Institute of the National Institutes of Health (1R21CA227996 to CC). CC is a CPRIT Scholar in Cancer Research.

Author information




ES, YZ, and KZ performed the analysis. ES and YZ produced the figures. YW generated the R package. ES, YZ, and CC conceived the research and designed the method and experiments. YZ and CC curated the data. ES, YZ, FSV, KZ, HY, and CC interpreted the results. ES drafted the manuscript. ES, YZ, YW, FSV, KZ, HY, and CC read and approved the final manuscript. CC directed the project.

Corresponding author

Correspondence to Chao Cheng.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Schaafsma, E., Zhao, Y., Wang, Y. et al. Whole transcriptome signature for prognostic prediction (WTSPP): application of whole transcriptome signature for prognostic prediction in cancer. Lab Invest 100, 1356–1366 (2020). https://doi.org/10.1038/s41374-020-0413-8

Download citation


Quick links