A clonal expression biomarker associates with lung cancer mortality

Biswas, Dhruva; Birkbak, Nicolai J.; Rosenthal, Rachel; Hiley, Crispin T.; Lim, Emilia L.; Papp, Krisztian; Boeing, Stefan; Krzystanek, Marcin; Djureinovic, Dijana; La Fleur, Linnea; Greco, Maria; Döme, Balázs; Fillinger, János; Brunnström, Hans; Wu, Yin; Moore, David A.; Skrzypski, Marcin; Abbosh, Christopher; Litchfield, Kevin; Al Bakir, Maise; Watkins, Thomas B. K.; Veeriah, Selvaraju; Wilson, Gareth A.; Jamal-Hanjani, Mariam; Moldvay, Judit; Botling, Johan; Chinnaiyan, Arul M.; Micke, Patrick; Hackshaw, Allan; Bartek, Jiri; Csabai, Istvan; Szallasi, Zoltan; Herrero, Javier; McGranahan, Nicholas; Swanton, Charles

doi:10.1038/s41591-019-0595-z

Letter
Published: 07 October 2019

A clonal expression biomarker associates with lung cancer mortality

Dhruva Biswas^1,2,3^na1,
Nicolai J. Birkbak ORCID: orcid.org/0000-0003-1613-9587^1,3,4,5^na1,
Rachel Rosenthal^1,2,3,
Crispin T. Hiley^1,3,
Emilia L. Lim^1,3,
Krisztian Papp ORCID: orcid.org/0000-0003-0619-8233⁶,
Stefan Boeing⁷,
Marcin Krzystanek⁸,
Dijana Djureinovic⁹,
Linnea La Fleur⁹,
Maria Greco¹⁰,
Balázs Döme^11,12,13,
János Fillinger^14,15,
Hans Brunnström ORCID: orcid.org/0000-0001-7402-138X¹⁶,
Yin Wu¹,
David A. Moore¹⁷,
Marcin Skrzypski^1,18,
Christopher Abbosh¹,
Kevin Litchfield³,
Maise Al Bakir³,
Thomas B. K. Watkins³,
Selvaraju Veeriah¹,
Gareth A. Wilson^1,3,
Mariam Jamal-Hanjani¹,
Judit Moldvay^11,19,
Johan Botling⁹,
Arul M. Chinnaiyan^{20,21,22,23,24},
Patrick Micke⁹,
Allan Hackshaw²⁵,
Jiri Bartek^8,26,
Istvan Csabai⁶,
Zoltan Szallasi^8,19,27,
Javier Herrero ORCID: orcid.org/0000-0001-7313-717X²,
Nicholas McGranahan ORCID: orcid.org/0000-0001-9537-4045^1,28,
Charles Swanton ORCID: orcid.org/0000-0002-4299-3018^1,3 &
TRACERx Consortium

Nature Medicine volume 25, pages 1540–1548 (2019)Cite this article

17k Accesses
60 Citations
226 Altmetric
Metrics details

Subjects

A Publisher Correction to this article was published on 03 June 2020

This article has been updated

Abstract

An aim of molecular biomarkers is to stratify patients with cancer into disease subtypes predictive of outcome, improving diagnostic precision beyond clinical descriptors such as tumor stage¹. Transcriptomic intratumor heterogeneity (RNA-ITH) has been shown to confound existing expression-based biomarkers across multiple cancer types^2,3,4,5,6. Here, we analyze multi-region whole-exome and RNA sequencing data for 156 tumor regions from 48 patients enrolled in the TRACERx study to explore and control for RNA-ITH in non-small cell lung cancer. We find that chromosomal instability is a major driver of RNA-ITH, and existing prognostic gene expression signatures are vulnerable to tumor sampling bias. To address this, we identify genes expressed homogeneously within individual tumors that encode expression modules of cancer cell proliferation and are often driven by DNA copy-number gains selected early in tumor evolution. Clonal transcriptomic biomarkers overcome tumor sampling bias, associate with survival independent of clinicopathological risk factors, and may provide a general strategy to refine biomarker design across cancer types.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Tumor sampling bias confounds lung cancer biomarkers.**

**Fig. 2: RNA inter- and intratumor heterogeneity quadrants.**

**Fig. 3: Clonal gene selection improves prognostic accuracy over conventional biomarker design and beyond clinicopathological risk factors.**

**Fig. 4: Pan-cancer prognostic relevance and the genomic underpinning of RNA heterogeneity quadrants.**

Genomic–transcriptomic evolution in lung cancer and metastasis

Article Open access 12 April 2023

Multi-region exome sequencing reveals the intratumoral heterogeneity of surgically resected small cell lung cancer

Article Open access 14 September 2021

Integrated single-cell RNA sequencing analysis reveals distinct cellular and transcriptional modules associated with survival in lung cancer

Article Open access 14 January 2022

Data availability

Sequence data used during the study are available through the Cancer Research UK & University College London Cancer Trials Centre (ctc.tracerx@ucl.ac.uk) for noncommercial research purposes, and access will be granted upon review of a project proposal that will be evaluated by a TRACERx data access committee and entering into an appropriate data access agreement subject to any applicable ethical approvals.

Code availability

Code is available at https://github.com/dhruvabiswas/tracerx-oracle.

Change history

03 June 2020
An amendment to this paper has been published and can be accessed via a link at the top of the paper.

References

Vargas, A. J. & Harris, C. C. Biomarker development in the precision medicine era: lung cancer as a case study. Nat. Rev. Cancer 16, 525–537 (2016).
Article CAS PubMed PubMed Central Google Scholar
Lee, W.-C. et al. Multiregion gene expression profiling reveals heterogeneity in molecular subtypes and immunotherapy response signatures in lung cancer. Mod. Pathol. 31, 947–955 (2018).
Article CAS PubMed Google Scholar
Gerlinger, M. et al. Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. N. Engl. J. Med. 366, 883–892 (2012).
Article CAS PubMed PubMed Central Google Scholar
Gulati, S. et al. Systematic evaluation of the prognostic impact and intratumour heterogeneity of clear cell renal cell carcinoma biomarkers. Eur. Urol. 66, 936–948 (2014).
Article CAS PubMed PubMed Central Google Scholar
Gyanchandani, R. et al. Intratumor heterogeneity affects gene expression profile test prognostic risk stratification in early breast cancer. Clin. Cancer Res. 22, 5362–5369 (2016).
Article CAS PubMed PubMed Central Google Scholar
Gulati, S., Turajlic, S., Larkin, J., Bates, P. A. & Swanton, C. Relapse models for clear cell renal carcinoma. Lancet Oncol. 16, e376–e378 (2015).
Article PubMed Google Scholar
Beer, D. G. et al. Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nat. Med. 8, 816–824 (2002).
Article CAS PubMed Google Scholar
Bianchi, F. et al. Survival prediction of stage I lung adenocarcinomas by expression of 10 genes. J. Clin. Invest. 117, 3436–3444 (2007).
Article CAS PubMed PubMed Central Google Scholar
Garber, M. E. et al. Diversity of gene expression in adenocarcinoma of the lung. Proc. Natl Acad. Sci. USA 98, 13784–13789 (2001).
Article CAS PubMed PubMed Central Google Scholar
Kratz, J. R. et al. A practical molecular assay to predict survival in resected non-squamous, non-small-cell lung cancer: development and international validation studies. Lancet 379, 823–832 (2012).
Article PubMed PubMed Central Google Scholar
Krzystanek, M., Moldvay, J., Szüts, D., Szallasi, Z. & Eklund, A. C. A robust prognostic gene expression signature for early stage lung adenocarcinoma. Biomark. Res. 4, 4 (2016).
Article PubMed PubMed Central Google Scholar
Li, B., Cui, Y., Diehn, M. & Li, R. Development and validation of an individualized immune prognostic signature in early-stage nonsquamous non-small cell lung cancer. JAMA Oncol. 3, 1529–1537 (2017).
Article PubMed PubMed Central Google Scholar
Raz, D. J. et al. A multigene assay is prognostic of survival in patients with early-stage lung adenocarcinoma. Clin. Cancer Res. 14, 5565–5570 (2008).
Article CAS PubMed Google Scholar
Shukla, S. et al. Development of a RNA-Seq based prognostic signature in lung adenocarcinoma. J. Natl Cancer Inst. 109, djw200 (2017).
Article CAS Google Scholar
Wistuba, I. I. et al. Validation of a proliferation-based expression signature as prognostic marker in early stage lung adenocarcinoma. Clin. Cancer Res. 19, 6261–6271 (2013).
Article CAS PubMed Google Scholar
Shedden, K. et al. Gene expression-based survival prediction in lung adenocarcinoma: a multi-site, blinded validation study. Nat. Med. 14, 822–827 (2008).
Article CAS PubMed PubMed Central Google Scholar
Subramanian, J. & Simon, R. Gene expression-based prognostic signatures in lung cancer: ready for clinical use? J. Natl Cancer Inst. 102, 464–474 (2010).
Article CAS PubMed PubMed Central Google Scholar
Burrell, R. A., McGranahan, N., Bartek, J. & Swanton, C. The causes and consequences of genetic heterogeneity in cancer evolution. Nature 501, 338–345 (2013).
Article CAS PubMed Google Scholar
Boutros, P. C. The path to routine use of genomic biomarkers in the cancer clinic. Genome Res. 25, 1508–1513 (2015).
Article CAS PubMed PubMed Central Google Scholar
Blackhall, F. H. et al. Stability and heterogeneity of expression profiles in lung cancer specimens harvested following surgical resection. Neoplasia 6, 761–767 (2004).
Article PubMed PubMed Central Google Scholar
Bachtiary, B. et al. Gene expression profiling in cervical cancer: an exploration of intratumor heterogeneity. Clin. Cancer Res. 12, 5632–5640 (2006).
Article CAS PubMed Google Scholar
Barranco, S. C. et al. Intratumor variability in prognostic indicators may be the case of conflicting estimates of patient survival and response to therapy. Cancer Res. 54, 5351–5356 (1994).
CAS PubMed Google Scholar
Jamal-Hanjani, M. et al. Tracking the evolution of non-small-cell lung cancer. N. Engl J. Med. 376, 2109–2121 (2017).
Article CAS PubMed Google Scholar
The Cancer Genome Atlas Research Network. Comprehensive genomic characterization of squamous cell lung cancers. Nature 489, 519–525 (2012).
Article PubMed Central CAS Google Scholar
The Cancer Genome Atlas Research Network. Comprehensive molecular profiling of lung adenocarcinoma. Nature 511, 543–550 (2014).
Article PubMed Central CAS Google Scholar
Djureinovic, D. et al. Profiling cancer testis antigens in non-small-cell lung cancer. JCI Insight 1, e86837 (2016).
Article PubMed PubMed Central Google Scholar
Goldstraw, P. et al. The IASLC lung cancer staging project: proposals for the revision of the TNM stage groupings in the forthcoming (seventh) edition of the TNM classification of malignant tumours. J. Thorac. Oncol. 2, 706–714 (2007).
Article PubMed Google Scholar
Okayama, H. et al. Identification of genes upregulated in ALK-positive and EGFR/KRAS/ALK-negative lung adenocarcinomas. Cancer Res. 72, 100–111 (2012).
Article CAS PubMed Google Scholar
Rousseaux, S. et al. Ectopic activation of germline and placental genes identifies aggressive metastasis-prone lung cancers. Sci. Transl. Med. 5, 186ra66 (2013).
Article PubMed PubMed Central CAS Google Scholar
Der, S. D. et al. Validation of a histology-independent prognostic gene signature for early-stage, non-small-cell lung cancer including stage IA patients. J. Thorac. Oncol. 9, 59–64 (2014).
Article CAS PubMed Google Scholar
Venet, D., Dumont, J. E. & Detours, V. Most random gene expression signatures are significantly associated with breast cancer outcome. PLoS Comput. Biol. 7, e1002240 (2011).
Article CAS PubMed PubMed Central Google Scholar
Tang, H. et al. Comprehensive evaluation of published gene expression prognostic signatures for biomarker-based lung cancer clinical studies. Ann. Oncol. 28, 733–740 (2017).
Article CAS PubMed PubMed Central Google Scholar
Chen, H.-Y. et al. A five-gene signature and clinical outcome in non-small-cell lung cancer. N. Engl J. Med. 356, 11–20 (2007).
Article CAS PubMed Google Scholar
Reka, A. K. et al. Epithelial–mesenchymal transition-associated secretory phenotype predicts survival in lung cancer patients. Carcinogenesis 35, 1292–1300 (2014).
Article CAS PubMed PubMed Central Google Scholar
Strauss, G. M. et al. Adjuvant paclitaxel plus carboplatin compared with observation in stage IB non-small-cell lung cancer: CALGB 9633 with the Cancer and Leukemia Group B, Radiation Therapy Oncology Group, and North Central Cancer Treatment Group study groups. J. Clin. Oncol. 26, 5043–5051 (2008).
Article CAS PubMed PubMed Central Google Scholar
Pignon, J.-P. et al. Lung adjuvant cisplatin evaluation: a pooled analysis by the LACE collaborative group. J. Clin. Oncol. 26, 3552–3559 (2008).
Article PubMed Google Scholar
Goldstraw, P. et al. The IASLC lung cancer staging project: proposals for revision of the TNM stage groupings in the forthcoming (eighth) edition of the TNM classification for lung cancer. J. Thorac. Oncol. 11, 39–51 (2016).
Article PubMed Google Scholar
Robinson, D. R. et al. Integrative clinical genomics of metastatic cancer. Nature 548, 297–303 (2017).
Article CAS PubMed PubMed Central Google Scholar
Danaher, P. et al. Gene expression markers of tumor infiltrating leukocytes. J. Immunother. Cancer 5, 18 (2017).
Article PubMed PubMed Central Google Scholar
Loo, P. V. et al. Allele-specific copy number analysis of tumors. Proc. Natl Acad. Sci. USA 107, 16910–16915 (2010).
Article PubMed PubMed Central Google Scholar
Lambrechts, D. Phenotype molding of stromal cells in the lung tumor microenvironment. Nat. Med. 24, 1277–1289 (2018).
Article CAS PubMed Google Scholar
Gentles, A. J. et al. The prognostic landscape of genes and infiltrating immune cells across human cancers. Nat. Med. 21, 938–945 (2015).
Article CAS PubMed PubMed Central Google Scholar
Uhlen, M. et al. A pathology atlas of the human cancer transcriptome. Science 357, eaan2507 (2017).
Article PubMed CAS Google Scholar
Rosenthal, R. et al. Neoantigen-directed immune escape in lung cancer evolution. Nature 567, 479–485 (2019).
Article CAS PubMed PubMed Central Google Scholar
Mlecnik, B. et al. Comprehensive intrametastatic immune quantification and major impact of immunoscore on survival. J. Natl Cancer Inst. 110, 97–108 (2018).
Article CAS Google Scholar
Yachida, S. et al. Distant metastasis occurs late during the genetic evolution of pancreatic cancer. Nature 467, 1114–1117 (2010).
Article CAS PubMed PubMed Central Google Scholar
Yates, L. R. et al. Subclonal diversification of primary breast cancer revealed by multiregion sequencing. Nat. Med. 21, 751–759 (2015).
Article CAS PubMed PubMed Central Google Scholar
Kim, T.-M. et al. Subclonal genomic architectures of primary and metastatic colorectal cancer based on intratumoral genetic heterogeneity. Clin. Cancer Res. 21, 4461–4472 (2015).
Article CAS PubMed Google Scholar
Svensson, V., Teichmann, S. A. & Stegle, O. SpatialDE: identification of spatially variable genes. Nat. Methods 15, 343–346 (2018).
Article CAS PubMed PubMed Central Google Scholar
Tang, H. et al. A 12-gene set predicts survival benefits from adjuvant chemotherapy in non-small cell lung cancer patients. Clin. Cancer Res. 19, 1577–1586 (2013).
Article CAS PubMed PubMed Central Google Scholar
Cleary, B., Cong, L., Cheung, A., Lander, E. S. & Regev, A. Efficient generation of transcriptomic profiles by random composite measurements.Cell 171, 1424–1436 (2017).
Article CAS PubMed PubMed Central Google Scholar
Jiang, P. et al. Signatures of T cell dysfunction and exclusion predict cancer immunotherapy response. Nat. Med. 24, 1550–1558 (2018).
Article CAS PubMed PubMed Central Google Scholar
Hugo, W. et al. Genomic and transcriptomic features of response to anti-PD-1 therapy in metastatic melanoma. Cell 165, 35–44 (2016).
Article CAS PubMed PubMed Central Google Scholar
Dobin, A. et al. STAR: ultrafast universal RNA-Seq aligner. Bioinformatics 29, 15–21 (2013).
Article CAS PubMed Google Scholar
Li, B. & Dewey, C. N. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12, 323 (2011).
Article CAS PubMed PubMed Central Google Scholar
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-Seq data with DESeq2. Genome Biol. 15, 550 (2014).
Article PubMed PubMed Central CAS Google Scholar
Wan, Y.-W., Allen, G. I. & Liu, Z. TCGA2STAT: simple TCGA data access for integrated statistical analysis in R. Bioinformatics 32, 952–954 (2016).
Article CAS PubMed Google Scholar
Durinck, S., Spellman, P. T., Birney, E. & Huber, W. Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt. Nat. Protoc. 4, 1184–1191 (2009).
Article CAS PubMed PubMed Central Google Scholar
Li, Q., Birkbak, N. J., Gyorffy, B., Szallasi, Z. & Eklund, A. C. Jetset: selecting the optimal microarray probe set to represent a gene. BMC Bioinformatics 12, 474 (2011).
Article PubMed PubMed Central Google Scholar
Yu, G. & He, Q.-Y. ReactomePA: an R/Bioconductor package for reactome pathway analysis and visualization. Mol. BioSyst. 12, 477–479 (2016).
Article CAS PubMed Google Scholar
Chen, J. J. W. et al. Global analysis of gene expression in invasion by a lung cancer model. Cancer Res. 61, 5223–5230 (2001).
CAS PubMed Google Scholar
Ishwaran, H., Kogalur, U. B., Blackstone, E. H. & Lauer, M. S. Random survival forests. Ann. Appl. Stat. 2, 841–860 (2008).
Article Google Scholar
Friedman, J., Hastie, T. & Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33, 1–22 (2010).
Article PubMed PubMed Central Google Scholar
Campbell, J. D. et al. Distinct patterns of somatic genome alterations in lung adenocarcinomas and squamous cell carcinomas. Nat. Genet. 48, 607–616 (2016).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

D.B. was the recipient of a Jean Shanks Foundation MBPhD studentship and also receives funding from the MBPhD program at University College London, as well as the NIHR BRC at University College London Hospitals. N.J.B. is a fellow of the Lundbeck Foundation and acknowledges funding from the Aarhus University Research Foundation and the Danish Cancer Society. K.L. is funded by the UK Medical Research Council (MR/P014712/1). J.M., B.D. and J.F. are supported by the Hungarian Science Foundation (OTKA-K129065). I.C. is supported by NVKP_16–1–2016-0004. Z.S. is supported by NAP2-2017-1.2.1-NKP-0002 and the Breast Cancer Research Foundation (BCRF-18-159). N.M. is a Sir Henry Dale Fellow, jointly funded by the Wellcome Trust and the Royal Society (grant number 211179/Z/18/Z), and also receives funding from Cancer Research UK (CRUK), Rosetrees and the NIHR BRC at University College London Hospitals. C.S. is Royal Society Napier Research Professor. This work was supported by the Francis Crick Institute, which receives its core funding from Cancer Research UK (FC001169, FC001202), the UK Medical Research Council (FC001169, FC001202) and the Wellcome Trust (FC001169, FC001202). C.S. is funded by Cancer Research UK (TRACERx and CRUK Cancer Immunotherapy Catalyst Network), the CRUK Lung Cancer Centre of Excellence, Stand Up 2 Cancer (SU2C), the Rosetrees Trust, the Butterfield and Stoneygate Trusts, NovoNordisk Foundation (ID16584), the Prostate Cancer Foundation and the Breast Cancer Research Foundation (BCRF). The research leading to these results has received funding from the European Research Council (ERC) under the European Union’s Seventh Framework Programme (FP7/2007-2013) Consolidator Grant (FP7-THESEUS-617844), European Commission ITN (FP7-PloidyNet 607722), an ERC Advanced Grant (PROTEUS) from the European Research Council under the European Union’s Horizon 2020 research and innovation programme (grant agreement 835297). Support was also provided to C.S. by the National Institute for Health Research, the University College London Hospitals Biomedical Research Centre and the Cancer Research UK University College London Experimental Cancer Medicine Centre.

Author information

These authors contributed equally: Dhruva Biswas, Nicolai J. Birkbak.
A list of members and affiliations appears online.

Authors and Affiliations

Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, Paul O’Gorman Building, London, UK
Dhruva Biswas, Nicolai J. Birkbak, Rachel Rosenthal, Crispin T. Hiley, Emilia L. Lim, Yin Wu, Marcin Skrzypski, Christopher Abbosh, Selvaraju Veeriah, Gareth A. Wilson, Mariam Jamal-Hanjani, Nicholas McGranahan, Charles Swanton, Charles Swanton, Mariam Jamal-Hanjani, Christopher Abbosh, Yin Wu, Selvaraju Veeriah, Marcin Skrzypski, Rachel Rosenthal, Dhruva Biswas, Nicholas McGranahan, Gareth A. Wilson, Emilia L. Lim, Crispin T. Hiley & Nicolai J. Birkbak
Bill Lyons Informatics Centre, University College London Cancer Institute, Paul O’Gorman Building, London, UK
Dhruva Biswas, Rachel Rosenthal, Javier Herrero, Rachel Rosenthal, Dhruva Biswas & Javier Herrero
Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute, London, UK
Dhruva Biswas, Nicolai J. Birkbak, Rachel Rosenthal, Crispin T. Hiley, Emilia L. Lim, Kevin Litchfield, Maise Al Bakir, Thomas B. K. Watkins, Gareth A. Wilson, Charles Swanton, Charles Swanton, Rachel Rosenthal, Dhruva Biswas, Gareth A. Wilson, Emilia L. Lim, Crispin T. Hiley, Nicolai J. Birkbak, Thomas B. K. Watkins, Maise Al Bakir & Kevin Litchfield
Department of Molecular Medicine, Aarhus University, Aarhus, Denmark
Nicolai J. Birkbak & Nicolai J. Birkbak
Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
Nicolai J. Birkbak & Nicolai J. Birkbak
Department of Physics of Complex Systems, ELTE Eötvös Loránd University, Budapest, Hungary
Krisztian Papp, Istvan Csabai, Istvan Csabai & Miklos Diossy
Bioinformatics and Biostatistics, The Francis Crick Institute, London, UK
Stefan Boeing & Stefan Boeing
Danish Cancer Society Research Center, Copenhagen, Denmark
Marcin Krzystanek, Jiri Bartek, Zoltan Szallasi & Zoltan Szallasi
Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden
Dijana Djureinovic, Linnea La Fleur, Johan Botling & Patrick Micke
Genomics Equipment Park, The Francis Crick Institute, London, UK
Maria Greco & Maria Greco
Department of Tumor Biology, National Korányi Institute of Pulmonology, Semmelweis University, Budapest, Hungary
Balázs Döme & Judit Moldvay
Division of Thoracic Surgery, Comprehensive Cancer Center, Medical University of Vienna, Vienna, Austria
Balázs Döme
Department of Thoracic Surgery, National Institute of Oncology, Semmelweis University, Budapest, Hungary
Balázs Döme
Department of Pathology, National Korányi Institute of Pulmonology, Semmelweis University, Budapest, Hungary
János Fillinger
Department of Pathology, National Institute of Oncology, Budapest, Hungary
János Fillinger
Lund University, Laboratory Medicine Region Skåne, Department of Clinical Sciences Lund, Pathology, Lund, Sweden
Hans Brunnström
Department of Pathology, UCL Cancer Institute, London, UK
David A. Moore & David A. Moore
Department of Oncology and Radiotherapy, Medical University of Gdansk, Gdansk, Poland
Marcin Skrzypski & Marcin Skrzypski
SE-NAP Brain Metastasis Research Group, 2nd Department of Pathology, Semmelweis University, Budapest, Hungary
Judit Moldvay, Zoltan Szallasi & Zoltan Szallasi
Michigan Center for Translational Pathology, University of Michigan, Ann Arbor, MI, USA
Arul M. Chinnaiyan
Department of Pathology, University of Michigan, Ann Arbor, MI, USA
Arul M. Chinnaiyan
Rogel Cancer Center, University of Michigan, Ann Arbor, Michigan, USA
Arul M. Chinnaiyan
Department of Urology, University of Michigan, Ann Arbor, MI, USA
Arul M. Chinnaiyan
Howard Hughes Medical Institute, University of Michigan, Ann Arbor, MI, USA
Arul M. Chinnaiyan
Cancer Research UK & University College London Cancer Trials Centre, University College London, London, UK
Allan Hackshaw, Allan Hackshaw, Yenting Ngai, Abigail Sharp, Cristina Rodrigues, Oliver Pressey, Sean Smith, Nicole Gower & Harjot Dhanda
Department of Medical Biochemistry and Biophysics, Karolinska Institute, Stockholm, Sweden
Jiri Bartek
Computational Health Informatics Program, Boston Children’s Hospital, Harvard Medical School, Boston, MA, USA
Zoltan Szallasi & Zoltan Szallasi
Cancer Genome Evolution Research Group, University College London Cancer Institute, University College London, London, UK
Nicholas McGranahan & Nicholas McGranahan
The Francis Crick Institute, London, UK
Mickael Escudero, Aengus Stewart, Andrew Rowan, Jacki Goldman, Peter Van Loo, Richard Kevin Stone, Tamara Denner, Emma Nye, Sophia Ward, Jerome Nicod, Clare Puttick, Katey Enfield, Emma Colliver & Brittany Campbell
University College London Cancer Institute, London, UK
Robert E. Hynds, Andrew Georgiou, Mariana Werner Sunderland, James L. Reading, Sergio A. Quezada, Karl S. Peggs, Teresa Marafioti, John A. Hartley, Pat Gorman, Helen L. Lowe, Leah Ensell, Victoria Spanswick, Angeliki Karamani, Maryam Razaq, Stephan Beck, Ariana Huebner, Michelle Dietzen, Cristina Naceur-Lombardelli, Mita Afroza Akther, Haoran Zhai, Nnennaya Kannu, Elizabeth Manzano, Supreet Kaur Bola, Ehsan Ghorani, Marc Robert de Massy, Elena Hoxha, Emine Hatipoglu, Stephanie Ogwuru & Benny Chain
University College London Hospitals, London, UK
David Lawrence, Martin Hayward, Nikolaos Panagiotopoulos, Robert George, Davide Patrini, Mary Falzon, Elaine Borg, Reena Khiroya, Asia Ahmed, Magali Taylor, Junaid Choudhary, Penny Shaw, Sam M. Janes, Martin Forster, Tanya Ahmad, Siow Ming Lee, Dawn Carnell, Ruheena Mendes, Jeremy George, Neal Navani, Dionysis Papadatos-Pastos, Marco Scarci, Elisa Bertoja, Robert C. M. Stephens, Emilie Martinoni Hoogenboom, James W. Holding & Steve Bandula
Aberdeen Royal Infirmary, Aberdeen, UK
Gillian Price, Sylvie Dubois-Marshall, Keith Kerr, Shirley Palmer, Heather Cheyne, Joy Miller, Keith Buchan, Mahendran Chetty & Mohammed Khalil
Ashford and St Peter’s Hospitals NHS Foundation Trust, Chertsey, UK
Veni Ezhil & Vineet Prakash
Barnet Hospital and Chase Farm Hospital, London, UK
Girija Anand & Sajid Khan
Barts Health NHS Trust, London, UK
Kelvin Lau, Michael Sheaff, Peter Schmid, Louise Lim & John Conibear
Berlin Institute for Medical Systems Biology, Max Delbrueck Center for Molecular Medicine, Berlin, Germany
Roland Schwarz
German Cancer Consortium (DKTK), partner site Berlin, Berlin, Germany
Roland Schwarz
German Cancer Research Center (DKFZ), Heidelberg, Germany
Roland Schwarz
Cancer Research UK Manchester Institute, University of Manchester, Manchester, UK
Jonathan Tugwood, Jackie Pierce, Caroline Dive, Ged Brady, Dominic G. Rothwell, Francesca Chemi & Elaine Kilgour
Cancer Research UK Lung Cancer Centre of Excellence, University of Manchester, Manchester, UK
Caroline Dive, Ged Brady, Dominic G. Rothwell, Francesca Chemi, Elaine Kilgour, Fiona Blackhall, Lynsey Priest, Matthew G. Krebs & Philip Crosbie
Christie NHS Foundation Trust, Manchester, UK
Fiona Blackhall, Lynsey Priest, Matthew G. Krebs, Mathew Carter, Colin R. Lindsay & Fabio Gomes
Wythenshawe Hospital, Manchester University NHS Foundation Trust, Manchester, UK
Philip Crosbie, Yvonne Summers, Raffaele Califano, Paul Taylor, Rajesh Shah, Piotr Krysiak, Kendadai Rammohan, Eustace Fontaine, Richard Booton, Matthew Evison, Stuart Moss, Juliette Novasio, Leena Joseph, Paul Bishop, Anshuman Chaturvedi, Helen Doran, Felice Granato, Vijay Joshi, Elaine Smith & Angeles Montero
Division of Infection, Immunity and Respiratory Medicine, University of Manchester, Manchester, UK
Philip Crosbie
Cancer Research Centre, University of Leicester, Leicester, UK
John Le Quesne, Joan Riley, Lindsay Primrose, Luke Martinson, Nicolas Carey, Jacqui A. Shaw & Dean Fennell
Leicester University Hospitals, Leicester, UK
Dean Fennell, Apostolos Nakas, Sridhar Rathinam, Louise Nelson, Kim Ryanna, Mohamad Tuffail, Amrita Bajaj & Jan Brozik
Cardiff & Vale University Health Board, Cardiff, UK
Fiona Morgan, Malgorzata Kornaszewska, Richard Attanoos, Haydn Adams & Helen Davies
Department of Pathology, GZA-ZNA Antwerp, Antwerp, Belgium
Roberto Salgado
Departments of Radiation Oncology and Radiology, Dana-Farber Cancer Institute, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
Hugo Aerts
Department of Radiology, Netherlands Cancer Institute, Amsterdam, The Netherlands
Hugo Aerts
Golden Jubilee National Hospital, Clydebank, UK
Alan Kirk, Mo Asif, John Butler, Rocco Bilanca & Nikos Kostoulas
Independent Cancer Patients’ Voice, London, UK
Mairead MacKenzie & Maggie Wilcox
University of Leicester, Leicester, UK
Sara Busacca, Alan Dawson & Mark R. Lovett
Liverpool Heart and Chest Hospital NHS Foundation Trust, Liverpool, UK
Michael Shackcloth, Sarah Feeney & Julius Asante-Siaw
Royal Liverpool University Hospital, Liverpool, UK
John Gosney
Manchester Cancer Research Centre Biobank, Manchester, UK
Angela Leek, Nicola Totten, Jack Davies Hodgkinson, Rachael Waddington, Jane Rogan & Katrina Moore
National Institute for Health Research Leicester Respiratory Biomedical Research Unit, Leicester, UK
William Monteiro & Hilary Marshall
NHS Greater Glasgow and Clyde, Glasgow, UK
Kevin G. Blyth, Craig Dick & Andrew Kidd
Royal Brompton and Harefield NHS Foundation Trust, London, UK
Eric Lim, Paulo De Sousa, Simon Jordan, Alexandra Rice, Hilgardt Raubenheimer, Harshil Bhayani, Morag Hamilton, Lyn Ambrose, Anand Devaraj, Hema Chavan, Sofina Begum, Aleksander Mani, Daniel Kaniu, Mpho Malima, Sarah Booth, Andrew G. Nicholson, Nadia Fernandes, Jessica E. Wallen & Pratibha Shah
Sheffield Teaching Hospitals NHS Foundation Trust, Sheffield, UK
Sarah Danson, Jonathan Bury, John Edwards, Jennifer Hill, Sue Matthews, Yota Kitsanta, Jagan Rao, Sara Tenconi, Laura Socci, Kim Suvarna, Faith Kibutu, Patricia Fisher, Robin Young, Joann Barker, Fiona Taylor & Kirsty Lloyd
The Princess Alexandra Hospital NHS Trust, Harlow, UK
Teresa Light, Tracey Horey, Dionysis Papadatos-Pastos & Peter Russell
The Whittington Hospital NHS Trust, London, UK
Sara Lock & Kayleigh Gilbert
University Hospital Birmingham NHS Foundation Trust, Birmingham, UK
Babu Naidu, Gerald Langman, Andrew Robinson, Hollie Bancroft, Amy Kerr, Salma Kadiri, Charlotte Ferris, Gary Middleton, Madava Djearaman & Akshay Patel
University Hospital Southampton NHS Foundation Trust, Southampton, UK
Christian Ottensmeier, Serena Chee, Benjamin Johnson, Aiman Alzetani & Emily Shaw
Velindre Cancer Centre, Cardiff, UK
Jason Lester

Authors

Dhruva Biswas
View author publications
You can also search for this author in PubMed Google Scholar
Nicolai J. Birkbak
View author publications
You can also search for this author in PubMed Google Scholar
Rachel Rosenthal
View author publications
You can also search for this author in PubMed Google Scholar
Crispin T. Hiley
View author publications
You can also search for this author in PubMed Google Scholar
Emilia L. Lim
View author publications
You can also search for this author in PubMed Google Scholar
Krisztian Papp
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Boeing
View author publications
You can also search for this author in PubMed Google Scholar
Marcin Krzystanek
View author publications
You can also search for this author in PubMed Google Scholar
Dijana Djureinovic
View author publications
You can also search for this author in PubMed Google Scholar
Linnea La Fleur
View author publications
You can also search for this author in PubMed Google Scholar
Maria Greco
View author publications
You can also search for this author in PubMed Google Scholar
Balázs Döme
View author publications
You can also search for this author in PubMed Google Scholar
János Fillinger
View author publications
You can also search for this author in PubMed Google Scholar
Hans Brunnström
View author publications
You can also search for this author in PubMed Google Scholar
Yin Wu
View author publications
You can also search for this author in PubMed Google Scholar
David A. Moore
View author publications
You can also search for this author in PubMed Google Scholar
Marcin Skrzypski
View author publications
You can also search for this author in PubMed Google Scholar
Christopher Abbosh
View author publications
You can also search for this author in PubMed Google Scholar
Kevin Litchfield
View author publications
You can also search for this author in PubMed Google Scholar
Maise Al Bakir
View author publications
You can also search for this author in PubMed Google Scholar
Thomas B. K. Watkins
View author publications
You can also search for this author in PubMed Google Scholar
Selvaraju Veeriah
View author publications
You can also search for this author in PubMed Google Scholar
Gareth A. Wilson
View author publications
You can also search for this author in PubMed Google Scholar
Mariam Jamal-Hanjani
View author publications
You can also search for this author in PubMed Google Scholar
Judit Moldvay
View author publications
You can also search for this author in PubMed Google Scholar
Johan Botling
View author publications
You can also search for this author in PubMed Google Scholar
Arul M. Chinnaiyan
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Micke
View author publications
You can also search for this author in PubMed Google Scholar
Allan Hackshaw
View author publications
You can also search for this author in PubMed Google Scholar
Jiri Bartek
View author publications
You can also search for this author in PubMed Google Scholar
Istvan Csabai
View author publications
You can also search for this author in PubMed Google Scholar
Zoltan Szallasi
View author publications
You can also search for this author in PubMed Google Scholar
Javier Herrero
View author publications
You can also search for this author in PubMed Google Scholar
Nicholas McGranahan
View author publications
You can also search for this author in PubMed Google Scholar
Charles Swanton
View author publications
You can also search for this author in PubMed Google Scholar

Consortia

TRACERx Consortium

Charles Swanton
, Mariam Jamal-Hanjani
, Christopher Abbosh
, Yin Wu
, Selvaraju Veeriah
, Marcin Skrzypski
, Rachel Rosenthal
, Dhruva Biswas
, Nicholas McGranahan
, Gareth A. Wilson
, Emilia L. Lim
, Crispin T. Hiley
, Nicolai J. Birkbak
, Maria Greco
, David A. Moore
, Javier Herrero
, Allan Hackshaw
, Thomas B. K. Watkins
, Maise Al Bakir
, Kevin Litchfield
, Istvan Csabai
, Stefan Boeing
, Zoltan Szallasi
, Yenting Ngai
, Abigail Sharp
, Cristina Rodrigues
, Oliver Pressey
, Sean Smith
, Nicole Gower
, Harjot Dhanda
, Miklos Diossy
, Mickael Escudero
, Aengus Stewart
, Andrew Rowan
, Jacki Goldman
, Peter Van Loo
, Richard Kevin Stone
, Tamara Denner
, Emma Nye
, Sophia Ward
, Jerome Nicod
, Clare Puttick
, Katey Enfield
, Emma Colliver
, Brittany Campbell
, Robert E. Hynds
, Andrew Georgiou
, Mariana Werner Sunderland
, James L. Reading
, Sergio A. Quezada
, Karl S. Peggs
, Teresa Marafioti
, John A. Hartley
, Pat Gorman
, Helen L. Lowe
, Leah Ensell
, Victoria Spanswick
, Angeliki Karamani
, Maryam Razaq
, Stephan Beck
, Ariana Huebner
, Michelle Dietzen
, Cristina Naceur-Lombardelli
, Mita Afroza Akther
, Haoran Zhai
, Nnennaya Kannu
, Elizabeth Manzano
, Supreet Kaur Bola
, Ehsan Ghorani
, Marc Robert de Massy
, Elena Hoxha
, Emine Hatipoglu
, Stephanie Ogwuru
, Benny Chain
, David Lawrence
, Martin Hayward
, Nikolaos Panagiotopoulos
, Robert George
, Davide Patrini
, Mary Falzon
, Elaine Borg
, Reena Khiroya
, Asia Ahmed
, Magali Taylor
, Junaid Choudhary
, Penny Shaw
, Sam M. Janes
, Martin Forster
, Tanya Ahmad
, Siow Ming Lee
, Dawn Carnell
, Ruheena Mendes
, Jeremy George
, Neal Navani
, Dionysis Papadatos-Pastos
, Marco Scarci
, Elisa Bertoja
, Robert C. M. Stephens
, Emilie Martinoni Hoogenboom
, James W. Holding
, Steve Bandula
, Gillian Price
, Sylvie Dubois-Marshall
, Keith Kerr
, Shirley Palmer
, Heather Cheyne
, Joy Miller
, Keith Buchan
, Mahendran Chetty
, Mohammed Khalil
, Veni Ezhil
, Vineet Prakash
, Girija Anand
, Sajid Khan
, Kelvin Lau
, Michael Sheaff
, Peter Schmid
, Louise Lim
, John Conibear
, Roland Schwarz
, Jonathan Tugwood
, Jackie Pierce
, Caroline Dive
, Ged Brady
, Dominic G. Rothwell
, Francesca Chemi
, Elaine Kilgour
, Fiona Blackhall
, Lynsey Priest
, Matthew G. Krebs
, Philip Crosbie
, John Le Quesne
, Joan Riley
, Lindsay Primrose
, Luke Martinson
, Nicolas Carey
, Jacqui A. Shaw
, Dean Fennell
, Apostolos Nakas
, Sridhar Rathinam
, Louise Nelson
, Kim Ryanna
, Mohamad Tuffail
, Amrita Bajaj
, Jan Brozik
, Fiona Morgan
, Malgorzata Kornaszewska
, Richard Attanoos
, Haydn Adams
, Helen Davies
, Mathew Carter
, Colin R. Lindsay
, Fabio Gomes
, Roberto Salgado
, Hugo Aerts
, Alan Kirk
, Mo Asif
, John Butler
, Rocco Bilanca
, Nikos Kostoulas
, Mairead MacKenzie
, Maggie Wilcox
, Sara Busacca
, Alan Dawson
, Mark R. Lovett
, Michael Shackcloth
, Sarah Feeney
, Julius Asante-Siaw
, John Gosney
, Angela Leek
, Nicola Totten
, Jack Davies Hodgkinson
, Rachael Waddington
, Jane Rogan
, Katrina Moore
, William Monteiro
, Hilary Marshall
, Kevin G. Blyth
, Craig Dick
, Andrew Kidd
, Eric Lim
, Paulo De Sousa
, Simon Jordan
, Alexandra Rice
, Hilgardt Raubenheimer
, Harshil Bhayani
, Morag Hamilton
, Lyn Ambrose
, Anand Devaraj
, Hema Chavan
, Sofina Begum
, Aleksander Mani
, Daniel Kaniu
, Mpho Malima
, Sarah Booth
, Andrew G. Nicholson
, Nadia Fernandes
, Jessica E. Wallen
, Pratibha Shah
, Sarah Danson
, Jonathan Bury
, John Edwards
, Jennifer Hill
, Sue Matthews
, Yota Kitsanta
, Jagan Rao
, Sara Tenconi
, Laura Socci
, Kim Suvarna
, Faith Kibutu
, Patricia Fisher
, Robin Young
, Joann Barker
, Fiona Taylor
, Kirsty Lloyd
, Teresa Light
, Tracey Horey
, Dionysis Papadatos-Pastos
, Peter Russell
, Sara Lock
, Kayleigh Gilbert
, Babu Naidu
, Gerald Langman
, Andrew Robinson
, Hollie Bancroft
, Amy Kerr
, Salma Kadiri
, Charlotte Ferris
, Gary Middleton
, Madava Djearaman
, Akshay Patel
, Christian Ottensmeier
, Serena Chee
, Benjamin Johnson
, Aiman Alzetani
, Emily Shaw
, Jason Lester
, Yvonne Summers
, Raffaele Califano
, Paul Taylor
, Rajesh Shah
, Piotr Krysiak
, Kendadai Rammohan
, Eustace Fontaine
, Richard Booton
, Matthew Evison
, Stuart Moss
, Juliette Novasio
, Leena Joseph
, Paul Bishop
, Anshuman Chaturvedi
, Helen Doran
, Felice Granato
, Vijay Joshi
, Elaine Smith
& Angeles Montero

Contributions

D.B. and N.J.B. conceived the project, designed the experiments, performed the bioinformatics analyses and wrote the manuscript. R.R., E.L.L., K.P., S.B., M.K., T.B.K.W. and G.A.W. performed data processing and bioinformatics analyses. C.T.H., Y.W., D.A.M., M.S., C.A. and M.A.B. gave advice on clinical interpretation. D.D., L.L.F., M.G., B.D., J.F., H.B. and J.M. performed the sample collection, curated the clinical data and helped with data interpretation. K.L., I.C., Z.S. and J.H. helped to direct the avenues of bioinformatics analysis. S.V. performed the sample preparation and RNA extraction. M.J.-H. designed the TRACERx study protocols and helped to analyze the clinical characteristics of the patients. J.Botling, A.M.C., P.M. and J.Bartek provided access to additional RNA-Seq datasets and gave feedback on the manuscript. A.H. provided statistical advice. N.M. and C.S. conceived the project, designed the experiments and helped write the manuscript. N.J.B., N.M. and C.S. supervised the study. All authors reviewed and approved the manuscript.

Corresponding authors

Correspondence to Nicolai J. Birkbak, Nicholas McGranahan, Charles Swanton or Charles Swanton.

Ethics declarations

Competing interests

C.S. receives grant support from Pfizer, AstraZeneca, BMS, and Roche-Ventana. C.S. has consulted for Pfizer, Novartis, GlaxoSmithKline, MSD, BMS, Celgene, AstraZeneca, Illumina, Genentech, Roche-Ventana, GRAIL, Medicxi and the Sarah Cannon Research Institute and is an advisor for Dynamo Therapeutics. C.S. holds shares in Apogen Biotechnologies, Epic Bioscience and GRAIL, and has stock options in, and is co-founder of, Achilles Therapeutics. R.R. has stock options in, and has consulted for, Achilles Therapeutics. C.A. has received speaking honoraria or expenses from Novartis, Roche, AstraZeneca and BMS. M.A.B. has consulted for Achilles Therapeutics. G.A.W. holds shares in Achilles Therapeutics. M.J.-H. has consulted, and is an advisor, for Achilles Therapeutics. D.B., N.J.B., N.M. and C.S. are co-inventors on a UK patent application (1901439.8) filed by Cancer Research Technology relating to methods of predicting survival rates for patients with cancer.

Additional information

Peer review information Joao Monteiro was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Patient cohorts included in the study.

a, CONSORT diagram for patient recruitment (left) and composition by tumour stage (right) of the TRACERx cohort. b, Patient composition of two RNAseq datasets: The Cancer Genome Atlas cohort (left), and the Uppsala cohort (right). c, Patient composition of four microarray datasets: Der et al, GSE50081 (top left); Okayama et al, GSE31210 (top right); Rousseaux et al, GSE30219 (bottom left); Shedden et al, GSE68465 (bottom right). Tumour stage (x-axis) and therapy status (colour) is indicated for all patient composition bar charts. LUAD = lung adenocarcinoma, LUSC = lung squamous cell carcinoma.

Extended Data Fig. 2 Analysis of the most variably expressed genes in TRACERx.

a, The dendrogram and coloured heatmap (top) shows the hierarchical clustering of tumour regions (columns) in the TRACERx multi-region RNAseq cohort (156 tumour regions, 48 NSCLC patients, stage I-III) according to the top 500 variably expressed genes (rows). The sparse heatmap (bottom) shows tumour regions (coloured by histology) per patient (rows). b, Kaplan-Meier survival analysis of the largest two patient clusters from the dendrogram in (a). Statistical significance was tested with a two-sided log-rank test. c, The hierarchical clustering approach taken to quantify discordance rates for published signatures is illustrated for a non-RNAseq signature, Kratz et al¹⁰, in TRACERx (n = 28 LUAD patients, stage I-III). As previously described by Gyanchandani et al⁵, this clustering approach provides a metric that is invariant of gene expression profiling platform. For a given number of clusters, clustering concordance was quantified as the percentage of TRACERx patients with all tumour regions in the same cluster. This analysis was run iteratively from 2 to 28 clusters; 28 is the total number of TRACERx LUAD patients, hence clustering concordance of 100% at 28 clusters is the theoretical upper limit using this metric. The dendrogram and coloured heatmap (top) shows the clustering of tumour regions (columns) according to the expression pattern of genes comprising the prognostic signature (rows). The grayscale heatmap (bottom left) shows tumour regions per patient (rows). For a range of clusters (2, 3, 14, 28), the coloured bars (middle left) show the assignment of tumour regions to clusters, the grayscale bars (bottom right) show which patients have their tumour regions discordantly assigned (gray) across clusters, and the pie charts (middle right) show the percentage of discordantly classified patients. d, Discordance rates for 9 published LUAD prognostic signatures^{7,8,9,10,11,12,13,14,15} plotted as the percentage of patients with tumour regions clustering together against the number of clusters. Vertical dashed lines mark a range of clusters (2, 3, 14, 28) as highlighted in (c).

Extended Data Fig. 3 Intra- and inter-tumour RNA heterogeneity scores.

a, Gene-wise and patient-wise RNA-ITH scores were calculated using multi-region RNAseq data (normalized count values) from TRACERx tumours (n=28 LUAD patients, 89 tumour regions, stage I-III). For a given tumour, the standard deviation of expression values for a particular gene across tumour regions was calculated yielding a gene-specific, patient-specific measure of RNA-ITH (σ_g,p). This was repeated for all genes, then all tumours, generating a matrix of σ_g,p values. Gene-wise RNA-ITH values are summarised as the average (median) value per gene across all tumours in the cohort (σ_g). Conversely, patient-wise RNA-ITH values are summarised as the average (median) value per tumour across all expressed genes (σ_p). Dashed lines indicate mean values. b, The scatter plots show the Spearman correlation between the chosen metric of intra-tumour expression variability (standard deviation) and alternative metrics, median absolute deviation (left) or coefficient of variation (right), as calculated in the TRACERx cohort (n=28 LUAD patients, 89 tumour regions, stage I-III). c, Diagram illustrating the calculation of gene-wise inter-tumour RNA heterogeneity scores through the random sampling of tumour regions from the TRACERx cohort (n=28 LUAD patients, 89 tumour regions, stage I-III; see Methods). d, The scatter plot shows the Spearman correlation between inter-tumour RNA heterogeneity scores calculated in TRACERx (n=28 LUAD patients, 89 tumour regions, stage I-III), randomly sampled to yield a sham single-biopsy cohort, and TCGA (n = 469 LUAD patients, stage I-III), a true single-biopsy cohort.

Extended Data Fig. 4 Clustering concordance and published prognostic signatures.

a, Clustering concordance scores calculated in TRACERx (n=28 LUAD patients, 89 tumour regions, stage I-III) using the same method taken to estimate the sampling bias of microarray signatures as described by Gyanchandani et al⁵ (see Extended Data 2c,d). For each gene, a curve is calculated for the number of patients with all regions in the same cluster against the number of clusters (2–28 clusters). Curves for five genes (minimum = CKMT2, lower quartile = CYSLTR2, median = MCM2, upper quartile = MFSD1, maximum = HOXC11) are shown (top), in addition to summarised clustering concordance scores for all genes (bottom). b, Gene-wise clustering concordance scores stratified by RNA heterogeneity quadrant, both calculated in TRACERx (n=28 LUAD patients, 89 tumour regions, stage I-III). Boxplots represent the median, 25th and 75th percentiles and the vertical bars span the 5th to the 95th percentiles. Statistical significance was tested with a two-sided Wilcoxon signed rank sum test. “*” indicates a P-value < 0.05, “**” indicates a P-value < 0.01, “***” indicates a P-value < 0.001.

Extended Data Fig. 5 Analysis of published prognostic signatures for LUAD by RNA heterogeneity quadrant.

a, The composition of published prognostic signatures by RNA heterogeneity quadrant, plotted in order of increasing percentage of Q4 genes (low intra- and high inter-tumour heterogeneity). b, Percentage of genes expected (total no. genes, as indicated in Fig. 2a) versus observed (in 9 published LUAD prognostic signatures^{7,8,9,10,11,12,13,14,15}) per RNA heterogeneity quadrant. Statistical significance was tested with a two-sided Fisher’s exact test. The ability of published prognostic genes for LUAD (the combined gene list from nine published signatures, 242 unique genes) to maintain prognostic value across patient cohorts is assessed (using Cox univariate survival analysis) in four microarray datasets: Shedden et al, GSE68465 (c); Okayama et al, GSE31210 (d); Der et al, GSE50081 (e); Rousseaux et al, GSE30219 (f). Boxplots represent the median, 25th and 75th percentiles and the vertical bars span the 5th to the 95th percentiles. Statistical significance was tested with a two-sided Wilcoxon signed rank sum test. “*” indicates a P-value < 0.05, “**” indicates a P-value < 0.01, “***” indicates a P-value < 0.001.

Extended Data Fig. 6 Prognostic signature design.

a, Biomarkers are designed using state-of-the-art signature construction methods, replicated from Shukla et al¹⁴ (signature A and B), Chen et al³³ (signature C), Reka et al³⁴ (Signature D) and Kratz et al¹⁰ (signature E). In parallel, the “prognostic significance” filters (present in each signature construction method) were substituted with “clonal expression” filters, generating corresponding clonal signatures (signatures A-clonal, B-clonal, C-clonal, D-clonal, and E-clonal). Published signature construction methods are indicated in orange, novel methods integrating clonal biomarker design are indicated in blue. All signatures are developed in TCGA LUAD patients (n=469, stage I-III) as the training dataset. b, Flow diagram illustrating the gene selection steps for ORACLE. Criteria to identify prognostic and clonally expressed genes, and the number of genes selected at each step are indicated. c, Optimization of the number of genes to select at the clustering concordance step through 10-fold cross-validation in the training cohort (TCGA, n=469 LUAD patients, stage I-III). The optimal number of genes, with the lowest cross-validation error, is shown by the vertical red line. d, The cut-off to dichotomize the ORACLE risk-score into ‘high’ and ‘low’ risk groups is optimized in the training cohort (TCGA, n=469 LUAD patients, stage I-III). The horizontal blue line indicates a log-rank P-value = 0.01 and the optimal cut-off is shown by the vertical red line. Statistical significance was tested with a two-sided log-rank test. e, Tumour sampling bias of the ORACLE signature assessed using multi-region RNAseq data from TRACERx (n=28 LUAD patients, 89 tumour regions, stage I-III). Each point represents a single tumour region, vertical lines display the range for each patient, and patients are ordered by predicted survival risk score. Points are coloured according to the risk classification of tumour regions within a patient: concordant low-risk (blue), concordant high-risk (red), or discordant (gray).

Extended Data Fig. 7 Risk stratification using ORACLE.

a, Kaplan-Meier plot of ORACLE in the RNAseq-based validation cohort (Uppsala, n=103 LUAD patients, stage I-III). Statistical significance was tested with a two-sided log-rank test. The ability of substaging criteria (b) and ORACLE (c) to split patients into prognostically informative groups is tested in stage I patients using the updated TNM version 8 criteria³⁷, shown as Kaplan-Meier plots for the Uppsala RNAseq dataset (n=53 LUAD patients, stage I, TNMv8). Statistical significance was tested with a two-sided log-rank test. d, The distribution of ORACLE risk scores by disease stage, shown for the Uppsala cohort (n=103 LUAD patients, stage I-III) and the MET500 cohort³⁸ (n=8 metastatic samples from patients with LUAD primary tumours). Boxplots represent the median, 25th and 75th percentiles and the vertical bars span the 5th to the 95th percentiles. Statistical significance was tested with a Wilcoxon signed rank sum test. No corrections were made for multiple comparisons. e, The scatter plot shows the Spearman correlation between Ki67 staining % and ORACLE risk-scores in the TRACERx cohort (n=28 LUAD patients, 89 tumour regions, stage I-III).

Extended Data Fig. 8 ORACLE as a cancer cell expression signature.

a, Spearman correlations between the infiltration of immune cell subsets, calculated from RNAseq data using the method described by Danaher et al³⁹, and ORACLE risk-scores in the TCGA dataset (n=469 patients, stage I-III). b, The scatter plot shows the Spearman correlation between ORACLE risk score and tumour purity assessed from whole-exome sequencing data using ASCAT, as described by Van Loo et al⁴⁰, in TRACERx (n=28 LUAD patients, 84 tumour regions, stage I-III). c, Lambrechts et al⁴¹ performed single-cell RNAseq on 52,698 cells sourced from 5 NSCLC patients, then defined 7 clusters of stromal cell genes and provided a per-cluster expression measure for every gene. The relative expression levels (y-axis) for each stromal cluster (coloured by cell-type, see figure legend) is plotted for all 23 genes comprising the ORACLE signature (bottom 3 rows). To aid interpretation, a marker gene for each of the 7 stromal cell clusters is also plotted (top row) for comparison: alveolar (AGER), B cell (MS4A1), epithelial (EPCAM), fibroblast (COL6A2), myeloid (CD68), T cell (CD3D), and vascular (FLT1) cell-types. d, Pearson correlations between the expression of individual ORACLE genes and copy-number state at the corresponding gene locus in the TRACERx cohort (n=28 LUAD patients, 89 tumour regions, stage I-III). Significant correlations (P<0.05) are marked in red, non-significant correlations are marked in blue.

Extended Data Fig. 9 Patient-level estimates of RNA-ITH and association with tumour cellular composition.

a, RNA-ITH scores calculated from each tumour by sampling one to N biopsies (where N is the total number of biopsies yielded by that tumour) in TRACERx (n=48 NSCLC patients, 156 tumour regions, stage I-III). For each patient the RNA-ITH score (y-axis) is plotted for all possible subgroups of tumour regions against the number of biopsies (x-axis). The mean (red line) and standard deviation (blue lines) are shown for each tumour. b, The scatter plots show the Spearman correlation between patient-level RNA-ITH scores and RNAseq-based immune infiltration measures, calculated from RNAseq data using the method described by Danaher et al³⁹ in TRACERx (n=48 NSCLC patients, 156 tumour regions, stage I-III). c, The scatter plot shows the Spearman correlation between patient-level RNA-ITH scores and tumour purity assessed from whole-exome sequencing data using ASCAT, as described by Van Loo et al⁴⁰, in TRACERx (n=48 NSCLC patients, 156 tumour regions, stage I-III).

Extended Data Fig. 10 Pathway analysis by RNA heterogeneity quadrant.

The top 10 Reactome pathways for each RNA heterogeneity quadrant are plotted: low inter- and high intra- (Q1, a), low inter- and low intra- (Q2, b), high inter- and high intra- (Q3, c), high inter- and low intra- (Q4, d).

Supplementary information

Reporting Summary

Supplementary Tables

Rights and permissions

Reprints and permissions

About this article

Cite this article

Biswas, D., Birkbak, N.J., Rosenthal, R. et al. A clonal expression biomarker associates with lung cancer mortality. Nat Med 25, 1540–1548 (2019). https://doi.org/10.1038/s41591-019-0595-z

Download citation

Received: 12 November 2018
Accepted: 20 August 2019
Published: 07 October 2019
Issue Date: October 2019
DOI: https://doi.org/10.1038/s41591-019-0595-z

This article is cited by

Clonal gene signatures predict prognosis in mesothelioma and lung adenocarcinoma
- Yupei Lin
- Bryan M. Burt
- Chao Cheng
npj Precision Oncology (2024)
Key processes in tumor metastasis and therapeutic strategies with nanocarriers: a review
- Hongjie Li
- Haiqin Huang
- Jingkun Bai
Molecular Biology Reports (2024)
Tracking the evolution of esophageal squamous cell carcinoma under dynamic immune selection by multi-omics sequencing
- Sijia Cui
- Nicholas McGranahan
- Shixiu Wu
Nature Communications (2023)
Spatial biology of cancer evolution
- Zaira Seferbekova
- Artem Lomakin
- Moritz Gerstung
Nature Reviews Genetics (2023)
Multi-region sampling with paired sample sequencing analyses reveals sub-groups of patients with novel patient-specific dysregulation in Hepatocellular Carcinoma
- Ah-Jung Jeon
- Yue-Yang Teo
- Pierce K. H. Chow
BMC Cancer (2023)