Lung adenocarcinomas (LUADs) display a broad histological spectrum from low-grade lepidic tumors through to mid-grade acinar and papillary and high-grade solid, cribriform and micropapillary tumors. How morphology reflects tumor evolution and disease progression is poorly understood. Whole-exome sequencing data generated from 805 primary tumor regions and 121 paired metastatic samples across 248 LUADs from the TRACERx 421 cohort, together with RNA-sequencing data from 463 primary tumor regions, were integrated with detailed whole-tumor and regional histopathological analysis. Tumors with predominantly high-grade patterns showed increased chromosomal complexity, with higher burden of loss of heterozygosity and subclonal somatic copy number alterations. Individual regions in predominantly high-grade pattern tumors exhibited higher proliferation and lower clonal diversity, potentially reflecting large recent subclonal expansions. Co-occurrence of truncal loss of chromosomes 3p and 3q was enriched in predominantly low-/mid-grade tumors, while purely undifferentiated solid-pattern tumors had a higher frequency of truncal arm or focal 3q gains and SMARCA4 gene alterations compared with mixed-pattern tumors with a solid component, suggesting distinct evolutionary trajectories. Clonal evolution analysis revealed that tumors tend to evolve toward higher-grade patterns. The presence of micropapillary pattern and ‘tumor spread through air spaces’ were associated with intrathoracic recurrence, in contrast to the presence of solid/cribriform patterns, necrosis and preoperative circulating tumor DNA detection, which were associated with extra-thoracic recurrence. These data provide insights into the relationship between LUAD morphology, the underlying evolutionary genomic landscape, and clinical and anatomical relapse risk.
This is a preview of subscription content, access via your institution
Open Access articles citing this article.
Genomic–transcriptomic evolution in lung cancer and metastasis
Nature Open Access 12 April 2023
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$189.00 per year
only $15.75 per issue
Rent or buy this article
Get just this article for as long as you need it
Prices may be subject to local taxes which are calculated during checkout
The WES data and RNA-seq data (in each case from the TRACERx study) used during this study have been deposited at the European Genome–phenome Archive (EGA), which is hosted by the European Bioinformatics Institute (EBI) and the Centre for Genomic Regulation (CRG) under the accession codes EGAS00001006494 (WES) and EGAS00001006517 (RNA-seq); access is controlled by the TRACERx data access committee and details on how to apply for access are available at the linked page. Hallmark gene sets58 were downloaded from Molecular Signatures Database (https://www.gsea-msigdb.org/gsea/msigdb/).
Code used to process data and generate figures is available at https://doi.org/10.5281/zenodo.7683605.
Nicholson, A. G. et al. The 2021 WHO classification of lung tumors: impact of advances since 2015. J. Thorac. Oncol. 17, 362–387 (2022).
Chang, J. C. et al. Comprehensive molecular and clinicopathologic analysis of 200 pulmonary invasive mucinous adenocarcinomas identifies distinct characteristics of molecular subtypes. Clin. Cancer Res. 27, 4066–4076 (2021).
Kadota, K. et al. Tumor spread through air spaces is an important pattern of invasion and impacts the frequency and location of recurrences after limited resection for small stage I lung adenocarcinomas. J. Thorac. Oncol. 10, 806–814 (2015).
Moreira, A. L. et al. A grading system for invasive pulmonary adenocarcinoma: a proposal from the international association for the study of lung cancer pathology committee. J. Thorac. Oncol. 15, 1599–1610 (2020).
Caso, R. et al. The underlying tumor genomics of predominant histologic subtypes in lung adenocarcinoma. J. Thorac. Oncol. 15, 1844–1856 (2020).
Tavernari, D. et al. Nongenetic evolution drives lung adenocarcinoma spatial heterogeneity and progression. Cancer Discov. 11, 1490–1507 (2021).
Jamal-Hanjani, M. et al. Tracking the evolution of non-small-cell lung cancer. N. Engl. J. Med. 376, 2109–2121 (2017).
Abbosh, C. et al. Phylogenetic ctDNA analysis depicts early-stage lung cancer evolution. Nature 545, 446–451 (2017).
Frankell, A. M. et al. The evolution of lung cancer and impact of subclonal selection in TRACERx. Nature https://doi.org/10.1038/s41586-023-05783-5 (2023).
Al Bakir, M. et al. The evolution of non-small lung cancer metastases in TRACERx. Nature https://doi.org/10.1038/s41586-023-05729-x (2023).
Martínez-Ruiz, C. et al. Genomic-transcriptomic evolution in lung cancer and metastasis. Nature https://doi.org/10.1038/s41586-023-05706-4 (2023).
Favero, F. et al. Sequenza: allele-specific copy number and mutation profiles from tumor sequencing data. Ann. Oncol. 26, 64–70 (2015).
Warth, A. et al. Tumour cell proliferation (Ki-67) in non-small cell lung cancer: a critical reappraisal of its prognostic role. Br. J. Cancer 111, 1222–1229 (2014).
Dong, Z.-Y. et al. Genetic and immune profiles of solid predominant lung adenocarcinoma reveal potential immunotherapeutic strategies. J. Thorac. Oncol. 13, 85–96 (2018).
Ding, Y. et al. Comparative study on the mutational profile of adenocarcinoma and squamous cell carcinoma predominant histologic subtypes in Chinese non-small cell lung cancer patients. Thorac. Cancer 11, 103–112 (2020).
Rekhtman, N., Ang, D. C., Riely, G. J., Ladanyi, M. & Moreira, A. L. KRAS mutations are associated with solid growth pattern and tumor-infiltrating leukocytes in lung adenocarcinoma. Mod. Pathol. 26, 1307–1319 (2013).
Lastwika, K. J. et al. Control of PD-L1 expression by oncogenic activation of the AKT–mTOR pathway in non-small cell lung cancer. Cancer Res. 76, 227–238 (2016).
Mermel, C. H. et al. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy number alteration in human cancers. Genome Biol. 12, R41 (2011).
Chang, J. C. et al. Comprehensive next-generation sequencing unambiguously distinguishes separate primary lung carcinomas from intrapulmonary metastases: comparison with standard histopathologic approach. Clin. Cancer Res. 25, 7113–7125 (2019).
Masai, K. et al. Prognostic impact of margin distance and tumor spread through air spaces in limited resection for primary lung cancer. J. Thorac. Oncol. 12, 1788–1797 (2017).
Eguchi, T. et al. Lobectomy is associated with better outcomes than sublobar resection in spread through air spaces (STAS)-positive T1 lung adenocarcinoma: a propensity score-matched analysis. J. Thorac. Oncol. 14, 87–98 (2019).
Warth, A. et al. Prognostic impact of intra-alveolar tumor spread in pulmonary adenocarcinoma. Am. J. Surg. Pathol. 39, 793–801 (2015).
Terada, Y. et al. Spread through air spaces is an independent predictor of recurrence in stage III (N2) lung adenocarcinoma. Interact. Cardiovasc. Thorac. Surg. 29, 442–448 (2019).
Aly, R. G. et al. Spread through air spaces (STAS) is prognostic in atypical carcinoid, large cell neuroendocrine carcinoma, and small cell carcinoma of the lung. J. Thorac. Oncol. 14, 1583–1593 (2019).
Lu, S. et al. Spread through air spaces (STAS) is an independent predictor of recurrence and lung cancer-specific death in squamous cell carcinoma. J. Thorac. Oncol. 12, 223–234 (2017).
Ikeda, T. et al. The epithelial–mesenchymal transition phenotype is associated with the frequency of tumor spread through air spaces (STAS) and a high risk of recurrence after resection of lung carcinoma. Lung Cancer 153, 49–55 (2021).
Chabon, J. J. et al. Integrating genomic features for non-invasive early lung cancer detection. Nature 580, 245–251 (2020).
Abbosh, C. et al. Tracking early lung cancer metastatic dissemination in TRACERx using ctDNA. Nature https://doi.org/10.1038/s41586-023-05776-4 (2023).
Ujiie, H. et al. Solid predominant histologic subtype in resected stage I lung adenocarcinoma is an independent predictor of early, extrathoracic, multisite recurrence and of poor postrecurrence survival. J. Clin. Oncol. 33, 2877–2884 (2015).
Nitadori, J.-I. et al. Impact of micropapillary histologic subtype in selecting limited resection versus lobectomy for lung adenocarcinoma of 2cm or smaller. J. Natl Cancer Inst. 105, 1212–1220 (2013).
Rekhtman, N. et al. SMARCA4-deficient thoracic sarcomatoid tumors represent primarily smoking-related undifferentiated carcinomas rather than primary thoracic sarcomas. J. Thorac. Oncol. 15, 231–247 (2020).
Schoenfeld, A. J. et al. The genomic landscape of SMARCA4 alterations and associations with outcomes in patients with lung cancer. Clin. Cancer Res. 26, 5701–5708 (2020).
Concepcion, C. P. et al. Smarca4 inactivation promotes lineage-specific transformation and early metastatic features in the lung. Cancer Discov. 12, 562–585 (2022).
Qian, J. & Massion, P. P. Role of chromosome 3q amplification in lung cancer. J. Thorac. Oncol. 3, 212–215 (2008).
Tonon, G. et al. High-resolution genomic profiles of human lung cancer. Proc. Natl Acad. Sci. USA 102, 9625–9630 (2005).
Moore, D. A. et al. In situ growth in early lung adenocarcinoma may represent precursor growth or invasive clone outgrowth-a clinically relevant distinction. Mod. Pathol. 32, 1095–1105 (2019).
Messal, H. A. et al. Tissue curvature and apicobasal mechanical tension imbalance instruct cancer morphogenesis. Nature 566, 126–130 (2019).
Yoshizawa, A. et al. Impact of proposed IASLC/ATS/ERS classification of lung adenocarcinoma: prognostic subgroups and implications for further revision of staging based on analysis of 514 stage I cases. Mod. Pathol. 24, 653–664 (2011).
Kadota, K. et al. A grading system combining architectural features and mitotic count predicts recurrence in stage I lung adenocarcinoma. Mod. Pathol. 25, 1117–1127 (2012).
Mäkinen, J. M. et al. Histological features of malignancy correlate with growth patterns and patient outcome in lung adenocarcinoma. Histopathology 71, 425–436 (2017).
von der Thüsen, J. H. et al. Prognostic significance of predominant histologic pattern and nuclear grade in resected adenocarcinoma of the lung: potential parameters for a grading system. J. Thorac. Oncol. 8, 37–44 (2013).
Bains, S. et al. Procedure-specific risk prediction for recurrence in patients undergoing lobectomy or sublobar resection for small (≤2 cm) lung adenocarcinoma: An international cohort analysis. J. Thorac. Oncol. 14, 72–86 (2019).
WHO Classification of Tumours Editorial Board Thoracic Tumours: WHO Classification of Tumours 5th edn (World Health Organization, 2021).
Rosenthal, R. et al. Neoantigen-directed immune escape in lung cancer evolution. Nature 567, 479–485 (2019).
AbdulJabbar, K. et al. Geospatial immune variability illuminates differential evolution of lung adenocarcinoma. Nat. Med. 26, 1054–1062 (2020).
Hendry, S. et al. Assessing tumor-infiltrating lymphocytes in solid tumors: a practical review for pathologists and proposal for a standardized method from the international immuno-oncology biomarkers working group: part 2: TILs in melanoma, gastrointestinal tract carcinomas, non-small cell lung carcinoma and mesothelioma, endometrial and ovarian carcinomas, squamous cell carcinoma of the head and neck, genitourinary carcinomas, and primary brain tumors. Adv. Anat. Pathol. 24, 311–335 (2017).
Denkert, C. et al. Standardized evaluation of tumor-infiltrating lymphocytes in breast cancer: results of the ring studies of the international immuno-oncology biomarker working group. Mod. Pathol. 29, 1155–1164 (2016).
Van Loo, P. et al. Allele-specific copy number analysis of tumors. Proc. Natl Acad. Sci. USA 107, 16910–16915 (2010).
Watkins, T. B. K. et al. Pervasive chromosomal instability and karyotype order in tumour evolution. Nature 587, 126–132 (2020).
Roth, A. et al. PyClone: statistical inference of clonal population structure in cancer. Nat. Methods 11, 396–398 (2014).
Burrell, R. A. et al. Replication stress links structural and numerical cancer chromosomal instability. Nature 494, 492–496 (2013).
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. 57, 289–300 (1995).
Ferrer-Bonsoms, J. A., Jareno, L. & Rubio, A. Rediscover: an R package to identify mutually exclusive mutations. Bioinformatics https://doi.org/10.1093/bioinformatics/btab709 (2021).
Li, B. & Dewey, C. N. RSEM: accurate transcript quantification from RNA-seq data with or without a reference genome. BMC Bioinformatics 12, 323 (2011).
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Korotkevich, G. et al. Fast gene-set enrichment analysis. Preprint at bioRxiv https://doi.org/10.1101/060012 (2021).
Liberzon, A. et al. The Molecular Signatures Database hallmark gene set collection. Cell Syst. 1, 417–425 (2015).
Subramanian, A. et al. Gene-set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA 102, 15545–15550 (2005).
Hänzelmann, S., Castelo, R. & Guinney, J. GSVA: gene-set variation analysis for microarray and RNA-seq data. BMC Bioinformatics 14, 7 (2013).
Kawaguchi, S., Higasa, K., Shimizu, M., Yamada, R. & Matsuda, F. HLA-HD: an accurate HLA typing algorithm for next-generation sequencing data. Hum. Mutat. 38, 788–797 (2017).
Reynisson, B., Alvarez, B., Paul, S., Peters, B. & Nielsen, M. NetMHCpan-4.1 and NetMHCIIpan-4.0: improved predictions of MHC antigen presentation by concurrent motif deconvolution and integration of MS MHC eluted ligand data. Nucleic Acids Res. 48, W449–W454 (2020).
North, B. V., Curtis, D. & Sham, P. C. A note on the calculation of empirical P values from Monte-Carlo procedures. Am. J. Hum. Genet. 71, 439–441 (2002).
Sanchez-Vega, F. et al. Oncogenic signaling pathways in the cancer genome atlas. Cell 173, 321–337 (2018).
The TRACERx study (NCT01888601) is sponsored by University College London (UCL/12/0279) and has been approved by an independent Research Ethics Committee (13/LO/1546). TRACERx is funded by Cancer Research UK (CRUK; C11496/A17786) and coordinated through the CRUK & UCL Cancer Trials Centre, which has a core grant from CRUK (C444/A15953). We gratefully acknowledge the participants and relatives who participated in the TRACERx study. We thank all site personnel, investigators, funders and industry partners that supported the generation of the data within this study. In particular, we acknowledge the support of Scientific Computing, the Advanced Sequencing Facility and Experimental Histopathology departments at the Francis Crick Institute. This work was also supported by the CRUK Lung Cancer Centre of Excellence and the CRUK City of London Centre Award (C7893/A26233) and the UCL Experimental Cancer Medicine Centre. BioRender aided in the generation of Figs. 3a and 4c and Extended Data Figs. 1k and 6a. T.K. is supported by the JSPS Overseas Research Fellowships Program (202060447). D.A.M. is supported by the CRUK Lung Cancer Centre of Excellence (C11496/A30025). M.A.B. is supported by CRUK, the Rosetrees Trust and the Francis Crick Institute. T.B.K.W. is supported by the Francis Crick Institute, as well as the Marie Curie ITN Project PLOIDYNET (FP7-PEOPLE-2013, 607722), Breast Cancer Research Foundation, Royal Society Research Professorships Enhancement Award (RP/EA/180007) and the Foulkes Foundation. A. Huebner is supported by CRUK. A.M.F. is supported by Stand Up To Cancer (SU2C-AACR-DT23-17). H.Z. is supported by the China Scholarship Council for a 4-year PhD study and the Breast Cancer Research Foundation. F.G.-V. is supported by Generalitat Valenciana fellowships program (APOSTD/2021/168). M.D. is supported by CRUK and the Lung Cancer Centre of Excellence. C.M.-R. is supported by the Rosetrees Trust (M630) and Wellcome Trust. E.C. is supported by CRUK (TRACERx (C11496/A17786)) and the Francis Crick Institute. K.S.S.E. was supported by the European Union’s Horizon 2020 research and innovation program under the Marie Sklodowska-Curie grant agreement no. 838540 and the Royal Society (RF\ERE\210216). S.H. is supported by CRUK and the Rosetrees Trust. C.T.H. has received funding from NIHR University College London Hospitals Biomedical Research Centre. S.Z. is a CRUK Career Development Fellow (award ref. RCCCDF-Nov21\100005) and is supported by Rosetrees Trust (grant ref. M917) and CRUK UCL Centre Non-Clinical Training Award (CANTAC721\100022). K.L. is funded by the UK Medical Research Council (MR/P014712/1 and MR/V033077/1), Rosetrees Trust and Cotswold Trust (A2437) and CRUK (C69256/A30194). N.J.B. is a fellow of the Lundbeck Foundation (R272-2017-4040) and acknowledges funding from Aarhus University Research Foundation (AUFF-E-2018-7-14) and the Novo Nordisk Foundation (NNF21OC0071483).E.L.C., J.D. and P.V.L. are supported by the Francis Crick Institute, which receives its core funding from CRUK (CC2008), the UK Medical Research Council (CC2008) and the Wellcome Trust (CC2008). P.V.L. is a Winton Group Leader in recognition of the Winton Charitable Foundation’s support toward the establishment of the Francis Crick Institute. P.V.L. is a CPRIT Scholar in Cancer Research and acknowledges CPRIT grant support (RR210006). P.S.A.’s laboratory work is supported by grants from the National Institutes of Health (P30 CA008748, R01 CA236615, R01 CA235667 and U01 CA214195), the US Department of Defense (CA180889 and CA200437), the DallePezze Foundation and the Derfner Foundation. P.S.A.’s laboratory receives research support from ATARA Biotherapeutics. D.R.J. is supported by National Institutes of Health grants R01CA217169, R01CA240472 and P30 CA008748 and the Hamilton Family Foundation. T.M. is supported by the UK National Institute of Health Research University College London Hospital Biomedical Research Centre. N.M. is a Sir Henry Dale Fellow, jointly funded by the Wellcome Trust and the Royal Society (grant no. 211179/Z/18/Z) and also receives funding from CRUK, Rosetrees and the National Institute for Health Research (NIHR) Biomedical Research Centre at University College London Hospitals and the CRUK University College London Experimental Cancer Medicine Centre. M.J.-H. is a CRUK Career Establishment Awardee and has received funding from CRUK, IASLC International Lung Cancer Foundation, Lung Cancer Research Foundation, Rosetrees Trust, UKI NETs and NIHR University College London Hospitals Biomedical Research Centre. C.S. is a Royal Society Napier Research Professor (RSRP\R\210001). C.S. is supported by the Francis Crick Institute that receives its core funding from CRUK (CC2041), the UK Medical Research Council (CC2041), and the Wellcome Trust (CC2041). For the purpose of open access, the author has applied a CC BY public copyright licence to any Author Accepted Manuscript version arising from this submission. C.S. is funded by CRUK (TRACERx (C11496/A17786), PEACE (C416/A21999) and CRUK Cancer Immunotherapy Catalyst Network); CRUK Lung Cancer Centre of Excellence (C11496/A30025); the Rosetrees Trust, Butterfield and Stoneygate Trusts; NovoNordisk Foundation (ID16584); Royal Society Professorship Enhancement Award (RP/EA/180007); NIHR University College London Hospitals Biomedical Research Centre; the CRUK University College London Centre; Experimental Cancer Medicine Centre; the Breast Cancer Research Foundation (US); and The Mark Foundation for Cancer Research Aspire Award (grant 21-029-ASP). This work was supported by a Stand Up To Cancer‐LUNGevity-American Lung Association Lung Cancer Interception Dream Team Translational Research Grant (grant no. SU2C-AACR-DT23-17 to S. M. Dubinett and A. E. Spira). Stand Up To Cancer is a division of the Entertainment Industry Foundation. Research grants are administered by the American Association for Cancer Research, the Scientific Partner of SU2C. C.S. is in receipt of an ERC Advanced Grant (PROTEUS) from the European Research Council under the European Union’s Horizon 2020 research and innovation program (grant agreement no. 835297).
D.A.M. reports speaker fees from AstraZeneca, Eli Lilly and Takeda, consultancy fees from AstraZeneca, Thermo Fisher, Takeda, Amgen, Janssen, MIM Software, Bristol Myers Squibb (BMS) and Eli Lilly and has received educational support from Takeda and Amgen. S.V. is a co-inventor to a patent for detecting molecules in a sample (US patent no. 10578620). M.A.B. has consulted for Achilles Therapeutics. A.M.F. is co-inventor to a patent application to determine methods and systems for tumor monitoring (PCT/EP2022/077987). C.A. has received speaking honoraria or expenses from Novartis, Roche, AstraZeneca and Bristol Myers Squibb and reports employment at AstraZeneca. C.A. is an inventor on a European patent application relating to assay technology to detect tumor recurrence (PCT/GB2017/053289), the patent has been licensed to commercial entities and under their terms of employment C.A. is due a revenue share of any revenue generated from such license(s). C.A. declares a patent application (PCT/US2017/028013) for methods to detect lung cancer. C.A. is a named inventor on a patent application to determine methods and systems for tumor monitoring (PCT/EP2022/077987) and is a named inventor on a provisional patent protection related to a ctDNA detection algorithm. D.B. reports personal fees from NanoString and AstraZeneca, and has a patent PCT/GB2020/050221 issued on methods for cancer prognostication. K.S.S.E. received research grant funding from BMS. C.T.H. has received speaker fees from AstraZeneca. K.L. has a patent on indel burden and CPI response pending and speaker fees from Roche tissue diagnostics, research funding from CRUK TDL/Ono/LifeArc alliance, Genesis Therapeutics, and consulting roles with Monopteros Therapeutics and Kynos Therapeutics. N.J.B. is a co-inventor to a patent for identification of responders to cancer treatment (PCT/GB2018/051912), has a patent application (PCT/GB2020/050221) on methods for cancer prognostication and a patent on methods for predicting anti-cancer response (US14/466,208). E.L.C. is employed by and holds shares in Achilles Therapeutics. P.S.A. is a Scientific Advisory Board Member and Consultant for ATARA Biotherapeutics, Bayer, Carisma Therapeutics, Imugene, ImmPactBio, Johnston & Johnston, Orion and OutpaceBio; declares patents, royalties and intellectual property on mesothelin-targeted CARs and other T cell therapies, which have been licensed to ATARA Biotherapeutics, issued patent method for detection of cancer cells using virus, and pending patent applications on PD-1 dominant negative receptor, wireless pulse-oximetry device, and on an ex vivo malignant pleural effusion culture system. Memorial Sloan Kettering Cancer Center (MSK) has licensed intellectual property related to mesothelin-targeted CARs and T-cell therapies to ATARA Biotherapeutics, and has associated royalties. D.R.J. has consulted for AstraZeneca, is a member of Clinical Trial Steering Committee for Merck, is a speaker for DAVA Oncology and is a member of the NCI Board of Scientific Advisors. N.R. has served on the Advisory Board for Merck, and is a member of the NCI Thoracic Malignancies Steering Committee. W.D.T. is a non-paid consultant for the LCMC3 and LCMC4 neoadjuvant clinical trials. A. Hackshaw has received fees for being a member of Independent Data Monitoring Committees for Roche-sponsored clinical trials, and academic projects coordinated by Roche. R.S. reports nonfinancial support from Merck and BMS, research support from Merck, Puma Biotechnology and Roche, and personal fees from Roche, BMS and Exact Sciences for advisory boards. A.G.N. reports personal fees from Merck, Boehringer Ingelheim, Novartis, AstraZeneca, BMS, Roche, Abbvie, Oncologica, Uptodate, European Society of Oncology, Takeda UK and Liberium, and personal fees and grants from Pfizer. N.M. has stock options in and has consulted for Achilles Therapeutics and holds European patents relating to targeting neoantigens (PCT/EP2016/ 059401), identifying patient response to immune checkpoint blockade (PCT/ EP2016/071471), determining HLA LOH (PCT/GB2018/052004) and predicting survival rates of patients with cancer (PCT/GB2020/050221). M.J.-H. has consulted for, and is a member of, the Achilles Therapeutics Scientific Advisory Board and Steering Committee, has received speaker honoraria from Pfizer, Astex Pharmaceuticals and Oslo Cancer Cluster, and is co-inventor on a European patent application relating to methods to detect lung cancer (PCT/US2017/028013). This patent has been licensed to commercial entities and under terms of employment M.J.-H. is due a share of any revenue generated from such license(s). C.S. acknowledges grant support from AstraZeneca, Boehringer Ingelheim, BMS, Pfizer, Roche-Ventana, Invitae (previously Archer Dx—collaboration in minimal residual disease sequencing technologies) and Ono Pharmaceutical. C.S. is an AstraZeneca Advisory Board member and Chief Investigator for the AZ MeRmaiD 1 and 2 clinical trials and is also co-chief investigator of the NHS Galleri trial funded by GRAIL and a paid member of GRAIL’s Scientific Advisory Board. C.S. receives consultant fees from Achilles Therapeutics (also SAB member), Bicycle Therapeutics (also a SAB member), Genentech, Medicxi, Roche Innovation Centre – Shanghai, Metabomed (until July 2022) and the Sarah Cannon Research Institute. C.S. had stock options in Apogen Biotechnologies and GRAIL until June 2021, and currently has stock options in Epic Bioscience, Bicycle Therapeutics, and has stock options in and is co-founder of Achilles Therapeutics. C.S. is an inventor on a European patent application relating to assay technology to detect tumor recurrence (PCT/GB2017/053289); the patent has been licensed to commercial entities and under terms of employment C.S. is due a revenue share of any revenue generated from such license(s). C.S. holds patents relating to targeting neoantigens (PCT/EP2016/059401), identifying patient response to immune checkpoint blockade (PCT/EP2016/071471), determining HLA LOH (PCT/GB2018/052004), predicting survival rates of patients with cancer (PCT/GB2020/050221), identifying patients who respond to cancer treatment (PCT/GB2018/051912), US patent relating to detecting tumor mutations (PCT/US2017/28013), methods for lung cancer detection (US20190106751A1) and both a European and US patent related to identifying insertion/deletion mutation targets (PCT/GB2018/051892). C.S. is a co-inventor to a patent application to determine methods and systems for tumor monitoring (PCT/EP2022/077987) and is a named inventor on a provisional patent protection related to a ctDNA detection algorithm. The other authors declare no competing interests.
Peer review information
Nature Medicine thanks Timothy Chan and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Joao Monteiro, in collaboration with the Nature Medicine team.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended Data Fig. 1 Histopathological assessment of the TRACERx 421 LUAD cohort.
a. Definition and categorisation of LUAD growth patterns. b. Representative haematoxylin and eosin (H&E) image of invasive mucinous adenocarcinoma (IMA). c-h. Representative H&E images of lepidic (c), papillary (d), acinar (e), cribriform (f), micropapillary(g), and solid (h)pattern observed in LUAD. Scale bars, 250 μm. i. Number of tumors by histological entity in the TRACERX 421 cohort and further annotation of the number of LUAD tumors by predominant subtype. j. Number of regions with growth pattern assessment. k. Schematic of histological assessment in the TRACERx study. Proportions of each subtype in the diagnostic slides were reported, and the predominant subtype was used to label each tumor. Multiregional sampling specimens were processed for whole exome sequencing and a subset was subjected to RNA-sequencing. Each region was annotated with the representative growth pattern. l. Overview of TRACERx 421 LUAD cohort. Fetal adenocarcinoma, colloid adenocarcinoma, and two tumors from a collision tumor, determined by genomic analysis, were not included in the plot (n = 244 tumors from 240 participants are shown). Each column represents one tumor. The proportion of each growth pattern (subtype) based on diagnostic sectional area, genomic variables, and Ki-67 fraction by immunohistochemical staining are summarised. Missing data are shown in white. WGD, whole genome doubling; TMB, tumor mutational burden; wGII, weighted genome instability index; FLOH, fraction of the genome subject to loss of heterozygosity; SCNA, somatic copy number alteration; ITH, intra-tumoral heterogeneity.
Extended Data Fig. 2 Genomic correlates of LUAD predominant subtypes.
a, b. Comparison of intra-tumoral heterogeneity (ITH) metrics calculated using TRACERx analytical pipeline vs orthogonal methods. (a) Comparison of the mean fraction of the genome subject to loss of heterozygosity (FLOH), mean weighted genome instability index (wGII), and somatic copy number alteration ITH (SCNA ITH) by SCNA profiles generated by the TRACERx pipeline (based on ASCAT43 with additional multi-sample SCNA estimation approach9,44) against SCNA profiles generated by Sequenza12 (n = 224 tumors) and (b) a comparison of truncal tumor mutational burden (TMB) using the TRACERx pipeline (clonality inferred by the modified version of PyClone45) vs ubiquitous TMB (n = 224 tumors). Blue lines and shaded areas represent linear regression lines with 95% confidence intervals (CIs). Pearson’s correlation coefficient and P value are shown. c. Correlation of genomic variables calculated using orthogonal methods and the proportion of high-grade patterns within each tumor. The color scale reflects Spearman’s rank correlation coefficient (rho). Correlation P values were corrected for multiple testing according to the Benjamini-Hochberg (BH) method and asterisks indicate q value ranges * q < 0.05, ** q < 0.01, *** q < 0.001, **** q < 0.0001. d–f. Frequency of truncal driver mutations (d), truncal driver gene amplifications (AMP; e), and truncal chromosomal arm level SCNAs (gain or loss/LOH; f) in LUAD predominant subtypes. Recurrent truncal alterations observed in more than 5% of the tumors in the cohort are shown. Asterisks represent the alterations observed in fewer than 10 tumors in both predominantly high- and low-/mid-grade predominant tumors. The color scale represents the frequency of the alteration observed within each subtype. Tumors with multi-regional SCNA data were included in the analysis (n = 224). g-i. Adjusted odds ratios with 95% CIs of truncal genomic alterations associated with the predominance of high-grade (n = 79 tumors) or low-/mid-grade patterns (n = 116 tumors). Genomic alterations selected by the model simplification are shown when (g) truncal alterations observed in more than 10% of the tumors in the cohort are included in the analysis, or when (h) SCNA profiles generated by Sequenza12 are used, or when (i) wGII is added to the model shown in Fig. 2a. Asterisks indicate type II ANOVA P value ranges * P < 0.05, ** P < 0.01, *** P < 0.001. The color represents the type of genomic alteration. Statistically significant alterations (P < 0.05) are indicated in bold. j, k. Mutual exclusivity and co-occurrence of truncal driver gene alterations and chromosome arm SCNAs specific to either predominantly high-grade or low-/mid-grade tumors when (j) truncal alterations observed in more than 10% of the tumors in the cohort are included in the analysis, or when (k) SCNA profiles generated by Sequenza12 are used. Truncal alterations with unadjusted P < 0.05 for both predominantly high- and low-/mid-grade tumors are filtered out. Color of the edge represents the relationship (mutual exclusivity vs co-occurrence) and the negative log of the q value (BH method) is represented in blue color scale in predominantly low-/mid-grade tumors and red color scale in predominantly high-grade tumors. Relationships with q < 0.1 are shown and asterisks indicate q value ranges * q < 0.05, ** q < 0.01. Covariates in statistically significant relationships (q < 0.05) are indicated in bold. l. Comparison of mean wGII between tumors with (n = 20) and without (n = 96) co-occurrence of truncal loss/LOH of chromosomes 3p and 3q in predominantly low-/mid-grade tumors. P value was calculated using a two-sided Wilcoxon rank-sum test.
Extended Data Fig. 3 SCNA and immune characteristics of LUAD predominant subtypes.
a. Across-genome plots showing the frequency of truncal and subclonal SCNAs of low-/mid-grade predominant tumors (top) and high-grade predominant tumors (bottom). Within each tumor type, the proportion of patients with gains or amplifications (top) and loss/LOH events (bottom) for each chromosome are described. The black line indicates the total (namely the sum of truncal and subclonal) proportion of tumours with SCNAs; the yellow and grey lines or shades indicate the proportion of tumors with subclonal and truncal gains, respectively. b, c. Comparison of ploidy adjusted mean copy number of chromosomal arm and driver genes between high-grade and low-/mid-grade predominant tumors, using SCNA profiles generated by Sequenza12 (b), and adding wGII to the regression model (c). Fixed effect coefficients of the linear mixed-effects model with tumor as a random effect are displayed on the x-axis, and the negative log of the q value (BH method) is displayed on the y-axis. The color represents the sign or the mean ploidy adjusted copy number, stratified with high-grade and low-/mid-grade predominance. Data points with q value ≥ 0.05 are colored in gray. Horizontal red dashed line represents q = 0.05. d. Number of genes with differential SCNA between high-grade and low-/mid-grade predominant tumors. Significantly increased copy number in high-grade predominant tumors compared with low/mid-grade predominant tumors: G2M checkpoint genes, 13/186; all genes, 2169/23657 (P = 0.36, chi-square test). e. Comparison of (left) PD-L1 expression on cancer cells measured by immunohistochemistry (left; n = 197 tumors) and stromal tumor infiltrating lymphocyte (TIL) scores (right; n = 228 tumors) across LUAD predominant subtypes. Each predominant subtype was compared against all other subtype tumors. P values were calculated using a two-sided Wilcoxon rank-sum test and corrected for multiple testing according to the BH method. Asterisks indicate q value ranges ** q < 0.01, **** q < 0.0001. Centre line, median; box limits, upper and lower quartiles; whiskers, 1.5× interquartile range. f. Adjusted odds ratios with 95% confidence intervals (CIs) for cancer cell PD-L1 positivity (≥ 1%) per tumor estimated by multivariable logistic regression model (PD-L1 positive, n = 75 tumors; PD-L1 negative, n = 102 tumors). Asterisks indicate type II ANOVA P value ranges * P < 0.05, *** P < 0.001. Statistically significant (P < 0.05) covariates are indicated in bold.
Extended Data Fig. 4 Morphological intra-tumoral heterogeneity reflects genomic heterogeneity.
a-c. Genomic distance between regions calculated by (a) LOH detected by Sequenza (n = 53 tumors) and genomic distance calculated by (b) mutation and (c) LOH only including tumor regions with purity ≥ 0.4 (n = 30 tumors). Each point represents a distance between a pair of regions in a tumor. Tumors containing both different subtype pair(s) and same subtype pair(s) are included in the analysis. Centre line, median; box limits, upper and lower quartiles; whiskers, 1.5× interquartile range. P values were calculated using a linear mixed-effects model, with tumor as a random effect. d. Comparison of tumor mutational burden (TMB) between ancestor-like and descendant-like regions (151 regional pairs in 54 tumors). Each line represents an ancestor-descendant-like regional pair. Each point represents one region and the plotted points were duplicated for regions associated with multiple ancestor-descendant-like pairs within a tumor. To assess the mutational burden shared in the majority of the cancer cells in the region, mutations with estimated cancer cell fraction (CCF) ≥ 95% were counted. Enrichment of higher TMB in descendant-like regions compared with the paired ancestor-like regions was evaluated by permutation test (1000 permutations, randomising TMB within each tumor, Monte Carlo procedure). e. Comparison of growth pattern by grades (left) and by the six growth patterns (right) between inferred ancestral-like and descendant-like regions. Tumors with single grades are included in the analysis. The color represents the transition of grade from ancestral-like to descendant-like region. Enrichment of lower-to-higher grade transition (upward transition) was evaluated using a permutation test (1000 permutations, randomising growth patterns within each tumor, Monte Carlo procedure). f. Comparison of regional growth pattern grade in ancestor-descendant-like pairs, inferred by various cutoffs of private LOH branch length proportion. All combinations of cutoff for ancestor-like and descendant-like inference shown in the figure yielded empirical P value < 0.05 (1000 permutations, Monte Carlo procedure) when the enrichment of lower-to-higher grade transition (upward transition) was tested. P values were not adjusted for the multiple comparisons shown in this panel. g-h. Comparison of regional pattern grade in ancestor-descendant-like pairs, inferred by (g) LOH profile generated by Sequenza12 and (h) both LOH profile and mutational profile (CCF ≥ 95%). Enrichment of lower-to-higher grade transition (upward transition) was evaluated using a permutation test (1000 permutations, randomising growth patterns within each tumor, Monte Carlo procedure).
Extended Data Fig. 5 Characterization of purely (homogeneously) solid tumors.
a. Comparison of G2M checkpoint gene expression in solid-pattern regions within purely solid tumors and mixed pattern tumors as defined by both diagnostic and regional growth pattern assessment. Centre line, median; box limits, upper and lower quartiles; whiskers, 1.5× interquartile range. P value was calculated using a linear mixed-effects model, with tumor as a random effect. b-e. Proportion of tumors which are purely solid, mixed pattern with solid component, and without any solid component, compared between tumors with and without truncal gain of chromosome arm 3q (b), truncal gain of arm or focal 3q (3q21.3-3q29) (c), truncal SMARCA4 mutation and/or LOH (d), and compared across the tumors stratified by truncal SMARCA4 mutation and LOH status (e). P value was calculated using a two-sided Fisher’s exact test. f. Comparison of the frequency of truncal copy number gain of chromosome arm 3q between mixed pattern tumors with solid component and purely solid tumors. A two-sided Fisher’s exact test was used. g-i. Comparison of the frequency of (g) copy number gain of chromosome arm 3q, (h) gain of arm or focal 3q (3q21.3-3q29), and (i) SMARCA4 mutation and/or LOH between mixed pattern tumors with solid component and purely solid tumors using SCNA profiles generated by Sequenza12. A two-sided Fisher’s exact test was used.
Extended Data Fig. 6 Analysis of morphology and genomics in metastasis samples.
a. Schematic of primary and secondary lung tumors in CRUK0296. Phylogenetic analysis confirmed the contralateral lung lesion to be a metastasis from the primary tumor resected three years earlier. STAS was positive in the primary tumor. b. Phylogenetic tree of a case having lung metastasis with pure lepidic appearance (CRUK0296). Blue nodes are mutation clusters found only in metastasis and gray nodes represent mutation clusters found in the primary tumor. Regional growth pattern is indicated in brackets; R, region. c. Representative H&E slide of a primary tumor of CRUK0296 showing tumor border (arrowheads) and STAS (arrow). d. Representative H&E slide of metastasis tumor in the contralateral lung of CRUK0296, which showed a pure lepidic pattern. Scale bars, 250 μm. e. Characteristics of five patients having lung recurrence samples sequenced and one patient having an intrapulmonary metastasis resected and sequenced at the time of primary surgery. All six patients showed positive STAS in the primary tumors and phylogenetic analysis revealed late metastatic divergence. f. Proportion of the timing of seeding clone divergence across predominant subtypes of primary tumors. g. Frequency of late or early divergence of the metastatic clone compared between tumors with and without STAS. A two-sided Fisher’s exact test was used.
Extended Data Fig. 7 Characterization of tumors with STAS and preoperative ctDNA shedding.
a. Overview of the TRACERx 421 LUAD cohort, ordered by the positivity of STAS, pre-operative ctDNA detection, and the site of the relapse (n = 223 patients). Participants with synchronous primary lung cancers were excluded. Colloid and fetal adenocarcinomas are included (predominant sutbype = Other). Each column represents a participant. IMA, invasive mucinous adenocarcinoma; LVI, lymphovascular invasion; PL, pleural invasion. Tumors that did not relapse before death or the development of a new primary cancer are treated as no recurrence (No rec). b. Kaplan-Meier (KM) curve of disease-free survival (DFS), comparing STAS present vs absent. Numbers at risk are described at the bottom. Unadjusted and adjusted hazard ratio (HR) (adjusted for age, pathological stage, smoking pack-years, surgery type, and receipt of adjuvant therapy) are shown. c. STAS positivity across predominant subtypes of the primary tumor. d. Histopathological features associated with STAS positivity (left) and pre-operative ctDNA detection (right). Negative log of the q values (BH method) obtained by multiple univariable logistic regression analyses are presented. Vertical dotted lines represent q = 0.05, and variables with q < 0.05 are presented in points with colors which represent the direction of the correlation. e. Frequency of driver mutations in 10 canonical oncogenic signalling pathways63 in STAS present and absent tumors. P values calculated by two-sided Fisher’s exact test were corrected for multiple testing according to the BH method and the asterisk indicates q value range * q < 0.05. f. Comparison of CTNNB1 gene expression (variance stabilisation normalised count) between STAS absent (n = 72 tumors) and present tumors (n = 114 tumors). Each point represents a tumor region (STAS absent, n = 169 regions; present, n = 289 regions). Centre line, median; box limits, upper and lower quartiles; whiskers, 1.5× interquartile range. P value was calculated using a linear mixed-effects model, with tumor as a random effect. g. Gene set enrichment analysis (GSEA) of Hallmark gene sets between STAS absent (n = 72 tumors) and present tumors (n = 112 tumors). Normalised enrichment score is displayed on the x-axis and indicates the enrichment for a given gene set. Gene sets with q < 0.25 (BH method) are described in gray. h. Pre-operative ctDNA positivity across predominant subtypes of the primary tumor.
Extended Data Fig. 8 Impact of predominant subtype upon site and risk of recurrence.
a. KM curve of DFS, comparing participants with predominantly high-grade tumors vs low-/mid-grade tumors. Numbers at risk are described at the bottom. Unadjusted and adjusted HRs (adjusted for age, stage, pack-years, surgery type, and adjuvant therapy) are shown. b. Frequency of the relapse site (intra- and/or extra-thoracic) across (left) predominant subtypes and (right) grades of the predominant subtype of the primary tumor. Tumors that did not relapse before death or the development of a new primary cancer are treated as no recurrence (No rec). c. Relapse-site specific (sub-distribution) HRs for predominantly high-grade tumors compared with low-/mid-grade tumors, adjusted for age, stage, pack-years, surgery type, and adjuvant therapy, are presented with 95% CIs on a logarithmic scale (n = 185 participants). Participants with synchronous primary lung cancers or uncertain sites of relapse were excluded from the analysis. P < 0.05 are described in red. No corrections were made for multiple comparisons.
Extended Data Fig. 9 Impact of STAS, pre-operative ctDNA positivity, and necrosis upon site and risk of recurrence.
a, b. Frequency of the relapse site (intra- and/or extra-thoracic), stratified by the positivity of STAS and pre-operative ctDNA detection. Pre-operative ctDNA data were based on (a) the assay previously reported by Abbosh et. al.8 (TRACERx100 cohort) and (b) the assay reported in our companion manuscript28 (TRACERx421 cohort), including 7 participants who underwent both assays in each cohort. Tumors that did not relapse before death or the development of a new primary cancer are treated as no recurrence (No rec). c. KM curves of DFS, stratified by the positivity of STAS and pre-operative ctDNA detection in participants with (top) pathological stage I disease and (bottom) stage II/III disease. HRs were adjusted for age, pathological stage, smoking pack-years, and receipt of adjuvant therapy. Surgery type was also added as a covariate for stage I disease but not for stage II & III disease, because only 1 participant underwent sublobar resection for stage II/III disease. Numbers at risk are described below the KM curves. d. Positivity of STAS and pre-operative ctDNA detection are incorporated with other tumor and clinical characteristics in a multivariable Cox proportional hazards model (DFS). HRs of each variable with 95% confidence intervals (CIs) are shown on the horizontal axis. e. Positivity of necrosis across predominant subtypes of the primary tumor. f. Frequency of the relapse site (intra- and/or extra-thoracic), stratified by the presence of STAS and necrosis in all LUADs. g. Relapse-site specific (sub-distribution) HR for positivity of necrosis, adjusted for age, stage, pack-years, surgery type, and adjuvant therapy, are presented with 95% CIs on a logarithmic scale (n = 211 participantsts). Participants with synchronous primary lung cancers or uncertain site of relapse were excluded from the analysis. P < 0.05 are shown in red. No corrections were made for multiple comparisons. h. Positivity of STAS and necrosis are incorporated with other tumor and clinical characteristics in a multivariable Cox proportional hazards model for DFS. HRs of each variable with 95% CIs are shown. i. KM curve of DFS, stratified by the positivity of STAS and the presence of necrosis. HRs were adjusted for age, stage, pack-years, surgery type, and adjuvant therapy. Numbers at risk are described at the bottom.
Extended Data Fig. 10 External validation of the impact of STAS and necrosis on disease-free survival.
a. Summary of patient demographics and clinical characteristics of the Memorial Sloan Kettering Cancer Center cohort (n = 712 patients). b. Positivity of STAS and necrosis are incorporated with other tumor and clinical characteristics in a multivariable Cox proportional hazards model of DFS. c. KM curve of DFS, stratified by the positivity of STAS and the presence of necrosis (n = 712 patients). Hazard ratios were adjusted for age, pathological stage, smoking pack-years, surgery type, and receipt of adjuvant therapy. Numbers at risk are described at the bottom.
Supplementary Tables 1–4.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Karasaki, T., Moore, D.A., Veeriah, S. et al. Evolutionary characterization of lung adenocarcinoma morphology in TRACERx. Nat Med 29, 833–845 (2023). https://doi.org/10.1038/s41591-023-02230-w