The evolutionary history of lethal metastatic prostate cancer

Journal name:
Nature
Volume:
520,
Pages:
353–357
Date published:
DOI:
doi:10.1038/nature14347
Received
Accepted
Published online

Cancers emerge from an ongoing Darwinian evolutionary process, often leading to multiple competing subclones within a single primary tumour1, 2, 3, 4. This evolutionary process culminates in the formation of metastases, which is the cause of 90% of cancer-related deaths5. However, despite its clinical importance, little is known about the principles governing the dissemination of cancer cells to distant organs. Although the hypothesis that each metastasis originates from a single tumour cell is generally supported6, 7, 8, recent studies using mouse models of cancer demonstrated the existence of polyclonal seeding from and interclonal cooperation between multiple subclones9, 10. Here we sought definitive evidence for the existence of polyclonal seeding in human malignancy and to establish the clonal relationship among different metastases in the context of androgen-deprived metastatic prostate cancer. Using whole-genome sequencing, we characterized multiple metastases arising from prostate tumours in ten patients. Integrated analyses of subclonal architecture revealed the patterns of metastatic spread in unprecedented detail. Metastasis-to-metastasis spread was found to be common, either through de novo monoclonal seeding of daughter metastases or, in five cases, through the transfer of multiple tumour clones between metastatic sites. Lesions affecting tumour suppressor genes usually occur as single events, whereas mutations in genes involved in androgen receptor signalling commonly involve multiple, convergent events in different metastases. Our results elucidate in detail the complex patterns of metastatic spread and further our understanding of the development of resistance to androgen-deprivation therapy in prostate cancer.

At a glance

Figures

  1. n-D Dirichlet process clustering reveals widespread polyclonal seeding in A22.
    Figure 1: n-D Dirichlet process clustering reveals widespread polyclonal seeding in A22.

    a, For pairs of metastases, cancer cell fractions (CCF), that is, the fraction of cancer cells within a sample containing a mutation, are plotted for all the substitutions detected in the WGS data. Red density areas off the axes and with CCF > 0 and < 1 reveal the existence of mutation clusters present at subclonal levels in more than one metastatic site. Mutation clusters for each sample are indicated with circles coloured according to the subclone they correspond to (Supplementary Table 3). The centre of each circle is positioned at the CCF values of the subclone in the two samples. The clusters at (1,1) correspond to the mutations present in all the cells in both sites (CCF = 1) while those on axes refer to sample-specific subclones. For example, light blue and dark green clusters absent from sample A are positioned on the y axis when H is compared to A but are moved to (0.60,0.08) and (0.60,0.88) when H is compared to K. b, Each subclone detected in A22 is represented as a set of colour-coded ovals across all organ sites (Supplementary Table 3). Each row represents a sample, with ovals in the far left column nested if required by the pigeonhole principle (see Supplementary Information). The area of the ovals is proportional to the CCF of the corresponding subclone. Subclonal mutation clusters are shown with solid borders. Oval plots are divided into three types: trunk (CCF = 1 in all samples), leaf (specific to a single sample) and branch (present in >1 sample and either not found in all samples or subclonal in at least one). BM, bone marrow; hum., humerus; L., left; LN, lymph node; R., right; Sem., seminal. c, Phylogenetic tree showing the relationships between subclones in A22. Branch lengths are proportional to the number of substitutions in each cluster. Branches are annotated with samples in which they are present and with oncogenic/putative oncogenic alterations assigned to that subclone. amp, amplification; LOH, loss of heterozygosity; MRCA, most recent common ancestor. d, Subclone colour key.

  2. Subclonal structure within 10 metastatic lethal prostate cancers.
    Figure 2: Subclonal structure within 10 metastatic lethal prostate cancers.

    All the subclones identified in the whole-genome sequenced samples are shown as phylogenetic trees and oval plots (as described in Fig. 1). Patients with polyclonal seeding (A34, A22, A31, A32 and A24) are on the right (amp: amplification). Abd. para., abdominal paraaortic; e. splice, essential splice; diaph., diaphragm; HD, homozygous deletion; ing., inguinal; subclav., subclavicular; super., superficial.

  3. Metastasis-to-metastasis seeding occurs either by a linear or by a branching pattern of spread.
    Figure 3: Metastasis-to-metastasis seeding occurs either by a linear or by a branching pattern of spread.

    ac, Body maps show the seeding of all tumour sites from A22 (a), A21 (b) and A24 (c). Sites shown include samples subject to targeted sequencing (A22-L, A24-F, A24-G) in addition to WGS samples. Seeding events are represented with arrows colour-coded according to Supplementary Table 3 and with double-heads when seeding could be in either direction. When the sequence of events may be ordered from the acquisition of mutations, arrows are numbered chronologically. Subclones on branching clonal lineages are labelled with the same number but with different letters, for example, 4a & 4b. See Supplementary Information section 4e for a detailed discussion of the body map in these cases. ligam., ligament. GL, Gleason grade; EPE, extrapostatic extension.

  4. Drivers of tumorigenesis are truncal while drivers of castration resistance are convergent.
    Figure 4: Drivers of tumorigenesis are truncal while drivers of castration resistance are convergent.

    a, Proportion of trunk, branch and leaf mutations in each sample. b, Heat map of oncogenic alterations present on the trunk (top) or off the trunk, that is, on branches or leaves (bottom). Alterations in oncogenes and tumour suppressors are shown in red and blue, respectively, with shade indicating the number of events in that patient. Focal deletions and substitutions/indels are shown with crosses and stars, respectively. Double crosses indicate homozygous deletions resulting from deletions of both alleles. c, Continuous selective pressure on AR signalling is observed in the form of multiple rearrangements resulting in multiple copy number increases at the AR locus within the same patient. Chromosomal rearrangements are plotted on top of the genome-wide copy number for each of the 4 WGS samples from A24. Rearrangements are coloured according to the colour code in Supplementary Table 3. Arcs above and below the top vertical line indicate deletion and tandem duplication events, while arcs above and below the second vertical line are head-to-head and tail-to-tail inversions, respectively.

  5. Variants identified in 51 whole-genome sequenced samples from 10 patients.
    Extended Data Fig. 1: Variants identified in 51 whole-genome sequenced samples from 10 patients.

    ac, Number of insertion/deletions (a), high-confidence substitutions (b) and chromosomal rearrangements (c) are plotted across all the samples from the 10 patients that had their whole genome sequenced.

  6. Validation of the subclonal hierarchies in A22.
    Extended Data Fig. 2: Validation of the subclonal hierarchies in A22.

    The primary means of validation was a deep sequencing validation experiment that included selected substitutions and indels from each sample, as described in Extended Data Table 2 and Supplementary Information section 2b. In addition, indels and rearrangements identified in WGS represent data sets orthogonal to the substitution data from which the subclones were identified. The subsets of samples in which validated substitutions, indels and rearrangements are found correlate strongly with the subclonal clusters identified from the clustering of substitutions from WGS, providing support for the existence of these subclones. a, b, For each patient, hierarchical clustering of the variant allele fraction (VAF) was performed separately for substitutions (a) and indels (b). VAFs are represented as a heat map with deeper shades of red indicating a higher proportion of reads reporting the mutant allele. Above each heat map, mutations are colour-coded according to the subclone they were assigned to by Dirichlet process clustering of WGS data in the case of substitutions or by VAF for indels. Indels that could not be assigned to any cluster are annotated with black. For A22, additional samples not subject to WGS were included in the validation experiment. c, For these patients the phylogenetic tree from Fig. 2 was modified to incorporate these additional samples. df, Number of substitutions assigned to each subclone (d) and numbers of indels (e) and rearrangements (f) present in different subsets of samples are plotted as bar charts. g, VAFs from whole-genome sequencing and validation data, plotted as scatter plots, are very highly correlated. h, Subclone colour key.

  7. Validation of the subclonal hierarchies in A31 and A32.
    Extended Data Fig. 3: Validation of the subclonal hierarchies in A31 and A32.

    Validation strategy as described in Extended Data Fig. 2. For A31 and A32, hierarchical clustering of the VAF was performed separately for substitutions (a) and (j) and indels (b) and (k). Heat maps are annotated as described in Extended Data Fig. 2. Additional samples for A31 and A32 are incorporated into the phylogenetic trees (c) and (l). Subclones for A31 CD and A32 CE are annotated in the corresponding 2d-DP plots (d) and (m). Numbers of substitutions in WGS data assigned to each subclone are plotted in (e) and (n). VAFs from WGS and validation data, plotted as scatter plots (f) and (o), are very highly correlated. Number of indels (g) and (p) and rearrangements (h) and (q) present in different subsets of samples are plotted as bar charts. Subclone Colour keys for A31 and A32 (i and r) respectively.

  8. Validation of the subclonal hierarchies in A24 and A34.
    Extended Data Fig. 4: Validation of the subclonal hierarchies in A24 and A34.

    Validation strategy as described in Extended Data Fig. 2. For A24 and A34, hierarchical clustering of the VAF was performed separately for substitutions (a) and (i) and indels (b) and (j). Heatmaps are annotated as described in Extended Data Fig. 2. Indels that could not be assigned to any cluster (if any) are annotated with black. Additional samples for A24 and A34 are incorporated into the phylogenetic tree (c) and (k). The additional cluster in A24, supported by rearrangements only, is indicated by a light green branch in the tree. Numbers of substitutions in WGS data assigned to each subclone are plotted in (d) and (l). VAFs from WGS and validation data, plotted as scatter plots (e) and (m), are very highly correlated. Number of indels (f) and (n) and rearrangements (g) and (o) present in different subsets of samples are plotted as bar charts. Subclone Colour keys for A24 and A34 (h and p) respectively.

  9. Validation of the subclonal hierarchies in A10 and A29.
    Extended Data Fig. 5: Validation of the subclonal hierarchies in A10 and A29.

    Validation strategy as described in Extended Data Fig. 2. For A10 and A29, hierarchical clustering of the VAF was performed separately for substitutions (a) and (h) and indels (b) and (i). Heat maps are annotated as described in Extended Data Fig. 2. Indels that could not be assigned to any cluster (if any) are annotated with black. Loci with depth <20X are coloured in light blue. The additional sample (D) for A29 is incorporated into the phylogenetic tree (j). Validation experiment for A10-E, the prostate sample, gave very low coverage (d). Subclones for A29-A and A29-C are annotated in the 2d-DP plot (k). Numbers of substitutions in WGS data assigned to each subclone are plotted in (c) and (l). VAFs from WGS and validation data, plotted as scatter plots (d) and (m), are very highly correlated. Number of indels (e) and (n) and rearrangements (f) and (o) present in different subsets of samples are plotted as bar charts. Subclone Colour keys for A10 and A29 (g and p) respectively.

  10. Validation of the subclonal hierarchies in A17 and A12.
    Extended Data Fig. 6: Validation of the subclonal hierarchies in A17 and A12.

    Validation strategy as described in Extended Data Fig. 2. For A17 and A12, hierarchical clustering of the VAF was performed separately for substitutions (a) and (i) and indels (b) and (j). Heat maps are annotated as described in Extended Data Fig. 2. Mutations that could not be assigned to any cluster are annotated with black. For A12, the C-specific cluster that is not present in substitutions is shown in very light green. Subclones for A17 AD are annotated in the 2d-DP plot (c). Numbers of substitutions in WGS data assigned to each subclone are plotted in (d) and (l). VAFs from WGS and validation data, plotted as scatter plots (e) and (m), are very highly correlated. Number of indels (f) and (n) and rearrangements (g) and (o) present in different subsets of samples are plotted as bar charts. Additional samples for A12 are incorporated into the phylogenetic tree (k). Subclone Colour keys for A17 and A12 (h and p) respectively.

  11. Validation of the subclonal hierarchies in A21.
    Extended Data Fig. 7: Validation of the subclonal hierarchies in A21.

    Validation strategy as described in Extended Data Fig. 2. Hierarchical clustering of the VAF was performed separately for substitutions (a) and indels (b). Heat maps are annotated as described in Extended Data Fig. 2. Loci with depth <20X is coloured in light blue. Additional samples L, N, and Q from FFPE material had low coverage. The only loci present in these samples were all truncal. These samples are incorporated into the phylogenetic tree (c). Numbers of substitutions in WGS data assigned to each subclone are plotted in (d). Number of indels (e) and rearrangements (f) present in different subsets of samples are plotted as bar charts. VAFs from WGS and validation data, plotted as scatter plots (g), are very highly correlated. Subclone Colour key (h).

  12. Convergent evolution at the AR locus.
    Extended Data Fig. 8: Convergent evolution at the AR locus.

    Rearrangements and copy number segments in the vicinity of the AR locus are shown for A31, A21, A29 and A10. (a) In A31, there are three different AR amplification events. In orange is a tandem duplication whose existence is supported by tumour reads in ADEF but not C. However, PCR-gel validation confirms its existence in the prostate sample C—the faintness of the band suggesting that this rearrangement is present subclonally in A31-C—as well as the prostate sample I, which was not subject to WGS. One tandem duplication is common to both prostate samples (shown in green) while the other is specific to sample C (dark pink). (b) In A21, there are four different sets of complex rearrangements, one shared by ACDEGH and the remainder specific to F, I and J. (c) Rearrangements in the vicinity of the AR locus and inter-mutation distances for A29 plotted on a log10 scale for lesions specific to the metastasis (left) and specific to the prostate (middle). Each sample has a different set of complex rearrangements, which are associated with distinct kataegis events. (d) In A10, one tandem duplication is shared by CD while four others are each specific to a single sample.

Tables

  1. Validation of mutation calling
    Extended Data Table 1: Validation of mutation calling
  2. Copy number genes
    Extended Data Table 2: Copy number genes

References

  1. Nowell, P. C. The clonal evolution of tumor cell populations. Science 194, 2328 (1976)
  2. Nik-Zainal, S. et al. The life history of 21 breast cancers. Cell 149, 9941007 (2012)
  3. Greaves, M. & Maley, C. C. Clonal evolution in cancer. Nature 481, 306313 (2012)
  4. Gerlinger, M. et al. Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. N. Engl. J. Med. 366, 883892 (2012)
  5. Gupta, G. P. & Massagué, J. Cancer metastasis: building a framework. Cell 127, 679695 (2006)
  6. Poste, G. & Fidler, I. J. The pathogenesis of cancer metastasis. Nature 283, 139146 (1980)
  7. Fidler, I. J. The pathogenesis of cancer metastasis: the ‘seed and soil’ hypothesis revisited. Nature Rev. Cancer 3, 453458 (2003)
  8. Talmadge, J. E. & Fidler, I. J. AACR centennial series: the biology of cancer metastasis: historical perspective. Cancer Res. 70, 56495669 (2010)
  9. McFadden, D. G. et al. Genetic and clonal dissection of murine small cell lung carcinoma progression by genome sequencing. Cell 156, 12981311 (2014)
  10. Cleary, A. S., Leonard, T. L., Gestl, S. A. & Gunther, E. J. Tumour cell heterogeneity maintained by cooperating subclones in Wnt-driven mammary cancers. Nature 508, 113117 (2014)
  11. Bolli, N. et al. Heterogeneity of genomic evolution and mutational profiles in multiple myeloma. Nature Commun. 5, 2997 (2014)
  12. Karantanos, T. & Thompson, T. C. GEMMs shine a light on resistance to androgen deprivation therapy for prostate cancer. Cancer Cell 24, 1113 (2013)
  13. Harris, W. P., Mostaghel, E. A., Nelson, P. S. & Montgomery, B. Androgen deprivation therapy: progress in understanding mechanisms of resistance and optimizing androgen depletion. Nature Clin. Pract. Urol. 6, 7685 (2009)
  14. Bernard, D., Pourtier-Manzanedo, A., Gil, J. & Beach, D. H. Myc confers androgen-independent prostate cancer cell growth. J. Clin. Invest. 112, 17241731 (2003)
  15. Sharma, N. L. et al. The androgen receptor induces a distinct transcriptional program in castration-resistant prostate cancer in man. Cancer Cell 23, 3547 (2013)
  16. Francis, J. C., Thomsen, M. K., Taketo, M. M. & Swain, A. β-catenin is required for prostate development and cooperates with Pten loss to drive invasive carcinoma. PLoS Genet. 9, e1003180 (2013)
  17. Marusyk, A. et al. Non-cell-autonomous driving of tumour growth supports sub-clonal heterogeneity. Nature 514, 5458 (2014)
  18. Sun, Y. et al. Treatment-induced damage to the tumor microenvironment promotes prostate cancer therapy resistance through WNT16B. Nature Med. 18, 13591368 (2012)
  19. Campbell, P. J. et al. The patterns and dynamics of genomic instability in metastatic pancreatic cancer. Nature 467, 11091113 (2010)
  20. Maley, C. C. et al. Selectively advantageous mutations and hitchhikers in neoplasms: p16 lesions are selected in Barrett's esophagus. Cancer Res. 64, 34143427 (2004)
  21. Majumder, P. K. et al. A prostatic intraepithelial neoplasia-dependent p27 Kip1 checkpoint induces senescence and inhibits cell proliferation and cancer progression. Cancer Cell 14, 146155 (2008)
  22. Yachida, S. et al. Distant metastasis occurs late during the genetic evolution of pancreatic cancer. Nature 467, 11141117 (2010)
  23. Visakorpi, T. et al. In vivo amplification of the androgen receptor gene and progression of human prostate cancer. Nature Genet. 9, 401406 (1995)
  24. Ramaswamy, S., Ross, K. N., Lander, E. S. & Golub, T. R. A molecular signature of metastasis in primary solid tumors. Nature Genet. 33, 4954 (2002)
  25. Lee, Y. F. et al. A gene expression signature associated with metastatic outcome in human leiomyosarcomas. Cancer Res. 64, 72017204 (2004)
  26. Zack, T. I. et al. Pan-cancer patterns of somatic copy number alteration. Nature Genet. 45, 11341140 (2013)
  27. Taylor, B. S. et al. Integrative genomic profiling of human prostate cancer. Cancer Cell 13, 1122 (2010)
  28. Barbieri, C. E. et al. Exome sequencing identifies recurrent SPOP, FOXA1 and MED12 mutations in prostate cancer. Nature Genet. 44, 685689 (2012)
  29. Futreal, P. A. et al. A census of human cancer genes. Nature Rev. Cancer 4, 177183 (2004)

Download references

Author information

  1. Present address: Avoneaux Medical Institute, Oxford, Maryland 21654, USA.

    • Michael R. Emmert-Buck
  2. These authors jointly supervised this work.

    • David E. Neal,
    • Colin S. Cooper,
    • Rosalind A. Eeles,
    • Ultan McDermott &
    • G. Steven Bova
  3. These authors contributed equally to this work.

    • Ultan McDermott,
    • David C. Wedge &
    • G. Steven Bova

Affiliations

  1. Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK

    • Gunes Gundem,
    • Peter Van Loo,
    • Barbara Kremeyer,
    • Ludmil B. Alexandrov,
    • Jose M. C. Tubio,
    • Elli Papaemmanuil,
    • Victoria Goody,
    • Calli Latimer,
    • Sarah O'Meara,
    • Kevin J. Dawson,
    • Peter J. Campbell,
    • Ultan McDermott &
    • David C. Wedge
  2. Department of Human Genetics, KU Leuven, Herestraat 49 Box 602, B-3000 Leuven, Belgium

    • Peter Van Loo
  3. Cancer Research UK London Research Institute, London WC2A 3LY, UK

    • Peter Van Loo
  4. Norwich Medical School and Department of Biological Sciences, University of East Anglia, Norwich NR4 7TJ, UK

    • Daniel S. Brewer &
    • Colin S. Cooper
  5. The Genome Analysis Centre, Norwich NR4 7UH, UK

    • Daniel S. Brewer
  6. Institute of Biosciences and Medical Technology, BioMediTech, University of Tampere and Fimlab Laboratories, Tampere University Hospital, Tampere FI-33520, Finland

    • Heini M. L. Kallio,
    • Gunilla Högnäs,
    • Matti Annala,
    • Kati Kivinummi,
    • Matti Nykter,
    • Tapio Visakorpi &
    • G. Steven Bova
  7. The James Buchanan Brady Urological Institute, Johns Hopkins School of Medicine, Baltimore, Maryland 21287, USA

    • William Isaacs
  8. Laboratory of Pathology, National Cancer Institute, National Institutes of Health, Maryland 20892, USA

    • Michael R. Emmert-Buck
  9. University of Liverpool and HCA Pathology Laboratories, London WC1E 6JA, UK

    • Christopher Foster
  10. Division of Genetics and Epidemiology, The Institute Of Cancer Research, London SW7 3RP, UK

    • Zsofia Kote-Jarai,
    • Colin S. Cooper &
    • Rosalind A. Eeles
  11. Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge CB1 8RN, UK

    • Douglas Easton
  12. Uro-oncology Research Group, Cancer Research UK Cambridge Institute, Cambridge CB2 0RE, UK

    • Hayley C. Whitaker &
    • David E. Neal
  13. Department of Surgical Oncology, University of Cambridge, Addenbrooke's Hospital, Cambridge CB2 0QQ, UK

    • David E. Neal
  14. Royal Marsden NHS Foundation Trust, London SW3 6JJ, UK; and Sutton SM2 5PT, UK

    • Rosalind A. Eeles

Consortia

  1. ICGC Prostate UK Group

  2. A list of participants and their affiliations appears in the Supplementary Information.

Contributions

D.E.N., C.S.C., R.A.E., U.M. and G.S.B. co-designed and co-directed the project and are Senior Principal Investigators of the Cancer Research UK funded ICGC Prostate Cancer Project. G.G., P.V.L., T.V., D.C.W., U.M. and G.S.B. designed the study and co-wrote the paper. G.G., P.V.L., B.K., L.B.A., J.M.C.T., K.J.D., M.A. and D.C.W. carried out bioinformatic analyses. K.K., V.G., C.L. and S.O.'M. carried out laboratory analysis. E.P., D.S.B., H.C.W., C.S.C., P.J.C. and all authors edited the paper. D.S.B., Z.K.-J., H.C.W., G.G. and D.C.W. coordinated the study. H.M.L.K. and G.H. performed clinical data analysis and curation. W.I. facilitated the initial development of the autopsy study. M.R.E.-B. provided pathology support. M.N. provided bioinformatics support and supported project development. The full ICGC Prostate Group created and maintains overall study direction. For this work the primary affiliation of C.S.C. is The Institute of Cancer Research.

Competing financial interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to:

Author details

Extended data figures and tables

Extended Data Figures

  1. Extended Data Figure 1: Variants identified in 51 whole-genome sequenced samples from 10 patients. (386 KB)

    ac, Number of insertion/deletions (a), high-confidence substitutions (b) and chromosomal rearrangements (c) are plotted across all the samples from the 10 patients that had their whole genome sequenced.

  2. Extended Data Figure 2: Validation of the subclonal hierarchies in A22. (383 KB)

    The primary means of validation was a deep sequencing validation experiment that included selected substitutions and indels from each sample, as described in Extended Data Table 2 and Supplementary Information section 2b. In addition, indels and rearrangements identified in WGS represent data sets orthogonal to the substitution data from which the subclones were identified. The subsets of samples in which validated substitutions, indels and rearrangements are found correlate strongly with the subclonal clusters identified from the clustering of substitutions from WGS, providing support for the existence of these subclones. a, b, For each patient, hierarchical clustering of the variant allele fraction (VAF) was performed separately for substitutions (a) and indels (b). VAFs are represented as a heat map with deeper shades of red indicating a higher proportion of reads reporting the mutant allele. Above each heat map, mutations are colour-coded according to the subclone they were assigned to by Dirichlet process clustering of WGS data in the case of substitutions or by VAF for indels. Indels that could not be assigned to any cluster are annotated with black. For A22, additional samples not subject to WGS were included in the validation experiment. c, For these patients the phylogenetic tree from Fig. 2 was modified to incorporate these additional samples. df, Number of substitutions assigned to each subclone (d) and numbers of indels (e) and rearrangements (f) present in different subsets of samples are plotted as bar charts. g, VAFs from whole-genome sequencing and validation data, plotted as scatter plots, are very highly correlated. h, Subclone colour key.

  3. Extended Data Figure 3: Validation of the subclonal hierarchies in A31 and A32. (429 KB)

    Validation strategy as described in Extended Data Fig. 2. For A31 and A32, hierarchical clustering of the VAF was performed separately for substitutions (a) and (j) and indels (b) and (k). Heat maps are annotated as described in Extended Data Fig. 2. Additional samples for A31 and A32 are incorporated into the phylogenetic trees (c) and (l). Subclones for A31 CD and A32 CE are annotated in the corresponding 2d-DP plots (d) and (m). Numbers of substitutions in WGS data assigned to each subclone are plotted in (e) and (n). VAFs from WGS and validation data, plotted as scatter plots (f) and (o), are very highly correlated. Number of indels (g) and (p) and rearrangements (h) and (q) present in different subsets of samples are plotted as bar charts. Subclone Colour keys for A31 and A32 (i and r) respectively.

  4. Extended Data Figure 4: Validation of the subclonal hierarchies in A24 and A34. (427 KB)

    Validation strategy as described in Extended Data Fig. 2. For A24 and A34, hierarchical clustering of the VAF was performed separately for substitutions (a) and (i) and indels (b) and (j). Heatmaps are annotated as described in Extended Data Fig. 2. Indels that could not be assigned to any cluster (if any) are annotated with black. Additional samples for A24 and A34 are incorporated into the phylogenetic tree (c) and (k). The additional cluster in A24, supported by rearrangements only, is indicated by a light green branch in the tree. Numbers of substitutions in WGS data assigned to each subclone are plotted in (d) and (l). VAFs from WGS and validation data, plotted as scatter plots (e) and (m), are very highly correlated. Number of indels (f) and (n) and rearrangements (g) and (o) present in different subsets of samples are plotted as bar charts. Subclone Colour keys for A24 and A34 (h and p) respectively.

  5. Extended Data Figure 5: Validation of the subclonal hierarchies in A10 and A29. (341 KB)

    Validation strategy as described in Extended Data Fig. 2. For A10 and A29, hierarchical clustering of the VAF was performed separately for substitutions (a) and (h) and indels (b) and (i). Heat maps are annotated as described in Extended Data Fig. 2. Indels that could not be assigned to any cluster (if any) are annotated with black. Loci with depth <20X are coloured in light blue. The additional sample (D) for A29 is incorporated into the phylogenetic tree (j). Validation experiment for A10-E, the prostate sample, gave very low coverage (d). Subclones for A29-A and A29-C are annotated in the 2d-DP plot (k). Numbers of substitutions in WGS data assigned to each subclone are plotted in (c) and (l). VAFs from WGS and validation data, plotted as scatter plots (d) and (m), are very highly correlated. Number of indels (e) and (n) and rearrangements (f) and (o) present in different subsets of samples are plotted as bar charts. Subclone Colour keys for A10 and A29 (g and p) respectively.

  6. Extended Data Figure 6: Validation of the subclonal hierarchies in A17 and A12. (364 KB)

    Validation strategy as described in Extended Data Fig. 2. For A17 and A12, hierarchical clustering of the VAF was performed separately for substitutions (a) and (i) and indels (b) and (j). Heat maps are annotated as described in Extended Data Fig. 2. Mutations that could not be assigned to any cluster are annotated with black. For A12, the C-specific cluster that is not present in substitutions is shown in very light green. Subclones for A17 AD are annotated in the 2d-DP plot (c). Numbers of substitutions in WGS data assigned to each subclone are plotted in (d) and (l). VAFs from WGS and validation data, plotted as scatter plots (e) and (m), are very highly correlated. Number of indels (f) and (n) and rearrangements (g) and (o) present in different subsets of samples are plotted as bar charts. Additional samples for A12 are incorporated into the phylogenetic tree (k). Subclone Colour keys for A17 and A12 (h and p) respectively.

  7. Extended Data Figure 7: Validation of the subclonal hierarchies in A21. (408 KB)

    Validation strategy as described in Extended Data Fig. 2. Hierarchical clustering of the VAF was performed separately for substitutions (a) and indels (b). Heat maps are annotated as described in Extended Data Fig. 2. Loci with depth <20X is coloured in light blue. Additional samples L, N, and Q from FFPE material had low coverage. The only loci present in these samples were all truncal. These samples are incorporated into the phylogenetic tree (c). Numbers of substitutions in WGS data assigned to each subclone are plotted in (d). Number of indels (e) and rearrangements (f) present in different subsets of samples are plotted as bar charts. VAFs from WGS and validation data, plotted as scatter plots (g), are very highly correlated. Subclone Colour key (h).

  8. Extended Data Figure 8: Convergent evolution at the AR locus. (591 KB)

    Rearrangements and copy number segments in the vicinity of the AR locus are shown for A31, A21, A29 and A10. (a) In A31, there are three different AR amplification events. In orange is a tandem duplication whose existence is supported by tumour reads in ADEF but not C. However, PCR-gel validation confirms its existence in the prostate sample C—the faintness of the band suggesting that this rearrangement is present subclonally in A31-C—as well as the prostate sample I, which was not subject to WGS. One tandem duplication is common to both prostate samples (shown in green) while the other is specific to sample C (dark pink). (b) In A21, there are four different sets of complex rearrangements, one shared by ACDEGH and the remainder specific to F, I and J. (c) Rearrangements in the vicinity of the AR locus and inter-mutation distances for A29 plotted on a log10 scale for lesions specific to the metastasis (left) and specific to the prostate (middle). Each sample has a different set of complex rearrangements, which are associated with distinct kataegis events. (d) In A10, one tandem duplication is shared by CD while four others are each specific to a single sample.

Extended Data Tables

  1. Extended Data Table 1: Validation of mutation calling (164 KB)
  2. Extended Data Table 2: Copy number genes (158 KB)

Supplementary information

PDF files

  1. Supplementary Information (720 KB)

    This file contains Supplementary Text 1-5 and additional references (see page 1 for details).

Excel files

  1. Supplementary Tables (47 KB)

    This file contains Supplementary Tables 1-3.

  2. Supplementary Data (6.3 MB)

    This file contains the Supplementary Variant Lists.

Additional data