Every cancer originates from a single cell. During expansion of the neoplastic cell population, individual cells acquire genetic and phenotypic differences from each other. Here, to investigate the nature and extent of intra-tumour diversification, we characterized organoids derived from multiple single cells from three colorectal cancers as well as from adjacent normal intestinal crypts. Colorectal cancer cells showed extensive mutational diversification and carried several times more somatic mutations than normal colorectal cells. Most mutations were acquired during the final dominant clonal expansion of the cancer and resulted from mutational processes that are absent from normal colorectal cells. Intra-tumour diversification of DNA methylation and transcriptome states also occurred; these alterations were cell-autonomous, stable, and followed the phylogenetic tree of each cancer. There were marked differences in responses to anticancer drugs between even closely related cells of the same tumour. The results indicate that colorectal cancer cells experience substantial increases in somatic mutation rate compared to normal colorectal cells, and that genetic diversification of each cancer is accompanied by pervasive, stable and inherited differences in the biological states of individual cancer cells.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
We thank I. Martincorena, R. van Boxtel, J. Truszkowski, H. Francies and M. Garnett for discussion of our findings. This work was supported by funding from the Wellcome Trust (098051), Stichting Vrienden van het Hubrecht and KWF (SU2C-AACR-DT1213 and HUBR KWF 2014-6917). Individual authors were supported as follows: S.F.R., Louis-Jeantet Foundation; N.S., JSPS Overseas Research Fellowships; H.L.-S., Wellcome Trust Non-clinical PhD Studentship; S.B., Wellcome Trust Intermediate Clinical Research Fellowship and St. Baldrick’s Foundation Robert J. Arceci Innovation Award; P.J.C., Wellcome Trust Senior Research Fellowship in Clinical Science.Reviewer information
Nature thanks M. Lawrence and the other anonymous reviewer(s) for their contribution to the peer review of this work.
Extended data figures and tables
Specimens were derived from the ascending colon of a 66-year-old woman (a–f), sigmoid rectum of a 65-year-old woman (g–n) and ascending transverse colon of a 56-year-old man (o–t), respectively. From each tumour, 4–6 segments were resected (sized 5 × 5 × 3–5 mm. All sections except T3 from P2 resulted in viable clonal organoids. b–f, h–n, p–t, Haematoxylin and eosin staining and Ki67 immunohistochemistry show cell morphology for individual tumour sections. Scale bars: 200 µm.
a, Comparison of phylogeny reconstructions from WGS analysis of clonal organoids (left) and subclonal analysis of the original tissue biopsies (right) from individuals P1–P3. The analysis of clonal organoids allows a very detailed phylogeny, exact placement of driver mutations and analysis of cell-to-cell differences. b, Venn diagrams depicting overlap between substitutions identified by the organoid approach and the tissue biopsy approach. c, Venn diagrams depicting overlaps between clones P2.N3 and P2.T6.2 and their respective subclones (see Methods). Only a small proportion of the total mutations is added during culturing in both normal and tumour organoids. d, New signature identified in this study in tumour organoid samples from P3, characterized by T > G, T > A and T > C mutations at NTA and NTT trinucleotides (mutated bases underlined). e, Contribution of each of the identified mutation signatures to individual samples. Top (by_sample), results of signature extraction from all substitutions identified in each sample (Supplementary Notes). Bottom, proportion in each sample derived by adding up proportions in the branches of the phylogenetic tree that make up that sample (identical to Fig. 1). f, Numbers of C > T mutations by CpG context. g, Signature analysis of substitutions identified in the original tissue biopsies.
Phylogenies for three individuals with branch lengths representing indel numbers, further subdivided in insertions and deletions. Boxed area for P1 shows the high number of indels in this patient, who displays microsatellite instability (MSI) in all tumour clones in a different scale.
Phylogenies for three individuals with branch lengths representing rearrangement numbers, further subdivided into deletions, inversions, tandem duplications and translocations.
Copy number profiles of all clones that have been WGS analysed, displayed as a heatmap (amplification in red, loss in blue). The structures of the phylogenetic trees are displayed on the left; branch lengths are not scaled.
a–c, Methylation pattern of the MLH1 gene for tumour and normal clones for three individuals, showing hypermethylation in proximity to the transcription start site (TSS) for P1 tumour clones compared to normal clones. d, Expression of MLH1 in all clones; MLH1 transcript could not be detected in tumour clones from P1.
a, Clustering of methylation data by PCA showing normal-derived organoids from three individuals (n = 12 biologically independent samples). b, Global methylation change in each tumour clone, expressed as the ratio of hypermethylated probes to hypomethylated probes. Hyper- and hypomethylation are assessed by comparing to the baseline methylation levels in the normal-derived clones (indicated with line at y = 1). c–e, Left, clustering of methylation data by PCA of tumour organoids from each individual, displaying the first two principal components. Clones from different segments are shown in different colours as in Extended Data Fig. 2. Right, phylogenetic trees based on expression data (as in Fig. 3b) with the main branches used for our methylation analysis indicated. c, P1, n = 20 biologically independent samples. d, P2, n = 21 biologically independent samples. e, P3, n = 17 biologically independent samples. f–h, Direction of methylation changes during tumour development. Methylation changes were assigned to either the branch of the tumour or the main subclonal branches (indicated in the phylogenetic trees in e). i–k, Relative proportion of probes in CpG islands, shores, shelves and seas that were differentially methylated in different branches (Supplementary Notes section 6).
a, PCA based on expression pattern of normal organoids from each individual, displaying the first two principal components (n = 13). A subclone and its ancestral clone are circled. b–d, Left, PCA of tumour clones from each individual. Clones derived from different segments are shown in different colours as in Figs. 2–4. A subclone derived from a tumour clone from P2 and its ancestor clone are circled. Right, phylogenetic trees based on expression data (as in Fig. 4b) with the main branches used for our expression analysis indicated. b, P1, n = 20 biologically independent samples. c, P2, n = 22 biologically independent samples. d, P3, n = 17 biologically independent samples. e–g, Global analysis of expression changes attributed to the trunk of the tree, the main branches or subclonal variation. h, Venn diagram displaying the differentially expressed genes that were attributed to the trunk of each tumour. Differentially expressed genes determined by a likelihood ratio test using a negative binomial generalized linear model fit (FDR < 0.05). i–k, Comparison of differentially expressed genes identified in the organoid clones of each patient versus the original tissue sections. Only genes that were significantly altered in all clones or all biopsies from each individual are considered.
Dose response data for seven drugs, tested on organoid clones from three individuals. Twenty-one concentrations were tested for each drug, ranging from 14.7 nm to 20 μM. Mean survival from two duplicate experiments is displayed in a heatmap. The concentration displayed in Fig. 4 is outlined with a black box in each panel. b, Reproducibility of drug response data. Each measurement was performed twice (technical replicate) and each experiment was performed in duplicate (biological replicate). For each biological or technical replicate the area under the curve (AUC) is shown. c, Dose–response curves after 6 days of treatment with IWP2 (Wnt secretion inhibitor) for clonal tumour organoids derived from P1. RNF43 mutant clones are responsive, whereas RNF43 wild-type (WT) clones are resistant. Data points and error bars represent the mean and s.d. of four independent technical replicates from two independent experiments.
Details of all clones described in this study: culturing time for derivation of each clone, and analyses performed (WGS=whole genome sequencing, TGTS = targeted sequencing). For clones indicated with (*) whole genomes have been previously analysed by Blokzijl et al, 2016 (14)
Sequencing characteristics of WGS analysed samples. For each clone we report the average sequencing depth; to illustrate coverage breadth we assessed the proportion of the exome covered to a certain depth
Functional mutations in known cancer genes identified by targeted sequencing and whole genome sequencing (extended driver analysis). Mutations have been assigned to nodes of the phylogenetic trees according to targeted sequencing data (mrca_tgt) and/or whole genome data (mrca_wgs). The nodes of the phylogenetic trees are numbered as in extended data figure 4
All substutions, indels and structural variants called in WGS clones. Total counts of all mutation classes per sample are listed. For all substitutions VAF and sequencing depth in each clone is reported. Nodes of the phylogenetic trees (indicated in the column “mrca”) correspond to the labels in the trees in extended data figure 4. For the truncal mutations of P2, substitutions have been assigned prior to the WGD (“17_prewgd”), posterior to the WGD(“17_postwgd”) or unspecified (“17_not_timed”). For substitutions that could not be attributed to a node of the tree we inspected clones where a mutation was expected but not detected, for copynumber changes. For all indels and rearrangements we report their attribution to nodes of the tree. Indels and rearrangements in P2 have been timed relative to the whole genome duplication as explained in the methods section
Substitution analysis of targeted sequencing data. For each mutation that was identified by WGS, VAF and sequencing depth of the targeted sequencing experiment is reported for each clone. Samples were attributed to nodes in the tree according to the procedure outlined in the methods section and in supplementary methods. Sequencing characteristics of WGS analysed samples. For each clone we report the average sequencing depth; to illustrate coverage breadth we assessed the proportion of the exome covered to a certain depth
Differentially expressed genes, determined from expression levels of 16271 genes in respectively 20, 21 and 17 samples for each of the 3 patients. Reported DEGs have a FDR corrected p-value less than 0.05, resulting from a likelihood ratio test using a negative binomial generalised linear model (methods). They have been attributed to the tree as indicated in extended data figure 9 and in supplementary analysis
Enrichment analysis for hallmarks (n=4436 genes), canonical processes (n=10233 genes) and GO-terms (n=16271 genes). Over-represented or under-represented processes were determined using an enrichment test which incorporates the effect of gene length on power to detect DEGs (R-package goseq). Processes with a FDR of <0.05 are reported in S7.They have been attributed to the tree as indicated in extended data figure 9 and in supplementary analysis
Survival data for all individual clones for treatment with seven different drugs in 21 concentrations