How the genomic features of a patient’s cancer relate to individual disease kinetics remains poorly understood. Here we used the indolent growth dynamics of chronic lymphocytic leukaemia (CLL) to analyse the growth rates and corresponding genomic patterns of leukaemia cells from 107 patients with CLL, spanning decades-long disease courses. We found that CLL commonly demonstrates not only exponential expansion but also logistic growth, which is sigmoidal and reaches a certain steady-state level. Each growth pattern was associated with marked differences in genetic composition, the pace of disease progression and the extent of clonal evolution. In a subset of patients, whose serial samples underwent next-generation sequencing, we found that dynamic changes in the disease course of CLL were shaped by the genetic events that were already present in the early slow-growing stages. Finally, by analysing the growth rates of subclones compared with their parental clones, we quantified the growth advantage conferred by putative CLL drivers in vivo.
Access optionsAccess options
Subscribe to Journal
Get full journal access for 1 year
only $3.90 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
All relevant data are available from the authors and/or are included with the manuscript. Clinical data about patients and samples analysed in the discovery cohort are listed in Supplementary Table 1a, b; sequencing metrics and somatic mutations are provided in Supplementary Tables 2–4. WES data are in dbGaP under accession code phs001431.v1.p. For the extension cohort, patient and sample characteristics as well as sequencing data are available from a previous publication15, and clinical data are summarized in Extended Data Table and Supplementary Tables 7 and 8. Further data to assess WBC dynamics were collected from these patients for this study and are illustrated in Extended Data Fig. 2, with clinical characteristics of patients with additional relapse samples provided separately in Supplementary Table 1c.
PhylogicNDT package18 is available at https://github.com/broadinstitute/PhylogicNDT. PhylogicNDT uses Python 2.7.13 and the following Python modules available from pypi.org: bottle 0.12.13, dill 0.2.7.1, et-xmlfile 1.0.1, intervaltree 2.1.0, jsonschema 2.6.0, lxml 3.7.3, more-itertools 2.5.0, mpmath 0.19, networkx 1.11, openpyxl 2.4.1, pdfkit 0.6.1, pydotplus 2.0.2, pymc 2.3.6, pymc3 3.0, python-dateutil 2.6.1, rpy2 2.8.5, seaborn 0.7.1, simplejson 3.10.0, svgwrite 1.1.9, scikit-learn 0.18.1, biopython 1.68. In addition, pyemd (https://github.com/garydoranjr/pyemd) and sselogsumexp (https://github.com/rmcgibbo/logsumexp) modules were used. Code for the Bayesian modelling of growth patterns is available at: https://github.com/ivbozic/Bayesian-Growth-Pattern-Modeling.
Wodarz, D. & Komarova, N. L. Dynamics of Cancer: Mathematical Foundations Of Oncology (World Scientific, 2014).
Burger, J. A. et al. Clonal evolution in patients with chronic lymphocytic leukaemia developing resistance to BTK inhibition. Nat. Commun. 7, 11589 (2016).
Diaz, L. A. Jr. et al. The molecular evolution of acquired resistance to targeted EGFR blockade in colorectal cancers. Nature 486, 537–540 (2012).
Norton, L. A Gompertzian model of human breast cancer growth. Cancer Res. 48, 7067–7071 (1988).
Spratt, J. A., von Fournier, D., Spratt, J. S. & Weber, E. E. Decelerating growth and human breast cancer. Cancer 71, 2013–2019 (1993).
Misale, S. et al. Vertical suppression of the EGFR pathway prevents onset of resistance in colorectal cancers. Nat. Commun. 6, 8305 (2015).
Talkington, A. & Durrett, R. Estimating tumor growth rates in vivo. Bull. Math. Biol. 77, 1934–1954 (2015).
Rodriguez-Brenes, I. A., Komarova, N. L. & Wodarz, D. Tumor growth dynamics: insights into evolutionary processes. Trends Ecol. Evol. 28, 597–604 (2013).
Gerlee, P. The model muddle: in search of tumor growth laws. Cancer Res. 73, 2407–2411 (2013).
Hart, D., Shochat, E. & Agur, Z. The growth law of primary breast cancer as inferred from mammography screening trials data. Br. J. Cancer 78, 382–387 (1998).
Hallek, M. et al. iwCLL guidelines for diagnosis, indications for treatment, response assessment, and supportive management of CLL. Blood 131, 2745–2760 (2018).
Puente, X. S. et al. Non-coding recurrent mutations in chronic lymphocytic leukaemia. Nature 526, 519–524 (2015).
Landau, D. A. et al. Mutations driving CLL and their evolution in progression and relapse. Nature 526, 525–530 (2015).
Wang, L. et al. SF3B1 and other novel cancer genes in chronic lymphocytic leukemia. N. Engl. J. Med. 365, 2497–2506 (2011).
Landau, D. A. et al. Evolution and impact of subclonal mutations in chronic lymphocytic leukemia. Cell 152, 714–726 (2013).
Döhner, H. et al. Genomic aberrations and survival in chronic lymphocytic leukemia. N. Engl. J. Med. 343, 1910–1916 (2000).
Hamblin, T. J., Davis, Z., Gardiner, A., Oscier, D. G. & Stevenson, F. K. Unmutated Ig VH genes are associated with a more aggressive form of chronic lymphocytic leukemia. Blood 94, 1848–1854 (1999).
Leshchiner, I. et al. Comprehensive analysis of tumour initiation, spatial and temporal progression under multiple lines of treatment. Preprint at https://www.bioRxiv.org/content/10.1101/508127v2 (2018).
Bozic, I., Gerold, J. M. & Nowak, M. A. Quantifying clonal and subclonal passenger mutations in cancer evolution. PLOS Comput. Biol. 12, e1004731 (2016).
Williams, M. J., Werner, B., Barnes, C. P., Graham, T. A. & Sottoriva, A. Identification of neutral tumor evolution across cancer types. Nat. Genet. 48, 238–244 (2016).
Wang, J. et al. Tumor evolutionary directed graphs and the history of chronic lymphocytic leukemia. eLife 3, e02869 (2014).
The International CLL-IPI working group. An international prognostic index for patients with chronic lymphocytic leukaemia (CLL-IPI): a meta-analysis of individual patient data. Lancet Oncol. 17, 779–790 (2016).
Ojha, J. et al. Deep sequencing identifies genetic heterogeneity and recurrent convergent evolution in chronic lymphocytic leukemia. Blood 125, 492–498 (2015).
Rose-Zerilli, M. J. J. et al. Longitudinal copy number, whole exome and targeted deep sequencing of ‘good risk’ IGHV-mutated CLL patients with progressive disease. Leukemia 30, 1301–1310 (2016).
Smith, E. N. et al. Genetic and epigenetic profiling of CLL disease progression reveals limited somatic evolution and suggests a relationship to memory-cell development. Blood Cancer J. 5, e303 (2015).
Martinez, P. et al. Dynamic clonal equilibrium and predetermined cancer risk in Barrett’s oesophagus. Nat. Commun. 7, 12158 (2016).
Ojha, J. et al. Monoclonal B-cell lymphocytosis is characterized by mutations in CLL putative driver genes and clonal heterogeneity many years before disease progression. Leukemia 28, 2395–2398 (2014).
Barrio, S. et al. Genomic characterization of high-count MBL cases indicates that early detection of driver mutations and subclonal expansion are predictors of adverse clinical outcome. Leukemia 31, 170–176 (2017).
Schuh, A. et al. Monitoring chronic lymphocytic leukemia progression by whole genome sequencing reveals heterogeneous clonal evolution patterns. Blood 120, 4191–4196 (2012).
Braggio, E. et al. Longitudinal genome-wide analysis of patients with chronic lymphocytic leukemia reveals complex evolution of clonal architecture at disease progression and at the time of relapse. Leukemia 26, 1698–1701 (2012).
Amin, N. A. et al. A Quantitative analysis of subclonal and clonal gene mutations before and after therapy in chronic lymphocytic leukemia. Clin. Cancer Res. 22, 4525–4535 (2016).
Tabak, B. et al. The Tangent copy-number inference pipeline for cancer genome analyses. Preprint at https://www.biorxiv.org/content/10.1101/566505v1 (2019).
We are grateful to P. dal Cin, D.-A. Landau, S. Shukla and U. Jäger for discussions. We also appreciate the efforts of all study nurses and clinical staff that made this study feasible, and the patients who generously provided their samples for this research. This work was supported in part by the NCI (5P01CA081534-14, 1R01CA155010-01A1, P01CA206978, U10CA180861), the CLL Global Research Foundation, and by NHLBI (1RO1HL103532-01). M.G. was supported by a Marie-Curie International Outgoing Fellowship from the European Union (PIOF-2013-624924). G.G. is partially supported by the Paul C. Zemecnik Chair in Oncology at the Massachusetts General Hospital Cancer Center. C.J.W. is a Scholar of the Leukemia and Lymphoma Society.
C.J.W. is founder of Neon Therapeutics and a member of its scientific advisory board. G.G. receives research funds from IBM and Pharmacyclics. G.G. is an inventor of several bioinformatics-related patents, including patents related to MuTect and ABSOLUTE. C.J.W., D.N. and T.J.K. receive research funding from Pharmacyclics. J.S. is a current employee of Moderna Therapeutics. J.G.G. receives grant funding from Janssen, Acerta, Celgene; and received honoraria from Abbvie, AZ, Celgene, Kite, Janssen, Pharmacyclics, Roche and Novartis. K.R.R. is on Medical Advisory Boards of Pharmacyclics, Roche/Genentech and Cellectis. J.R.B. is a consultant for Abbvie, Acerta, Beigene, Genentech/Roche, Gilead, Juno/Celgene, Kite, Loxo, Novartis, Pfizer, Pharmacyclics, Sunesis, TG Therapeutics and Verastem; received honoraria from Janssen and Teva; received research funding from Gilead, Loxo, Sun and Verastem; and served on data safety monitoring committees for Morphosys and Invectys. The other authors declare no potential conflicts of interest.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
a, Time courses of the discovery CLL cohort (Supplementary Table 1). Circles indicate time points of samples analysed by WES. Dotted lines represent course of CLL from diagnosis (left vertical line) until last follow-up (arrows) or death (right vertical line), and solid lines indicate timeframe covered by the analysis of serial samples by WES, coloured by growth pattern. b, Cumulative distribution function (CDF) of posterior probabilities for carrying capacity K obtained from the Bayesian model based on a logistic growth pattern for patients. Categorizations of the growth pattern of the individual patients are marked. c, Classification of patients based on the probability that their carrying capacity K is below 1,000 (×109 cells per litre) (red numbers in top left corner). Also shown are the posterior probability distributions for all model parameters (carrying capacity K, growth rate r, WBC count at diagnosis X0 and variance of the noise σ2). Far right panels per patient: leukaemia burden information provided by WBC measurements (blue dots), with ten random fits from the Bayesian model. Red numbers in top left corners indicate time (years) from diagnosis to first treatment.
a–c, Shown are samples displaying: logistic growth (n = 43) (a), indeterminate growth (n = 30) (b) or exponential growth (n = 12) (c). See Supplementary Table 8 for information on growth pattern fitting. Blue dots denote WBC measurements; coloured lines denote ten random growth model fits (see Supplementary Methods). Red numbers indicate years from diagnosis to first treatment for patients who progressed to treatment.
a, The increase in the numbers of total (top), clonal (middle) and subclonal (bottom) drivers is associated with overall leukaemia growth patterns. P values were determined by Kruskal–Wallis test. b, A trend towards increased maximal change in the CCF of a driver event is observed between the first and last pre-treatment samples of a given patient based on growth pattern. P values were determined by Kruskal–Wallis test. c, Top, probability of having a carrying capacity K of WBC of less than 1,000 × 109 cells per litre (blue dots) for patients with logistic, indeterminate or exponential growth patterns. Bottom, growth rates (small circles) together with 70% credible intervals (lines) across the discovery and extension samples, ordered based on the probability of logistic growth with samples classified as displaying logistic, indeterminate or exponential growth.
Changes in the CCF of subclones represented as two-dimensional pair-wise plots of multi-sample clustering results. Samples at a time point (TP) closest to diagnosis (first) versus the last sample before treatment (preTx) are shown in the left column; samples at the last time point before and the first time point after treatment are shown in the right column. a, b, Patients are grouped based on those having: subclones with significant evolution (a) or subclones that maintain interclonal balance (b). Significantly evolving subclones are indicated in orange (Supplementary Methods); expanding CLL driver mutations are coloured magenta. c, Examples of genetic evolution from the first to last pre-treatment time points, and from pre-treatment to relapse samples for patient 6 (with significant evolution) and patient 10 (not evolving). Shown are the two-dimensional distributions that reflect the average of the positional distributions of the cluster centres along the MCMC iterations, rather than the final posterior for the cluster centre, which is determined by the normalized product of the pre-clustered distributions of the mutations that were finally assigned to each cluster. Marginal distributions (on the x and y axes) depict the CCF distributions before clustering for each individual mutation. Final cluster assignment is indicated by the colour.
Extended Data Fig. 5 Detecting subclones and construction of evolutionary phylogenies using simulated data.
a, Bar plots showing the fraction of clustering results on simulated samples that are concordant with the ground truth (or differ by ∆n clusters). Simulations are grouped by low (2) and high (3–8) numbers of samples per case as well as low (2–9) and high (≥10) numbers of mutation per subclone. b, Similar CCF accuracy after clustering between simulated WES and WGS data. c, Simulation of a case with 5 samples and 5 subclones present at different CCF levels per sample (black lines denote ground truth). The predicted CCF distributions for each cluster are plotted as a function of the number of mutations in the subclone (from 2 to 100). When the number of mutations exceeds approximately 15–20, the CCF predictions become stable and accurate (low bias and variance). d, Examples of PhylogicNDT BuildTree algorithm results applied to simulated data. Grey shading highlights the correct tree, with percentage of MCMC iterations supporting the trees indicated. e, Analysis of prior selection for clustering. For a range of priors with varying mean number of clusters, K, the prior for α is computed, and the Dirichlet process posteriors for α and K illustrate how the choice of prior affects the estimation of K. f, Pigeon-hole principle: for two clusters, A and B (top), the convolution (middle) and difference (bottom) is illustrated. The area above 1.0 CCF of the convolution is consistent with the probability that they are parent–child rather than siblings. The area below 0.0 CCF of the difference represents the probability that cluster B is more prevalent than cluster A.
a, Subclonal dynamics for each patient in the discovery cohort in relation to tumour load over time in the observed disease course (represented by WBC, with dots indicating an available WBC measurement). Arrows denote time of sampling with WES. Distinguishable subclones meeting the criteria for confident detection (>10% CCF, in at least one sequenced sample) are coloured. CCFs in time periods between sequenced time points were inferred from the closest sequenced sample. b, Subclonal growth patterns of additional patients analogous to Fig. 4.
For 15 patients with non-bounded growth (EXP and IND) and at least one macroscopic subclone, we show the following: first column: selected complete phylogenetic trees of subclones; yellow boxes indicate branches that were detectable only in relapse samples; second column: cluster CCF dynamics over time with 95% credible intervals based on uncertainty of mutation assignment; third column: pre-treatment growth rates for each generated clone within the most likely phylogeny; fourth column: relative pre-treatment growth rates of subclones compared to their respective parent subclone.
Extended Data Fig. 8 Somatic copy number alteration calling from WES, WGS and SNP array data showing highly concordant results.
a, WES and WGS of CLLs from patients 1 and 4. b, Patient 1 data before and after capture bias correction via tangent normalization32. c, TCGA samples with available paired WES and single nucleotide polymorphism (SNP) array data.
Extended Data Fig. 9 Comparison of PhylogicNDT clustering results between WES and WGS data and growth of selected subclones.
a, In patient 1, paired results of WES and WGS data were available for all four time points and demonstrate matching CCFs throughout. b, c, CCF posterior distributions for the cluster centres (b) and individual mutations (c) for the corresponding subclones found in WES and WGS data of patient 1. d, e, For patients 4 and 6, two-dimensional comparisons are illustrated. f, g, Examples for subclones (magenta boxes) with a significant growth advantage relative to their parent and known driver (f), one subclone with significantly accelerated growth but no driver (g), and subclones with driver and no growth acceleration (h).
About this article
A comparison between Nonlinear Least Squares and Maximum Likelihood estimation for the prediction of tumor growth on experimental data of human and rat origin
Biomedical Signal Processing and Control (2019)