Exploiting evolutionary steering to induce collateral drug sensitivity in cancer.

Drug resistance mediated by clonal evolution is arguably the biggest problem in cancer therapy today. However, evolving resistance to one drug may come at a cost of decreased fecundity or increased sensitivity to another drug. These evolutionary trade-offs can be exploited using 'evolutionary steering' to control the tumour population and delay resistance. However, recapitulating cancer evolutionary dynamics experimentally remains challenging. Here, we present an approach for evolutionary steering based on a combination of single-cell barcoding, large populations of 108-109 cells grown without re-plating, longitudinal non-destructive monitoring of cancer clones, and mathematical modelling of tumour evolution. We demonstrate evolutionary steering in a lung cancer model, showing that it shifts the clonal composition of the tumour in our favour, leading to collateral sensitivity and proliferative costs. Genomic profiling revealed some of the mechanisms that drive evolved sensitivity. This approach allows modelling evolutionary steering strategies that can potentially control treatment resistance.

Bar plots represent the copy number as estimated by dividing the target locus concentration by the reference NSUN3 locus concentration and multiplying this ratio by three as NSUN3 is in three copies (triploid genome). Error bars represent the 95% Confidence Interval for the ratio (Total Error Model) as produced by QuantaSoft TM multiplied by 3. In ddPCR it is possible to calculate confidence intervals from the results of a single well by modelling positive and negative droplets as being generated by a Poisson distribution.     Bar plots represent the copy number as estimated by dividing the target locus concentration by the reference NSUN3 locus concentration and multiplying this ratio by three as NSUN3 is in three copies (triploid genome). Error bars represent the 95% Confidence Interval for the ratio (Total Error Model) as produced by QuantaSoft TM multiplied by 3. In ddPCR it is possible to calculate confidence intervals from the results of a single well by modelling positive and negative droplets as being generated by a Poisson distribution.

Supplementary Figure 18. Effects of freezing and thawing on barcode frequencies.
The frequencies of all barcodes identified in DMSO7 and DMSO8 are consistent to the frequencies in replicates expanded from the POT population before it was frozen. Some barcodes are always missed due to sequencing (binomial sampling of alleles).

Statistical Analysis of Lentiviral Barcoding
During the barcoding protocol, cells are randomly infected with barcodes such that one cell may receive multiple barcodes and that one barcode may appear in multiple cells. Following the statistical approach outlined by Lan et al. [3], we estimated the expected proportion of doubly barcoded cells and the expected proportion of barcodes that appear in multiple cells.
⌫ can be estimated from the barcoding efficiency, ⌘, defined as a the proportion of cells that are successfully labelled with at least one barcode. Specifically ⌫ = log(1 ⌘).
We estimate ⌘ = 0.1 as the proportion of cells which survive following selection with puromycin.
Hyo-eun et al. [2] report a total barcode library complexity N b = 7.2⇥10 7 by fitting a polynomial equation. The total number of cells prepared for barcoding was N c = 10 7 . Thus, Finally, the proportion of uniquely labelled cells (those that receive a unique combination of one or more barcodes) is given by

Barcodes from Harvested Cells
Errors introduced during PCR or sequencing of the molecular barcodes can result in spurious barcodes being identified, or in the underestimation of the prevalence of a specific barcode. We implemented a novel, bias free error correction protocol as follows.
All reads matching the 12bp of the forward primer, followed by a 30bp sequence, followed by 12bp of the reverse primers were considered. This permits us to identify barcodes that deviate from the weak/strong base pair pattern as a result of errors. Reads were filtered for base quality score >20 in all positions. All detected barcodes were merged into a single file to ensure that the same corrections were applied between different samples.
We Hamming distance. Where multiple representatives had the same Hamming distance, the count for the barcode was evenly split between the representatives to avoid bias.

Barcodes from Supernatent Cells
For the barcodes extracted weekly from the supernatent cells, the extraction and filtering were performed as above. The correction mapping derived from harvested cells was used to correct the barcodes. There were no barcodes identified in the supernatent cells that was not detected in at least one of the harvested samples.
To demonstrate that each of the 8 HYPERflask replicates harbours a suitably similar barcode distribution following preparation, we performed a stochastic population simulation of the POT outgrowth and splitting steps to estimate the likelihood that a barcode in an initial population is present in N/8 of the replicate populations. The simulation comprised two parts: 1. Stochastic simulation of the POT outgrowth from an initial population of uniquely barcoded cells.
2. Stochastic simulation of splitting the POT population into 8 replicate populations.
To achieve (1) we assumed that each cell in the initial population was uniquely barcoded, and that each uniquely barcoded population was governed by stochastic exponential growth with birth rate b and death rate d. We implemented a Gillespie algorithm to simulate the exponential growth (Supplementary Figure 2), see Erban et al. [1] for implementation details. Approximate birth and death rates for the HCC827 cell line were previously derived by Mumenthaler et al. [4].
For the oxygen concentration of 20% and media glucose concentration of 2g/L that correspond to our experimental design, the appropriate values are approximately b = 0.032, d = 0.002, which we used to parameterise the model. Figure 2(B) shows a histogram of population sizes from 10,000 realisations of the simulation from a single cell with instances of extinction (population size equals zero) omitted.
Under this model of stochastic exponential growth differently barcoded populations do not interact, and so the POT barcode frequency distribution was computed by combining 10,000 independent realisations of the stochastic process. The barcode frequency distribution that arises is shown in Figure 2(C).
Finally, to simulate step (2) we performed a random equal size 8 way split of the full population of barcodes generated by the stochastic simulation. To determine the likelihood that a barcode appears in precisely N/8 replicates we simulated the stochastic outgrowth of the POT 10 times, each with 20 associated stochastic simulations of the split, and averaged the results.