Role of network-mediated stochasticity in mammalian drug resistance

A major challenge in biology is that genetically identical cells in the same environment can display gene expression stochasticity (noise), which contributes to bet-hedging, drug tolerance, and cell-fate switching. The magnitude and timescales of stochastic fluctuations can depend on the gene regulatory network. Currently, it is unclear how gene expression noise of specific networks impacts the evolution of drug resistance in mammalian cells. Answering this question requires adjusting network noise independently from mean expression. Here, we develop positive and negative feedback-based synthetic gene circuits to decouple noise from the mean for Puromycin resistance gene expression in Chinese Hamster Ovary cells. In low Puromycin concentrations, the high-noise, positive-feedback network delays long-term adaptation, whereas it facilitates adaptation under high Puromycin concentration. Accordingly, the low-noise, negative-feedback circuit can maintain resistance by acquiring mutations while the positive-feedback circuit remains mutation-free and regains drug sensitivity. These findings may have profound implications for chemotherapeutic inefficiency and cancer relapse.


Linearity assessment with L1-norm and curve fitting.
To assess the degree and range of linearity, we calculated the L1-norm metric 1,2 for the mean expression levels from the mNF-PuroR and mNF-GFP dose-responses. The metric can be thought of as the distance between a mean expression dose-response and an ideal linear function, with 0 representing perfect linearity and 0.5 being the least linear 1,2 . First, the doses and means (i.e., coordinates) were rescaled so that the first point in the new coordinate system was (0,0) and the last dose and mean was (1,1). From these discrete values, we finely interpolated the means between doses with the interp1.m MATLAB function. Then, the area between the ideal linear function and the rescaled interpolated doseresponse was calculated with the trapz.m MATLAB function. This integrated area represents the L1-norm metric up to each dose. We included 4 or more doses in each L1-norm metric calculation. The plots in Supplementary Figure 5a,c display L1-norms that are generally minimal over low to intermediate Doxycycline concentrations. The L1-norm abruptly increases once doses induce saturating means. To calculate the coefficients of determination or R-squared values of linearity for mean expression doseresponses, the mean expression data was also fit to a linear function.

Estimating switching rates after flow-sorting of mNF and mPF high-and low-expressing subpopulations.
We estimated the rates of gene expression fluctuations (switching rates) by flow-sorting of highexpressing and low-expressing mPF and mNF cells induced to the decoupled noise point (DNP) and then monitoring these sorted cells as they returned to their original expression distributions. We sorted the cells with the lowest and highest 15% of expression into separate, terminal wells corresponding to each subsequent time point of expression measurement. Using cells from these wells, we measured expression over time with a BD FACSCalibur Cell Analyzer flow cytometer at the Stony Brook School of Medicine Research Flow Cytometry Core facility. To threshold high-and low-expressing subpopulation fractions over time post-sorting, we split the expression values from each sorted sample by the median expression level of a corresponding unsorted control sample.
The mNF high-and low-sort fractions drifted over time to a value higher than the unsorted high fraction (0.5). Since the original distribution did not shift accordingly, the trend did not represent a biological phenomenon; it possibly arose from technical noise after fractioning very similar distributions. To detrend the mNF high-and low-sort fractional data, we individually fit the linear equation mx + b, with x equaling the time point (days), to three local time points in each curve with a clearly visible linear trend. The resulting time-dependent linear equation was then subtracted from each mNF high fraction value at each corresponding time point from the curve and then shifted upward by 0.5.
We fit the high subpopulation fractions of both high-and low-sorted cells over time using the following exponential curve, a general solution for two-state phenotype switching models 3,4 : where H(t) is the high subpopulation fraction over time, r is the switching rate from low to high expression (rise rate), f is the switching rate from high to low expression (fall rate), A and C are constants. Fitting lowsorted cells resulted in consistent switching rate values.
Since the individual median threshold at every time point partitions the monitored unsorted, visibly unimodal distributions into two cellular states with equal cell numbers, and cell growth rates do not differ noticeably, the calculated fits for r and f from the sorted cells reflect symmetrical cell switching rates across a consistent midway boundary. Therefore, at each time point, the rise and fall rates are equivalent: ( The corresponding rates are listed in Supplementary Table 1 while plots of high fractions over time are displayed in Supplementary Figure 7.

Image segmentation and assessment of cell survival under drug treatment.
We used the motorized Nikon Ti-E microscope at a magnfication of 10x to capture phase contrast and fluorescent images every 24 hours after initial Puromycin treatment. Each individual sample consists of 2x3 image fields tiled together with 15% overlap but without active stitching (NIS Elements AR). To determine the number of cells in each field, distinct nuclear stained cells and green fluorescent cells were counted with a bright spot detection algorithm provided by the Nikon Elements AR software. A typical diameter between 9 to 12 μm and an average contrast of 22.7 was applied in each run of the algorithm. The resulting binary objects were merged with an OR gate and then the objects were morphologically separated with a structual element repeated throughout the image. Each object was then counted per image to construct the cell growth curves.

Gene expression noise and fitness model.
The phenomenological model presented in this section facilitates a conceptual understanding of the relationship between gene expression noise and fitness. Importantly, it permits investigating how the steepness of the fitness function affects the relative advantage and disadvantage of gene expression noise in different levels of stress.
We modeled PuroR protein concentration in mNF and mPF cell populations using the equation: where µNF,PF and σNF,PF are the mean (set to unity) and standard deviation of PuroR, respectively, and ξ is the normal distribution. The growth rate (fitness) of each cell as a function of PuroR concentration is modeled using a sigmoidal function: The size of mNF and mPF cell populations is described as an exponential growth process: ( We investigated the effect of the steepness of the fitness function H and the noise ratio σPF/σNF on the advantage of mPF cells versus mNF cells under stress. Note that µ PF = µ NF and σPF  σNF. The model predicts that the high-noise mPF cells have an advantage in high stress for all fitness functions, while low-noise mNF cells are at an advantage only for steep fitness functions in low stress and are at a disadvantage otherwise (Supplementary Figure 1). This result generalizes previous work that established that low and high gene expression noise is advantageous in low and high stress conditions, respectively 6,7 .

Detailed description of plasmid construction.
The oligonucleotides used in the construction of the plasmids can be found in Supplementary  Table 3. The first intermediate plasmid pDN-D2irTNG5kwh was created by cutting pcDNA5/FRT and pDN-D2irTNG4kwh 1 with the SpeI-HF and SphI-HF restriction enzymes. The 4000 bp fragment from the first (Flp-In expression backbone) and 3400 bp fragment from the second (hTetR::NLS::EGFP construct with enhancements) plasmids were ligated together to make pDN-D2irTNG5kwh. The P2A sequence was introduced into the circuit by sequential extension PCR with primers CMV-PacI-f, SV40-AscI-BbvCI-r (flanking) and TN-P2A-r, P2A-TN-f, P2A-i2-f, P2A-i1-EGFP-f, EGFP-P2A-f (internal) and then inserting the product fragment into pDN-D2irTNG5kwh (cutting both with SpeI-HF, NotI-HF and ligating) resulting in pDN-D2irTN2AG5kwh (mNF-GFP).

Density-based gating of flow cytometry data.
Flow cytometry data was exported as individual FCS files, which were then analyzed by custom MATLAB scripts. Raw forward and side scatter values were log-transformed, and then plotted as a 2dimensional histogram. The number of bins was usually 60 unless adjusted whenever debris was unintentionally gated. The 2-dimensional histogram counts were then plotted as a contour, which further subdivided the plot with a density gate (Supplementary Figure 36). We chose the second from the widest contour level on average, which increased in density at the SSC-FSC coordinates harboring cellular events.

Determine D, P, and N via Equations (3) -(6) in Methods
call ODE solver for Equation (7) The number of cells that survive initial Puromycin treatment is given by: Ns = P + N. D is the number of cells that die during initial Puromycin treatment, P is the number of nongrowing persister cells, N is the number of growing nongenetically drug-resistant cells, and G is the number of growing genetically drug-resistant cells in the population. randn1 and randn2 are normally distributed random numbers, η1 and η2 the noise strength, and Nmax the carrying capacity of the population. Noise is added to Ns and Nmax to account for the experimental variability in the size of the initial surviving cell fraction and the size of the cell population on the day that the laboratory experiments were terminated. Cells switch between P and N phenotypes, and depending on the parameters, P cells can convert to G cells and N cells can mutate to become G cells. All parameters can be found in Supplementary Table 2.
Evolutionary dynamics model. The following pseudocode describes the stochastic population dynamics algorithm: Kill curves to determine optimal Puromycin dosage. CHO Flp-in cells (parental) were passaged at 2.5 x 10 5 cells per well in a 6-well plate to establish toxicity over increasing Puromycin concentrations. Stock Puromycin solution was diluted to 10 µg/mL in media and further diluted to conduct a kill curve with Puromycin concentrations of 1, 3, 5, 7, and 10 µg/mL added a day after cell seeding. Cells were monitored every day and media was replenished every 3-4 days up to 2 weeks. We found that 10 µg/mL Puromycin was not toxic for both uninduced circuits.

Expanded details for circuit sequencing.
During the development of the humanized TetR (hTetR), an extra glycine amino acid was inserted directly after the start codon to complete a consensus Kozak sequence and thereby enhance translational efficiency 1 . Thus, the amino acid coordinates are shifted by +1 compared to the original TetR class B protein sequence (UniProtKB: P04483; PDB: 4AC0). The CRISP-ID algorithm inferred potential mutant alleles from mixed peaks in sequencing traces (ab1 files) 9 . Background cutoff percentages (ranging from 10-25%) were scaled to the quality of nucleotide calls. Each read was evaluated starting at 85 bp and up to 600 bp depending on the location of poor read quality. In doing so, we can distinguish genetic heterogeneity of potential subpopulations in the sequence reads from mixed peaks and minimize false positives derived from poor read quality. Primers used for sequencing are listed in Supplementary  Table 4   Two overarching categories of mechanism can drive drug resistance evolution in the mNF and mPF cell populations: heritable alteration (purple) and stochastic gene expression (light gray). Heritable alterations can occur either inside (tan, red arrows) or outside (light blue) the gene circuit. Extra-circuit alterations could elevate PuroR expression (i.e., be PuroR-dependent, blue arrows) or could elicit PuroR-independent, native resistance mechanisms. In these experiments, adaptation was always PuroR-dependent. PuroR-dependent adaptation could further be classified as inducer-dependent or independent (orange). Gene expression stochasticity as a resistance mechanism should be inducer-dependent, based on the noise amplitude and noise memory at the given inducer level. Inducer-dependence should be indicated by expression reversion to pre-treatment levels in uninduced populations after drug removal.  Stochasticity cannot be ruled out for cells with the low-noise mNF-PuroR circuit because, after drug removal, expression of both induced and uninduced cells dropped to pre-treatment and pre-induction levels, respectively. On the other hand, the induced mPF-PuroR populations maintained their expression above pre-treatment levels, which is consistent with extra-circuit heritable mechanisms. Accordingly, sequencing results for mNF-PuroR and mPF-PuroR replicate 4 revealed no intra-circuit mutations.