Introduction

Plasmids first became known for their role in spreading antibiotic resistance in bacteria as extrachromosomal genetic elements capable of interspecies transfer1. Today, plasmids are an invaluable tool in biotechnology and genetic engineering due to their small size and the ease with which they can be engineered for biotechnological applications. The plasmid copy number is regulated by the replication origin, and number of plasmid per cell is influenced by several factors including growth conditions2,3,4, the host strain5,6, temperature and other extracellular stresses7.

The introduction of engineered genetic systems inside bacteria, however, often results in negative metabolic effects and added fitness costs. Indeed, recent studies have probed the effect of plasmid copy number on host cell growth8,9 and the role of plasmids on central metabolic processes10. Recombinant gene expression have also been shown to add significant metabolic burdens on bacterial cells11,12,13, which may interfere with normal cellular processes and even lead to cessation of growth. The control of plasmid copy number can also drastically alter the behavior of simple repression systems14,15 which may significantly impact engineered gene circuits and should be important considerations for plasmid-based biotechnology and bioengineering applications.

Studies have also demonstrated the utility of modulating the plasmid copy number for biotechnological applications and to enhance the behavior of engineered genetic systems. For example, runaway replication has been used as a method for increasing cellular protein production16,17. Recent works have also leveraged plasmid copy numbers to amplify a signal through a large increase in copy number18,19 and to increase the cooperativity and robustness of synthetic gene circuits20. Single cell measurements have also been used to accurately quantify the prevalence of plasmid loss within a population21.

Controlling plasmid copy number should allow for the reliable scaling of biosynthesis or biological function while limiting metabolic effects associated with engineered genetic systems. While the first inducible plasmid copy number system was developed in 198422 in order to show that the priming RNA of the ColE1 origin of replication controlled plasmid copy number, this system had uncontrolled replication due to a removal of the inhibitory RNA. More recently, inducible plasmid copy number have been developed and are commercially available23,24, though these aren’t well-characterized and seem to act more as on-off switches than truly tunable copy number systems.

Further, apart from a few well-documented point mutations and deletions that drastically increase or decrease the copy numbers of specific replication origins25, the current toolkit for biologists to exert control over plasmid copy numbers is limited. Additionally, we still lack a quantitative description of the relationship between plasmid copy number and host metabolic burden.

In this work, we use two approaches to actively control the copy number of pUC19 and other ColE1-based plasmids to study the impact of copy number on cellular growth and metabolism. Specifically, we first demonstrate a strategy for finely tuning plasmid copy number in a way that is dependent on the concentration of an exogenous inducer molecule. Then, we develop a method that uses a parallelized assay for rapid screening of the dependence of a system on plasmid copy number.

We further characterize our systems at both the macroscopic and single cell level to uncover a relationship between gene copy number, protein production, and cellular growth rates. Our approach also leads to insights into the variation in plasmid copy number and the phenomena of plasmid loss and runaway replication at the single-cell level. We finally demonstrate that manipulations of plasmid copy number can be used to tune the behavior of synthetic gene circuits such as a simple CRISPRi genetic inverter, and control the expression of complex biosynthetic pathways to optimize the production of the pigment Violacein. Our work thus provides a straightforward method for generating copy number variants from a specific plasmid backbone, which we anticipate will expand the synthetic biology toolkit and constitute the basis for more nuanced attention to plasmid copy number in synthetic gene circuits, protein biosynthesis, and other biotechnological applications.

Results

Model of ColE1 plasmid replication and copy number

The molecular mechanisms that regulate plasmid copy number control have been well characterized26. In the ColE1 origin of replication, used extensively in this work, two RNAs control replication (Fig. 1A): the priming RNA (RNA-p), which acts in cis as a primer near the origin of replication, and the inhibitory RNA (RNA-i) which is transcribed antisense to RNA-p and acts in trans to inhibit replication through binding to the priming RNA prior to hybridization near the origin (Fig. 1B)27.

Fig. 1: Replication of ColE1 origin plasmids.
figure 1

A Map of the pUC19 plasmid. The origin of replication consists of a 555 bp region containing the priming RNA (RNA-p), the inhibitory RNA (RNA-i), the inhibition window, and the site of replication initiation (ORI). B Diagram of the steps leading to ColE1 replication. In the ColE1 Origin, reversible binding of the inhibitory RNA-i to RNA-p exercises control over the copy number by inhibiting replication (left branch). Production of RNA-p transcript triggers plasmid replication through the hybridization of RNA-p with the ORI (right branch). C Predicted plasmid copy number for hyperbolic (upper) and exponential (lower) initiation models28 plotted as a function of RNA-p transcription initiation rate. Inset shows the predicted dependence of plasmid copy number on host cell division time. Model parameters: ϵ = 0.545 min, ρ = 0.65, r = 0.01 min, ki = 0.95 min. D Plasmid copy number measurements made by digital droplet PCR of the aTc-inducible pUC-pTet plasmid at increasing aTc concentrations; n = 3, error bars derived from Poisson distribution of ddPCR counts. E Growth rates of cells hosting the pUC-pTet plasmid at increasing plasmid copy numbers. We see a marginal effect of plasmid copy number on cellular growth rates, inset shows individual growth curves colored by aTc concentration. Data points are means across n = 6 biological replicates, error bars = std. dev. F Plasmid copy number measurements of a plasmid that contains an additional copy of the inhibitory RNA under the control of an IPTG-inducible promoter; n = 3, error bars derived from Poisson distribution of ddPCR counts. G Growth rate vs plasmid copy number of the IPTG-inducible plasmid shows a strong metabolic burden at high levels of RNA-i expression. Data points are means across n = 6 biological replicates, error bars = std. dev.

A simple macroscopic-scale model developed by Paulsson et al. recapitulates these observations and predicts the functional dependence of plasmid copy number on the molecular details of this process28 (Fig. 1C). This process is classified as inhibitor-dilution copy number control29, owing to the fact that the copy number is controlled through a combination of replication inhibition and dilution of plasmids and regulatory components due to cell division.

Specifically, as the ratio of RNA-p transcription rate kp to RNA-i transcription rate ki is increased, plasmid copy number Np will also increase according to

$${N}_{p}=\frac{{n}\,(\epsilon +r)}{{k}_{i}-{k}_{p}}{\left[{\left(\frac{\rho {k}_{p}}{r}\right)}^{\frac{1}{n}}-1\right]}$$
(1)

where ϵ is the RNA degradation rate, ρ is the RNA-p priming probability, r is the cellular growth rate, and n is the number of rate-limiting steps in the inhibition window28. The n → 1 and n →  limits represent hyperbolic and exponential inhibition, respectively (Fig. 1C).

As the priming-to-inhibitory RNA production ratio kp/ki approaches 1, the number of inhibitory RNA transcripts will not be present in high enough number to sequester priming RNA transcripts, thereby failing to limit plasmids from duplicating in a process called runaway replication (Fig. 1C). In addition, increasing the plasmid copy number may increase the metabolic burden associated with plasmid expression and reduce the growth rate r, which in turn increases the plasmid copy number further as r → 0.

On the other hand, as the RNA production ratio kp/ki decreases, the number of plasmids inside each cell will approach 1 and small variations in the number of plasmids apportioned to each daughter cell during cell divisions may lead to the catastrophic consequence of plasmid loss. These two stresses, plasmid loss and metabolic burden, necessitate the evolution of copy number control systems to ensure that neither stress results in the loss of the plasmid30. The experimentally measured RNA production ratio kp/ki is closer to 0.3331, suggesting that the ColE1 ORI is naturally in a state that balances plasmid loss and metabolic burdens.

Tuning priming and inhibitory RNA transcription to control ColE1 copy number

The Paulsson model of ColE1-type replication predicts that the RNA-p transcription initiation rate should determine the plasmid copy number without drastically altering the sensitivity of copy number control29. Thus, in order to only control the activity of the promoter upstream of the priming RNA, we first replaced the native priming promoter in the pUC19 plasmid with an anhydrotetracycline (aTc) inducible promoter (Fig. 1D). Liquid cultures of TetR- and LacI-overexpressing E. coli carrying this plasmid were grown in aTc concentrations spanning 4 orders of magnitude and the plasmid copy number was determined via digital droplet PCR (ddPCR) from total DNA isolates by measuring the ratio of the bla gene on the plasmid to the genomic dxs gene32.

This plasmid showed a robust control of copy number with aTc induction (Fig. 1D). Copy numbers ranged from 1.4 copies per cell in the complete absence of aTc to roughly 50 copies per cell at full induction (100 ng mL1 aTc).

We subsequently measured the growth rate of cells harboring the pUC-pTet plasmid as a function of the aTc concentration in a microplate reader (Fig. 1E). Though the absolute difference in growth rates is small, a cell with a few copies of the plasmid grows roughly 10% faster than a cell with 50 plasmids, we can clearly see that increasing the plasmid copy number has a weak, though consistent, effect on the host cell growth rate.

To complement the work above, we sought to demonstrate that the transcription level of inhibitory RNA can be used to control ColE1 copy number as well. Seeking to do this with minimal perturbation to the native pUC19 origin of replication, we inserted a second copy of the inhibitory RNA downstream of the IPTG-inducible Lac promoter on the pUC19 plasmid. This gives us control over the inhibition of replication through the addition of exogenous IPTG (Fig. 1F).

When no IPTG was added to the cells we measured ~ 270 plasmids/genome, a number that agrees with what we have measured for the standard pUC19 plasmid, indicating that our system efficiently represses the additional inhibitory RNA transcript and does not affect the copy number in the absence of induction. Induction of the inhibitory RNA greatly reduced plasmid copy number to approximately 30 plasmids/genome at full RNA-i induction, demonstrating that tuning RNA-i production rate can be used to control the plasmid copy number.

It is interesting to note that we observed paradoxical effects on the growth rate from cells harboring this plasmid - at high plasmid copy numbers we see a faster growth rate and at lower copy numbers we see a markedly slower growth rate (Fig. 1G). These observations are markedly different from what we see in the pUC-pTet plasmids, where increased plasmid copy number has a relatively modest effect on the growth rate of host cells. Instead, the reduction in the maximum cell density reached by the IPTG-induced populations (Fig. 1G, inset) suggests that overproduction of the inhibitory RNA may completely abolish plasmid replication in some cells, preventing them from dividing further.

Gene expression scales with plasmid copy number

Understanding the interplay between plasmid copy number, host cell growth rate, and gene expression is key to optimizing protein production in bacterial cell cultures. If protein production is too low, one may have to process massive amounts of cells to obtain a sufficient amount of their target protein. If too much protein is produced, on the other hand, cells may stop growing or be out-competed by cells harboring plasmids that lack the correct insert. Understanding the interplay between plasmid regulation and protein expression is key to solve protein production maximization problems.

To directly measure the effects of plasmid and gene copy number on gene expression and host cell growth rate, we first inserted sfGFP downstream of a constitutive promoter on our pUC-pTet plasmid, whose copy number is controlled through exogenous aTc induction. This gives us a straightforward way of monitoring the effects of plasmid copy number on gene expression. Cell cultures with this tunable copy number plasmid were grown in a microplate reader where simultaneous measurements of fluorescence and growth rate could be made.

We first observe clear differences in growth rates as a function of both plasmid copy number and aTc concentration (Fig. 2B) in H media. We see a roughly 15% decrease in host growth rate at maximal induction of copy number (~50 copies per cell). This metabolic burden is significantly higher than the pUC-pTet plasmid (Fig. 1E), indicating that the majority of the reduction in growth rate comes from the overexpression of sfGFP and not from the increased plasmid copy number. In this sense, addition of sfGFP to the plasmid provides a stronger coupling between the host growth rate and plasmid copy number by taking up additional cellular resources.

Fig. 2: Control of gene expression through induction of plasmid copy number.
figure 2

A Schematic of the effects of adding aTc to the pUC-pTet-sfGFP plasmid. As the concentration is increased the number of plasmids and sfGFP genes increases as well as the metabolic cost of the plasmid. B Growth rates of cells hosting the pUC-pTet-sfGFP plasmid as a function of aTc concentration. C sfGFP Production of the pUC-pTet-sfGFP plasmid grown at different aTc concentrations. D Estimated plasmid copy numbers from sfGFP measurements of our plasmids. We see that the copy number induction curve shifts towards lower aTc concentration as with decreasing media richness. For all panels, Data points are means over biological replicates, error bars = std. dev. (n = 6).

Despite the fitness costs associated with increasing the plasmid copy number, we nevertheless observe an almost perfect sfGFP induction curve when compared to the aTc concentration (Fig. 2C). We performed identical measurements in 3 different media conditions: H Media, M9 minimal media + 0.2% glycerol  + 0.2% casamino acids, and M9 minimal media + 0.2% glucose. In less optimal media we saw that the sfGFP induction curve was shifted towards lower aTc values (Fig. 2C). From our fluorescence data and ddPCR measurements at the endpoints (0 and 100 ng/μL) we predicted the PCN of our pUC-pTet-sfGFP plasmid in different media compositions (Fig. 2D). From this data we observe that as the richness of the media is decreased we see a net increase in the plasmid copy number at every aTc concentration. While this results in a shift in the induction curve we nonetheless see a broad range of induction across different media conditions. We further observed that there was no consistent effect of increased plasmid copy number on host cell growth rate in the two tested minimal media compositions (Sup.), perhaps indicating that other cellular processes are the dominant limiting factor on host cell growth in these conditions.

Our pUC-pTet plasmid provides a simple way to control gene expression without the need to insert an inducible promoter upstream of the gene of interest.

Metabolic costs of the pUC-pTet plasmid at the single-cell level

While macroscopic measurements show a clear interdependence between gene expression and copy number, single cell measurements can provide a more powerful path towards understanding the relationship between plasmid copy number and gene expression. In particular, single-cell studies allow us to determine the distributions of gene expression of single cells at varying plasmid copy numbers and directly observe the phenomena of plasmid loss and runaway replication.

To first corroborate protein production measurements with plasmid copy number, we observed the pUC-pTet-sfGFP system for up to 150 generations in a microfluidic device33,34.

In agreement with population-wide measurements (Fig. 2C), individual cells show an increase in the mean fluorescence with plasmid copy number (Fig. 3A). We also observe a marked increase in the fluorescence of some cells at high aTc concentrations. Interestingly, cells with an increased fluorescence level are found to be elongated (Fig. 3A and Supplementary Fig. 1). The intensity distribution (Fig. 3B) undergoes a profound change as the plasmid copy number increases: the distribution is peaked at a very low intensity at low plasmid copy number but, as the copy number increases, the distribution widens and a high-intensity tail emerges.

Fig. 3: Microscopic Analysis of sfGFP Expressed by the pUC-pTet.
figure 3

A Representative snapshots of microfluidic chambers at various aTc concentrations. Slow-growing cells at high copy number are indicated by orange arrows and cells at 0.3 ng/mL of aTc in an “activated” state are indicated by purple arrows. Experiments in 100 ng/mL and 0.1 ng/mL conditions were repeated twice with similar results, all other experiments were repeated once. B Distribution of cellular fluorescence intensities at increasing aTc concentrations. C Normalized mean fluorescence as a function aTc concentration, log scale. Inset: Normalized mean fluorescence vs aTc concentration on a linear scale. N = number of cells listed in corresponding condition in Panel B. Data points are averages over all cells, error bars = std. dev. of distributions in panel B. D Growth rate distributions of cells grown at 0.1, 0.3, and 1 ng/mL aTc. E Fraction of slow-growing, elongated cells over time at several aTc concentrations. This measurement is indicative of the extent to which cells are burdened by the high plasmid copy numbers. F Proportion of viable cells as a function of time at various aTc concentrations. The sharp decrease in the number of viable cells at low aTc concentrations provides insight into the extent to which plasmid loss occurs at low copy numbers.

The most drastic shift in the distribution occurs between 0.3 ng mL−1 and 1 ng mL−1, where the average cell intensity (Fig. 3C) is in good agreement with population wide measurements (Fig. 2C), but each population sees a steady increase in the number of individual cells with increased sfGFP levels (Fig. 3A, red arrows). As the plasmid copy number increases further, a growing number of cells reach an “activated” state where they show fluorescence levels as high as those full pTet induction. Eventually, all cells for aTc concentrations above 3 ng mL−1 reach this activated, fully-induced state (Fig. 3C).

As the plasmid copy number sharply increases between 0.1 ng mL−1 and 1 ng mL−1 of aTc, we observe a clear drop in average growth rate (Fig. 3D and Supplementary Fig. 2) similar to bulk measurements (Fig. 2B). The distribution of cellular growth rates (Fig. 3D) show that this drop is largely due to a fraction of cells that suffer from reduced growth rate. This is likely due to the metabolic burden of plasmid maintenance and sfGFP expression severely limiting the growth of a subset of the cellular population. Above 1 ng mL−1 aTc, the growth rate of healthy cells seems to stabilize (Supplementary Fig. 2). However, we notice an increasing fraction of elongated cells (Fig. 3E), many of which have lost the ability to grow and turned extremely bright (Supplementary Movie 1). The inability of a fraction of cells to grow may explain why growth rate continually decreases even though plasmid copy number saturates at high aTc levels in bulk measurements (Fig. 2B).

At low plasmid copy numbers, we see a decreased number of cells in each microfluidic chamber. In fact, nearly every cell undergoes lysis within the first 1500 min of tracking for aTc concentrations lower than 1 ng mL−1, suggesting that cells lost resistance to carbenicillin due to plasmid loss (Fig. 3F).

These observations highlight the delicate balance between plasmid loss at low copy number and heightened metabolic burdens at high copy numbers. Our results demonstrate that both phenomena occur stochastically and on a per-cell basis.

Massively parallelized assay determines copy numbers of ColE1-derived plasmids

Though we were able to obtain robust control over plasmid copy number with our inducible plasmid, we next sought to exert control over plasmid copy number without the need for external inducer molecules. In order to survey the landscape of possible plasmid copy numbers that arise from changes of the transcription rates of the RNAs controlling ColE1 replication, we simultaneously constructed 1024 different variants of the pUC19 plasmid through site directed mutagenesis of the −35 and −10 boxes of the promoter controlling the priming RNA and at the +1 site of the priming RNA transcript (Fig. 4A, Pp library). The same procedure was performed on the promoter controlling the inhibitory RNA (Fig. 1A, Pi library), giving us 2 separate libraries of plasmids: one with a variable priming RNA transcription rate kp and another with a variable inhibitory RNA transcription rate ki.

Fig. 4: Massively parallelized copy number assay.
figure 4

A Procedure for massively parallelized copy number assay. 1) The promoter upstream of the priming RNA was randomized with site directed mutagenesis. 2) Plasmids were ligated and cells were transformed with this library of plasmids with high efficiency and grown until OD600 of 1.0. 3) At which point a fraction of the culture was diluted and regrown and the remaining fraction was saved for plasmid extraction. 4) Sequencing libraries were generated from isolated plasmids and sequenced using next generation sequencing. Growth rates and Plasmid Copy Numbers (PCN) were then calculated from the frequencies of sequencing counts present at each time point. B Ranked list of the measured fold-change for plasmids generated through mutations to the promoter driving the priming RNA in this work (n = 830). Promoters with a fold-change below zero will result in plasmid loss (PCN < 1). (lower) The distribution of copy numbers arising from mutations to the priming RNA promoter. C Ranked list of the measured fold-change for plasmids generated through mutations to the promoter driving the inhibitory RNA in this work (n = 365). D Agreement between plasmid copy number as measured by sequencing counts and digital droplet PCR. Data points are averages over n = 3 biological replicates, error bars = std. dev. E The relationship between plasmid copy number and growth rate as measured across 830 variants of the pUC19 plasmid. The size of each data point is inversely proportional to the error in the measurement, with the top, middle two and lowest quartiles representing an error of less than 5%, between 5% and 25%, and more than 25%, respectively. Data from Fig. 1E is plotted (red to blue gradient points) A linear metabolic burden seems to account for the observed relation reduction in growth rate with increasing copy number. Data points represent average over n = 3 measurements.

These libraries were constructed using multiplexed PCR and KLD assembly35 and electroporated into competent E. Coli cells (Fig. 4A, step 2). Cell cultures were grown in selection conditions until they reached an OD600 of 1.0 and subsequently diluted by a factor of 4 and grown again to an OD600 of 1.0 (Fig. 4A, step 3). Excess culture from each step of dilution was set aside for sequencing (Fig. 4A, step 4).

For each recovered mutant, the abundance of each construct at each time point was used to infer the growth rate of each individual construct. By coupling this information with the fold change in sequencing counts from the library prior to transformation, we determined the relative copy numbers of every plasmid in this library (Fig. 4B, C) and their growth rates (Fig. 4D). Our method was able to generate 1,194 plasmids, each with a distinct plasmid copy number ranging from less than 1 copy per cell to close to 800 copies per cell (Supplementary Data 1).

To make sure our sequencing-based method yields accurate plasmid copy number measurements, we next sought out to calibrate our sequencing-based numbers to absolute copy numbers made using digital droplet PCR measurements. Individual constructs of 9 promoters were selected (light blue datapoints in Fig. 4B) and their absolute copy number were determined via digital PCR by the ratio of dxs to bla genes32. There is a good agreement between fold change in sequencing abundance and the plasmid copy numbers as measured by digital droplet PCR (Fig. 4D, RMS error 0.994), indicating that the fold change in abundance of sequencing counts with a modest correction applied to correct for differences in growth rates is a sound measure of plasmid copy number.

The simultaneous measurement of plasmid copy numbers and the growth rates of their host cells provides us with a powerful measurement of the interdependence of copy number and metabolic burden. We were not able to determine any key sequence motifs that could be used to predict plasmid copy number (Supplementary Fig. 3). While a high plasmid copy number should reduce host cell growth, to what extent that occurs is not known. Our measurements of plasmid copy numbers and growth rates for the 830 plasmids in the priming promoter library, all with nearly identical origins of replications, provide a powerful way to deduce this relationship.

In Fig. 4E, we observe that, as the plasmid copy number increases, the growth rate of host cells decreases. A simple relationship in which the metabolic burden of plasmid expression is linear with the copy number largely agrees with these findings –i.e., cost = abx + b, where x is the plasmid copy number, with a = 0.065% ± 0.006% and b =  0.019 min1 ± 0.003 min1. Where errors were calculated from the variance estimates of the coefficients. This relationship can arise from a simple resource allocation model in which each plasmid takes up a small fraction of the resources necessary for cellular growth proportional to the plasmid size (pUC19 is 2686 bp in size, which is approximately equal to 0.058% the length of the E. coli genome).

To determine the extent to which antibiotic supplementation affected the growth rates and plasmid copy numbers36 of cells hosting these plasmids we inserted sfGFP into our library and measured the growth rates and fluorescence of eight constructs on a microplate reader. We observed no noticeable change in the mid-exponential or endpoint fluorescence between samples treated with and without antibiotics. (Supplementary Fig. 4) This indicates that there was no measurable change in the mean copy number due to antibiotic exposure. We observed a constant decrease in the growth rate when samples were treated with antibiotics that was independent of the plasmid copy number. Together these indicate that our supplementation with antibiotics did not have an observable effect on plasmid copy numbers or measured growth rates in our experiments.

This results opens up the possibility for improved models of plasmid evolution and replication, as they indicate that plasmids impose a metabolic burden that is linear with their copy number on their hosts. Knowledge of this relationship between host growth and plasmid copy number may shed light on the evolution of cis and trans acting regulatory elements as well as the stringency of plasmid copy number control mechanisms.

Plasmid copy number controls violacein biosynthesis

We sought to demonstrate the potential for using plasmid copy number to control complex biosynthetic processes comprising several proteins and enzymes. We chose to focus on Violacein (Fig. 5A), a purple pigment formed by the condensation of two tryptophan molecules37 which is commonly used to demonstrate biosynthesis optimization38. The violacein biosynthetic pathway contains five enzymes, VioABCDE, each under the control of a different promoter and spanning approximately 8kb of DNA (Fig. 5B). Hence, we reasoned that tuning the plasmid copy number of the whole Violacein production pathway may serve as an effective method to scale up the expression of VioABCDE while keeping the production of each enzyme under the same stoichiometric ratio.

Fig. 5: Optimization of violacein production by tuning plasmid copy number.
figure 5

A The action of 5 enzymes on two tryptophan molecules yields violacein (B) The complete VioABCDE pathway was inserted in the pUC-pTet plasmid. C Violacein production scales with aTc concentration when the pathway is inserted into the pUC-pTet plasmid. D A library of variable copy number plasmids containing the VioABCDE pathway was created by performing mutagenesis on the priming promoter of a ColE1-based plasmid. E Violacein production is can be scaled up or down by plasmid copy number. At very high copy number production is low due to increased metabolic burden. Inset: Lag time is drastically increased in constructs with high violacein production. Error bars = std. dev. (n = 4). F A two plasmid system was created in which the VioABC enzymes were placed on a single plasmid of fixed copy number and the VioDE were placed in our library of variable copy number plasmids. G The use of a two plasmid system outperforms a the single plasmid system in raw violacein yields. H Growth rate is dramatically improved with a two plasmid system. Our randomized approach leads to the creation of constructs with fast growth rates and high amounts of violacein production.

To achieve this, we constructed a pTet inducible plasmid harboring the Violacein biosynthesis pathway using Gibson assembly39 (Fig. 5B). We then grew cells with the pUC-pTet-Vio plasmid at increasing aTc concentrations and observed a clear increase in the violacein production (as measured by the absorbance at 650 nm of methanol extractions from cells) with increasing aTc concentrations (Fig. 5C). This indicates that increasing plasmid copy number is a viable way to scale up the product of a biosynthetic reaction.

Additionally, we inserted the VioABCDE pathway into a pUC plasmid under the control of a moderate-strength promoter, and we used site directed mutagenesis of the origin of replication in the manner described above to generate a plasmid copy number library (Fig. 5D). We plated cells hosting this library and picked individual colonies to measure their vioalcein production. We sequenced the promoter controlling the priming RNA of these plasmids to infer their copy number from the results of our parallelized assay and clearly saw that up to a certain point, increases in plasmid copy number lead to an increased vioalcein yield (Fig. 5E). This further demonstrates that plasmid copy numbers are a reliable way to scale the production of a biosynthetic compound.

We then reasoned that we could tune the stoichiometric ratio of the enzymes in this pathway to optimize violacein biosynthesis by placing some enzymes in our tunable plasmid copy number system and the others in a plasmid of fixed copy number. Specifically, we placed VioABC on a plasmid with the p15A origin of replication (~20–40 copies/cell) and while VioD and VioE were placed on plasmids with priming promoter library (Fig. 5F). The violacein production level of 15 colonies hosting both plasmids was measured. We found that splitting the pathway on to two plasmids improved vioalcein yields (Fig. 5G) and increased the growth rates of host cells (Fig. 5H) relative to cells hosting the single plasmid system. While the improvements in violacein were substantial, more impressive was the several fold increase in the growth rate of cells hosting this pathway even at very high levels of violacein production. Interestingly, there was no clear relationship between violacein production and host cell growth rates, as our highest producing constructs had growth rates in the upper half of those measured and some of the slowest growth rates were observed in constructs that had among the lowest levels of violacein production (Fig. 5H).

Together this indicates the tuning plasmid copy numbers is a viable way to optimize biosynthesis both through increased raw product yields and substantially increased growth rates of host cells. The value of our randomized approach is accentuated in our observation that plasmid copy number and growth rate were not necessarily correlated with final vioalcein yields (Fig. 5H) in our two plasmid system.

Control of a CRISPRi system through sponge binding sites

Controlling the plasmid copy number may also be useful in synthetic biology and in the creation of robust gene circuits18,20,40,41. To test this, we used a simple CRISPR-dCas12a inverter to demonstrate the effect of adding “sponge” binding sites to a simple genetic circuit through manipulation of plasmid copy numbers. Sponge sites are additional binding sites used to titrate out (“soak up”) transcription factors, and they have been used to redirect flux in metabolic pathways to optimize arginine production42, to activate silent biosynthetic gene clusters43, to tune gene expression timing44, and mitigate protein toxicity45. In these studies, variable numbers of sponge sites are placed on high or low copy number plasmids.

Here, we added a sponge site to a few isolates from our tunable copy number library to alter the behavior of a simple genetic circuit. Specifically, we use a two plasmid system in which a low-copy plasmid (pSC101, 2–5 copies per cell) contains an aTc inducible CRISPR-Cas12a guide RNA which can bind to an active site Sa and turn OFF a promoter driving sfGFP, while the second plasmid contains a decoy binding site Sd site that sequesters CRISPR-Cas12a molecules away from the active site Sa. The sponge plasmids were picked out from our library of variable copy number plasmids generated in Fig. 4B and span copy numbers from 30 to 270.

Time-series measurements of the fully-induced CRISPRi inverter show that the OFF state of the inverter has a higher fluorescence level as the number of sponge sites is increased (Fig. 6A). This suggests that the addition of sponge sites reduces the occupancy of the active site Sa and decreases the repression efficiency of the CRISPRi inverter.

Fig. 6: Control of a CRISPRi inverter with competitor binding sites.
figure 6

A Top: Diagram showing how production of a dCas12a CRISPR RNA (crRNA) is controlled by an aTc-inducible promoter, and the dCas12a+crRNA duplex can either bind to the active site Sa, which turns off production of sfGFP, or to a decoy site Sd residing on a “sponge” plasmid. Bottom: Fluorescence intensity vs time for CRISPR-dCas12a systems with variable numbers of sponge sites (Nc). B Induction curves for a CRISPRi system with sponge sites on plasmid with varying copy numbers, n = 3 biological replicates for each data point. C ON-OFF levels for variable numbers of sponge sites. The ON and OFF fluorescence level of the inverter both increase as the number of sponge sites varies from 0 to 270. The dynamic range decreases with increasing sponge copy number. D Active site occupancy vs number of sponge sites, as the number of sponge sites increases the active site becomes less occupied. The gray area represents the numerical solution of a fold-change model15,35 for dCas12a where the binding energy equal to − 13.25 kBT (top) and − 12.25 kBT (bottom).

We also observe a reduction in the dynamic range of circuits as the number of sponge sites is increased (Fig. 6B), where cells with more than 150 sponge sites have at least a 40% reduction in dynamic range compared to those with no sponge sites. In contrast, the addition of 30 to 50 sponge sites only reduces the dynamic range by approximately 6–8% (Fig. 6C).

While a large dynamic range is often desirable in genetic devices, a reduction in dynamic range could be used to divert the flux in a metabolic pathway, ensure that the concentration of some protein stays within a given range, or to reduce the toxicity of CRISPR-Cas proteins46 due to spurious off-target binding. Using a simple model of protein-DNA binding15, we can discover that the occupancy of the active binding site Sa (Fig. 6D) in the OFF state is reduced from nearly 100% with zero sponge sites to 50% when 270 sponge sites are present. Overall, we demonstrate that plasmid copy number can be used to accurately tune the level of transcription factor titration in a simple genetic inverter.

Discussion

This work details the development of a system of tunable copy number plasmids and demonstrates their utility in protein expression, biosynthetic pathway optimization, and synthetic biology applications. Using a combination of high-throughput measurements and single-cell experiments, we deduce relations between plasmid copy number and host cell’s growth rate, protein expression, biosynthesis, and the performance of simple regulatory elements. This system could help resolve known issues currently plaguing the design and scaling up of synthetic gene circuits such as leaky expression, narrow ranges of activation and repression control, and a large metabolic burden due to competition over a common pool of resources47,48,49.

Using a massively parallel directed mutagenesis approach, we find that the most straightforward and reliable way to change the plasmid copy number in the ColE1 Origin is through mutations to the promoter controlling the priming RNA. Changes to the machinery controlling inhibitory RNA levels are much less likely to give stable constructs as mutations to the promoter controlling inhibition change the sequence of the priming transcript. Changes to the sequence of the priming transcript can have drastic effects on plasmid copy numbers through secondary structure changes50. Overall, these results suggest that the ColE1 replication machinery is more robust to changes in the priming RNA transcription initiation rate and much less robust to changes in the parameters detailing inhibition.

Our multiplexed method of generating a plasmid copy number library allows one to rapidly screen the effects of plasmid copy number on a given system and could be tailored to select for genetic devices that perform the best under a given set of conditions. A key finding of this facet of the work is that ColE1-type plasmids impose a linear metabolic burden on their hosts proportional to the relative size of the plasmid, which may shed light on the evolution of various regulatory elements in inhibitor-dilution copy number control mechanisms.

Our single cell measurements also show that plasmid loss and runaway replication are quite common phenomena. These findings are in line with recent single cell measurements of plasmid copy numbers21 and illustrate that very careful attention should be given to the plasmids used to house genetic circuits in order to ensure reliability. Since we observed an increasingly large fraction of non-growing cells as we increased the plasmid copy number, our results further suggest that lower growth rate for populations carrying a high-copy number plasmid may be explained by non-growing cells as opposed to a universal reduction in cellular growth rates. While this work is restricted to the ColE1 origin of replication, similar principles can be used to tune the copy numbers of other inhibitor-dilution controlled plasmids. Specifically, one could mutate the promoters controlling the priming and inhibitory RNAs in any other antisense-RNA controlled origin of replication to generate a library of plasmid copy numbers from a template plasmid.

The implications of plasmid copy number go beyond its effects on gene expression and host cell fitness that we’ve presented thus far. As vehicles of horizontal gene transfer, the prevalence of which increases with plasmid copy number, plasmids play a key role in bacterial ecology and evolution51. On the other hand, a consequence of a high copy number is a reduced retention and fixation of novel plasmid variants52. Together, these two issues create a complicated image of the role of plasmid copy number in bacterial diversity and evolution, it’s possible that the methods for controlling plasmid copy number presented here can be used to help shed light on the role of plasmid replication in bacterial evolution.

The evolution of the fitness cost associated with plasmid-borne antimicrobial resistance genes is of central importance in preventing the spread of drug-resistant bacteria53. Our work presented here could provide a clear path for investigating the role of plasmid copy number in the evolution of the cost of antimicrobial resistance. Further, the fitness cost of plasmid expression is highly species-dependent and dependent on the diversity of a bacterial community51 and recent findings have shown that plasmids can quickly achieve fixation in a population under non-selective conditions54, further understanding of plasmid replication mechanisms and copy number effects might allow for a more nuanced understanding of these varied properties of plasmids.

Together, these findings demonstrate that plasmid copy number can have drastic effects on host cellular processes and provide straightforward ways of investigating these effects in other systems. Our results underscore the importance of tuning plasmid copy number for the optimization of biosynthetic pathways, gene regulatory circuits, and other synthetic biological systems.

Methods

Next-generation sequencing library preparation

The multiplexed libraries of plasmids was generated by site-directed mutagenesis using the NEB Q5 High-fidelity polymerase. Primers (pBR322-promoter1-1024-rev 5′-GAKKKMAAGA AGaTcCTTTG aTcTTTTCTA CGGGGTCTG-3′,pBR322-promoter1-1024-for 5′-CTTTTTTTCT GCGCGTWWRM TGCTGCTNGC AAACAAAAAA ACCACCG-3, pBR322-promoter1-1024-rev 5′-GTTAGGCCAC CAKKKMAAGA ACTCTGTAGC ACCGCCTACA TACC-3′, pBR322-promoter1-1024-for 5′-TACGGCTWWR MTAGAAGNAC AGTATTTGGT aTcTGCGCTC TGCTG-3′) containing degenerate nucleotides in the −35, −10, and +1 sites of the promoters controlling transcription of both RNAs were used to amplify the standard pUC19 plasmid (NEB). After PCR amplification samples were digested with DpnI (NEB) restriction enzyme to remove excess pUC19 template.

The PCR product was then circularized with electroligase (NEB) and transformed with high efficiency (≥1.5 million CFU/mL) into Endura Electrocompetent cells. The transformation efficiency was determined by plating of serial dilutions of recovered cells. After transformation liquid cultures were grown in Terrific Broth (TB, VWR) supplemented with 0.4% glycerol until reaching an OD600 of 1.0, at which point cultures were diluted 1:4 and regrown to OD600 of 1.0, a procedure that was repeated 4 times. Excess culture from each time point was set aside for plasmid isolation. Plasmids were isolated by miniprep (Zymo) and a 243 bp region of the ori was amplified (pBR322-5prime-promoter1-rev 5′-TCTACACTCT TTCCCTACAC GACGCTCTTC CGaTcTCAGA CCCCGTAGAA AAGaTcAAAG GaTcTTC-3′, pBR322-3prime-for 5′-GACTGGAGTT CAGACGTGTG CTCTTCCGAT CTCGAGGTAT GTAGGCGGTG CTACA-3′) via PCR, at which point universal primers and indices were attached to the segment using an additional PCR step. Prepared libraries were then diluted to 50 pM and sequenced on the Illumina iSeq using overlapping 2 × 150 bp overlapping reads.

Only sequences with zero mismatches in the ori region were used for further analysis. The distribution of the number of counts per construct can be seen in Supplementary Fig. 5 and a full list of the promoter sequences and their respective number of counts at each time points is provided in Supplementary Data 1, 2, and 3.

Construction of inducible plasmids

The pUC-pTet plasmid, whose copy number is inducible in the presence of aTc was constructed using a PCR-based insertion as follows. Primers (pTet-tn10-for 5′-AAATAACTCT aTcAATGATA GAGTGTCAAG AAGaTcCTTT GaTcTTTTCT ACGGGGTCTG A-3′, pTet-tn10-rev 5′-TACCACTCCC TaTcAGTGAT AGAGATaTcT GCAAACAAAA AAACCACCGC TACCAGC-3′) overlapping with the pUC19 origin of replication and containing a pTet-inducible promoter were used to linearize and amplify the pUC19 backbone. The linearized plasmid was then circularized using a KLD reaction (NEB) and transformed into competent cells via electroporation. During the recovery period 1x aTc was added to the recovery media to facilitate replication of the plasmid.

The pUC-pLac-RNAi plasmid, whose copy number can be reduced through the addition of IPTG, was constructed as follows. The inhibitory RNA (commonly called RNAI) from the pUC19 plasmid was isolated via PCR. In the process 15 bp overhangs were added to facilitate Gibson assembly. Another pUC19 plasmid was the linearized downstream of the pLac promoter (primers) and the linearized plasmid together with the inhibitory RNA insert were assembled via Gibson assembly at 50 °C for 1 h and transformed into competent cells via electroporation.

Copy number and growth rate measurement via next-generation sequencing reads

To determine the relative plasmid copy number from the abundance of next generation sequencing reads we compare the fraction of the sequencing reads at several time points with that in the initial library. The ratio of the two, with a small correction applied to account for differences in the growth rates of cells harboring different plasmids, is what we report as a relative plasmid copy number. Unless otherwise noted, all experiments and cell cultures were performed at 37 °C.

We measure the initial fraction of the reads in the ligation product (fi) and the final fractions of the reads of the plasmids harvested at different time points (f1,f2,f3,f4). Differences in the doubling time can be measured from differences in the fractions of reads from points separated in time. These relationships can be described in the equations below. First, to determine the doubling time from adjacent time points:

$${f}_{n}={f}_{n-1}{2}^{\delta t(\frac{1}{\tau }-\frac{1}{{\tau }_{av}})}$$

The above line states that the fraction at a given time point is equal to the fraction at the previous time point multiplied by the difference in the number of doublings from the bulk culture. Below we have rearranged the equation to solve for the growth rate from the variables we can either observe or control.

$$\frac{Lo{g}_{2}\big(\frac{{f}_{n}}{{f}_{n-1}}\big)}{\delta t}+\frac{1}{{\tau }_{av}}=\frac{1}{\tau }$$

Secondly, to determine the copy number from the growth rate and the difference in sequencing counts we express the fraction at the first time point as the product of the fraction in the ligation, the relative copy number, and the number of excess doubling times:

$${f}_{f}={f}_{i}C{2}^{t\big(\frac{1}{\tau }-\frac{1}{{\tau }_{av}}\big)}$$
$$\frac{{f}_{n}}{{f}_{i}}{2}^{-t\big(\frac{1}{\tau }-\frac{1}{{\tau }_{av}}\big)}=C$$

By inserting the above expression for the doubling time, we can have the relative copy number solely in terms of the fractions of each sequencing reads and the ratio of the time between time points and the time between transformation and the first time point.

$$\frac{{f}_{f}}{{f}_{i}}\frac{{f}_{n-1}}{{f}_{n}}{2}^{-\frac{t}{\delta t}}=C$$

This analysis hinges on the assumption that upon transformation each cell receives a single plasmid, that cells do not transfer plasmids between each other, and that the number of transformants of a given plasmid is proportional to the amount of that construct present in the initial electroligation product. Further, this analysis assumes that cells harboring different constructs have identical lag times and that cells remain in the exponential growth phase until an an OD600 of 1.0 (Supplementary Fig. 6 provides evidence of this).

Absolute copy number determination via digital droplet PCR (ddPCR)

Digital Droplet PCR (ddPCR) was used to corroborate our sequencing measurements of plasmid copy number. Colonies hosting the appropriate constructs were picked off of plates after transformation and grown in 3 mL of TB or H media (10 g L−1 tryptone, 8 g L−1 NaCl) to an OD600 of 1.0. Cells were then serially diluted 8000-fold in nuclease free ddH2O. Cells were lysed by heating at 95 °C for 20 min. After lysis, 1 μL of the lysate was used in each digital PCR reaction. Two digital PCR reactions were performed per sample - one targeting the genomic dxs gene and another targeting the beta lactamase bla gene on the plasmid (primers). This defines the plasmid copy number as the number of plasmids per genome. After generating droplets on the QX200 droplet generator (Biorad), reactions were thermal cycled at 95 °C for 10 min, followed by 35 cycles of 98 °C for 30 s, 58 °C for 30 s, and 72 °C for 1 min. Samples were held at 4 °C after the reaction was completed and before droplets were read. Droplets were read with the QX-200 Droplet Reader (Biorad) and concentrations were determined using the Quantasoft analysis pro software (Biorad).

PCN sponge effects on CRISPRi inverter

CRISPR system with nuclease activity deactivated Cas protein(dCas) can be used to interfere with transcription regulation (CRISPRi) by design crRNA targeting at the promoter for the gene of interest which acts a repressor of gene. However, dCas/crRNA complex can also bind to other sites that share the same DNA sequence as the target but does not directly regulate the gene (these are called competitor sites). When the second scenario happens, it will have an impact on the repression level of the gene at a given concentration of dCas/crRNA, i.e., a sponge effect for the gene of interest by consuming dCas/crRNA from the same pool. To test the sponge effect at different numbers of competitor sites, a dual-plasmid system was used. The main plasmid(Kanamycin resistant) is a low copy(pSC101) pTet-inducible CRISPRi-dFnCas12a inverter, where the crRNA transcribes by pTet target at a promoter that drivers a green fluorescence protein(sfGFP). The second plasmid is edited from the above-mentioned pUC19-PCN library by inserting a single competitor via standard Site-Directed Mutagenesis(NEB, E0554). Two plasmids were co-transformed into the GL002 strain that has dFnCas12a and TetR integrated in its genome with Mix & Go! E.coli Transformation Kit(T3001). Five different PCN-plasmids were tested with the same main inverter plasmid. The expression of sfGFP from the dual-plasmid system with different PCN was individually measured at a gradient of 20 aTc concentrations on the BioTek (Synergy H1) plate reader in H medium. Three biological replicates were used for each system, i.e., three different colonies were picked up and grown in 2 mL H media without aTc for approximately 18 h. Then, 100 μL of overnight cell culture was used and diluted into 2 mL fresh H media before transferring to a 96 well plate where each of the middle of 10 x 6 wells contains 1 μL of diluted cell culture and 200 μL fresh H media with corresponding aTc concentration.

Growth rate measurements

Growth rates of individual constructs were measured on the BioTek Synergy H1 plate reader. All plate reader experiments were performed in VWR 96 well plates with a plastic lid, wells on the outer edge of the plate were not used to avoid evaporation effects. Overnight cultures of cells grown to saturation were serially diluted by a factor of 100,000 into well-mixed 200 μL aliquots of media containing the appropriate antibiotics and inducer concentrations. Cells were grown at 37 °C for 36 h while to OD600 and fluorescence (if appropriate) was recorded every 3 min. Cells were continuously shaken orbitally at 240 rpm. Growth rates were calculated from the resultant time series data by averaging a window of ±6 min around the maximal growth rate, calculated as a discrete difference of OD600 readings between time points.

Gene expression measurements

The expression of green fluorescent protein (sfGFP) was used as a measure of gene expression in this work. Fluorescence measurements were taken on cells prepared as described above as they grew. The mean value of the fluorescence at the point of maximal growth rate divided by the OD at which the maximal growth rate was achieved was the reported fluorescence. Six replicates of each inducer concentration were taken and the average and standard deviation of this measure of gene expression were reported. Plasmid copy numbers were estimated from this data in the mid-exponential phase using the relation \(P=\frac{G\mu }{k}\), where μ is the growth rate, G is the sfGFP signal, and k is a constant estimated by fitting the data to digital PCR measurements at the endpoints.

Violacein production measurements

Cultures of single colonies hosting plasmids containing the violacein biosynthesis pathway were grown overnight in 3 mL Terrific Broth with appropriate antibiotics. Overnight cultures were then shaken for 24 h at 300 rpm at room temperature in 15 mL Falcon tubes. First, 200 μL of liquid culture was centrifuged and cells were resuspended in 50 μL of water. Then, 450 μL of methanol was added to the resuspended cells and they were shaken for 1 h at room temperature. Cell extracts were spun down and 200 μL of the supernatant was transferred to a 96-well plate where the absorbance at 650 nm was used to quantify violacein production.

Single cell microscopy

The single cell data was collected within a month over 8 subsequent experiments with varying aTc concentrations. Each experiment was run for 1300–5000 min, corresponding to at least 20–80 generations. All experiments were started from the same stock bacterial colony, initially grown to OD600 0.5 at 10 ng mL−1 aTc and kept refrigerated. In order to start the experiment, 3 μL of the stock colony was dispersed in 3 mL of H media with plasmid selecting antibiotic carbenicillin and aTc. The cells were grown to OD600 0.5 ± 0.1, concentrated by centrifugation and inserted in plasma cleaned microfluidic chips.

After entering the chip bacteria populated and grew in 60 μm long, 7.5 μm wide and 1.05 μm thick chambers connected to a supply channel flowing fresh media pressurized to 4 psi34. On top of H media, carbenicillin and aTc, 0.1 g L−1 of bovine serum albumin was added to the media to improve evacuation of cells from the supply channel55.

Single cells were resolved by epi-fluorescence microscopy of sfGFP with a 100 x, 1.4 NA apochromat Leica objective. Single cell detection was performed using customized Python scripts and OpenCV software (https://opencv.org). Tracking enabled measurement of cell dimensions, intensity and growth rate overtime. Cells can be classified in three categories: (1) non dividing dark cells (loss of antibiotic resistance due to plasmid loss), (2) healthy cells and (3) non growing bright cells (likely due to plasmid overload). Whereas in bulk pathological cells of type (1) or (3) would quickly be overgrown by healthy cells, in chambers, we find that they can get stuck at the back or even populate the whole chamber, rendering the chamber unusable when the cells eventually die. This is particularly true for non dividing dark cells at low aTc concentration. By using 7.5 μm wide chambers, we allow many cells to grow side by side, which increases the chances of healthy cells to repopulate the chambers. However, this is not enough to maintain healthy chambers for more than a few hours at low aTc concentration. For aTc concentrations of 1 ng mL1 and above a majority of chambers last for the whole experiment time of 5000 min. We base intensity measurements on growing cells only (growth rate larger than 1/3 of mean growth rate) to prevent over representation of non growing cells that can reach up to 40% of the detected cells (Fig. 3F).

Due to the rapid loss of chambers we are unable to measure a stabilized fluorescence level for aTc concentration below 1 ng mL−1. However, we can still observe that the aTc concentration promotes plasmid replication. Indeed, the chamber survives longer at higher aTc concentration (plasmid loss is less frequent) and the average fluorescence increases with aTc. For aTc concentration of 1 ng mL−1 to 100 ng mL−1 time dependent fluorescence and growth rate curves (Supplementary Fig. 5) stabilize after about 600 min (i.e., 12 generations) and are expected to accurately represent PCN in exponential growth condition.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.