Modeling the ecology of parasitic plasmids

Plasmids are autonomous genetic elements that can be exchanged between microorganisms via horizontal gene transfer (HGT). Despite the central role they play in antibiotic resistance and modern biotechnology, our understanding of plasmids’ natural ecology is limited. Recent experiments have shown that plasmids can spread even when they are a burden to the cell, suggesting that natural plasmids may exist as parasites. Here, we use mathematical modeling to explore the ecology of such parasitic plasmids. We first develop models of single plasmids and find that a plasmid’s population dynamics and optimal infection strategy are strongly determined by the plasmid’s HGT mechanism. We then analyze models of co-infecting plasmids and show that parasitic plasmids are prone to a “tragedy of the commons” in which runaway plasmid invasion severely reduces host fitness. We propose that this tragedy of the commons is averted by selection between competing populations and demonstrate this effect in a metapopulation model. We derive predicted distributions of unique plasmid types in genomes—comparison to the distribution of plasmids in a collection of 17,725 genomes supports a model of parasitic plasmids with positive plasmid–plasmid interactions that ameliorate plasmid fitness costs or promote the invasion of new plasmids.

The equations governing the dynamics of conjugative plasmids are: where α is the growth rate of plasmid-free cells, ρ is the number density of plasmid-free cells, C is the nutrient concentration, γ c is the conjugation rate, ρ p is the number density of plasmid-containing cells, δ is the death rate of cells, p is the plasmid loss coefficient, ∆ is the plasmid fitness cost, and S is the nutrient supply rate.

Condition for plasmid invasion
It can be seen that there is a plasmid-free equilibrium, y * f , at If this equilibrium is locally unstable, plasmids can invade. The plasmid-free equilibrium will be unstable when the real part of the eigenvalues of the Jacobian evaluated at this equilibrium are greater than zero. In these eigenvalue calculations, we will take advantage of the fact that the eigenvalues of block upper or lower triangular matrices are the eigenvalues of the diagonal block matrices. To obtain a block lower triangular Jacobian, J f , we reorder the equations to be [ρ p , C, ρ], yielding The submatrix in the lower right corner will always have negative eigenvalues since its T r = − Sα δ < 0 and its Det = Sα > 0. Therefore plasmids will invade if We can rearrange this as This inequality has an intuitive physical interpretation: plasmid invasion requires the rate of conjugation to overcome the losses due to lower fitness and random plasmid loss events during division.

Conjugative plasmid phase diagram
To obtain the phase diagram of possible ecologically stable outcomes, we assume that the rate of plasmid loss is negligible such that p = 0. This system contains three equilibria: a no-plasmid equilibrium, a plasmid-only equilibrium, and an equilibrium where the two cell types coexist. We construct the phase diagram by assessing the conditions for local stability of all equilibria. The stability conditions of the no-plasmid and plasmidonly equilibria are explicitly derived here, while the stability of the coexistence equilibrium is determined by numerical analysis of the Jacobian in MATLAB. We have the plasmid-free equilibrium from Eq. S4, and with p = 0 there is now a plasmid-only equilibrium located at We already have the condition for the stability of the plasmid-free equilibrium (Eq. S7) which is γ c ρ * < δ∆ when p = 0. The Jacobian at the plasmid-only equilibrium is Note that this matrix is computed using the same order of the variables as in Eq. S5. The submatrix in the upper left will always have negative eigenvalues since it has T r = − S(1−∆)α δ < 0 and Det = S(1 − ∆)α > 0.
Thus, the plasmid-only equilibrium is stable if We can rearrange this to This has a similar interpretation to Eq. S7. In order to resist invasion by plasmid-free cells, the plasmidcontaining cells must conjugate faster than 1 1−∆ − 1 δ, a rate that increases monotonically with fitness burden. When the fitness burden is small, this function is the same as in the Eq. S7: )δ, meaning that the two stability conditions converge as ∆ → 0. Now that we have determined the conditions for stability of the pure strain equilibria, we can search for regions of coexistence or bistability within the phase diagram. Coexistence between plasmid-containing and plasmid-free cells will occur if both of the single-strain equilibria are unstable. This implies a region in the phase diagram where the conjugation rate is high enough for plasmids to invade the plasmid-free equilibrium, but not high enough for the plasmidcontaining cells to resist invasion by the plasmid-free cells. Formally, this means that the critical conjugation rate for stability of the plasmid-free equilibrium is lower than the critical conjugation rate for stability of the plasmid-only equilibrium: For parasitic plasmids, there will always be a region of the phase diagram where coexistence occurs. The condition for bistability between the plasmid-free and plasmid-only equilibria is the opposite of the coexistence condition, as it requires the plasmid-free critical conjugation rate be above the plasmid-only critical conjugation rate. Thus, for parasitic plasmids this system cannot support bistability between the plasmid-free and plasmidonly equilibria when p = 0. Note that even if ∆ < 0, the system cannot support bistability since Eq. S12 is true for all ∆ = 0.

Transformative plasmids 1.2.1 Model equations
The equations governing the dynamics of transformative plasmids are: where γ t is the transformation rate, P is the number density of free plasmids, n eff is the average number of plasmids released upon the death of a plasmid-containing cell, and δ p is the decay rate of free plasmids.

Condition for plasmid invasion
We once again have a plasmid-free equilibrium, y * f , this time at Plasmids will invade if this equilibrium is unstable. We reorder the variables to be [ρ, C, ρ p , P ] to obtain J f , a block triangular Jacobian at the plasmid-free equilibrium: The submatrix in the upper left will always have negative eigenvalues since it has T r = − Sα δ < 0 and Det = Sα > 0. Thus stability depends on the eigenvalues of the lower right submatrix. When the plasmid is costly, i.e. (1 − ∆) < 1, this matrix always has at least one negative eigenvalue because it has T r = For a high-copy-number plasmid that releases many plasmids upon death (n eff 1), the plasmid invasion condition reduces to This has a similar interpretation to the earlier invasion condition. For plasmid invasion, transformation rates must overcome losses due to fitness cost and random losses during cell division.

Transformative plasmid phase diagram
Similar to our analysis of the conjugation model, we build a phase diagram by analyzing the stability of the equilibria. We again assume that the plasmid loss rate is negligible (p = 0). This system also has plasmid-only, plasmid-free, and coexistence equilibria. We will explicitly derive the stability condition for the two pure strain equilibria here and assess the stability of the coexistence equilibrium by numerically analyzing the Jacobian in MATLAB. The stability condition for the plasmid-free equilibrium with p = 0 is Also note that if the denominator is negative (n eff < ∆), the inequality flips, and because γ t > 0, the plasmidfree equilibrium will always be stable. Next, we find the plasmid-only equilibrium to be: The Jacobian for this equilibrium (using the original order of the variables) is J p is block lower triangular, and the middle submatrix is the same as the matrix in the upper left of Eq. S9 and has only negative eigenvalues. Thus, the plasmid-only equilibrium is stable if This can be rearranged to Note that the LHS is equal to γ t P * . This is similar in form to Eq. S11. For a population of plasmid-only cells to resist invasion by plasmid-free cells, the rate of transformation must overcome a rate that is monotonically increasing with plasmid cost. Using these stability conditions, we can explore whether this system can support coexistence or bistability. We begin with the condition for coexistence: the critical conjugation rate for stability of the plasmid-only equilibrium must be greater than that of the plasmid-free equilibrium, such that When ∆ > 0 and n eff > ∆, this reduces to Thus, when these conditions are met and n eff > 1, coexistence is possible. Like in the conjugation case, the condition for bistability between the plasmid-free and plasmid-only equilibria is the opposite of the condition for coexistence. Thus, when n eff < 1, the phase diagram contains a region of bistability. Even if n eff < ∆ (only possible when n eff < 1), the system will not support coexistence because the inequality in Eq. S22 will flip and the plasmid-free equilibrium will always be stable. Fig. S1: Ratio of optimal segregation loss p to optimal fitness cost ∆ for different plasmid copy costs ∆ p .
Using the same parameters as in Fig. 1C, we compute the n p that minimizes the right-hand side of Eq. 4 and find the corresponding ratio of the optimal p to ∆.
1.3 Relating ∆, n eff , and p , to plasmid copy number Many of the parameters in the ODE models (see Eqs. S1-S3 and S14-S17) are strongly influenced by a single property: the copy number of the plasmid, n p . Plasmid copy number is property that plasmids can themselves control by modulating their replication rate and is a major factor in plasmid evolution. How might plasmid cost ∆ depend on n p ? If the cost of a plasmid primarily arises from its replication, it is reasonable to assume that the cost will scale with copy number, such that ∆ = ∆ p n p where ∆ p is the cost per individual plasmid copy.
The number of plasmids released upon cell death is also likely to scale with n p , such that n eff = n p where is the fraction of plasmids that remain viable after cell death. How might the plasmid loss coefficient p relate to copy number? If we assume that daughter cells receive plasmids from the parent cell by random segregation, it follows that p = 2 1−np , i.e. the probability that one daughter cell receives zero plasmids.
As we discuss in the main text, the relationships between the model parameters and plasmid copy number allow us to compute optimal copy numbers that maximize the invasion potential of the plasmids. For a conjugative plasmid, this generally results in an intermediate optimal copy number. In Fig. S1 we characterize the relationship between the fitness cost ∆ and segregation loss rate p of optimal plasmids for different values of plasmid copy cost ∆ p . As can be seen, optimal conjugative plasmids have p < ∆.

Conjugation and transformation phase diagrams with finite loss rate p
In earlier sections we analytically characterized the stable ecological states of the single-plasmid models assuming negligible segregation loss rates (p = 0). To confirm that allowing p > 0 does not significantly alter the results, we numerically characterized the stable ecological states with finite segregation loss rates. In the case of conjugation, there is no longer a definite transition between plasmid-only and coexistence states. Since plasmid loss is finite, all equilibria contain at least a small fraction of plasmid-free cells. As such, instead of a phase diagram, we show a heatmap of the steady-state fraction of plasmid-free cells in Fig. S2A. As can be seen, there is still a transition between plasmid-containing and coexistence equilibria, but no true plasmid-only state. However, the behavior is qualitatively similar to that of the p = 0 as the plasmid-free cells become a negligible fraction of the population as plasmid cost decreases or conjugation rate increases. For transformation with finite loss rates, we show a phase diagram in Fig. S2B. As can be seen, bistability still occurs but is now between a plasmid-free state and a coexistence state.

Plasmid-driven extinction
Can a sufficiently virulent and costly plasmid drive its population to extinction? In order to answer this question we must modify the model to include a nutrient depletion term, as this will allow the existence of a equilibrium where no cells persist. We show the analysis for the conjugation model in detail here, but these results hold for the transformation model as well. First, we must modify Eq. S3 to be: The extinction equilibrium of the system with therefore be: The corresponding Jacobian is: This matrix is block triangular, and thus the eigenvalues are the eigenvalues of the diagonal blocks. By inspection, it can be seen that the largest eigenvalue is −δ + αS δca . This eigenvalue is positive when plasmid-free cells are able to persist in the absence of plasmids. In the case of transformation, the eigenvalues of the extinction Jacobian are identical except for an additional negative eigenvalue corresponding to the decay rate of free plasmids. Thus, in biologically-relevant scenarios where cells persist in the absence of the plasmid, the extinction equilibrium is unstable. This means that plasmids in our model cannot drive the population to extinction.
However, this result depends on populations being measured as continuous variables. With finite populations it is possible that a highly infectious plasmid could reduce the plasmid-free population to zero cells.

Multiplasmid, single population ODE models
For multiple plasmids in a single population, the models consist of 2 m + 1 equations governing the dynamics of all possible types of plasmid-containing cell. For large m, these equations are difficult to represent and we instead developed a script to automatically generate and simulate these equations. As an example, we show the equations for m = 2: dρ 10 dt = (1 − ∆)αCρ 10 + γ c ρ(ρ 11 + ρ 10 ) − γ c ρ 10 (ρ 01 + ρ 11 ) − (1 − ∆)αCp ρ 10 + (1 − ∆) 2 αCp ρ 11 − δρ 10 , dρ 11 dt = (1 − ∆) 2 αCρ 11 + γ c (ρ 11 ρ 01 + ρ 11 ρ 10 + 2ρ 10 ρ 01 ) − 2(1 − ∆) 2 αCp ρ 11 − δρ 11 , The subscripts of ρ indicate plasmid content (e.g. ρ 10 contains plasmid one but not plasmid two). If all plasmids are identical and begin at identical abundances, these equations can be condensed into a set of m + 2 equations: Here, the subscript of ρ indicates number of unique plasmids (e.g. ρ m has m unique plasmids). As ∆ → 0, the steady-state distribution of plasmids among cells will approach a binomial (in the case of identical plasmids) or a Poisson binomial (in the case of non-identical plasmids). To demonstrate this, we show in Fig. S3 the difference between the binomial approximation and true solution for varying values of ∆ in a system of three Fig. S3: Deviation from binomial approximation as a function of plasmid cost ∆ for a system of three identical plasmids with the same parameters as in Fig. 2. For the binomial approximation, we analytically compute the steady-state fraction of plasmid-containing cells in a system containing a single plasmid and use this as the p in the binomial distribution. For each value of ∆, the conjugation rate is adjusted such that γ c = 3γ * c , where γ * c is the critical conjugation rate needed for plasmid invasion. The solution to the full set of equations is found by numerical solution as in Fig. 2. The reported deviation is the 2-norm of the difference between the two plasmid distributions.
identical plasmids. As can be seen, the error approaches zero as ∆ decreases. Fig. S4: Distributions of unique plasmid types per genome in non-clinically relevant genera, along with corresponding model fits. We excluded 91 clinically relevant genera from our initial set of 17 725 genomes, including all major pathogens and human commensal genera. The final dataset was composed of 4845 genomes. Distributions truncated and fit using the same method as in Fig. 4B S5: Plasmid distribution analysis with putative engineered strains removed. The NCBI complete genomes dataset may contain genomes that have been artificially engineered, potentially skewing the observed plasmid distribution. To determine the impact of such genomes on our analysis, we filtered genomes based on the description of their associated Bioproject. We removed BioProjects containing the following strings in their description: 'engineer', 'clone', 'cloni', 'artificial', 'laboratory', 'recomb', 'industr', 'chimera', and 'Evolution of bacterial fitness with an expanded genetic code'. This filtering removed 173 genomes, and we then repeated the analyses found in Fig. 3. As can be seen, the results have not substantially changed. The parameters estimated from this filtered dataset are very similar to those estimated from the original dataset. Thus, we conclude that the presence of potential artificially engineered genomes has not substantially impacted our analysis of natural genomes. Fig. S6: Distributions of unique plasmid types per genome in common genera, along with corresponding model fits. We show the nine genera with the largest number of genomes, excluding Klebsiella and Escherichia (shown in Fig. 4B and C). We also exclude Bordatella, which only has one point above the observation count cutoff. Distributions truncated and fit using the same method as in Fig. 4B and C, with fitting parameters shown in each panel. plasmid types, every new epoch can be described as N draws from a multinomial distribution: Where p i is the probability of drawing a population with i plasmids, n i is the number of populations with i plasmids. For brevity we define T ≡ ∞ j=0 n j w j . In the models we analysis, w 0 = 1. The stationary distribution will occur when the expectation of the multinomial draws is equal to the current population such that E[n i (t + 1)] = N p i (t) = n i (t).
For i = 0, applying the stationary condition yields Thus for i > 0 the stationary condition is For brevity we define κ ≡ q/(1 − q) and divide both sides by N to obtain an expression for i > 0 in terms of the population fractions, f i : This is an infinite set of equations, but we can infer the relationship between the f 0 and all later entries by examining the first few equations. We first write out the equations for k = 1, 2, 3: Solving these equations gives a recursion relationship for i > 0: Therefore, the expression in terms of f 0 is For the fractions to be proper, they must all sum to 1: From this, we can solve for the value of f 0 : Thus for i > 0, the population fractions are

Case of positive epistasis
In this version of the Wright-Fisher model, we take the plasmid-free populations to have a fitness of unity, while all plasmid-containing populations have a fitness of 1 − ∆. This is an extreme form of positive epistasis among plasmid types (i.e. the presence of one plasmid reduces the fitness burden of another plasmid). Every new "epoch" can again be described as N draws from a multinomial distribution: where p i is probability of drawing a species with i plasmids, n i is the population of species with i plasmids, and 1 is the indicator function. For brevity we define T ≡ ∞ j=0 n j (1 − ∆) 1 j =0 . The stationary distribution will occur when the expectation of the multinomial draws is equal to the current population: For i = 0, applying the stationary condition yields Thus for i > 0 the stationary condition is For brevity we define κ ≡ q/(1 − q) and divide both sides by N to obtain an expression for i > 0 in terms of the population fractions, f i : Once again we will examine the first few equations to extract a recursion relationship between population fractions. Writing out the equations for k = 1, 2, 3 yields: Solving each of these gives: We can therefore extract a recurrence equation relating f i to f 0 for i > 0: Since these proportions must sum to unity, we can find f 0 from the infinite sum of the coefficients: For the stationary distribution to exist this series must converge, requiring that This makes intuitive sense: in order for the number of plasmid types per cell not to increase without bound, the reduction in reproduction probability due to the presence of plasmids must be greater than the invasion probability. The entire distribution is therefore Note that if ∆ 1 and q 1, this distribution can be reduced to a single parameter distribution:

Incompatibility interactions
How would our model behave if we included incompatibility interactions between plasmids? First, let us consider cells sampling from a uniform distribution of plasmids. Suppose there are N distinct, equally abundant, incompatibility groups that can coexist over long timescales within a single cell. Whenever a plasmid invades a cell, its incompatibility group is randomly selected from these incompatibility groups, and the invasion is only successful if the incoming plasmid belongs to a new incompatibility group. In this case, we can represent this with an invasion function q = q 0 (N − i)/N where i is the number of plasmids in the cell and q 0 is the invasion probability for a cell containing no plasmids. The outcome will depend on the number of incompatibility groups. If the number of plasmids contained within the average cell is small relative to the number of incompatibility groups, the outcome will be similar to that of a constant invasion probability. This is because the invasion probability does not change substantially if i N . However, if there are a small number of incompatibility groups, this would have an effect similar to negative fitness epistasis and shorten the tail of the plasmid distribution. Any deviation from a uniform distribution of plasmids would actually shorten the tail of the plasmid distribution. This is because it would be more likely for plasmids that already reside within the cell to attempt and fail to invade, effectively reducing the invasion rate and thus favoring lower plasmid numbers.