Mobile genetic elements (MGEs) allow bacteria to rapidly adapt to changing environments through the acquisition of novel traits that range from single genes to entire catabolic pathways (Top and Springael, 2003). Of particular concern is the ability of MGEs to capture and spread antibiotic resistance determinants by horizontal gene transfer (HGT).

The presence of antibiotic resistance determinants on MGEs makes it increasingly difficult to curb the emergence and spread of drug resistance because these elements have an evolutionary life of their own (Bergstrom et al., 2000; Touchon et al., 2014). Consequently, if frequencies of antibiotic resistance are reduced following, for example, reduced selective pressures, then MGEs harboring resistance determinants can continuously move on to other recipients of the same or different species and promote their persistence (Bergstrom et al., 2000; Baquero et al., 2013). To gain more knowledge on the basic processes controlling the frequencies of antibiotic resistance in bacterial populations, it is essential that we better understand the population dynamics and evolution of bacteria with acquired MGEs encoding antibiotic resistance in both the presence and absence of antibiotic selective pressures (Johnsen et al., 2009). With the exception of work focusing on plasmids (Bouma and Lenski, 1988; Dahlberg and Chao, 2003; San Millan et al., 2014; Gullberg et al., 2014), surprisingly few studies have addressed the impact of MGE acquisitions on bacterial population dynamics (but see Starikova et al., 2012; Starikova et al., 2013). One class of MGEs with a prominent role in the emergence and spread of antibiotic resistance determinants where studies on the population dynamics following acquisition are particularly limited are the mobile resistance integrons, and especially the class 1 integrons.

Integrons are genetic elements with the ability to capture, express and shuffle one of the smallest MGEs known, the gene cassettes (Stokes and Hall, 1989). Structurally, integrons consist of an integrase (intI) gene, an attachment site for acquisition of gene cassettes (attI), and an array of gene cassettes that is highly variable in number and composition (Figure 1a). More than 100 integrase genes have been described (Boucher et al., 2007). An integron integrase is capable of site-specific recombination between attI and the gene-cassette-borne attachment site (attC), or between two attC sites (Collis and Hall, 1992; Collis et al., 2001). One promoter located in this region, Pint, facilitates expression of intI and another, PC, enables transcription of the gene cassettes. Integrase expression is often controlled by the bacterial SOS response, requiring lexA derepression for expression (Guerin et al., 2009; Cambray et al., 2011). Presence of binding sites for repressors, transcription factors or other DNA-binding proteins can further modulate expression of the integrase (Cagle et al., 2011).

Figure 1
figure 1

Illustration of the main model features. Panel (a) gives a schematic representation of a simple integron consisting of an integrase gene (IntI, orange) and three cassette genes (C1C3, green). The integrase is expressed from the promoter Pint, whereas the cassette array is expressed from the promoter PC. Gene expression of the cassettes is assumed to decline with increasing distance from the PC promoter. Attachment sites for integrase-mediate gene-shuffling are shown in blue (attI) and purple (attC). Panel (b) shows how integrase-mediated cassette shuffling can lead to new genotypes. Cassettes are excised from any position at rate ρ and are then, with probability θ, re-inserted in the first position of the array. Starting from the genotype C1C2C3, six genotypes can be produced depending on which cassette is excised and whether or not it is re-inserted. Panel (c) shows how different stress environments impact the stress-induced death rate hg. The top chart shows all eight combinations of presence (black) and absence (white) of three stressors. The chart below shows the stress-induced death rate in these environments for six example array genotypes consisting of different cassettes. These genotypes are shown schematically (right) and as mathematically as vectors (left). The stress-induced death rate ranges from zero (white) to 3ηS=0.9 (blue).

In their simplest form, gene cassettes contain a promoterless open reading frame followed by an attC site. The integron integrase can excise gene cassettes, generating transiently free circular DNA molecules, and insert free gene cassettes preferentially at attI (reviewed in Cambray et al., 2010; Escudero et al., 2015). There is a negative correlation between the distance of the gene cassettes from PC and their expression levels, except for rare occasions where internal promoters are present (for example, Weldhagen, 2004). Genes encoded by gene cassettes can be antibiotic resistance determinants or other genes that confer selective advantage, but also addiction modules or genes of unknown functions (‘ORFans’). The region downstream of the gene cassettes is variable, and in the clinically relevant class I integrons this region often contains a truncated qacE1 and a sul1; both genes are thought to have been important in the early evolution of these elements (Gillings, 2014; Gillings et al., 2014). Integrons have been traditionally grouped into those associated with MGEs (mobile integrons) and the large chromosomal superintegrons, but there is no structural difference that distinguishes superintegrons from MGE-borne integrons (Boucher et al., 2007; Cambray et al., 2011). However, the MGE-associated integrons are often involved in the acquisition and spread of antibiotic resistance determinants, and a number of different classes have been described (reviewed in Gillings, 2014). Class 1 integrons have been extensively studied mechanistically with respect to gene-cassette acquisition. They are widely dispersed in Proteobacteria (including most clinical Gram-negatives) and occasionally found in Gram-positives (Fluit and Schmitz, 1999; Partridge et al., 2009).

The potential adaptive effects of gene-cassette acquisition as well as the clinical relevance of antibiotic resistance-encoding gene cassettes have received much attention since integrons were discovered in the late 1980s (Stokes and Hall, 1989). Moreover, the main mechanistic aspects of integration, excision and expression of gene cassettes have been elucidated in detail (Dubois et al., 2007; Dubois et al., 2009; Loot et al., 2012). Yet, the selective forces responsible for the evolution and maintenance of integrons are poorly understood. The recent documentation of substantial fitness costs imposed by newly acquired integrase genes (Starikova et al., 2012) highlights the need to identify evolutionary benefits of integrases and integrons that can overcome such fitness costs. Experimental evidence indicates that through reshuffling of cassettes within the integron, the bacteria gain access to genes that were previously not expressed because they were too distant from the promoter (Collis and Hall, 1995; reviewed in Cambray et al., 2010), but this idea has never been formally scrutinized.

Here, we construct and analyze mathematical models to gain a better understanding of integron evolutionary dynamics. Specifically, we consider a population of bacteria that is subject to a number of stochastically changing stressors (for example, antibiotics). The bacteria may carry integrons consisting of an integrase gene as well as a number of cassette genes providing resistance to these stressors. We focus on short and relatively simple integrons, as best exemplified by clinical class 1–3 integrons (see discussion section). We are interested in the selective forces that maintain functional integrases in the face of fitness costs and mutations rendering them inactive. Three versions of the model are considered: a basic model where the integrase reshuffles cassette genes at a constant rate, a model extension where reshuffling only occurs under stressful conditions and a second extension where new cassettes can be incorporated through HGT.


We constructed a mathematical model of a bacterial population in which bacteria with a functional and with an inactivated integrase compete with each other (see Figure 1 for an illustration). For a full description of this model, see the Supplementary Information. In brief, in bacteria carrying the functional integrase, the gene cassettes are reshuffled at a certain rate. This affects their patterns of gene expression and, as a result, the level of resistance to a number of stressors that the bacteria experience and that fluctuate stochastically through time. Our model is described by the following system of ordinary differential equations:

Here, Xg and Yg are the abundances of bacteria with and without a functional integrase, respectively, for different genotypes g. The four terms in the equations for dXg/dt correspond to bacterial growth, death, mutational inactivation of the integrase and integrase-mediated cassette shuffling. The model equations were solved numerically using the software package Mathematica v10.0 (Wolfram Research, Inc., Champaign, IL, USA). Unless stated otherwise, all simulations were run for 10 000 time units and in 100 replicates; simulations were generally started from a homogeneous initial population of 106 bacteria carrying a functional integrase. All parameters of the model with their standard values are listed in Table 1. Parameter values chosen are based on experimental results where available (Collis and Hall, 1995; Starikova et al., 2012; Discussion section). We added two extensions to the model in which we included stress-dependent integrase activity, and HGT of gene cassettes.

Table 1 Parameters of the model with their assumed standard values


Basic model

Figure 2 shows an example for the evolutionary dynamics emerging from our model. It can be seen that owing to the fluctuating selection pressure on the population caused by the stochastically changing environment (that is, absence or presence of three stressors), genotype frequencies in the population change rapidly. Specifically, whenever a single stressor is present, genotypes that carry the corresponding resistance cassette gene at the first position within the integron are selectively favored. Similarly, when two stressors are present, genotypes carrying the two corresponding resistance cassette genes in the first two positions are favored.

Figure 2
figure 2

Example evolutionary dynamics of an integron. Throughout, solid lines denote bacteria with an intact integrase, whereas dashed lines denote bacteria without a functional integrase. The bar above each plot shows the presence (black bars) or absence (white) of each of three stressors to which the cassette genes 1, 2 and 3 provide resistance, with stressor 1 shown on top and stressor 3 at the bottom. Panel (a) shows the frequencies of the different genotypes in the population. Colors denote integron genotypes with different cassettes in the first position of the integron, as indicated in the legend; genotypes with the same leading cassette are shown in the same color. The bold black line gives the total frequency of genotypes carrying a functional integrase. Panels (b and c) give the total frequencies of genotypes carrying a particular cassette in the first position among all bacteria with or without a functional integrase, respectively. (d) gives the genetic diversity (computed as the probability that two randomly drawn cassette array genotypes are different) and (e) the mean number of cassettes in integrons within the two subpopulations. All parameters take the standard values given in Table 1.

It can also be seen that throughout the simulation, the functional integrase is maintained at a high frequency in the population (black line in Figure 1a), with one drop to about 0.5 but subsequent recovery. This is despite a number of fitness costs imposed on bacteria carrying a functional integrase, including the assumed constitutive increase in death rate ηI, but also the fact that often the ‘wrong’ cassette gene will be incorporated in the first integron position and that often cassette genes are lost following excision (leading to integrons that are shorter on average; Figure 2e). The reason that the functional integrase is maintained in spite of these fitness costs is that by shuffling of cassette genes, the integrase increases genetic diversity. This can be seen by comparing genotype frequencies among bacteria with and without the functional integrase (Figures 2b vs c), and also by comparing the levels of genotype diversity in these two subpopulations (Figure 2d). As a result of this increased genetic diversity and the ensuing higher variance in fitness, the bacterial subpopulation with the functional integrase can respond more rapidly to changing selection pressure.

Given both the natural variation observed in some of the parameters of our model and the dearth of experimental estimates of others (Discussion section), it is important to ascertain how varying the different parameters over a wide range of plausible values impact the resulting dynamics. Figure 3 shows how varying the main parameters of the model while keeping the other parameters fixed affect the final frequency of the functional integrase in the population. At very low rates ρ of integrase-mediated gene shuffling, the selective advantage of integrase-carrying bacteria will be very small and the functional integrase will be gradually lost from the population (Figure 3a). On the other hand, at very high rates of gene shuffling, the integrase becomes strongly deleterious and is also lost. This is because favorable genotypes will rapidly be eliminated through integrase action in this case, including through excision events that are not followed by reintegration of cassettes. The highest integrase frequency, close to one, is therefore seen with intermediate rates of gene shuffling.

Figure 3
figure 3

Impact of different parameters on the final integrase frequency. The plots show how (a) the rate ρ of integrase-mediated cassette shuffling, (b) the cassette reintegration probability θ, (c) the fitness cost ηI of the integrase, (d) the stress-induced death rate ηS, (e) the gene expression parameter β and (f) the stress resistance parameter γ influence the final frequency of the functional integrase. For each plot, one parameter was varied, whereas all other parameters take standard values. The blue line gives the median frequency, the dashed red lines the mean frequency and the shaded areas the interquartile and 90% interquantile range of final integrase gene frequencies in the population. Gray vertical lines indicate the standard value of each parameter.

Figure 3b shows the impact of the parameter θ, the fraction of cassette genes that are re-inserted following excision. This plot shows that the integrase is selected for even if a relatively high proportion (up to 80%, corresponding to θ=0.2) of cassettes are not re-inserted. In Figures 3c and d the impact of the fitness cost ηI of the functional integrase as well as the mortality ηS imposed by the stressors is explored. As expected, the integrase frequency decreases from one to zero with increasing ηI, but increases from zero to one with increasing ηS. This means that an integrase that is very costly can only be maintained at low frequencies or is not maintained at all in the population. On the other hand, the higher the stress-induced mortality rate ηS is, the stronger does selection for the integrase become. Finally, Figures 3e and f show the impact of the parameters β and γ that determine how gene order affects fitness in the presence of stressors. The integrase frequency increases monotonically with increasing β, the parameter that determines how strongly gene expression declines within the integron. When β is too low, gene expression of a cassette will only weakly be affected by its position in the integron, so that gene shuffling becomes irrelevant and there is no selection for a functional integrase. The parameter γ, determining how gene expression affects resistance levels to a given stressor, needs to take intermediate values to lead to selection for the integrase: when γ is too low, even strongly expressed genes provide little resistance and when γ is too high even weakly expressed genes provide strong resistance; in both cases gene order within the integron becomes irrelevant for fitness (Supplementary Figure S1B). Example dynamics where the parameters ρ, θ, ηI and ηS are individually varied relative to their standard values are shown in Supplementary Figure S2.

We next investigated how environmental change affects selection for the functional integrase. To this end, we screened both the average stress level, σmean and the velocity σvel with which the stressors appear and disappear (Figure 4). We observed that the integrase frequency is maximized for intermediate values of both of these parameters. This result can be understood intuitively by considering extreme values of both parameters. When σmean is very low, there will usually be no stressor present, so that on average there will be very weak selection acting on cassette genes and hence for the integrase. When σmean is very high, all stressors will be present at once most of the time, so that gene order does not matter much and selection for the integrase will also be weak. (Note, however, that even though gene-cassette order does not matter much in this case, the presence of the cassettes is of course important. This means there will be selection against integrases that excise but do not reinsert cassettes.) When the velocity at which the environment changes is very low, time periods in which reshuffling of gene cassettes is beneficial will be rare, so that again there will be only weak selection for the integrase. Finally, when the velocity of environmental change is very high, the population will essentially experience only a homogeneous average stress level in which all three stressors are present; gene order also does not matter in this case.

Figure 4
figure 4

Mean frequency of the functional integrase gene for different mean stress levels σmean and rates σvel at which stress conditions change. The integrase is assumed to impose no fitness cost in a and b (ηI=0), but is costly in panels c and d (ηI=0.001). The activity of the integrase is moderate in panels a and c (ρ=0.001) and relatively high in panels b and d (ρ=0.01). Other parameters take standard values as indicated in Table 1.

Integron length varies widely across different bacteria (Cambray et al., 2010). As such, it is important to ascertain the impact of cassette numbers on selection for the integrase. Supplementary Figure S4 shows that with increasing number n of cassette genes, the integrase is maintained at a higher frequency in the population. Not surprisingly, the integrase goes extinct rapidly for n=0 (empty integron) and n=1 (no cassette shuffling possible). For n=2 and n=3, the integrase clearly provides a fitness benefit but observed frequencies vary widely between zero and one. For n=4, the integrase is strongly selected for and always maintained at a high frequency. We expect that this trend of increasing strength of selection for the functional integrase with increasing integron length will continue with n>4, but due to computational restrictions we were unable to demonstrate this. (With n=5, there are already 7812 possible genotypes in the model.) The reason that selection becomes stronger with increasing n is that for any given value of β, gene expression at the end of the cassette array will become increasingly weak. This means that those cassettes can contribute little to stress resistance and selection for the functional integrase that can move these cassettes to the first position becomes stronger.

Model extension 1: stress-dependent integrase activity

Integrase expression is often not constitutive but controlled by LexA and thus triggered by the SOS response (Guerin et al., 2009; Cambray et al., 2011). Therefore, we next investigated an extension of our model where the integrase is only active (and costly) when the stress-induced mortality exceeds a certain threshold ϕ (Supplementary Information). Figure 5a shows the impact of this threshold on the evolutionary maintenance of an integrase that is very costly (ηI=0.01) and as a result not maintained when the other parameters take standard values (Figure 3c). When ϕ is very low, the integrase is active even at very low levels of stress so that the model becomes similar (or, with ϕ=0, identical) to the basic model and the functional integrase is lost. At the other extreme, when ϕ is very high the integrase is never active and thus not under selection. Rather, the functional integrase slowly declines in frequency due to mutational decay. Therefore, it is at intermediate values of ϕ that the integrase reaches the highest frequency in the population. Here, the integrase is only activated ‘when needed’ so that the benefit of cassette reshuffling within the integron at times of stress is combined with an absence of fitness costs in times of little stress. This effect is further illustrated in Supplementary Figure S5 showing the evolutionary dynamics with the same stress environment as in Figure 2 and Supplementary Figure S2, but with increasing values of ϕ.

Figure 5
figure 5

Final integrase frequency in model extensions 1 and 2. The blue line gives the median frequency, the dashed red lines the mean frequency and the shaded areas the interquartile and 90% interquantile range of final integrase gene frequencies in the population. Panel (a) shows the impact of stress-threshold ϕ (above which the integrase is expressed) on the final frequency of the functional integrase (model extension 1). Panel (b) shows the impact of the rate τ of HGT mediated by the integrase on its final frequency (model extension 2). All parameters take standard values as indicated in Table 1 except (a) ηI=0.01 and (b) ρ=0.

The advantage of stress-induced compared with constitutive integrase activity is also apparent when the impact of other parameters on the frequency of the integrase is investigated. As Supplementary Figure S6 shows (compared with Figure 3), these parameters have little impact of the frequency of the integrase. In particular, the integrase can be stably maintained at a high frequency unless almost no excised cassettes are re-inserted (θ close to zero) or the integrase is very costly (large ηI). This is because for intermediate stress thresholds ϕ, the integrase is essentially neutral in the worst case (for example, with low values of ρ, ηS or ϕ), but can still be highly advantageous in other cases (for example, when ηS is large).

Model extension 2: HGT

So far, integrase action was limited to reshuffling genes within the integron. We next considered a model extension where cassettes from other bacteria in the population may also be incorporated into the integron by means of HGT (Supplementary Information). Specifically, we assume here that cassette genes occurring within the population (in living bacteria or as free DNA) may be transferred to recipient bacteria where the integrase incorporates these cassettes into the first position of the integron. In our model, HGT thus represents an alternative pathway of how cassette arrays can be modified that may operate independently from the array rearrangements triggered by cassette excision considered previously.

To compare these two pathways, simulations were run first in the absence of gene reshuffling (ρ=0; see Supplementary Figure S8 for example dynamics). Figure 5b shows that like reshuffling, HGT can also lead to the stable maintenance of the functional integrase in the population in spite of its fitness costs. Moreover, the final frequency of the functional integrase also increases with increasing rate τ of HGT, but decreases at very high rates τ, as was the case for very high values of ρ. To compare the effect of the two rates, τ needs to be adjusted by the number of individuals in the population because HGT depends on both the number of donor and recipient individuals in the population. When multiplying τ by a factor of K=109 (a rough estimate for population size) and comparing the corresponding values, it can be seen that for low to intermediate rates of τ, the effect of HGT is very similar to that of within-genome shuffling (Supplementary Figure S7).

At very high rates of ρ and τ, the integrase reaches a slightly higher frequency with HGT than with within-genome shuffling when θ=0.5 (cassette loss in 50% of cases), but a lower frequency when θ=1 (all excised cassettes are re-inserted). This can be explained by the fact that in contrast to reshuffling, HGT can also produce integron arrays containing cassette duplications. Although this may sometimes confer higher stress resistance, it means that arrays produced by HGT are generally less variable than arrays containing only different cassettes. When HGT occurs in addition to reshuffling (both ρ > 0 and τ > 0, see purple line in Supplementary Figure S7), it appears that low to intermediate rates of HGT have little impact on the frequency of the functional integrase, indicating that the two pathways interact non-linearly in producing selection for integrase action.

Finally, we considered a scenario of de novo integron evolution. This scenario is highly simplistic but may still capture important aspects of real-life integron evolution that we have not addressed so far, including the emergence of integrons carrying antibiotic resistance genes in clinical settings (Discussion section). We assume that initially, the population consists only of bacteria that carry each of the resistance cassette genes individually, but no (or no functional) integrase. This will result in fluctuating frequencies of these genotypes in the population but no generation of genotypes carrying >1 cassette. We then assume that bacteria carrying only a functional integrase but no cassette genes enter the population at a low frequency. Specifically, we assume that the initial numbers of bacteria in the population are X000(0)=106 and Y100(0)=Y200(0)=Y300(0)=108, with all other bacterial numbers being zero initially. By means of HGT, the bacteria carrying the functional integrase can then capture the different cassette genes, combining them into a single genotype and thus forming an integron of increasing length (gray line in Figure 6). Those bacteria that have formed the integron have achieved a fitness advantage because carrying several cassette genes provides better protection from stressors than a single cassette gene, even if not all of these cassettes are fully expressed. Thus, the bacteria carrying the fully formed integron can now spread to fixation in the population (bold black line in Figure 6).

Figure 6
figure 6

Example dynamics showing how HGT can favor the invasion of bacteria with a functional integrase. Here, the population consists initially of predominantly integrase-free bacteria carrying individual cassette genes, with only few bacteria that carry a functional integrase but no cassette genes. The three colored lines show the frequencies of bacteria without a functional integrase that carry one of three cassette genes, providing resistance to each of the three stressors (shown on top as in Figure 2). The bold black line gives the total frequency of bacteria carrying a functional integrase. Finally, the bold gray line gives the average number of cassette genes among all bacteria carrying a functional integrase (scale shown on right hand y axis). All parameters take the standard values shown in Table 1, with the exceptions of σvel=0.05, μ=0 and ηS=0.2.


We have studied a mathematical model exploring the selective maintenance of integron integrases within a population of bacteria exposed to changing stressful conditions. Our results indicate that reshuffling of cassettes within an integron can confer a selective advantage to a functional integrase that comes about through changes in gene expression that are often strong enough to outweigh substantial fitness costs of the integrase. Our model, albeit focusing on a single population, reproduces several empirical observations seen across bacterial strains and species, including a relatively high prevalence of mutationally inactivated integrases in both clinical isolates and experimentally evolved strains (Gillings et al., 2005; Nemergut et al., 2008; Starikova et al., 2012) and cassette arrays that are highly variable in terms of cassette composition and order (reviewed in Gillings, 2014). Taken together, our results support the view that integron integrases are selectively maintained because they enable their hosts to efficiently use cassette arrays as a ‘reservoir of standing genetic variability’ (Cambray et al., 2010).

Integron diversity encompasses attributes that our model does not account for. Our model is based on characterization of class 1 integrons and other mobile resistance Integrons found in clinical settings (reviewed in Gillings, 2014). These integrons are generally short (8 cassettes), have cassettes that carry single genes that almost always confer antibiotic resistance, and generally carry a single PC promoter from which the cassettes are expressed. An important feature that is not included in our model, however, is the mobility of these integrons. Interestingly, integrases in clinical class 2 integrons are almost always inactivated by the same mutation producing an internal stop codon (Hansson et al., 2002). Viewed in the light of our model, this suggests that following their emergence as resistance integrons in pathogens, the ancestors of the bacteria carrying these integrons may have experienced prolonged stable environmental conditions so that selection to maintain a functional integrase was relaxed.

Many integrons are more complex than the clinical class 1–3 integrons. For example, the chromosomal integrons in different Vibrio species are much larger (some carry >200 cassettes) and more diverse. Many carry toxin–antitoxin systems (for example, Rowe-Magnus et al., 2003; Szekeres et al., 2007; reviewed in Cambray et al., 2010), some carry their own promoters or have >1 open reading frame, and ORFans are prevalent. Interestingly, however, in cases where the function can be inferred it appears that the gene products produced from the cassettes are often involved in interactions with the environment that may be subject to rapid change (for example, functions associated with host interactions and virulence, interaction with phages and biofilm formation). This suggests that our results are relevant beyond shorter mobile integrons of classes 1–3, and that the extraordinary gene-cassette diversity in chromosomal integrons in environmental bacteria (for example, Gillings et al., 2009; Boucher et al., 2011; Koenig et al., 2011) can be explained by positive selection for an active integron integrase.

Our model rests on the assumption that expression levels of cassette genes depend on their position within the integron: with increasing distance from PC, expression levels decline. This assumption is well supported. With only few exceptions (Bissonnette et al., 1991; Stokes and Hall, 1991; Naas et al., 2001), the gene cassettes themselves are promoterless, so that their expression relies on PC (Cambray et al., 2010). Mechanistically, transcription initiated by PC or by other internal promoters can span several cassette genes, and generally expression of cassette genes decreases with distance from the promoter (Collis and Hall, 1995; Coyne et al., 2010). Consequently, integron gene-cassette dynamics modeled here is also relevant for internal gene-cassette promoters other than PC. The decreased promoter activity can be attributed to the presence of stem-loop secondary structures at the attC sites that modulate ribosome binding and in turn affect translation (Jacquier et al., 2009), or act as weak transcriptional terminators. For example, Collis and Hall (1995) observed roughly equal amounts of RNA transcripts containing only the first and the first two cassettes. This corresponds to a parameter of β=ln2 ≈0.69 in our model, which is close to our standard value of β=0.5. However, we still lack a general and accurate description of cassette expression levels within integrons that would be essential for more quantitative predictions of integron evolution.

Another knowledge gap becomes evident when modeling gene-cassette shuffling. Reordering of gene cassettes has been observed experimentally (Collis and Hall, 1992) and is believed to occur by excision of a gene cassette and recapture of that cassette (Hall, 2012). Studies performed in vitro show that integrase-mediated cassette excision is preferentially catalyzed at two attC sites (releasing cassettes at position 2 or downstream), and that free gene cassettes are inserted typically at attI (in the first position close to PC) (Collis et al., 1993; Collis et al., 2002). However, to the best of our knowledge, no experimental data are available for the frequency at which excised gene cassettes are actually re-inserted. Although we assume a standard value of θ=0.5 for the probability of gene-cassette recapturing, our model demonstrates that for an integrase to be stably maintained, reinsertion efficiencies down to about θ=0.2 may be sufficient in changing environments (Figure 2b).

The gene-shuffling rate ρ is a function of the integrase activity, and we expect that integrase expression is an evolutionarily adjusted process that depends on Pint, the intI1 sequence, and the host genotype. Indeed, our simulations support the view that natural selection should optimize integrase expression because the integrase is most strongly favored at intermediate shuffling rates. Conversely, very high integrase activity can be deleterious due to occasional loss of excised cassettes as well as through the creation of unfavorable cassette orders within the integron. This shows that overexpressed integrases can be costly not only through pleiotropic effects in the host bacterium (for example, caused by ectopic recombination activities; Recchia et al., 1994; Harms et al., 2013), but also for the integron itself. Both of these costs can be strongly reduced when integrase expression is stress-dependent, as has been suggested by recent studies reporting SOS-induced expression of intI (Guerin et al., 2009; Cambray et al., 2011). Our model shows that indeed, for suitable thresholds of stress experienced by the bacteria (model parameter ϕ), such stress-induced integrase expression can strongly promote the maintenance of functional integrases within bacterial populations.

Our study also highlights a close relationship between integron and mutation rate evolution in bacteria, including mutator strains. Mutators are characterized by an elevated mutation rate, usually caused by mutations in genes responsible for DNA mismatch correction (Denamur and Matic, 2006; Raynes and Sniegowski, 2014). In a well-adapted population, mutators are disfavored because of their increased genetic load. However, in an adapting population mutations that result in an increased mutation rate may hitchhike to high frequencies along with beneficial mutations that they help produce. Similarly, functional integrases can be expected to decline in frequency when a bacterial population is well adapted to a stable environment. At best, integrases will decay neutrally through mutational pressure, but they may also be selected against due to fitness costs. On the other hand, in populations under selective pressure, active integrases can be indirectly selected for because even though most integron rearrangements they effect may be detrimental, they can still hitchhike along with those that provide a decisive fitness advantage.

HGT allows for gene flow within a population and can lead to acquisition of genes from unrelated taxa. Consequently, HGT is a major factor in the spread of antibiotic resistance determinants (Davies and Davies, 2010; Abel zur Wiesch et al., 2011; Levin et al., 2014). In the extension of our model where we incorporated HGT, the free flow of gene cassettes in the population substantially contributes to the stable maintenance of functional integrases in fluctuating environments. Here, the advantage is achieved both by moving gene cassettes into the first position within the integron, and by capturing new or previously lost gene cassettes (see also Starikova et al. (2012) for an earlier and simpler model of integrase-mediated HGT). This model extension can also help explain repeats of identical gene cassettes at different positions in the same integron (for example, Elsaied et al., 2011). Notably, in populations with static integrons that only carry defective integrase genes, for example, as result of long-term stable environmental conditions, HGT is strongly beneficial in our model for genotypes that carry functional integrases but no suitable gene cassettes. Here, HGT provides the gene cassettes required for integron platform assembly and shuffling. As a result, the model infers that carriers of functional integrases can rise to high frequencies over time when the environment has changed from static to fluctuating.

We have presented here the first population biological model for the maintenance of integrase-mediated gene shuffling. Our principal aim was not to present a model capable of making quantitative predictions regarding integron evolution (which seems impossible given the currently available data), but to formally scrutinize hypotheses on integron evolution and to identify parameters that are predicted to be important (see also Servedio et al. (2014) for a discussion of ‘proof-of-concept’ models). Clearly, there is much scope for future theoretical work exploring integron evolutionary dynamics. For example, whereas our model involved competitions between bacteria with and without a functional integrase, it would be interesting to investigate the evolution of integrase activity, taking into account potential constraints of this parameter with respect to PC promoter strength (Jove et al., 2010; Guerin et al., 2011). Even more importantly, our study highlights the need for experimental studies estimating crucial parameters influencing integron evolution, including fitness effects, levels of gene expression across cassette arrays and excision/reintegration rates of cassettes.