Cyclic di-GMP is a second messenger used by most bacteria that allows them to control the shift between many phenotypes. The best-studied examples include the transition between sessile (biofilm) and planktonic (including motile) cell states [1] and the expression of virulence factors [2]. Higher levels of c-di-GMP normally promote sessile, low virulent phenotypes, whereas lower levels promote planktonic, high virulent phenotypes [3]. c-di-GMP is synthesized by diguanylate cyclases (DGCs) with GGDEF domains and degraded by phosphodiesterases (PDEs) with either EAL or HD-GYP domains [3]. Levels of c-di-GMP are controlled by DGCs and PDEs as a response to clues in the environment [3].

Many bacteria carry a notably large number of c-di-GMP-associated genes on their chromosome: Pseudomonas aeruginosa, for example, encodes 38 GGDEF and/or EAL proteins [4]. Yet, bacterial genomes typically consist of both chromosomal and extra-chromosomal elements. Plasmids are extra-chromosomal elements and important facilitators of horizontal gene transfer (HGT), which is a main driver of bacterial evolution. Horizontal plasmid transfer typically occurs via conjugation. Conjugative plasmids encode a complete set of transfer genes, while mobilizable plasmids do not and therefore hitchhike on conjugative plasmids. Conjugative and mobilizable plasmids are collectively called transmissible plasmids [5].

Here we analyzed available sequences in GenBank and made comparative analyses between plasmid- and chromosome-encoded GGDEF and EAL coding DNA sequences (CDSs). Our analyses included plasmids that are 10–250 kb long. We used this strict definition to avoid miss-annotating cloning vectors, chromids, and small chromosomes as plasmids. Additionally, metadata available in GenBank was used to determine the replicons as plasmids or chromosomes.

We found that both GGDEF and EAL CDSs are common on plasmids, as 6.8% of plasmids carried a CDS with a GGDEF or EAL domain (Table S1 & S2). Next, GGDEF and EAL CDSs were categorized as catalytically active or inactive based on the presence or absence of conserved amino acids in the active sites of the two domains (Fig. S1). Overall, the density of GGDEF and EAL CDSs was higher among plasmids compared to chromosomes (Fig. 1a). The density of both catalytically active GGDEF and EAL CDSs was higher on plasmids compared to chromosomes, as was the density of inactive EAL CDSs. The relative density of inactive GGDEF CDSs was approximately the same between chromosomes and plasmids. The GGDEF and EAL domains of plasmids were more diverse (dissimilar) compared to the chromosomal-encoded ones (Fig S2), supporting plasmids as platforms where genetic rearrangement and innovation are more prone to occur.

Fig. 1
figure 1

GGDEFs and EALs are common on transmissible plasmids. a Density of EAL or GGDEF domain containing CDSs of plasmids and chromosomes. Only replicons with either EAL or GGDEF domains were included in this analysis. b Maximum likelihood phylogram of active only-GGDEF proteins identified on plasmids and the most similar GGDEFs encoded on chromosomes. Bacterial names refer to the host of the GGDEF CDSs. The phylogram is based on amino acid sequences. Chromosomes were included in the analysis if reported as complete or chromosome according to NCBI assembly level. Bootstrap support (1000 replicates) above 90% is shown on branches in gray circles. c Percentage of predicted active and inactive GGDEFs and EALs encoded on transmissible or non-transmissible plasmids. Analysis made based on plasmids in GenBank with GGDEF or EAL CDSs associated with Enterobacteriaceae strains. (A) Enterobacteriaceae cluster; (B) Streptomyces cluster; (C) Rhizobium/Agrobacterium cluster; (D) Deinococcus cluster; (E) Pantoea cluster; (F) Alpha/Gammaproteobacteria cluster; (G) Legionella cluster.

GGDEF and EAL domain containing proteins often hold a diverse array of additional domains important for the biochemical function of the specific protein. We found a diverse number of architectures (unique combinations of types, number, and sequence of protein domains) of plasmid CDSs with active GGDEF and/or EAL domains, illustrating that accessory genes of plasmids with GGDEF and EAL domains have very different functions. No strong tendencies were found when assessing if some of these domain architectures were unique for plasmids. However, two domain architectures stood out, as they are associated with transposition and thus HGT: (i) the domain architecture EALcd01948-DUF3330pfam11809, which is highly conserved and commonly found on plasmids from a very broad range of Bacteria (See, e.g., NCBI Identical Protein Report WP_003132006). These proteins are known as Urf2 and are typically part of transposons (e.g., Tn21, Tn501, and Tn5481). Urf2 may be involved in modulating the transposition of transposons [6, 7], but it is unclear how c-di-GMP PDE activity is connected to transposition. (ii) CDSs that coded for putative proteins with the HTH_21pfam13276-RVEpfam00665-GGDEFcd01949 domain architecture were found on Salmonella enterica plasmids (NCBI Identical Protein Report WP_012002053). This domain architecture suggests that the putative protein is a DNA integrase (HTH_21 & RVE) that either responds to c-di-GMP or synthesizes it.

Next, we looked at the CDSs of plasmids that only coded for active GGDEFs or EALs (i.e., did not contain any other known domains), as they are likely to function as unregulated c-di-GMP DGCs and PDEs, respectively, with basal enzymatic activity. These one-domain CDSs were found on plasmids in a diverse range of Bacteria. Figure 1b shows a phylogram based on unique sequences of only-GGDEFs of plasmids and the most similar GGDEFs encoded on chromosomes (Dataset S1 and S2). Figure 1b illustrates the diverse range of bacterial hosts that harbor a plasmid with GGDEFs, which included different Proteobacteria, Deinococcus-Thermus, Actinobacteria, Aquificae, Bacilli, and Cyanobacteria. Interestingly, all amino acid sequences of plasmids were different from the ones on chromosomes. Also, the hosts of the plasmid GGDEFs were typically phylogenetically distinct from the bacteria with the most similar GGDEF on the chromosomes. CDSs with only the EAL domain were also found on plasmids harbored by a phylogenetically diverse range of bacterial hosts. However, only 12 unique active only-EAL CDSs were identified (Dataset S3).

As the above indicated that active only-GGDEF and only-EAL CDSs were common on plasmid replicons, we theorized that DGCs and PDEs may be positively selected for on transmissible plasmids because these enzymes may induce biofilm and motile phenotypes, respectively, both of which have been shown to enhance rates of conjugative plasmid transfer [8]. We, therefore, characterized the plasmids as transmissible or non-transmissible based on the presence or absence of relaxase DCSs, respectively [9]. Examining the occurrence of catalytically active and inactive GGDEF and EAL CDSs on transmissible versus non-transmissible plasmids, we found that all GGDEF and EAL CDSs, regardless of predicted catalytic activity, were more common on transmissible plasmids (Fig. S3). We also performed the same analysis focusing only on Enterobacteriaceae and found similar trends (Fig. 1c). This is notable, as the ratio of transmissible to non-transmissible plasmids (10–250 kb) in GenBank has been shown roughly to be 1:1 [9].

The evolutionary success criteria of chromosomes and plasmids are not necessarily the same [10], as explained by selfish gene and genomic conflict theories [11, 12]. We, therefore, speculated that some plasmids may promote specific host behaviors to enhance their own transfer frequency. To this end, the biofilm phenotype was of specific interest as it can promote conjugative plasmid transfer [8, 13]. To test this, we performed the following proof-of-concept experiments: we obtained a conjugative wild-type plasmid, pUUH239.2 [14], harbored by Klebsiella pneumoniae, which encodes a DGC. This DGC gene was cloned onto expression vector pRham. This and an empty control vector were introduced into five different Enterobacteriaceae strains (Table S3, S4 & S5) and tested for biofilm formation by crystal violet, an assay that measures attachment to plastic surfaces [15]. With the exception of S. enterica, all strains expressing the DGC produced more biofilm compared to the controls in this assay (Fig. 2a). S. enterica attached very poorly to the plastic surfaces and we therefore used an alternative, the Congo red (CR) assay. CR binds to polysaccharides and beta-amyloid proteins that are produced as part of the extracellular polymeric substance (EPS) matrix during biofilm formation by S. enterica and Escherichia coli [16, 17]. Both S. enterica and E. coli formed more biofilm EPS when expressing the DGC in the CR assay (Fig. 2b). DGCs that increase c-di-GMP levels are known to reduce motility. This was found among the tested strain capable of swimming when expressing the DGC (Fig. 2c). Note that Klebsiella spp. are unable to swim. Lastly, we tested whether DGC expression, which facilitated the biofilm phenotype, also enhanced the rates of conjugation of plasmid pKJK5. This was the case for all strains tested (Fig. 2d, e). Klebsiella oxytoga was not included in these experiments as the strain used was not compatible with pKJK5.

Fig. 2
figure 2

Laboratory experiments link the expression of DGCs, increased biofilm formation, lowered swimming motility, and elevated transfer rates of a conjugative plasmid. A DGC gene originating from a conjugative wild-type plasmids was cloned onto an expression vector (pRham) and transformed into different species of Enterobacteriaceae: Escherichia coli (Ec), Klebsiella oxytoca (Ko), Klebsiella pneumoniae (Kp), Salmonella enterica (Se), and Serratia liquefaciens (Sl). Hereafter, biofilm phenotypes, swimming motility, and conjugation of pKJK5 were tested. a Biofilm formation by DGC-expressing strains relative to a control strain (with an empty pRham vector) is shown tested in crystal violet (CV) assays (N = 3) and b biofilm matrix (EPS extracellular polymeric substance) production measure by Congo red (CR) binding assays (N = 4). Increased redness and less smooth colonies by DGC-expressing strains grown on CR agar is seen below the bar chart that indicate elevated biofilm matrix formation. c Swimming motility of DGC-expressing strains relative to a control strains (N = 3). d Fold difference of conjugal plasmid (pKJK5) transfer frequencies (transconjugants per donor) from DGC-expressing donors compared to donors with the control vector to a recipient of the same species without any vectors (N = 4). e pKJK5 transfer frequencies (transconjugants per donor) from DGC-expressing donors (black markers) and donors with the control vector (white markers) to a recipient of the same species without any vectors (N = 4). All error bars represent standard deviations. ns indicates no significant as p-values were equal to or higher than 0.05 by ANOVA Tukey post-hoc correction test

Although demonstrating an association between HGT and c-di-GMP signaling, the data presented here is initial as the interconnection between HGT, plasmids, and the c-di-GMP signaling system is likely much more complex than this study encompasses; we base our analyses on sequence data from GenBank, which may not reflect the actual distribution of EAL and GGDEF CDSs among plasmids. Also, specific c-di-GMP levels may induce other phenotypes not considered in our laboratory experiments, where we focus only on the biofilm phenotype. These experiments should therefore be considered as proof of concept.

We found a large proportion of inactive EAL CDSs on plasmids, suggesting that these non-catalytic protein domains have a function. This is indeed plausible because degenerate EAL domains are known to function as c-di-GMP receptor domains [18].

Collectively, our data suggest that HGT is an important factor in the evolution and ubiquity of the c-di-GMP signaling system. This indicates an overarching relationship between HGT and the c-di-GMP system, which, to our knowledge, has not previously been described despite the ubiquity of the c-di-GMP system and the great importance of HGT on the evolution of bacteria. Supportive hereof, Bordeleau et al. [19] showed that some integrating conjugative elements of Vibrio cholera encode active DGCs affecting biofilm formation as well as motility, and it was noted that GGDEFs could be identified in some conjugative plasmids and a bacteriophage. Richter et al. [20] found that human pathogen E. coli O104:H4 expressed DgcX (a c-di-GMP DGC) at high levels and that this facilitated a unique biofilm phenotype related to the strain’s high virulence. Interestingly, the dgcX gene is encoded at an attB phage integration site and flanked by prophage elements, suggesting that the gene was acquired via HGT. Kulesekara et al. [4] found that several c-di-GMP DGC and PDE genes of P. aeruginosa are located on presumptive horizontally acquired genomic islands.

Materials and methods

Plasmid pUUH239.2 was kindly provided by Dr. Linus Sandegren.

See supplemental material file S1 for details about bioinformatics and associated statistical analyses. See supplemental material file S2 for details about the in vitro experimental materials and methods used.