Influence of mobile genetic elements and insertion sequences in long- and short-term adaptive processes of Acidithiobacillus ferrooxidans strains

The recent revision of the Acidithiobacillia class using genomic taxonomy methods has shown that, in addition to the existence of previously unrecognized genera and species, some species of the class harbor levels of divergence that are congruent with ongoing differentiation processes. In this study, we have performed a subspecies-level analysis of sequenced strains of Acidithiobacillus ferrooxidans to prove the existence of distinct sublineages and identify the discriminant genomic/genetic characteristics linked to these sublineages, and to shed light on the processes driving such differentiation. Differences in the genomic relatedness metrics, levels of synteny, gene content, and both integrated and episomal mobile genetic elements (MGE) repertoires support the existence of two subspecies-level taxa within A. ferrooxidans. While sublineage 2A harbors a small plasmid related to pTF5, this episomal MGE is absent in sublineage 2B strains. Likewise, clear differences in the occurrence, coverage and conservation of integrated MGEs are apparent between sublineages. Differential MGE-associated gene cargo pertained to the functional categories of energy metabolism, ion transport, cell surface modification, and defense mechanisms. Inferred functional differences have the potential to impact long-term adaptive processes and may underpin the basis of the subspecies-level differentiation uncovered within A. ferrooxidans. Genome resequencing of iron- and sulfur-adapted cultures of a selected 2A sublineage strain (CCM 4253) showed that both episomal and large integrated MGEs are conserved over twenty generations in either growth condition. In turn, active insertion sequences profoundly impact short-term adaptive processes. The ISAfe1 element was found to be highly active in sublineage 2A strain CCM 4253. Phenotypic mutations caused by the transposition of ISAfe1 into the pstC2 encoding phosphate-transport system permease protein were detected in sulfur-adapted cultures and shown to impair growth on ferrous iron upon the switch of electron donor. The phenotypic manifestation of the △pstC2 mutation, such as a loss of the ability to oxidize ferrous iron, is likely related to the inability of the mutant to secure the phosphorous availability for electron transport-linked phosphorylation coupled to iron oxidation. Depletion of the transpositional △pstC2 mutation occurred concomitantly with a shortening of the iron-oxidation lag phase at later transfers on a ferrous iron-containing medium. Therefore, the pstII operon appears to play an essential role in A. ferrooxidans when cells oxidize ferrous iron. Results highlight the influence of insertion sequences and both integrated and episomal mobile genetic elements in the short- and long-term adaptive processes of A. ferrooxidans strains under changing growth conditions.


Results and discussion
Genomic properties of A. ferrooxidans strains support the existence of two subspecies. The sequenced A. ferrooxidans strains used in this study have comparable genomic features in size, GC content, and global coding potential (Supplementary Table 1). Although all strains in this clade possess two rRNA operons with identical gene contexts 2 , they differ in the number of tRNAs (Supplementary Table 1). In the case of the ATCC 23270 T strain, this is due to the presence of an additional tRNA set in a foreign MGE known as ICEAfe1 23,27 . However, the origin of the additional tRNAs in other strains is currently undetermined (e.g. DSM 16786). All these strains have nearly identical (> 99.2%) 16S rRNA genes, are highly conserved genome-wise, with an average nucleotide identity (ANIb) ranging between 95.9 and 100%, and in silico DDH values averaging 88.2% (Supplementary Table 2). Hence, they can be unambiguously assigned to a single species of A. ferrooxidans. Despite this fact, two well-defined subgroups of strains (sublineages 2A and 2B) are evident from both genomic indexes calculated (Fig. 1A), with reciprocal ANI and DDH average values much closer to the species delimitation thresholds than the group-specific averages, suggesting the existence of two subspecies (Fig. 1B).
Since gene order conservation (synteny) is lost faster than sequence similarity 39 , we compared the synteny levels among A. ferrooxidans strains belonging to both 2A and 2B sublineages to assess the existence of subspecies. Using distinct iron-oxidizing Acidithiobacillus species as controls (47-92% synteny coverage), intermediate synteny coverage levels were observed within A. ferrooxidans strains when sublineages were disregarded (Fig. 1C, green; 2A vs. 2B and 2B vs. 2A strains). On the contrary, cross-comparison of strains of the same sublineage Pangenome analysis reveals substantial differences in the coding potential of the subspecies. We next analyzed the coding potential of 15 publicly available complete and draft genomes of A. ferrooxidans and derived the core, flexible, and exclusive gene complements for this set of strains. The core genome of the species, composed of protein-coding gene sequences common to all 15 strains, consisted of 1,300 protein families ( Fig. 2A Supplementary Table 3A). This set represents 42% of the CDSs encoded by A. ferrooxidans ATCC 23270 T and 9% of the total pangenomic gene complement of the species. Comparative analysis of this set of core proteins between strains assigned to each of the sublineages (Fig. 2B) revealed higher percentages of conserved proteins and higher average protein identity/similarity levels at the sublineage level ( Fig. 2C; Supplementary Table 3B), confirming the results obtained for the group at the nucleotide level. While 64 protein families present in the majority (> 90%) of the sublineage 2A strains are exclusive to this group, only 30 protein families are exclusive to the sublineage 2B strains (Supplementary Table 3C). This contrasts the 427 protein families exclusive to 'Acidithiobacillus ferruginosus' CF3 1 compared to A. ferrooxidans strains, which underly species differentiation in terms of gene content. Functional assignments of the flexible and exclusive gene complements are listed in Supplementary Table 3D and are further analyzed below. Apart from hypothetical or unknown function proteins, which comprised 79% of the flexible and exclusive gene complement of all A. ferrooxidans strains compared, the most frequent functional categories in the exclusive gene complement of sublineages 2A and 2B were addiction module proteins and pili-related functions. To learn more about the nature and organization of the dispensable/flexible genome and its contribution to subspecies differentiation, we predicted and analyzed the repertoire of MGEs in A. ferrooxidans strains.  76 vs. in silico digital DNA-DNA hybridization index (dDDH) assessed using the Genome-to-Genome Distance Calculator with recommended formula 2 77 and species cut-off limits defined by Meier-Kolthoff et al. 78 showing a clear-cut distinction between A. ferrooxidans clade 2A and 2B strains (excepting strain F221 with comparisons crossing this threshold). (B) Basic statistics for intra-and interclade genomic relatedness indexes distributions. Thresholds used for species delimitation are the following: digital DNA:DNA hybridization dDDH > 70% (same genomic species 77,78 ); Average Nucleotide Identity ANI > 96% (same genomic species 73,76 ). (C) Synteny coverage fraction (using 9 anchors, 2 to 10 calculated as in Drillon et al. 59 Fig. 1). The strategy used aims to bypass the fragmentation of most drafts (steps 1-3), the poor annotation of MGE-related contigs (steps 4-6), and the difficulties of fully reconstructing the elements present in the dataset (step 7). For this purpose, we used information available on several integrated MGEs (iMGEs) that have been identified and described to a different extent in A. ferrooxidans 16,20,22,23,27,33 . Moreover, candidate MGEs derived from applying existing MGE prediction resources to two available complete A. ferrooxidans genome sequences (NC_011761, NC_011206) were used 33,40 . A total of nine iMGEs spanning experimentally validated and bioinformatically predicted GIs, ICEs, or Tn7 transposons were present in the query genomes pertaining to A. ferrooxidans strains ATCC 23270 T and ATCC 53993 (Table 1). These were numbered correlatively (iMGE1-9) according to their position along the genome alignment of both strains (Fig. 3A). Protein-coding genes of all iMGEs were used as queries to assess the occurrence and percentage coverage of each iMGE in other A. ferrooxidans strains (Fig. 3B, Supplementary Table 4A). The orthology of the queried iMGE-associated proteins and candidate target proteins in the draft genomes of the other strains was confirmed by reciprocal best hits with BLAST (data not shown). This strategy resulted in a total of 10,604 target protein hits above a set cut-off (e-value < 0.001), which were distributed among 15 strains, including two query genomes for which iMGEs have been detected but not yet characterized 16,41 . The heatmap in Fig. 3B shows that sublineage 2A iMGEs are poorly conserved in A. ferrooxidans sublineage 2B strains, except for iMGE6 and iMGE2, which are present and conserved in most strains of this species, suggesting early fixation in the evolutionary history of the clade. Identified target proteins were analyzed for their likelihood of iMGE affiliation by analyzing contextual information (Supplementary Fig. 1; step 5), including G + C content skew, depth   Table 4A). For illustrative purposes, we chose strain CCM 4253 as a test case based on the quality and contiguity of its draft genome (Fig. 3C). The main characteristics of the novel candidate iMGEs identified in strain CCM 4253 are summarized in Table 2 (see Supplementary Table 4C for further details). These include two www.nature.com/scientificreports/ novel ICEs (iMGE10 and iMGE13), two novel GIs (iMGE11 and iMGE12), and two variant versions of the ATCC 23270 T iMGE1 and iMGE4. The occurrence of novel iMGEs at the predicted integration sites in the genome of strain CCM 4253 was experimentally validated for all six of these iMGEs, while mitomycin C-inducible excision could be demonstrated in five of them (Fig. 3D, the primer sequences are listed in Supplementary Table 4D).  Table 1). (B) Occurrence and coverage of validated and candidate iMGEs of A. ferrooxidans ATCC 23270 T and ATCC 53993 in an extended set of draft genomes of the species. Strains were clustered based on the coverage patterns of the MGEassociated gene products (cut-off: e-value -10; detailed in Supplementary Table 4). (C) Identification of novel iMGEs and iMGE-fragments in strain CCM 4253. Colored elements in the outer layer are known and candidate iMGEs in ATCC 23270 T and ATCC 53993 genomes. Some iMGEs are strain-exclusive, such as iMGE1 20 and iMGE3 23 . Others are conserved in two reference strains used as queries. Dark blue elements correspond to CCM 4253 strain genomic assembly contigs (GCA_003233765). Additional information layers, from the center of the figure outwards, correspond to (a) the TBlastN hits found in the CCM 4253 genome using either query strains MGEs; (b) the G + C content skew of the position; (c) the deep local coverage at the position; (d-f) the exclusive, flexible, or core pangenome compartment to which a given CCM 4253 gene pertains and (g) the prediction of MGE features using a combination of tools as previously described in Gonzalez et al. 40 and Moya-Beltrán et al. 33 .
(D) PCR validation of novel and partially conserved candidate iMGEs identified in the genome of strain CCM 4253. PCR products correspond to specific attL, attR, attP, and attB sites (the scheme of experimental design is shown in Supplementary Table 4D and primer sequences are listed in Supplementary Table 4E) of the following iMGEs: iMGE1 (partial ICE integrated into the rimO gene); iMGE10 (integrated at tRNA Thr-TGT); iMGE11 (integrated at tRNA Arg-CCT); iMGE12 (integrated at tRNA Arg-TCT); iMGE13 (integrated at tRNA Ala-GGC); iMGE4 (partial ICE integrated at tRNA Ala-CGC). PCR of attP and attB sites were evaluated on DNA recovered after incubation of A. ferrooxidans CCM 4253 cells with mitomycin C. Lane M represents the 100 bp DNA ladder; lanes L, R, P, and B represent the attL, attR, attP, and attB specific sites in the individual iMGEs, respectively. The gel image has been cropped for display. The original gel image is shown in Supplementary  Table 4F. tion pipeline used in this work produced several hits in stand-alone contigs that failed to incorporate into the chromosome scaffolds or the circular assemblies of the draft genomes. Some exhibited genomic signatures of plasmids and were highly conserved exclusively among 2A strains (Fig. 4). BLASTn analysis of these contig segments against the non-redundant NCBI database revealed a high level of sequence identity (> 88%) and synteny coverage against a member of the pTFI91-like plasmid family, specifically A. ferridurans ATCC 33020 T plasmid pTF5 42 . This plasmid family has initially been described in A. ferrooxidans strains 43 and found in other Acidithiobacillus spp. 42 . They share a conserved 2.2-kb SacI restriction endonuclease region containing the replication origin (oriV). The oriV region of A. ferrooxidans 2A candidate plasmids identified downstream of the rep was found to be part of a gene cluster encoding invertase, integrase, partition genes parA and parG, and plasmid replicase, flanked by two ISAfd1-like transposases, likely forming a distinct insertion sequence (Fig. 4A). The identified oriV is highly conserved compared to that described for pTF5. It exhibits binding sites for DnaA and IHF family proteins (Fig. 4B), which supports the assignment of these contigs as episomal MGEs (eMGEs). The existence of a pTF5-like plasmid was confirmed in silico in A. ferrooxidans CCM 4253 by recirculation of the contig 15 (QKQP01000015) (Fig. 4C) and then experimentally by isolation and cleavage of the replicon using the www.nature.com/scientificreports/ restriction endonuclease BamHI, resulting in three predicted products of 3,077, 4,155, and 10,594 bp (Fig. 4D).
In addition to the replication module, the pTF5-like plasmids identified in A. ferrooxidans 2A strains encoded (i) a single addiction or vapBC-type module, (ii) an adaptation module comprising genes encoding redox-active proteins predicted to function in electron transport systems (ntcA / fnr, hcp, hcr, nnrS), and (iii) an ISAfe25 transposon consisting of tnpARX genes. The ISAfe25 transposon is replaced in two strains by a cluster of three genes encoding retron-type RNA-directed DNA polymerase, StbC protein, and putative NERD domain protein.
The gene cluster encoding the redox-active proteins forms the essential part of the sequence of the pTF5-like plasmids, which distinguishes these larger plasmids from the 9.8-kb pTFI91 plasmid 43 . The adaptation genes ntcA, hcp, and hcr (with 99% identity) were also part of the flexible gene complement of plasmid-free A. ferrooxidans strains, including the ATCC 23270 T (ICEAfe2) and sublineage 2B strains (near MGE-signature genes such as trb-type T4SS genes, data not shown). Complete annotation of the pTF5-like plasmid from strain CCM 4253 is provided in Supplementary  www.nature.com/scientificreports/ the MGEs and the chromosome were inferred by assigning the individual protein products to the core, flexible, or exclusive pangenome gene complements. Only genes pertaining to the flexible or exclusive gene pool were retained for further analysis (Supplementary Table 6). This strategy produced 7596 protein families associated with known and novel MGEs of the species and its sublineages (Fig. 5A). As a result, 51.8% of the A. ferrooxidans pangenome is tentatively associated with the mobilome. Functional assignment of these genes, and functional classifications achieved using CDD, COGs, and KEGG revealed that 5.7% of MGE-associated protein families encode functions related to MGE biology (replication, recombination/integration, mobilization/conjugation, maintenance/stability, partition), 5.9% encode accessory functions with potential adaptive value (defense systems, transporters, enzymes), and 16.6% lack yet functional assignment (Supplementary Table 6). Representative results of the protein family (PF) distributions per functional class obtained using COGs are shown in Fig. 5B. COG enrichment analysis showed that a subset of 137 protein families (comprising 5063 individual proteins) were exclusively assigned to single strains of one sublineage (Fig. 5B, exclusive PFs in orange) or to several strains from the same sublineage (Fig. 5B, intra-lineage PFs in green). As expected from the number of strains in each dataset (11 strains in clade 2A vs 4 strains in clade 2B), the number of protein families per COG category indicated that the flexible genome of the sublineage 2A is larger than that of sublineage 2B (Fig. 5C) and supports the view that the pangenome of both sublineages is open. Integrated functional classification of the subset of MGE-associated protein families using COGs, CDD and KEGG revealed that both sublineages have a distinct set of glycosyltransferases and restriction-modification systems. In contrast, strains of both sublineages differ in the number and types of transporters, transposases, and transcriptional regulators, which are invariantly more abundant and diverse in the non-type strains (Supplementary Table 6).   [36][37][38] . Thus, we performed genome resequencing of long-term iron-and sulfuradapted cultures (Experiment 1, 20 generations). We also monitored genotype and phenotype over time upon switching culture media in focused short-term adaptations (Experiments 2 and 3, 6 generations; Supplementary  Fig. 2). Genome resequencing of iron-and sulfur-adapted cultures of strain CCM 4253 (Experiment 1) did not show any significant changes in the location of iMGEs with respect to genome reference (QKQP01), which occurred at the exact inferred locations in both derived cultures (Fig. 6A). Plasmid pTF5-like (QKQP01000015) was stably retained by cells from both adapted cultures and occurred at comparable fold proportions (sixfold: Iron; fivefold: Sulfur; Supplementary Table 5C). Instead, the reconstructed chromosomes of the iron-and sulfuradapted cell populations differed by a single replicative transposition of the ISAfe1 in the sulfur-adapted culture (Fig. 6A,B). The strain CCM 4253 carries 12 distinct ISs and 28 copies of the ISAfe1 (ISL3) spread along its genome (Supplementary Table 7). In the sulfur-adapted culture, an additional copy of the ISAfe1 interrupted the pstC2 gene (DN052_16065) encoding an identical phosphate permease (AFE_1940, 100% identity) to that found in the A. ferrooxidans ATCC 23270 T genome 17 . The A. ferrooxidans type strain's genome contains two similar pstC genes, but their amino acid sequence shares only 29% identity. All genomes analyzed in this work encoded two pst operons that contribute to Pi uptake: (i) pstI operon with sensor and regulatory functions and the exopolyphosphatase encoding gene (phoBR-pstS1C1A1B1-phoU-ppx), and (ii) stand-alone partially incomplete transporter pstII operon (pstS2C2A2) (Fig. 6B). Both permeases form part of the binding-proteindependent transport systems for inorganic phosphate (Pi), the Pst systems found in many bacterial species. The phosphate transporter PstSCAB activates the histidine kinase PhoR under Pi-limiting conditions, which subsequently phosphorylates the transcription factor PhoB and thus activates the pho regulon, allowing Pi uptake. In contrast, PhoB is deactivated by PhoR under sufficient Pi conditions, consequently inhibiting the expression of genes involved in response to Pi starvation 44 . The expression of both pstI and pstII operons is induced in A. ferrooxidans ATCC 23270T under phosphate limitation in modified 9 K-Fe media at pH 1.5 (personal communication, Mario Vera). Also, the expression of the pstII operon is higher (sixfold) in 9 K phosphate rich media when grown in iron at pH 1.8 compared to elemental sulfur at pH 3.5, while the pstI operon is equally expressed in both conditions (personal communication, David S. Holmes). This evidence suggests that the pstII operon is preferentially used for phosphate uptake in low pH iron media (where iron oxidation may cause phosphate depletion via chelation or precipitation), and that the inactivation of this transporter may occur and be selected for when cells are grown in the absence of iron and/or at higher pH. Although the ISAfe1 element silenced the pstC2 gene related to phosphate metabolism, the same element has previously been shown to silence the resB gene with a completely different function. The resB gene encodes the maturation protein required to form the bc 1 complex involved in reverse electron transfer during iron oxidation in A. ferrooxidans. An interruption of the resB gene in strain ATCC 19859 resulted in a mutant that lost the capacity to oxidize iron but retained the ability to oxidize sulfur 37 . Furthermore, the stress induced by elevated sodium chloride concentration caused ISAfd1 to be inserted downstream of the two promoters PI and PII of the rus operon (which encodes the iron oxidation pathway), thereby preventing its transcription. The ability to oxidize iron was restored after prolonged cultivation in the absence of sodium chloride, and two revertant strains were obtained 38 . Given the scarcity of genetic mutants of the taxon, we analyzed this mutation in further detail.

Mutational inactivation of ptsC2 in sulfur-grown cells impairs early iron oxidation. In contrast
to previously described transposition mutations in iron-oxidizing acidithiobacilli, the sulfur-adapted A. ferrooxidans culture with △pstC2 mutation retained the ability to oxidize iron but only after a lag phase lasting several days. A similarly long lag phase has been observed in A. ferrooxidans cells after switching edonor from sulfur to iron and not vice versa 45 . This lag period is considered to reflect the time required to synthesize regulatory factors to induce genes involved in iron oxidation 46,47 . Although cell adaptation during the sulfur-to-iron transition has been comprehensively described at the mRNA and protein levels, regulatory factors responsible for this lag phase remain unidentified 48 . Further investigation, using PCR-based screening of pstC2 alleles (see Methods and Supplementary Table 4D) revealed that the long-term sulfur-adapted culture contained not only the △pstC2 transposition mutant (product size of 2264 bp) but also the pstC2 wild-type allele (product size of 954 bp) (Fig. 6C). Thus, we sought to analyze the consequences of this mutation.
We monitored the culture transfer in which the transposition event occurred or disappeared. When transferring a long-term iron-adapted culture (pH 1.7) that is wild type for both pstC1 and pstC2 alleles to sulfur medium (initial pH 3.5), and then repeatedly transferring a 10% inoculum to fresh sulfur medium (Experiment 2), a PCR product of 2264 bp indicating the emergence of the △pstC2 allele was observed from the sixth generation onwards. Although the same mutation was repeatedly observed, a PCR product of 954 bp corresponding to the pstC2 allele was still detected in the culture. In addition, the sulfur-adapted populations enriched in the △pstC2 mutant were repeatedly observed to stop growing on elemental sulfur when the pH reached pH 1.8 (instead of pH < 1.3 as was the case of wild type cultures), implying the mutants are probably sensitive to pH. This implies that the populations enriched in cells bearing the mutant allele are probably exposed to severe acidification leading to culture collapse. In turn, the reciprocal experiment (Experiment 3), entailing the repeated transfer of a long-term sulfur-adapted culture in a fresh iron-containing medium stably maintained at pH 1.7, revealed the coexistence of both the pstC2 and △pstC2 alleles until the fourth generation, after which the transposition event stopped being detected (Fig. 6D). In parallel the iron oxidation lag phase shortened gradually with each transfer www.nature.com/scientificreports/ to fresh iron media until it disappeared entirely after the fourth transfer (Fig. 6E), a point at which cultures were well adapted to growth in iron.
These results indicate that sulfur-adapted cultures growing under higher pH endure or select the emergence of the △pstC2 allele. Phosphoric acid has three pK a values, the lowest of which is 2.1. Only anionic dihydrogen phosphate and undissociated phosphoric acid are relevant in extremely acidic environments. The latter becomes increasingly dominant with decreasing pH (as occurs with sulfur oxidation, not iron oxidation). Thus, an enhanced influx of dihydrogen phosphate and phosphoric acid into the cells expressing both phosphate  Supplementary Fig. 2 (Experiment 1). Lane M represents a 1-kb DNA ladder; lane Fe 2+ represents the PCR product of the pstC2 locus from the iron-adapted culture; lane S 0 represents the PCR product of the pstC2 locus from the sulfur-adapted culture. The gel image has been cropped for display. The original gel image is shown in Supplementary Fig. 2D. (D) PCR evaluation of reversal of the transpositional mutation during subsequent culture transfers, as shown in Supplementary Fig. 2 (Experiment 3). Lane M represents a 1-kb DNA ladder; lanes 1-6 represent the PCR product of the pstC2 locus from the first to sixth individual iron passages in Fig. 6E. The gel image has been cropped for display. The original gel image is shown in Supplementary Fig. 2E. (E) The long-term sulfur-adapted A. ferrooxidans CCM 4253 mutant strain ΔpstC2 passaged on ferrous iron repeatedly, as shown in Supplementary Fig. 2 (Experiment 3). The first (purple circle), second (dark blue triangle), third (light blue square), fourth (green diamond), fifth (light green inverted triangle), and sixth (yellow circle) iron passage. www.nature.com/scientificreports/ transporters may lead to cell death by acidification of the cytoplasm and ultimately to culture collapse. Also, our results suggest that iron-adapted cultures growing at lower pH require the pstII-encoded transporter for sufficient phosphate uptake, possibly to secure phosphorous availability for oxidative phosphorylation coupled to aerobic respiratory electron transfer during iron oxidation, and/or to deal with poor solubility of phosphate arising from ferric iron sequestration at increasing pH. The ISAfe1 transposable insertion into the pstII operon affected the ability to oxidize iron, similarly to previously observed insertions in the res and rus operons in Acidithiobacillus spp. These mutations, and others yet to be described, seem to emerge frequently under permissive conditions (e.g. during growth on sulfur) only to reveal themselves upon the change in the growth mode.

Conclusions
Combined nucleotide sequence, synteny, and gene complement comparative analyses of A. ferrooxidans strains proved to be a successful strategy to resolve subspecies-level taxa within the species. Discriminant genomic/ genetic characteristics between sublineages included distinct flexible gene complements, MGEs repertoire, and differentiated MGE-associated gene cargo. Part of the differential gene complement may have become fixed in the respective sublineages for adaptive reasons. Adaptive genes linked to cell-environment interactions (e.g. glycosyltransferases, transporters) or to host cell-MGEs interactions (e.g. restriction-modification systems) were found in both sublineages 2A and 2B but differed in quality and/or quantity. How these gene functions relate to sublineages divergence requires further exploration. Using A. ferrooxidans CCM 4253 as a test case, stability of both episomal and integrated MGEs under adaptive growth in both ferrous iron-and sulfur-containing media was observed, supporting their role in long-term adaptive processes. In turn, active replicative transposition of ISs (ISAfe1) after repeated culture transfers resulted in mutational inactivation of the ptsC2 gene and impaired iron oxidation upon transfer to ferrous iron-containing media. These results support the previously observed phenomenon in stressed acidophiles and the role of ISs in short-term diversification under permissive conditions. Impairment of growth of sulfur-adapted CCM 4253 cells in phosphate-rich media amended with iron as an energy source upon transfer (as reflected by the iron oxidation lag) revealed a role for the phosphate permease in the passive cytoplasmic acidification caused by the influx of dihydrogen phosphate and undissociated phosphoric acid in low pH medium.

Methods
Bacterial strains, growth conditions, treatments, and determinations. A. ferrooxidans was isolated from mine waters at Zlaté Hory in the Czech Republic and is deposited in the Czech Collection of Microorganisms (CCM) under number 4253. Bacterial strain CCM 4253 was cultivated onto overlay plates containing ferrous iron plus tetrathionate or tetrathionate only as electron donors 49 at 30 °C. A single A. ferrooxidans CCM 4253 colony from the ferrous iron plus tetrathionate overlay plate was picked and multiplied in a basal salts medium containing ferrous iron at 30 °C on a rotary shaker. After sufficient cell numbers were achieved (1 × 10 8 cells mL -1 ), the strain was cultured in basal salts media containing ferrous iron (9 g L -1 equivalent to 161 mM) or elemental sulfur (10 g L -1 ) as electron donors at 30 °C on a rotary shaker as described previously 48 . The longterm iron-and sulfur-adapted A. ferrooxidans CCM 4253 cultures were obtained after twenty transfers on the respective substrate (20 generations), as shown in Supplementary Fig. 2 (Experiment 1). Cells were harvested, and genomic DNA was obtained for resequencing.
The long-term ferrous iron-adapted culture (genotype pstC2) was transferred and passaged in a basal salts medium containing elemental sulfur. Cells were washed in a basal salts medium before being transferred to another substrate. The sulfur-oxidizing culture was cultivated at 30 ºC on a rotary shaker until the pH dropped from an initial 3.5 to about 1.0 (10-14 days). Then 1/10 of the sulfur-grown culture was transferred to a fresh basal salts medium containing elemental sulfur and cultured under the same conditions. A total of six generations were prepared by this passaging, as shown in Supplementary Fig. 2 (Experiment 2). In addition, the long-term sulfuradapted culture (genotype △pstC2) was transferred and passaged in a basal salts medium containing ferrous iron. Cells were washed in basal salts medium before being transferred to another substrate. The iron-oxidizing culture was cultivated at 30 °C on a rotary shaker until the complete ferrous iron was consumed (2-10 days). Then 1/10 of the iron-grown culture was transferred to a fresh basal salts medium containing ferrous iron and cultured under the same conditions. A total of six generations were prepared by this passaging, as shown in Supplementary Fig. 2 (Experiment 3). Aliquots of each short-term iron-and sulfur-adapted culture (1-6 generations) were harvested, genotyped, and phenotyped (ferric iron concentration and pH) as in Experiment 1. Ferric iron concentration was determined spectrophotometrically at 300 nm 50 . The pH values were measured using a Radiometer electrode and a laboratory pH meter PHM220 (MeterLab).
DNA isolation and sequencing library preparation. Genomic DNA was isolated by phenol-chloroform extraction for next-generation sequencing techniques, as described earlier 51  www.nature.com/scientificreports/ v2 for 300 cycles was used for sequencing, and the entire 151 bp paired-end sequencing system was prepared as described in the MiSeq System Guide (15027617v04). Genomic DNA obtained from A. ferrooxidans CCM 4253 was also commercially sequenced using Illumina technology (Macrogen, South Korea).
Mitomycin C treatment and polymerase chain reactions. PCR validation of novel and partially conserved candidate iMGEs identified in the genome of strain CCM 4253 after treatment of A. ferrooxidans cells with mitomycin C, as previously described 23 . Amplification of target sequences was performed using GoTaq® DNA Polymerase (Promega) according to the protocol provided by the manufacturer. Oligonucleotides used in this study for PCR are listed in Supplementary Table 4E. The cycling conditions were as follows: initial denaturation for 15 min at 95 °C; 35 cycles consisting of denaturation for 30 s at 95 °C, primer annealing for 30 s at 60 °C, and extension for 40 s (attLRPB) or 2 min 20 s (pstC2) at 72 °C; followed by a final extension step for 7 min at 72 °C. PCR products were visualized on 1% agarose gels stained with GelRed® (Biotium).
Genome sequencing and resequencing analysis. After quality control using FastQC and filtering using Trimmomatic 52 , sequence reads were assembled de novo using SPAdes genome assembler 53 . All contigs from mate-pair sequencing were aligned and ordered against the ATCC 23270 T genome using MAUVE 54 . The contigs positions were further confirmed by BLAST search against the assembly graph produced by SPAdes genome assembler from paired-end sequencing data. Contigs in the correct position and orientation were then manually linked. A total of 15 contigs were obtained. Non-overlapping gaps were filled by contigs produced by the SPAdes genome assembler. The complete chromosome and plasmid sequences were subsequently annotated using NCBI Prokaryotic Genome Annotation Pipeline (PGAP) 55  Genomes recovery from databases, gene-calling, and annotation. Genome drafts were obtained from NCBI (https:// www. ncbi. nlm. nih. gov/ assem bly/) as of March 2020. We checked contamination and completeness as in Raes et al. 56 and Manni et al. 57 . The resulting genome statistics are summarized in Supplementary  Table 1B, along with sequence deposit information. Gene-calling and annotation were performed using the PGAP 55 . A genome sequence of low quality (AFE-DLC5, JNNH01) was annotated through the RAST pipeline (Rapid Annotation using Subsystem Technology) 58 . Recovered annotations were analyzed versus KEGG 82 and COG 81 databases as of July 2020 using SqueezeMeta 39 .
Comparative genomics and synteny analysis. Overall genome relatedness indexes (OGRIs) and core, flexible, and exclusive genes were derived from Moya-Beltrán and colleagues 1 . All possible pairwise genome comparisons using the average nucleotide identities based on Blast (ANIb) or the in silico digital DNA-DNA hybridization index (dDDH) are summarized in Supplementary Prediction and analysis of mobile genetic elements. The complete and high-quality draft genomes of A. ferrooxidans ATCC 23270 T , ATCC 53993, and CCM 4253 (CP001219.1, NC_015850, CP001132.1) were used for the identification of putative MGEs using several programs such as ISfinder database 60 and searched against the local database under conditions defined by 90% minimum similarity, 100% IS element coverage. ISEScan 61 software was used for the de novo search of IS elements in chromosomal and TnpPred 62 for predicting prokaryotic transposases. Manual curations of ISs and Tnp were performed to filter out false-positive results and incomplete IS elements. IslandPath 63 , AlienHunter 64 , and PAI-DA 65 were used for genomic island identification, and Phage Finder 66 and PhiSpy 67 for prophage identification. CONJscan 68 for Type IV secretion systems prediction and T346hunter software 69 was used to identify conjugation genes, the Atlas T4SS 70 database was searched, and the protein domains were identified using the CD search program. tRNAscan 71 and Aragorn 72 were used for tRNA and tmRNA searches. All predictions were analyzed and curated manually, as in Moya-Beltrán et al. 2019. Direct repeats were identified using the Needle program (EMBOSS) at the sequence termini. The BPROM 71 program predicted bacterial promoters 73 . DnaA and IHF binding motifs were annotated manually, as in Chakravarty et al. 1995 43

Data availability
The datasets generated and/or analyzed during the current study are available from the National Center for Biotechnology  www.nature.com/scientificreports/