The complete methylome of an entomopathogenic bacterium reveals the existence of loci with unmethylated Adenines

DNA methylation can serve to control diverse phenomena in eukaryotes and prokaryotes, including gene regulation leading to cell differentiation. In bacteria, DNA methylomes (i.e., methylation state of each base of the whole genome) have been described for several species, but methylome profile variation during the lifecycle has rarely been studied, and only in a few model organisms. Moreover, major phenotypic changes have been reported in several bacterial strains with a deregulated methyltransferase, but the corresponding methylome has rarely been described. Here we report the first methylome description of an entomopathogenic bacterium, Photorhabdus luminescens. Eight motifs displaying a high rate of methylation (>94%) were identified. The methylome was strikingly stable over course of growth, but also in a subpopulation responsible for a critical step in the bacterium’s lifecycle: successful survival and proliferation in insects. The rare unmethylated GATC motifs were preferentially located in putative promoter regions, and most of them were methylated after Dam methyltransferase overexpression, suggesting that DNA methylation is involved in gene regulation. Our findings bring key insight into bacterial methylomes and encourage further research to decipher the role of loci protected from DNA methylation in gene regulation.


Results
Predicted MTases in P. luminescens TT01. The P. luminescens TT01 genome harbors 47 genes which are annotated as methyltransferase or methylase, most of them encoding putative RNA MTases or protein MTases. Only 8 genes (plu0087, plu0338, plu0600, plu2710, plu2942, plu3417, plu3449 and plu3462) are annotated as DNA methyltransferase or DNA methylase 45 . Analysis of the P. luminescens TT01 genome using REBASE revealed 12 putative DNA MTase-encoding genes, i.e. the 8 genes cited above, plus plu0233, plu1935 and plu4197 which are annotated as encoding CHP (conserved hypothetical proteins) or unknown proteins, and plu4319 which is annotated as encoding a "Type I site-specific deoxyribonuclease HsdM". Prediction of the protein domains revealed an S-adenosyl-L-methionine-dependent methyltransferase domain (Interpro domain IPR029063) in all of the 12 MTases.
The 12 putative DNA MTase-encoding-genes found in P. luminescens TT01 are listed in Table 1. While 4 MTases are associated with REases, 7 are solitary MTases, and one is a hybrid MTase. The 12 loci were located all over the chromosome (Fig. 1). One predicted MTase (plu4319) corresponded to a Type I 9 while the remaining MTases were presumed to classify as Type II. For 8 of them, REBASE analysis proposed a recognition sequence (Table 1) 9 .
Distribution of predicted MTases in Gram-negative bacteria. The genomic context and taxonomic distribution of putative P. luminescens TT01 MTase-encoding genes are presented in Fig. S1. Most (n = 10) of the MTases encoding loci were located in the accessory genome and are associated with phagic regions (n = 4), genomic islands (n = 3), or regions of genomic plasticity (n = 3) ( Table 1). The dam gene is located in the core genome and an ortholog was present in each of the Photorhabdus and Xenorhabdus (a closely related genus, with similar lifestyle 46 ) complete available genomes, with a conserved synteny (Fig. S1). dam was also broadly distributed among many (>100) organisms of the Enterobacteriaceae family (Fig. S1), suggesting that this gene was ancestral in the genome. In contrast, plu0599, encoding M.PluTI 47 , is also located in the core genome (Table 1) but only conserved in two other strains of Xenorhabdus and Photorhabdus (Fig. S1). Other P. luminescens TT01 MTase-encoding genes were variably distributed among Enterobacteriaceae. For example, plu4197 was only found SCIeNTIFIC RePORTs | (2018) 8:12091 | DOI: 10.1038/s41598-018-30620-5 in P. luminescens TT01 genome whereas putative orthologs of several MTases (dcm, plu2942, plu3417, plu3449, plu3462, plu4319) were found in more than 50 organisms of the Enterobacteriaceae family (Fig. S1). The variable taxonomic distributions of MTases encoding genes and their high prevalence in the accessory genome, including an association with phage or DNA recombination genes, suggest that most of these genes have been acquired by horizontal gene transfer.

Identification of Methylation Motifs in P. luminescens Genome.
In order to have a complete overview of all the existing methylation sites over the P. luminescens genome, two complementary sequencing technologies were used: (i) SMRT sequencing by PacBio which can identify m6A and (to a lesser extent) m4C, and (ii) WGBS by Illumina (MiSeq) which can identify m5C.
The methylome analysis was first performed on DNA samples extracted from WT P. luminescens cells harvested during the mid-exponential-growth phase (OD = 0.3-0.4, hereafter referred to as EP for exponential phase). The SMRT sequencing yielded an average coverage of 182X allowing the identification of a high number (n = 41143) of statistically significant (QV score ≥30, see the Methods section for details) DNA modification marks and the identification of 6 conserved motifs: GATC; TGGCCA; GGANNNNNNRTGA; TCAYNNNNNNTCC; AGGCCT; CTCGAG ( Table 2). Five of them display m6A modifications and one displays an m4C modification. The WGBS approach allowed the discovery of additional m5C modification marks (n = 10898), grouped in two motifs: CCWGG and GGCGCC (Table 3). Altogether, we found 52041 methylated nucleotides distributed all over the P. luminescens genome (Fig. 1). For convenience, the 8 identified motifs were numbered according to sequencing technology used and their occurrence in the P. luminescens TT01 genome, and are hereafter referred to as motifs I to V for SMRT, with partner motifs GGANNNNNNRTGA/ TCAYNNNNNNTCC numbered IIIa and IIIb, respectively, and motifs VI and VII for WGBS (Tables 2 and 3). The number of methylated motifs found in the genome ranged from 28 (motif V) to 37465 (motif I). The proportion of a given motif found methylated in the P. luminescens TT01 genome ranged from 94.0% (motif VII) to 100% (motif V).
Methylation motifs are conserved during various growth conditions. In order to determine whether the methylation pattern can evolve over the course of growth, methylome analysis after SMRT sequencing was also performed on bacterial cells harvested during other growth conditions: WT cells harvested during the late exponential-growth phase (OD = 0.9, hereafter referred as LE), stationary phase (overnight growth reaching OD = 1.5, hereafter referred as SP), and late-stationary phase (i.e., after 24 h of growth, with OD > 3, hereafter referred as LS). WGBS was also performed on WT cells harvested during SP (OD = 1.5). Moreover, the SMRT methylome of a previously-identified sub-population that is resistant to a cationic AMPs (i.e., polymyxin B 44 ) was also investigated. The 8 identified motifs were found in all the methylomes analyzed. For a given motif, a high (>94%) and similar proportion (coefficient of variation <0.01) of methylated sites was observed (Tables 2 and 3). For a given motif identified by SMRT sequencing, the IPD (interpulse duration) ratio did not drastically change over the course of growth (Table 2). Taken together, these results indicate that in all the tested conditions, a high rate of methylation was reached for each of the MTases responsible for the methylation of the 8 identified motifs. This suggests that the corresponding MTase-encoding genes were expressed in all these growth conditions.
Expression of predicted MTases. The expression level of the 12 MTase-encoding genes was analyzed by qRT-PCR on mRNA extracted from cells grown in LB and harvested during exponential phase (EP and LE) and stationary phase (SP) relative to gyrB, a housekeeping gene. Figure 2 shows that the 12 MTases could be split into two groups: MTases (n = 7) encoded by a gene with a low or very low level of expression compared to gyrB (less than 10-fold, or 100-fold, respectively) and MTase genes (n = 5) that were expressed at a higher level (>0.1-fold the gyrB level of expression). These two groups were identical for the 3 growth conditions tested, revealing that a given MTase gene was expressed at similar levels in the various tested conditions. This is in agreement with the finding that the methylome of TT01 was very similar for all the conditions tested. Most (5/7) solitary MTase-encoding genes were affiliated to the group with a low level of expression (compared to gyrB). Only dam and dcm, the two broadly conserved solitary MTase-encoding genes, could be assigned to the group with the higher expression (Fig. 2). In contrast, all MTases associated with a RM system except hsdM (plu4319), were affiliated to the group with a high level of expression (compared to gyrB).
Comparative analysis revealed that none of the 12 MTase-encoding genes was significantly differentially expressed between the LE and EP (Fig. S2A). In contrast, during SP compared to EP, the expression of two and five MTases was significantly downregulated or upregulated, respectively. Among the MTase genes affiliated to the group with a high level of expression, only plu0233 and plu0600 were significantly more expressed during SP compared to EP (Fig. S2B). However, methylome analysis of cells harvested during these two different growth conditions revealed no difference in motif detection, suggesting that a limited increase or decrease in the level of expression of these MTases did not significantly contribute to the genome methylation pattern in P. luminescens TT01.  Table 1 for details. Note that the P. luminescens TT01 strain harbors no plasmid. Finally, given their very low expression levels (<0.001-fold the gyrB gene), five solitary MTases (Plu2942, Plu3417, Plu3449, Plu3462 and Plu4197) could be assumed to be inactive and therefore may not significantly contribute to the genome methylation pattern in P. luminescens TT01.
Motif-MTase assignment. Based on in silico analysis, 7 of the 8 identified motifs could be assigned to 6 of the MTases found in the genome (Table 1), as follows. The m6A modifications observed in motifs I, III and V could be targeted by the MTases Dam (plu0087, solitary MTase), HsdM (plu4319, RM system) and plu2709 (RM system), respectively; the m4C modifications observed in motif IV could be targeted by plu1935 (RM system); the m5C modifications observed in motifs VI and VII could be targeted by Dcm (plu0338, solitary MTase) and PluTI (plu0599, RM system), respectively (Table 1). Thus, only the m6A modifications observed in motif II could not be associated to an MTase based on in silico analysis. Considering the affiliation of plu0233 to the group with the high level of expression (Fig. 2), motif II could be targeted by plu0233 (a hybrid REase-MTase).

Genome-Wide Analysis of Modification Profiles.
All of the 52041 methylated nucleotides found in these 8 methylated motifs were distributed across the genome (Fig. 1). A dedicated webpage with a genome browser displaying the precise position of the methylated nucleotides has been generated (access details can be found in the "Data Availability" section below).
The distribution over the TT01 chromosome of the most prevalent methylation motif (GATC) was analyzed using the DistAMO tool and revealed a heterogenic GATC motif distribution (Fig. S3). As regions with a high GATC density are also regions displaying a high DNA methylation rate, their genomic localization was identified, and their distribution in the core and accessory genome of P. luminescens was determined. The 38 high-GATC-density regions identified are listed in Table S1. Eighteen were distributed in the core genome, and 20 in the accessory genome as follows: genomic islands (GI, n = 14), regions of genomic plasticity (RGP, n = 4), or phagic regions (n = 2). The major functions associated with these regions were "metabolism" (n = 22), "virulence" (n = 5) and "antibiotic synthesis" (n = 5) 46 .
Methylation pattern of a clonal subpopulation. We previously identified a small fraction of the P. luminescens TT01 population that is resistant to cationic AMPs and revealed that this subpopulation is the one responsible for successful infection in insects. Using SMRT sequencing, we showed that this subpopulation was genetically identical to a bacterial population composed of >99% of cells with an AMP-sensitive phenotype 44 .  Here, we analyzed the SMRT data on the methylation pattern of the AMP-resistant subpopulation in order to identify the differences in m4C or m6A modification marks compared to the control sample (EP) corresponding to cells with an AMP-sensitive phenotype. The number of modification marks identified was 41114 in the AMP-resistant subpopulation ( Table 2) and 41143 in the control sample (EP). For each of the motifs analyzed, a similar proportion (ranging from 96.2% to 100% depending on motif) of methylated sites was observed (coefficient of variation <0.01) between the AMP-resistant subpopulation and the control sample (Table 2). Altogether, only 37 modification marks differed between the two samples, as follows: 26 in motif I, 3 in motif II, 4 in motif III, and 4 in motif IV (detailed data can be found in the genome browser). None of these differential modification marks was located in the vicinity of the genomic regions harboring the pbgP operon (encoding enzymes involved in LPS modification and required for the AMP resistance) or the phoP/phoQ genes (encoding a two-component system required for the activation of pbgP expression). Thus, the global methylome of the AMP-resistant subpopulation analyzed after SMRT sequencing was highly similar to that of the WT grown during the control condition.
Analysis of the location of unmethylated motifs. For each of the 8 motifs identified, and for the four growth conditions tested, the number of unmethylated motifs are rare in P. luminescens (Fig. 3), in agreement with the high fraction of modifications marks identified (Tables 2 and 3). The location of the motif-associated methylation marks was determined relative to the position of neighboring ORFs: either in a putative promoter region (i.e. <200 bp upstream from a start codon), intragenic (inside an ORF), or in other intergenic regions (i.e. >200 bp from a start codon, or downstream of an ORF). For each motif, the fraction of motifs with modification marks (detected in at least one growth condition) mapping to a putative promoter region, as well as the fraction of motif without modification marks (detected in the four growth conditions tested) was calculated. These fractions were compared to the fraction of the corresponding motif mapping to putative promoter regions found in the genome (Fig. 3). For motif II to motif VII, the fraction of unmethylated motifs located in putative promoter regions was not significantly different compared to that observed elsewhere in the genome (i.e. inside an ORF or >200 bp from a start codon, or downstream of an ORF). In contrast, the fraction of unmethylated motif I (in at least one of the four growth conditions tested) located in putative promoter regions was significantly higher than that observed elsewhere in the genome (p < 0.001, Fisher's exact test) (Fig. 3). We therefore focused on the precise location of the conserved unmethylated motifs I (GATC). Unmethylated GATC motifs are rare in P. luminescens: only 56 sites out of 37500 were unmethylated in at least one of the 4 growth conditions tested (Fig. 3). Only a limited number (n = 16) of these unmethylated GATC motifs were located inside an ORF while the majority of them (n = 41) were located either in putative promoter regions, or in other intergenic regions (Fig. 3). As described above, the fraction of these unmethylated GATC motifs located in putative promoter regions was significantly higher than that observed elsewhere in the genome. Among the 56 GATC motifs unmethylated in at least one growth condition, 35 GATC motifs were unmethylated in three or four of the growth conditions tested (detailed data are available in the genome browser). Strikingly, 22 GATC motifs were unmethylated in all four growth conditions tested. Only two were located in a gene body, while the remaining 20 were located in putative promoter regions, with 4 of them mapping to the same promoter region (i.e. upstream plu1732) (Fig. 4). These 22 motifs were distributed all over the chromosome (Fig. 4). Interestingly, all these 22 motifs were still unmethylated in the AMP-resistant subpopulation described above (Fig. 4). Taken together, these results suggest the existence of factors (e.g., DNA binding proteins) hindering DNA methylation at these particular sites. Such factors are presumably always present in the various growth conditions tested, including in the AMP-R subpopulation. No conserved motif, recognizable by transcription factors was observed in these 22 loci based on MEME analysis (data not shown).

Methylome modifications by Dam-MTase overexpression.
We then investigated whether the high proportion (99.8%, Table 2) of the methylated GATC motifs identified in a reference condition could be increased by Dam MTase overexpression. The methylome of a strain overexpressing the Dam MTase was therefore analyzed and compared to its control strain (harboring an empty plasmid).
The number of modification marks identified in the strain overexpressing the Dam MTase and in the control sample (harboring an empty plasmid) was 41175 and 41119, respectively ( Table 2). The 68 modification marks differing between the two samples were distributed as follows: 57 in motif I (GATC) which is recognized by Dam MTase, 2 in motif II, 2 in motif IIIa, 4 in motif IIIb and 3 in motif IV (detailed data can be found in the genome browser). As described above, 22 GATC motifs were unmethylated in all four growth-curve time-points tested and were also unmethylated in the control strain harboring an empty plasmid (Fig. 4). In contrast, most of these motifs (n = 18) were found methylated in the strain overexpressing the Dam MTase (Fig. 4). Thus, with only 4 out of 37500 GATC sites found unmethylated in the strain overexpressing the Dam MTase, the proportion of methylated motif I was therefore higher than in the control (99.99% vs 99.83%, respectively). For the other motifs analyzed, a similar proportion of methylated motifs was observed in the strain overexpressing the Dam MTase (coefficient of variation <0.01). Thus, besides an increase in GATC methylation, the global methylome of the strain overexpressing the Dam MTase displayed only a limited number of differences compared to the methylome of the control. These results reveal that Dam overexpression caused a modification of the DNA methylation pattern, which was focused on the unmethylated GATC sites found in the control condition as well as in other various growth conditions tested.

Discussion
There has been a recent surge in bacterial methylomic data released 32,33,35,[37][38][39]48 . However, in these studies, SMRT sequencing only allowed the analysis of m6A and m4C modification marks, as the thorough identification of the third known DNA methylation mark (i.e. m5C) requires other investigations such as WGBS. Furthermore, most of these studies were performed in only one growth condition. While the genome of P. luminescens TT01 was  45 , our study has only now revealed its complete epigenome (m6A, m4C and m5C) and found no major difference between the various growth conditions tested. Up to now, observation of a relatively stable methylome had only been described for E. coli 49,50 . m4C methylation is restricted to archaea and prokaryotes, but in many bacterial species such as E. coli, no m4C-MTase was identified 32,36,51,52 . In contrast, P. luminescens displays a clear motif with m4C modification marks (motif IV), with 184 occurrences in the genome at a high rate of methylation (>96% of the motifs are detected as methylated). Based on our results, this motif is presumably methylated by an MTase for which orthologs are mostly found in the Photorhabdus genus (plu1935).
Several bacterial variants displaying genetic differences with their ancestor have been analyzed for their DNA methylation pattern (Haemophilus, Neisseria, Helicobacter, Campylobacter) [53][54][55][56] . In P. luminescens, we previously described the existence of an AMP-resistant (AMP-R) subpopulation which displays no difference in genome sequence compared to the control population (for which more than 99% of the cells are AMP-sensitive) 44 . The bacterial DNA methylation pattern in a bacterial population grown in presence of antibacterial agents has only been described in E. coli 49 . Here, we analyzed the methylome of the P. luminescens AMP-R subpopulation and found that it was highly similar to that of the control population. We thus provide evidence that the AMP used here (i.e. polymyxin B) does not drastically modify the DNA methylation pattern in P. luminescens. The precise mechanism allowing the AMP-R subpopulation to arise remains unknown, but it requires activation of the expression of the pbgPE locus in a PhoPQ-dependent manner 44,57 . In accordance with the high similarity observed between the methylomes of the AMP-R subpopulation and the wild-type control population, no particular modification mark (focused here on m6A and m4C modification marks) mapping to the loci responsible for the AMP resistance could be identified in the AMP-R subpopulation. In addition, Bisulfite sequencing of the promoter region of pbgPE genes revealed no m5C difference between both subpopulations (Mouammine & Brillard, unpublished data), suggesting that DNA methylation is not a mechanism triggering the occurrence of the AMP-R subpopulation.
The most prevalent methylation motifs throughout the P. luminescens genome are GATC (motif I) methylated by Dam, followed by CCwGG (motif VI), methylated by Dcm. Out of the 12 MTases found in P. luminescens TT01, these two MTases are the only ones for which orthologs are found in numerous bacterial genera other than Photorhabdus or the closely-related genus Xenorhabdus. As both MTases are widely distributed in Enterobacteria, they have been extensively studied 24 . The P. luminescens methylome analysis also confirmed the presence of a m5C modification mark in motif VII (GGCGCC), a motif previously described as recognized by an RM system (PluTI) 47 . Five additional motifs were also identified in this study, including two not found in REBASE (motifs IIIa and IIIb) 9 .
Here we also show that the Dam methyltransferase is very efficient in P. luminescens, causing the DNA methylation of more than 99% of the GATC sites in the genome, similarly as what was described in E. coli or Salmonella 5 . Despite this high level of methylation, we identified several loci that were always unmethylated in the tested conditions, suggesting the existence of factors, such as DNA-binding proteins, that hinder these particular GATC sites, as described elsewhere 58 . Strikingly, a significant enrichment of putative promoter regions was observed for these unmethylated motifs but not for the other 6 motifs identified. Dam has been described to act as a regulator of gene expression in E. coli and Salmonella strains, and is therefore considered as involved in epigenetic mechanisms, but such a role in other bacteria remains to be investigated 5 . Our results suggest that in P. luminescens, Dam may be responsible for a similar mechanism. Furthermore, we provide evidence that Dam overexpression can allow the methylation of these usually-unmethylated sites, suggesting a competition between as-yet-unidentified DNA-binding proteins and the Dam MTase.
MTase overexpression is reported to be related to strong phenotype modifications in several bacterial species, but their methylome has never been investigated 13,15,17,18,59 . In E. coli, SMRT sequencing of strains overexpressing 3 MTases, for which no particular phenotype was described, revealed signatures in the kinetic variations of the DNA polymerase that were not detected in the parental strain (i.e. methylation of adenines which were not located in a particular motif), suggesting nonspecific activity of these overexpressed MTases 36 . In P. luminescens, the Dam overexpression was not associated with particular signatures in the kinetic variations of the DNA polymerase compared to the control strain. In contrast, it was associated with an increase in GATC methylation frequency. Further research is required to find out whether this change in the DNA-methylation pattern may be related to the modification of several phenotypes observed in the Dam-overexpressing strain, including impaired motility, impaired virulence in insects, and increased biofilm-forming ability 18 .

Conclusion
This study brings the first description of the methylome of an entomopathogenic bacterium, with the identification of eight motifs displaying a high rate of methylation (Fig. 5). The methylome was stable over growth curve, as well as in an antimicrobial peptide-resistant subpopulation responsible for virulence in insects. The rare unmethylated GATC motifs were located preferentially in putative promoter regions, suggesting that DNA methylation is involved in gene regulation. Overexpression of the Dam MTase can lead to a slight modification of the DNA-methylation pattern, including the methylation of 18/22 sites which are usually protected from methylation by a presumed DNA-binding protein. Given the major modification of phenotypes associated with MTase-overexpression in several bacterial species, the strategy employed here can prove a powerful tool to open a new field of investigation to determine the role of loci protected from DNA methylation in gene regulation.

Methods
Strains and growth conditions. The P. luminescens TT01 bacterial strains were routinely grown in Luria broth (LB) medium with a 180 rpm agitation at 28 °C. As required, antibiotic concentrations used for bacterial selection were gentamycin at 15 μg mL −1 ; polymyxin B (polyB), 100 μg mL −1 . The AMP-R subpopulation was isolated as previously described 44 . Briefly, addition of polymyxin B to WT cells grown in LB at an OD 540 of 0.3-0.4 (EP) led to a decrease in OD due to the death of the AMP-sensitive population (about 0.5% of the WT population was found to be AMP-R). The AMP-R cells were then collected when the OD 540 had returned to 0.3 in the presence of polymyxin B. Genomic DNA was extracted on these cells for methylome analysis. Genomic analysis of MTases encoding genes. The REBASE database 9 was used to identify the putative DNA MTases in the P. luminescens TT01 genome 45 . The MaGe tool (available at https://www.genoscope.cns.fr/ agc/microscope/mage/) was used to analyse the genomic context of their encoding genes. The genomic locations were defined in the core or accessory genome (Regions of genomic plasticity (RGP), Genomic Island (GI) and Prophage regions (P)) as previously described (Ogier et al. 46 ). The MaGe tool was also used to analyse gene distribution and their synteny among a panel of Xenorhabdus and Photorhabdus genomes (27 and 16 strains, respectively, for which a complete genome was available). Distribution of the MTases in other Enterobacterial organisms was performed using a BlastP search on the NCBI non-redundant protein database with default parameters and a maximal target sequences set at 1000.
Single-molecule real-time (SMRT) DNA sequencing. Genomic DNA was extracted from bacteria grown in LB and harvested at an OD 540 of 0.3-0.4 (EP), OD = 0.9 (LE), OD = 1.5 (SP), and after 24 h of growth (OD > 3, LS) as follows. Bacterial cells corresponding to 2 ml of culture were washed in PBS and pellets were stored at −80 °C. To perform lysis, cells were resuspended in 200 µl of TSE-lysozyme for 15 minutes at 37 °C, followed by addition of 640 µl EDTA pH8 0.5 M and 160 µl SDS 10% and incubated 15 minutes at 60 °C. Lysates were incubated for 1 hour at 56 °C after addition of 20 µl proteinase K (20 mg·ml −1 ), cooled on ice and incubated 5 minutes at room temperature with 30 µl of RNAse A (20 mg·ml −1 ). Precipitation of contaminants was performed by addition of chilled 350 µl potassium acetate 5 M and a centrifugation step (10,000 × g for 10 min at 4 °C). The genomic DNA was purified with magnetic beads (Sera-Mag Speed beads, Thermo-Scientific) as previously described 60 . The DNA libraries were prepared according to PacBio guidelines: 20-kb Template Preparation Using BluePippin Size-Selection System (15-kb size cutoff); shearing at 40 kb was performed using Megaruptor system (Diagenode); sizing at 17 kb was performed using BluePippin system (Sage). Libraries were sequenced on one or two PacBio SMRT cells at 0.25 nM with the Protocol OneCellPerWell (OCPW), P6C4 chemistry and 360 minutes movies on a Pacific Biosciences RSII instrument (GeT-PlaGe, Toulouse, France). The raw data were processed with the PacBio SMRT Analysis Suite (version v2.3.0 p4). For samples EP, SP and AMP-R, the reads were assembled de novo, with the high-quality Hierarchical Genome Assembly Process HGAP.3 algorithm and no rearrangement was observed with progressiveMauve 2.1.0.a1 61 . Conserved Synteny LinePlot revealed 100% conservation of synteny groups between the TT01 genomes studied, with a synton size ≥3 and the P. luminescens TT01 NC_005126 genome as the reference (data not shown).
DNA methylation detection and motifs identification after SMRT sequencing. DNA methylation was determined using the RS_Modification_and_Motif_Analysis protocol within SMRT Portal 2.3.0p4 which uses an in silico kinetic reference and a Welch's t-test based kinetic score detection of modified base positions 62 with parameters set as follow: subread/polymerase read length > = 500, polymerase read quality > = 80 and modification QV > = 30. A score of 30 for the "Modification QV" is the default threshold for calling a position as modified, and corresponds to a p-value of 0.001. Homemade script was used to keep methylated bases for adenine or cytosine with score > = 30 and known motifs.
Whole Genome Bisulfite DNA sequencing (WBGS) and DNA methylation detection. Genomic DNA from bacteria grown in EP and SP was extracted as described above and was sequenced using Illumina MiSeq technology as previously described 63 .
Fastq files were trimmed for adapters and low quality bases with Trim Galore! (v0.4.4) 64 then mapped to the public reference genome (NC005126) with Bismark (v0.17.0) 65 . Picard tools were used to remove duplicated reads. Then methylation calling was performed with Bismark_methylation_extractor (v0.17.0) for every single cytosine 65 . Positions at which a sequencing coverage reached 25X or more, and where the proportion of instances that was detected as modified (i.e. number of reads detecting a modification/total number of reads at a given position) reached at least 90% were considered as a methylated base. The surrounding sequences (+/−20 nt) of each methylated cytosine were extracted and analyzed for a motif search using MEME-ChIP (v4.12) 66 . For the 2 motifs identified by WGBS, the mean proportion of reads that was detected as modified at a given position was high (97.3 for motif VI and 96.5 for motif VII). The fraction of motifs methylated was calculated as the number of motifs with a methylated based out of the total number of motifs found in the genome.
Determination of GATC-rich and GATC-poor regions. DistAMO analysis, which calculate a z-score (with value of 2/-2 considered as a significant value) 67 revealed heterogenic GATC motif distribution over the TT01 chromosome (Supp. Data Fig. S3). However, the large window sizes used for the calculation of the z-scores of the GATC motif distribution range from 500 kb (at the inner ring) to 50 kb (on the outer ring increasing in 50 kb steps, Fig. S3). In order to have a clearer view of the GATC motif distribution over the TT01 chromosome, we used overlapping windows of a size of 1 kb, sliding every 100 bp. The mean proportion of GATC occurrence found per kb of the whole genome was calculated, and regions of 1 kb displaying at least +/−2 SEM were considered as GATC-enriched or GATC-depleted regions. Such regions were then compared to the position of regions of genomic plasticity previously described 46 . RT-qPCR analysis. Total RNA extraction was performed on cells harvested during exponential phase (EP and LE) and stationary phase (SP), from three independent cultures for each strain, using RNeasy miniprep Kit (Qiagen), according to the manufacturer's instructions. An additional incubation step with DNase I (Qiagen) was performed. The quantity and quality of RNA were assessed with an Agilent 2100 Bioanalyzer with the RNA 6000 Nano LabChip kit. Lack of DNA contamination was controlled by carrying out a PCR on each RNA preparation.
Quantitative reverse transcription-PCR (RT-qPCR) were carried out as previously described 18 . Briefly, RNA samples from 3 biological replicates for each strain were used for cDNA synthesis. The SuperScript II reverse transcriptase (Invitrogen) was used on 1 µg of total RNA with random hexamers (100 ng·µl −1 ; Roche Diagnostics). qPCR analyses were performed using SYBR green Master kit (Roche Diagnostics) with 1 µl of cDNA and specific gene primers at 1 µM (Table S2). The reactions were performed in duplicate at 95 °C for 10 min, followed by 45 cycles at 95 °C for 5 s, 61 °C for 10 s, and 72 °C for 15 s and monitored in the LightCycler 480 system (Roche). Melting curves were analyzed and always contained a single peak. The data analyzed with the REST software 2009 68 using the pairwise fixed randomization test with 2,000 permutations are presented as a ratio with respect to the reference housekeeping gene gyrB, as previously described 18 .