Deciphering bacterial epigenomes using modern sequencing technologies

Abstract

Prokaryotic DNA contains three types of methylation: N6-methyladenine, N4-methylcytosine and 5-methylcytosine. The lack of tools to analyse the frequency and distribution of methylated residues in bacterial genomes has prevented a full understanding of their functions. Now, advances in DNA sequencing technology, including single-molecule, real-time sequencing and nanopore-based sequencing, have provided new opportunities for systematic detection of all three forms of methylated DNA at a genome-wide scale and offer unprecedented opportunities for achieving a more complete understanding of bacterial epigenomes. Indeed, as the number of mapped bacterial methylomes approaches 2,000, increasing evidence supports roles for methylation in regulation of gene expression, virulence and pathogen–host interactions.

Introduction

DNA methylation was discovered in bacteria more than a half century ago1. It is now known that modification of the four canonical DNA bases by methylation can act as an epigenetic regulator — that is, it can impart distinct and reversible regulatory states to identical genetic sequences. In eukaryotes, epigenetic regulation can occur at multiple levels: DNA methylation, nucleosome positioning, histone variants and histone modifications. By contrast, bacteria lack histones and nucleosomes; therefore, DNA methylation is their primary means of epigenetic gene regulation.

Three different forms of DNA methylation exist in bacterial genomes: N6-methyladenine (6mA), which is the most prevalent form; N4-methylcytosine (4mC); and 5-methylcytosine (5mC). Although 5mC is the dominant form in eukaryotes, 6mA is the most prevalent form in prokaryotes. DNA is methylated by methyltransferase (MTase) enzymes, which transfer a methyl group from S-adenosyl-l-methionine (SAM) to the appropriate position on target bases (Fig. 1). Importantly, only a select few sequence motifs in each bacterial genome are targeted by MTases; for example, in Escherichia coli, 5ʹ-GATC-3ʹ is targeted by DNA adenine methylase (Dam) and 5ʹ-CCWGG-3ʹ by DNA-cytosine methyltransferase (Dcm). However, nearly every occurrence of the target motifs is methylated2. The MTase specificity domain that determines the target motif varies widely across species, resulting in a large diversity of methylated motifs across the bacterial kingdom.

Fig. 1: Primary types of DNA methylation in bacteria.
figure1

Chemical structures are shown for the most common forms of DNA methylation in bacteria, including N4-methylcytosine (4mC), 5-methylcytosine (5mC) and N6-methyladenine (6mA). In each instance, a methyltransferase (MTase) transfers a methyl group (CH3) from S-adenosyl-l-methionine to the unmodified nucleotide, producing a methylated nucleotide and S-adenosyl-homocysteine.

MTases function either alongside a cognate restriction enzyme as part of a restriction–modification (RM) system or as ‘orphans’ that lack a cognate restriction enzyme. DNA methylation mediated by both types of MTases has been found to play important regulatory roles in bacteria2,3,4,5,6,7,8,9,10,11,12. RM systems protect cells from invading DNA by methylating endogenous DNA and cleaving non-methylated foreign DNA2,4. RM systems are divided into three main categories based on the subunits involved and the precise site of DNA restriction13,14,15,16 (Fig. 2). Orphan MTases, such as Dam in Gammaproteobacteria, are thought to regulate DNA replication and gene expression, among other functions2. There is also emerging evidence that heterogeneity in methylation patterns within bacterial populations (often caused by phase variation of MTases) can promote adaptive selection by generating heterogeneity in gene expression and cellular phenotypes beyond those provided by genetic variation alone17,18. The vast majority of the > 6,000 bacterial genomes sequenced to date have been found to encode MTases and are, therefore, likely to be subject to DNA methylation19,20. Nonetheless, the precise sequence targets and biological roles of most MTases remain unknown21, largely owing to the historical lack of high-throughput tools for detecting 6mA and 4mC. Indeed, while method development for detecting eukaryotic 5mC has flourished over the past few decades22,23,24,25, only modest methodological advances for detecting the principle forms of DNA methylation in bacterial genomes have been made over the same period.

Fig. 2: Types of restriction–modification systems.
figure2

Restriction–modification (RM) systems are divided into three categories based on the subunits involved and the precise site of DNA restriction. Type I systems are composed of a single enzyme containing restriction, modification and specificity subunits, and multisubunit complexes are necessary for both modification and restriction. They target bipartite motifs, that is, two short sub-motifs of a specific sequence that are separated by a fixed number of nonspecific nucleotides. Cleavage can occur up to several kilobases away from the non-methylated motif site14. Type II RM systems are composed of distinct methyltransferase (MTase) and restriction enzymes that target short palindromic motifs. The restriction enzyme cleaves DNA close to or within the non-methylated motif sites15. Type III RM systems consist of complexes of multiple modification and restriction subunits, with the specificity element contained in the MTase. Short, non-palindromic sequences are targeted for methylation, and non-methylated motif sites are targeted by a separate restriction enzyme, which must bind with the MTase to achieve sequence specificity and cut at a location roughly 25 bp from the motif16. In each panel, the nucleotide targeted for methylation within a motif is indicated in bold text.

The recent introduction of new sequencing technologies is beginning to address this problem (Table 1). In particular, single-molecule, real-time (SMRT) sequencing has enabled all three major forms of bacterial DNA methylation to be detected simultaneously for the first time, and this technology has been used to generate most of the > 2,000 mapped bacterial and archaeal methylomes3,20,26,27,28,29,30,31 that are currently available in the centralized REBASE database19. These methylomes represent a wide variety of isolates from more than 750 distinct species, including common human pathogens such as Salmonella enterica (n = 150), E. coli (n = 123), Klebsiella pneumoniae (n = 93) and Staphylococcus aureus (n = 47).

Table 1 Methods currently used to detect methylated DNA in prokaryotes at single-nucleotide resolution

Here, we review the currently available methods for mapping bacterial methylomes, with a focus on cutting-edge technologies such as SMRT sequencing and Oxford Nanopore sequencing32. We discuss their potential to provide us with fully characterized methylomes that contain not only the full set of methylated positions and targeted motifs but also a complete map of the MTases and RM systems responsible for each methylated motif. We provide an overview of the insights into bacterial epigenomes that these new technologies have afforded and discuss how they might be used in the future to obtain a more complete understanding of bacterial epigenomes and the complex roles they play in defining interactions between bacteria and their host organisms. We do not attempt to review the rich history and foundations of bacterial epigenetics, which have been thoroughly reviewed elsewhere2,4,21,33,34.

Early methods for mapping methylomes

The bulk of methodological development for DNA methylation detection has historically been devoted to characterizing 5mC in higher eukaryotes, largely because the biological importance of 5mC in mammalian cells has been recognized for more than half a century35,36,37. However, characterization of bacterial methylomes requires alternative methods that can detect the more prevalent 6mA and 4mC in addition to 5mC. A number of different approaches have traditionally been used to characterize DNA methylation in bacterial genomes (Table 1).

Restriction enzyme-based mapping

Prokaryotic MTases are known to primarily target specific sequence motifs for methylation. The genome-wide methylation status of these motifs can often be deduced by digesting genomic DNA with one or more methyl-sensitive restriction enzymes of known specificities and analysing the pattern of cut and uncut restriction sites by next-generation sequencing (NGS)38,39. This approach is robust, reliable and accurate but is limited to the study of methylation motifs that perfectly or partially match the known specificities of available restriction enzymes. Thus, although still useful for assessing methylation events within known sequence motifs, it is generally not suitable for discovering new motifs.

Sanger sequencing-based mapping

Theoretically, the most common forms of bacterial DNA methylation can be detected as a by-product of Sanger sequencing because the presence of 4mC, 5mC and 6mA in the DNA template affects the amplitude of peaks in the sequencing trace. Although several studies have used this method to investigate the methylomes of pathogenic bacteria40,41,42,43,44, technical limitations, including subtle peak signatures and the low throughput of Sanger sequencing, have prevented it from achieving wider usage32.

Bisulfite sequencing-based mapping

Despite its reputation as the gold standard for characterizing 5mC in eukaryotic genomes (owing to its high sensitivity, accuracy and compatibility with NGS technologies), bisulfite sequencing has only quite recently been applied to the study of 5mC in bacteria45,46. More recently, it has been shown that treating genomic DNA with ten–eleven translocation (TET) enzymes before bisulfite treatment makes it possible to characterize both 5mC and 4mC bacterial methylomes47. Before bisulfite treatment, TET enzymes oxidize 5mC to 5-carboxylcytosine (5caC), which is subsequently read as a thymine in bisulfite sequencing data. By using a combination of standard bisulfite sequencing and 4mC-TET-assisted bisulfite sequencing (4mC-TAB-seq), it becomes possible to distinguish some fraction of 4mC positions from the 5mC positions; ideally, a sufficient number are detected to permit identification of the 4mC motif47. However, 6mA remains undetectable with this approach.

Newer methods for mapping methylomes

Recent innovations in so-called third-generation sequencing technologies, including SMRT sequencing and nanopore sequencing (Fig. 3), make it possible to directly interrogate native DNA molecules without PCR amplification. Importantly, these approaches retain chemical modifications in the DNA and enable many of them to be detected for the first time at a genome-wide scale.

Fig. 3: Technologies for detecting DNA methylation through direct sequencing of native DNA molecules.
figure3

a | Libraries for single-molecule, real-time (SMRT) sequencing from Pacific Biosciences consist of double-stranded DNA fragments flanked by hairpin SMRTbell adaptors that permit the polymerase to process both strands of the template. The libraries can be configured to accommodate the requirements of the specific application: short-insert libraries generate multiple subreads from both strands of the template molecule, which is useful for generating higher accuracy consensus subreads, and long-insert libraries are used to generate the longest possible subreads, which is critical for de novo assembly and detection of structural variants. b | SMRT sequencing relies on a sequencing-by-synthesis approach. A DNA polymerase is bound within a zeptolitre-scale observation chamber, called a zero-mode waveguide (ZMW), and uses a strand from the native sequencing library as a template for the read, incorporating fluorescently labelled deoxyribonucleoside triphosphates (dNTPs) as they diffuse into the ZMW. Each incorporated dNTP is briefly immobilized at the polymerase active site, emitting a fluorescent pulse in the corresponding colour channel. c | When observing the fluorescent traces produced by each ZMW, which are highly multiplexed on a chip, the order of pulses provides the read sequence, and pauses between pulses indicate the presence of a covalent modification in the template DNA. d | The 1D library preparation from Oxford Nanopore Technologies (ONT) uses a lead adaptor (loaded with a motor protein) and a tethering adaptor, which helps co-locate the molecule near the nanopore, to enable the sequencing of a single DNA strand from the molecule58. e | ONT sequencing instruments rely on engineered biological nanopores embedded in a lipid membrane to sequence single-stranded DNA (ssDNA). A voltage potential is applied across the membrane, and ssDNA is ratcheted through the nanopore by a motor protein bound to the DNA library molecule. f | The ionic current flowing through the nanopore depends on the precise set of nucleotides occupying the constriction point. Methylated nucleotides in the ssDNA introduce distinct current patterns, making it possible to detect the existence of modified bases relative to non-methylated DNA or precomputed models. For clarity, only two changes in current levels are shown in each box.

Direct detection of DNA methylation with single-molecule, real-time sequencing

SMRT sequencing, which is available in the commercialized RS II and Sequel instruments manufactured by Pacific Biosciences, is the first third-generation sequencing technology with a record of successfully characterizing bacterial methylomes. SMRT sequencing can simultaneously report nucleotide sequence and all three major types of DNA methylation in bacteria, albeit at different sensitivities (high for 6mA, moderate for 4mC and low for 5mC) owing to the signal-to-noise ratios specific to each modification type3,18,20,27,28,48.

In SMRT sequencing, each template molecule consists of a double-stranded native DNA fragment that has been circularized by ligating hairpin adaptors to each end26 (Fig. 3a). A DNA polymerase enzyme is bound to the hairpin-adapted template molecule, and a specially designed surface chemistry immobilizes the polymerase–template complex at the base of a zeptolitre-scale observation chamber, called a zero-mode waveguide (ZMW), which limits background fluorescence originating outside this small observation chamber26,49 (Fig. 3b). Sequencing by synthesis is initiated, and the DNA polymerase proceeds around the circularized DNA template multiple times, generating multiple subreads. During each base incorporation event, a fluorescently labelled deoxyribonucleoside triphosphate (dNTP) complementary to the template base is briefly immobilized in the ZMW observation window by the polymerase. A camera captures the resulting fluorescent pulse, and because each of the four canonical bases has a different fluorescent label, the series of pulses observed as the complementary DNA strand is synthesized can be used to construct the sequence read. Although base calling is accomplished by monitoring the order of dNTP incorporation events, DNA modifications are detected by identifying changes in the kinetics of the polymerase as it translocates along the DNA template (Fig. 3c). Specifically, the polymerase kinetics is described by the inter-pulse duration (IPD), which is the time interval between pulses that indicate nucleotide incorporation events. It has been shown that the IPD can be perturbed by the primary and secondary structures of the DNA template molecules26 and by covalent DNA modifications, including 4mC, 6mA, 5mC, 5-hydroxymethylcytosine (5hmC) and other types of DNA methylation and damage27,48,50,51. The signal-to-noise ratio for 4mC and 6mA is sufficiently high that they can be directly detected in native DNA3,18,48. However, detection of 5mC and 5hmC requires either high sequencing depth or additional steps to convert them to 5-formylcytosine and 5caC, both of which have higher signal-to-noise ratios48,51.

In order to detect modified nucleotides, IPD values from native DNA sequencing data can be compared with control IPD values from either methylation-free whole-genome amplified (WGA) DNA or precomputed in silico IPD models. The in silico model is trained using large amounts of sequencing data from unmodified DNA and consists of predicted IPD values for a given local sequence context28. The local sequence context surrounding the site of nucleotide incorporation strongly affects the processivity of the polymerase, and resulting fluctuations in IPD values must, therefore, be accounted for when looking for IPD deviations caused by a base modification event27,48. Owing to the extent of contact between the polymerase and the template DNA molecule, a modified base can affect the IPD values both upstream and downstream of the modified position. The resulting IPD signatures differ between 6mA, 4mC and 5mC, so it is usually possible to assign a methylation type to an observed methylation motif20,27. The vast majority of SMRT methylome studies have used a consensus approach to assess IPD values in aligned native reads from data of high sequencing depth. By comparing the native IPD values at a genomic position with either the predicted unmodified IPD value (using the in silico model) or a distribution of control IPD values obtained from WGA DNA, a simple statistical test can be used to identify methylated positions where the native and control IPD values diverge3,27,52. Alternative methods and statistical models have been proposed for methylation detection in conditions where the sequencing depth is expected to be low. For example, in metagenomic sequencing, low-abundance species would not be expected to be well covered, and for bacterial populations exhibiting heterogeneous methylation, one would not expect to find uniform methylation patterns in a given sample18,53.

Direct detection of DNA methylation with nanopore sequencing

Nanopore sequencing has been under active development for decades, but recent progress has led to the release in 2014 of MinION, the first commercially available sequencing platform by Oxford Nanopore Technologies54,55,56,57,58,59,60,61 and, more recently, the release of the GridION and PromethION. In this technology, genetically engineered protein nanopores are placed in a lipid membrane, across which a voltage is applied to drive the negatively charged single-stranded DNA (ssDNA) through the nanopores for sequencing. Multiple protocols are available for constructing libraries for nanopore sequencing, which all involve the ligation of adaptor sequences coupled with a motor protein to double-stranded DNA (dsDNA) fragments (Fig. 3d). The tethering adaptor sequences help to concentrate the DNA fragments near the nanopore-containing lipid membrane, while the motor protein facilitates the processive ratcheting of ssDNA through the protein nanopore at a fixed rate during sequencing. Sensors monitor the current through each nanopore during this process and detect variations caused by the translocation of the polynucleotide strand obstructing the channel (Fig. 3e). These current fluctuations are a function of the roughly six specific nucleotides occupying the constricted part of the nanopore channel at a given moment (Fig. 3f) and are processed by a proprietary recursive neural network to construct the sequence of nucleotides in the read61.

Although the vast majority of research applications to date have focused on using MinION to call the four canonical bases, current signals have been shown to differ between canonical bases and covalently modified nucleotides, which provides the possibility of detecting chemical DNA modifications62. However, the presence of modified bases can potentially complicate the base calling process, which relies on characteristic current levels produced as each k-mer (a combination of nucleotides of length k) passes through the nanopore. The presence of multiple types of base modifications greatly expands the set of possible k-mers beyond those constructed exclusively from the four canonical bases, which introduces substantial computational challenges.

Early attempts to detect methylation during nanopore sequencing, using a variety of protein nanopore configurations and experimental conditions, focused on eukaryotic applications and, therefore, were limited to 5mC and 5hmC detection54,55,61,63. However, the introduction of the MinION device has recently broadened the development focus towards characterizing prokaryotic methylomes64,65,66,67. For instance, a variable order hidden Markov model (HMM) was trained to identify methylation events in bacterial genomes64. When paired with a hierarchical Dirichlet process (HDP) to learn current distributions from the MinION device, it can detect both 5mC and 6mA at the specific motifs included in the training data. This HMM–HDP approach can also detect these modifications in individual reads, albeit at substantially lower sensitivities than when it is applied to consensus current signals from multiple aligned reads. However, the model is constrained by the contents of its training data, which limits its ability to identify novel modification types or novel methylated motifs. Although encouraging, such model-based approaches remain limited in their ability to identify DNA methylation at various sequence contexts. A preprint article has described an alternative method for nanopore-based methylation detection that uses a statistical comparison of current signals from native and methylation-free WGA DNA66 and that builds upon the design first proposed for detecting base modifications in SMRT sequencing27. This approach is not limited to the detection of DNA methylation at specific sequence motifs and has detected several expected 4mC, 5mC and 6mA motifs in bacterial genomes carrying MTases of known specificity, although the detection accuracy fluctuates with different methylation types and motif specificities66. Although encouraging, detection is not yet possible at the level of single molecules, and methods such as this one that do not require a priori knowledge may not be able to distinguish between diverse forms of DNA modification events, especially in eukaryotic genomes68.

These studies have demonstrated the feasibility of nanopore-based methylation detection; however, some challenges remain. For instance, none of these nanopore-sequencing-based methods has been applied for the biological characterization of an unknown bacterial methylome. Nonetheless, the rapid pace of method development in this field and the ongoing technological advancements in the underlying sequencing technology make nanopore sequencing a promising field to watch.

Methylation motifs and methyltransferases

Comprehensive mapping of a bacterial methylome requires more than just the detection of methylated nucleotides; it also requires identification of methylation motifs and the MTases responsible for the observed methylation patterns (Fig. 4).

Fig. 4: Steps for comprehensive characterization of a bacterial methylome.
figure4

Methylome characterization is increasingly becoming a standard component of bacterial genomic research. The detection of methylated positions can lead to the identification of precise methylated sequence motifs. A methylated motif can then be assigned to the responsible methyltransferase (MTase) based on either querying a database of MTases with known target motifs or through experimental means that involve comparing wild-type strains with strains where the MTase is inactivated. Multiple lines of functional investigation can lead from this basic characterization of the primary features of a bacterial methylome. SNPs, single-nucleotide polymorphisms.

Identifying methylation motifs

DNA methylation events in prokaryotic genomes are highly motif-driven for all three of the primary methylation types. If a methylation motif is targeted by an MTase, typically >95% of motif occurrences are methylated2,18,20,53. Historically, methylation motifs for novel type II RM systems have been identified through restriction digest approaches, as in these systems, restriction occurs precisely within the specific sequence motifs. However, the restriction site cannot serve as a proxy for the methylation motif in type I and type III RM systems, in which restriction occurs at a variable distance from the site of methylation69 (Fig. 2). As a result, there was until recently a notable paucity of known type I and type III RM systems contained in REBASE. However, the introduction of SMRT sequencing and the resulting accumulation of methylome surveys have contributed to a rapidly growing catalogue of known bacterial RM systems19,29,30,70,71,72 (Fig. 5). The initial output of a methylation survey is a list of genomic positions that are likely modified. In order to infer the methylation motif from this list, tools such as MEME73 can be used to build a consensus motif from the sequence context immediately surrounding the modified position.

Fig. 5: The accelerating pace of methyltransferase discovery.
figure5

Historically, type II methyltransferases (MTases) have been the most amenable to discovery, primarily through restriction enzyme digest and fragment analysis. By contrast, restriction enzyme digest is not well suited to de novo discovery of methylated motifs in type I and type III restriction–modification systems because the cut sites of cognate restriction enzymes are typically located at a variable distance from the methylated motif site. The introduction of methylation detection using single-molecule, real-time (SMRT) sequencing in 2010–2012 resulted in a surge of newly discovered MTases belonging to these systems.

Identifying methyltransferases responsible for motif methylation

Gene prediction tools and homology search tools, such as SEQWARE74, are often used to identify genes likely to encode components of an RM system, including subunits responsible for restriction, specificity and MTase activity. These components are typically encoded by genes proximal to each other in the genome and can be classified by RM system type (Fig. 2) on the basis of type-specific functional domains. Once classified by type, the characteristic methylation properties of the different RM system types can be leveraged to narrow down the list of putative MTases responsible for an observed methylation motif. For instance, type I MTases target complementary bipartite motifs on both strands, while type III MTases target contiguous, non-palindromic motifs on a single strand. After narrowing the set of candidate MTase genes, the sequences of these candidates can be queried against MTase sequences with known motif specificities in REBASE19, and a high-quality sequence match is often sufficient for a confident mapping20,28. In the absence of a high-quality MTase match in REBASE, two experimental approaches can be used to identify the MTase responsible for an observed methylated motif. The first relies on heterologous expression of the putative MTase gene in an otherwise non-methylated host, such as E. coli ER2796 (refs3,28,52,75,76). Alternatively, the putative MTase can be subjected to an inactivating mutation, where the mutation is either introduced experimentally29,77 or occurs naturally in a related strain78,79. If heterologous expression of the MTase results in methylation of the motif in question or if inactivation of the MTase abolishes methylation at that motif, the causal role of that MTase is confirmed.

Insights into restriction–modification systems from methylome analysis

RM systems often represent an important obstacle to genetic manipulation of an organism by leading to low transformation efficiencies. The design of effective shuttle vectors must, therefore, either include compatible methylation patterns to provide protective methylation or limit the number of motif sites in the vector that are subject to restriction by the host RM system80,81. Both of these approaches require a thorough understanding of the host RM repertoire and benefit from a comprehensive catalogue of known RM systems and specificities.

Perhaps the most surprising observation to come from the multitude of prokaryotic methylome studies is the remarkable diversity of MTase genes and target specificities. A recent survey of 230 diverse bacterial and archaeal epigenomes, which was enabled by SMRT sequencing, found DNA methylation in 93% of genomes across a wide diversity of methylated motifs (834 distinct motifs; averaging three motifs per organism)20. The primary driver behind this diversity is the spread of MTase-containing mobile genetic elements through horizontal gene transfer (HGT)20,82,83. Mutation events can also occur in the target recognition domain of MTase genes and thereby modify the sequence motif targeted for methylation, providing a route to further methylome diversification30. As a consequence of such diversification, researchers commonly find substantially divergent methylomes not only among species but also among different strains of the same species29,72,84,85,86,87,88,89.

Insights into methylation types

The recent surge of studies devoted to the characterization and functional examination of bacterial methylomes has built upon decades of previous work, most of which has relied on experimental approaches focused on a handful of loci in a relatively small number of well-characterized genomes4,10,38,90,91,92. The hard-won insights from these foundational studies have long hinted at an unappreciated level of complexity and regulatory potential present in modifications to the four canonical bases. Genome-wide mapping by modern methylation detection technologies is shedding new light on the distribution and roles of the three primary forms of DNA methylation in bacteria.

N4-methylcytosine

The extent of 4mC in bacterial genomes is not well understood, and its function largely remains a mystery, although it is known to be involved in multiple RM systems. 4mC occurs less frequently than 6mA in all bacteria but has been observed more often in thermophilic bacteria than in non-thermophilic bacteria, possibly because it is substantially more resistant to heat-induced deamination than 5mC, the other form of cytosine methylation found in bacteria93,94. RM-based analysis and modified bisulfite methods have been used to map 4mC in a number of bacterial genomes, including Caldicellulosiruptor kristjanssonii and Enterococcus faecalis47,81,95,96. However, SMRT sequencing is currently the most broadly applied method for 4mC detection. A variety of 4mC motif specificities have been identified in a range of species, including Bacillus cereus28, Helicobacter pylori29,30,70, Campylobacter jejuni71 and S. enterica72. Despite this progress in mapping the distribution of 4mC, its biological functions remain largely unclear. Only a single published study on the gastric carcinogenic bacterium H. pylori provides new insight into its potential functions. Deletion of the 4mC MTase M2.HpyAII altered the expression of 102 genes, resulting in decreased adherence to a human gastric adenocarcinoma cell line, reduced potential to induce inflammation and a diminished capacity for natural transformation97.

5-methylcytosine

Dcm, the orphan cytosine MTase of E. coli, has been the subject of study for several decades, and its target specificity of 5ʹ-CCWGG-3ʹ has long been known98. The EcoRII RM system is known to protect bacteria against parasitism99, and methylation by Dcm has been associated with Tn3 transposition100, lambda phage recombination101 and the expression of ribosomal proteins during stationary phase102. However, more general insights into the biological role of 5mC in bacteria have remained somewhat elusive.

Two recent studies have taken advantage of the genome-wide and single-nucleotide resolution of bisulfite sequencing to thoroughly investigate 5mC function in Gammaproteobacteria. In the first study, deletion of Dcm and the resulting suppression of methylation at 5ʹ-CCWGG-3ʹ in E. coli resulted in increased expression of the RNA polymerase sigma factor rpoS gene and many of its target genes during stationary phase45. The second study revealed that methylation of 5ʹ-RCCGGY-3ʹ by the cytosine MTase VchM is required for optimal growth in Vibrio cholerae and affects the cell envelope stress response, potentially by downregulating genes required for modifying the lipopolysaccharide inner core of the cell envelope46. While it is tempting to conclude from these studies that 5mC in bacteria is a suppressor of gene expression, more work will be needed to confirm this role — particularly as neither study demonstrated direct regulation of transcription by 5mC methylation.

SMRT sequencing has revealed the 5mC motif specificities of active cytosine MTases in a variety of bacterial species and strains29,80,84,88,103. Identification of 5mC methylated positions in isolates of Neisseria meningitidis showed them to be mutational hot spots, indicating that 5mC methylation may play a role in genome plasticity and evolution84. An improved understanding of 5mC motif specificities has also facilitated the design of plasmids capable of overcoming barriers to transformation in an important strain of Bifidobacterium animalis80, thereby enabling the molecular mechanisms underlying the observed correlations between bifidobacteria and gut health to be studied104.

N6-methyladenine

Although traditional RM digestion-based approaches were used in a recent genome-wide mapping study of 6mA105, the majority of bacterial studies have adopted SMRT sequencing. The abundance of 6mA MTases in the bacterial world and the robust IPD signature generated by 6mA during SMRT sequencing have led to the discovery of a vast diversity of 6mA MTases and methylated motifs in bacteria, which include many previously unknown orphan MTases and a multitude of previously uncharacterized type I, II and III RM systems19,20 (Fig. 5). The elucidated 6mA methylated motifs span a wide variety of organisms across multiple phyla, including Bacteroidetes87, Firmicutes81,89,106,107, Actinobacteria31,80,85 and Proteobacteria28,29,30,70,71,72,76,77,78,84,86,88,103,108,109. Functional knockout studies in many of these organisms highlight the ability of certain 6mA MTases to induce widespread transcriptional changes3,18,31,106,110,111, while other work has revealed differentially methylated 6mA positions in response to varied growth stages and environmental conditions31,77,103.

Researchers have also taken advantage of SMRT sequencing to explore mechanisms of bacteriophage invasion and host defence. For instance, 936-type bacteriophages, which commonly infect Lactococcus lactis starters used in cheese production, have been shown to encode multiple 6mA MTases75, which likely provide the bacteriophages with protective methylation that allows them to circumvent host RM systems. Conversely, the bacteriophage exclusion system is a gene cassette that confers bacteriophage resistance in a wide range of host bacteria. Interestingly, although activity of a 6mA MTase in the cassette is required for successful host defence, phage DNA does not appear to be targeted for restriction, which suggests a novel mechanism of methylation-based host defence112.

Insights into epigenetic regulation

In addition to providing a better understanding of the modifications and modifiers involved in DNA methylation, advances in methylation detection methods are also starting to reveal their biological functions and, in some cases, the mechanisms by which they exert their biological effect.

Methylation as a cellular regulatory signal

Several MTases have been shown to be capable of inducing consequential shifts in gene expression3,45,46,110,113. For instance, in the local competition model, competitive binding between an MTase and other DNA-binding proteins (such as transcription factors) at specific motif sites affects transcription of a nearby gene, leading to phenotypic variation within the bacterial population6,114,115,116 (Fig. 6a). In effect, the MTase methylates specific targets in some fraction of the population, thereby inducing or repressing local transcription in this fraction of the population. Canonical examples of this model in E. coli include the transcriptional regulation of the pyelonephritis-associated pili (pap) operon and the outer membrane protein-encoding gene agn43 by Dam methylation at nearby 5ʹ-GATC-3ʹ sites114,115,117. In the case of pap, which is required for adherence of uropathogenic E. coli to the host urinary tract, the local processivity of Dam is hindered by the sequence context of the 5ʹ-GATC-3ʹ in the pap promoter, which provides more time for the DNA-binding proteins to access their target sites and thereby skews the competition in their favour118. In both examples, methylation provides a means for modulating the antigenic profile of the population, thereby playing a role in immune evasion of host-adapted pathogenic bacteria.

Fig. 6: Epigenetic mechanisms of gene regulation and their consequences.
figure6

a | The methylation status at motif sites within the upstream regulatory sequence of a gene can affect its expression. The presence of methylated bases in this region can interfere with the binding of regulatory proteins, leading to either upregulation or downregulation of the gene. For instance, methylation can prevent a transcription factor (TF) from binding to its TF-binding site (TFBS), thereby preventing transcription of the downstream gene. b | Phase-variable methyltransferases (MTases) are capable of inducing genome-wide changes in methylation status and gene expression. Spontaneous and reversible frameshift mutations, often caused by slipped-strand mispairing in tandem repeat sequences during replication, induce inactivating premature stop codons in the gene encoding the modification (M) subunit. Cells containing the inactivated form of the M subunit lack methylation at the target motif, thereby providing a clonally expanded bacterial population with divergent methylation activity and distinct gene expression regimes69. In type I and type III restriction–modification (RM) systems where ON/OFF phase variation is most common, restriction activity requires both restriction (R) subunits and full-length M subunits. Therefore, both the methylation and restriction functions of these RM system are toggled ON and OFF by these frameshift mutations. c | Genetic rearrangements, such as inversion events, within the gene encoding the specificity (S) subunit can result in the expression of multiple specificity alleles. When paired with an active M subunit, these diversified S subunits target multiple motifs for methylation106. d | If a gene affected by methylation status encodes a TF or another protein with promiscuous DNA-binding specificity, the local methylation status can potentially trigger a cascade of downstream changes on gene expression. e | DNA methylation is likely to be involved in alternative mechanisms of gene regulation. For example, methylation is known to affect the curvature of DNA molecules, which could potentially control which regions of a chromosome are exposed to the transcriptional machinery of the cell, thereby affecting gene expression.

Traditional methylation detection approaches have identified examples of antigenic variation in other Gammaproteobacteria that are generated by competition between Dam and DNA-binding proteins, including in the leucine-responsive regulatory protein (Lrp) and the oxidative stress response protein (OxyR)2,4,119,120,121,122. However, the prevalence of this type of epigenetic regulation was revealed only when it became possible to systematically map the frequency and distribution of non-methylated sites with SMRT sequencing. Studies have reported several hundred non-methylated motif sites across various bacteria31,85,103,110,123; these sites are enriched at gene regulatory regions2,4,20, which suggests they are involved in competitive regulation of gene expression. Although detailed mechanisms remain to be identified in most cases, it has recently been shown through SMRT sequencing that variable site-specific Dam methylation at a 5ʹ-GATC-3ʹ motif in the regulatory region of the opvAB operon of S. enterica is responsible for determining the O-antigen chain length, which is a major determinant of phage resistance in S. enterica124,125.

DNA methylation also exerts critical regulatory signals during DNA replication. For example, specific motif sites in the replication origins of E. coli and Caulobacter crescentus (5ʹ-GATC-3ʹ and 5ʹ-GANTC-3ʹ, respectively) must be methylated for replication to occur6. Furthermore, Dam methylation of a hemimethylated 5ʹ-GATC-3ʹ site in the promoter of the transposase gene of IS10 represses its transcription during replication, presumably to ensure that potential transposition occurs only when a cell contains more than one copy of the chromosome10. Clues to the biological function of an MTase can occasionally be found by identifying genomic regions that are enriched for the methylation target motifs. For instance, enrichment of the Dam 5ʹ-GATC-3ʹ target motif near the origin of replication in E. coli and other Gammaproteobacteria has been well documented and is linked to its roles in the initiation of replication77,92. SMRT sequencing has revealed further examples of enrichment of 6mA motif sites near origins of replication in Arthrobacter and Nocardia, which indicates that this phenomenon may not be limited to Gammaproteobacteria20. Examples of over-enriched and under-enriched motif sites at other regions of the genome have been identified by SMRT in a wide variety of bacteria3,20,31,46,103, which could provide important clues about the biological purpose of the MTases responsible for their methylation.

Phase variation and epigenetic heterogeneity

Phase variation of bacterial surface proteins, caused by reversible mutations at a hypervariable locus126,127,128, has long been recognized as a mediator of antigenic variation and immune evasion. However, the importance and extent of phase-variable MTases have only more recently become apparent. Hypervariable mutations in the regions regulating and encoding these MTases can result in cell-to-cell differences in their expression (which affects the methylation status of their targets; Fig. 6b) or in their target specificity (which results in methylation of a different set of targets; Fig. 6c). Consequently, heterogeneous methylation patterns can develop within a clonally expanded population, which often has dramatic and genome-wide regulatory consequences69,129,130,131,132. The set of genes affected by a particular phase-variable MTase is called a phase-variable regulon or a phasevarion69,130. Note that this phenomenon is different from the example of phase variation of surface proteins described in the previous section, in which population-level variation in methylation at specific motif sites is caused by competition between MTases and DNA-binding proteins and not by phase-variable MTases.

Phase-variable MTases were first observed almost two decades ago as hypervariable inversion events in the hsd genes of Mycoplasma pulmonis133 and Streptococcus pneumoniae134, which encode the type I RM system of these bacteria. Further examples were subsequently uncovered in Pasteurella haemolytica135, Moraxella catarrhalis136, Haemophilus influenzae130,137,138,139, H. pylori140,141,142 and N. meningitidis79,139,143,144. Their biological importance was quickly appreciated owing to their effects on multiple genes throughout the genome, but in the absence of techniques to determine the underlying motif-specific methylation events, their phase-variable behaviour could be inferred only indirectly, and the precise mechanisms by which they affect gene expression remained unknown. SMRT sequencing has since been used to characterize the target motifs and methylation sites for a number of previously identified phase-variable MTases from a range of bacteria, including ModA and ModD in N. meningitidis109,144, ModM2 in the human respiratory pathogen M. catarrhalis78 and ModA in H. influenzae76,130,137,138,139. It has also provided insights into the mechanisms that give rise to phase-variable MTases and how they regulate phasevarions. Studies aiming to characterize how phase-variable MTases in H. pylori contribute to its highly complex methylome identified multiple phase-variable MTases generated by slipped-strand mispairing in homopolymer tracts as well as an unusual type I MTase that targets multiple bipartite motifs by interacting with several target recognition domain elements; this process can generate methylome diversification through recombination within the specificity subunit (S subunit)29,30. Although it had been previously shown that phase-variable MTases in H. pylori are capable of regulating phasevarions142, SMRT sequencing demonstrated the importance of the ModH5 allele of the phase-variable ModH MTase in regulating virulence genes in H. pylori145. A study in S. pneumoniae found that a previously observed phase variation of the type I hsd system134 is capable of inducing dramatic changes in the bacterial methylome. This example is one of the most complex phasevarions characterized to date: reconfiguration of five target recognition domains in the S subunit leads to six possible MTase alleles, each with its own target specificity106. Taken together, these findings have deepened our understanding of previously identified phase-variable MTases.

Other studies have taken advantage of the hypothesis-free nature of analysing methylomes with SMRT sequencing to uncover novel phase-variable MTases in other pathogenic bacteria. For example, SMRT sequencing led to the recent discovery of MTase phase variation in the human gastric pathogen C. jejuni108 and the bovine respiratory pathogen Bibersteinia trehalosi88. The phase variation in C. jejuni was shown to affect cell adherence, invasion and biofilm formation, but additional study is required to determine the functional consequences of MTase phase variation in B. trehalosi. Use of a software package named SMALR, which was developed to extract single-molecule-level methylation status from SMRT sequencing data, revealed a new type of epigenetic heterogeneity in the marine bacterium Chromohalobacter salexigens18, in which methylation is dispersed across some, but not all, instances of a target motif. The biological reason for this observed pattern of incomplete methylation is unknown.

There is now a wealth of evidence indicating that MTase phase variation is a crucial survival mechanism for host-adapted bacteria. Variability in methylation patterns has been observed to affect gene expression and phenotypes, but future work will be required to clarify the precise mechanisms through which methylation regulates gene expression.

Epigenetic regulation of clinically important phenotypes

Of the many molecular and cellular phenotypes regulated by DNA methylation, clinically important phenotypes are of particular interest. Previous studies using traditional methods hinted at the clinical relevance of bacterial methylation. For instance, methylation of 5ʹ-GATC-3ʹ by Dam in Salmonella enterica subsp. enterica serovar Typhimurium was shown to be essential for virulence146,147. More recent studies have linked additional clinically important phenotypes to bacterial DNA methylation, and many have used SMRT sequencing to associate specific methylation motifs targeted by phase-variable MTases with particular phenotypes.

For instance, the ModA11 and ModA12 alleles of the phase-variable ModA MTase in N. meningitidis have been linked to sensitivity to several antibiotics that are typically prescribed for meningococcal disease. The phase-variable ModD MTase has also been linked to hypervirulent strains of the same pathogen79,144. The phase-variable MTase ModM in M. catarrhalis has potential roles in colonization, infection and immune evasion in humans. Specifically, a recent study observed a significant enrichment of the ModM3 allele over the more common ModM2 allele in middle ear isolates from individuals with otitis media, which suggests that genes regulated by ModM3 methylation play a part in colonization and infection78. Specific ModA alleles of H. influenzae were selected for in vivo during progression of otitis media in chinchillas, suggesting a role for DNA methylation in H. influenzae colonization and infection76. Additionally, experiments using locked variants of these phase-variable ModA alleles demonstrated regulation of a variety of clinically important pathways, such as immune evasion, biofilm formation, antibiotic susceptibility, virulence and niche adaptation76,148. These results corroborate orthogonal studies that implicate ModA phase variation as an important regulator of virulence and immune evasion149,150. In S. pneumoniae, the MTase of the SpnD39III RM system possesses six specificity alleles that are generated through genetic rearrangement and that target different motifs for methylation. These alleles have different virulence phenotypes and are selected at various stages of colonization and infection106.

Collectively, these studies imply that many bacterial pathogens exploit epigenetic switches as a flexible mechanism to regulate gene expression during host colonization and infection. Some of these mechanisms may serve as targets of potential therapeutic intervention strategies.

Towards deeper mechanistic insights

The first step in studying the functional impacts of bacterial DNA methylation is to compare global gene expression between wild-type strains and MTase mutant strains. A number of studies that used RNA sequencing for such comparisons have shown that perturbation of a single DNA MTase often results in differential expression of tens or hundreds of genes, reaching as many as a thousand in some cases3,18,31,106,110,111. These data highlight that the effects of DNA methylation in the regulation of bacterial gene expression have been underestimated but also reveal some unexpected findings. In some cases, the regulatory effects of MTases can be conclusively traced to methylation at the promoters of target genes. For instance, the ModH5 MTase in H. pylori has been shown to regulate the activity of the gene flagellin A (flaA) via methylation in the flaA promoter145. However, generally, only a small proportion ( <10%) of the differentially expressed genes have methylated sites in their promoter regions3,45,46,110, which implies that the local competition model, in which a DNA MTase and other DNA-binding proteins compete for binding at the promoter of a gene, does not apply to most differentially expressed genes (Fig. 6a). Another possibility is that the methylation status at individual motif sites might regulate the expression of a transcription factor, causing a broad downstream shift in the expression of its target genes (Fig. 6d). In order to determine which mechanisms are at work, specific methylation sites must be individually mutated using genetic tools such as site-directed mutagenesis114,115,116. Multiple studies have observed a positive correlation between the number of methylation sites in a gene and the fold change of expression between wild type and MTase mutants3,46, suggesting that epigenetic regulation of expression may in fact be driven by multiple methylation sites in both the promoter region and the gene body. Another intriguing hypothesis relates to the effect of DNA methylation on the chromosome topology151,152,153, whereby methylation induces structural changes that alter the repertoire of genes exposed to the cellular transcriptional machinery (Fig. 6e).

Comparisons with eukaryotic methylomes

Analyses of DNA methylation in eukaryotic genomes have focused on 5mC. However, even with the advent of second-generation and third-generation sequencing technologies, functional studies of 5mC in the bacterial kingdom have been rare because it is less prevalent than 6mA. Thus, comparisons between bacterial and eukaryotic methylomes have not been feasible until the recent discovery of 6mA in a number of eukaryotes154, including algae155, fungi156, worms157, insects158 and mammals159,160. These studies have revealed diverse functions for eukaryotic 6mA, including the regulation of gene expression157,158,160, regulation of transposon mobility158,160 and crosstalk with histone variants and histone modifications157,160.

The genomic distribution of 6mA differs between prokaryotes and eukaryotes. The frequency of 6mA (as a fraction of the total number of adenine residues in the genome) is orders of magnitude lower in most eukaryotes than in prokaryotes68,154. Furthermore, eukaryotic 6mA events are much less motif-driven than those in prokaryotes. For example, very few occurrences (often < 3%) of 6mA motifs identified in the genomes of Chlamydomonas reinhardtii, Caenorhabditis elegans, Plasmodium falciparum and mouse embryonic stem cells (mESCs) are methylated68,155,157,160,161. One likely explanation for these observations is that modified 6mA sites are not targeted by cognate restriction enzymes in eukaryotes and, therefore, do not need to be located at specific sequence motifs. Another possible reason is that MTases have limited access to eukaryotic DNA because it is packaged in nucleosomes, and thus only exposed motifs will be methylated.

Despite these important differences, some commonalities do exist. For example, 6mA events are known to inhibit the transcription of a form of transposon called insertion elements in bacteria10,162, which is analogous to the observed enrichment of 6mA events at transposons in both C. elegans and mESCs157,160. More fundamentally, the intrinsic properties of 6mA and its effect on DNA conformation are expected to be consistent between bacteria and eukaryotes151, although different organisms may exploit these properties in different molecular and cellular contexts. Complete high-resolution maps will be the foundation for future comparisons of bacterial and eukaryotic 6mA methylomes. Although SMRT sequencing and Oxford Nanopore sequencing hold great promise for mapping DNA methylation in bacteria, their successful application to eukaryotic genomes faces critical challenges stemming from the scarcity of modified sites and the lack of clear target motifs. As recent work has suggested, 6mA detection in eukaryotes requires cross-validation by integrating complementary sequencing technologies with molecular technologies based on restriction enzymes and 6mA antibodies68.

Future perspectives

Integration with orthogonal assays for mechanistic insights

Technological breakthroughs have made it easier than ever to map bacterial methylomes. However, comprehensive studies will be necessary to fully characterize the precise mechanisms by which DNA methylation modulates gene expression and alters bacterial phenotypes. Such studies would benefit from a richer collection of functional genomics data (such as transcription factor binding assays) from many bacterial species, across different genetic backgrounds (wild type and MTase mutants) and in conditions of growth or stress. These experiments must be followed by genetic experiments that mutate and characterize specific methylation sites. In addition, future studies could test the hypothesis that the thermodynamic effect of DNA methylation induces conformational changes to a bacterial chromosome, rendering previously inaccessible genes accessible to the transcriptional machinery151,163 (Fig. 6e). Chromatin conformation capture analyses, such as Hi-C, can be used to elucidate the effects of bacterial DNA methylation on DNA conformation and, consequently, on gene transcription1,54.

Phasevarions in vaccine development

The ability of phase-variable MTases to activate antigenic diversity in host-adapted pathogens (Fig. 7) makes them very relevant for vaccine development. Highly diverse and variable antigens do not make good vaccine candidates and are typically avoided. However, genes for outer membrane proteins or other antigens that lack simple tandem repeats (which are common indicators of phase variation) might still be subject to variable expression if they are part of a phasevarion164. Indeed, it has been shown that multiple vaccine candidates are likely subject to this epigenetic means of antigenic variation. Thus, identifying phase-variable MTases and their phasevarions in host-adapted pathogens76 is likely to facilitate more effective vaccine development.

Fig. 7: Phenotypic consequences of epigenetic heterogeneity.
figure7

a | The presence of phase-variable methyltransferases (MTases) can introduce heterogeneous methylation patterns in a clonally expanded bacterial population, leading to subpopulations with distinct gene expression regimes and phenotypes. b | Phenotypically distinct subpopulations can emerge within a colony as a result not only of genetic variation but also of epigenetic variation, that is, variation in DNA methylation status. These subpopulations serve as units of adaptive selection and provide a means of population-level flexibility in response to rapidly changing environments.

Methylation detection using nanopore sequencing

SMRT sequencing has been instrumental in enabling the study of bacterial methylomes, but other sequencing technologies, such as those commercialized by Oxford Nanopore Technologies, have the potential to make important contributions to the field of bacterial epigenetics in the near future. Assuming continued maturation of the technology and improvements in the modification detection algorithms, the very long read lengths offered by nanopore sequencing devices may provide single-molecule phased detection of bacterial DNA methylation in samples from a variety of environments. This ability will be helpful for the epigenetic study of heterogeneous bacterial samples, including metagenomic populations, where the study of methylation has so far been limited18,87. The recent use of methylation signatures as discriminative features for metagenomic binning suggests that the applications of methylation detection in long reads extend beyond identifying methylated motifs in bacteria53.

These advances come at a time when the presence and importance of DNA methylation types that have traditionally been recognized only in prokaryotes, such as 6mA, are being investigated in eukaryotes156,157,160. As these epigenetic marks become better understood, it will be interesting to see whether eukaryotic modifications share any functional traits with those found in their prokaryotic ancestors.

Conclusions

The study of bacterial methylomes has been revolutionized by the introduction of technologies capable of detecting 4mC, 5mC and 6mA at a genome-wide scale and single-nucleotide resolution. Application of these new technologies has led to a greater appreciation for the sheer quantity and diversity of methylation systems and their target specificities in bacteria. Deposition of newly discovered MTase genes and their target motifs to community databases such as REBASE19 has created a powerful resource for researchers, providing a catalogue of the RM systems that can act as barriers to efficient transformation. Technological advances have also highlighted hypervariable MTases and their consequences on genome-wide methylation, gene expression and phenotypic plasticity. Given the modern sequencing-based tools at their disposal, researchers are better equipped than ever before to probe the previously hidden epigenetic mechanisms of the bacterial realm.

References

  1. 1.

    Boyer, H. Genetic control of restriction and modification in Escherichi coli. J. Bacteriol. 88, 1652–1660 (1964).

    CAS  PubMed  PubMed Central  Google Scholar 

  2. 2.

    Casadesús, J. & Low, D. Epigenetic gene regulation in the bacterial world. Microbiol. Mol. Biol. Rev. 70, 830–856 (2006). This review discusses how bacterial DNA methylation acts as a regulatory signal in various bacteria.

    PubMed  PubMed Central  Google Scholar 

  3. 3.

    Fang, G. et al. Genome-wide mapping of methylated adenine residues in pathogenic Escherichia coli using single-molecule real-time sequencing. Nat. Biotechnol. 30, 1232–1239 (2012). This paper describes the first application of SMRT sequencing to detect bacterial 6mA events at single-base resolution and genome-wide scale.

    CAS  PubMed  Google Scholar 

  4. 4.

    Wion, D. & Casadesús, J. N6-methyl-adenine: an epigenetic signal for DNA-protein interactions. Nat. Rev. Microbiol. 4, 183–192 (2006). This review discusses the use of 6mA as a regulatory signal in various bacteria.

    CAS  PubMed  PubMed Central  Google Scholar 

  5. 5.

    Løbner-Olesen, A., Marinus, M. G. & Hansen, F. G. Role of SeqA and Dam in Escherichia coli gene expression: a global/microarray analysis. Proc. Natl Acad. Sci. USA 100, 4672–4677 (2003).

    PubMed  Google Scholar 

  6. 6.

    Low, D. a & Casadesús, J. Clocks and switches: bacterial gene regulation by DNA adenine methylation. Curr. Opin. Microbiol. 11, 106–112 (2008).

    CAS  PubMed  Google Scholar 

  7. 7.

    Boye, E., Løbner-Olesen, A. & Skarstad, K. Limiting DNA replication to once and only once. EMBO Rep. 1, 479–483 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. 8.

    Boye, E., Stokke, T., Kleckner, N. & Skarstad, K. Coordinating DNA replication initiation with cell growth: differential roles for DnaA and SeqA proteins. Proc. Natl Acad. Sci. USA 93, 12206–12211 (1996).

    CAS  PubMed  Google Scholar 

  9. 9.

    Hsieh, P. Molecular mechanisms of DNA mismatch repair. Mutat. Res. 486, 71–87 (2001).

    CAS  PubMed  Google Scholar 

  10. 10.

    Roberts, D., Hoopes, B. C., McClure, W. R. & Kleckner, N. IS10 transposition is regulated by DNA adenine methylation. Cell 43, 117–130 (1985).

    CAS  PubMed  Google Scholar 

  11. 11.

    Hernday, A., Krabbe, M., Braaten, B. & Low, D. Self-perpetuating epigenetic pili switches in bacteria. Proc. Natl Acad. Sci. USA 99, (Suppl. 4), 16470–16476 (2002).

    CAS  PubMed  Google Scholar 

  12. 12.

    Waldron, D. E., Owen, P. & Dorman, C. J. Competitive interaction of the OxyR DNA-binding protein and the Dam methylase at the antigen 43 gene regulatory region in Escherichia coli. Mol. Microbiol. 44, 509–520 (2002).

    CAS  PubMed  Google Scholar 

  13. 13.

    Bickle, T. A. & Kruger, D. H. Biology of DNA restriction. Microbiol. Rev. 57, 434–450 (1993).

    CAS  PubMed  PubMed Central  Google Scholar 

  14. 14.

    Loenen, W. A. M., Dryden, D. T. F., Raleigh, E. A. & Wilson, G. G. Type i restriction enzymes and their relatives. Nucleic Acids Res. 42, 20–44 (2014).

    CAS  PubMed  Google Scholar 

  15. 15.

    Pingoud, A., Wilson, G. G. & Wende, W. Type II restriction endonucleases—a historical perspective and more. Nucleic Acids Res. 42, 7489–7527 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  16. 16.

    Rao, D. N., Dryden, D. T. F. & Bheemanaik, S. Type III restriction-modification enzymes: a historical perspective. Nucleic Acids Res. 42, 45–55 (2014).

    CAS  PubMed  Google Scholar 

  17. 17.

    Furuta, Y. & Kobayashi, I. Mobility of DNA sequence recognition domains in DNA methyltransferases suggests epigenetics-driven adaptive evolution. Mob. Genet. Elements 2, 292–296 (2012).

    PubMed  PubMed Central  Google Scholar 

  18. 18.

    Beaulaurier, J. et al. Single molecule-level detection and long read-based phasing of epigenetic variations in bacterial methylomes. Nat. Commun. 6, 7438 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  19. 19.

    Roberts, R. J., Vincze, T., Posfai, J. & Macelis, D. REBASE-a database for DNA restriction and modification: enzymes, genes and genomes. Nucleic Acids Res. 43, D298–D299 (2015). This paper describes the REBASE database, which has become a central repository for bacterial methylome information.

    CAS  PubMed  Google Scholar 

  20. 20.

    Blow, M. J. et al. The epigenomic landscape of prokaryotes. PLOS Genet. 12, e1005854 (2016). This study describes a comprehensive survey of the methylomes of 230 bacteria, describing the diversity of MTases and specificities.

    PubMed  PubMed Central  Google Scholar 

  21. 21.

    Davis, B. M., Chao, M. C. & Waldor, M. K. Entering the era of bacterial epigenomics with single molecule real time DNA sequencing. Curr. Opin. Microbiol. 16, 192–198 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  22. 22.

    Plongthongkum, N., Diep, D. H. & Zhang, K. Advances in the profiling of DNA modifications: cytosine methylation and beyond. Nat. Rev. Genet. 15, 647–661 (2014).

    CAS  PubMed  Google Scholar 

  23. 23.

    Bock, C. Analysing and interpreting DNA methylation data. Nat. Rev. Genet. 13, 705–719 (2012).

    CAS  PubMed  Google Scholar 

  24. 24.

    Laird, P. W. Principles and challenges of genomewide DNA methylation analysis. Nat. Rev. Genet. 11, 191–203 (2010).

    CAS  PubMed  Google Scholar 

  25. 25.

    Hirst, M. & Marra, M. A. Next generation sequencing based approaches to epigenomics. Brief. Funct. Genom. 9, 455–465 (2010).

    CAS  Google Scholar 

  26. 26.

    Eid, J. et al. Real-time DNA sequencing from single polymerase molecules. Science 323, 133–138 (2009). This paper is a good introduction to the concepts and technology underpinning SMRT sequencing.

    CAS  PubMed  Google Scholar 

  27. 27.

    Flusberg, B. A. et al. Direct detection of DNA methylation during single-molecule, real-time sequencing. Nat. Methods 7, 461–465 (2010). This paper provides an early description of 5mC, 5hmC and 6mA detection using SMRT sequencing.

    CAS  PubMed  PubMed Central  Google Scholar 

  28. 28.

    Murray, I. A. et al. The methylomes of six bacteria. Nucleic Acids Res. 40, 11450–11462 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. 29.

    Krebes, J. et al. The complex methylome of the human gastric pathogen Helicobacter pylori. Nucleic Acids Res. 42, 2415–2432 (2014). This paper describes how the application of SMRT sequencing to multiple H. pylori strains revealed unexpectedly complex methylomes and many novel methylation motifs.

    CAS  PubMed  Google Scholar 

  30. 30.

    Furuta, Y. et al. Methylome diversification through changes in DNA methyltransferase sequence specificity. PLOS Genet. 10, e1004272 (2014).

    PubMed  PubMed Central  Google Scholar 

  31. 31.

    Lluch-Senar, M. et al. Comprehensive methylome characterization of Mycoplasma genitalium and Mycoplasma pneumoniae at single-base resolution. PLOS Genet. 9, e1003191 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  32. 32.

    Korlach, J. & Turner, S. W. Going beyond five bases in DNA sequencing. Curr. Opin. Struct. Biol. 22, 251–261 (2012).

    CAS  PubMed  Google Scholar 

  33. 33.

    Sánchez-Romero, M. A., Cota, I. & Casadesús, J. DNA methylation in bacteria: from the methyl group to the methylome. Curr. Opin. Microbiol. 25, 9–16 (2015).

    PubMed  Google Scholar 

  34. 34.

    Casadesús, J. in DNA Methyltransferases - Role and Function (eds Jeltsch, A. & Jurkowska, R. Z.) 35–61 (Springer International Publishing, 2016).

  35. 35.

    Razin, A. & Riggs, A. D. DNA methylation and gene function. Science 210, 604–610 (1980).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. 36.

    Robertson, K. D. DNA methylation and human disease. Nat. Rev. Genet. 6, 597–610 (2005).

    CAS  PubMed  Google Scholar 

  37. 37.

    Ito, S. et al. Role of Tet proteins in 5mC to 5hmC conversion, ES-cell self-renewal and inner cell mass specification. Nature 466, 1129–1133 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  38. 38.

    Zweiger, G., Marczynski, G. & Shapiro, L. A. Caulobacter DNA methyltransferase that functions only in the predivisional cell. J. Mol. Biol. 235, 472–485 (1994).

    CAS  PubMed  Google Scholar 

  39. 39.

    Nelson, M., Raschke, E. & McClelland, M. Effect of site-specific methylation on restriction endonucleases and DNA modification methyltransferases. Nucleic Acids Res. 21, 3139–3154 (1993).

    CAS  PubMed  PubMed Central  Google Scholar 

  40. 40.

    Rao, B. S. & Buckler-White, A. Direct visualization of site-specific and strand-specific DNA methylation patterns in automated DNA sequencing data. Nucleic Acids Res. 26, 2505–2507 (1998).

    CAS  PubMed  PubMed Central  Google Scholar 

  41. 41.

    Bart, A., van Passel, M. W., van Amsterdam, K. & van der Ende, A. Direct detection of methylation in genomic DNA. Nucleic Acids Res. 33, e124 (2005).

    CAS  PubMed  PubMed Central  Google Scholar 

  42. 42.

    Broadbent, S. E., Balbontin, R., Casadesus, J., Marinus, M. G. & van der Woude, M. YhdJ, a nonessential CcrM-like DNA methyltransferase of Escherichia coli and Salmonella enterica. J. Bacteriol. 189, 4325–4327 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  43. 43.

    Shell, S. S. et al. DNA methylation impacts gene expression and ensures hypoxic survival of Mycobacterium tuberculosis. PLOS Pathog. 9, e1003419 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  44. 44.

    Bart, A., Pannekoek, Y., Dankert, J. & van der Ende, A. NmeSI restriction-modification system identified by representational difference analysis of a hypervirulent Neisseria meningitidis strain. Infect. Immun. 69, 1816–1820 (2001).

    CAS  PubMed  PubMed Central  Google Scholar 

  45. 45.

    Kahramanoglou, C. et al. Genomics of DNA cytosine methylation in Escherichia coli reveals its role in stationary phase transcription. Nat. Commun. 3, 886 (2012). This paper describes one of the first applications of bisulfite sequencing to characterize 5mC in bacteria.

    PubMed  Google Scholar 

  46. 46.

    Chao, M. C. et al. A cytosine methytransferase modulates the cell envelope stress response in the cholera pathogen. PLOS Genet. 11, 1–24 (2015).

    Google Scholar 

  47. 47.

    Yu, M. et al. Base-resolution detection of N4-methylcytosine in genomic DNA using 4mC-Tet-assisted-bisulfite sequencing. Nucleic Acids Res. 43, 1–10 (2015).

    Google Scholar 

  48. 48.

    Schadt, E. E. et al. Modeling kinetic rate variation in third generation DNA sequencing data to detect putative modifications to DNA bases. Genome Res. 23, 129–141 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  49. 49.

    Levene, M. J. et al. Zero-mode waveguides for single-molecule analysis at high concentrations. Science 299, 682–686 (2003).

    CAS  PubMed  Google Scholar 

  50. 50.

    Clark, T. A., Spittle, K. E., Turner, S. W. & Korlach, J. Direct detection and sequencing of damaged DNA bases. Genome Integr. 2, 10 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  51. 51.

    Clark, T. A. et al. Enhanced 5-methylcytosine detection in single-molecule, real-time sequencing via Tet1 oxidation. BMC Biol. 11, 4 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  52. 52.

    Clark, T. A. et al. Characterization of DNA methyltransferase specificities using single-molecule, real-time DNA sequencing. Nucleic Acids Res. 40, e29 (2012).

    CAS  PubMed  Google Scholar 

  53. 53.

    Beaulaurier, J. et al. Metagenomic binning and association of plasmids with bacterial host genomes using DNA methylation. Nat. Biotechnol. 36, 61–69 (2018).

    CAS  PubMed  Google Scholar 

  54. 54.

    Clarke, J. et al. Continuous base identification for single-molecule nanopore DNA sequencing. Nat. Nanotechnol. 4, 265–270 (2009). This paper provides an early description of a protein nanopore with covalently attached adapter continuously differentiating between the four canonical bases and 5mC.

    CAS  PubMed  Google Scholar 

  55. 55.

    Manrao, E. A. et al. Reading DNA at single-nucleotide resolution with a mutant MspA nanopore and phi29 DNA polymerase. Nat. Biotechnol. 30, 349–353 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  56. 56.

    Laszlo, A. H. et al. Decoding long nanopore sequencing reads of natural DNA. Nat. Biotechnol. 32, 829–834 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  57. 57.

    Manrao, E. A., Derrington, I. M., Pavlenok, M., Niederweis, M. & Gundlach, J. H. Nucleotide discrimination with DNA immobilized in the MSPA nanopore. PLOS ONE 6, e25723 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  58. 58.

    Jain, M., Olsen, H. E., Paten, B. & Akeson, M. The Oxford Nanopore MinION: delivery of nanopore sequencing to the genomics community. Genome Biol. 17, 239 (2016).

    PubMed  PubMed Central  Google Scholar 

  59. 59.

    Deamer, D., Akeson, M. & Branton, D. Three decades of nanopore sequencing. Nat. Biotechnol. 34, 518–524 (2016).

    CAS  PubMed  Google Scholar 

  60. 60.

    Ip, C. L. C. et al. MinION Analysis and Reference Consortium: phase 1 data release and analysis. F1000Research 4, 1075 (2015).

    PubMed  PubMed Central  Google Scholar 

  61. 61.

    de Lannoy, C., de Ridder, D. & Risse, J. A sequencer coming of age: de novo genome assembly using MinION reads. F1000Research 6, 1083 (2017).

    PubMed  PubMed Central  Google Scholar 

  62. 62.

    Laszlo, A. H. et al. Detection and mapping of 5-methylcytosine and 5-hydroxymethylcytosine with nanopore MspA. Proc. Natl Acad. Sci. USA 110, 18904–18909 (2013). This paper uses a phi29 polymerase to ratchet ssDNA through a protein nanopore and identifies the presence of 5mC and 5hmC in single DNA molecules.

    CAS  PubMed  Google Scholar 

  63. 63.

    Wescoe, Z. L., Schreiber, J. & Akeson, M. Nanopores discriminate among five C5-cytosine variants in DNA. J. Am. Chem. Soc. 136, 16582–16587 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  64. 64.

    Rand, A. C. et al. Mapping DNA methylation with high-throughput nanopore sequencing. Nat. Methods 14, 411–413 (2017). This paper describes the SignalAlign tool, which can detect multiple cytosine modifications and 6mA from nanopore sequencing data.

    CAS  PubMed  PubMed Central  Google Scholar 

  65. 65.

    McIntyre, A. B. R. et al. Nanopore detection of bacterial DNA base modifications. Preprint at bioRxiv https://doi.org/10.1101/127100 (2017).

    Article  Google Scholar 

  66. 66.

    Stoiber, M. H. et al. De novo identification of DNA modifications enabled by genome-guided nanopore signal processing. Preprint at bioRxiv https://doi.org/10.1101/094672 (2016).

  67. 67.

    Simpson, J. T. et al. Detecting DNA cytosine methylation using nanopore sequencing. Nat. Methods 14, 407–410 (2017).

    CAS  PubMed  Google Scholar 

  68. 68.

    Zhu, S. et al. Mapping and characterizing N6-methyladenine in eukaryotic genomes using single-molecule real-time sequencing. Genome Res. 28, 1067–1078 (2018). This article emphasizes the challenges and caveats in the use of SMRT sequencing for 6mA detection in eukaryotes. Similar challenges apply to nanopore sequencing.

    CAS  PubMed  PubMed Central  Google Scholar 

  69. 69.

    Srikhanta, Y. N., Fox, K. L. & Jennings, M. P. The phasevarion: phase variation of type III DNA methyltransferases controls coordinated switching in multiple genes. Nat. Rev. Microbiol. 8, 196–206 (2010).

    CAS  PubMed  Google Scholar 

  70. 70.

    Lee, W. C. et al. The complete methylome of Helicobacter pylori UM032. BMC Genomics 16, 424 (2015).

    PubMed  PubMed Central  Google Scholar 

  71. 71.

    O’Loughlin, J. L. et al. Analysis of the Campylobacter jejuni genome by SMRT DNA sequencing identifies restriction-modification motifs. PLOS ONE 10, e0118533 (2015).

    PubMed  PubMed Central  Google Scholar 

  72. 72.

    Pirone-Davies, C. et al. Genome-wide methylation patterns in Salmonella enterica subsp. enterica Serovars. PLOS ONE 10, e0123639 (2015).

    PubMed  PubMed Central  Google Scholar 

  73. 73.

    Bailey, T. L. et al. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 37, (Suppl. 2), W202–W208 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  74. 74.

    O’Connor, B. D., Merriman, B. & Nelson, S. F. SeqWare query engine: storing and searching sequence data in the cloud. BMC Bioinformatics 11, S2 (2010).

    PubMed  PubMed Central  Google Scholar 

  75. 75.

    Murphy, J. et al. Methyltransferases acquired by lactococcal 936-type phage provide protection against restriction endonuclease activity. BMC Genomics 15, 831 (2014).

    PubMed  PubMed Central  Google Scholar 

  76. 76.

    Atack, J. M. et al. A biphasic epigenetic switch controls immunoevasion, virulence and niche adaptation in non-typeable Haemophilus influenzae. Nat. Commun. 6, 7828 (2015). In this paper, SMRT sequencing is used to characterize ON/OFF switching of a type III RM system and its effect on immunoevasion and niche adaptation in an animal model.

    CAS  PubMed  PubMed Central  Google Scholar 

  77. 77.

    Bendall, M. L. et al. Exploring the roles of DNA methylation in the metal-reducing bacterium Shewanella oneidensis MR-1. J. Bacteriol. 195, 4966–4974 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  78. 78.

    Blakeway, L. V. et al. ModM DNA methyltransferase methylome analysis reveals a potential role for Moraxella catarrhalis phasevarions in otitis media. FASEB J. 28, 5197–5207 (2014).

    CAS  PubMed  Google Scholar 

  79. 79.

    Seib, K. L. et al. A novel epigenetic regulator associated with the hypervirulent Neisseria meningitidis clonal complex 41/44. FASEB J. 25, 3622–3633 (2011).

    CAS  PubMed  Google Scholar 

  80. 80.

    O’ Connell Motherway, M. et al. Identification of restriction-modification systems of Bifidobacterium animalis subsp. lactis CNCM I-2494 by SMRT sequencing and associated methylome analysis. PLOS ONE 9, e94875 (2014).

    PubMed  Google Scholar 

  81. 81.

    Huo, W., Adams, H. M., Zhang, M. Q. & Palmer, K. L. Genome modification in Enterococcus faecalis OG1RF assessed by bisulfite sequencing and single-molecule real-time sequencing. J. Bacteriol. 197, 1939–1951 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  82. 82.

    Kobayashi, I., Nobusato, A., Kobayashi-Takahashi, N. & Uchiyama, I. Shaping the genome — restriction-modification systems as mobile genetic elements. Curr. Opin. Genet. Dev. 9, 649–656 (1999).

    CAS  PubMed  Google Scholar 

  83. 83.

    Conlan, S. et al. Single-molecule sequencing to track plasmid diversity of hospital-associated carbapenemase-producing Enterobacteriaceae. Sci. Transl Med. 6, 254ra126 (2014).

    PubMed  PubMed Central  Google Scholar 

  84. 84.

    Sater, M. R. A. et al. DNA Methylation assessed by SMRT sequencing is linked to mutations in Neisseria meningitidis isolates. PLOS ONE 10, e0144612 (2015).

    PubMed  PubMed Central  Google Scholar 

  85. 85.

    Zhu, L. et al. Precision methylome characterization of Mycobacterium tuberculosis complex (MTBC) using PacBio single-molecule real-time (SMRT) technology. Nucleic Acids Res. 44, 730–743 (2016).

    CAS  PubMed  Google Scholar 

  86. 86.

    Mou, K. T. et al. A comparative analysis of methylome profiles of Campylobacter jejuni sheep abortion isolate and gastroenteric strains using PacBio data. Front. Microbiol. 5, 782 (2014).

    PubMed  Google Scholar 

  87. 87.

    Leonard, M. T. et al. The methylome of the gut microbiome: disparate Dam methylation patterns in intestinal Bacteroides dorei. Front. Microbiol. 5, 361 (2014).

    PubMed  PubMed Central  Google Scholar 

  88. 88.

    Anton, B. P., Harhay, G. P., Smith, T. P. L., Blom, J. & Roberts, R. J. Comparative methylome analysis of the occasional ruminant respiratory pathogen Bibersteinia trehalosi. PLOS ONE 11, e0161499 (2016).

    PubMed  PubMed Central  Google Scholar 

  89. 89.

    Chen, P. et al. Comparative genomics reveals the diversity of restriction-modification systems and DNA methylation sites in Listeria monocytogenes. Appl. Environ. Microbiol. 83, e02091–16 (2017).

    PubMed  PubMed Central  Google Scholar 

  90. 90.

    Blyn, L. B., Braaten, B. A. & Low, D. A. Regulation of pap pilin phase variation by a mechanism involving differential dam methylation states. EMBO J. 9, 4045–4054 (1990).

    CAS  PubMed  PubMed Central  Google Scholar 

  91. 91.

    Boyer, H. W. DNA restriction and modification mechanisms in bacteria. Annu. Rev. Microbiol. 25, 153–176 (1971).

    CAS  PubMed  Google Scholar 

  92. 92.

    Løbner-Olesen, A., Skovgaard, O. & Marinus, M. G. Dam methylation: coordinating cellular processes. Curr. Opin. Microbiol. 8, 154–160 (2005).

    PubMed  Google Scholar 

  93. 93.

    Ehrlich, M. et al. DNA methylation in thermophilic bacteria: N4-methylcytosine, 5-methylcytosine, and N5methyladenine. Nucleic Acids Res. 13, 1399–1412 (1985).

    CAS  PubMed  PubMed Central  Google Scholar 

  94. 94.

    Ehrlich, M., Wilson, G. G., Kuo, K. C. & Gehrke, C. W. N4-methylcytosine as a minor base in bacterial DNA. J. Bacteriol. 169, 939–943 (1987).

    CAS  PubMed  PubMed Central  Google Scholar 

  95. 95.

    Chung, D., Farkas, J., Huddleston, J. R., Olivar, E. & Westpheling, J. Methylation by a unique $α$-class N4-Cytosine methyltransferase is required for DNA transformation of caldicellulosiruptor bescii DSM6725. PLOS ONE 7, e43844 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  96. 96.

    Vilkaitis, G. & Klimasauskas, S. Bisulfite sequencing protocol displays both 5-methylcytosine and N4-methylcytosine. Anal. Biochem. 271, 116–119 (1999).

    CAS  PubMed  Google Scholar 

  97. 97.

    Kumar, S. et al. N4-cytosine DNA methylation regulates transcription and pathogenesis in Helicobacter pylori. Nucleic Acids Res. 46, 3429–3445 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  98. 98.

    Boyer, H. W., Chow, L. T., Dugaiczyk, A., Hedgpeth, J. & Goodman, H. M. DNA substrate site for the EcoRII restriction endonuclease and modification methylase. Nat. New Biol. 244, 40–43 (1973).

    CAS  PubMed  Google Scholar 

  99. 99.

    Takahashi, N., Naito, Y., Handa, N. & Kobayashi, I. A. DNA methyltransferase can protect the genome from postdisturbance attack by a restriction-modification gene complex. J. Bacteriol. 184, 6100–6108 (2002).

    CAS  PubMed  PubMed Central  Google Scholar 

  100. 100.

    Yang, M. K., Ser, S. C. & Lee, C. H. Involvement of E. coli dcm methylase in Tn3 transposition. Proc. Natl Sci. Counc. Repub. China. B. 13, 276–283 (1989).

    CAS  PubMed  Google Scholar 

  101. 101.

    Korba, B. E. & Hays, J. B. Partially deficient methylation of cytosine in DNA at CCATGG sites stimulates genetic recombination of bacteriophage lambda. Cell 28, 531–541 (1982).

    CAS  PubMed  Google Scholar 

  102. 102.

    Militello, K. T. et al. Conservation of Dcm-mediated cytosine DNA methylation in Escherichia coli. FEMS Microbiol. Lett. 328, 78–85 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  103. 103.

    Kozdon, J. B. et al. Global methylation state at base-pair resolution of the Caulobacter genome throughout the cell cycle. Proc. Natl Acad. Sci. USA 110, E4658–E4667 (2013).

    CAS  PubMed  Google Scholar 

  104. 104.

    O’Callaghan, A. & van Sinderen, D. Bifidobacteria and their role as members of the human gut microbiota. Front. Microbiol. 7, 925 (2016).

    PubMed  PubMed Central  Google Scholar 

  105. 105.

    Dalia, A. B., Lazinski, D. W. & Camilli, A. Characterization of undermethylated sites in Vibrio cholerae. J. Bacteriol. 195, 2389–2399 (2013).

    CAS  PubMed  Google Scholar 

  106. 106.

    Manso, A. S. et al. A random six-phase switch regulates pneumococcal virulence via global epigenetic changes. Nat. Commun. 5, 5055 (2014). In this article, SMRT sequencing is used to characterize a specificity switching RM system involved in colonization and virulence of S. pneumoniae.

    CAS  PubMed  PubMed Central  Google Scholar 

  107. 107.

    Li, J. et al. Epigenetic switch driven by DNA inversions dictates phase variation in Streptococcus pneumoniae. PLOS Pathog. 12, e1005762 (2016).

    PubMed  PubMed Central  Google Scholar 

  108. 108.

    Anjum, A. et al. Phase variation of a Type IIG restriction-modification enzyme alters site-specific methylation patterns and gene expression in Campylobacter jejuni strain NCTC11168. Nucleic Acids Res. 44, 4581–4594 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  109. 109.

    Seib, K. L. et al. Specificity of the ModA11, ModA12 and ModD1 epigenetic regulator N6-adenine DNA methyltransferases of Neisseria meningitidis. Nucleic Acids Res. 43, 4150–4162 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  110. 110.

    Gonzalez, D., Kozdon, J. B., McAdams, H. H., Shapiro, L. & Collier, J. The functions of DNA methylation by CcrM in Caulobacter crescentus: a global approach. Nucleic Acids Res. 42, 3720–3735 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  111. 111.

    Zhou, B. et al. The global regulatory architecture of transcription during the caulobacter cell cycle. PLOS Genet. 11, e1004831 (2015).

    PubMed  PubMed Central  Google Scholar 

  112. 112.

    Goldfarb, T. et al. BREX is a novel phage resistance system widespread in microbial genomes. EMBO J. 34, 169–183 (2014).

    PubMed  PubMed Central  Google Scholar 

  113. 113.

    Balbontin, R. et al. DNA adenine methylation regulates virulence gene expression in Salmonella enterica serovar Typhimurium. J. Bacteriol. 188, 8160–8168 (2006).

    CAS  PubMed  PubMed Central  Google Scholar 

  114. 114.

    der Woude, M. W. Van, Braaten, B. & Low, D. Epigenetic phase variation of the pap operon in Escherichia coli. Trends Microbiol. 4, 5–9 (1996).

    PubMed  Google Scholar 

  115. 115.

    Wallecha, A., Munster, V., Correnti, J., Chan, T. & van der Woude, M. Dam- and OxyR-dependent phase variation of agn43: essential elements and evidence for a new role of DNA methylation. J. Bacteriol. 184, 3338–3347 (2002).

    CAS  PubMed  PubMed Central  Google Scholar 

  116. 116.

    Lim, H. N. & Van Oudenaarden, A. A multistep epigenetic switch enables the stable inheritance of DNA methylation states. Nat. Genet. 39, 269–275 (2007).

    CAS  PubMed  Google Scholar 

  117. 117.

    Casadesús, J. & Low, D. A. Programmed heterogeneity: epigenetic mechanisms in bacteria. J. Biol. Chem. 288, 13929–13935 (2013).

    PubMed  PubMed Central  Google Scholar 

  118. 118.

    Peterson, S. N. & Reich, N. O. GATC flanking sequences regulate Dam activity: evidence for how Dam specificity may influence pap expression. J. Mol. Biol. 355, 459–472 (2006).

    CAS  PubMed  Google Scholar 

  119. 119.

    Davies, M. R., Broadbent, S. E., Harris, S. R., Thomson, N. R. & van der Woude, M. W. Horizontally acquired glycosyltransferase operons drive salmonellae lipopolysaccharide diversity. PLOS Genet. 9, e1003568 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  120. 120.

    Broadbent, S. E., Davies, M. R. & van der Woude, M. W. Phase variation controls expression of Salmonella lipopolysaccharide modification genes by a DNA methylation-dependent mechanism. Mol. Microbiol. 77, 337–353 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  121. 121.

    Cota, I., Blanc-Potard, A. B. & Casadesús, J. STM2209-STM2208 (opvAB): a phase variation locus of Salmonella enterica involved in control of O-antigen chain length. PLOS ONE 7, e36863 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  122. 122.

    Camacho, E. M. & Casadesus, J. Regulation of traJ transcription in the Salmonella virulence plasmid by strand-specific DNA adenine hemimethylation. Mol. Microbiol. 57, 1700–1718 (2005).

    CAS  PubMed  Google Scholar 

  123. 123.

    Cohen, N. R. et al. A role for the bacterial GATC methylome in antibiotic stress survival. Nat. Genet. 48, 581–586 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  124. 124.

    Cota, I. et al. OxyR-dependent formation of DNA methylation patterns in OpvAB OFF and OpvAB ON cell lineages of Salmonella enterica. Nucleic Acids Res. 44, 3595–3609 (2016).

    CAS  PubMed  Google Scholar 

  125. 125.

    Cota, I. et al. Epigenetic control of Salmonella enterica O-antigen chain length: a tradeoff between virulence and bacteriophage resistance. PLOS Genet. 11, e1005667 (2015).

    PubMed  PubMed Central  Google Scholar 

  126. 126.

    Jennings, M. P., Hood, D. W., Peak, I. R., Virji, M. & Moxon, E. R. Molecular analysis of a locus for the biosynthesis and phase-variable expression of the lacto-N-neotetraose terminal lipopolysaccharide structure in Neisseria meningitidis. Mol. Microbiol. 18, 729–740 (1995).

    CAS  PubMed  Google Scholar 

  127. 127.

    van der Ende, A. et al. Variable expression of class 1 outer membrane protein in Neisseria meningitidis is caused by variation in the spacing between the -10 and -35 regions of the promoter. J. Bacteriol. 177, 2475–2480 (1995).

    PubMed  PubMed Central  Google Scholar 

  128. 128.

    Cerdeño-Tárraga, A. & Patrick, S. Extensive DNA inversions in the B. fragilis genome control variable gene expression. Science 307, 1463–1466 (2005).

    PubMed  Google Scholar 

  129. 129.

    Henderson, I. R., Owen, P. & Nataro, J. P. Molecular switches - the ON and OFF of bacterial phase variation. Mol. Microbiol. 33, 919–932 (1999).

    CAS  PubMed  Google Scholar 

  130. 130.

    Srikhanta, Y. N., Maguire, T. L., Stacey, K. J., Grimmond, S. M. & Jennings, M. P. The phasevarion: a genetic system controlling coordinated, random switching of expression of multiple genes. Proc. Natl Acad. Sci. USA 102, 5547–5551 (2005). This paper introduces the concept of a phase-variable regulon (phasevarion).

    CAS  PubMed  Google Scholar 

  131. 131.

    Atack, J. M., Tan, A., Bakaletz, L. O., Jennings, M. P. & Seib, K. L. Phasevarions of bacterial pathogens: methylomics sheds new light on old enemies. Trends Microbiol. 26, 715–726 (2018).

    CAS  PubMed  Google Scholar 

  132. 132.

    Atack, J. M., Yang, Y., Seib, K. L., Zhou, Y. & Jennings, M. P. A survey of type III restriction-modification systems reveals numerous, novel epigenetic regulators controlling phase-variable regulons; phasevarions. Nucleic Acids Res. 46, 3532–3542 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  133. 133.

    Dybvig, K., Sitaraman, R. & French, C. T. A family of phase-variable restriction enzymes with differing specificities generated by high-frequency gene rearrangements. Proc. Natl Acad. Sci. USA 95, 13923–13928 (1998).

    CAS  PubMed  Google Scholar 

  134. 134.

    Tettelin, H. et al. Complete genome sequence of a virulent isolate of Streptococcus pneumoniae. Science 293, 498–506 (2001).

    CAS  Google Scholar 

  135. 135.

    Ryan, K. A. & Lo, R. Y. Characterization of a CACAG pentanucleotide repeat in Pasteurella haemolytica and its possible role in modulation of a novel type III restriction-modification system. Nucleic Acids Res. 27, 1505–1511 (1999).

    CAS  PubMed  PubMed Central  Google Scholar 

  136. 136.

    Seib, K. L., Peak, I. R. A. & Jennings, M. P. Phase variable restriction-modification systems in Moraxella catarrhalis. FEMS Immunol. Med. Microbiol. 32, 159–165 (2002).

    CAS  PubMed  Google Scholar 

  137. 137.

    Zaleski, P., Wojciechowski, M. & Piekarowicz, A. The role of Dam methylation in phase variation of Haemophilus influenzae genes involved in defence against phage infection. Microbiology 151, 3361–3369 (2005).

    CAS  PubMed  Google Scholar 

  138. 138.

    Fox, K. L. et al. Haemophilus influenzae phasevarions have evolved from type III DNA restriction systems into epigenetic regulators of gene expression. Nucleic Acids Res. 35, 5242–5252 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  139. 139.

    Gawthorne, J. A., Beatson, S. A., Srikhanta, Y. N., Fox, K. L. & Jennings, M. P. Origin of the diversity in DNA recognition domains in phasevarion associated modA genes of pathogenic Neisseria and Haemophilus influenzae. PLOS ONE 7, e32337 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  140. 140.

    De Vries, N. et al. Transcriptional phase variation of a type III restriction-modification system in Helicobacter pylori. J. Bacteriol. 184, 6615–6623 (2002).

    PubMed  PubMed Central  Google Scholar 

  141. 141.

    Skoglund, A. et al. Functional analysis of the M. HpyAIV DNA methyltransferase of Helicobacter pylori. J. Bacteriol. 189, 8914–8921 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  142. 142.

    Srikhanta, Y. N. et al. Phasevarion mediated epigenetic gene regulation in Helicobacter pylori. PLOS ONE 6, e27569 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  143. 143.

    Srikhanta, Y. N. et al. Phasevarions mediate random switching of gene expression in pathogenic Neisseria. PLOS Pathog. 5, e1000400 (2009).

    PubMed  PubMed Central  Google Scholar 

  144. 144.

    Jen, F. E. C., Seib, K. L. & Jennings, M. P. Phasevarions mediate epigenetic regulation of antimicrobial susceptibility in Neisseria meningitidis. Antimicrob. Agents Chemother. 58, 4219–4221 (2014).

    PubMed  PubMed Central  Google Scholar 

  145. 145.

    Srikhanta, Y. N. et al. Methylomic and phenotypic analysis of the ModH5 phasevarion of Helicobacter pylori. Sci. Rep. 7, 16140 (2017).

    PubMed  PubMed Central  Google Scholar 

  146. 146.

    Heithoff, D. M., Sinsheimer, R. L., Low, D. A. & Mahan, M. J. An essential role for DNA adenine methylation in bacterial virulence. Science 284, 967–970 (1999).

    CAS  PubMed  Google Scholar 

  147. 147.

    Garcia-Del Portillo, F., Pucciarelli, M. G. & Casadesus, J. DNA adenine methylase mutants of Salmonella typhimurium show defects in protein secretion, cell invasion, and M cell cytotoxicity. Proc. Natl Acad. Sci. USA 96, 11578–11583 (1999).

    CAS  PubMed  Google Scholar 

  148. 148.

    Brockman, K. L. et al. The ModA2 phasevarion of nontypeable Haemophilus influenzae regulates resistance to oxidative stress and killing by human neutrophils. Sci. Rep. 7, 3161 (2017).

    PubMed  PubMed Central  Google Scholar 

  149. 149.

    Brockman, K. L. et al. ModA2 phasevarion switching in nontypeable Haemophilus influenzae increases the severity of experimental otitis media. J. Infect. Dis. 214, 817–824 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  150. 150.

    VanWagoner, T. M. et al. The modA10 phasevarion of nontypeable Haemophilus influenzae R2866 regulates multiple virulence-associated traits. Microb. Pathog. 92, 60–67 (2016).

    CAS  PubMed  Google Scholar 

  151. 151.

    Polaczek, P., Kwan, K. & Campbell, J. L. GATC motifs may alter the conformation of DNA depending on sequence context and N6-adenine methylation status: possible implications for DNA-protein recognition. Mol. Gen. Genet. 258, 488–493 (1998).

    CAS  PubMed  Google Scholar 

  152. 152.

    Le, T. B., Imakaev, M. V., Mirny, L. A. & Laub, M. T. High-resolution mapping of the spatial organization of a bacterial chromosome. Science 342, 731–734 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  153. 153.

    Diekmann, S. DNA methylation can enhance or induce DNA curvature. EMBO J. 6, 4213–4217 (1987).

    CAS  PubMed  PubMed Central  Google Scholar 

  154. 154.

    Luo, G.-Z. & He, C. DNA N6-methyladenine in metazoans: functional epigenetic mark or bystander? Nat. Struct. Mol. Biol. 24, 503–506 (2017).

    CAS  PubMed  Google Scholar 

  155. 155.

    Fu, Y. et al. N6-methyldeoxyadenosine marks active transcription start sites in chlamydomonas. Cell 161, 879–892 (2015). This article describes one of the first studies to map 6mA events at high resolution and on the genome scale in an eukaryotic genome.

    CAS  PubMed  PubMed Central  Google Scholar 

  156. 156.

    Mondo, S. J. et al. Widespread adenine N6-methylation of active genes in fungi. Nat. Genet. 49, 964–968 (2017).

    CAS  Google Scholar 

  157. 157.

    Greer, E. L. et al. DNA methylation on N6-adenine in C. elegans. Cell 161, 868–878 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  158. 158.

    Zhang, G. et al. N6-methyladenine DNA modification in Drosophila. Cell 161, 893–906 (2015).

    CAS  PubMed  Google Scholar 

  159. 159.

    Koziol, M. J. et al. Identification of methylated deoxyadenosines in vertebrates reveals diversity in DNA modifications. Nat. Struct. Mol. Biol. 23, 24–30 (2016).

    CAS  PubMed  Google Scholar 

  160. 160.

    Wu, T. P. et al. DNA methylation on N6-adenine in mammalian embryonic stem cells. Nature 532, 329–333 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  161. 161.

    Luo, G.-Z. et al. Characterization of eukaryotic DNA N6-methyladenine by a highly sensitive restriction enzyme-assisted sequencing. Nat. Commun. 7, 11301 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  162. 162.

    Yin, J. C., Krebs, M. P. & Reznikoff, W. S. Effect of dam methylation on Tn5 transposition. J. Mol. Biol. 199, 35–45 (1988).

    CAS  PubMed  Google Scholar 

  163. 163.

    Ngo, T. T. M. et al. Effects of cytosine modifications on DNA flexibility and nucleosome mechanical stability. Nat. Commun. 7, 10813 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  164. 164.

    Tan, A., Atack, J. M., Jennings, M. P. & Seib, K. L. The capricious nature of bacterial pathogens: phasevarions and vaccine development. Front. Immunol. 7, 586 (2016).

    PubMed  PubMed Central  Google Scholar 

  165. 165.

    Banerjee, S. & Chowdhury, R. An orphan DNA (cytosine-5-)-methyltransferase in Vibrio cholerae. Microbiology 152, 1055–1062 (2006).

    CAS  PubMed  Google Scholar 

Download references

Acknowledgements

The authors thank A. Tourancheau and other members of the Fang laboratory for their comments. The work was funded by R01 GM114472 (G.F.) and R01 GM128955 (G.F.) from the National Institutes of Health. G.F. is an Irma T. Hirschl/Monique Weill-Caulier Trust Research Scholar and a Nash Family Research Scholar.

Reviewer information

Nature Reviews Genetics thanks J. Casadesus, M. Oggioni and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Author information

Affiliations

Authors

Contributions

All authors researched data for the article, made substantial contributions to discussions of the content and reviewed and/or edited the manuscript before submission. J.B. and G.F. wrote the article.

Corresponding author

Correspondence to Gang Fang.

Ethics declarations

Competing interests

J.B. is currently employed by Oxford Nanopore Technologies, Ltd. G.F. and E.S. declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Related links

REBASE database: http://rebase.neb.com/cgi-bin/pacbiolist?0

Glossary

Phase variation

A means by which reversible variation of protein expression is achieved in bacteria, often in an ON/OFF manner. The process creates phenotypic diversity in a clonally expanded population and allows the colony to survive in rapidly changing environments.

Adaptive selection

An evolutionary process though which surviving organisms accumulate genetic changes that lead to a fitness advantage over their progenitors.

Methylomes

The entirety of DNA methylation marks across genomes.

Bisulfite sequencing

The treatment of DNA with bisulfite chemically converts unmethylated cytosines to uracils. As methylated cytosines are unaffected, the location of methylation can be identified by sequencing the bisulfite-treated DNA.

Hidden Markov model

(HMM). A mathematical concept that describes a finite set of ‘states’ and a probabilistic model for transitioning from one state to another. The probability associated with each transition can be derived from training sets. HMMs are valuable because they enable a search or alignment algorithm to be built on firm probabilistic bases.

Hierarchical Dirichlet process

(HDP). A non-parametric Bayesian approach for modelling a collection of mixture distributions that share mixture components.

Chromatin conformation capture

A technique used to assess the spatial organization of chromosomes within a cell. Briefly, DNA is first chemically crosslinked and fragmented. The crosslinked fragments are then ligated. When sequenced, the ligated fragments produce concatemers that help reveal which regions of sequence co-locate within the cell.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Beaulaurier, J., Schadt, E.E. & Fang, G. Deciphering bacterial epigenomes using modern sequencing technologies. Nat Rev Genet 20, 157–172 (2019). https://doi.org/10.1038/s41576-018-0081-3

Download citation

Further reading