In May 2011, we were reminded that bacterial outbreaks are sudden and can be fatal. Almost 4,000 people, mostly in Germany, were infected by an Escherichia coli strain that caused serious illness; 53 died. To rapidly sequence the E. coli outbreak strain, Matthew Waldor, from Harvard Medical School and Brigham and Women's Hospital, teamed up with scientists from Pacific Biosciences.

The company's single-molecule real-time (SMRT) sequencing platform is fast and provides long reads, which are invaluable in putting a bacterial genome together. Waldor and colleagues identified the outbreak strain as a Shiga toxin–producing E. coli O104:H4, as reported in the New England Journal of Medicine in August 2011.

In the course of this sequencing, the scientists at Pacific Biosciences, led by Eric Schadt, noticed that the DNA methylation pattern in the outbreak strain differed from those of other E. coli O104 strains.

SMRT sequencing not only reports which nucleotides the polymerase incorporates but also records the polymerase kinetics. As the company's researchers showed previously, the rate at which the polymerase incorporates each nucleotide differs depending on the modifications present on each base. Unmodified DNA is read quickly, but if a methyl group is attached to a base, the polymerase slows down and this kinetic variation (KV) is recorded. In the team's previous efforts, these KV differences had been detected on only short synthetic templates. As Schadt notes, “It is one thing to see the signal in a completely artificial context, and then quite another to see it 'in the wild'.” The researchers developed a statistical model that allowed them to detect base modifications genome wide (Schadt et al., 2012)2.

Because methylation of adenosine has a bigger effect on the KV than methylation of cytosine, the scientists profiled methylated adenosine with high sensitivity across the genome of the E. coli O104 outbreak strain, determined signature sequences around the methylated sites and matched these signatures to the DNA methyltransferases (MTases) responsible for them. That was when things got interesting for Waldor and his colleagues.

Most bacteria have more than one MTase; together with a restriction endonuclease, they form the restriction modification (RM) system. “RM systems are thought to have an immune function,” Waldor explains. The restriction enzyme cleaves foreign (unmethylated) DNA, and methylation protects the genome from being cut with the enzyme.” But functions outside of immunity are largely unexplored. Waldor was intrigued that the outbreak strain showed methylation sites that were not present in other E. coli O104 strain. The cause was the same bacteriophage that made this strain so pathogenic: the phage's genome encoded Shiga toxin as well as a new RM system. “When we knocked out the MTase and its associated nuclease,” explains Waldor, “we saw that the DNA of three other phages [integrated in the E. coli genome] was amplified.” Even more unexpectedly, the researchers found that eliminating the RM system led to changes in expression of 40% of all E. coli genes (Fang et al., 2012)1. For Waldor, the conclusion was that RM systems, present in the vast majority of bacteria, may be related not just to immunity but also to control processes such as DNA replication and transcription. He adds, “That was cool, but we don't yet know the mechanisms of these processes.”

Much remains to be done. Researchers at Pacific Biosciences are working on increasing the sensitivity with which methylated cytosine can be detected so that its genome-wide occurrence can also be profiled. Most likely this will yield a new set of MTases with more functions to explore.