In 1943, Luria and Delbrück used a phage-resistance assay to establish spontaneous mutation as a driving force of microbial diversity1. Mutation rates are still studied using such assays, but these can only be used to examine the small minority of mutations conferring survival in a particular condition. Newer approaches, such as long-term evolution followed by whole-genome sequencing2,3, may be skewed by mutational ‘hot’ or ‘cold’ spots3,4. Both approaches are affected by numerous caveats5,6,7. Here we devise a method, maximum-depth sequencing (MDS), to detect extremely rare variants in a population of cells through error-corrected, high-throughput sequencing. We directly measure locus-specific mutation rates in Escherichia coli and show that they vary across the genome by at least an order of magnitude. Our data suggest that certain types of nucleotide misincorporation occur 104-fold more frequently than the basal rate of mutations, but are repaired in vivo. Our data also suggest specific mechanisms of antibiotic-induced mutagenesis, including downregulation of mismatch repair via oxidative stress, transcription–replication conflicts, and, in the case of fluoroquinolones, direct damage to DNA.
Subscribe to Journal
Get full journal access for 1 year
only $3.90 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Luria, S. E. & Delbrück, M. Mutations of bacteria from virus sensitivity to virus resistance. Genetics 28, 491–511 (1943)
Wielgoss, S. et al. Mutation rate inferred from synonymous substitutions in a long-term evolution experiment with Escherichia coli. G3 (Bethesda) 1, 183–186 (2011)
Lee, H., Popodi, E., Tang, H. & Foster, P. L. Rate and molecular spectrum of spontaneous mutations in the bacterium Escherichia coli as determined by whole-genome sequencing. Proc. Natl Acad. Sci. USA 109, E2774–E2783 (2012)
Martincorena, I., Seshasayee, A. S. N. & Luscombe, N. M. Evidence of non-random mutation rates suggests an evolutionary risk management strategy. Nature 485, 95–98 (2012)
Lenski, R. E., Slatkin, M. & Ayala, F. J. Mutation and selection in bacterial populations: alternatives to the hypothesis of directed mutation. Proc. Natl Acad. Sci. USA 86, 2775–2778 (1989)
Wielgoss, S. et al. Mutation rate dynamics in a bacterial population reflect tension between adaptation and genetic load. Proc. Natl Acad. Sci. USA 110, 222–227 (2013)
Drake, J. W. Contrasting mutation rates from specific-locus and long-term mutation-accumulation procedures. G3 (Bethesda) 2, 483–485 (2012)
Kinde, I., Wu, J., Papadopoulos, N., Kinzler, K. W. & Vogelstein, B. Detection and quantification of rare mutations with massively parallel sequencing. Proc. Natl Acad. Sci. USA 108, 9530–9535 (2011)
Minoche, A. E., Dohm, J. C. & Himmelbauer, H. Evaluation of genomic high-throughput sequencing data generated on Illumina HiSeq and genome analyzer systems. Genome Biol. 12, R112 (2011)
Lou, D. I. et al. High-throughput DNA sequencing errors are reduced by orders of magnitude using circle sequencing. Proc. Natl Acad. Sci. USA 110, 19872–19877 (2013)
Schmitt, M. W. et al. Detection of ultra-rare mutations by next-generation sequencing. Proc. Natl Acad. Sci. USA 109, 14508–14513 (2012)
Schmitt, M. W. et al. Sequencing small genomic targets with high efficiency and extreme accuracy. Nat. Methods 12, 423–425 (2015)
Acevedo, A., Brodsky, L. & Andino, R. Mutational and fitness landscapes of an RNA virus revealed through population sequencing. Nature 505, 686–690 (2014)
Baba, T. et al. Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol. Syst. Biol. 2, 2006.0008 (2006)
Taniguchi, Y. et al. Quantifying E. coli proteome and transcriptome with single-molecule sensitivity in single cells. Science 329, 533–538 (2010)
Garibyan, L. et al. Use of the rpoB gene to determine the specificity of base substitution mutations on the Escherichia coli chromosome. DNA Repair (Amst.) 2, 593–608 (2003)
Drake, J. W. A constant rate of spontaneous mutation in DNA-based icrobes. Proc. Natl Acad. Sci. USA 88, 7160–7164 (1991)
Chen, X. & Zhang, J. No gene-specific optimization of mutation rate in Escherichia coli. Mol. Biol. Evol. 30, 1559–1562 (2013)
McDonald, M. J., Wang, W.-C., Huang, H.-D. & Leu, J.-Y. Clusters of nucleotide substitutions and insertion/deletion mutations are associated with repeat sequences. PLoS Biol. 9, e1000622 (2011)
Zhu, L., Wang, Q., Tang, P., Araki, H. & Tian, D. Genomewide association between insertions/deletions and the nucleotide diversity in bacteria. Mol. Biol. Evol. 26, 2353–2361 (2009)
Kohanski, M. A., DePristo, M. A. & Collins, J. J. Sublethal antibiotic treatment leads to multidrug resistance via radical-induced mutagenesis. Mol. Cell 37, 311–320 (2010)
Dwyer, D. J. et al. Antibiotics induce redox-related physiological alterations as part of their lethality. Proc. Natl Acad. Sci. USA 111, E2100–E2109 (2014)
Gutierrez, A. et al. β-Lactam antibiotics promote bacterial mutagenesis via an RpoS-mediated reduction in replication fidelity. Nat. Commun. 4, 1610 (2013)
Liu, Y. & Imlay, J. A. Cell death from antibiotics without the involvement of reactive oxygen species. Science 339, 1210–1213 (2013)
Lang, G. I. & Murray, A. W. Estimating the per-base-pair mutation rate in the yeast Saccharomyces cerevisiae. Genetics 178, 67–82 (2008)
Pohlhaus, J. R. & Kreuzer, K. N. Norfloxacin-induced DNA gyrase cleavage complexes block Escherichia coli replication forks, causing double-stranded breaks in vivo. Mol. Microbiol. 56, 1416–1429 (2005)
Merrikh, H., Zhang, Y., Grossman, A. D. & Wang, J. D. Replication-transcription conflicts in bacteria. Nat. Rev. Microbiol. 10, 449–458 (2012)
Sangurdekar, D. P., Srienc, F. & Khodursky, A. B. A classification based framework for quantitative description of large-scale microarray data. Genome Biol. 7, R32 (2006)
Dutta, D., Shatalin, K., Epshtein, V., Gottesman, M. E. & Nudler, E. Linking RNA polymerase backtracking to genome instability in E. coli. Cell 146, 533–543 (2011)
Rosenberg, S. M. Evolving responsively: adaptive mutation. Nat. Rev. Genet. 2, 504–515 (2001)
Berend, D. & Tassa, T. Improved bounds on Bell numbers and on moments of sums of random variables. Probab. Math. Stat. 30, 185–205 (2010)
Woods, R. J. et al. Second-order selection for evolvability in a large Escherichia coli population. Science 331, 1433–1436 (2011)
Datsenko, K. A. & Wanner, B. L. One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products. Proc. Natl Acad. Sci. USA 97, 6640–6645 (2000)
Sarkar, S., Ma, W. T. & Sandri, G. H. On fluctuation analysis: a new, simple and efficient method for computing the expected number of mutants. Genetica 85, 173–179 (1992)
We thank A. Heguy and the NYU Genome Technology Center, which is partially supported by the Cancer Center Support Grant, P30CA016087, at the Laura and Isaac Perlmutter Cancer Center. This work used computing resources at the High Performance Computing Facility of the Center for Health Informatics and Bioinformatics at the NYU Langone Medical Center. We thank D. Dwyer and K. Shankarling for materials, and T. Artemyev for his contribution. This work was supported by NIH grant R01GM107329 and HHMI (E.N.) and NCI PSOC grant U54 CA193313 (B.M.). J.J. was supported by the NYU Medical Scientist Training Program and a National Defense Science and Engineering Graduate Fellowship.
The authors declare no competing financial interests.
Reviewer Information Nature thanks N. Luscombe, I. Martincorena, J. Wang and the other anonymous reviewer(s) for their contribution to the peer review of this work.
Extended data figures and tables
a, Mock culture composed of rpoB point mutants of known concentration was sequenced using MDS. Output concentrations of each point mutant recovered from R = 2 analysis are plotted against its input concentration (see Supplementary Information Table 2 for details). b, c, Distribution of the sizes of barcode families in four trials, shown as log10(number of barcode families) per trial versus size of barcode family in reads (R). b, Trials used for the calibration run shown in a (~100 M reads total, divided into four trials). c, Representative quadruplicate trials (from rpoB of wild-type bacteria grown in LB broth with no antibiotics) taking up a total of one quarter of the output of a HiSeq rapid run, a total of ~60 M reads.
a, Barcodes are attached to original DNA molecules as per MDS protocol. After linear amplification, a second barcode is attached to the opposite end of each read (see Supplementary Information: testing sample preparation and PCR efficiency). Exponential PCR is then performed. In the analysis phase, reads can be grouped both by primary barcode (that is, a classic MDS barcode family) and a second barcode corresponding to a ‘subfamily’ of reads with the same parent from a particular linear amplification step before exponential amplification. b, The probability that for a given family only reads of one subfamily are recovered (a ‘homogenous’ barcode) decreases exponentially with R. For example, for R = 3, the probability all 3 reads are of the same subfamily is 0.02. c, We show the number of reads in each subfamily, sorted within each column by subfamily size, for the 1,500 largest primary barcode families in the experiment. For families of such size, it is unlikely that a single subfamily will account for more than 25% of the total number of reads recovered from that family.
a, Empirically, average substitution frequency (with high frequency substitutions such as C→A excluded) stabilizes as R increases. Note, substitution frequencies are not normalized by number of generations. b, Empirical sequencing C→A error rate at C→A mutational hotspots with neighbouring Cs (same as those in Fig. 3c) versus all other positions. c, C→A substitution frequencies when 10% 8-oxoG is synthetically added to in vitro DNA and in FPG-treated samples. Frequencies are reported from ROI positions with potential 8-oxoG incorporations as described in template ‘rpoB_reverse_complement_8-oxo-Dg’. Frequencies are reported at R = 2 level. For R > 2, no C→A substitutions were found in 72,646 in vitro template sites. Data represent biological triplicates. Error bars are standard deviation.
Positive frequencies denote synonymous substitutions. Negative frequencies denote nonsynonymous substitutions. a, c, Values are averaged across quadruplicate trials. b, d, In vitro synthesized DNA has undergone 20-cycle PCR amplification using Q5 polymerase.
a, Substitution frequencies of all ROIs after ~120 generations of growth. Note that values are not normalized for the number of generations and are thus true frequencies, rather than mutation rates. b, Mutation frequencies are shown in context of their 5′ (A, C, G, or T on the x axis) and 3′ (A, C, G, or T on the y axis) neighbours. c, The relative relationship between in vivo substitution frequencies and expected errors due to sequencing and PCR (from in vitro DNA assays) is poorly described by a linear approximation (R2 = 0.27). Furthermore, the recovered frequency from in vivo substitutions (R = 3) is higher than the rate of error (equivalent frequencies would be represented by the dotted line), even with the relatively relaxed read-cutoff threshold of R = 2 (the sequencing + PCR error with an R = 3 cutoff is approximately an order of magnitude lower). Templates are rpoB CDS and mrcA ROIs.
Extended Data Figure 6 Comparing substitution rate and indel rate across 5 ROIs reveals a positive correlation.
Pearson correlation coefficient = 0.76.
a–d, As calculated in fluctuation assays in wild-type cells grown in exponential phase only (a), wild-type cells grown to saturation (b), katG overexpression mutant grown to saturation (c) and inactive katG (H106Y point mutation) overexpression mutant grown to saturation (d). Growth in LB broth was supplemented with possible subinhibitory doses of ampicillin (amp), norfloxacin (nor), or gentamycin (gen). Rates are mean. Error bars are 95% CI. N = 25 (see Methods: fluctuation assays).
As calculated in fluctuation assays in anaerobic conditions (a) and in a mutS knockout (b). Note that because the transition (Ts) rate was high in MutS strains, transversion mutations could not be detected. Rates are mean. Error bars are 95% CI. N = 25 (see Methods: fluctuation assays).
Extended Data Figure 9 Rates of rpoB and mrcA substitutions in the presence of antibiotics as calculated by MDS.
Asterisks indicate cultures grown separately and prepared with Phusion rather than Q5. Although not shown, we note that only in-frame (3×) indels were observed in rpoB in fluctuation assays, as expected since frameshift indels would be deleterious. These increased in frequency by a factor of 10 on addition of norfloxacin.
Extended Data Figure 10 Schematic depicting the mathematical derivation of the false positive rate of MDS due to polymerase error.
a, The origin of various terms used in equations (2)–(7). b, Illustration of an example calculation of false positive rate given more ‘intuitive’ values of N, R and P. The false positive rate is calculated in a way that accounts for the possibility that an error in one or more ‘linear’ cycles propagates to a whole family of reads. The number of reads with an error (k) is Poisson distributed according to equation (2). The probability of a false positive is the sum of the probabilities that all R reads come from one of k families, for all possible k, according to equation (3). Note that in practice, P < 10−6, and in our study N = 12, R > 2, making the false positive rate much lower (see Fig. 1).
About this article
Cite this article
Jee, J., Rasouly, A., Shamovsky, I. et al. Rates and mechanisms of bacterial mutagenesis from maximum-depth sequencing. Nature 534, 693–696 (2016). https://doi.org/10.1038/nature18313
Genetic Variants and Phenotypic Characteristics of Salmonella Typhimurium-Resistant Mutants after Exposure to Carvacrol
PLOS Pathogens (2020)
Divergent Evolution of Mutation Rates and Biases in the Long-Term Evolution Experiment with Escherichia coli
Genome Biology and Evolution (2020)
Age-related accumulation of de novo mitochondrial mutations in mammalian oocytes and somatic tissues
PLOS Biology (2020)