Abstract
Improvements in DNA synthesis and sequencing have underpinned comprehensive assessment of gene function in bacteria and eukaryotes. Genome-wide analyses require high-throughput methods to generate mutations and analyze their phenotypes, but approaches to date have been unable to efficiently link the effects of mutations in coding regions or promoter elements in a highly parallel fashion. We report that CRISPR–Cas9 gene editing in combination with massively parallel oligomer synthesis can enable trackable editing on a genome-wide scale. Our method, CRISPR-enabled trackable genome engineering (CREATE), links each guide RNA to homologous repair cassettes that both edit loci and function as barcodes to track genotype–phenotype relationships. We apply CREATE to site saturation mutagenesis for protein engineering, reconstruction of adaptive laboratory evolution experiments, and identification of stress tolerance and antibiotic resistance genes in bacteria. We provide preliminary evidence that CREATE will work in yeast. We also provide a webtool to design multiplex CREATE libraries.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Findlay, G.M., Boyle, E.A., Hause, R.J., Klein, J.C. & Shendure, J. Saturation editing of genomic regions by multiplex homology-directed repair. Nature 513, 120–123 (2014).
Shendure, J. Life after genetics. Genome Med. 6, 86 (2014).
Smanski, M.J. et al. Functional optimization of gene clusters by combinatorial design and assembly. Nat. Biotechnol. 32, 1241–1249 (2014).
Isaacs, F.J. et al. Precise manipulation of chromosomes in vivo enables genome-wide codon replacement. Science 333, 348–353 (2011).
Wang, H.H. et al. Programming cells by multiplex genome engineering and accelerated evolution. Nature 460, 894–898 (2009).
Sandoval, N.R. et al. Strategy for directing combinatorial genome engineering in Escherichia coli. Proc. Natl. Acad. Sci. USA 109, 10540–10545 (2012).
Wang, H.H. et al. Multiplexed in vivo His-tagging of enzyme pathways for in vitro single-pot multienzyme catalysis. ACS Synth. Biol. 1, 43–52 (2012).
Raman, S., Rogers, J.K., Taylor, N.D. & Church, G.M. Evolution-guided optimization of biosynthetic pathways. Proc. Natl. Acad. Sci. USA 111, 17803–17808 (2014).
Ho, J.M. et al. Efficient reassignment of a frequent serine codon in wild-type Escherichia coli. ACS Synth. Biol. 5, 163–171 (2016).
Warner, J.R., Reeder, P.J., Karimpour-Fard, A., Woodruff, L.B.A. & Gill, R.T. Rapid profiling of a microbial genome using mixtures of barcoded oligonucleotides. Nat. Biotechnol. 28, 856–862 (2010).
Wetmore, K.M. et al. Rapid quantification of mutant fitness in diverse bacteria by sequencing randomly bar-coded transposons. MBio 6, e00306–e00315 (2015).
Zeitoun, R.I. et al. Multiplexed tracking of combinatorial genomic mutations in engineered cell populations. Nat. Biotechnol. 33, 631–637 (2015).
Kim, H. & Kim, J.-S. A guide to genome engineering with programmable nucleases. Nat. Rev. Genet. 15, 321–334 (2014).
Cong, L. et al. Multiplex genome engineering using CRISPR/Cas systems. Science 339, 819–823 (2013).
Jiang, Y. et al. Multigene editing in the Escherichia coli genome via the CRISPR-Cas9 system. Appl. Environ. Microbiol. 81, 2506–2514 (2015).
Li, Y. et al. Metabolic engineering of Escherichia coli using CRISPR-Cas9 meditated genome editing. Metab. Eng. 31, 13–21 (2015).
Shalem, O. et al. Genome-scale CRISPR-Cas9 knockout screening in human cells. Science 343, 84–87 (2014).
Wang, T., Wei, J.J., Sabatini, D.M. & Lander, E.S. Genetic screens in human cells using the CRISPR-Cas9 system. Science 343, 80–84 (2014).
Gilbert, L.A. et al. Genome-scale CRISPR-mediated control of gene repression and activation. Cell 159, 647–661 (2014).
Peters, J.M. et al. A comprehensive, CRISPR-based functional analysis of essential genes in bacteria. Cell 165, 1493–1506 (2016).
Li, K., Wang, G., Andersen, T., Zhou, P. & Pu, W.T. Optimization of genome engineering approaches with the CRISPR/Cas9 system. PLoS One 9, e105779 (2014).
Zhou, Y. et al. High-throughput screening of a CRISPR/Cas9 library for functional genomics in human cells. Nature 509, 487–491 (2014).
Chen, S. et al. Genome-wide CRISPR screen in a mouse model of tumor growth and metastasis. Cell 160, 1246–1260 (2015).
Mali, P. et al. RNA-guided human genome engineering via Cas9. Science 339, 823–826 (2013).
Koike-Yusa, H., Li, Y., Tan, E.-P., Velasco-Herrera, Mdel.C. & Yusa, K. Genome-wide recessive genetic screening in mammalian cells with a lentiviral CRISPR-guide RNA library. Nat. Biotechnol. 32, 267–273 (2014).
Pines, G. et al. Codon compression algorithms for saturation mutagenesis. ACS Synth. Biol. 4, 604–614 (2015).
Sawitzke, J.A. et al. Probing cellular processes with oligo-mediated recombination and using the knowledge gained to optimize recombineering. J. Mol. Biol. 407, 45–59 (2011).
Jiang, W., Bikard, D., Cox, D., Zhang, F. & Marraffini, L.A. RNA-guided editing of bacterial genomes using CRISPR-Cas systems. Nat. Biotechnol. 31, 233–239 (2013).
Oh, J.-H. van Pijkeren, J.-P. CRISPR-Cas9-assisted recombineering in Lactobacillus reuteri. Nucleic Acids Res. 42, e131 (2014).
Watson, M., Liu, J.-W. & Ollis, D. Directed evolution of trimethoprim resistance in Escherichia coli. FEBS J. 274, 2661–2671 (2007).
Toprak, E. et al. Evolutionary paths to antibiotic resistance under dynamically sustained drug selection. Nat. Genet. 44, 101–105 (2011).
Iwakura, M. et al. Evolutional design of a hyperactive cysteine- and methionine-free mutant of Escherichia coli dihydrofolate reductase. J. Biol. Chem. 281, 13234–13246 (2006).
Boehr, D.D., McElheny, D., Dyson, H.J. & Wright, P.E. The dynamic energy landscape of dihydrofolate reductase catalysis. Science 313, 1638–1642 (2006).
Bhabha, G. et al. Divergent evolution of protein conformational dynamics in dihydrofolate reductase. Nat. Struct. Mol. Biol. 20, 1243–1249 (2013).
Fisher, M.A. et al. Enhancing tolerance to short-chain alcohols by engineering the Escherichia coli AcrB efflux pump to secrete the non-native substrate n-butanol. ACS Synth. Biol. 3, 30–40 (2014).
Foo, J.L. & Leong, S.S.J. Directed evolution of an E. coli inner membrane transporter for improved efflux of biofuel molecules. Biotechnol. Biofuels 6, 81 (2013).
Tenaillon, O. et al. The molecular diversity of adaptive convergence. Science 335, 457–461 (2012).
Chang, R.L. et al. Structural systems biology evaluation of metabolic thermotolerance in Escherichia coli. Science 340, 1220–1223 (2013).
Basak, S. & Jiang, R. Enhancing E. coli tolerance towards oxidative stress via engineering its global regulator cAMP receptor protein (CRP). PLoS One 7, e51179 (2012).
Rodríguez-Verdugo, A., Gaut, B.S. & Tenaillon, O. Evolution of Escherichia coli rifampicin resistance in an antibiotic-free environment during thermal stress. BMC Evol. Biol. 13, 50 (2013).
Campbell, E.A. et al. Structural mechanism for rifampicin inhibition of bacterial rna polymerase. Cell 104, 901–912 (2001).
White, D.G., Goldman, J.D., Demple, B. & Levy, S.B. Role of the acrAB locus in organic solvent tolerance mediated by expression of marA, soxS, or robA in Escherichia coli. J. Bacteriol. 179, 6122–6126 (1997).
Nakashima, R., Sakurai, K., Yamasaki, S., Nishino, K. & Yamaguchi, A. Structures of the multidrug exporter AcrB reveal a proximal multisite drug-binding pocket. Nature 480, 565–569 (2011).
Kohanski, M.A., Dwyer, D.J., Hayete, B., Lawrence, C.A. & Collins, J.J. A common mechanism of cellular death induced by bactericidal antibiotics. Cell 130, 797–810 (2007).
Dwyer, D.J., Kohanski, M.A. & Collins, J.J. Role of reactive oxygen species in antibiotic action and resistance. Curr. Opin. Microbiol. 12, 482–489 (2009).
Mills, T.Y., Sandoval, N.R. & Gill, R.T. Cellulosic hydrolysate toxicity and tolerance mechanisms in Escherichia coli. Biotechnol. Biofuels 2, 26 (2009).
Glebes, T.Y., Sandoval, N.R., Gillis, J.H. & Gill, R.T. Comparison of genome-wide selection strategies to identify furfural tolerance genes in Escherichia coli. Biotechnol. Bioeng. 112, 129–140 (2015).
Browning, D.F. et al. Modulation of CRP-dependent transcription at the Escherichia coli acsP2 promoter by nucleoprotein complexes: anti-activation by the nucleoid proteins FIS and IHF. Mol. Microbiol. 51, 241–254 (2004).
Sandoval, N.R., Mills, T.Y., Zhang, M. & Gill, R.T. Elucidating acetate tolerance in E. coli using a genome-wide approach. Metab. Eng. 13, 214–224 (2011).
Wolfe, A.J. The acetate switch. Microbiol. Mol. Biol. Rev. 69, 12–50 (2005).
Chiang, S.M. & Schellhorn, H.E. Regulators of oxidative stress response genes in Escherichia coli and their functional conservation in bacteria. Arch. Biochem. Biophys. 525, 161–169 (2012).
Wang, X. et al. Engineering furfural tolerance in Escherichia coli improves the fermentation of lignocellulosic sugars into renewable chemicals. Proc. Natl. Acad. Sci. USA 110, 4021–4026 (2013).
Stoebel, D.M., Hokamp, K., Last, M.S. & Dorman, C.J. Compensatory evolution of gene regulation in response to stress by Escherichia coli lacking RpoS. PLoS Genet. 5, e1000671 (2009).
Smith, A.M. et al. Highly-multiplexed barcode sequencing: an efficient method for parallel analysis of pooled samples. Nucleic Acids Res. 38, e142 (2010).
van Opijnen, T., Bodi, K.L. & Camilli, A. Tn-seq: high-throughput parallel sequencing for fitness and genetic interaction studies in microorganisms. Nat. Methods 6, 767–772 (2009).
Lajoie, M.J. et al. Genomically recoded organisms expand biological functions. Science 342, 357–360 (2013).
Ronda, C., Pedersen, L.E., Sommer, M.O.A. & Nielsen, A.T. CRMAGE: CRISPR optimized MAGE recombineering. Sci. Rep. 6, 19452 (2016).
Maruyama, T. et al. Increasing the efficiency of precise genome editing with CRISPR-Cas9 by inhibition of nonhomologous end joining. Nat. Biotechnol. 33, 538–542 (2015).
Reisch, C.R. & Prather, K.L.J. The no-SCAR (Scarless Cas9 Assisted Recombineering) system for genome editing in Escherichia coli. Sci. Rep. 5, 15096 (2015).
Bao, Z. et al. A homology integrated CRISPR-Cas (HI-CRISPR) system for one-step multi-gene disruptions in Saccharomyces cerevisiae. ACS Synth. Biol. 4, 585–594 (2015).
Wong, A.S.L. et al. Multiplexed barcoded CRISPR-Cas9 screening enabled by CombiGEM. Proc. Natl. Acad. Sci. USA 113, 2544–2549 (2016).
Li, X.-T. et al. Identification of factors influencing strand bias in oligonucleotide-mediated recombination in Escherichia coli. Nucleic Acids Res. 31, 6674–6687 (2003).
Costantino, N. & Court, D.L. Enhanced levels of λ Red-mediated recombinants in mismatch repair mutants. Proc. Natl. Acad. Sci. USA 100, 15748–15753 (2003).
Wang, H.H., Xu, G., Vonner, A.J. & Church, G. Modified bases enable high-efficiency oligonucleotide-mediated allelic replacement via mismatch repair evasion. Nucleic Acids Res. 39, 7336–7347 (2011).
Mosberg, J.A., Gregg, C.J., Lajoie, M.J., Wang, H.H. & Church, G.M. Improving lambda red genome engineering in Escherichia coli via rational removal of endogenous nucleases. PLoS One 7, e44638 (2012).
Nyerges, Á. et al. A highly precise and portable genome engineering method allows comparison of mutational effects across bacterial species. Proc. Natl. Acad. Sci. USA 113, 2502–2507 (2016).
Alper, H., Moxley, J., Nevoigt, E., Fink, G.R. & Stephanopoulos, G. Engineering yeast transcription machinery for improved ethanol tolerance and production. Science 314, 1565–1568 (2006).
Alper, H. & Stephanopoulos, G. Global transcription machinery engineering: a new approach for improving cellular phenotype. Metab. Eng. 9, 258–267 (2007).
Gutiérrez-Ríos, R.M. et al. Regulatory network of Escherichia coli: consistency between literature knowledge and microarray profiles. Genome Res. 13, 2435–2443 (2003).
Ross, W. et al. A third recognition element in bacterial promoters: DNA binding by the alpha subunit of RNA polymerase. Science 262, 1407–1413 (1993).
Ebright, R.H., Ebright, Y.W. & Gunasekera, A. Consensus DNA site for the Escherichia coli catabolite gene activator protein (CAP): CAP exhibits a 450-fold higher affinity for the consensus DNA site than for the E. coli lac DNA site. Nucleic Acids Res. 17, 10295–10305 (1989).
Kosuri, S. et al. Scalable gene synthesis by selective amplification of DNA pools from high-fidelity microchips. Nat. Biotechnol. 28, 1295–1299 (2010).
Jinek, M. et al. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 337, 816–821 (2012).
Qi, L.S. et al. Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression. Cell 152, 1173–1183 (2013).
Firth, A.E. & Patrick, W.M. GLUE-IT and PEDEL-AA: new programmes for analyzing protein diversity in randomized libraries. Nucleic Acids Res. 36, W281–5 (2008).
Datta, S., Costantino, N. & Court, D.L. A set of recombineering plasmids for gram-negative bacteria. Gene 379, 109–115 (2006).
Prior, J.E., Lynch, M.D. & Gill, R.T. Broad-host-range vectors for protein expression across gram negative hosts. Biotechnol. Bioeng. 106, 326–332 (2010).
Hamady, M., Walker, J.J., Harris, J.K., Gold, N.J. & Knight, R. Error-correcting barcoded primers for pyrosequencing hundreds of samples in multiplex. Nat. Methods 5, 235–237 (2008).
Edgar, R.C. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 2460–2461 (2010).
Farasat, I. et al. Efficient search, mapping, and optimization of multi-protein genetic systems in diverse bacteria. Mol. Syst. Biol. 10, 731 (2014).
Bakan, A., Meireles, L.M. & Bahar, I. ProDy: protein dynamics inferred from theory and experiments. Bioinformatics 27, 1575–1577 (2011).
Bateman, A. et al. The Pfam protein families database. Nucleic Acids Res. 32, D138–D141 (2004).
Schrödinger, LLC. The PyMOL Molecular Graphics System, Version 1.3r1 (2010).
Nakashima, R. et al. Structural basis for the inhibition of bacterial multidrug exporters. Nature 500, 102–106 (2013).
Hung, L.-W. et al. Crystal structure of AcrB complexed with linezolid at 3.5 Å resolution. J. Struct. Funct. Genomics 14, 71–75 (2013).
Rice, P.A., Yang, S., Mizuuchi, K. & Nash, H.A. Crystal structure of an IHF-DNA complex: a protein-induced DNA U-turn. Cell 87, 1295–1306 (1996).
Molodtsov, V. et al. X-ray crystal structures of the Escherichia coli RNA polymerase in complex with benzoxazinorifamycins. J. Med. Chem. 56, 4758–4763 (2013).
Murakami, K.S., Masuda, S. & Darst, S.A. Structural basis of transcription initiation: RNA polymerase holoenzyme at 4 A resolution. Science 296, 1280–1284 (2002).
Rhee, S., Martin, R.G., Rosner, J.L. & Davies, D.R. A novel DNA-binding motif in MarA: the first structure for an AraC family transcriptional activator. Proc. Natl. Acad. Sci. USA 95, 10413–10418 (1998).
Kwon, H.J., Bennik, M.H., Demple, B. & Ellenberger, T. Crystal structure of the Escherichia coli Rob transcription factor in complex with DNA. Nat. Struct. Biol. 7, 424–430 (2000).
Acknowledgements
This work would not have been possible without the insights and efforts of a number of talented individuals. We would like to thank T. Mansell, N. Boyle, K. Fujimori, and H. Chilton for their input, feedback, guidance and contributions to this work. This work was supported by the US Department of Energy (Grant DE-SC0008812) and CAPES foundation (grant #0315133).
Author information
Authors and Affiliations
Contributions
R.T.G., M.C.B., G.P., S.A.L. and R.Z. all contributed to A.D.G.'s development of the concept. A.D.G., M.C.B., G.P., S.A.L. and R.T.G. all aided in the design of experiments. Scripts to automate CREATE cassette design were written by A.D.G. Library construction and recombineering was done by A.D.G. Selections, sample preparation, sequencing, clonal reconstructions and growth validations of selected variants were done by A.D.G., M.C.B., G.P., R.L. and Z.W. Sequencing data analysis was done by A.D.G. with contributions to the statistical analysis provided by R.Z. and R.T.G. The http://www.thebioverse.org web interface was developed by A.L.H.-E. Yeast validation of CREATE methodology was performed by L.L., R.L. and W.G.A. The manuscript was written by A.D.G. and R.T.G.
Corresponding author
Ethics declarations
Competing interests
R.T.G. and A.D.G. have a patent application pending (WO/2015/123339) whose value may be affected by the publication of this paper. R.T.G., A.D.G. and A.L.H.-E. have financial interests in Muse Biotechnology Inc., which is commercializing the CREATE technology.
Integrated supplementary information
Supplementary Figure 1 Enabling flexible design strategies.
Illustration of designs compatible with the CREATE strategy. a) For protein engineering applications, a silent codon approach is taken (top, see also Fig. 1). This mutation strategy allows targeted mutagenesis of key protein regions to alter features such as DNA binding, protein-protein interactions, catalysis, or allosteric regulation. Above, a DNA binding saturation mutagenesis library designed for the global transcription factor Fis designed for this study is illustrated. b) For promoter mutations, PAM sites in proximity to a specified transcription start site (TSS) can be be disrupted through nucleotide replacement or integration cassettes. To simplify this design procedure used in this study, consensus CAP or UP elements were designed for integration at a fixed location relative to the TSS without taking into account possible effects of these mutations may have on proximal genes. c) An example of cassette design for mutagenizing a ribosome binding site (RBS). Although easily accommodated by the design pipeline, RBS mutagenesis was not performed for this study. d) Example of a simple deletion design. Points a and b are included to illustrate distance between two sites at the gene deletion locus. In all cases cassette designs disrupt a targeted PAM to allow selective enrichment of the designed mutant.
Supplementary Figure 2 Cas9 editing efficiency controls using galactokinase red/white screening.
a) The CREATE galK_120/17 off cassette (different cassettes tested shown below in b) was transformed into different backgrounds to assess the efficiency of homologous recombination between the CREATE plasmid and the target genome. Red colonies represent unedited (wt) genomic variants and white colonies represent edited variants. Transformation into cells containing only pSIM5 or pSIM5/X2 and dCas9 plasmids exhibited no detectable recombination as indicated by the lack of white colonies. In the presence of active Cas9 (X2-Cas9, far right) we observe high efficiency editing (>80%), indicating the requirements for dsDNA cleavage to achieve high efficiency editing and library coverage. b) Different cassettes designed to test the requirements of efficient editing using the CREATE strategy. The stop codon edit is shown in blue for each cassette, and the synonymous PAM mutation (red) is positioned at increasing distances from the edit site (17, 44 and 59 bp).
Supplementary Figure 3 Toxicity of gRNA dsDNA cleavage in E. coli.
a) The toxicity of a single gRNA cut in E. coli as observed in control experiments with a gRNA targeting galK (spacer sequence TTAACTTTGCGTAACAACGC) or folA (spacer sequence GTAATTTTGTATAGAATTTA). In the absence of a repair template we observe strong killing from the gRNA. Rescue efficiencies of 103-104 are observed upon co-transformation of a single stranded donor oligo indicating the need for a homologous repair template to alleviate this toxicity. b) Toxicity of multiple CREATE edits. The targeted sites are illustrated graphically on the left and at the bottom of the bar graph. A non-targeting gRNA control was used to estimate transformation efficiency based on no edits (far left, no target sites). A CREATE cassette targeting either folA (red) or galK (green) or a combination of the two. Note the multiplicative toxicity in E. coli of having additional gRNAs expressed from the same plasmid. In this scenario, there is homologous repair for each site suggesting that off-target gRNA cleavage would be highly lethal. These data suggest that off target cleavage by a CREATE cassette would be selectively removed from the population early in the library construction phase.
Supplementary Figure 4 Test of CREATE strategy for gene deletions.
a) Cassette design for deleting 100 bp from the galK ORF. The HA is designed to recombine with regions of homology with the designated spacing, with each 50 bp side of the CREATE HA designed to recombine at the designated site (blue). The PAM/spacer location (red) is proximal to one of the homology arms and is deleted during recombination, allowing selectable enrichment of the deleted segment. b) Electrophoresis of chromosomal PCR amplicons from clones recombineered with this cassette. c) Design for 700 bp deletion as in a). d) Colony PCR of 700 bp deletion cassettes as in b). The asterisks in b) and d) indicate colonies that appear to have the designed deletion. Note that some clones appear to have bands pertaining to both wt and deletion sizes indicating that chromosome segregation in some of the colonies is incomplete when plated 3 hours post recombineering28.
Supplementary Figure 5 Editing efficiency controls by co-transformation of gRNA and linear dsDNA cassettes.
Effect of PAM distance on editing efficiency using linear dsDNA PCR amplicons and co-transformation with a gRNA. On the left is an illustration of the experiments - PCR amplicons were designed to contain a dual (TAATAA) stop codon on one side (asterisk) and a PAM mutation just downstream of the galK gene (gray box) on the other end. These PCR amplicons were co-transformed with a gRNA targeting the downstream galK PAM site. The primers were designed such that the mutations were 40 nt from the end of the amplicon to ensure enough homology for recombination. Data was obtained from these experiments by red/white colony screening. A linear fit to the data is shown at the bottom. Cassettes in which only the PAM mutation is present were included as assay controls and were observed to have very low rates of GalK inactivation. These experiments were performed in a BW25113 strain of E. coli in which the mutS gene was knocked out to allow high efficiency editing with double stranded DNA templates. This approach in MG1655 did not achieve high efficiency editing due to the active mutS allele.
Supplementary Figure 6 Library cloning analysis and statistics.
a) Reads from the plasmid library following cloning are shown according to the number of total mismatches between the read and the target design sequence. The majority of plasmids are matches to the correct design. However, there are a large number of 4 base pair indel/mismatch mutants that were observed in this cloned population. b) Plot of the mutation profile for the plasmid pool as a function of cassette position. An increase in the mutation frequency is observed near the center of the homology arm (HA) indicating a small error bias in the sequencing or synthesis of this region. We suspect that this is due to the presence of sequences complementary to the spacer element in the gRNA c) Histogram of the distances between the PAM and codon for the CREATE cassettes designed in this study. Large majority (> 95%) were within the design constraints tested in Fig. 2. The small fraction that are beyond 60 bp were made in cases where there was no synonymous PAM mutation within closer proximity. d) Library coverage from multiplexed cloning of CREATE plasmids. Deep sequencing counts of each variant are shown with respect to their position on the genome. The inset shows a histogram of the number of variants having the indicated plasmid counts in the cloned libraries.
Supplementary Figure 7 Precision of CREATE cassette tracking of recombineered populations.
a) Correlation plot of CREATE cassette read frequencies in the plasmid population prior to Cas9 exposure (x-axis) and after 3 hours post transformation into a Cas9 background. b) between replicate recombineering reactions following overnight recovery. The gray lines indicate the line of perfect correlation for reference. R2 and p values were calculated from a linear fit to the data using the Python SciPy statistics package. A counting threshold of 5 for each replicate experiment was applied to the data to filter out noise from each data set.
Supplementary Figure 8 Growth characteristics of folA mutations in M9 minimal media.
While F153R appears to maintain normal growth characteristics, the growth rate of the F153W mutation is significantly slower under these conditions, suggesting that these two amino acid substitutions at the same site have very different effects on organismal fitness presumably due to different changes invoked in the stability/dynamics of this protein.
Supplementary Figure 9 Enrichment profiles for folA CREATE cassettes in minimal media.
Cassettes that encode synonymous HA are shown in black and non-synonymous cassettes in gray, the dashed lines indicate enrichment scores with p<0.05 significance compared to the synonymous population mean as estimated from a bootstrap analysis (Sup. Meth.). The enrichment score observed for each mutant cassette at each position in the protein sequence is shown to the left and a histogram of these enrichment scores as a fraction of the total variants to the right. The two populations appear to be largely similar. Conserved residues that are highly deleterious are shown in blue for reference.
Supplementary Figure 10 Validation of newly identified acrB mutations for improved solvent and antibiotic tolerance.
a) On the left a global overview of AcrB efflux pump. Substrates enter the pump through the openings in the periplasmic space and are extruded via the AcrB/AcrA/TolC complex across the outer membrane and into the extracellular space. Library targeted residues are highlighted by blue spheres for reference and the red dot indicates the region where many of the enriched variants clustered. On the right is a blow up of the loop-helix motif abutting the central funnel where enriched mutations in isobutanol were identified (red and teal spheres), presumably affecting solute transport from the periplasmic space. Mutants targeting the T60 position (teal spheres) was also enriched in the presence of erythromycin b) Confirmation of N70D and D73L mutations for tolerance to isobutanol. The N70D mutation in particular appears to improve the final OD to a significant degree. Reconstructed strains were measured for final OD in capped 1.5 mL eppendorf tubes following 48 hours incubation. Error bars are derived from N=3 trials and p-values derived from a one-tailed T-test. c) Improved growth of the AcrB T60N mutant was observed in inhibitory concentrations of erythromycin (200 μg/mL) and isobutanol (1.2%) in shaking 96 well plate, indicating that this mutation may enhance the efflux activity of this pump towards many compounds. For these experiments, CREATE cassette designs were individually synthesized, cloned and sequence verified before recombineering into E. coli MG1655 to reconstruct the mutations and the genomic modifications were sequence verified by colony PCR to confirm the genotype-phenotype association.
Supplementary Figure 11 Benefits of rational mutagenesis for sampling novel adaptive genotypes.
a) 500 μg/mL rifampicin b) 500 μg/mL erythromycin c) 10 g/L acetate and d) 2 g/L furfural. While naturally evolving systems or error-prone PCR are highly biased towards sampling single nucleotide polymorphisms (i.e. 1 nt mutations, red) these histograms illustrate the potential advantages for rational design approaches that can identify rare or inaccessible mutations (2 and 3 nt, green and blue respectively). For example, the highest fitness solutions appear to be biased toward these rare mutations in rifampicin, erythromycin and furfural selections to varying degrees. These results indicate that procedures such as CREATE should allow more rapid and thorough analysis of fitness improving mutations, in much the same way that computational approaches are being used to improve directed evolution for protein engineering.
Supplementary Figure 12 Reconstruction of mutations identified by erythromycin selection.
Reconstructed strains grown in 0.5 mL in capped 1.5 mL eppendorf tubes following 48 hours incubation in the presence of 200 μg/mL erythromycin and final OD measurements assessed. Error bars are derived from N=3 trials. A one tailed T-test was performed on each set of measurements to determine p-values indicated for significance of growth benefit.
Supplementary Figure 13 Validation of Crp S28P mutation for furfural or thermal tolerance.
a) Crystal structure of the Crp regulatory protein with variants identified by furfural selection highlighted in red (PDB ID 3N4M). A number of the CREATE designs targeting residues near the cyclic-AMP binding site (aa. 28-30, 65) of this regulator were highly enriched in minimal media selections for furfural or thermal tolerance, suggesting that these mutations may enhance E. coli growth in minimal media under a variety of stress conditions. b) Validation the Crp S28P mutant identified in 2 g/L furfural selections in M9 media. This mutant was reconstructed as described for AcrB T60S in Fig. S8.
Supplementary Figure 14 Precision editing with CREATE in laboratory and wild strains of S. cerevisiae.
A CREATE cassette designed to edit tandem stop codons into ADE2 was designed and inserted into a modified pCRCT vector29. a) The laboratory strain BY4709 was transformed, and 95% of the colonies were phenotypically confirmed as red after 3 days liquid culture and plating, indicating successful deactivation of ADE2. Genotypic confirmation of 20 strains was performed by sequencing the ADE2 genomic locus and revealed that the disruption was due to the designed edit and not a consequence of NHEJ mediated indel formation in 100% of the colonies. b) RM11-1a, a haploid derivative of a vineyard strain30, was transformed using the same plasmid and CREATE cassette designed using the S288c reference sequence for ADE2. Despite three polymorphisms occurring in the region of interest in RM11-1 ADE2, 98% of colonies possessed the red ade2 knockout phenotype. We sequenced the ADE2 locus in 20 of these red colonies to confirm that 70% were mutated as intended at the target site (we note that in RM11-1a we saw many colonies where the sequencing traces contained mixtures of the wt and designed dual stop codon edit suggesting incomplete chromosomal segregation).
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–14 (PDF 1788 kb)
Supplementary Table 1
Protein Engineering Libraries (XLSX 42 kb)
Supplementary Table 2
Cloning primers (XLSX 26 kb)
Supplementary Table 3
MiSeqPrimers (XLSX 32 kb)
Supplementary Table 4
Supplementary Table 4 (XLSX 83 kb)
Supplementary Table 5
Publication (XLSX 24 kb)
Supplementary Table 6
Supplementary Table 6 (TXT 8711 kb)
Rights and permissions
About this article
Cite this article
Garst, A., Bassalo, M., Pines, G. et al. Genome-wide mapping of mutations at single-nucleotide resolution for protein, metabolic and genome engineering. Nat Biotechnol 35, 48–55 (2017). https://doi.org/10.1038/nbt.3718
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nbt.3718
This article is cited by
-
Precise genome-editing in human diseases: mechanisms, strategies and applications
Signal Transduction and Targeted Therapy (2024)
-
Base editor-mediated large-scale screening of functional mutations in bacteria for industrial phenotypes
Science China Life Sciences (2024)
-
Deep mutational scanning of essential bacterial proteins can guide antibiotic development
Nature Communications (2023)
-
Soil microbiome engineering for sustainability in a changing environment
Nature Biotechnology (2023)
-
Biosynthesis of catharanthine in engineered Pichia pastoris
Nature Synthesis (2023)