Defining synonymous codon compression schemes by genome recoding


Synthetic recoding of genomes, to remove targeted sense codons, may facilitate the encoded cellular synthesis of unnatural polymers by orthogonal translation systems. However, our limited understanding of allowed synonymous codon substitutions, and the absence of methods that enable the stepwise replacement of the Escherichia coli genome with long synthetic DNA and provide feedback on allowed and disallowed design features in synthetic genomes, have restricted progress towards this goal. Here we endow E. coli with a system for efficient, programmable replacement of genomic DNA with long (>100-kb) synthetic DNA, through the in vivo excision of double-stranded DNA from an episomal replicon by CRISPR/Cas9, coupled to lambda-red-mediated recombination and simultaneous positive and negative selection. We iterate the approach, providing a basis for stepwise whole-genome replacement. We attempt systematic recoding in an essential operon using eight synonymous recoding schemes. Each scheme systematically replaces target codons with defined synonyms and is compatible with codon reassignment. Our results define allowed and disallowed synonymous recoding schemes, and enable the identification and repair of recoding at idiosyncratic positions in the genome.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Efficient, programmable insertion of very long synthetic DNA (s. DNA) into the genome of E. coli.
Figure 2: Iterating REXER for GENESIS.
Figure 3: Systematic and defined synonymous recoding in an E. coli operon rich in essential genes.
Figure 4: Compiled recoding landscapes of targeted codons reveal allowed and disallowed synonymous recoding schemes and enable the identification and repair of idiosyncratic positions in the genome.


  1. 1

    Cello, J., Paul, A. V. & Wimmer, E. Chemical synthesis of poliovirus cDNA: generation of infectious virus in the absence of natural template. Science 297, 1016–1018 (2002)

  2. 2

    Chan, L. Y., Kosuri, S. & Endy, D. Refactoring bacteriophage T7. Mol. Syst. Biol. 1, 2005.0018 (2005)

  3. 3

    Itaya, M., Tsuge, K., Koizumi, M. & Fujita, K. Combining two genomes in one cell: stable cloning of the Synechocystis PCC6803 genome in the Bacillus subtilis 168 genome. Proc. Natl Acad. Sci. USA 102, 15971–15976 (2005)

  4. 4

    Gibson, D. G. et al. Complete chemical synthesis, assembly, and cloning of a Mycoplasma genitalium genome. Science 319, 1215–1220 (2008)

  5. 5

    Gibson, D. G. et al. Creation of a bacterial cell controlled by a chemically synthesized genome. Science 329, 52–56 (2010)

  6. 6

    Annaluru, N. et al. Total synthesis of a functional designer eukaryotic chromosome. Science 344, 55–58 (2014)

  7. 7

    Kudla, G., Murray, A. W., Tollervey, D. & Plotkin, J. B. Coding-sequence determinants of gene expression in Escherichia coli. Science 324, 255–258 (2009)

  8. 8

    Ro, D.-K. et al. Production of the antimalarial drug precursor artemisinic acid in engineered yeast. Nature 440, 940–943 (2006)

  9. 9

    Chin, J. W. Reprogramming the genetic code. Science 336, 428–429 (2012)

  10. 10

    Mukai, T. et al. Reassignment of a rare sense codon to a non-canonical amino acid in Escherichia coli. Nucleic Acids Res. 43, 8111–8122 (2015)

  11. 11

    Itaya, M., Fujita, K., Ikeuchi, M., Koizumi, M. & Tsuge, K. Stable positional cloning of long continuous DNA in the Bacillus subtilis genome vector. J. Biochem. 134, 513–519 (2003)

  12. 12

    Krishnakumar, R. et al. Simultaneous non-contiguous deletions using large synthetic DNA and site-specific recombinases. Nucleic Acids Res. 42, e111 (2014)

  13. 13

    Datsenko, K. A. & Wanner, B. L. One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products. Proc. Natl Acad. Sci. USA 97, 6640–6645 (2000)

  14. 14

    Wang, K. et al. Optimized orthogonal translation of unnatural amino acids enables spontaneous protein double-labelling and FRET. Nat. Chem. 6, 393–403 (2014)

  15. 15

    Neumann, H., Wang, K., Davis, L., Garcia-Alai, M. & Chin, J. W. Encoding multiple unnatural amino acids via evolution of a quadruplet-decoding ribosome. Nature 464, 441–444 (2010)

  16. 16

    Cho, B.-K. et al. The transcription unit architecture of the Escherichia coli genome. Nat. Biotechnol. 27, 1043–1049 (2009)

  17. 17

    Li, G.-W., Oh, E. & Weissman, J. S. The anti-Shine-Dalgarno sequence drives translational pausing and codon choice in bacteria. Nature 484, 538–541 (2012)

  18. 18

    Sørensen, M. A. & Pedersen, S. Absolute in vivo translation rates of individual codons in Escherichia coli. The two glutamic acid codons GAA and GAG are translated with a threefold difference in rate. J. Mol. Biol. 222, 265–280 (1991)

  19. 19

    Curran, J. F. & Yarus, M. Rates of aminoacyl-tRNA selection at 29 sense codons in vivo. J. Mol. Biol. 209, 65–77 (1989)

  20. 20

    Kimchi-Sarfaty, C. et al. A “silent” polymorphism in the MDR1 gene changes substrate specificity. Science 315, 525–528 (2007)

  21. 21

    Zhang, G., Hubalewska, M. & Ignatova, Z. Transient ribosomal attenuation coordinates protein synthesis and co-translational folding. Nat. Struct. Mol. Biol. 16, 274–280 (2009)

  22. 22

    Quax, T. E. F. et al. Differential translation tunes uneven production of operon-encoded proteins. Cell Reports 4, 938–944 (2013)

  23. 23

    Quax, T. E. F., Claassens, N. J., Söll, D. & van der Oost, J. Codon bias as a means to fine-tune gene expression. Mol. Cell 59, 149–161 (2015)

  24. 24

    Li, G.-W., Burkhardt, D., Gross, C. & Weissman, J. S. Quantifying absolute protein synthesis rates reveals principles underlying allocation of cellular resources. Cell 157, 624–635 (2014)

  25. 25

    Pósfai, G. et al. Emergent properties of reduced-genome Escherichia coli. Science 312, 1044–1046 (2006)

  26. 26

    Jiang, W., Bikard, D., Cox, D., Zhang, F. & Marraffini, L. A. RNA-guided editing of bacterial genomes using CRISPR-Cas systems. Nat. Biotechnol. 31, 233–239 (2013)

  27. 27

    Bryksin, A. V. & Matsumura, I. Rational design of a plasmid origin that replicates efficiently in both gram-positive and gram-negative bacteria. PLoS One 5, e13244 (2010)

  28. 28

    Kouprina, N., Noskov, V. N. & Larionov, V. Selective isolation of large chromosomal regions by transformation-associated recombination cloning for structural and functional analysis of mammalian genomes. Methods Mol. Biol. 349, 85–101 (2006)

  29. 29

    Giegé, R., Sissler, M. & Florentz, C. Universal rules and idiosyncratic features in tRNA identity. Nucleic Acids Res. 26, 5017–5035 (1998)

  30. 30

    Sharp, P. M. & Li, W. H. The codon Adaptation Index—a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res. 15, 1281–1295 (1987)

  31. 31

    dos Reis, M., Savva, R. & Wernisch, L. Solving the riddle of codon usage preferences: a test for translational selection. Nucleic Acids Res. 32, 5036–5044 (2004)

  32. 32

    Tuller, T., Waldman, Y. Y., Kupiec, M. & Ruppin, E. Translation efficiency is determined by both codon bias and folding energy. Proc. Natl Acad. Sci. USA 107, 3645–3650 (2010)

  33. 33

    Gerdes, S. Y. et al. Experimental determination and system level analysis of essential genes in Escherichia coli MG1655. J. Bacteriol. 185, 5673–5684 (2003)

  34. 34

    Keseler, I. M. et al. EcoCyc: fusing model organism databases with systems biology. Nucleic Acids Res. 41, D605–D612 (2013)

  35. 35

    Dai, K. & Lutkenhaus, J. The proper ratio of FtsZ to FtsA is required for cell division to occur in Escherichia coli. J. Bacteriol. 174, 6145–6151 (1992)

  36. 36

    Dewar, S. J., Begg, K. J. & Donachie, W. D. Inhibition of cell division initiation by an imbalance in the ratio of FtsA to FtsZ. J. Bacteriol. 174, 6314–6316 (1992)

  37. 37

    Lajoie, M. J. et al. Probing the limits of genetic recoding in essential genes. Science 342, 361–363 (2013)

  38. 38

    Lajoie, M. J. et al. Genomically recoded organisms expand biological functions. Science 342, 357–360 (2013)

  39. 39

    Ostrov, N. et al. Design, synthesis, and testing toward a 57-codon genome. Science 353, 819–822 (2016)

  40. 40

    Napolitano, M. G. et al. Emergent rules for codon choice elucidated by editing rare arginine codons in Escherichia coli. Proc. Natl Acad. Sci. USA 113, E5588–E5597 (2016)

  41. 41

    Baba, T. et al. Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol. Syst. Biol. 2, 2006.0008 (2006)

  42. 42

    Grosjean, H. J., de Henau, S. & Crothers, D. M. On the physical basis for ambiguity in genetic coding interactions. Proc. Natl Acad. Sci. USA 75, 610–614 (1978)

  43. 43

    Curran, J. F. Decoding with the A:I wobble pair is inefficient. Nucleic Acids Res. 23, 683–688 (1995)

  44. 44

    Dong, H., Nilsson, L. & Kurland, C. G. Co-variation of tRNA abundance and codon usage in Escherichia coli at different growth rates. J. Mol. Biol. 260, 649–663 (1996)

  45. 45

    Ishii, N. et al. Multiple high-throughput analyses monitor the response of E. coli to perturbations. Science 316, 593–597 (2007)

  46. 46

    Gallagher, R. R., Li, Z., Lewis, A. O. & Isaacs, F. J. Rapid editing and evolution of bacterial genomes using libraries of synthetic DNA. Nat. Protocols 9, 2301–2316 (2014)

  47. 47

    Newton, C. R. et al. Analysis of any point mutation in DNA. The amplification refractory mutation system (ARMS). Nucleic Acids Res. 17, 2503–2516 (1989)

Download references


Work was supported by the Medical Research Council, UK (MC_U105181009 and MC_UP_A024_1008, J.W.C.), the Danish Council for Independent Research (DFF – 4090-00289, to J.F.), a Boehringer Ingelheim Fonds PhD fellowship (to S.F.B.), The Gates-Cambridge Scholarship (to S.H.K), and an ERC Advanced Grant (SGCR to J.W.C.). We thank Neil Grant MRC-LMB Visual Aids for photography.

Author information




J.W.C. defined the direction of research. K.W. designed and constructed the REXER and GENESIS systems. J.F. implemented DNA assembly methods in S. cerevisiae. K.W. and S.F.B. identified target codons, developed tE and designed recoding schemes. K.W., J.F., S.F.B., S.H.K., and T.C. performed experiments. All authors analysed the data. K.W. and J.W.C. wrote the paper with input from all authors.

Corresponding authors

Correspondence to Kaihang Wang or Jason W. Chin.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Extended data figures and tables

Extended Data Figure 1 Simultaneous double selection and recombination enhances integration at a target locus.

a, Classical recombination and double selection recombination. In classical recombination, a linear dsDNA with a synthetic DNA (s. DNA) sequence and a positive selection marker (+, CmR) flanked by homologous region 1 (HR1) and homologous region 2 (HR2) is transformed into the cell. Recombinants are selected by expression of the positive selection marker. By simultaneous double selection recombination, s. DNA containing the double selection marker −2/+2 (sacB–CmR) is integrated in place of the double selection marker −1/+1 (rpsLKanR) on the genome. Double selection for the gain of +2 and loss of −1 selects for simultaneous gain of s. DNA and loss of genomic sequence, and improves recombination at the target genomic locus. b, Colony PCR of clones from classical recombination and simultaneous double selection and recombination. c, All of the clones isolated by simultaneous double selection and recombination have s. DNA integrated at the target locus. The data show the mean percentage ± s.d. at the correct locus (n = 6, three biological replicates each performed in two technical replicates, for each technical replicate 8 clones were phenotyped). P < 10−4, two tailed t-test for the null hypothesis that classical recombination is as efficient as double selection recombination, REXER 2 or REXER 4. d, The data for simultaneous double selection recombination, REXER 2 and REXER 4 show the mean percentage ± s.d. at the correct locus (n = 6, three biological replicates each performed in two technical replicates, for each technical replicate eight independent clones were phenotyped). The data for B. subtilis and S. cerevisiae are from previous publications. A previously reported method integrating foreign DNA into B. subtilis genome only using negative selection gave 3% (9 out of 271) of selected clones with right combination of markers3,11. A previously reported method replaced the S. cerevisiae chromosome III in 11 steps using only positive selection. The efficiency, as judged by clones with the correct combination of markers, was reported for ten of these steps; the mean percentage of clones with the right combination of markers is plotted (13%). The error bar represents the maximum and minimum integration efficiency as judged by clones with the correct combination of markers. The minimum efficiency was 0.5% (replacement of 55 kb), the maximum efficiency was 59% (replacement of 9 kb)6. For gel source images, see Supplementary Fig. 1.

Extended Data Figure 2 REXER enables site-specific integration of large DNA fragments into the genome.

a, The use of two distinct double selection cassettes (−1/+1 (rpsL–KanR) and −2/+2 (sacB–CmR) allows simultaneous selection for the loss of the negative selection marker on the genome and the gain of the positive selection marker from the BAC upon integration of synthetic DNA. b, Efficient replacement of genomic rpsL–KanR with BAC-bound sacB–CmR using REXER 2 and REXER 4. All colonies tested (n = 22) contained the correct combination of selection markers after REXER 2 or REXER 4 as analysed by phenotyping, colony PCR, and DNA sequencing (not shown). c, Efficient insertion of 9-kb synthetic DNA. Genomic rpsL–KanR was replaced with a synthetic lux operon coupled to sacB–CmR using REXER 2 and REXER 4. All colonies on the tenfold dilution double selection plates for REXER 2 and the 104- fold plates for REXER 4 show bioluminescence. Eleven colonies each from REXER 2 and REXER 4 showed correct integration by phenotyping, colony PCR, and DNA sequencing (not shown). d, Efficient insertion of 90-kb synthetic DNA. The 90-kb DNA consisted of the lux operon in the middle of 80-kb DNA (previously deleted from the MDS42 genome) and followed by sacB–CmR, carried on a BAC. For gel source images, see Supplementary Fig. 1.

Extended Data Figure 3 Replacement of 100 kb of genomic DNA via REXER.

a, The synthetic DNA contains the 100-kb wild-type DNA (open reading frames in grey) with five genes of the lux operon (blue) and sacB–CmR. Complete replacement leads to integration of all five lux genes (luxA, B, C, D and E) resulting in bioluminescent cells, while partial replacement confers loss of one or more lux genes and loss of bioluminescence. b, After REXER 2, 80% of 2 × 102 colonies examined were bioluminescent; after REXER 4, 50% of 2 × 102 colonies examined were bioluminescent. c, Eleven bioluminescent colonies from REXER 2 and eleven bioluminescent colonies from REXER 4 were analysed. All colonies analysed had all five lux genes correctly integrated, indicating complete replacement of the 100-kb genomic region. All clones analysed contained the right combination of selection markers. d, Eleven bioluminescent colonies from REXER 2 and eleven non-bioluminescent colonies from REXER 2 were analysed. While bioluminescent colonies contained all five lux watermarks, all the non-bioluminescent colonies analysed were lacking one or more lux genes, indicating partial replacement of the genomic region. All clones analysed contained the right combination of selection markers. For gel source images, see Supplementary Fig. 1.

Extended Data Figure 4 Iterative REXER.

a, The product of REXER shown in Extended Data Fig. 2a was used as a template for the next round of REXER. b, The phenotypes of clones from the first round of REXER. c, The phenotypes of clones from the second round of REXER. For gel source images, see Supplementary Fig. 1.

Extended Data Figure 5 Synonymous codon compression strategies.

a, Codon and anticodon interactions in the E. coli genome. Twenty-eight sense codons are highlighted in grey, along with the amber stop codon. The genome-wide removal of these sense codons, but not other sense codons, would enable all their cognate tRNA to be deleted without removing the ability to decode one or more sense codons remaining in the genome. This is necessary but not sufficient for the reassignment of sense codons to unnatural monomers. Serine, leucine and alanine codon boxes are highlighted because the endogenous aminoacyl-tRNA synthetases for these amino acids do not recognize the anticodons of their cognate tRNAs. This may facilitate the assignment of codons within these boxes to new amino acids through the introduction of tRNAs bearing cognate anticodons that do not direct mis-aminocylation by endogenous synthetases. The number of total codon counts for all 64 triplet codons in the MDS42 genome (GenBank accession number AP012306), all known codon–anticodon interactions through both Watson–Crick base-paring and wobbling, base modification on tRNA anticodons, tRNA genes, and measured in vivo tRNA relative abundance are reported. This analysis identifies 10 codons from the serine, leucine, and alanine groups (serine codon TCG, TCA, AGT, AGC; leucine codon CTG, CTA, TTG, TTA; and alanine codon GCG, GCA) that satisfy both the codon–anticodon interaction and aminoacyl-tRNA synthetases recognition criteria for codon reassignment. bd, Serine, leucine and alanine codon removal and tRNA deletion strategies compatible with codon reassignment to unnatural amino acids (u.a.a.).

Extended Data Figure 6 Recoding landscapes for compression of serine codons by REXER.

a, The sequences for the systematically recoded mraZ to ftsZ region were de novo designed, synthesized and assembled into BAC and used for REXER. bd, The recoding landscapes for serine recoding schemes (r.s.) 1–3, and the resulting compiled recoding landscape.

Extended Data Figure 7 Recoding landscapes.

ae, Recoding schemes 4–8. f, Recoding scheme 1 with ftsA codon 407 changed from AGT to AGC (highlighted in orange).

Extended Data Figure 8 Identifying and fixing a deleterious sequence in defined and systematic synonymous recoding.

a, Recoding codon 407 in ftsA in the wild-type genomic background. The wild-type codon at ftsA codon position 407 is the serine codon TCG. We sequenced 16 post-REXER clones for TCG to AGT and 20 post-REXER clones for TCG to TCT. b, Changing ftsA 407 AGT to AGC in the serine r.s.1 background. We sequenced 16 AGT clones and 16 AGT to AGC clones. c, Changing ftsA 407 AGT to AGC in the serine r.s.1 background greatly improved the fraction of fully recoded clones across the entire 20-kb region from 0% to 94% (16 clones sequenced). d, The fixed serine r.s.1 with ftsA 407 AGC yielded clones with no measurable growth defect. The doubling times of fully recoded clones from serine r.s.1 with ftsA 407 AGC, serine r.s.2, serine r.s.3, and alanine r.s.7 were measured and showed no measurable growth defects when compared to the wild-type MDS42 E. coli control with the second double selection cassette integrated at the same genomic locus. (The P values for the null hypothesis that the doubling times of each recoded clone is different from the wild-type control were calculated by two-tailed t-tests. Serine r.s.1 ftsA 407 AGC versus wild type, P = 0.54; serine r.s.2 versus wild type, P = 0.62; serine r.s.3 versus wild type, P = 0.39; alanine r.s.1 versus wild type, P = 0.47.) n = 12 biological replicates and error bars show s.d. e, Combining single-strand DNA recombineering with REXER to fix a short deleterious stretch within the synthetic sequence of r.s. 1. A 90-nt single-stranded oligonucleotide was designed to change the deleterious sequence of AGT in ftsA codon position 407 in r.s.1 to a tolerated sequence, AGC. The oligonucleotide sequence was designed based on the reverse strand of the synthetic sequence to bind the forward strand with the single nucleotide change positioned in the middle (45 from nt 5′ end). The oligonucloeotide was co-transformed into E. coli during a REXER experiment that introduced r.s. 1 into the genome. f, Fixing a short deleterious sequence on synthetic DNA with REXER + ssDNA recombineering. Sixteen clones from REXER double selection (described in e) were randomly picked and subjected to single nucleotide polymorphism (SNP) genotyping using primers specific for either the wild-type sequence in ftsA codon position 407 (TCG) or the fixed sequence (AGC). MDS42rpsLK43R/rK was used as the wild-type control and a fully recoded clone from serine r.s.3 with verified ftsA 407 AGC as the positive control. SNP genotyping at ftsA codon position 407 identified one clone (clone 12, highlighted in orange) out of a total of 16 clones tested with fixed sequence AGC, which was then fully sequenced across the entire 20-kb recoding region and confirmed as fully recoded at all 83 targeted codon positions. For gel source images, see Supplementary Fig. 1.

Extended Data Table 1 Defining recoding rules by codon adaptation index (cAi), tRNA adaptation index (tAi), and translation efficiency (tE)
Extended Data Table 2 Properties of genes targeted for recoding

Supplementary information

Supplementary Figure 1

This file contains original DNA gel images. (PDF 5777 kb)

Supplementary Data 1

Genomic locus for selection marker -1/+1. (TXT 5 kb)

Supplementary Data 2

Plasmid containing lambda red, Cas9, and tracrRNA. (TXT 16 kb)

Supplementary Data 3

BAC containing lux operon and -2/+2 for integration. (TXT 21 kb)

Supplementary Data 4

Plasmid containing spacers for REXER 2. (TXT 5 kb)

Supplementary Data 5

Plasmid containing spacers for REXER 4. (TXT 5 kb)

PowerPoint slides

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wang, K., Fredens, J., Brunner, S. et al. Defining synonymous codon compression schemes by genome recoding. Nature 539, 59–64 (2016).

Download citation

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.