Main

The standard genetic code contains 64 codons that degenerately encode 20 natural amino acids and translation termination. Organismal variations in codon recognition and inclusion of natural non-standard amino acids (nsAAs) have given rise to alternative genetic codes1,2,3. Cognate and near-cognate codon recognition by tRNAs can vary widely across domains and between species as tRNAs are gained, lost, mutated or reassigned to recognize different codons—for example, via variations in anticodon wobble and base modifications (such as inosine)1. Deviations in code may render codons unassigned6, encoding alternative canonical amino acids2,3 or nsAAs7,8, or serving as noncanonical stop codons (such as UUA and UCA)7.

Alternative codon configurations and codon redundancy highlight the plasticity of the genetic code, motivating genome-scale efforts to construct GROs—organisms constructed with alternative genetic codes whereby redundant codons are reassigned to new functions. GROs possess several advantageous phenotypes, including for enhanced phage resistance8,9,10, biocontainment strategies11,12 and efficient site-specific incorporation of nsAAs to imbue novel protein properties13,14. The first constructed GRO was an E. coli in which all instances of TAG stop codons were replaced with synonymous TAA codons, followed by deletion of release factor 1 (RF1)4,15. The resulting strain, C321.ΔA, released TAG as an unassigned codon amenable for reassignment. Recent efforts have commenced to reassign sense codons in E. coli5,16 and other organisms17,18,19.

Whole-genome recoding inspires unresolved questions regarding the malleability of the genetic code and whether a minimal non-degenerate genome is feasible. By assigning unused codons to new functions, such genomes could expand in vivo protein design with unnatural biochemistries, conferring valuable properties for biomaterials and drugs (for example, immunity to proteases). Achieving such goals requires engineering essential translation factors (tRNAs and release factors) for defined single-codon specificities (codon exclusivity), decoupling codons from translational crosstalk (for example, wobble effect or near-cognate suppression) and eliminating functional redundancy among synonymous codons. Single-codon specificities found in nature demonstrate the feasibility of engineering exclusivity—for example, AUG specificity for tRNAMet enabled by discriminatory post-transcriptional modifications of AUG-specific tRNAMet(CAU) and AUA-specific tRNAIle(CAU)20. Eliminating similar modifications has mitigated anticodon wobble in native tRNAs, rendering them codon exclusive21,22. Engineering essential translation factors presents unique challenges related to their cryptic alternative functions that could be revealed by wide-scale genomic recoding efforts.

Here we describe Ochre, a GRO that fully compresses a redundant codon functionality into a single codon, liberating two essential stop codons for reassignment. Specifically, synonymous replacement of 1,195 instances of the stop codon TGA alongside ∆TAG4, combined with engineering of essential translation factors (RF2 and tRNATrp) to attenuate native UGA recognition, enabled multi-site incorporation of two distinct nsAAs at UAG and UGA with greater than 99% accuracy within single proteins. Although past recoding efforts have repurposed stop and sense codons via deletion of associated translation machineries (RF14 and tRNAs5, respectively), a genome reduced to a single non-degenerate stop codon had not been achieved owing to challenges in engineering translation factors to precisely tune codon specificities. Our efforts disentangled translational crosstalk within the stop codon block, rendering four codons non-degenerate, with each serving a unique function: UAA as the sole stop codon, UGG encoding Trp and UGA and UAG liberated to encode two unique nsAAs (Fig. 1a). Distinct from prior efforts4,5, we demonstrate the importance of compressing redundant codons and disentangling translational crosstalk to fully liberate codons for precise nsAA reassignments, establishing a benchmark for future recoding. Ochre provides a novel context to interrogate translation and cryptic codon functions and to evolve new translation machinery with applications in biocontainment11,12, genetic isolation8,9,10 and biomanufacturing of proteins with synthetic chemistries13,14,23.

Fig. 1: Design and construction of genomically recoded E. coli with a single stop codon.
figure 1

a, Cellular schematic summary of translational decoding of the three canonical stop codons—UAG, UGA and UAA—and the tryptophan-encoding UGG codon alongside their native cognate translation factors across three strains of E. coli after iterative rounds of recoding: MG1655 (wild type), rEc∆1.∆A (∆UAG, ∆RF1) and Ochre (∆UAG, ∆RF1, RF2*, tW*). Asterisks denote translation factors engineered to discriminate against UGA. Codons with multiple iterations of genomic recoding from wild-type E. coli to rEc∆2.∆A.B3.tW*, which is a ∆TAG-∆TGA recoded strain containing a mutant RF2 and a mutant tRNATrp both engineered to discriminate against UGA. OTS components (o-aaRS and o-tRNA) in recoded strains enable reassignment of UAG and UGA to sense codons incorporating nsAAs. RF, release factor. b, Diagram of ∆TAG-∆TGA recoded genome (rEc∆2.∆A) depicting relative locations of important designed mutagenic events in the genome-wide removal of TGA. Innermost sections of circles depict genomic subdivisions for targeted recoding of rEc∆1.∆A to acquire rEc∆2E.∆A (A′ and B′) and subsequently to acquire fully recoded rEc∆2.∆A.B1 (A–H). Sec, selenocysteine. c, Top, representations of common terminal TGA codons (red) overlapping genes (blue). Bottom, common mutational means of resolving codon conversions with minimal genetic deviation. d, Predicted translation initiation rates (TIRs) for genes overlapped by genes terminating with TGA shown in b recoded from TGA to TAA plotted against the predicted translation initiation rates for wild-type overlapped genes. The diagonal corresponds to recoding events that do not alter the expected translation rate. R2 is the Pearson correlation coefficient. e, Schematic overview depicting iterative steps of MAGE-mediated genomic recoding and CAGE-mediated hierarchical assembly of recoded subdomains into a final fully recoded strain (rEc∆2) lacking terminal TGA codons.

Source Data

Constructing a ΔTAG/ΔTGA recoded strain

The stop codon block consists of four translationally linked codons: TAG, TGA, TAA and TGG. In C321.∆A4, TAG is freed as an open coding channel, leaving TAA (stop), TGG (Trp) and TGA (stop and near-cognate Trp). With only 71 essential genes among 1,216 total open reading frames (ORFs) containing TGA in the MG1655 genome (Extended Data Table 1), TGA is the second rarest codon. This scarcity and translational redundancy makes TGA an attractive target alongside TAG. Additionally, dual orthogonal translation systems (OTSs) expressing orthogonal aminoacyl-tRNA synthetases (o-aaRSs) and tRNAs (o-tRNAs) targeting UAG and UGA have been deployed in unrecoded E. coli to incorporate two distinct nsAAs24, but their efficacy was limited by competition with native translation factors.

There are four primary open questions for our recoding effort: (1) whether TGA codons are essential; (2) whether redundant codon functions can be compressed into one codon; (3) whether release factors can attain single-codon specificity; and (4) whether codon suppression can be mitigated without deletion of translation factors. Release factors7,25 and tRNAs26,27 with altered specificities suggest it is feasible to engineer attenuated UGA recognition, mitigating translation factor competition to free TGA alongside TAG for reassignment. Addressing these questions would enable construction of a strain lacking both TAG and TGA codons that, when complemented by a noncompetitive mutant release factor and tRNATrp, could be reassigned as sense codons. Although previous recoding efforts enabled nsAA incorporation at two open serine codons9, unresolved translational crosstalk (for example, o-tRNA wobble28) is likely to result in misincorporations. We hypothesize that resolving crosstalk at UGA within a ∆TGA recoded strain will enable precise, site-specific dual incorporation of nsAAs at both UAG and UGA and establish functional codon exclusivity (non-degeneracy) in the stop codon block.

To construct a ΔTAG, ΔTGA recoded strain (rEcΔ2.ΔA), we used the ΔTAG progenitor C321.ΔA4, which we refer to as rEcΔ1.ΔA. All TGA codons, with few exceptions, occur at the end of a gene to terminate translation. Out of the 1,216 annotated ORFs that contain TGA, 1,171 are annotated genes and 45 are listed as pseudogenes. To reduce recoding efforts, 76 non-essential genes and 3 pseudogenes containing TGA were removed with 16 targeted genomic deletions29 (Fig. 1b). The remaining 1,134 terminal TGA codons (1,092 genes and 42 pseudogenes) were targeted for conversion to TAA via multiplex automated genomic engineering (MAGE)30 (Supplementary Table 1). Three formate dehydrogenase genes (fdhF, fdoG and fdnG) containing internal TGA codons were not targeted for recoding owing to their role in encoding selenocysteine31. Four distinct oligonucleotide designs were used to convert codons. One strategy introduced single-nucleotide substitutions to 833 non-overlapping ORFs and 3 refactoring strategies targeted 380 overlapping ORFs in which such substitutions might affect neighbouring gene expression (Fig. 1c and Supplementary Methods). These latter strategies resulted in changes to more than 300 overlapping coding sequences (Supplementary Table 2) and few predicted disruptions to translation initiation rates (Fig. 1d and Supplementary Table 3).

Construction of ΔTGA rEcΔ2.ΔA constituted two major phases (Fig. 1e), each utilizing iterative cycles of MAGE30 concurrently targeting distinct genomic subdomains within clonal progenitor strains, followed by conjugative assembly genome engineering (CAGE)15 to hierarchically assemble recoded subdomains into a final strain. TGA-to-TAA conversions were confirmed via whole-genome sequencing (WGS) after each assembly. In phase 1, rEcΔ2E.ΔA was constructed from rEcΔ1.ΔA by converting 71 essential genes terminating with TGA split between 2 distinct genomic subdomains (A′ and B′) among two clones (Supplementary Table 4). An additional 44 proximal genes terminating with TGA were converted and 3 genes terminating with TGA were replaced with selectable markers. Recoded subdomains A′ and B′ were assembled via CAGE into rEcΔ2E.ΔA (Supplementary Table 5).

Phase 2 constructed the final ΔTGA recoded rEcΔ2.ΔA. Another 1,012 ORFs terminating with TGA (980 genes and 35 pseudogenes) were converted via MAGE (Supplementary Table 1 and 6) and 229 non-essential ORFs (72 genes terminating with TGA and 3 pseudogenes terminating with TGA)29 were deleted via marker placements (Extended Data Fig. 1 and Supplementary Table 7). TGA sites were divided among eight rEcΔ2E.ΔA clones and targeted concurrently across distinct genomic subdomains (A–H) split among clones (Fig. 1b and Supplementary Table 8). Preserving chemotaxis and motility, non-essential genomic regions were identified29 and those with a high density of genes terminating with TGA were deleted, yielding 16 genomic deletions. Unwanted markers were deleted via tolC displacements15. The non-essential glycerate kinase I gene garK was recalcitrant to conversion and was inactivated by a frameshift nonsense mutation (Supplementary Table 9). Recoded subdomains were hierarchically assembled via CAGE into the final rEcΔ2.ΔA (Extended Data Fig. 2 and Supplementary Table 10). Out of 1,216 predicted TGA codons in E. coli, 1,195 were abolished, leaving 10 predicted pseudogenes, 8 insertion sequences and 3 internal selenocysteine codons unrecoded (Extended Data Table 1). WGS data and breseq analysis32 were used to assess background mutations33. More than 700 unintended background mutations accumulated in our final recoded strain (Supplementary Tables 1113), although this was likely to be an overestimate (Supplementary Methods). To mitigate detrimental phenotypes arising during construction, cell colonies were routinely measured for maximum OD600 (MaxOD) and doubling time in lysogeny broth (LB) and minimal (M9) media. The fastest growing strains with highest MaxOD were selected to progress for further recoding. When auxotrophies arose, conjugal backcrossing to parental strains (Supplementary Methods and Supplementary Fig. 2) and targeted reversions of implicated mutations (Supplementary Methods and Supplementary Tables 14a and 15) were used to restore prototrophy.

Engineering mutant RF2

Engineering UAA-specific translation termination is a first step in decoupling UGA translational crosstalk and liberating UAG and UGA for efficient reassignment. RF1, encoded by prfA, terminates translation at UAG and UAA, whereas RF2, encoded by prfB, terminates translation at UGA and UAA. RF1 was deleted within ancestral rEcΔ1.∆A4, confirming that RF1 is non-essential without TAG codons and that RF2 can serve as the sole release factor. After ∆TGA recoding, we tested whether RF1 could independently support translation termination. Unlike RF1, RF2 serves additional roles beyond termination, including post-peptidyl transfer quality control with release factor 3 (RF3)34 and ArfA-cooperative ribosomal rescue from ‘non-stop’ mRNAs35. Our inability to delete RF2 in the presence of RF1 suggests that RF1 cannot functionally compensate for loss of RF2. The capacity of RF2 to act as a sole release factor made it our target for engineering specificity for UAA. Thus, we sought to identify mutations that attenuate RF2-mediated termination at UGA while retaining essential functions. Using MAGE, five primary mutations were introduced to the prfB gene: (1) T246A, to restore wild-type release factor methylation36,37; (2) terminal TGA-to-TAA conversion; (3) internal TGA-to-TAA for autoregulatory maintenance38,39; (4) S205P for UAA-specific termination activity40; and (5) E170K charge flip41,42 to enable strain viability (Fig. 2a,b and Supplementary Table 14b). Two in vivo assays measured the effect of these mutations on the ability of RF2 to: (1) outcompete cognate/near-cognate suppression in fluorescent reporter translation; and (2) inhibit phage infection. Derived from K-12 strains, rEc∆1.∆A contains an A246T mutation in RF2 that hinders termination efficiency compared with wild-type E. coli42. We introduced T246A to restore RF2 methylation by PrmC, generating variant RF2.B0 and improving strain fitness42. To make RF2.B0 translation compatible with attenuated UGA termination, we converted both the internal autoregulatory TGA38,39 and terminal TGA to TAA, generating variant RF2.B1 (Supplementary Table 5). The prfB mRNA, encoding RF2, contains an internal ribosome binding site (RBS) upstream of a conditionally frameshifting CUU-UGA sequence that regulates RF2 translation based on cellular concentrations38,39. Although the CUU-UGA sequence is highly conserved in prfB mRNA, exceptions such as Chlorobium tepidum instead have CUU-UAA38, suggesting permissibility. TGA-to-TAA conversion introduced a D26N mutation with no significant growth defects in rEc∆2.∆A.B1 compared with B0 (Extended Data Fig. 3a). Downstream, the prfB terminal TGA overlaps the RBS for lysS, encoding lysine-tRNA synthetase. To preserve lysS translation, we inserted TAA directly upstream of the terminal TGA (Fig. 2a). In prokaryotic translation, release factor codon specificity is predominantly conferred by helix α5, specific to U at position 1 of the stop codon, and a 10 or more amino acid codon recognition loop whose structure and sequence discriminates between the second and third codon bases43,44,45. Alternative release factor codon specificity is well precedented in prokaryotic systems. Single point mutations in RF2 alter codon specificity (for example, E167K or F207T), expanding recognition to all three stop codons, creating ‘omnipotent’ release factors41. In addition, some mitochondrial release factors terminate at canonical sense codons (for example, AGA or AGG)7. Various works have interrogated codon specificity via tripeptide ‘anticodon’ motifs located within recognition loops: PXT in RF1 (where X is A or V) or SPF in RF243,46,47. Crystallography implicates hydrogen bonding by the threonine in PXT and the serine in SPF in discrimination at the second base of the codon44, although molecular dynamics simulations dispute this45.

Fig. 2: Engineering RF2 to attenuate UGA termination.
figure 2

a, Left, comparative mRNA transcript and protein diagrams displaying key nucleotides and amino acids in wild-type RF2.B (cyan) and mutant RF2.B3 (purple). Right, RF2 variants used in this study. b, Top, AlphaFold48 structural prediction of RF2.B3 with residue deviations from wild-type RF2 marked in purple. Bottom, magnified views of wild-type RF2 SPF (left) and mutant RF2 PPF (right) codon recognition loops, highlighting interactions with residue 205. AlphaFold48 structural prediction suggests that wild-type S205 presents up to four hydrogen bonds within the RF2 codon recognition loop, whereas P205 presents one. c, Schematic representation of mCherry–YFP fluorescent reporter used to assess release activity at three target codons (X is UAG, UGA or UAA) relative to a sense codon control (GCG) in a peptide linker between mCherry and YFP. Release activity at codon X prevents downstream YFP expression and fluorescence without compromising mCherry, whereas stalling and degradation leads to loss of mCherry and YFP signals. High YFP fluorescence suggests readthrough. d, Schematic representation of expected fluorescent outcomes resulting from the mCherry–YFP readthrough assay. e,f, mCherry (top) and YFP fluorescence (bottom) expressed as a function of translation termination or readthrough at target codon X during reporter translation in variant strains with release factors. Fluorescence is normalized to GCG codon construct fluorescence in absence (e) or presence (f) of UAG- or UGA-suppressing supD tRNAs. Colour scheme in df as indicated in c. a.u., arbitrary units. Data are mean ± s.e.m. P values for comparison with rEc∆1.∆A by unpaired t-tests (n = 3 biological replicates). *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001.

Source Data

We hypothesized that one characterized E. coli RF2 mutation (S205P) in the tripeptide SPF might endow UAA specificity within rEcΔ2.∆A46. Structural prediction by AlphaFold48 (Fig. 2b) suggests that wild-type S205 presents up to 4 hydrogen bonds within the codon recognition loop of RF2, whereas P205 presents 1, probably altering recognition loop stability. Previous characterizations of S205P evaluated release factor codon recognition in vivo and in vitro41,46. Using a chimeric RF1–RF2 backbone, Ito et. al. recapitulated wild-type RF2 termination with SPF46. Introducing S205P (forming chimeric PPF) attenuated recognition of UGA in vitro without compromising UAA termination. Nonetheless, expression in E. coli did not complement native RF2 knockout46. A second study introduced S205P into wild-type RF241. Unsurprisingly, supplemental expression in E. coli led to growth interference and cell nonviability. Given these results, we hypothesized that unlike wild-type E. coli, which is dependent on native UGA termination, our ∆TGA rEc∆2.∆A.B1 strain could uniquely tolerate RF2 S205P. Using rEcΔ2.∆A.B1, we attempted to introduce S205P into genomic RF2.B1 to generate variant RF2.B2, but did not recover viable clones without prfA, compensatory mutations in translation regulatory protein RF3 (prfC) (Supplementary Table 14b), or episomal RF2.B2 expression with growth defects (Extended Data Fig. 4 and Supplementary Discussion 1). To enable S205P within a genomic ∆RF1/RF2.B1 background and maintain cellular fitness, we used MAGE oligonucleotides to introduce mutations across RF2.B1 concurrent with S205P (Supplementary Table 16). Targeted mutations were identified through E. coli RF1 and RF2 sequence alignments with RF2s from low-TGA-frequency bacterial genomes49 that are hypothesized to encode RF2 variants with poorer UGA recognition. Residues were cross-referenced with the literature to avoid conferring recognition of UAG41,42,46. Viable S205P strains arose only in co-occurrence with E170K (Extended Data Fig. 3b). E170K is a previously characterized charge flip mutation that compensates growth interference from S205P41 and independently arose from rEcΔ1.ΔA adaptive evolution42. Release factor complementation assays revealed that residue charge flips close to E170—such as E167K—altered codon recognition, unlike E170K41. Introduction of S205P and E170K into RF2.B1 generated RF2.B3 (Fig. 2a,b and Supplementary Tables 14, 16 and 17). To examine the essentiality of E170K in our rEcΔ2.∆A.B3 strain, we used MAGE to establish a K170E reversion but did not yield viable clones in absence of wild-type S205 or RF1, confirming the essentiality of E170K to support viability of the RF2 S205P mutation.

Characterizing mutant RF2 termination

To assess whether RF2.B3 abolishes recognition of UGA, we conducted an in vivo assay measuring release activity of RF2 variants at all three stop codons. Using validated methods50, this assay evaluates the ability of a release factor to terminate translation of an mCherry gene ending in a target codon upstream of a YFP gene (Fig. 2c, Extended Data Fig. 5b–d and Supplementary Table 18). If no termination occurs at the target codon, translation reads through mCherry to YFP, expressing both proteins. If termination occurs, only mCherry is expressed, yielding low YFP fluorescence. Figure 2d–f presents fluorescence of mCherry and YFP expression as a function of termination at a target codon, normalized to fluorescent expression from GCG (positive control). The results revealed low YFP expression at all stop codons, regardless of release factor status (Fig. 2e). Previous work showed that rEcΔ1.ΔA genes terminating with TAG are degraded via transfer messenger RNA (tmRNA) tagging, reducing UAG codon readthrough and GFP expression51. This is consistent with low expression levels observed for the mCherry gene terminating with UAG and downstream YFP. For more robust readouts, we used modified UAG- and UGA-suppressor serine tRNAs (supD)52 that precisely base pair with their respective codons to examine direct codon competition with release factor variants, effective for assessing codon reassignment (Fig. 2f). Results with mutant rEc∆2.∆A.B3 suggest termination at UAA, approximately 42-fold enhanced readthrough at UAG and 45-fold enhanced readthrough at UGA relative to no suppressors, leading to 11-fold and 8-fold increases in UGA YFP fluorescence compared with wild-type RF2 strains rEc∆1.∆A.B0 and rEc∆2.∆A.B1, respectively. These results demonstrate that UGA recognition by RF2.B3 is significantly attenuated compared with RF2.B0, as RF2.B3 is outcompeted by UGA-specific suppressor tRNAs. To assess whether S205P (not E170K) causes attenuation, we reverted RF2.B3 P205 back to serine, generating variant RF2.B4. We repeated the in vivo mCherry–YFP readthrough assays without suppressors to assess rEc∆2.A.B2 (RF2(S205P)/RF1+) alongside isogenic rEc∆2.∆A.B4 (RF2(E170K)/∆RF1) and two wild-type RF2/∆RF1 controls: rEc∆2.∆A.B0 and rEc∆1.∆A.B0 (Extended Data Fig. 5e). The high mCherry and low YFP fluorescence for UAG from rEc∆2.A.B2 is indicative of RF1 terminating at UAG. For UGA, results reveal low YFP fluorescence from both RF2.B0 and RF2.B4 variants, suggesting little effect from E170K and D26N on UGA termination. However, S205P in the absence of E170K (rEc∆2.A.B2) results in high YFP fluorescence, suggesting significant UGA readthrough from native suppression and possibly abolition of RF2 termination. This result stands in stark contrast to RF2.B3 (S205P with E170K) without supD suppressors (Fig. 2e), revealing S205P to be uniquely responsible for attenuation of UGA termination. These data suggest that E170K restores S205P termination at UAA sufficient for cell viability and partially restores termination at UGA, although significantly attenuated, to enable robust supD suppression (Fig. 2f).

The effects of individual RF2 mutations on translation are further evidenced by phage assays assessing genetic orthogonality8,9,10 (Supplementary Discussion 2). We challenged strains with infection from two bacteriophages: Lambda (λ) (containing TAG, TGA and TAA) and Mu (µ) (containing TGA and TAA only) (Fig. 3 and Extended Data Fig. 6). Whereas all strains lacking RF1 retained resistance to λ, as previously described51 (Fig. 3a), only rEc∆2.A.B2 (RF2(S205P)/RF1+) displayed resistance to µ (Fig. 3b). These results suggest that: (1) UGA termination may be abolished in rEc∆2.A.B2 and thus is not necessary for ∆TGA strain viability; and (2) that E170K in rEc∆2.A.B3 restores partial UGA termination function in alignment with readthrough data.

Fig. 3: Characterization of release factor variants using phage assays.
figure 3

a, RF1 and RF2 variant strains were challenged with λ phage to assess phage infectivity with genes terminating with TAG, TAA and TGA. PFU, plaque-forming units. b, Release factor variant strains were challenged with µ phage to assess phage infectivity with genes terminating with TAA and TGA only. Data are mean ± 95% confidence interval for n = 3 biological replicates. Unpaired t-test for comparison with E. coli MG1655. NS, not significant; ND, not detected (lower than the measurable detection limit).

Source Data

Eliminating UGA suppression by tRNATrp

We established proteomics and reporter protein assays to examine how UGA is decoded in rEc∆2.∆A.B3. Mass spectrometry revealed marked Trp suppression comprising around 80% of residues at UGA, indicating significant suppression by tRNATrp (Fig. 4 and Supplementary Table 19) as the E. coli tRNATrp (Ec-tRNATrp) anticodon loop wobble pairs with UGA53. To reduce native Trp suppression and disentangle translational crosstalk between TGA and TGG, we modified trpT, which encodes tRNATrp(CCA), to abolish UGA recognition. Modifications at positions 34 and 37 of tRNA anticodon loops have been demonstrated to dictate codon recognition fidelity across kingdoms22,26,54. Mutations that knock out A37-modifying enzymes have been shown to abolish UGA suppression from Saccharomyces cerevisiae tRNACys(GCA) (Sc-tRNACys)22 and strongly attenuate suppression (by 98.5% compared to the wild type) by Ec-tRNATrp (ref. 21). We compared the sequence surrounding the anticodon region of Ec-tRNATrp to that of Sc-tRNACys and identified a conserved A36-A37-A38 sequence in the anticodon loop of both tRNAs that is required for modification of A37 to N6-(isopentenyl) adenosine (i6A37) in both prokaryotes and eukaryotes26. On the basis of these observations, we hypothesized that the UGA suppression activity of Ec-tRNATrp is linked to A37 and that an A37G mutation, which disables ms2i6A37 modification, could abolish UGA suppression in E. coli. To test this hypothesis, we mutated A37 in native Ec-tRNATrp to G37 (tW*) to disrupt the A36-A37-A38 substrate motif and prevent base modification (Fig. 4a), generating rEc∆2.∆A.B3.tW* (Ochre). We used our mCherry–YFP fluorescent assays with an isopropyl β-d-1-thiogalactopyranoside (IPTG)-inducible Methanocaldococcus jannaschii tyrosyl-tRNA synthetase and proK constitutively expressed tRNATyr(UCA) pair (Mj-TyrOTS) designed to encode Tyr at UGA to evaluate OTS activity and native tRNATrp suppression at UGA (Fig. 4b). We observed a significant reduction in YFP/mCherry expression in rEc∆2.∆A.B3.tW* compared with rEc∆2.∆A.B3 (wild-type tRNATrp), suggesting a reduction of native tRNATrp suppression (Fig. 4c). To confirm decreased Trp suppression in rEc∆2.∆A.B3.tW*, we used mass spectrometry reporter for exact amino acid decoding (MS-READ) to assay amino acid incorporation at UGA in wild-type and A37G tRNATrp strains containing Mj-TyrOTS13,55. Mass spectrometry analysis confirmed Mj-TyrOTS-mediated Tyr incorporation at UGA codons, and unexpectedly, a small amount (less than 1%) of Cys incorporation, possibly due to a parallel third base wobble to Sc-tRNACys22. Whereas Trp was the most abundant amino acid at UGA in rEc∆2.∆A.B3 cells with wild-type tRNATrp, it was completely absent in rEc∆2.∆A.B3.tW* (Fig. 4d and Supplementary Table 19). These results confirm that a single base substitution in Ec-tRNATrp preserves native Trp decoding while eliminating UGA suppression, removing the wobble effect and establishing an open UGA codon for reassignment. With the combination of RF1 deletion and attenuated UGA recognition of RF2.B3 and tW*, four codons that naturally display translational crosstalk (UAG-open, UGA-open, UAA-stop and UGG-Trp) can be rendered translationally isolated and functionally exclusive (Fig. 1a).

Fig. 4: Engineering tRNATrp to mitigate UGA suppression.
figure 4

a, Schematics of wild-type and mutant (A37G) tRNACys(GCA) anticodon loops in S. cerevisiae and wild-type and mutant (A37G) tRNATrp(CCA) anticodon loops in E. coli, depicting loss of anticodon modification and codon recognition in mutant forms. b, Schematic representation of Mj-TyrOTS expression plasmid: IPTG-inducible M. jannaschii tyrosyl-tRNA synthetase (Mj-TyrRS) and accompanying constitutively expressed tRNATyr(UCA). c, Mj-TyrOTS-enabled suppression of UGA within a mCherry–YFP fluorescent reporter expressed in rEc∆2.B2 strains with wild-type or mutant tRNATrp (tW*). Data are mean ± s.e.m. P values for comparison with no Mj-TyrOTS and no Ara (inducer) conditions for each strain by unpaired t-tests (n = 3 biological replicates). d, Stacked mass spectrometry data depicting identities of incorporated residues at UGA from MS-READ analysis of rEc∆2.B2 strains with wild-type or mutant tRNATrp.

Source Data

Characterizing growth of recoded strains

The cumulative effects of whole-genome TGA replacement and RF2–tRNATrp translation engineering can be seen when comparing our rEc∆2.∆A.B3.tW* with ancestral rEcΔ1.∆A in fitness (Extended Data Fig. 7) and morphology (Extended Data Fig. 8). To assess the effects on cellular fitness, we measured doubling time and MaxOD of each strain in LB and M9 media (Fig. 5). Doubling times for rEc∆2.∆A.B3.tW* increased by approximately 50% in M9 and 20% in LB, whereas MaxOD increased by around 16% in M9 and decreased by around 18% in LB. Further analysis dissecting the individual effects of ∆TGA recoding and translation factor mutations on growth revealed that RF2 S205P was a primary contributing factor to fitness impairments (Extended Data Fig. 3a and Supplementary Discussion 3). These data reveal general growth impairment relative to ancestral rEcΔ1.∆A, but with minor improvement in M9 MaxOD. However, the similarity in growth of rEcΔ2.∆A.B1 with commercial strains TOP10 and DH10B in LB (Fig. 5) incentivizes further tuning of translation factors to improve growth and fitness.

Fig. 5: Growth of recoded strains.
figure 5

MaxOD and doubling time of recoded and laboratory E. coli strains grown in nutrient-rich LB (top) and nutrient-poor M9 (bottom) liquid media. Data are mean ± 95% confidence interval for n = 11 (LB) or 7 (M9) biological replicates. P values for comparison with rEc∆2.∆A.B1 by Mann–Whitney U-test.

Source Data

nsAA dual incorporation into proteins

To assess the practical implications of disentangling translational crosstalk to liberate codons, we assessed the ability of rEc∆2.∆A.B3 and rEc∆2.∆A.B3.tW* (Ochre) to incorporate two distinct nsAAs at UAG and UGA in reporter proteins. Previous recoding of TAG and RF1 deletion in rEcΔ1.ΔA demonstrated multi-site incorporation of nsAAs in elastin-like polypeptides (ELPs) with more than 95% accuracy13. To reassign both UAG and UGA codons, we constructed a dual-OTS fluorescent reporter plasmid for incorporating para-acetyl-l-phenylalanine (pAcF) at UGA and Nε-Boc-l-lysine (BocK) at UAG (Fig. 6 and Extended Data Fig. 9a). This plasmid contains a bicistronic cassette with l-arabinose-inducible (pBAD) o-aaRSs paired with constitutively active (proK) o-tRNAs assigned to each codon: pAzFRS.2.t113 paired with anticodon-modified M. jannaschii tRNA(UCA) (Mj-tRNA(UCA)) targeting UGA, and chPylRS56 paired with optimized PylT targeting UAG. o-tRNAs were expressed with a valX linker for dual expression. For a reporter, we expressed a Tet repressor (TetR)–anhydrotetracycline (aTc)-inducible ELP–GFP13 with the ELP sequence modified for MS-READ55. Seven ELP–GFP constructs were tested to assess nsAA encoding at UAG, UGA, or both within peptide linkers: (1) 1×TAG; (2) 3×TAG; (3) 1×TGA; (4) 3×TGA; (5) 1×(TGA-TAG); (6) 3×(TGA-TAG); and (7) 3×TAC as a control construct for native incorporation (Fig. 6a, Extended Data Fig. 9b and Supplementary Tables 18 and 20). We tested four strains to compare single and dual incorporation: rEcΔ1.ΔA.B0, rEcΔ2.ΔA.B1, rEc∆2.∆A.B3 and rEc∆2.∆A.B3.tW* (Fig. 6b–d).

We first analysed single nsAA incorporation at one and three codons. Without RF1 present, we expected minimal differences between strains incorporating BocK at UAG. Indeed, induction of OTS with expression of 3×TAG or 1×TAG and BocK demonstrated little appreciable difference in normalized fluorescence between strains (Fig. 6b and Extended Data Fig. 9c). As expected, induction of 3×TAG and OTS with pAcF presented around fourfold (rEc∆1.∆A), sevenfold (rEc∆2.∆A.B1), ninefold (rEc∆2.∆A.B3) and ninefold (rEc∆2.∆A.B3.tW*) increased fluorescence relative to no nsAAs (Fig. 6b). Expression of 1×TAG expression demonstrated a similar trend with higher normalized fluorescence and approximately twofold change (Extended Data Fig. 9c). These results reveal no negative effect on nsAA incorporation efficiencies at UAG resulting from ∆TGA recoding with or without translation engineering.

With translation engineering, we hypothesized that pAcF incorporation at UGA would significantly increase over wild-type RF2 variants. As expected, induction of 3×TGA and OTS with pAcF presented approximately threefold (rEc∆1.∆A), threefold (rEc∆2.∆A.B1), eightfold (rEc∆2.∆A.B3) and sevenfold (rEc∆2.∆A.B3.tW*) increased fluorescence relative to no nsAAs (Fig. 6c). Meanwhile, 1×UGA fluorescence fold changes were similar across all strains, resulting from increased fluorescence in the absence of nsAAs (Extended Data Fig. 9d). This effect probably resulted from previously observed mischarging of Mj-tRNA by pAzRS.2.t113. Overall, these data reveal that TGA reassignment was not enabled by ∆TGA recoding alone, highlighting the role of RF2 and tRNATrp engineering in mitigating codon recognition to liberate and reassign UGA.

Finally, we assessed dual incorporation of BocK and pAcF at UAG and UGA, respectively. Induction of 3×(TGA-TAG) and OTS with BocK and pAcF presented approximately 2-fold (rEc∆1.∆A), 7-fold (rEc∆2.∆A.B1), 20-fold (rEc∆2.∆A.B3) and 17-fold (rEc∆2.∆A.B3.tW*) increased fluorescence relative to no nsAAs (Fig. 6d). Expression of 1×TAG resulted in similar relative fluorescence fold changes, but with rEc∆1.∆A matching rEc∆2.∆A (Extended Data Fig. 9e). Demonstrably, the attenuation of UGA competition from native translation factors enabled UAG and UGA stop codon reassignment for multi-site encoding of two distinct nsAAs.

To assess translational accuracy, we used mass spectrometry to assay BocK and pAcF incorporation at six codons (three TGA and three TAG) in a modified MS-READ reporter peptide designed for unambiguous identification of two MS-READ reporter segments (Fig. 6e and Supplementary Table 21). Tandem mass spectra for each reporter segment confirmed incorporation of 3×BocK and 3×pAcF at all intended positions. Product ion spectra were more clearly differentiated by a strong 100-Da neutral loss signature at BocK sites (Fig. 6f). In rEc∆2.∆A.B3, peptide intensities for reporter segments revealed Trp incorporation at UGA, competing with pAcF, and Lys, Gln and Tyr incorporation at UAG, consistent with previous observations when reassigning UAG13 (Fig. 6g,h), providing approximately 93% on-target incorporations. By contrast, results with rEc∆2.∆A.B3.tW* were completely free of Trp contamination and accompanied by a marked reduction in Lys, Gln and Tyr incorporation at UAG, yielding more than 99% on-target incorporation at all sites (Fig. 6g,h). This result demonstrates the effect of the tRNATrp A37G mutation in enhancing incorporation efficiency and mitigating competition from native UGA suppressors. These conclusions are validated by western blot assays dissecting translation at individual codons (Extended Data Fig. 9f and Supplementary Discussion 4). To evaluate the applicability of our expression chassis, we measured purified 3×(TGA-TAG) ELP–GFP containing BocK and pAcF dual incorporations, obtaining yields of approximately 2 mg l−1 from Ochre, around 21-fold higher than yields from unrecoded BL21 (Extended Data Fig. 9g,h and Supplementary Discussion 5). Together, these results emphasize the utility of our work and the importance of mitigating native codon competition to facilitate nsAA incorporation into proteins with high fidelity.

Fig. 6: Single and dual incorporation of nsAAs at UAG and UGA in ELP–GFP proteins.
figure 6

a, Left, schematic depicting the dual-OTS and fluorescent reporter expression system. l-arabinose-inducible o-aaRSs and accompanying constitutively expressed UAG- and UGA-suppressing o-tRNAs coordinate charging and incorporation of the nsAAs BocK and pAcF at UAG and UGA, respectively, in one of three aTc-inducible fluorescent ELP–GFP reporters. Each reporter contains three instances of TAG, TGA or TAG and TGA within a central ELP–GFP linker to study single or dual nsAA encoding. Right, key representing nsAA incorporation at UAG, UGA or both within four variant recoded strains. bd, Fluorescent expression of ELP–GFP reporter containing 3 TAG codons (b), 3 TGA codons (c) or 3 TAG and 3 TGA codons (d) in variant strains with or without 10 mM BocK and/or 1 mM pAcF added to the growth medium, along with 0.05% (w/v) l-arabinose and 100 ng ml−1 aTc for induction. Colour scheme as shown in the key in a. Fluorescence is normalized to optical density, then to 3×TAC control construct fluorescence/optical density. Data are mean ± 95% confidence interval for n = 3 biological replicates. P values for comparison with rEc∆1.∆A by unpaired t-tests (n = 3). WT, wild type. e, Diagrammatic depiction of dual reporter ELP–3×(TGA-TAG)–GFP with TAG (red) and TGA (blue) stop codon positions and associated nsAAs. f, Mass spectrometry data showing incorporation of BocK and pAcF in ELP–3×(TGA-TAG)–GFP reporter proteins expressed in rEc∆2.∆A.B3. ^ indicates BocK sites. g, MS-READ peptide intensity for expected BocK incorporation at 3×UAG and pAcF at 3×UGA, and misincorporations identified in all examined digested peptides. h, MS-READ peptide intensity of misincorporations from g.

Source Data

Discussion

Here we describe an integrated genomic, biomolecular and protein engineering effort to disentangle translational crosstalk within the stop codon block, rendering four codons (UAG, UAA, UGA and UGG) functionally non-degenerate, thereby compressing the stop function to UAA while opening UAG and UGA for reassignment. This was achieved by replacing all genomic stop codons with synonymous TAA, coupled with engineering of tRNATrp and RF2 to mitigate UGA recognition. Deploying compatible OTSs, the resulting strain, Ochre, enabled UAG and UGA reassignment for multi-site incorporation of two distinct nsAAs within proteins with more than 99% accuracy. This result highlights the efficacy of integrating translation factor engineering with genomic recoding to more comprehensively compress, release and accurately reassign codons. As recoding efforts delete cognate translation factors to free codons4,9, underlying webs of natively competing translation factors may be revealed, serving as a major barrier to comprehensive recoding. Our work overcomes this by engineering translation factors to disentangle crosstalk, which is essential to completely liberate codons, offering a new lens for recoding efforts across domains of life.

Codon recognition was a central focus of the study. Our efforts affirm that translation factor decoding is a delicate balance, involving highly tuned structural interactions, concentration kinetics and competition for codons (Supplementary Discussion 1), with RF2 decoding influenced by residues beyond the recognition loop (for example, E170K). Future recoding efforts will require engineering codon recognition. As in tRNATrp, natural anticodon modifications offer promise for future codon isolation. Modifications may be added or removed to widen or narrow codon specificity. Alternatively, ribosomal components57,58 or RF3 may provide other levers for modulating decoding. Given the limited sampling of translation factor mutations in this study (Supplementary Table 14b), there exists broader opportunity to engineer factor functionalities in ways that are uniquely enabled by GROs—for example: (1) directed evolution of translation machineries may foster kinetics that are impermissible in other contexts; (2) additional release factor engineering to abolish UGA decoding may enable Ochre to further obstruct horizontal transfer of genetic elements8,9,10 or improve protein yields; or (3) engineered noncanonical stop codons may enable bidirectional genetic isolation7.

This study provides novel insights into fundamental biology. Spontaneous mutations that arise in RF3 (Supplemental Table 14b) may help to elucidate regulatory mechanisms of translation. The replacement of TGA, decoupling it from stop function, provides insight into codon redundancy and cryptic codon functions removed during codon compression. Differential suppression of UGA codons by tRNAs in times of stress can alter cellular transcription patterns50,59, demonstrating nonsynonymous stop codon functions. Other recoding efforts in E. coli have uncovered expression-regulating functions that are unique to AGG/AGA codons60. Future recoding efforts should take note of alternative functions of translation factors or codons, especially as these efforts reveal them. The effects of ∆TGA recoding on cellular responses in Ochre have yet to be investigated—this approach may have potential for improving fitness and investigating cryptic codon functions.

Our study provides insights into the science of genetic codes and translation by utilizing a synthetic biology approach that recodes genomes and repurposes the function of translation components in an integrated framework. More broadly, this study helps to establish design rules to probe and engineer genomes alongside translation factors to achieve a deeper and predictive understanding of the universal translation machinery and the canonical genetic code, setting the stage for engineering genomes with non-degenerate genetic codes. This research also offers valuable applications in safeguarding genetically modified organisms with enhanced genetic isolation8,10,51, biocontainment11,12 and expanded capacity for producing entirely new classes of synthetic proteins, biomaterials or therapeutic agents with diverse chemistries13,14,23.

Methods

DNA

Oligonucleotides were purchased from Integrated DNA Technologies with standard purification and desalting. See Supplementary Data for all oligonucleotides and DNA constructs.

Media

Unless otherwise stated, all cultures were grown in LB-Lennox medium (10 g l−1 bacto tryptone, 5 g l−1 sodium chloride, 5 g l−1 yeast extract). LB agar plates were composed of LB plus 15 g l−1 bacto agar. M9 minimal medium (12.8 g l−1 Na2HPO4, 3 g l−1 KH2PO4, 1 g l−1 NH4Cl, 0.5 g l−1 NaCl, 3 mg l−1 CaCl2) was adjusted to pH 7.5 with 10 M NaOH. For phage experiments, Tryptone-KCl (TK) liquid medium comprised of 10 g l−1 Tryptone, 5 g l−1 KCl and 0.5 ml l−1 of 1 M CaCl2. TK sloppy agar contained 10 g l−1 Tryptone, 5 g l−1 NaCl and 7 g l−1 agar. TK bottom agar contained 10 g l−1 Tryptone, 2.5 g l−1 NaCl, 2.5 g l−1 KCl, 0.5 ml l−1 of 1M CaCl2 and 10 g l−1 agar. Super optimal broth with catabolite repression (SOC) liquid medium contained (20 g l−1 tryptone, 5 g l−1 yeast extract, 10 mM NaCl, 2.5 mM KCl, 10 mM MgCl2, 10 mM MgSO4 and 20 mM glucose). For protein yield quantification, terrific broth medium (TB) contained 11.8 g l−1 tryptone, 23.6 g l−1 yeast extract, 9.4 g l−1 K2HPO4, 2.2 g l−1 KH2PO4 and 0.8% v/v glycerol.

Selective agents

ColE1 was expressed in strain JC411 and purified as previously described61. All other selective agents were purchased commercially: carbenicillin (50 µg ml−1), chloramphenicol (40 µg ml−1), gentamycin (10 µg ml−1), kanamycin (30 µg ml−1), sodium dodecyl sulfate (SDS) (0.005% w/v), spectinomycin (190 µg ml−1), tetracycline (15 µg ml−1), hygromycin (150 µg ml−1), zeocin (10 µg ml−1) and colicin E1 (ColE1; ~10 µg ml−1).

Promoter inducers

Plasmid-induced pORTMAGE62 and Recombineering63: l-arabinose (0.2% w/v, pBAD promoter), rhamnose (0.15% w/v, rhaB promoter). Plasmid maintenance was carried out with tetracycline (5–10 ng ml−1, tetA selectable marker).

OTS expression for nsAA incorporation assays: l-arabinose (0.05% w/v, pBAD promoter) and anhydrotetracycline (100 ng ml−1, pL tetO promoter).

Non-standard amino acids

BocK was purchased from Chem-Impex (00363) and dissolved in LB or TB to a final concentration of 10 mM. pAcF was purchased from Chem-Impex (24756), dissolved in sterile water to a concentration of 50 mM and used at a final concentration of 1 mM.

Strains

All strains were built on rEc∆1.∆A-C321.ΔA4 (MG1655 dnaG.Q576A.exoX.K28TAA.ΔtolC.ΔprfA.ΔmutS::zeo.Δ (ybhB-bioAB)::[λcI857 N(cro-ea59)::tetR-bla].12B.tolQRA); with all instances of TAG codons converted to TAA and 1,195 naturally occurring TGA codons.

Selectable marker preparation

Selectable markers were amplified by PCR (40 µl per reaction) performed using Kapa HiFi HotStart ReadyMix according to the manufacturer’s protocols with annealing at 61 °C. Primers were designed via Benchling primer creation software, confirmed with Kun’s oligonucleotide Tm calculator (http://arep.med.harvard.edu/kzhang/cgi-bin/myOligoTm.cgi). PCR products were purified using a Qiagen PCR purification kit, eluted in 30 µl dH2O and quantified either using an Eppendorf BioPhotometer plus or NanoDrop ND1000 spectrophotometer. For analysis, amplicons were run on a 1.5% agarose gel stained with ethidium bromide to confirm the expected band sizes.

TGA-to-TAA recoding

The starting strain was a standard rEc∆1.∆A (C321.∆A) with a zeocin marker cassette replacing a chloramphenicol marker at the mutS locus (Supplementary Table 10). The ybgC-tolQRA locus was duplicated between the ycgV and ychF genes as previously described64 to increase the fidelity of ColE1 selections. The previously reported mutations to dnaG and exoX to improve MAGE efficiency were also present in the starting strain65. All genes containing TGA codons were identified from WGS (Supplementary Table 1; see Methods section ‘WGS and analysis’ below). Eliminating TGA codons required a combined strategy of MAGE-mediated conversions of TGA to TAA and non-essential gene deletions (Supplementary Table 7). The oligonucleotides were designed as previously described30, ordered from IDT and grouped into pools up to 11 oligonucleotides per pool at 10–12 µM total DNA per pool, regardless of number of oligonucleotides (N) (Supplementary Table 1). These pools were used for N + 1 rounds of MAGE before cultures were plated. MAGE was performed as previously described30. Forty-seven colonies were picked and screened using Multiplex allele-specific colony PCR (MASC-PCR), as previously described15. Simultaneously, picked colonies were grown to confluence, diluted 1:15 into 150 µl of M9 minimal medium, then inoculated 1:100 into 150 µl LB and M9 to assess growth phenotypes (MaxOD, doubling time and lag time) (see ‘Fitness analysis’). The colony containing the largest number of lowest frequency conversions and minimal deviations in growth fitness was chosen for subsequent rounds of MAGE.

Native RBS predictions

Predicted RBS translation rates of genes overlapped by TGA stop codons before and after recoding to TAA were calculated in alignment with previous protocols36.

MAGE and λ-Red-mediated recombination

MAGE, pORTMAGE and λ-Red-mediated recombination were performed as previously described15,62. Cells were transferred to 0.1 cm cuvettes, electroporated (BioRad GenePulser, 1.78 kV, 200 Ω, 25 µF) and immediately resuspended into 3 ml LB (MAGE) or 1 ml SOC medium (dsDNA), grown at 34 °C, 225 rpm. For tolC and galK negative selections, cultures were recovered for at least 7 h to allow complete protein turnover before exposure to ColE1 and 2-deoxygalactose, respectively. Once all strains were conjugated into a final strain (see ‘CAGE assembly’), genomically integrated λ-Red was displaced by tolC negative selection. To convert background mutations and remaining TGA codons we employed a plasmid-based single stranded DNA recombineering approach. l-arabinose-inducible λ-Red-derived recombineering machinery was expressed on a temperature-curable plasmid. For double stranded DNA (dsDNA) recombination, rhamnose-inducible episomal λ-Red-derived recombineering machinery was also expressed on a temperature-curable plasmid containing l-arabinose-inducible ccdA antitoxin for negative selections as previous described63 for both marker removal and release factor placement/displacement. Both λ-Red plasmids contained tetA selectable markers. To cure plasmids, strains were incubated with λ-Red induction at 37 °C overnight in 3 ml LB, plated onto solid medium, incubated at 42 °C for 2–4 h, then incubated at 37 °C until colonies were visible. Colonies were picked into 150 µl LB, incubated for 3 h, then transferred 10 µl into LB with tetracycline to identify colonies lacking plasmids.

Genotyping

MASC-PCR was used to simultaneously detect up to 11 TGA-to-TAA conversions or background mutation reversions, as previously described with TAG conversions15. All primers are designed with a 61 °C annealing temperature, according to Kun’s oligonucleotide Tm calculator, while avoiding 3′ end binding in secondary structure, according to Benchling’s primer secondary structure prediction function. Each reaction consisted of KAPA 2 G Fast Multiplex ReadyMix (Kapa Biosystems, KK5802), 2 µl of template DNA and 0.2 µM of each primer for each 10 µl reaction. MASC-PCR results were run on 2.2% agarose gels with ethidium bromide staining. After λ-Red-mediated recombination or conjugation, colony PCR was used to confirm the presence and absence of selectable markers at desired positions. Colony PCR (10 µl per reaction; annealing at 61 °C (Kun’s oligonucleotide Tm calculator)) was performed using Kapa 2G Fast HotStart ReadyMix following manufacturer’s protocols. Results were analysed on a 1.5% agarose gel stained with ethidium bromide. Sanger sequencing was performed by Genewiz.

CAGE assembly

Conjugative assembly genome engineering (CAGE) was performed using the protocol as previously described15. Deviating from previous protocols, the donor strain had two positive markers—a spectinomycin marker and a gentamycin marker, each flanking the recoded region, in addition to the kanamycin resistance origin of transfer (kanR-oriT) cassette set ~3–5 kb upstream (Extended Data Fig. 2 and Supplementary Table 10). The donor strain also contained a modified RK24 plasmid in which all genes that end in a TAG codon were recoded to TAA10. Cell spots were rinsed twice with 500 µl LB and collected to give a 1 ml 10−2 dilution of conjugated cells, followed by an additional 10-fold dilution in a new tube. 50 µl of 10−2 and 10−3 dilutions were each plated onto LB agar plate with appropriate antibiotics to select for both the recipient background selectable markers and the donor region selectable markers. Forty-seven candidate colonies were grown in a 96-well format and screened for desired genotypes via PCR (to confirm presence and absence of selectable markers) and MASC-PCR (to confirm the presence of interspersed desired codon replacements). For large genomic transfers (>1 Mb), final strains were sequence verified through WGS (see ‘Genotyping’). The resulting strain contained the three selectable markers, all of which were subsequently deleted via dsDNA λ-Red tolC-mediated selection–counterselection (via SDS or ColE1 selection, respectively), or maintained for the next conjugation64.

Genomic deletions

To reduce the overall number of genes requiring ∆TGA recoding via MAGE, 16 multigenic regions containing a total of 229 genes (3 later categorized as pseudogenes) were identified for deletion based on previous work by the Blattner laboratory29 (Extended Data Fig. 1). Each targeted deletion site (up to ~34 kb in size) was displaced via tolC selectable marker displacement and counterselection for subsequent marker removal in accordance with previous protocols64. ColE1 selections on solid medium were performed as previously described15,64. For pre-selection, 5 µl of recovered cultures were inoculated into 150 µl LB with either carbenicillin (control) or ColE1 with vancomycin (64 µg ml−1), in triplicate, in a 96-well plate and incubated at 34 °C for 16 h to monitor growth along the progenitor strain with tolC. Strains exhibiting growth in both carbenicillin and ColE1 with vancomycin were considered positive for tolC deletion and subsequently plated onto LB agar plates with ColE1 for monoclonal colony selection and PCR screened to confirm the loss of tolC. Subsequent MASC analyses allowed for selection of strains with both deletion events and TGA conversions to reduce the total number of MAGE cycles. After every tolC placement and gene displacement, strain growth curves were analysed to assess gene deletion impact on strain phenotype (see ‘Fitness analysis’). Major growth impairments were assessed. Additional major deletions (>50 bp) resulted either from mutagenesis of highly repetitive noncoding genes, active transposable elements or scarring from conjugations.

WGS and analysis

Whole genomes were isolated using the Qiagen DNeasy Blood and Tissue isolation kit. For high fidelity genome analysis performed after conjugations, raw reads were acquired via short-read 150-bp (50× coverage) paired-end Illumina sequencing data were collected with Hiseq 4000 with libraries prepared by the Yale Center for Genome Analysis. Long-read (<25 kb, 50× coverage) data were prepared by Pacific Biosciences Single Molecule Real-Time (SMRT) Analysis. For rapid whole-genome analysis, raw reads were acquired via Plasmidsaurus Oxford nanopore standard bacterial WGS (30× coverage) via bacterial pellet submissions suspended in DNA/RNA Shield. To perform analyses on raw reads, latest versions of breseq 0.38.1 computation pipeline were employed for aligning sequence reads to rEc1.∆A (C321.∆A) reference genome, run using Ubuntu LTS on Windows Subsystem for Linux, in accordance with https://barricklab.org37. Summary .html outputs provided comprehensive lists of mismatches, indels and missing or novel junctions to monitor TGA conversion progress, deletions and background mutagenesis. Mutations were then organized by type in Excel (see Supplementary Tables).

AlphaFold structure of RF2.B3

The 3D structure of RF2.B3 in Fig. 2b was acquired through Benchling AlphaFold48 online prediction from amino acid sequence submission (Supplementary Table 17) and spatially oriented for view of primary reaction sites.

Liquid selection complementation

In triplicate, strains harbouring prfB variant expression plasmids (as described in Extended Data Fig. 4 and Supplementary Discussion 1) were electroporated with a kanamycin resistance (kanR) dsDNA cassette for displacement of genomic prfB and recovered in varying concentrations of vanillic acid inducer for 3 h at 37 °C to allow episomal expression of prfB variants. Strains were then diluted 1:50 into selective LB medium 25 µg ml−1 kanamycin and incubated with shaking at 37 °C, monitoring absorbance at 600 nm in a BioTek Synergy H1 plate reader (Agilent). After 36 h of growth, OD600 for each strain was reported as viability after complementation. Individual colonies were isolated by plating knockout strains on selective medium after recovery and PCR screened to confirm kanR displacements of genomic prfB. Colonies that tested positive with PCR screen were confirmed by WGS (Plasmidsaurus) to not harbour genomic prfB or secondary mutations, and whole-plasmid sequencing (Plasmidsaurus) of their prfB-containing plasmid to confirm variant.

Readthrough fluorescence assay

We used a dual fluorescence mCherry–YFP reporter to test stop codon readthrough as previously reported66. This reporter was constitutively expressed from a low copy backbone (p15A) to maximize dynamic range for detecting post-transcriptional fluctuations. Each evaluated strain was transformed with four variants of the reporter (three stop codons and a sense codon positive control GCG) and plated on selective medium for overnight growth. Colonies were picked in triplicate into 150 µl cultures of LB with appropriate selection in 96-well plates, grown with shaking at 225 rpm in 37 °C for 18 h after which timepoints measurements were taken for mCherry (excitation: 585 nm, emission: 635 nm, gain: 100), YFP (excitation: 500 nm, emission: 541 nm, gain: 80) and OD600 (absorbance: 600 nm) in a BioTek Synergy H1 plate reader (Agilent). To assess cognate amber or opal suppression, a plasmid bearing supD52 with CUA or UCA anticodon was co-transformed with the dual fluorescence reporter plasmid. Fluorescence values of all strain and plasmid conditions were normalized to OD600 before dividing by the positive control described above to arrive at fractional mCherry and YFP signals reported.

Bacteriophage assays

For all phage experiments, growth was carried out in TK at 37 °C; infection and propagation occurred in TK sloppy agar poured onto solid TK bottom agar incubated at 37 °C. TK sloppy agar mixes were maintained at 45 °C before use.

Phage propagation

For propagation, E. coli MG1655 was grown to mid-log phase in 3 ml of LB. Two-hundred microlitres of bacteria was added to 3 ml TK sloppy agar. Immediately following, 50 µl of phage (10 to 100-fold dilutions) was added directly from refrigerated stock into the sloppy-bacterial culture. Three millilitres of sloppy bacteria phage culture was poured onto solid TK bottom agar plates, dispersed evenly, then left at room temperature to solidify. Plates were then incubated at 37 °C for ~16 h to permit lysis to proceed to completion. The entire sloppy agar was collected and centrifuged (12,000g, 2 min) and 3 ml of supernatant was filtered with 0.22-µm filter column to remove bacteria.

Phage titration

Bacteria strains were grown overnight at 37 °C until OD600 reached 2–3. Twenty microlitres of bacteria was added to 3 ml sloppy agar. Three millilitres of sloppy bacteria culture was poured onto solid TK bottom agar plates. After incubated at room temperature for 15 min, 3 µl of the phage dilutions (101- to 108-fold dilutions) were dropped on the surface of the solidified sloppy agar. Once the drops dried, plates were incubated overnight at 37 °C. Visible plaques were counted for the individual drops. Titres (in PFU per ml) were calculated according to \({\rm{PFU}}\,{\rm{per}}\,{\rm{ml}}=N\times \frac{1}{{\rm{DF}}}\times \frac{1}{V}\), where N is the number of plaques, DF is the phage dilution factor and V is the volume of phage dilution pipetted on the plate.

Fitness analysis

Kinetic growth curves

Kinetic growth (OD600) curves were obtained via monitoring strain growth within a BioTek Synergy HT plate reader. Each strain was grown in triplicate within 96-well plates, 150 µl LB and 150 µl M9, within 96-well flat-bottom plates, incubated at 34 °C for intermediate λ-Red+ strains and 37 °C for final strains, 225 rpm, for 16–36 h and absorbance at 600 nm was read at 10-min intervals. In preparation, strains were grown to confluence in LB, diluted 1:15 into 150 µl of M9 minimal medium and inoculated 1:100 into 150 µl LB and M9. Auxotrophic strains revealed no growth in M9.

OD600 calibration

The absorbance obtained by the BioTek Synergy HT plate reader was recalibrated to OD600 (absorbance at 600 nm through 1 cm pathway) using a standard curve \(y=2.746x+1.878{x}^{2}\). The OD600 of an overnight LB culture of MG1655 was measured with a Biochrom Libra S4 Spectrophotometer at 600 nm wavelength in a semi-micro cuvette (1 cm pathway) after 1:10 dilution of culture into 1 ml LB. To generate the calibration curve, a series of cultures with OD600 ranging from 0 to 6 were prepared by diluting in LB medium. These cultures were then measured by the BioTek Synergy HT plate reader, with the same settings as growth cultures. The average values were then fitted to a polynomial standard curve for recalibration. The effects of medium evaporation in plate wells are not considered.

Doubling time and MaxOD calculation

The recalibrated growth curve was used to calculate the doubling time and MaxOD. Linear fitting of log2 OD600 was performed using a sliding-window method67, where the window size is 50 min for rich medium and 100 min for M9 minimal medium in the early log phase, respectively. The doubling time was calculated as the reciprocal of the slope. MaxOD was obtained within the 36 h growth period. Individual biological sample n ≥ 4.

nsAA incorporation assays

Plasmid construction

Gene fragments for aaRS and tRNA were synthesized by Twist Bioscience and cloned into expression vectors by Golden Gate assembly. Plasmids were sequence verified by whole-plasmid sequencing (Plasmidsaurus or Quintarabio). All cloning was made in Mach1 (Thermo Fisher, C862003).

Incorporation of BocK and pAcF into proteins

Recoded strains were transformed with OTS-reporter plasmids by standard electroporation protocols. Electroporated strains were recovered in 2 ml LB or SOC for at least 2 h before plating onto LB agar plates with kanamycin (50 µg ml−1) and incubated at 37 °C overnight. Three single colonies from each plate were picked and grown in 800 µl LB supplemented with kanamycin (50 µg ml−1) in a 96 deep-well plate sealed with a Breathe-Easy film (Sigma-Aldrich) and incubated at 37 °C with shaking at 220 rpm for 20–24 h. For nsAA incorporation at UAG and/or UGA, after overnight growth the cultures were back-diluted 1:50 onto a clear-bottom black 96-well plate (Costar) in a total of 150 µl of LB supplemented with kanamycin (50 µg ml−1), aTc (100 ng ml−1), l-arabinose (0.05% w/v), 1 mM pAcF and/or 10 mM BocK. Cell growth (absorbance at OD600) and GFP fluorescence (excitation 485 nm, emission 525, gain 70, bottom measurement) were measured in a BioTek Synergy H1 plate reader (Agilent) for 24 h at 10 min intervals with linear shaking. Data were analysed with a custom Python script. All reported GFP fluorescence was normalized to ELP–3×TAC–GFP control fluorescence within each corresponding condition.

Expression of ELP–3×(TGA-TAG)–GFP with nsAAs for mass spectrometry

Single colonies of rEc∆2.∆A.B3 and rEc∆2.∆A.B3.tW* (Ochre) containing the dual-OTS dual reporter plasmid were inoculated in 2 ml LB with 50 µg ml−1 kanamycin in a 14 ml falcon tube overnight at 37 °C with shaking at 220 rpm. After overnight growth the cultures were diluted 1:100 in 25 ml LB-kanamycin and grown until OD600 at 0.6, where the cultures were supplemented with 1 mM pAcF and 10 mM BocK and induced with 100 ng ml−1 aTc and 0.05% w/v l-arabinose. The cultures were grown at 37 °C overnight, after which the cells were collected by centrifugation at 3,200g for 20 min in a 50 ml centrifuge tube and stored at −80 °C until protein purification.

Protein mass spectrometry

ELP–GFP reporter protein purification

Frozen E. coli cell pellets were thawed on ice and pellets were lysed by sonication with lysis buffer consisting of 50 mM Tris-HCl (pH 7.4, 23 °C), 500 mM NaCl, 0.5 mM EGTA, 1 mM DTT, 10% glycerol, 50 mM NaF and 1 mM Na3O4V. The extract was clarified with two rounds of centrifugation performed for 20 min at 4 °C and 14,000g. Cell-free extracts were applied to Ni-NTA metal affinity resin and purified according to the manufacturer’s instructions. Wash buffers contained 50 mM Tris pH 7.5, 500 mM NaCl, 0.5 mM EGTA, 1 mM DTT, 50 mM NaF, 1 mM Na3VO4 and 10 mM imidazole. Proteins were eluted with a wash buffer containing 250 mM imidazole. Eluted protein was subjected to 4 rounds of buffer exchange (20 mM Tris pH 8.0 and 100 mM NaCl) and concentrated using a 30 kDa molecular weight cut-off spin filter (Amicon).

Protein digestion and mass spectrometry

Affinity purified, buffer exchanged ELP–GFP reporter protein, or whole cell lysates, were digested and analysed by mass spectrometry as described previously with some modifications55,68,69. ELP–GFP reporter protein (5–10 μg) was diluted with water and 20% SDS for a final volume of 115 μl and final concentration of 1% SDS. Samples were denatured for 15 min at 55 °C in a heat block. Reduction and alkylation of cysteines was performed with TCEP and 2-chloroacetamide (CAM) using a final TCEP and CAM concentration of 10 mM and 44 mM respectively. The reduction-alkylation reaction proceeded for 20 min at 55 °C. 6 μl of 50 mg ml−1 SP3 beads (Speed Bead, Cytiva) pre-washed and resuspended with water were added to samples for a final working volume of 134 μl. Binding of protein to the beads was induced by adding 150 μl of 100% ethanol. The binding mixture was incubated in a ThermoMixer Eppendorf at 24 °C for 10 min at 1,400 rpm. After binding, the beads were magnetized on a magnetic rack and supernatants were removed. This was followed by three rounds of bead washes with 500 μl of 80% ethanol per wash. All traces of 80% ethanol were removed after the last wash. Beads in each sample were resuspended with 50 μl of digestion solution containing 0.4 μg of sequencing grade trypsin (Promega) in 50 mM triethylammonium bicarbonate (TEAB) buffer (Sigma). Digests were incubated for 16 h at 37 °C in a ThermoMixer at 1,400 rpm. Beads were magnetized and 50 μl supernatants were moved to fresh tubes. Beads were resuspended in 50 μl of 50 mM TEAB buffer and incubated for 5 min at 37 °C in a ThermoMixer at 1,400 rpm to maximize peptide recovery. Beads were magnetized again and the two 50 μl supernatants were combined. Peptides were dried in a vacuum centrifuge at room temperature. Dried peptides were reconstituted in 2/98 acetonitrile/water with 0.1% formic acid and analysed by LC–MS/MS. LC–MS/MS was performed using a Vanquish Neo UHPLC system (Thermo) and an Orbitrap Eclipse Tribrid Mass Spectrometer (Thermo). The analytical column employed was a 75 μm inner diameter, fused silica capillary tube (Molex) packed in-house to a length of 15 cm with 1.9 μm ReproSil-Pur 120 Å C18-AQ (Dr. Maisch) using methanol as the packing solvent. Column was attached to a PepSep Spray Adapter with a fused silica emitter (Bruker). Peptide separation was achieved using mixtures of 0.1% formic acid in water (solvent A) and 0.1% formic acid in acetonitrile (solvent B) with a 41-min gradient; 0/5, 30/30, 39/45, 40/55, 41/100 (time (min)/B (%), linear ramping between steps). The gradient was performed with a flowrate of 300 nl min−1. At least one blank injection (5 μl 2% B) was performed between samples to eliminate peptide carryover on the analytical column. One-hundred nanomoles of trypsin-digested BSA and 100 ng of trypsin-digested HeLa protein standard were run periodically between samples as quality control standards. The mass spectrometer was operated with the following parameters: (MS1) 60,000 orbitrap resolution, 250% normalized AGC target, 50 ms maximum injection time, 300–1,400 m/z scan range; (data dependent-MS2) Ion Trap detector, 200% normalized AGC target, 13 ms maximum injection time, top 10 mode, 1.2 m/z isolation window, 30% normalized HCD collision energy, 40 s dynamic exclusion. Data were searched using MaxQuant version 1.6.10.43 with deamidation (NQ), oxidation (M) and phospho (STY) as variable modifications and carbamidomethyl (C) as a fixed modification with up to 3 missed cleavages, 5 amino acids minimum length and 1% false discovery rate against a modified Uniprot E. coli database containing custom MS-READ reporter proteins. MS-READ search results were analysed using MaxQuant and Perseus version 1.6.2.2

Protein yield quantification

Protein expression

Precultures were inoculated from fresh colonies picked from selective LB agar plates (LB plates + antibiotics at 37 °C for 24 h) into 2 ml TB with 50 µg ml−1 kanamycin and grown overnight at 37 °C with shaking at 220 rpm. Overnight precultures were diluted to OD600 of 0.1 in 20 ml TB with 50 µg ml−1 kanamycin and grown until OD600 = 0.6–0.8. pAcF was added to a final concentration of 1 mM. Reporter and OTS expression was induced with 100 ng ml−1 aTc and 0.05% arabinose. Cells were collected 8 h post-induction by centrifugation at 4,000g and the supernatant was removed. Cell pellets were resuspended in 1 ml wash buffer (50 mM Tris-HCl, 150 mM NaCl, pH 7.5), transferred to 1.7 ml tubes and centrifuged at 17,000g. The wash buffer was then removed and the cell pellet stored at −80 °C until further use.

Lysis and extraction

Pellets were resuspended in sufficient volume of wash buffer to normalize OD600 to 10. 1 ml of cells were lysed using Sigma-Aldrich 1× BugBuster (from 10× stock) and centrifuged at 16,000g to isolate soluble fraction.

Quantifying GFP concentration

GFP quantification methods were adapted from previous work70. An ELP–sfGFP standard with an N-Terminal 6×His tag (containing no in-frame stop codon and cloned in a pET28a vector) was expressed in BL21. The ELP–sfGFP standard was purified using a Ni-NTA column and the protein was quantified using a Pierce BCA (Bicinchoninic Acid) Protein Assay kit (Thermo Fisher) using established methods68. A GFP standard curve was produced by spiking the purified GFP standard into lysate prepared from Ochre that contained no GFP. GFP was spiked in at 28, 18.6, 12.5, 8.3, 5.5, 3.7, 2.5, 1.6, 1.1, 0.7, 0.5 and 0.3 µg ml−1 to generate triplicate, linear standard GFP curves. Lysates of different GFP expressing strains were prepared in a similar fashion and dilutions were fit to the linear range of the standard curve. Fluorescence of clarified lysates was measured in an H1M BioTek plate reader (excitation: 485 nm, emission :510 nm, gain: 60) alongside the GFP standard. Fluorescence values were converted to µg ml−1 and used to calculate total yields of expressed protein.

Statistical analysis

Statistical significance was typically generated from two-tailed unpaired t-tests. P values for mCherry–YFP readthrough assays (Fig. 2e,f and Extended Data Fig. 5e) were calculated using Welch’s unpaired t-test by comparison to rEc∆1.∆A within each paired codon and suppressor condition. Error bars display s.e.m., n = 3. P values for phage infection assays (Fig. 3) and tRNATrp UGA suppression assays (Fig. 4c,d) were derived from unpaired t-tests by comparison to E. coli MG1655 within each graph. Error bars display 95% confidence interval, n = 3. P values for tRNATrp UGA suppression assays (Fig. 4c) were derived from unpaired t-tests in relation to no Mj-TyrOTS plasmid within each strain. Error bars display 95% confidence interval, n = 3. P values for strain doubling times and maximum OD600 were calculated using Mann–Whitney U-tests comparing strain values with those of rEc∆1.∆A within each graph. Error bars display 95% confidence interval, n = 7 or 11 replicates (Fig. 5), n = 8 (Extended Data Fig. 3, LB) or n = 4 (Extended Data Fig. 3, 2×YT and TB) replicates. P values in ELP–GFP fluorescence assays assessing nsAA incorporation efficiencies (Fig. 6b–d and Extended Data Fig. 9c–e) were derived from unpaired t-tests comparing data to rEc∆1.∆A within each nsAA condition. Error bars display 95% confidence interval, n = 3. P values for RF2 complementation fluorescent protein production (Extended Data Fig. 4f and Supplementary Fig. 3a) compare fluorescence of GFP expressed from RF2 B2(P205) complemented strains to fluorescence of B1(wild type (S205)) strains with a similar genotype and vanillic acid concentration using unpaired t-tests (n = 3). Error bars display s.e.m. P values comparing RF2.B1 and B2 TGA readthrough (Supplementary Fig. 3b) for various vanillic acid concentrations compared to 0.1 µM vanillic acid within each strain release factor variant were calculated by Welch’s t-tests (n = 3). Error bars display s.e.m. Calculations and graphs were generated using Graph Pad Prism (v.10.1.0). Figures were prepared using Adobe Illustrator 2023 (v.28.2). All measurements were taken from distinct samples except when measured repeatedly over time to collect time courses.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.