Even our most audacious feats of genetic engineering, such as reprogramming entirely new biosynthetic pathways into a cell, are generally constrained by strict adherence to the linguistic laws of the genome. Indeed, the rules regarding which three-nucleotide codon within a gene designates a given amino acid in the resulting protein are near universal.

Two new studies from George Church's lab at Harvard Medical School demonstrate important strides in taking genome modification to another level by rewriting the genetic dictionary. “We want to know whether we can fundamentally change the properties of an organism,” says Marc Lajoie, a graduate student in Church's group. He cites a number of potential benefits. For example, cells might be modified to reinterpret certain codons as a sign to incorporate unnatural amino acids that introduce new properties to the recipient protein. Furthermore, recombinant organisms that speak a different codon 'language' would be invulnerable to pathogens seeking to hijack their protein synthesis machinery and unable to cause havoc in the environment by exchanging genetic material with natural species.

In the first study, Church's group partnered with Farren Isaacs at Yale University to conduct genome-wide substitution in the bacterium Escherichia coli (Lajoie et al., 2013a)1, using targeted gene-editing techniques to eliminate every instance of one of three 'stop' codons (UAG) by replacing it with a different stop codon (UAA). This allowed them to delete the gene encoding the release factor protein that specifically terminates transcription at UAG codons, and the resulting bacteria were not only viable but also less vulnerable to bacteriophages that still rely on this stop codon.

Church's group then explored just how much manipulation E. coli could withstand by generating a series of strains in which they completely eliminated 13 of the 64 possible codons from 42 different genes essential to survival (Lajoie et al., 2013b)2. Most amino acids can be designated by more than one codon—some by as many as six—and the researchers removed the 13 codons by replacing them with alternative, synonymous codons. In addition to eliminating these 'forbidden' codons, Lajoie and colleagues also made random synonymous substitutions in these genes at every other codon wherever this could be achieved without altering the encoded amino acid sequence.

“Our first attempt was designed to fail,” says Lajoie. “We changed just about every third base, which means that on average these proteins had 65% identity with the wild-type DNA sequence but 100% identity in amino acid sequence.” In fact, most of the resulting bacterial strains proved robust against this broad genomic overhaul. For 26 of the 42 fully recoded proteins, the bacteria remained viable, albeit with a lower growth rate, and nine other proteins could be recoded throughout most but not all of their sequence. Only seven could not be recoded, possibly because of unforeseen effects on overlapping regulatory sequences or genes. However, all 42 proteins proved amenable to the targeted elimination of just the 13 forbidden codons. “We didn't find any situation in which we couldn't change a forbidden codon to at least one permitted codon,” says Lajoie. In comparison to the heavy gene editing from the initial experiments, removal of just the forbidden codons had no effect on bacterial growth.

These results suggest clear potential for the genome-wide repurposing of at least a subset of these codons to encode different amino acids in E. coli, and having 13 candidates to choose from should provide good insurance for future success as Church's team continues toward this goal. “Until we have a deterministic understanding of genomes, diversity and troubleshooting are going to be really important,” says Lajoie. “Expecting your first million designs to fail is a healthy plan, and designing your methods to accommodate that and turn failures into something that yields better designs in the future will be important.”