Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Synthetic cross-phyla gene replacement and evolutionary assimilation of major enzymes


The ability of DNA to produce a functional protein even after transfer to a foreign host is of fundamental importance in both evolutionary biology and biotechnology, enabling horizontal gene transfer in the wild and heterologous expression in the lab. However, the influence of genetic particulars on DNA functionality in a new host is poorly understood, as are the evolutionary mechanisms of assimilation and refinement. Here, we describe an automation-enabled large-scale experiment wherein Escherichia coli strains were evolved in parallel after replacement of the genes pgi or tpiA with orthologous DNA from donor species spanning all domains of life, from humans to hyperthermophilic archaea. Via analysis of hundreds of clones evolved for 50,000+ cumulative generations across dozens of independent lineages, we show that orthogene-upregulating mutations can completely mitigate fitness defects that result from initial non-functionality, with coding sequence changes unnecessary. Gene target, donor species and genomic location of the swap all influenced outcomes—both the nature of adaptive mutations (often synonymous) and the frequency with which strains successfully evolved to assimilate the foreign DNA. Additionally, time series DNA sequencing and replay evolution experiments revealed transient copy number expansions, the contingency of lineage outcome on first-step mutations and the ability for strains to escape from suboptimal local fitness maxima. Overall, this study establishes the influence of various DNA and protein features on cross-species genetic interchangeability and evolutionary outcomes, with implications for both horizontal gene transfer and rational strain design.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Gene swap strain construction and laboratory evolution.
Fig. 2: Evolutionary outcomes.
Fig. 3: Evolved strain mutations.
Fig. 4: Mutational mechanisms of orthogene assimilation.
Fig. 5: Adaptive dynamics.

Data availability

The genome sequence data that support the findings of this study are available from ALEdb ( under project name ‘SvNS’.

Code availability

AfterQC, the software used to trim and filter DNA-seq reads, is available at Breseq, the software used to identify mutations, is available at Co, the software used to edit genome references (, is available at


  1. 1.

    Soucy, S. M., Huang, J. & Gogarten, J. P. Horizontal gene transfer: building the web of life. Nat. Rev. Genet. 16, 472–482 (2015).

    CAS  Article  Google Scholar 

  2. 2.

    Palmer, K. L., Kos, V. N. & Gilmore, M. S. Horizontal gene transfer and the genomics of enterococcal antibiotic resistance. Curr. Opin. Microbiol. 13, 632–639 (2010).

    CAS  Article  Google Scholar 

  3. 3.

    Potvin, G., Ahmad, A. & Zhang, Z. Bioprocess engineering aspects of heterologous protein production in Pichia pastoris: a review. Biochem. Eng. J. 64, 91–105 (2012).

    CAS  Article  Google Scholar 

  4. 4.

    Chen, J. et al. Genome hypermobility by lateral transduction. Science 362, 207–212 (2018).

    CAS  Article  Google Scholar 

  5. 5.

    Kachroo, A. H. et al. Evolution. Systematic humanization of yeast genes reveals conserved functions and genetic modularity. Science 348, 921–925 (2015).

    CAS  Article  Google Scholar 

  6. 6.

    Kachroo, A. H. et al. Systematic bacterialization of yeast genes identifies a near-universally swappable pathway. eLife 6, e25093 (2017).

    Article  Google Scholar 

  7. 7.

    Kacar, B., Garmendia, E., Tuncbag, N., Andersson, D. I. & Hughes, D. Functional constraints on replacing an essential gene with its ancient and modern homologs. mBio 8, e01276-17 (2017).

    Article  Google Scholar 

  8. 8.

    Lind, P. A., Tobin, C., Berg, O. G., Kurland, C. G. & Andersson, D. I. Compensatory gene amplification restores fitness after inter-species gene replacements. Mol. Microbiol. 75, 1078–1089 (2010).

    CAS  Article  Google Scholar 

  9. 9.

    Kacar, B., Ge, X., Sanyal, S. & Gaucher, E. A. Experimental evolution of Escherichia coli harboring an ancient translation protein. J. Mol. Evol. 84, 69–84 (2017).

    CAS  Article  Google Scholar 

  10. 10.

    Bershtein, S. et al. Protein homeostasis imposes a barrier on functional integration of horizontally transferred genes in bacteria. PLoS Genet. 11, e1005612 (2015).

    Article  Google Scholar 

  11. 11.

    Sandberg, T. E., Salazar, M. J., Weng, L. L., Palsson, B. O. & Feist, A. M. The emergence of adaptive laboratory evolution as an efficient tool for biological discovery and industrial biotechnology. Metab. Eng. 56, 1–16 (2019).

    CAS  Article  Google Scholar 

  12. 12.

    Charusanti, P. et al. Genetic basis of growth adaptation of Escherichia coli after deletion of pgi, a major metabolic gene. PLoS Genet. 6, e1001186 (2010).

    Article  Google Scholar 

  13. 13.

    McCloskey, D. et al. Adaptation to the coupling of glycolysis to toxic methylglyoxal production in tpiA deletion strains of Escherichia coli requires synchronized and counterintuitive genetic changes. Metab. Eng. 48, 82–93 (2018).

    CAS  Article  Google Scholar 

  14. 14.

    McCloskey, D. et al. Multiple optimal phenotypes overcome redox and glycolytic intermediate metabolite imbalances in Escherichia coli pgi knockout evolutions. Appl. Environ. Microbiol. 84, e00823-18 (2018).

    Article  Google Scholar 

  15. 15.

    Sandberg, T. E. et al. Evolution of Escherichia coli to 42 °C and subsequent genetic engineering reveals adaptive mechanisms and novel mutations. Mol. Biol. Evol. 31, 2647–2662 (2014).

    CAS  Article  Google Scholar 

  16. 16.

    LaCroix, R. A. et al. Use of adaptive laboratory evolution to discover key mutations enabling rapid growth of Escherichia coli K-12 MG1655 on glucose minimal medium. Appl. Environ. Microbiol. 81, 17–30 (2015).

    Article  Google Scholar 

  17. 17.

    Sandberg, T. E. et al. Evolution of E. coli on [U-13C]glucose reveals a negligible isotopic influence on metabolism and physiology. PLoS ONE 11, e0151130 (2016).

    Article  Google Scholar 

  18. 18.

    de Avila e Silva, S.,& Notari, D. L., Neis, F. A., Ribeiro, H. G. & Echeverrigaray, S. BacPP: a web-based tool for Gram-negative bacterial promoter prediction. Genet. Mol. Res. 15, gmr7973 (2016).

    Article  Google Scholar 

  19. 19.

    Kershner, J. P. et al. A synonymous mutation upstream of the gene encoding a weak-link enzyme causes an ultrasensitive response in growth rate. J. Bacteriol. 198, 2853–2863 (2016).

    CAS  Article  Google Scholar 

  20. 20.

    Peil, L. et al. Distinct XPPX sequence motifs induce ribosome stalling, which is rescued by the translation elongation factor EF-P. Proc. Natl Acad. Sci. USA 110, 15265–15270 (2013).

    CAS  Article  Google Scholar 

  21. 21.

    Bailey, S. F., Hinz, A. & Kassen, R. Adaptive synonymous mutations in an experimentally evolved Pseudomonas fluorescens population. Nat. Commun. 5, 4076 (2014).

    CAS  Article  Google Scholar 

  22. 22.

    Agashe, D. et al. Large-effect beneficial synonymous mutations mediate rapid and parallel adaptation in a bacterium. Mol. Biol. Evol. 33, 1542–1553 (2016).

    CAS  Article  Google Scholar 

  23. 23.

    Kristofich, J. et al. Synonymous mutations make dramatic contributions to fitness when growth is limited by a weak-link enzyme. PLoS Genet. 14, e1007615 (2018).

    Article  Google Scholar 

  24. 24.

    Matsumoto, T., John, A., Baeza-Centurion, P., Li, B. & Akashi, H. Codon usage selection can bias estimation of the fraction of adaptive amino acid fixations. Mol. Biol. Evol. 33, 1580–1589 (2016).

    CAS  Article  Google Scholar 

  25. 25.

    Leon, D., D’Alton, S., Quandt, E. M. & Barrick, J. E. Innovation in an E. coli evolution experiment is contingent on maintaining adaptive potential until competition subsides. PLoS Genet. 14, e1007348 (2018).

    Article  Google Scholar 

  26. 26.

    Concha, C. et al. Interplay between developmental flexibility and determinism in the evolution of mimetic Heliconius wing patterns. Curr. Biol. 29, 3996–4009.e4 (2019).

    CAS  Article  Google Scholar 

  27. 27.

    Conrad, T. M. et al. RNA polymerase mutants found through adaptive evolution reprogram Escherichia coli for optimal growth in minimal media. Proc. Natl Acad. Sci. USA 107, 20500–20505 (2010).

    CAS  Article  Google Scholar 

  28. 28.

    Wytock, T. P. et al. Experimental evolution of diverse Escherichia coli metabolic mutants identifies genetic loci for convergent adaptation of growth rate. PLoS Genet. 14, e1007284 (2018).

    Article  Google Scholar 

  29. 29.

    Sastry, A. V. et al. The Escherichia coli transcriptome mostly consists of independently regulated modules. Nat. Commun. 10, 5536 (2019).

    CAS  Article  Google Scholar 

  30. 30.

    Herring, C. D., Glasner, J. D. & Blattner, F. R. Gene replacement without selection: regulated suppression of amber mutations in Escherichia coli. Gene 311, 153–163 (2003).

    CAS  Article  Google Scholar 

  31. 31.

    Thomason, L. C., Costantino, N. & Court, D. L. E. coli genome manipulation by P1 transduction. Curr. Protoc. Mol. Biol. 79, 1.17.1–1.17.8 (2007).

    Article  Google Scholar 

  32. 32.

    Li, W. et al. The EMBL-EBI bioinformatics web and programmatic tools framework. Nucleic Acids Res. 43, W580–W584 (2015).

    CAS  Article  Google Scholar 

  33. 33.

    Sandberg, T. E., Lloyd, C. J., Palsson, B. O. & Feist, A. M. Laboratory evolution to alternating substrate environments yields distinct phenotypic and genetic adaptive strategies. Appl. Environ. Microbiol. 83, e00410-17 (2017).

    Article  Google Scholar 

  34. 34.

    Lenski, R. E. Experimental evolution and the dynamics of adaptation and genome evolution in microbial populations. ISME J. 11, 2181–2194 (2017).

    CAS  Article  Google Scholar 

  35. 35.

    Marotz, C. et al. DNA extraction for streamlined metagenomics of diverse environmental samples. Biotechniques 62, 290–293 (2017).

    CAS  Article  Google Scholar 

  36. 36.

    Chen, S. et al. AfterQC: automatic filtering, trimming, error removing and quality control for fastq data. BMC Bioinform. 18, 80 (2017).

    Article  Google Scholar 

  37. 37.

    Deatherage, D. E. & Barrick, J. E. Identification of mutations in laboratory-evolved microbes from next-generation sequencing data using breseq. Methods Mol. Biol. 1151, 165–188 (2014).

    CAS  Article  Google Scholar 

  38. 38.

    Zadeh, J. N. et al. NUPACK: analysis and design of nucleic acid systems. J. Comput. Chem. 32, 170–173 (2011).

    CAS  Article  Google Scholar 

Download references


This work was supported by the Novo Nordisk Foundation (grant no. NNF10CC1016517) and in part by the National Institutes of Health (grant no. R01GM057089). T.E.S. was supported in part by the National Science Foundation Graduate Research Fellowship grant no. DGE-1144086. We thank E. Brunk, E. Catoiu, M. Omar Din, C. Olson, M. Wu and Y. Hutchison for useful advice and discussions. We thank A. Feist for making automated evolution machines available for the experiments performed in this study.

Author information




T.E.S. and B.O.P. conceived the project and wrote the manuscript. T.E.S. and R.S. designed and constructed the strains. P.V.P. assisted with genome sequencing. T.E.S., R.S., P.V.P. and B.O.P. aided in data analysis.

Corresponding author

Correspondence to Bernhard O. Palsson.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Orthogene properties.

a, GC content of native and donor gene sequences (GC% total in parentheses). b, Histogram of the change in codon usage resulting from replacement of native E. coli sequences with foreign versions. c, Change in protein’s amino acid usage resulting from replacement of native sequences with foreign versions.

Extended Data Fig. 2 Evolutionary trajectories.

Fitness improvements over the course of evolution for knockout controls and gene-swapped strains, with failure lineages indicated by dotted lines. ΔtpiA controls were increased from four to ten to provide more comparison lineages for the single HsaTpi failure.

Extended Data Fig. 3 Orthogene impact on strain fitness.

a, Growth rates of various pgi- and tpiA-swap evolved strains before and after knockout of the orthogene. b, Enzyme activity levels of various strains, determined by colorimetric assay. No bar indicates an activity level below detection of the assay. c, Growth rates at various AHL concentrations of pgi or tpiA knockout strains containing plasmids with AHL-inducible expression of the H. sapiens pgi or tpiA. Growth rates higher than knockout levels even with no AHL induction may be due to leaky plasmid expression. In all panels strain names correspond with those given in the Supplementary Dataset containing DNA sequencing data, and error bars represent standard deviation from quadruplicate measurements. Orthogene-assimilation failures: PaePgi 26.93, HsaPgi 17.83, HsaTpi 8.34. Orthogene-assimilation successes: VchPgi 29.76, BmePgi 13.86, HsaPgi 20.71, VchTpi 25.65, BmeTpi 13.105, PaeTpi 23.107, HsaTpi 3.96.

Extended Data Fig. 4 Recurring mutations highlight regions under selection.

a, Mutations to crp and the C-terminus of rpoA were highly characteristic of pgi failures and knockout controls. Mapping to the cryoEM structure of the transcription activation complex (PDB ID: 6B6H) reveals that these characteristic mutations cluster in the same spatial region. b, Minimum free energy (MFE) structure at 37 °C of tpiA transcript for HsaTpi swap, with observed ALE endpoint mutations. The coding sequence changes destabilize G-C rungs of the strongest stem-loop, while the 5’-UTR +A insertion destabilizes this same stem-loop via increased stabilization of a stem-loop-adjacent unstructured region.

Extended Data Fig. 5 Various pgi transcripts and mutations.

a, SNP accumulation in pgi across 924 Escherichia coli strain variants with whole genome sequences available. b, The only pgi differences in E. coli and various E. albertii strains within the first 120 bp of transcript are minor 5’-UTR changes and a number of synonymous mutations.

Extended Data Fig. 6 Orthogene assimilation in the presence of knockout-characteristic mutations.

Even in the presence of one or more Δpgi-characteristic mutations a lineage could still successfully assimilate the orthogene, with added evolutionary time facilitating but not guaranteeing this outcome.

Extended Data Fig. 7 Continuation ALEs.

a, The only four failure HsaPgi lineages were given additional evolutionary time and reached success in most cases. b, Additional evolutionary time did not enable success for any of four continued PaePgi failure lineages. c, A single BmePgi midpoint clone was isolated with two knockout-characteristic mutations (to rpoA and nusA) and no orthogene changes. Independent lineages founded from this clone all eventually acquired orthogene mutations and higher growth rates.

Extended Data Fig. 8 Orthogene copy number expansions.

a, All orthogene copy number expansions found in pgi swap ALE strains. Homology between flanking genes or repetitive extragenic palindromic (REP) sequences facilitated the expansions. b, Relative growth rates on various carbon sources for HsaTpi strains genetically identical except for the size of tpiA duplication. Inclusion of the rrnB-rrnE region hindered growth on glucose, while the rrnC-rrnA expansion hindered glycerol growth. Error bars represent standard deviation from quadruplicate measurements.

Extended Data Fig. 9 Replay ALEs.

a, Two distinct HsaPgi strains with the same gnd- containing genome duplication but different orthogene mutations were split into lineages and evolved, which could lead to collapse of the gnd duplication. b, An HsaPgi strain that had acquired a genomic copy number increase of the orthogene was used to found four lineages. Evolution resulted in copy number remaining stable or increasing. c, A BmePgi strain with a synonymous orthogene SNP (A5A) was used to found three lineages. Evolution enabled further orthogene-upregulating mutations to increase growth rate, and one lineage acquired a second synonymous SNP as the only coding sequence changes to the foreign DNA.

Extended Data Fig. 10 Method of scarless strain construction.

The pgi swap strains were constructed as shown. The tpiA swap strains used a construct lacking the foreign gene start homology, I-SceI site, and antibiotic cassette; double stranded DNA breaks were induced by CRISPR-targeting to native gene sequence, obviating the need for antibiotics.

Supplementary information

Supplementary Information

Supplementary Tables 1 and 2.

Reporting Summary

Supplementary Data

Evolved strain mutations

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Sandberg, T.E., Szubin, R., Phaneuf, P.V. et al. Synthetic cross-phyla gene replacement and evolutionary assimilation of major enzymes. Nat Ecol Evol 4, 1402–1409 (2020).

Download citation

Further reading


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing