The generation of a chemical system capable of replication and evolution is a key objective of synthetic biology. This could be achieved by in vitro reconstitution of a minimal self-sustaining central dogma consisting of DNA replication, transcription and translation. Here, we present an in vitro translation system, which enables self-encoded replication and expression of large DNA genomes under well-defined, cell-free conditions. In particular, we demonstrate self-replication of a multipartite genome of more than 116 kb encompassing the full set of Escherichia coli translation factors, all three ribosomal RNAs, an energy regeneration system, as well as RNA and DNA polymerases. Parallel to DNA replication, our system enables synthesis of at least 30 encoded translation factors, half of which are expressed in amounts equal to or greater than their respective input levels. Our optimized cell-free expression platform could provide a chassis for the generation of a partially self-replicating in vitro translation system.
Self-encoded reproduction is a major requirement for the creation of artificial life1. In systems inspired by existing biochemistry, such as minimal protein-based cells (MPCs), self-replication demands a complete cell-free reconstitution of the central dogma of molecular biology, including DNA replication, transcription and translation2,3,4,5,6. In vitro protein synthesis from DNA can be achieved in well-defined recombinant systems based on phage RNA polymerases, core parts of the Escherichia coli translation machinery and a minimal energy regeneration system (PURE— Protein synthesis Using Recombinant Elements)7. In contrast, transcription–translation-coupled DNA replication (TTcDR) of a genome encoding all macromolecular components of the PURE system by a self-encoded replisome remains difficult8. DNA replication employing DNA polymerases (DNAP) from phages such as Phi29 are promising candidates for self-encoded TTcDR of minimal genomes9,10. For example, partially self-encoded TTcDR inside liposomes was accomplished using small linear Phi29 genomes encoding a minimal two-gene replicon on three kilobases (kb)11. TTcDR of the Phi29 full-length genome (∼19 kb) in a PURE-based system was also achieved, but only if sufficient amounts of replication factors were either supplied externally or co-expressed from an excess of non-replicative DNA templates11. TTcDR of small circular DNAs (2 kb) encoding only the Phi29-DNAP was recently realised by coupling the reaction to Cre-Lox recombination12. Despite these advances, a concurrent, self-encoded replication and expression of the up to 150 genes (113 kb) proposed for MPC self-replication3 is currently out of reach. Here, we describe a modified PURE reaction that enables direct co-expression and Phi29-DNAP-dependent TTcDR of large multicistronic DNA elements that reach the predicted genome size required to encode a minimal cell. In particular, we demonstrate self-replication of a multipartite genome larger than 116 kb encompassing the full set of Escherichia coli translation factors, all three ribosomal RNAs, an energy regeneration system, as well as RNA and DNA polymerases. Parallel to DNA replication, our system enables synthesis of at least 30 encoded translation factors, half of which are expressed in amounts equal to or greater than their respective input levels.
PURErep enables self-encoded DNA replication
Initially, we tested self-encoded Phi29-DNAP-dependent TTcDR using the standard protocol of the commercially available PURExpress system. The Phi29-DNAP coding region flanked by a T7 promoter was first cloned into a pCR-Blunt TOPO vector (pREP, Fig. 1a). In principle, this construct should enable spontaneous RNA-primed rolling-circle replication13 by the self-encoded DNAP without additional replication proteins or externally supplied DNA primers as reported previously10. However, using the standard PURExpress reaction supplied with dNTPs and 4 nM pREP, we were unable to detect de novo synthesis of DNA by either agarose gel electrophoresis or qPCR (Fig. 1b, c). This finding is in agreement with previous studies which reported that the high tRNA and rNTP concentrations in standard PURE systems impair DNA-polymerase (DNAP) activity and that optimised custom systems are required to achieve efficient TTcDR10,11. In order to improve DNA replication without access to tailor-made PURE systems, we set out to optimise the PURExpress standard reaction protocol. To this end, we increased the relative amount of translation factors, ribosomes and reducing agent while decreasing tRNA and rNTP levels (Fig. 1d; Supplementary Table 1). Using this optimised PURE formulation (PURErep), we achieved, depending on the pREP input concentration, ∼5–12-fold replication of pREP monomer units in overnight TTcDR reactions (Fig. 1b, e). Full-length de novo synthesis of pREP was confirmed by MluI digestion of the replication product (Fig. 1c). Taking superfolder green fluorescent protein (sfGFP)14 expression as an overall measure for in vitro translation (IVT) activity, we found that the altered PURE formulation resulted in a batch-dependent reduction of protein synthesis yields of 20–40% compared with the TTcDR-incompetent PURExpress system (Supplementary Fig. 1A, B). Thus, the improved compatibility of the PURErep system with DNA replication is achieved at the expense of only a modest reduction in overall protein expression strength.
TTcDR products can be transformed and propagated in E. coli
A qPCR-based analysis of DNA replication revealed a robust doubling time of 1–2 h for different initial template concentrations with DNA replication proceeding even after 24 h at 30 °C (Fig. 1e). TTcDR of pREP was also sustainable for more than five successive generations of serial dilution when 4% of an overnight PURErep/pREP reaction was directly transferred into a fresh PURErep mix (Fig. 1f). This result implies that TTcDR products can serve as templates for self-coded DNA replication over several generations. As expected from the rolling-circle-type replication, we observed a considerable amount of product with low electrophoretic mobility, likely representing large molecular weight concatemers and/or DNA-MgPPi clusters as reported previously for similar reactions (Supplementary Fig. 1C)15. Unexpectedly, we also observed formation of ~5 kb products in unprocessed samples, suggesting that TTcDR reactions may produce considerable amounts of monomeric pREP copies (Supplementary Fig. 1C). We were also able to transform de novo synthesised products into E. coli after removal of parental plasmids (Supplementary Fig. 2A). Purified in vivo amplified products were identical in size to monomeric pREP (Supplementary Fig. 2B).
PURErep enables TTcDR of large multipartite genomes
Encouraged by the efficient TTcDR in PURErep, we set out to co-replicate a collection of genes coding for crucial components of the PURE reaction such as the 31 essential E. coli translation factors (TFs). To this end, we probed co-TTcDR of pREP (4.6 kb) together with each one of the three large plasmids pLD1 (30 kb, 13 translation factors – TFs), pLD2 (20 kB, 8 TFs), or pLD3 (23 kb, 9 TFs), which were recently cloned to enable recombinant expression of 30 of the 31 TFs16. Indeed, the TTcDR products of all four plasmids (including pREP) showed identical MluI restriction patterns as clonal plasmids conventionally propagated in E. coli (Fig. 2a). Moreover, the pLD TTcDR products could be directly transformed into E. coli, from where they were maintained as monomeric plasmids (demonstrated for pLD3, Supplementary Fig. 2C, D). The optimised PURErep mix enabled even the complete replication of all three pLD plasmids together with PURErep in a one-pot reaction (Fig. 2b; Supplementary Fig. 3A, B).
Next we sought to further expand the genetic load of the TTcDR-system by co-replicating plasmids encoding additional components of the PURE system such as EF-Tu (pEFTu), which is missing in the pLD system, and also the ribosomal RNA operon rrnB (prRNA), which encodes for 23S rRNA, 16S rRNA, 6S rRNA and tRNA(Glu2)17 (Fig. 2c). qPCR experiments targeting plasmid-specific amplicons confirmed that monomer units of all six plasmids (total DNA length 93 kb) were replicated about 2–8-fold relative to their respective input levels in the presence of pREP and dNTPs after overnight incubation (Fig. 2c). In support of complete co-replication of all plasmids, transformations of DpnI-treated PURErep reaction products into E. coli resulted in colonies resistant to either zeocin (pREP), kanamycin (pLD plasmids and prRNA) or carbenicillin (pEFTu) (Fig. 2d). DNA preparations of 26 randomly picked clonal colonies followed by restriction pattern analysis indeed confirmed successful TTcDR of all six plasmids (Fig. 2e; Supplementary Fig. 3C–E). In contrast, almost no background colonies were obtained when samples from dNTP-free PURErep experiments were transformed into E. coli (Fig. 2d). Using the same approach (Fig. 3; Supplementary Fig. 4), we were able to demonstrate co-replication of five additional plasmids encoding all but one of the missing proteins of the PURE enzyme mix (Supplementary Table 2, except peptidylprolyl isomerase). The additional plasmids include the genes for a minimal nucleoside triphosphate regeneration system based on creatine kinase (pCKM), adenylate kinase (pAK1) and nucleoside diphosphate kinase (pNDK), as well as T7-RNA polymerase (T7RNAP) and pyrophosphatase (pIPP), which is added to more recent versions of the PURE system18. With a total size of 116.3 kb, this set of 11 plasmids reaches >100% of the predicted genome length proposed for a near-minimal, self-replicating system dependent only on small-molecule nutrients (Fig. 3a)3.
PURErep enables synthesis of 30 TFs during TTcDR
Having shown combined TTcDR of the multicistronic plasmids that encode almost all proteins of the PURE enzyme mix, we explored whether the PURErep mix could also enable parallel expression of these genes during replication. A (partially) self-replicating system based on the central dogma needs to be able to regenerate at least some of its different protein components. As a first step in this direction, we focused on the multicistronic expression of the TFs encoded on the three pLD plasmids pLD1, pLD2 and plD3 (not including pEFTu). To explore whether PURErep is generally capable of supporting multicistronic expression from these plasmids, we performed cell-free expression from each individual plasmid in presence of BODIPY-Lys-tRNALys, which enables the fluorescent labelling of translation products at lysine residue sites. Using the reported expression patterns for affinity-purified TF ensembles from pLD overexpression experiments16, we could assign the majority of the de novo synthesised protein subunits to the to the respective TFs (Supplementary Fig. 5). To improve detection sensitivity and enable quantification of newly synthesised proteins, we also performed a mass spectrometry-based quantitative protein expression analysis using stable-isotope labelling19. For this purpose, we carried out PURErep in vitro experiments with each pLD plasmid with 15N213C6-lysine as sole source of lysine and 15N413C6-arginine as the sole source of arginine. Using the unlabelled PURE-supplemented TFs as internal standards to determine the heavy-to-light (H/L) ratio of isotope-labelled peptides, we found strong evidence for the de novo synthesis of all pLD-encoded TF protein subunits in overnight reactions (Fig. 4a). In particular, we obtained H/L ratios close to or larger than one for 12 of the 13 TFs encoded on pLD1 implying full regeneration of most of the encoded proteins during IVT. Partial or even full regeneration was also observed for the proteins encoded on both pLD2 and pLD3 (Fig. 4a).
Next, we probed multicistronic expression of all three pLD plasmids during parallel TTcDR induced by the addition of pREP. Despite the considerably increased synthetic burden (replication of a 78 kb genome and transcription/translation of 33 protein chains), we detected H/L ratios > 0.73 for 16 of the 32 encoded protein subunits. The H/L ratios of remaining TF subunits indicated regeneration levels between 10–70% (N = 10) and 4–9% (N = 6) (Fig. 3b). Thus, even under non-optimised batch conditions, PURErep in combination with pREP enables both the complete replication of 32 pLD-encoded TF cistrons as well as expression of about half of the encoded TF peptide chains in yields comparable or exceeding their initial PURErep input concentrations.
We demonstrated that under optimised TTcDR conditions, synthetic multipartite DNA genomes approaching the size of a postulated MPC genome can self-replicate and express proteins under cell-free conditions. Surprisingly, primer-free TTcDR by Phi29-DNAP alone is already sufficient to generate a significant amount of monomeric replication products from circular plasmids without any enzymatic post-processing, suggesting that partially recursive genome replication in cell-free systems can be achieved with only a single DNA polymerase. Furthermore, both the monomeric and concatemer TTcDR products can be directly transformed into E. coli where they propagate as authentic copies of their parental plasmid presumably after re-circularisation by intramolecular homologous recombination20.
PURErep enables the self-encoded replication of plasmid ensembles with a total DNA length that exceeds both the size of a proposed minimal genome for a self-replicating translatome3 and that of the smallest known bacterial genome (Nasuia deltocephalinicola, 112 kb)21. Currently, most of the space in our multipartite model genome is taken up by the plasmid backbones, which maintain compatibility with in vivo propagation. In future genome designs, these sections could be replaced by the ~110 genes that are currently missing for the encoding of a complete minimal replicator dependent only on small-molecule nutrients3.
An additional core requirement for a future minimal replicator such as a MCP is the ability to regenerate its individual protein components. In proof-of-concept batch PURErep reactions, we found that de novo synthesis of 30 of the 31 essential E. coli TFs can be detected after TTcDR of their encoding ~78 kb plasmid ensemble. The relative (apparent) regeneration of the TFs encoded on pLD1 was generally efficient and reached H/L ratios ≥0.8 for 10/13 TFs during co-TTcDR of all pLD plasmids and >0.9 for 12/13 TFs when only pLD1 alone was added to PURErep. In comparison, much lower regeneration levels were achieved for the TFs encoded on pLD2 and pLD3. These results correlate well with the concentrations of the individual TFs in the PURErep starting solution (Supplementary Table 2): The proteins encoded on pLD1 are the lowest concentrated TFs in PURE (approximate concentrations of 15–480 nM) and therefore readily compatible with the protein expression yields that can be achieved with current recombinant IVT systems. In contrast, the initial concentrations of pLD2-encoded TFs, which performed worst in our co-expression experiments, are much higher (approximate concentration of 0.75–3.2 µM) and therefore cannot be efficiently regenerated in the current PURErep system.
While quantification using stable-isotope labelling is considered a reliable and robust methodology to determine relative expression levels (in particular in cell-free environments)22,23, several factors such as incomplete trypsin digestion, translational arrest, incomplete peptide labelling or low peptide counts in e.g., Arg/Lys-rich proteins may affect quantification and therefore obscure the achieved regeneration levels22,23,24,25. Furthermore, the current MS-based approach provides no information on the correct folding of the synthesised polypeptide chain and, thus, the actual amount of functional protein obtained during expression. Therefore, a direct functional feedback of the synthesised TFs back into IVT will be required in future experiments to assess or improve the amount of active protein that can be generated during TTcDR. Fortunately, E. coli tRNA synthetases, which are one of the major TF factors in IVT systems, can be very well expressed in their active soluble form in PURE26. Thus, it seems conceivable that the current IVT activity of PURErep is sufficient to generate systems capable of regenerating self-coded proteins of which only low concentrations are required.
In addition to addressing the functional state of the in vitro expressed proteins, self-encoded regeneration of TF factors will require further optimisation of expression stoichiometries and yields. Balanced stoichiometries could be achieved through optimised ribosome-binding sites, cistron positioning, promotor optimisations or feedback regulation27,28. Enhanced protein expression will most likely require continuous mode cell-free protein synthesis setups, e.g., based on miniaturised fluid array devices, which greatly increase protein yields29. Using this approach, the regeneration of the other pLD-TFs, EF-Tu and the other proteins of the PURE enzyme mix could be achieved. The construction a self-regenerating IVT-system that is completely independent from the external supply of external macromolecules will also require integrated ribosome synthesis, assembly and translation (iSAT)30,31,32. Recent non-commercial protocols for in-house PURE-production16,18,33,34 provide attractive starting points for the generation of improved PURErep formulations that may be compatible with these key activities.
In its current form, PURErep can achieve modular in vitro replication of large genome-sized plasmid ensembles that retain their compatibility with bacterial in vivo propagation. This direct transferability could improve design, evolution and prototyping of MPC modules, orthogonal central dogmas35 or synthetic gene circuits36, which were before not amenable to TTcDR-based in vitro replication.
All primers used for cloning, mutagenesis and/or qPCR are listed in Supplementary Table 3, and were either ordered from IDT or Eurofins. All plasmids used in this study are listed in Supplementary Table 4. The open-reading frame of the Phi29 DNA-Polymerase (Gene ID: 6446511) was ordered as synthetic gene (gblock, IDT) and cloned into a pCR-Blunt vector using the ZeroBlunt cloning kit (Thermo Fisher) according to the manufacturer’s instructions. The resulting construct was further optimised for in vitro translation by adding a T7 promoter with T7 gene 10 translation-enhancer sequence37 and a downstream bidirectional transcription terminator using Q5 site-directed mutagenesis (NEB) according to the supplier’s instructions. The identity of the final construct pREP was verified by sequencing. All cloning procedures were performed with chemically competent E. coli DH5alpha. The plasmids pLD1, pLD2 and pLD3 were a generous gift from A. Forster (Uppsala University) and are described in more detail elsewhere16. The plasmid pEFTu, which for historic reasons also encodes a gene copy of IF-1, was constructed from respectively linearised genes and a pIVEX 2.3d backbone using the HiFi assembly kit (NEB). First, an intermediate version was assembled from linear overhang PCR products using the primer pairs 152, 152 (IF-1 fragment) and 153, 154 (pIVEX backbone). Subsequently, three linear overhang PCR products created using the primer pairs 161, 162 (EF-Tu fragment) and 163, 164 (gene spacer fragment) and 158, 159 (intermediate backbone containing IF-1) were assembled into the final construct. For the generation of prRNA, the E. coli ribosomal operon rrnB was directly amplified from Top10 E. coli by colony PCR using Q5-DNA polymerase with the primer pairs 85, 86, and cloned into a pCR-Blunt vector using the ZeroBlunt cloning kit (Thermo Fisher) according to the manufacturer’s instructions. Plasmids encoding for nucleotide-diphosphate kinase (pNDK, ID:124136)7, T7-RNA polymerase (pT7RNAP, ID:124138)7, creatine kinase m-type (pCKM, ID:124134)7, inorganic pyrophosphatase (pIPP, ID:118978)18 and adenylate kinase 1 (pAK1, ID:118977)18 were obtained from Addgene. Ampicillin-resistance genes were deleted by PCR in pT7RNAP, pCKM and pNDK using primers 200 and 201 (Supplementary Table 3). dsDNA concentrations were measured using a NanoDrop One-c (Thermo Scientific) following the manufacturer’s instructions. All constructs were verified by sequencing (Eurofins Genomics).
The difference in protein synthesis yields between PURExpress and PURErep was estimated using fluorescence of de novo synthesised sfGFP. To this end, 25 µL PURExpress reactions were set up according to the manufacturer’s instructions using 150 ng of pIVEX-sfGFP plasmid. In total, 25 µL PURErep reactions consisted of 2.5 µL 10× energy mix (EM, Supplementary Table 1), 1 µL solution A (PURExpress, NEB), 15 µL solution B (PURExpress, NEB), 0.6 µL 25 mM dNTPs (equimolar), 0.5 µL rNTP mix (18.75 mM ATP, 12.5 mM GTP, 6.25 mM UTP/CTP) and 150 ng pIVEX-sfGFP plasmid DNA. After 2 h of incubation at 37 °C, 2× SDS loading buffer was added to the respective mixtures, and they were incubated for 5 min at 55 °C to preserve sfGFP fluorescence. In all, 10 µL of each sample was subsequently loaded on a 12% polyacrylamide SDS-Gel. Fluorescent bands were directly visualised using a Typhoon FLA 7000, and analysed via ImageQuant, GE Healthcare Life Sciences. To assess pLD-plasmid encoded gene expression, pLD plasmids (final concentration 4 nM) were added to PURErep reactions containing 1 µL of FluoroTect GreenLys (Promega). Prior to the addition of 2× SDS loading buffer, the samples were incubated for 30 min at 37 °C with 1 µL of RNase Cocktail (Thermo Fisher Scientific). After denaturing PAGE, de novo synthesised proteins were visualised using a Typhoon FLA 7000 scanner.
Transcription–translation-coupled DNA replication
The reaction composition for a typical 25 µL TTcDR reaction was as follows: 2.5 µL 10× EM, 1 µL solution A (PURExpress, NEB), 15 µL solution B (PURExpress, NEB), 0.6 µL 25 mM dNTPs (equimolar), 0.5 µL rNTP mix (18.75 mM ATP, 12.5 mM GTP, 6.25 mM UTP/CTP) and plasmid DNA (as specified in the main text). If necessary, the reaction volumes were adjusted to 25 µL with ddH2O. TTcDR reactions in the conventional PURExpress system were assembled according to the standard protocol for a 25 µL reaction: 10 µL solution A, 7.5 µL solution B, 0.6 µL 25 mM dNTPs, plasmid DNA as specified in the main text. The final reaction volume was adjusted to 25 µL with ddH2O. PURExpress and PURErep samples were incubated at 30 °C for up to 16 h in a ProFlex thermocycler (Applied Biosystems) for the time indicated. Time point zero samples were aliquoted directly after mixing, flash-frozen in liquid nitrogen and stored at −80 °C until further use.
Gel analysis of TTcDR products
Untreated TTcDR samples were directly analysed by neutral agarose gel electrophoresis in 1× TAE (Tris-Acetate-EDTA, gels pre-stained with SYBR-safe). Due to the size of some rolling-circle concatemers and/or due to the possible formation of MgPPi-DNA nanoparticles15, a fraction of the total product remained in the gel pockets. When defined product bands were desired, such as in Figs. 1c and 2a, samples were treated with FastDigest MluI (Thermo Fisher Scientific, simply referred to as MluI throughout the rest of the paper) in 1 × FastDigest buffer according to the manufacturer’s instructions. Gels were imaged using a Typhoon FLA 7000, GE Healthcare Life Sciences using the Typhoon Scanner Control 5.0 software package and analysed using ImageQuant TL 8.1 and/or ImageJ 1.51i. To confirm the identity of the replication products pLD1-3 by restriction pattern analysis, TTcDR samples were processed by adding 1 µL RNAse Cocktail (Thermo Fisher) and 1 mg/ml Proteinase K. After 16 h of incubation at 37 °C, the samples were loaded on a neutral 0.8% agarose gel. DNA products migrating at a size of ~20–30 kb were gel-extracted and purified using the Zymoclean Large DNA Fragment Extraction Kit (Zymo Research). Purified DNA was cut with MluI and restriction patterns visualised by neutral gel electrophoresis as described above. The reference lanes consisted of purified plasmids (or mixtures thereof), digested with MluI.
In vivo propagation of TTcDR products
For transformation experiments, TTcDR samples were digested with Dpn1 (NEB) (1.5 h at 37 °C) to remove parental plasmid DNA before transforming 2 µL of the mixture into 50 µL electrocompetent 10-beta E. coli cells (NEB). Transformants were selected on LB-agar plates supplemented with either zeocin (pREP), carbenicillin (pEFTu, pIPP, pAK1), chloramphenicol (pNDK, pT7RNAP, pCKM) or kanamycin (pLD plasmids, prRNA). Fingerprints restriction digests of plasmids isolated from cells grown in presence of zeocin and kanamycin were performed with MluI as described in the previous section. Plasmids isolated from chloramphenicol plates were digested with XbaI (NEB). Plasmids isolated from carbenicillin plates were either digested with EcoRV (NEB) or XbaI as indicated in the respective figure legends. In vivo propagated TTcDR products and plasmids were prepared from overnight E. coli cultures using the NucleoSpin Plasmid kit (Macherey Nagel) following the manufacturer’s instructions.
Relative DNA quantification by qPCR
Fold changes of DNA copy-number relative to input levels (t = 0) were measured by qPCR (Luna Universal Mix, NEB) in a StepOne Real-Time PCR System (Thermo Fisher Scientific, StepOne/StepOnePlus Software v2.3). For each time point, three individual samples were taken and diluted 4000-fold in ddH2O, which were further diluted 1:20 in the final qPCR reaction (final dilution 1:80,000). The specific primers for each target amplicon are listed in Supplementary Table 3. The fold change f at time point t was calculated using the equation:
where f(t) is the fold change of the sample at time point t, E the PCR efficiency and ΔCq(t) the average difference between the qPCR cycle thresholds ΔCq at time zero and time t. E, Cq(0) and Cq(t) were determined using LinRegPCR38 (version 2018.0). Different TTcDR time points of the same experiment were quantified in the same qPCR experiment using a common primer/enzyme mastermix for each target plasmid. Asymmetric upper and lower confidence limits (68%) for f(t) were approximated by calculating f(t) for ΔCq(t) + s.d. and ΔCq(t) – s.d., respectively, where s.d. is the standard deviation for ΔCq(t) values from replicates as stated in the respective figure legends (typically n = 3). All data sets were visualised using Graphpad Prism 7.05.
Stable-isotope labelling of co-expression products
For stable-isotope labelling of de novo synthetised protein, TTcDR samples were mixed with an energy mix containing 15N213C6-lysine and 15N413C6-arginine, instead of the corresponding unlabelled amino acids. After incubating for 2 h at 37 °C, the samples were analysed via mass spectrometry. First, the reaction mixture was diluted with equal volumes of buffer containing 1% sodium deoxycholate, 10 mM TCEP and 40 mM chloroacetamide in 25 mM Tris•HCl at pH 8.5 to be incubated at 37 °C for 20 min. The reaction mixture was further diluted and incubated overnight with roughly 1 µg of trypsin. Digested peptides were acidified and purified through SCX (strong cation exchange) StageTips (Thermo Scientific). Liquid chromatography–mass spectrometry (LC-MS) analysis was performed on a Q-Exactive-HF mass spectrometer (Thermo Scientific) operated in a data-dependent fashion. The raw data were processed using the MaxQuant39 computational platform (version 188.8.131.52), and all peptide and protein identifications were filtered at 1% false discovery rate. The derived peak list was searched using Andromeda search engine integrated in MaxQuant against the E. coli K12 proteome (Proteome ID: UP000000625/Genome accession: U00096) obtained from UniProt (4391 protein entries; last modified May 14 2019). The obtained H/L values for each pLD-encoded protein were analysed and plotted using Graphpad Prism 7.05.
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
The source data underlying Figs. 1b, 1c, 1e, 1f, 2a, 2b, 2c, 2e, 3b, 4a, 4b, Supplementary Figs. S1B, S1C and 5 are provided in a Source Data file. MaxQuant outputs are provided in two separate source files. Plasmids encoding for nucleotide-diphosphate kinase (ID:124136), T7-RNA polymerase (ID:124138), creatine kinase m-type (ID:124134), inorganic pyrophosphatase (ID:118978) and adenylate kinase 1 (ID:118977) were obtained from Addgene. Further data supporting the findings of this paper are available from the corresponding author upon reasonable request.
von Neumann, J. The general and logical theory of automata. in (ed. Taub, A. H.) Cerebral mechanisms in behavior; the Hixon Symposium. 1–41 (Wiley, 1951).
Szostak, J. W., Bartel, D. P. & Luisi, P. L. Synthesizing life. Nature 409, 387–390 (2001).
Forster, A. C. & Church, G. M. Towards synthesis of a minimal cell. Mol. Syst. Biol. 2, 45 (2006).
Noireaux, V., Maeda, Y. T. & Libchaber, A. Development of an artificial cell, from self-organization to computation and self-reproduction. Proc. Natl Acad. Sci. USA 108, 3473–3480 (2011).
Schwille, P. et al. MaxSynBio: avenues towards creating cells from the bottom up. Angew. Chem. Int. Ed. 57, 13382–13392 (2018).
Le Vay, K., Weise, L. I., Libicher, K., Mascarenhas, J. & Mutschler, H. Templated self‐replication in biomimetic systems. Adv. Biosyst. 3, 1800313 (2019).
Shimizu, Y. et al. Cell-free translation reconstituted with purified components. Nat. Biotechnol. 19, 751–755 (2001).
Fujiwara, K., Katayama, T. & Nomura, S.-I. M. Cooperative working of bacterial chromosome replication proteins generated by a reconstituted protein expression system. Nucleic Acids Res. 41, 7176–7183 (2013).
Garamella, J., Marshall, R., Rustad, M. & Noireaux, V. The all E. coli TX-TL toolbox 2.0: a platform for cell-free synthetic biology. ACS Synth. Biol. 5, 344–355 (2016).
Sakatani, Y., Ichihashi, N., Kazuta, Y. & Yomo, T. A transcription and translation-coupled DNA replication system using rolling-circle replication. Sci. Rep. 5, 10404 (2015).
van Nies, P. et al. Self-replication of DNA by its encoded proteins in liposome-based synthetic cells. Nat. Commun. 9, 1583 (2018).
Sakatani, Y., Yomo, T. & Ichihashi, N. Self-replication of circular DNA by a self-encoded DNA polymerase through rolling-circle replication and recombination. Sci. Rep. 8, 13089 (2018).
Lagunavicius, A. et al. Novel application of Phi29 DNA polymerase: RNA detection and analysis in vitro and in situ by target RNA-primed RCA. RNA 15, 765–771 (2009).
Pédelacq, J. D., Cabantous, S., Tran, T., Terwilliger, T. C. & Waldo, G. S. Engineering and characterization of a superfolder green fluorescent protein. Nat. Biotechnol. 24, 79–88 (2006).
Galinis, R. et al. DNA nanoparticles for improved protein synthesis. Vitr. Angew. Chem. 128, 3172–3175 (2016).
Shepherd, T. R. et al. De novodesign and synthesis of a 30-cistron translation-factor module. Nucleic Acids Res. 45, 10895–10905 (2017).
Harvey, S. & Hill, C. W. Exchange of spacer regions between rRNA operons in Escherichia coli. Genetics 125, 683–690 (1990).
Lavickova, B. & Maerkl, S. J. A simple, robust, and low-cost method to produce the PURE cell-free system. ACS Synth. Biol. 8, 455–462 (2019).
Mann, M. Functional and quantitative proteomics using SILAC. Nat. Rev. Mol. Cell Biol. 7, 952–958 (2006).
Fujii, R., Kitaoka, M. & Hayashi, K. Error-prone rolling circle amplification: the simplest random mutagenesis protocol. Nat. Protoc. 1, 2493–2497 (2006).
Bennett, G. M. & Moran, N. A. Small, smaller, smallest: the origins and evolution of ancient dual symbioses in a phloem-feeding insect. Genome Biol. Evol. 5, 1675–1688 (2013).
Xian, F. et al. Peptide biosynthesis with stable isotope labeling from a cell-free expression system for targeted proteomics with absolute quantification. Mol. Cell. Proteom. 15, 2819–2828 (2016).
Hanke, S., Besir, H., Oesterhelt, D. & Mann, M. Absolute SILAC for accurate quantitation of proteins in complex mixtures down to the attomole level. J. Proteome Res. 7, 1118–1130 (2008).
Matic, I. et al. Absolute SILAC-compatible expression strain allows sumo-2 copy number determination in clinical samples. J. Proteome Res. 10, 4869–4875 (2011).
Narumi, R. et al. Cell-free synthesis of stable isotope-labeled internal standards for targeted quantitative proteomics. Synth. Syst. Biotechnol. 3, 97–104 (2018).
Awai, T., Ichihashi, N. & Yomo, T. Activities of 20 aminoacyl-tRNA synthetases expressed in a reconstituted translation system in Escherichia coli. Biochem. Biophys. Rep. 3, 140–143 (2015).
Karig, D. K., Iyer, S., Simpson, M. L. & Doktycz, M. J. Expression optimization and synthetic gene networks in cell-free systems. Nucleic Acids Res. 40, 3763–3774 (2012).
Chizzolini, F., Forlin, M., Cecchi, D. & Mansy, S. S. Gene position more strongly influences cell-free protein expression from operons than t7 transcriptional promoter strength. ACS Synth. Biol. 3, 363–371 (2014).
Li, J. et al. Cogenerating synthetic parts toward a self-replicating system. ACS Synth. Biol. 6, 1327–1336 (2017).
Jewett, M. C., Fritz, B. R., Timmerman, L. E. & Church, G. M. In vitro integration of ribosomal RNA synthesis, ribosome assembly, and translation. Mol. Syst. Biol. 9, 678 (2013).
Liu, Y., Fritz, B. R., Anderson, M. J., Schoborg, J. A. & Jewett, M. C. Characterizing and alleviating substrate limitations for improved in vitro ribosome construction. ACS Synth. Biol. 4, 454–462 (2015).
Caschera, F. et al. High-throughput optimization cycle of a cell-free ribosome assembly and protein synthesis system. ACS Synth. Biol. 7, 2841–2853 (2018).
Villarreal, F. et al. Synthetic microbial consortia enable rapid assembly of pure translation machinery. Nat. Chem. Biol. 14, 29–35 (2018).
Wang, H. H. et al. Multiplexed in vivo his-tagging of enzyme pathways for in vitro single-pot multienzyme catalysis. ACS Synth. Biol. 1, 43–52 (2012).
Liu, C. C., Jewett, M. C., Chin, J. W. & Voigt, C. A. Toward an orthogonal central dogma. Nat. Chem. Biol. 14, 103–106 (2018).
Adamala, K. P., Martin-Alarcon, D. A., Guthrie-Honea, K. R. & Boyden, E. S. Engineering genetic circuit interactions within and between synthetic minimal cells. Nat. Chem. 9, 431–439 (2017).
Olins, P. O. & Rangwala, S. H. A novel sequence element derived from bacteriophage T7 mRNA acts as an enhancer of translation of the lacZ gene in Escherichia coli. J. Biol. Chem. 264, 16973–16976 (1989).
Ruijter, J. M. et al. Amplification efficiency: linking baseline and bias in the analysis of quantitative PCR data. Nucleic Acids Res. 37, e45–e45 (2009).
Cox, J. & Mann, M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat. Biotechnol. 26, 1367–1372 (2008).
Kuruma, Y. & Ueda, T. The PURE system for the cell-free synthesis of membrane proteins. Nat. Protoc. 10, 1328–1344 (2015).
The authors especially thank A. Forster (Uppsala University) for providing the plasmids pLD1, pLD2 and pLD3 to the community. We thank N. Nagaraj and the MPIB mass spectrometry core facility for the help with the mass spectrometry data acquisition and analysis. We thank P. Schwille and L. Kei for constructive discussions and comments. Special thanks to K. Le Vay for providing critical comments and reviewing of the paper. Funding was provided by the MaxSynBio consortium, which is jointly funded by the Federal Ministry of Education and Research of Germany and the Max Planck Society. M. Heymann gratefully acknowledges support from the Joachim Herz Foundation.
The authors declare no competing interests.
Peer review information Nature Communications thanks Allen Liu and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Libicher, K., Hornberger, R., Heymann, M. et al. In vitro self-replication and multicistronic expression of large synthetic genomes. Nat Commun 11, 904 (2020). https://doi.org/10.1038/s41467-020-14694-2
Nature Communications (2021)
Nature Reviews Methods Primers (2021)
Non-associative phase separation in an evaporating droplet as a model for prebiotic compartmentalization
Nature Communications (2021)
In vitro synthesis of 32 translation-factor proteins from a single template reveals impaired ribosomal processivity
Scientific Reports (2021)
Nature Communications (2020)