Systematic molecular evolution enables robust biomolecule discovery

DeBenedictis, Erika A.; Chory, Emma J.; Gretton, Dana W.; Wang, Brian; Golas, Stefan; Esvelt, Kevin M.

doi:10.1038/s41592-021-01348-4

Download PDF

Article
Published: 30 December 2021

Systematic molecular evolution enables robust biomolecule discovery

Nature Methods volume 19, pages 55–64 (2022)Cite this article

41k Accesses
27 Citations
172 Altmetric
Metrics details

Subjects

Abstract

Evolution occurs when selective pressures from the environment shape inherited variation over time. Within the laboratory, evolution is commonly used to engineer proteins and RNA, but experimental constraints have limited the ability to reproducibly and reliably explore factors such as population diversity, the timing of environmental changes and chance on outcomes. We developed a robotic system termed phage- and robotics-assisted near-continuous evolution (PRANCE) to comprehensively explore biomolecular evolution by performing phage-assisted continuous evolution in high-throughput. PRANCE implements an automated feedback control system that adjusts the stringency of selection in response to real-time measurements of each molecular activity. In evolving three distinct types of biomolecule, we find that evolution is reproducibly altered by both random chance and the historical pattern of environmental changes. This work improves the reliability of protein engineering and enables the systematic analysis of the historical, environmental and random factors governing biomolecular evolution.

Phage-assisted continuous and non-continuous evolution

Article 16 November 2020

The developing toolkit of continuous directed evolution

Article 22 May 2020

In vivo hypermutation and continuous evolution

Article 19 May 2022

Main

Biologists have long sought to understand the effects of history and the environment on evolutionary outcomes. The long-term evolution experiment set the gold standard for characterizing genomic evolution at scale: tracking 12 separate Escherichia coli populations over 65,000 generations^1,2. However, it has proven difficult to gather datasets of similar scope for the evolution of individual biomolecules. Laboratory protein evolution traditionally uses discrete rounds of targeted in vitro mutagenesis and selection³, sacrificing throughput, replicates and trajectory length in favor of restricting mutations to a single gene or pathway within an otherwise fixed environment. High-throughput molecular evolution methods are needed to quantify the effects of history and environmental selection pressures on single-gene evolution⁴ and enable more robust biomolecular engineering.

Continuous directed evolution methods accelerate the process of diversification and selection by coupling gene function to the fitness of a rapidly replicating organism, such as a bacteriophage⁵, mammalian virus^6,7 or a yeast plasmid^8,9. The power of these techniques comes from the ability to rapidly cycle through dozens of generations, and they have been used to quantify mutation rate¹⁰, track evolution pathways^9,11 and rapidly engineer diverse proteins including highly specific nanobodies⁷ and improved CRISPR effectors¹². While valuable additions to the biomolecule evolution toolkit^13,14,15, these experiments are laborious, limited by low throughput and difficult to execute with precise environmental control. Extracting fundamental insights about biomolecular evolution and systematically producing evolved biomolecules with desired activities requires comprehensively sampling the parameters known to influence the results of evolution—including the initial genotype^16,17,18 and the evolution environment¹⁹—in high replicate²⁰.

We present a new approach that combines the speed of continuous evolution, precise control of the environment and the throughput required to evolve many independent populations in parallel. Whereas classic evolution ‘replay’ experiments²¹ and existing continuous evolution methods⁵ evolve a single population contained within a continuous-flow bioreactor (Fig. 1a), we developed a high-throughput plate-based method and applied it to a widely used evolution technique, phage-assisted continuous evolution (PACE)⁵. In our method, evolving bacteriophage populations are housed within 96-well plates on a high-throughput liquid-handling system, which is controlled using Python²² to achieve the same host bacteria flow-through rates as continuous-flow bioreactors. This system enables continuous evolution in 96-plex using commonly available robotic equipment and open-source software²². It offers several distinct advantages over existing methods^13,14,15 including scalability to hundreds of parallel populations, historical sampling in 96-well plate format, real-time monitoring of any molecular fitness output rather than just growth effects^15,23 and precise temporal and chemical control of the environment. We use this platform to identify improbable mutations in a single molecular evolution by performing the experiment in 96-plex, and next demonstrate that the inclusion of controls, replicates and many diverse initial genotypes increases the likelihood of successful evolution and improves interpretation of results. Further, miniaturization enables us to precisely fine-tune the chemical environment of each experiment using reagent-limiting compounds²³. Using real-time activity monitoring, we implement a feedback control system in which the stringency of the bacterial host strain is autonomously adjusted in response to measurements of biomolecule activity, robustly producing evolved variants by eliminating failures associated with improper selection strength. When combined with high-throughput sequencing, we are also able to distinguish both contingent and deterministic evolution outcomes. Finally, we present an optimized method using limited consumables, capable of week-long multi-path experiments with only once-per-day researcher intervention. Thus, the high-throughput continuous evolution platform we describe enables systematic exploration of the factors responsible for evolutionary outcomes.

**Fig. 1: Design and validation of high-throughput evolution.**

Results

Development of a systematic evolution platform

We began by developing a 96-well plate-based method, wherein 500-µl cultures of evolving M13 bacteriophage (Fig. 1a) are serially diluted with fresh host bacteria twice per hour using an automated liquid handler (Fig. 1b). To enable the pipetting speeds required to approximate continuous flow, we developed a robotic Python interface²² that precisely times the distribution of bacteria, the addition of chemical stimuli, the sampling of populations for real-time monitoring and historical sample preservation (Extended Data Fig. 1). Integrated real-time measurement of luminescence, fluorescence and turbidity enables activity-dependent fitness tracking, which we show is more precise than monitoring turbidity alone¹⁵ (Extended Data Fig. 2), Thus, a fluorescent or luminescent reporter can be coupled to either the presence of phage or to the direct activity of the evolving biomolecule itself. We refer to this platform as PRANCE.

Each evolution round is initiated by sterilizing a bacterial culture reservoir (Extended Data Fig. 3) and adding culture to each population (Fig. 1b and Extended Data Fig. 1b). Bacterial cultures can be sourced from an active turbidostat or chemostat, enabling high-volume experiments, or from preprepared bacterial stock stored at 4 °C, enabling experiments that use many bacterial cultures (Fig. 1b). Accessory molecules (for example, chemical mutagens, stimuli, small molecules) are then pinned to each population, which are monitored in real time by an integrated, automated plate reader that measures not only the population density, but also the fluorescence and luminescence of each population at discrete 30-min intervals. Samples are preserved in 96-well format and retained for downstream analyses such as sequencing of accumulated changes, or in vitro and in vivo activity measurements (Fig. 1b). The system also incorporates error handling, failure-mode prevention and wireless experimenter communication, (Extended Data Fig. 4), and is optimized for minimal human intervention (Extended Data Fig. 1c,d). The fast iteration time of PRANCE enables dozens of rounds of evolution per day, comparable to traditional PACE, with the ease and throughput of a plate-based format.

We first demonstrated the real-time activity-monitoring capability by propagating M13 bacteriophage encoding T7 RNA polymerase (RNAP) in place of the pIII phage coat protein using host bacteria expressing pIII and a luminescence reporter (luxAB) under the control of a T7 promoter in 48 independent populations (Fig. 1c). We observed T7 RNAP-dependent luminescence in all samples in less than 4 hours from both 37 and 4 °C culture (Fig. 1d), showing that PRANCE can reproducibly monitor real-time reporters of fitness.

Multiplexing identifies previously inaccessible genotypes

While current directed evolution methods suffice when the primary goal is to engineer a single functional protein, they are limited in their ability to probe the randomness and reproducibility of any given biomolecule evolution and characterize the ensemble of possible outcomes. We wondered if a well-studied evolution^5,10,11 could provide new outcomes if sufficiently sampled. First, to measure the stochasticity of evolution, we evolved the T7 RNAP to initiate on the T3 promoter and performed the evolution in 90-plex. In this experiment, host bacteria contain an inducible mutagenesis plasmid²⁴ and an accessory plasmid-containing pIII and luxAB under the control of the T3 promoter (Fig. 2a). With 500-µl populations typically harboring 10⁸ infected cells per ml experiencing high mutagenesis, each population should traverse single-mutation fitness valleys to explore a large fraction of double mutants each day²⁴. We inoculated 96 total populations with or without T7 RNAP-expressing ΔpIII phage (with six no-phage controls) and tracked their progress in real time with luminescence (Fig. 2b). We found that bacteria sourced from 4 °C and mutagenized on-deck perform similarly (Extended Data Fig. 5a) and tracked absorbance depression (Extended Data Fig. 5b). This high-throughput exploration of the evolution of the T7 RNAP, with >5× more parallel populations than the largest previously reported experiment¹⁰, allowed us to measure the frequency and reproducibility of the emergence of different genotypes. Both novel and previously reported mutations were observed. In addition, we quantified the elapsed evolution times and found the distribution to be logistically distributed (goodness of fit, CvM stat 0.017, Kolmogorov–Smirnov test 0.046) (Fig. 2c and Extended Data Fig. 5c,d), consistent with only a single mutation (N748D/S or M219R) being required for improved activity (Fig. 2d). The single M219R mutation, which exhibits substantially delayed emergence relative to N748D (chi-square P = 0.0037) (Fig. 2e), has not been previously reported despite the many previous iterations of the T7 RNAP evolution. This may be partly due to N748D/S resulting from a transition mutation (A → G), whereas M219R results from a transversion mutation (T → G), which occurs less frequently²⁴. Thus, systematic high-replicate evolution allows for the extensive profiling of evolutionary reproducibility²⁰ and enables deeper sampling of less accessible genotypes that cannot be readily identified from a single population.

**Fig. 2: Quantifying the stochasticity of biomolecular evolution.**

Miniaturization enables reagent-limited evolution

As traditional PACE uses continuous flow to constantly refresh host bacteria and consumes large quantities of media, it has previously been infeasible to evolve biomolecules in environments that require small molecules that are difficult to synthesize or are new²⁵. PRANCE reduces each bioreactor volume by 100-fold, thus making small molecule-dependent environments more feasible and controllable. To demonstrate these capabilities, we modified an established evolution of the pyrrolysine aminoacyl-synthetase (PylRS) to incorporate noncanonical amino acids (ncAAs)²⁶, using substantially lower quantities of ncAAs than previously reported. To enable multiplexing of diverse transfer RNA (tRNA)–PylRS pairs, we encoded a PylRS variant and a UAG-containing tRNA^Pyl within the M13 phage genome and inserted a UAG amber stop codon within the pIII phage coat protein along with a luciferase reporter expressed from host bacteria (Fig. 3a). Thus, phage proliferation and luminescence are both directly coupled to suppression of the UAG amber codon via ncAA incorporation (Fig. 3b).

**Fig. 3: Controlling the chemical environment in high-throughput evolution.**

We used three pyrrolysine synthetases—chPylRS, an evolution intermediate (chPylRS-IP)²⁶ and an evolved variant (chPylRS-IPYE)²⁶—and quantified their ability to incorporate Boc-lysine as measured by ncAA-dependent kinetic luminescence activity (Fig. 3c) and amber codon-dependent phage enrichment (Fig. 3d). We then used PRANCE to evolve the PylRSs to incorporate Boc-lysine by inoculating eight populations with phage encoding either chPylRS and chPylRS-IP in quadruplicate. Aminoacyl-synthetase (AARS) evolution is prone to the emergence of ‘cheaters’, that is promiscuous charging of canonical amino acids²⁷. To monitor the emergence of nonspecific AARSs, we included eight populations with phage encoding each variant but in the absence of ncAAs; phage propagation under these conditions would indicate the evolution of nonspecific charging of canonical amino acids. Together, we monitored luminescence and absorbance across a total of 24 populations over 36 h of evolution. We observed propagation of both chPylRS- and chPylRS-IP-encoding phage in the presence of Boc-lysine (Fig. 3e), and identified novel genotypes with previously unidentified mutations (Fig. 3f,g). The lack of luminescence in the absence of ncAA indicates that cheater AARSs are unlikely to emerge under the evolution conditions used (Fig. 3e). Thus, the inclusion of control populations—typically neglected in directed evolution experiments due to throughput limitations—enabled the extraction of additional information previously unobtainable. This capability can be used to determine whether or not negative selection against promiscuous activity²⁶ is necessary. Additionally, the automated addition of Boc-lysine to 12 evolving populations over 36 h of PRANCE consumed less than 100 mg of total compound, nearly ten times less than what would have been required for a single population within a bioreactor. Given this substantial reduction in reagents, PRANCE enables multiplexed and well-controlled evolution experiments with fine control over the chemical environment using molecules that are too expensive (for example, 4-azido-Phe, $2,500 per gram²⁸) or rare (for example, pyrrolysine²⁹) to be used with traditional continuous-flow bioreactors.

Simultaneous evolution of dozens of biomolecules

Previously, we used PACE to evolve tRNAs³⁰ capable of decoding quadruplet codons^31,32 toward the goal of engineering a four-base codon translation system^33,34. To evolve new quadruplet tRNAs (qtRNAs), we inserted a quadruplet codon (AGGG) into pIII and generated a variety of qtRNA-encoding, pIII-deficient phage (Fig. 4a). In the absence of a functional qtRNA, the quadruplet codon generates a frameshift, truncating pIII and precluding phage propagation (Fig. 4b). The success of qtRNA evolution can depend on which tRNA paralog is used to initiate evolution, highlighting the importance of studying a variety of starting genotypes. Here, we used PRANCE to simultaneously identify many functional qtRNAs by subjecting a full set of 20 different paralogs to evolution within a single experiment. We first replicated the evolution of six TAGA-decoding qtRNAs, and observed similar genotypes as previously described³⁰ (Extended Data Fig. 6). Next, we initiated PRANCE by seeding 48 populations in an optimized configuration (Extended Data Fig. 7) with phage encoding 20 different qtRNA paralogs corresponding to every canonical amino acid containing a library of randomized anticodons (NNNN) (20 populations); phage encoding eight different qtRNA paralogs (Ala, Glu, Gly, His, Arg, Ser, Thr and Trp) each with directed AGGG frameshift anticodons (24 populations) or no-phage controls (four populations).

**Fig. 4: Feedback-controlled evolution of diverse starting genotypes.**

We then tracked luminescence of all 48 populations over 36 h and found that phage encoding Gly, His, Ser, Arg, Thr and Trp qtRNA paralogs successfully decoded quadruplet codons (Extended Data Fig. 6e). Indeed, when subcloned, several of the isolated qtRNAs exhibited improved activity (Fig. 4c,d), although further characterization would be required to determine the amino acid identity of the evolved qtRNAs³⁴. These results are consistent with the observation that only some tRNAs are capable of improvements to this biomolecular activity, highlighting the importance of genotype diversification in the success of evolution. Notably, a single PRANCE experiment evolved multiple AGGG-decoding qtRNAs, which would have previously required dozens of individual PACE experiments.

Feedback evolution of activity-diverse biomolecules

Although we successfully identified AGGG-decoding qtRNAs with improved activity, we also observed a high failure rate. Many of the qtRNAs with low initial activity (Ala, Cys, Asp, Phe and so on) never evolved, but rather experienced an experimental failure mode referred to as ‘washout’ in which the effective population size decreases to zero (Extended Data Fig. 6e). Additionally, we observed that qtRNAs with high initial activity (Arg, Trp) maintained population size and triggered luminescence but did not acquire mutations. These results indicate that selection was too stringent in some cases and too lenient in others—both common directed evolution failure modes. We hypothesized that the ability to dynamically tune selection stringency in accordance with population fitness would improve the likelihood of successfully evolving biomolecules from diverse starting points, by both reducing the possibility of phage washout and maintaining selection pressure on high-activity variants. To improve likelihood of evolution success, we developed a feedback control system^35,36 that adjusts the stringency of selection by modifying the host bacterial strain in response to a real-time analysis of molecular activity-dependent luminescence. As a model system, we selected four qtRNA paralogs (Phe, His, Ser and Arg) that decode the TAGA quadruplet codon, are known to have improved variants and differ greatly in initial activity. We sought to use feedback control to evolve all four qtRNAs in a single experiment.

First, we characterized three bacterial APs that confer different levels of selection pressure. The most ‘lenient’ of these APs encodes T7 RNAP containing two quadruplet codons, with the T7 promoter driving production of pIII and luxAB (Fig. 4e). We also characterized more stringent APs containing either one (moderate) or two (stringent) quadruplet codons directly in pIII (Fig. 4e) and found that these APs adequately cover phage enrichment space (Fig. 4f). We next evolved these qtRNAs on each of the three individual bacterial sources separately, under static selection. The variant with the highest initial activity, qtRNA^Arg_TAGA, only evolves under high selection pressure. Conversely, under lenient pressure all phage propagate, but only the qtRNAs with the lowest initial activity experience selection and acquire mutations (Fig. 4g). None of the three levels of stringencies could evolve all four qtRNAs. To customize selection pressure, we implemented automated feedback control in which the bacterial source of each population is adjusted in response to real-time measurements of fitness as measured by luminescence (Fig. 4h and Methods). This strategy successfully propagated phage populations encoding all four qtRNAs to the end of the 36-hour experiment (Fig. 4i). Additionally, by measuring clonal variant activity from each of the eight feedback-controlled populations using a luciferase reporter, we found that all eight populations had evolved qtRNA variants with improved translation efficiency (Fig. 5a). Thus, feedback control avoided phage washout in all cases while simultaneously exposing high-activity qtRNAs to more challenging evolution environments, in a short time window, without researcher intervention. These results demonstrate that feedback control is more robust and less failure prone, enabling the evolution of biomolecules with diverse activities.

**Fig. 5: Varying the timing of environmental changes yields diverse evolution trajectories.**

Evolution outcomes are determined by temporal dynamics

It is well established that selection stringency can affect the trajectory of evolution^10,19, but the importance of the timing of those changes has remained largely unexplored. In nature, changes in the physical environment are complex, and can be random or even exhibit mutual dependence with population fitness (for example, in predator–prey dynamics³⁷). Thus, we wondered how static versus dynamic selective environments would affect the trajectory of single-biomolecule evolution within the laboratory. Due to the small size of qtRNAs, we used next-generation sequencing to characterize the evolutionary history of 32 populations as they were subjected to different temporal perturbations to their selective environment. We tracked the genotypic abundance relative to population size of four qtRNA paralogs over 36 hours, under four selection schedules in duplicate as they underwent evolution under static (lenient or stringent), dynamic yet unresponsive (discrete) or feedback-controlled (responsive) stringency modulation.

We found that the selective environment affects whether and how quickly evolved variants reach fixation in a population. For example, the variant qtRNA^Arg_TAGAU43 can only be evolved using stringent selection because the tRNA genotype initiating that evolution has high initial fitness (Fig. 5a,b). Accordingly, all qtRNA^Arg_TAGA evolution experiments arrive at a convergent solution, but the speed of evolution depends on how quickly the stringent selective environment is introduced (Fig. 5b). Further, qtRNA^Phe_TAGA, qtRNA^His_TAGA, and qtRNA^Ser_TAGA are each vulnerable to washout at high stringency due to lower overall fitness (Fig. 5c–e); thus, evolution only occurs for these qtRNAs in environments that are initially lenient. During qtRNA^Phe_TAGA evolution, an accessible and convergent single-point mutant (32 A) arises that is highly fit in lenient and moderate selective environments, resulting in purification and population size growth in those environments (Fig. 5d). The qtRNA^His_TAGA evolution is more complex, containing several variants with elevated fitness composed of mutations to base 32 together with modifications in the variable loop (Fig. 5e,g). Finally, the synergistic epistasis between mutations in qtRNA^Ser_TAGA (Fig. 5a) makes purification of the highly improved qtRNA^Ser_TAGAA32-C38 mutant³¹ less robust: only one responsive environment completely purified this variant (Fig. 5c). Together, these data show how the unique fitness landscape of each biomolecule determines the dynamics of its evolution in different selective environments.

We also observed that the dynamics of environmental changes can affect the phylogenetics of evolution. Unlike Arg and Phe-qtRNA evolution, which each appear to deterministically converge on particular high-activity variants irrespective of changes in stringency timing (Fig. 5b,d), we found that the genotypes resulting from qtRNA^His_TAGA evolution are particularly sensitive to historical changes in the environment. The discrete selection schedule resulted in wide genotypic variety, with seven unique genotypes each reaching >10% of phage population share at some point during evolution (Fig. 5f). In this schedule, the arbitrary introduction of moderate stringency (t = 12 h) reproducibly enriches intermediately active variants (C32 or G32) and their phylogenetic descendants with variable loop mutations (G32-Δ48, C32-A48, C32-Δ48, C32-Δ47) (Fig. 5g), before converging on a globally optimal variant. Conversely, we see that during responsive evolution, where moderate stringency is delayed until the population is sufficiently fit (t = 18 h), a single active point mutant (Δ45) emerges as the predominant variant without widely exploring other genotypes at high population abundance (Fig. 5e,f). These data show that seemingly small perturbations to the historical selective environment, whether arbitrary or in response to a changing ecosystem, can drive purification of distinct genetic variants that are either moderately or highly fit³⁸. Collectively, these results demonstrate that although single-biomolecule evolution may appear deterministic on simple fitness landscapes with a sharp peak, more complex landscapes may produce outcomes contingent on seemingly inconsequential events²¹.

Long-running evolution with PRANCE

The longevity of most PACE experiments (>100 hours) requires the platform to be capable of performing long-running, multi-trajectory evolution experiments. Due to the large quantities of consumables used by liquid-handling robots, the need for frequent researcher intervention (for tip-replenishing) and the ongoing tip shortage resulting from the COVID-19 pandemic³⁹, we developed an optimized method capable of sterilizing and reusing tips on the robot deck (Supplemental Video 1). This optimized method uses approximately five boxes of tips per day, and can be run for over a week at a time with user intervention only once every 24 hours. During method optimization, we observed that different robotic configurations introduce varying amounts of cross-contamination when propagating highly active phage (Extended Data Fig. 8a,b). To demonstrate the capabilities of this method, we first validated that tip reuse and sterilization introduced no quantifiable cross-contamination within 12 hours (Extended Data Fig. 8c), indicating that tips could be replenished once or twice per day.

We then used this technique to enable a 10-day evolution in which we evolved T7 RNAP to bind eight new promoters (Fig. 6a,b). During this experiment, we tested three techniques to maintain large population sizes during long-running experiments: allowing the evolution to proceed without intervention (no pulse); spiking the population with bacteria expressing pIII under the phage shock promoter (psp) that enable activity-independent phage propagation periodically every 12 h (12 h pulse) or spiking only before transitions to new evolution stringency (pretransition pulse). We evolved 32 populations for 240 hours (10 days) in a single, uninterrupted experiment (Fig. 6c). During this time, no cross-contamination in the eight no-phage control wells was detected. We found that the phage titer maintenance schemes affected the genotypes that evolved (Fig. 6d). To quantify activity, individual variants containing all of the dominant mutations from each replicate (Fig. 6d and Supplementary Table 3) were subcloned into plasmid reporter constructs in which LuxAB was driven by the TP6, −3 variant or −5 variant promoter. The activities of 24 total subclones were then quantified by luminescence and compared to wild-type (WT) T7 RNAP on each respective promoter (Fig. 6e and Extended Data Fig. 9). Most variants obtained in the T7 → T3 → −3/−5 trajectories were found to exhibit between 10 and 20-fold higher activities than WT T7 RNAP on the same promoter. In addition, we found that the stringency conditions interacted with the evolution goals; for example, the reduced population size of ‘No pulse’ in the SP6 trajectory was the only condition that converged on mutations to E222, a residue associated with nonspecific promoter binding¹¹ (Fig. 6e). The highest activity variant obtained from −3 variant evolution (prepulse, population no. 2, Fig. 6e) was also the only population to reach saturation with a novel E218A mutation even though many populations obtained the M219R/K solutions described above (Fig. 6f). Thus, PRANCE enables the seamless exploration of complex, multi-mutational pathways that evolve over the course of many days and require several intermediate evolution goals.

Discussion

PRANCE provides a number of advances over existing methods. The platform has higher throughput, supporting hundreds of diverse evolution conditions simultaneously. The 100× volume reduction and robotic compound pinning capabilities support evolution in chemical environments that were not previously practical due to reagent cost²⁸ or synthesis limitations²⁹, e.g., those involving the use of small-molecule fluorescent reporters of metabolism⁴⁰, pH⁴¹ or CO₂⁴², or supplementation with complex intermediates⁴³. Further, the system can be used with either continuously grown or chilled host cells to enable evolution at diverse temperatures. The universal 96-well format enables sample preservation for immediate or downstream analysis of historical genotypic and phenotypic changes. Additionally, the system is capable of performing feedback-controlled evolution: we demonstrated feedback control triggered by luminescence, and the system is extensible to many activity-dependent fluorescent reporters (transcription, quorum sensing, solubility, protease activity, splicing and so on) as well as any measurement that can be integrated robotically, such as PCR, binding measurements or orthogonal in vitro assays. Similarly, the system is amenable to experimentation with other feedback control algorithms³⁵ or machine learning-guided adaptive control. With the improved availability and decreasing cost of high-throughput equipment, this approach is an increasingly accessible option for many laboratories. Finally, although we demonstrate the use of this platform with bacteriophage, we believe that it is practicable to accommodate continuous evolution platforms in eukaryotes such as yeast^9,15 or possibly mammalian–virus systems^6,7.

From a strictly engineering perspective, our results highlight how multiplexed evolution enables engineering that is comprehensive, robust and systematic. With PRANCE, we sampled all parameters known to affect evolution outcomes: the initial genotype, the evolution environment and events that happen by chance. Evolution in high throughput thus improves our ability to sample the ensemble of possible outcomes. For example, even though the evolution of T7 RNAP has been extensively reported^5,10,11, we continued to identify novel genotypes with systematic parallel sampling. We observed that initiating experiments with a diverse set of tRNA paralogs was a robust strategy for evolving highly active variants and further identified that feedback control enables side-by-side evolution of biomolecules with diverse initial activity, in the absence of preexisting knowledge. Finally, we found that populations evolving with differentially timed stringency perturbations can yield distinct solutions, highlighting the use of examining many evolution schedules. By improving the reliability of laboratory evolution, PRANCE enables the reinvention of directed evolution as a more robust and reproducible engineering discipline.

Systematic evolution could be of great use to disciplines ranging from natural evolution to epidemiology. Indeed, this platform is capable of collecting datasets about molecular evolution that approach the size and scope of the long-term evolution experiment dataset of whole-genome evolution^1,2 by examining numerous individual populations and environments over many generations. By performing a single evolution with many replicates, we can characterize the reproducibility and stochasticity of single events, a longstanding challenge for evolutionary biologists. As selection schedules reconstitute the natural process of evolving related biomolecules in different selective environments, PRANCE also offers the opportunity to better understand the relationship between time, stringency, the environment and the traversal of a fitness landscape. More generally, the ability to systematically explore evolutionary outcomes across many populations may help resolve controversial questions in evolutionary biology, such as the friction between determinism and contingency. For example, we find that the timing as well as the nature of environmental changes can reproducibly affect the genotypic outcomes of individual biomolecules (that is, contingency), but outcomes vary according to the particular selective landscape (that is, determinism). Thus, PRANCE provides the opportunity to answer fundamental questions in evolutionary biology, recapitulate naturally occurring environmental changes and simulate perturbations to these environments—all within the laboratory.

Methods

See Supplementary Table 4 for part numbers and all plasmids used in these experiments.

Plaque assays

Manual plaque assays

S2060 cells were transformed with the accessory plasmid of interest. Overnight cultures of single colonies grown in 2XYT media supplemented with maintenance antibiotics were diluted 1,000-fold into fresh 2XYT media with maintenance antibiotics and grown at 37 °C with shaking at 230 r.p.m. to an optical density (OD₆₀₀) of roughly 0.6–0.8 before use. Bacteriophage were serially diluted 100-fold (four dilutions total) in H₂O. Then, 100 μl of cells were added to 100 μl of each phage dilution, and to this 0.85 ml of liquid (70 °C) top agar (2XYT media + 0.6% agar) supplemented with 2% Bluo-Gal was added and mixed by pipetting up and down once. This mixture was then immediately pipetted onto one quadrant of a quartered Petri dish already containing 2 ml of solidified bottom agar (2XYT media + 1.5% agar, no antibiotics). After solidification of the top agar, plates were incubated at 37 °C for 16–18 h. See Supplementary Table 4 for catalog numbers.

Robotics-accelerated plaque assays

The same procedure was followed as above, except that plating of the plaque assays was done by a liquid-handling robot (Hamilton Robotics) by plating 20 μl of bacterial culture and 100 μl of phage dilution with 200 μl of soft agar onto a well of a 24-well plate already containing 235 μl of hard agar per well. To prevent premature cooling of soft agar, the soft agar was placed on the deck in a 70 °C heat block. Source code from our implementation can be found at https://github.com/dgretton/roboplaque.

Phage enrichment assays

S2060 cells were transformed with the accessory plasmid of interest as described above. Overnight cultures of single colonies grown in 2XYT media supplemented with maintenance antibiotics were diluted 1,000-fold into Davis Rich Media (DRM) media with maintenance antibiotics and grown at 37 °C with shaking at 230 r.p.m. to OD₆₀₀ roughly 0.4–0.6. Cells were then infected with bacteriophage at a starting titer of 10⁵ pfu ml⁻¹. Cells were incubated for another 16–18 h at 37 °C with shaking at 230 r.p.m. Supernatant was filtered (as described) and stored at 4 °C. The phage titer of these samples was measured in an activity-independent manner using a plaque assay containing E. coli bearing pJC175e (as described). A complete list of catalog numbers can be found in the Supplementary Table 4.

Measurement of biomolecule activity using luciferase reporter

S2060 cells were transformed with the luciferase-based activity reporter and biomolecule expression plasmids. Overnight cultures of single colonies grown in DRM media supplemented with maintenance antibiotics were diluted 500-fold into DRM media with maintenance antibiotics in a 96-well 2-ml deep well plate, with or without isopropyl-β-d-thiogalactoside (IPTG) inducer. The plate was sealed with a porous sealing film and grown at 37 °C with shaking at 230 r.p.m. for 1 h. Next, 175 μl of cells were transferred to a 96-well black-walled clear-bottom plate, and then 600 nm absorbance and luminescence were read using an ClarioSTAR plate reader (BMG Labtech) over the course of 8 h, during which the cultures were incubated at 37 °C.

Equipment set-up for PRANCE

Media

DRM²⁶ is used for PRANCE and all experiments involving plate reader measurements due to its low fluorescence and luminescence background. 2XYT media, a media optimized for phage growth, is used for all other purposes, including phage-based selection assays and general cloning. See Supplementary Table 4 for media catalog numbers. Antibiotics (Gold Biotechnology) were used at the following working concentrations: carbenicillin, 50 μg ml⁻¹; spectinomycin, 100 μg ml⁻¹; chloramphenicol, 40 μg ml⁻¹; kanamycin, 30 μg ml⁻¹; tetracycline, 10 μg ml⁻¹ and streptomycin, 50 μg ml⁻¹.

General robotic equipment and configuration

A Hamilton Microlab STARlet eight-channel base model was augmented with a Hamilton CO-RE 96 Probe Head, a Hamilton iSWAP Robotic Transport Arm and a Dual Chamber Wash Station. Air filtration was provided by an overhead high-efficiency particulate air filter fan module integrated into the robot enclosure. A BMG CLARIOstar luminescence multi-mode microplate reader was positioned inside the enclosure, within reach of the transport arm.

Mini peristaltic pump array

Up to seven miniature 12-V, 60 ml min⁻¹ peristaltic pumps (fish tank pumps) were actuated by custom motor drivers. A Raspberry Pi mini single-board computer received instructions over a local internet protocol address and commanded the motor drivers via I²C. A complete list of catalog numbers can be found in the Supplementary Table 4.

Bacterial reservoir

We implemented a self-cleaning bacterial reservoir consisting of the pump array and a custom 3D-printed corrugated reservoir (Extended Data Fig. 1): see Supplementary Table 4 for the .stl file. Following each filling of the bacterial reservoir with fresh bacteria and liquid exchange (below), remaining culture was drained from the reservoir and the reservoir was automatically rinsed with 5% bleach once and water four times. The reservoir is seated on a Big Bear Automation Microplate Orbital Shaker, which shakes the reservoir throughout all self-cleaning steps to achieve uniform diffusion.

Software

A general-purpose driver method was created using MicroLab STAR VENUS ONE software and compiled to Hamilton Scripting Language (hsl) format. Instantiation of this method and management of its local network connection was handled in Python. The Pyhamilton Python²² package provided an overlying control layer. Interfaces to the CLARIOstar plate reader, pump array and orbital shaker were encapsulated in supporting Python packages. We used Git to develop and version control the packages and the specific Python methods used for each experiment; our software implementation can be found on github at https://github.com/dgretton/std-96-pace.

PRANCE experimental set-up

Mutagenesis functionality plating test

To avoid premature induction of the mutagenesis plasmid (which can mutagenize the mutagenesis plasmid itself and lead to loss of inducible mutagenesis) all bacteria containing mutagenesis plasmids are repressed with 20 mM glucose, and were not grown to higher density than OD₆₀₀ = 1.0. To confirm mutagenesis plasmid functionality, a bacterial culture sample or four single-colony transformants were picked and resuspended in DRM, and were serially diluted and plated on 2XYT media + 1.5% agar with appropriate antibiotics (at half of the usual concentrations) with either 25 mM arabinose or 20 mM glucose. Bacteria containing functional mutagenesis plasmid 6 (MP6) will exhibit smaller colonies on the arabinose plate relative to the glucose plate.

PRANCE host bacterial strain preparation

To prepare host bacterial strains, the accessory plasmid of interest and mutagenesis plasmid (MP6)²⁴ were transformed into S2060 cells as described. During MP6 transformation, cells were recovered with DRM to ensure MP6 repression and were streaked on 2XYT media + 1.5% agar with appropriate antibiotics (at half of the usual concentrations) and 20 mM glucose. Transformants were picked for the mutagenesis functionality plating test (above), and those same transformants were grown at 37 °C with shaking at 230 r.p.m. to OD₆₀₀ 0.4. This culture was used to seed a PRANCE culture in a manner dependent on the specific PRANCE experiment:

To initiate a turbidostat, prepared host bacteria were used to seed DRM supplemented with 25 μg ml⁻¹ carbenicillin and 20 μg ml⁻¹ chloramphenicol in a turbidostat (implemented by a standard turbidity probe and glassware vessel). The bacteria were grown with stirring at 37 °C to OD₆₀₀ = 0.8.
To initiate chemostats, prepared host bacteria were used to seed 150 ml of DRM supplemented with appropriate antibiotics (at half of the usual concentrations) in three separate chemostat vessels. The chemostat cultures were kept at roughly 100–150 ml and grown with stirring at 37 °C. The flow-through and volume of the turbidostats were manually adjusted as needed to maintain the target OD₆₀₀ = 0.8 in all chemostats.
To prepare chilled bacterial culture for use in PRANCE, an overnight culture of prepared host bacteria was diluted 1:1,000 into 2 × 600 ml of DRM in two separate 2 l baffled flasks with appropriate antibiotics (at half of the usual concentrations). The bacteria were grown at 37 °C with shaking at 230 r.p.m. to OD₆₀₀ of roughly 0.3–0.4. The two 600-ml cultures were combined and stored at 4 °C for up to 4 days before being situated in a 4 °C refrigerator adjacent to the robot.

MP6 functionality is also confirmed at the end of the experiment with the arabinose/glucose plating test (above).

Anticipated mutagenesis rates

Use of the MP6 mutagenesis plasmid increases the mutation rate to two mutations per kilobasepair²⁴. For example, when evolving tRNAs (roughly 75 basepairs), during a single round of phage genome replication in a PRANCE experiment, a tRNA will experience 0.15 mutations per replication in the tRNA itself. This generates a double-mutant rate of 0.023 double mutants per replication, providing an opportunity to traverse single-mutant fitness valleys. To approximate, the average phage replicates roughly 24 times per day at steady state⁵ and consequently generates roughly 0.55 double mutants per phage per day within the tRNA. PRANCE experiments typically harbor around 10⁸ infected cells per ml. Thus, for 0.5-ml populations, we expect each experiment to visit roughly 2.5 × 10⁷ double mutants per day, which should reliably cover the 24,975 unique possible two mutants:

$$(75\,{\mathrm{bp}}\,{\mathrm{choose}}\,2) \times 3^ 2\,{\mathrm{total}}\,2 - {\mathrm{mutants}} = 24,975\,{\mathrm{unique}}\,2 - {\mathrm{mutants}}$$

General PRANCE implementation

Host bacteria were pumped from their source (turbidostat, chemostat or refrigerated chilled culture) into the on-deck bacterial reservoir. Arabinose was added to the reservoir to a final concentration of 10 mM to induce mutagenesis. Then, 250 μl of bacteria were robotically transferred to each replicate (500 μl per replicate) in deep 96-well plates situated on the deck of the liquid-handling robot, which was kept at 37 °C. To avoid edge effects, replicates are placed in every other column (Extended Data Fig. 7). This liquid exchange occurs every 30 min continuously throughout the duration of the experiment, generating a flow-through rate of 1 vol h⁻¹ in each replicate. After allowing mutagenesis induction for 1 h, each replicate was inoculated with selection phage to a starting titer between 10⁵ and 10⁸ pfu ml⁻¹, with appropriate replicates set aside as no-phage controls. Sampling and plate reading of replicate waste liquid was automatically carried out every 1–4 h as described. After completion of PRANCE, the phage supernatant from select replicates was filtered and plaqued; clonal plaques were expanded overnight, filtered and Sanger sequenced.

Semicontinuous flow using pipetting

To perform liquid exchange, every 30 min, automated pipetting was used to exchange liquid in on-deck replicates. The procedure consisted of tip pickup, aspiration of fresh bacterial culture, dispense of new culture into replicate vessels, mixing of liquid, aspiration of waste, dispense of waste into the waste reservoir and tip disposal.

Automated plate reading

At specified times, 175 μl of the waste samples are deposited into plate reader plates. To read the plates, the plate gripper transports the plate to the integrated plate reader for absorbance at an OD of 600 nm and fluorescence or luminescence measurements as required.

Choosing number of replicates

Consult Extended Data Fig. 1 to select a configuration (number of populations, bottlenecking fraction and hands-off time) that meets your experimental requirements. There are two primary limiting factors: the amount of time the robot requires to perform each necessary step within a given time frame, and the number of consumable reagents (tips, plates and so on) that can be physically staged on the deck of the robot. For every individual PRANCE run, we select and tune the method configurations depending on the requirements of each individual experiment (Supplementary Table 1). All of our experiments are run at an effective flow-through rate of 1 vol h⁻¹, with a bottlenecking fraction of 0.5. A configuration we frequently preferred consisted of 48 evolving populations, requiring experimenter intervention once every 13 h. Alternatively, when operating at maximum capacity (1 vol h⁻¹, 96 populations) PRANCE requires intervention once every 7 h. The tradeoff between populations and hands-off time will depend on the capacity of your robot deck.

Automated tip sterilization and reuse

During the COVID-19 pandemic, supplies of consumable labware involved in viral diagnostic tests became constrained³⁹, including supplies of pipette tips for Hamilton robots. Due to the number of liquid-handling steps involved in a typical evolution, it is preferable to reuse the same tips over the course of an evolution rather than replace each tip after every liquid-handling step. This is done to avoid adding to the extensive backlog of constrained supplies, to avoid spending time manually reloading new tips onto the deck and to increase the available deck space for increasingly advanced evolution methods. To implement tip reuse, tips must be cleaned and sterilized with bleach after every liquid-handling step to avoid cross-contamination between replicates. Water rinses are performed after sterilization to prevent bleach carryover into replicates. The equipment required to implement this protocol are the Hamilton washer module for pumping fresh water and 10% bleach into basins, and four 500-ml reservoirs for holding 10% bleach (first basin) or tap water (remaining three). Tip washing and sterilization are performed at two distinct points during each robotic cycle. The first point is after bacteria is moved from holding wells to replicates containing M13 phage using the 96-channel head. The 96-channel head then dispenses residual liquid from the previous step into the bleach basin on the Hamilton washer module, then submerges the tips in a bleach reservoir on the robot deck. The 96-channel head next aspirates and dispenses freshly pumped water from the water basin on the Hamilton washer module as a rinsing step. Finally, the 96-channel head performs consecutive rinses on each water reservoir on-deck to remove any residual bleach. The second point where tip washing and sterilization occurs is after moving culture from viral replicates to reader plates with the eight-channel head. Immediately after each dispense to a reader plate, the eight-channel head performs the same series of actions previously described for the 96-channel head. To allow for adequate drying time after each wash, the robot script rotates through five 96-tip racks, ensuring washed tips will not be reused for at least one whole robotic cycle. Luminescence assays were performed on the above protocol to validate the absence of cross-contamination and bleach carryover between wells. Before inoculation, fresh tips were used and sterilized. To assess the effectiveness of tip sterilization, tips were submerged directly in phage-containing T7 RNAP, sterilized and then used to maintain psp-pIII bacteria cultures using a high-throughput turbidostat method as previously described²². They were compared to no-phage contamination negative controls, and bacteria inoculated with the same phage as positive controls, in 24 replicates. This protocol may be used to implement pipette tip sterilization and reuse for general experimental protocols run on Hamilton robots, including COVID-19 quantitative PCR sample preparation.

Next-generation sequencing

qtRNA were amplified off of phage populations (four qtRNA evolution experiments, two replicates, ten time points, four evolution conditions) with primers barcoding for qtRNA and time points. Barcoded samples were pooled and sequenced as Illumina paired end reads.

Data analysis

To calculate evolution times for each data point were calculated from the inflection point of the data fit to a weighted n-parameters logistic regression with the nplr package in R. Distribution of total evolution times calculated by fitting univariate distributions to data by maximum likelihood estimation, and the goodness of fit Kolmogorov–Smirnov, Cramer–von Mises statistics were calculated with the fitdistrplus package in R⁴⁷.

Feedback controller calculation

For each individual replicate, luminescence is first normalized to absorbance. Each replicate reading is then smoothed over an interval of n = 3 and stringency adjustment is triggered after three sequential events in which the reading is 3σ over background. Background is quantified as the average of no-phage controls (luminescence/absorbance). The second stringency adjustment (moderate → stringent) is triggered similarly, following a 12 h delay to allow for luminescence equilibration.

Phylogenetic trees

Phylogenetic lineages were identified with R packages ade4⁴⁸, ape⁴⁹, adegenet⁵⁰ and phyloseq⁵¹ and visualized with ggtree⁵².

Muller plots

Parent lineages were identified from phylogenetic trees and then used to generate muller plots with the R package ggmuller⁵³. For visualization purposes, abundance was adjusted for population size by scaling the fraction by the total amount of luminescence at each given time point. Actual sequencing abundance fraction percentages are presented of each replicate per population are shown in Fig. 5f.

Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability

All data are publicly available at https://doi.org/10.17632/h9z94f9y6p.1 (https://data.mendeley.com/datasets/h9z94f9y6p/1). Source data are provided with this paper.

Code availability

All code will be made available via github, see https://github.com/dgretton/std-96-pace and https://github.com/dgretton/roboplaque.

References

Good, B. H., McDonald, M. J., Barrick, J. E., Lenski, R. E. & Desai, M. M. The dynamics of molecular evolution over 60,000 generations. Nature 551, 45–50 (2017).
Article PubMed PubMed Central Google Scholar
Barrick, J. E. et al. Genome evolution and adaptation in a long-term experiment with Escherichia coli. Nature 461, 1243–1247 (2009).
Article CAS PubMed Google Scholar
Zhao, H., Giver, L., Shao, Z., Affholter, J. A. & Arnold, F. H. Molecular evolution by staggered extension process (StEP) in vitro recombination. Nat. Biotechnol. 16, 258–261 (1998).
Article CAS PubMed Google Scholar
Romero, P. A. & Arnold, F. H. Exploring protein fitness landscapes by directed evolution. Nat. Rev. Mol. Cell Biol. 10, 866–876 (2009).
Article CAS PubMed PubMed Central Google Scholar
Esvelt, K. M., Carlson, J. C. & Liu, D. R. A system for the continuous directed evolution of biomolecules. Nature 472, 499–503 (2011).
Article CAS PubMed PubMed Central Google Scholar
Berman, C. M. et al. An adaptable platform for directed evolution in human cells. J. Am. Chem. Soc. 140, 18093–18103 (2018).
Article CAS PubMed PubMed Central Google Scholar
English, J. G. et al. VEGAS as a platform for facile directed evolution in mammalian cells. Cell https://doi.org/10.1016/j.cell.2019.05.051 (2019).
Crook, N. et al. In vivo continuous evolution of genes and pathways in yeast. Nat. Commun. 7, 13051 (2016).
Article CAS PubMed PubMed Central Google Scholar
Ravikumar, A., Arzumanyan, G. A., Obadi, M. K. A., Javanpour, A. A. & Liu, C. C. Scalable, continuous evolution of genes at mutation rates above genomic error thresholds. Cell 175, 1946–1957.e13 (2018).
Article CAS PubMed PubMed Central Google Scholar
Leconte, A. M. et al. A population-based experimental model for protein evolution: effects of mutation rate and selection stringency on evolutionary outcomes. Biochemistry 52, 1490–1499 (2013).
Article CAS PubMed Google Scholar
Dickinson, B. C., Leconte, A. M., Allen, B., Esvelt, K. M. & Liu, D. R. Experimental interrogation of the path dependence and stochasticity of protein evolution using phage-assisted continuous evolution. Proc. Natl Acad. Sci. USA 110, 9007–9012 (2013).
Article CAS PubMed PubMed Central Google Scholar
Hu, J. H. et al. Evolved Cas9 variants with broad PAM compatibility and high DNA specificity. Nature https://doi.org/10.1038/nature26155 (2018).
Wong, B. G., Mancuso, C. P., Kiriakov, S., Bashor, C. J. & Khalil, A. S. Precise, automated control of conditions for high-throughput growth of yeast and bacteria with eVOLVER. Nat. Biotechnol. 36, 614–623 (2018).
Article CAS PubMed PubMed Central Google Scholar
Horinouchi, T., Minamoto, T., Suzuki, S., Shimizu, H. & Furusawa, C. Development of an automated culture system for laboratory evolution. J. Lab. Autom. 19, 478–482 (2014).
Article PubMed Google Scholar
Zhong, Z. et al. Automated continuous evolution of proteins in vivo. ACS Synth. Biol. 9, 1270–1276 (2020).
Article CAS PubMed PubMed Central Google Scholar
Meyer, M. M. et al. Library analysis of SCHEMA-guided protein recombination. Protein Sci. 12, 1686–1693 (2003).
Article CAS PubMed PubMed Central Google Scholar
Crameri, A., Raillard, S. A., Bermudez, E. & Stemmer, W. P. DNA shuffling of a family of genes from diverse species accelerates directed evolution. Nature 391, 288–291 (1998).
Article CAS PubMed Google Scholar
Karanicolas, J. et al. A de novo protein binding pair by computational design and directed evolution. Mol. Cell 42, 250–260 (2011).
Article CAS PubMed PubMed Central Google Scholar
Amini, Z. N. & Müller, U. F. Low selection pressure aids the evolution of cooperative ribozyme mutations in cells. J. Biol. Chem. 288, 33096–33106 (2013).
Article CAS PubMed PubMed Central Google Scholar
Zinkus-Boltz, J., DeValk, C. & Dickinson, B. C. A phage-assisted continuous selection approach for deep mutational scanning of protein–protein interactions. ACS Chem. Biol. 14, 2757–2767 (2019).
Article CAS PubMed Google Scholar
Blount, Z. D., Lenski, R. E. & Losos, J. B. Contingency and determinism in evolution: replaying life’s tape. Science 362, eaam5979 (2018).
Chory, E. J., Gretton, D. W., DeBenedictis, E. A. & Esvelt, K. M. Enabling high-throughput biology with flexible open-source automation. Mol. Syst. Biol. 17, e9942 (2021).
Article PubMed PubMed Central Google Scholar
Balagaddé, F. K., You, L., Hansen, C. L., Arnold, F. H. & Quake, S. R. Long-term monitoring of bacteria undergoing programmed population control in a microchemostat. Science 309, 137–140 (2005).
Article PubMed Google Scholar
Badran, A. H. & Liu, D. R. Development of potent in vivo mutagenesis plasmids with broad mutational spectra. Nat. Commun. 6, 8425 (2015).
Article CAS PubMed Google Scholar
Polycarpo, C. et al. An aminoacyl-tRNA synthetase that specifically activates pyrrolysine. Proc. Natl Acad. Sci. USA 101, 12450–12454 (2004).
Article CAS PubMed PubMed Central Google Scholar
Bryson, D. I. et al. Continuous directed evolution of aminoacyl-tRNA synthetases. Nat. Chem. Biol. https://doi.org/10.1038/nchembio.2474 (2017).
Umehara, T. et al. N-acetyl lysyl-tRNA synthetases evolved by a CcdB-based selection possess N-acetyl lysine specificity in vitro and in vivo. FEBS Lett. 586, 729–733 (2012).
Article CAS PubMed Google Scholar
Chin, J. W. et al. Addition of p-azido-l-phenylalanine to the genetic code of Escherichia coli. J. Am. Chem. Soc. 124, 9026–9027 (2002).
Article CAS PubMed Google Scholar
Srinivasan, G., James, C. M. & Krzycki, J. A. Pyrrolysine encoded by UAG in Archaea: charging of a UAG-decoding specialized tRNA. Science 296, 1459–1462 (2002).
Article CAS PubMed Google Scholar
DeBenedictis, E. A., Carver, G. D., Chung, C. Z., Söll, D. & Badran, A. H. Multiplex suppression of four quadruplet codons via tRNA directed evolution. Nat. Commun. 12, 5706 (2021).
Article CAS PubMed PubMed Central Google Scholar
Magliery, T. J., Anderson, J. C. & Schultz, P. G. Expanding the genetic code: selection of efficient suppressors of four-base codons and identification of ‘shifty’ four-base codons with a library approach in Escherichia coli. J. Mol. Biol. 307, 755–769 (2001).
Article CAS PubMed Google Scholar
Anderson, J. C., Magliery, T. J. & Schultz, P. G. Exploring the limits of codon and anticodon size. Chem. Biol. 9, 237–244 (2002).
Article CAS PubMed Google Scholar
Wang, K., Neumann, H., Peak-Chew, S. Y. & Chin, J. W. Evolved orthogonal ribosomes enhance the efficiency of synthetic genetic code expansion. Nat. Biotechnol. 25, 770–777 (2007).
Article PubMed Google Scholar
DeBenedictis, E., Söll, D. & Esvelt, K. Measuring the tolerance of the genetic code to altered codon size. Preprint at bioRXiv https://doi.org/10.1101/2021.04.26.441066 (2021).
Nourmohammad, A. & Eksin, C. Optimal evolutionary control for artificial selection on molecular phenotypes. Phys. Rev. X 11, 011044 (2021).
CAS Google Scholar
Simutis, R. & Lübbert, A. Bioreactor control improves bioprocess performance. Biotechnol. J. 10, 1115–1130 (2015).
Article CAS PubMed Google Scholar
Hartl, R. F., Mehlmann, A. & Novak, A. Cycles of fear: periodic bloodsucking rates for vampires. J. Optim. Theory Appl. 75, 559–568 (1992).
Article Google Scholar
Baym, M. et al. Spatiotemporal microbial evolution on antibiotic landscapes. Science 353, 1147–1151 (2016).
Article CAS PubMed PubMed Central Google Scholar
Wadman, M. United States rushes to fill void in viral sequencing. Science 371, 657–658 (2021).
Article CAS PubMed Google Scholar
Zhao, Y. & Yang, Y. Profiling metabolic states with genetically encoded fluorescent biosensors for NADH. Curr. Opin. Biotechnol. 31, 86–92 (2015).
Article CAS PubMed Google Scholar
Zhang, L. et al. Ratiometric fluorescent pH-sensitive polymers for high-throughput monitoring of extracellular pH. RSC Adv. 6, 46134–46142 (2016).
Article CAS PubMed PubMed Central Google Scholar
Zhujun, Z. & Seitz, W. R. A carbon dioxide sensor based on fluorescence. Anal. Chim. Acta 160, 305–309 (1984).
Article Google Scholar
Cho, I., Jia, Z.-J. & Arnold, F. H. Site-selective enzymatic C‒H amidation for synthesis of diverse lactams. Science 364, 575–578 (2019).
Article CAS PubMed Google Scholar
Jiang, R. & Krzycki, J. A. PylSn and the homologous N-terminal domain of pyrrolysyl-tRNA synthetase bind the tRNA that is essential for the genetic encoding of pyrrolysine. J. Biol. Chem. 287, 32738–32746 (2012).
Article CAS PubMed PubMed Central Google Scholar
Suzuki, T. et al. Crystal structures reveal an elusive functional domain of pyrrolysyl-tRNA synthetase. Nat. Chem. Biol. 13, 1261–1266 (2017).
Article CAS PubMed PubMed Central Google Scholar
Cheetham, G. M. T., Jeruzalmi, D. & Steitz, T. A. Structural basis for initiation of transcription from an RNA polymerase–promoter complex. Nature 399, 80–83 (1999).
Article CAS PubMed Google Scholar
Delignette-Muller, M. L. et al. fitdistrplus: an R package for fitting distributions. J. Stat. Softw. 64, 1–34 (2015).
Article Google Scholar
Dray, S. et al. The ade4 package: implementing the duality diagram for ecologists. J. Stat. Softw. 22, 1–20 (2007).
Article Google Scholar
Paradis, E. & Schliep, K. ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics 35, 526–528 (2019).
Article CAS PubMed Google Scholar
Jombart, T. adegenet: a R package for the multivariate analysis of genetic markers. Bioinformatics 24, 1403–1405 (2008).
Article CAS PubMed Google Scholar
McMurdie, P. J. & Holmes, S. phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data. PLoS ONE 8, e61217 (2013).
Article CAS PubMed PubMed Central Google Scholar
Yu, G. Using ggtree to visualize data on tree-like structures. Curr. Protoc. Bioinforma. 69, e96 (2020).
Article Google Scholar
Noble, R. ggmuller: create muller plots of evolutionary dynamics (GitHub, 2019).

Download references

Acknowledgements

We acknowledge W. Consigli, W. Fu, A. Cuevas and others at Hamilton Robotics for their guidance and assistance. We thank K. Prather’s laboratory for equipment use and assistance. We thank E. Alley, S. Von Stetina and B. Thuronyi for their thoughtful comments on the paper. This work was supported by the MIT Media Laboratory, an Alfred P. Sloan Research Fellowship (to KME), gifts from the Open Philanthropy Project and the Reid Hoffman Foundation (to K.M.E.), and the National Institute of Digestive and Kidney Diseases (grant no. R00 DK102669-01 to KME). E.A.D. was supported by the National Institute for Allergy and Infectious Diseases (grant no. F31 AI145181-01). E.J.C. was supported by a Ruth L. Kirschstein NRSA fellowship from the National Cancer Institute (grant no. F32 CA247274-01).

Author information

These authors contributed equally: Erika A. DeBenedictis, Emma J. Chory.

Authors and Affiliations

Media Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
Erika A. DeBenedictis, Emma J. Chory, Dana W. Gretton, Brian Wang, Stefan Golas & Kevin M. Esvelt
Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
Erika A. DeBenedictis & Emma J. Chory
Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA
Emma J. Chory
Broad Institute of MIT and Harvard, Cambridge, MA, USA
Emma J. Chory

Authors

Erika A. DeBenedictis
View author publications
You can also search for this author in PubMed Google Scholar
Emma J. Chory
View author publications
You can also search for this author in PubMed Google Scholar
Dana W. Gretton
View author publications
You can also search for this author in PubMed Google Scholar
Brian Wang
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Golas
View author publications
You can also search for this author in PubMed Google Scholar
Kevin M. Esvelt
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

E.A.D. and K.M.E. conceived the study. E.A.D. and D.W.G. developed the platform with support from E.J.C. and advice from K.M.E. D.W.G. and S.G. developed the software with advice from E.A.D., E.J.C. and K.M.E. E.A.D., E.J.C. and D.W.G. designed the experiments with advice from K.M.E. E.A.D., E.J.C., D.W.G. and B.W. performed the experiments. E.J.C. analyzed and visualized data. E.J.C., E.A.D., B.W. and K.M.E. wrote the paper with input from all authors.

Corresponding authors

Correspondence to Erika A. DeBenedictis or Kevin M. Esvelt.

Ethics declarations

Competing interests

E.A.D. and K.M.E. have filed US Patent 16405380 on this work.

Peer review information

Nature Methods thanks Arjun Ravikumar and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Rita Strack was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 PRANCE optimization.

(a) Robotic manipulations operate in a loop, which repeats every 30 minutes. (b) Culture source fluidics (media, turbidostat/static culture, waste) are peripherally separated from the robot. The maximum flow-through rate is determined by the frequency with which the robot exchanges liquid (operations per hour), as well as the fraction of the standing volume of the population that is exchanged during each operation (the bottlenecking fraction). There is a trade-off between the maximum flow rate and the extent of bottlenecking. (c) The number of populations that can be serviced assuming 2 robot operations per hour (ops/hr) impacts the experimenter-free/hands-off operation time of the robot. (d) Larger robot decks can fit more tip carriers, more tips, and therefore require less frequent servicing.

Extended Data Fig. 2 Relationship between real-time monitoring data and phage titer.

(a) Correlation between absorbance depression and luminescence for each evolving replicate. Kernel density estimates of the absorbance and luminescence data for the population are plotted on x and y axis, respectively. Luminescence data from Fig. 2b. (b) Comparison of real-time luminescence tracking (top) and corresponding phage titer as measured by traditional plaque assays (bottom). See Supplementary Table 1 for evolution construct details.

Source data

Extended Data Fig. 3 Reservoir diagrams.

Schematics of the 8-channel and 96-channel media reservoirs. These were printed on a Form 3 resin 3D printer. See the Supplementary Table 4 for .stl files for each.

Extended Data Fig. 4 Failure mode analysis.

Analysis of failure modes used to improve reliability and error handling.

Extended Data Fig. 5 Stochasticity of T7 RNAP evolution.

Validating T7 mutagenesis with cool PRANCE. MP-containing bacteria were provided with either 1) induction prior to cooling to 4 C, 2) given no inducer, or 3) induced on the robotic deck at 37 C and their luminescence was tracked for 30 hours to validate that mutagenesis behaved similarly to induction of cultures directly from a turbidostat (see Fig. 1d). (b) Real-time absorbance depression monitoring of 90 simultaneous directed evolution experiments with 6 no-phage controls, fit with a binomial regression of the total data. (c) Logistic regression of each luminescence trace during T7 RNAP evolution to bind the T3 promoter, used to calculate the average evolution times (Supplemental Methods). (d) Goodness-of-fit estimates of a logistic distribution of the total T7 evolution time data.

Source data

Extended Data Fig. 6 TAGA-qtRNA and AGGG-qtRNA PRANCE.

(a) Constructs for evolving TAGA-decoding qtRNAs. (b) Representative results for evolving TAGA-decoding qtRNAs. (c) Evolved qtRNAs exhibit increased ability to decode a TAGA quadruplet codon. Units are the percent luminescence when translating luxAB-357-TAGA in the presence of the qtRNA relative to expression of all-triplet-luxAB. (d) evolved genotypes E) Real-time absorbance and luminescence monitoring of qtRNA-encoding phage where either randomized or directed anticodons were used to evolve AGGG containing codons.

Source data

Extended Data Fig. 7 Edge effects.

Comparison between 96 replicates implemented in a densely packed 96-well plate (left) and 96 replicates split over two plates to reduce edge effects (right). Data plots the minimum time to phage detection via luminescence monitoring (below). Both plates above are normalized to their internal max value.

Source data

Extended Data Fig. 8 Tip contamination, sterilization and reuse.

To assess the maximum amount of possible cross-contamination, T7 RNAP-containing phage were inoculated into cultures containing pT7-psp-LuxAB bacteria in a grid-like pattern with 48 phage-containing wells and 48 no-phage-containing wells. PRANCE steps were performed with either (a) the 96-head channel or (b) the 8-channel pipettor. Use of the 96-well head gave less cross-contamination events, which we attributed to lower fly-over events. (c) To assess the impact of tip-sterilization and reuse in the optimized robotic method and configuration (Supplemental Video 1), robotic tips were submerged in either water or T7 RNAP-containing phage and then sterilized prior to being used to maintain high-throughput bacterial cultures containing pT7-psp-LuxAB²². Sterilized tips were also used to propagate bacteria inoculated with T7 RNAP phage as a positive control to ensure that bleach carryover did not affect phage propagation. No cross contamination was observed in the serialized tip condition over 12 hours, indicating that tips could be reused for a minimum of 12 hours without being replaced.

Source data

Extended Data Fig. 9 Kinetic Luminescence Activity Assays of −3, −5, and TP6 Promoter Variants.

To quantify the relative activities of each variant (independent of possible phage backbone mutations), phage from each replicate at time point t = 10 days were isolated and each T7 RNAP variant was cloned into an IPTG-inducible reporter construct. Subcloned variants were transformed into S2060 cells containing LuxAB driven by either the (a) −3 variant, (b) −5 variant, or (c) TP6 promoter and grown overnight to an OD > 1. Bacteria were then diluted 1:100 and grown to an OD of exactly 1.2 in DRM using a high-throughput robotic turbidostat method as described previously²². Once the bacteria reached an OD of 1.2 (approximately 2 hours), cells were induced with either 1 mM IPTG or [-] IPTG controls (n = 3 for each condition). Bacteria were autonomously maintained at an OD of 1.2 for the duration of the experiment and luminescence readings were taken once every 45 minutes for 10 hours. Fold change in luminescence (as shown in Fig. 6E), was calculated by averaging the luminescence in each turbidostat once the luminescence reached equilibrium (t > 8 hours) and then normalized to the average luminescence of the [-] IPTG controls within the same time window. WT T7 RNAP was used as a control for each respective promoter reporter construct (n = 6 for each WT control).

Source data

Supplementary information

Supplementary Information

Supplementary Table 1, legends for supplementary Tables 2–4.

Reporting Summary

Supplementary Video 1

Tip sterilization robotic method.

Source data

Source Data Fig. 1

Annotated luminescence and absorbance data.

Source Data Fig. 2

Annotated luminescence and absorbance data, Inflection point analysis of evolution time, T7 RNAP sequence mutations.

Source Data Fig. 3

Annotated luminescence and absorbance data, PFU values for phage enrichment.

Source Data Fig. 4

Luminescence data as a percentage of WT, annotated luminescence and absorbance data.

Source Data Fig. 5

Luminescence data as a percentage of WT, next-generation sequencing data with annotation files.

Source Data Fig. 6

Annotated luminescence and absorbance data.

Source Data Extended Data Fig. 2

Annotated luminescence and absorbance data.

Source Data Extended Data Fig. 5

Annotated luminescence and absorbance data.

Source Data Extended Data Fig. 6

Annotated luminescence and absorbance data.

Source Data Extended Data Fig. 7

Annotated luminescence and absorbance data.

Source Data Extended Data Fig. 8

Annotated luminescence and absorbance data.

Source Data Extended Data Fig. 9

Annotated luminescence and absorbance data.

Rights and permissions

Reprints and permissions

About this article

Cite this article

DeBenedictis, E.A., Chory, E.J., Gretton, D.W. et al. Systematic molecular evolution enables robust biomolecule discovery. Nat Methods 19, 55–64 (2022). https://doi.org/10.1038/s41592-021-01348-4

Download citation

Received: 04 November 2020
Accepted: 09 November 2021
Published: 30 December 2021
Issue Date: January 2022
DOI: https://doi.org/10.1038/s41592-021-01348-4

This article is cited by

Continuous directed evolution of a compact CjCas9 variant with broad PAM compatibility
- Lukas Schmidheini
- Nicolas Mathis
- Gerald Schwank
Nature Chemical Biology (2024)
Engineering is evolution: a perspective on design processes to engineer biology
- Simeon D. Castle
- Michiel Stock
- Thomas E. Gorochowski
Nature Communications (2024)
High-throughput continuous evolution of compact Cas9 variants targeting single-nucleotide-pyrimidine PAMs
- Tony P. Huang
- Zachary J. Heins
- David R. Liu
Nature Biotechnology (2023)
A high-throughput platform for efficient exploration of functional polypeptide chemical space
- Guangqi Wu
- Haisen Zhou
- Hua Lu
Nature Synthesis (2023)
In vivo hypermutation and continuous evolution
- Rosana S. Molina
- Gordon Rix
- Chang C. Liu
Nature Reviews Methods Primers (2022)

Subjects

Abstract

Similar content being viewed by others

Main

Results

Development of a systematic evolution platform

Multiplexing identifies previously inaccessible genotypes

Miniaturization enables reagent-limited evolution

Simultaneous evolution of dozens of biomolecules

Feedback evolution of activity-diverse biomolecules

Evolution outcomes are determined by temporal dynamics

Long-running evolution with PRANCE

Discussion

Methods

Plaque assays

Manual plaque assays

Robotics-accelerated plaque assays

Phage enrichment assays

Measurement of biomolecule activity using luciferase reporter

Equipment set-up for PRANCE

Media

General robotic equipment and configuration

Mini peristaltic pump array

Bacterial reservoir

Software

PRANCE experimental set-up

Mutagenesis functionality plating test

PRANCE host bacterial strain preparation

Anticipated mutagenesis rates

General PRANCE implementation

Semicontinuous flow using pipetting

Automated plate reading

Choosing number of replicates

Automated tip sterilization and reuse

Next-generation sequencing

Data analysis

Feedback controller calculation

Phylogenetic trees

Muller plots

Reporting Summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review information

Additional information

Extended data

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links