Introduction

The development of synthetic biology shows the feasibility to implement computing devices with DNA genetic circuits in living cells. Synthetic cellular designs often intended to implement certain functions that make cells respond to specific environmental stimuli or even change their growth and cellular development. For instance, synthetic toggle switches1 and genetic oscillators2,3,4,5 can be used to control cell metabolism, synthetic counters6 can be potentially applied to the regulation of telomere length and cell aggregation, and genetic logic gates7,8,9,10 can achieve digital computation in response to stimulus input signals. In addition to these transcription-based DNA circuits, new emerging translational mRNA circuits11 are likely to have impact on mammalian regenerative medicine and gene therapy. Through the genetic engineering, synthetic cellular circuits are potentially useful to perform therapeutic and diagnostic functions.

For some situations where noxious chemical stimuli exist for many cell generations, the computational results from the synthetic circuits in parent cells are required to be propagated to their daughter cells so that the daughter cells can save time to respond to the environmental stimuli. To achieve this transgenerational memory, one possible method is to store the computational results in separate synthetic memory devices which can be duplicated in cell divisions. In the recent work of Siuti et al.12, a more efficient scheme for constructing synthetic cellular circuits with integrated logic and memory was proposed, where the computational result was automatically stored in the computing circuit configuration and the changes of configuration can be propagated to its descendant cells. The so-implemented circuits were built based on recombinases and tested in Escherichia coli cells and they showed a long-term memory for at least 90 cell generations. More recently, recombinase-based logic circuits has been applied in clinical uses. For instance, in recent work13 the authors demonstrate that biosensor made of recombinase-based logic gates can be used to detect pathological glycosuria in urine from diabetic patients. The ability to build complex recombinase-based logic circuits is an important step to enable widespread biomedical applications.

Specifically, the synthetic cellular circuits proposed by Siuti et al.12 used serine recombinases Bxb1 and phiC31 to implement various two-input logic gates. A serine recombinase targeting a pair of non-identical recognition sites known as attB (attachment site bacteria) and attP (attachment site phage) is able to induce irreversible DNA inversion. As illustrated in Fig. 1(a), since the inversion makes the recognition sites become hybrid sites called attR and attL which cannot be targeted by the recombinase, no further inversion is allowed afterwards.

Figure 1
figure 1

Recombinase-mediated DNA inversion and its application to the implementation of a logic gate. (a) Schematic illustration of the irreversible inversion of DNA sequences using serine recombinases. (b) Implementation of an AND gate using recombinases. The right-turn arrow represents a promoter; the red and blue triangles are the targeting sites of recombinases Bxb1 and phiC31, respectively; the letter T’s flanked by the targeting sites are transcription terminators; the green box represents the gene encoding the green fluorescent protein.

We illustrate how recombinases take part in the implementation of two-input logic gates with the two-input AND gate example shown in Fig. 1(b). (As a convention, in this paper we read a DNA sequence from left to right assuming the 5′-to-3′ direction of the coding strand). Let molecules AHL and aTc be the stimulus inputs to a cell and act as inducers activating the expressions of recombinases Bxb1 and phiC31, respectively. These recombinases when activated will irreversibly invert (flip) the DNA sequences flanked by their recognition sites (denoted by the colored triangle pairs). The DNA sequences being flanked can be a promoter, a transcription terminator, or a reporter, e.g., a green fluorescent protein (GFP). Inverting these DNA sequences will alter the output gene expression. In Fig. 1(b), two terminators were flanked by the recognition sites of recombinases Bxb1 and phiC31, and the output green fluorescent reporter is highly expressed only when both inducers AHL and aTc are in high concentration to activate BxB1 and phiC31 which together further flip and disable both terminators (denoted by letter “T”). Therefore, the circuit of Fig. 1(b) effectively implements a two-input AND gate. Note that such DNA sequence changes will survive through cell divisions and can be inherited to descendant cells in different generations. Hence the so-implemented logic function can achieve a long-term transgeneration memory.

Motivated by the viability and applicability of recombinase-based circuits, in this paper we formalize the construction of a general multi-input logic gate with its DNA sequence composed of series of promoters and transcription terminators targeted by multiple recombinases. We further characterize the set of Boolean functions realizable under such logic gates. In addition, we show a design flow for arbitrary Boolean function construction with cascaded recombinase-based logic gates. This automated design methodology is demonstrated by leveraging synthesis tool ABC14, an electronic design automation (EDA) tool developed at UC Berkeley, to synthesize cascaded multi-level recombinase-based circuits.

Methods and Results

To formalize the general multi-input gate construction, we use the three-input logic gates in Fig. 2(a–h) as examples to illustrate. Figure 2(a) shows a realization of a 3-input AND gate using three recombinases R 1, R 2, and R 3, where molecule I i is a stimulus input that activates the expression of recombinase R i , for i = 1, 2, 3. Then R i ’s induce the inversions of their corresponding DNA sequence fragments. In order to express GFP in this gate, first we require R 1 to invert the inverted promoter so that the RNA polymerase can bind to it and begin the transcription of the downstream DNA sequence in which the GFP gene resides. Second, R 2 is needed to flip the terminator to avoid the termination of transcription before reaching the GFP gene. Third, R 3 is demanded to upright the GFP gene for the RNA polymerase to initiate GFP production. Collectively, to have GFP highly expressed all R i ’s must exist, and thus this circuit implements a 3-input AND gate. Note that this 3-input AND gate, where the promoter and the reporter gene GFP can be flipped by recombinases, is designed in a different fashion from the 2-input AND gate in Fig. 1(b), where only transcription terminators are inverted by recombinases. The additional choice of flipping the DNA fragments of promoter and GFP gives more flexibility for logic gate construction.

Figure 2
figure 2

Examples of generalized multi-input recombinase-based logic gates. (ah) Implementation of basic 3-input logic gates using recombinases. The inputs of each gate from top to down are recombinases R 1, R 2, and R 3, respectively; inducer I i monitored by the cell activates the expression of R i ; the red, blue, and orange triangles denote the targeting sites of R i , i = 1, 2, 3, respectively. (i) Schematic illustration of a 4-bit non-basic logic function \(O=({R}_{1}+\overline{{R}_{2}}\oplus {R}_{3})\overline{{R}_{4}}\) and corresponding implementation using recombinases.

In Fig. 2(b–h) we present seven other basic 3-input gates implemented with recombinases. Special implementations with nested targeting sites are applied on the XOR gate in (g) and the XNOR gate in (h). In the XOR gate in (g), the existence of one or three recombinases results in one or three times of GFP gene flipping and thus making the upside-down gene become upright, while the existence of two recombinases makes the GFP gene flip twice and remain upside down. Similar situations happen in the XNOR gate in (h).

Since the implementations of multi-input gates are possible, we are not constrained to using only 3-input gates and basic gate types, such as AND, OR, NAND, NOR, XOR, and XNOR gates. Rather, we can construct complex logic gates with more inputs. Figure 2(i) shows an example of a 4-input logic circuit

$$O=({R}_{1}+\overline{{R}_{2}}\oplus {R}_{3})\overline{{R}_{4}},$$

which can be directly realized by a single 4-input complex logic gate, instead of cascading multiple two-input gates.

Formalism of Recombinase-Based Logic Gates

Syntax of well-formed sequences

We define the following syntax to formalize the DNA sequences of logic gates constructed with recombinases. Here the basic elements composing a legal DNA sequence of a recombinase-based logic gate are “atomic terms”, including (inverted/non-inverted) transcription factors, (inverted/non-inverted) promoters, (inverted/non-inverted) genes, and targeting sites of recombinases. The syntax of DNA sequence forming a legal recombinase-based logic gate can be defined as follows.

Definition 1 An atomic term in a DNA sequence is a transcription terminator T, a promoter P, a gene G, an inverted transcription terminator , an inverted promoter , or an inverted gene . The syntax of an atomic term can be expressed in Backus-Naur Form as

(1)

Let the targeting sites attP and attB of recombinase r in a DNA sequence be denoted as “{ r ” and “} r ”, respectively. In the sequel, the subscripts of { r and } r may be omitted for brevity when they are clear from the context or immaterial to the discussion. Note that targeting sites “{” and “}” of a recombinase must appear in a pair.

Definition 2 The syntax of a well-formed sequence (wfs) is recursively defined as follows.

$$\begin{array}{l}\langle wfs\rangle \,::=\,\langle atomic\,term\rangle \,|\,{\{\langle wfs\rangle \}}_{{r}_{i}}\,|\,\langle wfs\rangle \langle wfs\rangle .\end{array}$$
(2)

In this paper we concentrate on the special case of one-gene wfs (1g-wfs), where only one gene G, which is neither inverted nor sandwiched by targeting sites, appears at the end of the wfs and serves as the output. For example, , and are 1g-wfs’s. Notice that under the 1g-wfs setting, the logic gate has a single output and the gene can only be transcribed in one direction from left to right.

A pair of targeting sites of a recombinase is called basic if it only flanks an atomic term. Otherwise, it is called non-basic. We call a 1g-wfs basic if it contains only basic pairs of targeting sites, and non-basic if it contains some non-basic pair of targeting sites. For example, is a basic 1g-wfs. In contrast, and are non-basic 1g-wfs’s.

Furthermore, a non-basic pair of targeting sites can be nested. That is, a non-basic pair of targeting sites can be flanked by another pair of targeting sites. For instance, has nested two pairs of targeting sites targeted by the recombinases r 3 and r 4.

We discuss the logic functions induced by basic and non-basic 1g-wfs’s in the following.

Semantics of well-formed sequences – Basic well-formed sequences

We first study some reduction rules of basic 1g-wfs’s. Let σ be the DNA sequence of a basic 1g-wfs excluding the output gene, that is, σ is a basic wfs without any gene. We denote a wfs without any gene as 0g-wfs. Because σ is made of components \({\{T\}}_{{r}_{i}}\), and for any component C in σ the sequence σ can be decomposed into

$$\sigma ={\sigma }_{1}C{\sigma }_{2},$$

where σ 1 and σ 2 are two 0g-wfs’s, if non-empty. We show that the logic gate induced by the 1g-wfs σG can be further reduced to an equivalent form according to the type of the component C.

When C is a transcription terminator T, then σ equals

$${\sigma }_{1}T{\sigma }_{2}G\equiv {\sigma }_{2}G.$$
(3)

This equivalence holds because any transcription that starts from σ 1 to gene G is always blocked by the transcription terminator T in the middle, making σ 1 T a don’t-care and thus removable.

When C is an inverted terminator , then σ equals

(4)

This equivalence holds because the inverted terminator never blocks the transcription and is thus removable.

When C is a promoter P, then σ equals

$${\sigma }_{1}P{\sigma }_{2}G\equiv P{\sigma }_{2}G.$$
(5)

This equivalence holds because no matter whether there is a transcription that starts from σ 1 to G or not, a transcription can always start from the promoter P. Therefore, σ 1 is a don’t-care and thus removable.

When C is an inverted promoter, then σ equals

(6)

This equivalence holds because the transcription that begins at proceeds across σ 1 in the direction from right to left, it does not pass through G. As a result, the expression of G can not be initiated by and thus can be removed from the sequence.

When C is since an atomic term A is equivalent to {A} r for recombinase r being in low concentration (denoted R = 0 by treating r as a Boolean variable R of value 0) or for recombinase r being in high concentration (denoted R = 1 by treating r as a Boolean variable R of value 1), the reduction rules for C can be easily extended from the previous rules as summarized below.

$${\sigma }_{1}{\{T\}}_{r}{\sigma }_{2}G\equiv \{\begin{array}{ll}{\sigma }_{2}G, & {\rm{for}}\,R=0\\ {\sigma }_{1}{\sigma }_{2}G, & {\rm{for}}\,R=1\end{array}$$
(7)
(8)
$${\sigma }_{1}{\{P\}}_{r}{\sigma }_{2}G\equiv \{\begin{array}{cc}P{\sigma }_{2}G, & {\rm{for}}\,R=0\\ {\sigma }_{1}{\sigma }_{2}G, & {\rm{for}}\,R=1\end{array}$$
(9)
(10)

with the above analysis, we can derive the corresponding Boolean function of a given 1g-wfs. Consider the 1g-wfs σG with the sequence σ targeted by recombinases r i , \(i=1,...,n\). Activating the expression of gene G requires the recombinases r i ’s have adequate (high or low) concentrations so that the 1g-wfs σG effectively reduces to PG. The Boolean function induced by σG is determined through a series of decisions made by r i ’s. In essence, it corresponds to a decision list15. To illustrate, consider the example The decision list induced by the 1g-wfs σG is shown in Fig. 3. Note that given a sequence without non-basic targeting sites, the decisions always start from the rightmost to the leftmost components because a component closer to the gene may overwrite the effects imposed by the components on its left and thus it is of higher priority. Therefore, the Boolean function of σG is determined starting from R 1 to R 5. In order to reduce σ to P to express gene G, first we must require R 1 to be 1. Otherwise if R 1 = 0, σ becomes equivalent to a null sequence no matter what other R i ’s are. Next, if we let R 2 be 1, we can have an equivalent sequence equal to P as wished. Otherwise we can let R 2 be 0 and look for other possibilities for the reduction to P. If R 2 = 0, we can easily tell that the only possibility occurs when R 3 and R 4 are both 0 and that the logic of R 5 never affects the reduction. Collectively, the logic function of the gate σG is derived as \({R}_{1}\cdot ({R}_{2}+\overline{{R}_{3}}\cdot \overline{{R}_{4}})\), where symbol “+” denotes Boolean disjunction, symbol “·” denotes Boolean conjunction, and symbol “−” or “!” denotes Boolean negation. In the sequel, we sometimes omit the conjunction symbol “·” in a Boolean expression.

Figure 3
figure 3

Decision list corresponding to 1g-wfs Node labelled R i is the decision for the logic value of R i . Nodes labelled 0 (resp. 1) stand for gene G cannot (resp. can) be expressed. The sequences beside nodes are the equivalent sequences after the corresponding (partial) decisions.

In general, we can systematically convert any basic 1g-wfs to its corresponding logic function. To achieve this conversion, the operator Ω over a 1g-wfs is defined in Table 1. For an empty sequence , we define Ω[] = 0. For example, the Boolean function of the 1g-wfs is derived by

Table 1 Operators for parsing basic 1g-wfs σCG, with (non-empty) 0g-wfs σ, component C, and gene G, to logic function.

Semantics of well-formed sequences – Non-basic well-formed sequences

We extend the above derivation of Boolean function to non-basic 1g-wfs’s by having the operator Ω over a 0g-wfs {σ} r (which can be basic or non-basic) defined as

(11)

where is the inverted sequence of σ. To understand equation (11), consider a 1g-wfs σG with only one pair of non-basic targeting sites. Suppose σ = {σ 1} r , where σ 1 is a basic 0g-wfs. Then σ is equal to σ 1 when R = 0 and to , the inverted sequence of σ 1, when R = 1. For example, the logic function for can be obtained by

For a 1g-wfs with multiple (possibly nested) non-basic pairs of targeting sites, its logic function can also be directly derived by the Ω operator. For example, the logic function for can be obtained by

Non-basic pairs of targeting sites can be exploited to efficiently construct special Boolean functions. One of such special functions is the parity function. An n-input odd parity function can be realized by the 1g-wfs

When there is an odd number of R i ’s equal to 1, the 1g-wfs reduces to sequence PG and gene G can be expressed. Otherwise it reduces to sequence G and gene G cannot be expressed. On the other hand, the n-input even parity function can be realized by the 1g-wfs

$$\mathop{\underbrace{\{\cdots \{}}\limits_{n}P{\}}_{{r}_{1}}\cdots {\}}_{{r}_{n}}G.$$

Construction of Multi-level Recombinase-Based Logic Circuits

With the recombinase-based logic gates built from 1g-wfs’s, we can cascade them to implement arbitrary complex multi-level circuits. For example, the logic function Z = (A + B)(AB) can be implemented with the two-level circuit shown in Fig. 4(a), which is composed of an OR-gate, an XOR-gate, and an AND-gate. One possible DNA implementation of Z with cascade can be derived by converting each gate to their 1g-wfs realizations as shown in Fig. 4(b). The 1g-wfs’s that encode the genes R 1, R 2, and Z correspond to the OR, XOR and AND gates, respectively. The recombinases r 1 and r 2 as the inputs to the AND gate are the intermediate signals.

Figure 4
figure 4

Example of a cascaded recombinase-based logic circuit. (a) Logic circuit of Boolean function Z = (A + B)(AB). (b) The corresponding DNA implementation of the circuit in (a) with gate cascade. A and B denote the recombinase inputs of the overall circuit. The genes R1 and R2 encode the recombinases r 1 and r 2, respectively, which are the inputs to the downstream AND gate. The protein encoded by the gene Z is the output of the circuit.

Because the basic 1g-wfs gates can implement decision list functions, they form a functionally complete set of primitive logic gates that can be composed to implement any Boolean function. Therefore the 1g-wfs gates can be collected as a library for the synthesis of complex logic circuits. By leveraging conventional logic synthesis tools in electronic design automation (EDA), recombinase-based logic circuits can be synthesized with the flow shown in Fig. 5(a). Given a Boolean function or circuit netlist as the input, it is first optimized by technology-independent techniques for circuit simplification. The simplified circuit is further optimized by technology-dependent techniques for technology mapping using the primitive gates in the given standard cell library. To achieve recombinase-based logic circuit synthesis, the main task is to provide the library while all other optimization tasks can be done using existing logic synthesis tools.

Figure 5
figure 5

Illustration of the synthesis flow with an input circuit and a library of primitive gates. (a) Logic synthesis flow for the implementation of recombinase-based logic circuit. (b) Circuit diagram of an input circuit netlist example, ISCAS benchmark c17. Circuit c17 consists of six NAND gates with five inputs {A, B, C, D, E} and two outputs {Y, Z}. (c) Example of a library of DNA gates with area cost specified. The library contains 44 different cells and each cell corresponds to a DNA logic gate defined by a 1g-wfs with up to three inputs. The variables a, b, and c in a function specification represents the recombinase inputs to a gate, and the variable O denotes the gate output.

In this work, we adopt ABC14, an industrial-strength logic synthesis tool developed at UC Berkeley, for circuit synthesis and optimization. Given a circuit netlist, we first apply ABC to perform technology-independent optimization on the netlist, e.g., Boolean minimization to minimize the number of product terms and literals. We then use ABC to perform technology mapping to implement the area or performance optimized netlist using the 1g-wfs gates in the library.

To illustrate the synthesis flow, we consider implementing ISCAS benchmark circuit c17 shown in Fig. 5(b) with recombinase-based genetic circuit realization. The circuit consists of five inputs A, B, C, D, and E, and two outputs Y and Z with functions

$$\{\begin{array}{l}Y=AB+\overline{(BC)}D\\ Z=\overline{(BC)}D+\overline{(BC)}E.\end{array}$$
(12)

For area-driven synthesis of benchmark c17, there are 44 DNA gates defined by their 1g-wfs’s with up to three recombinase inputs. They are collected as the library as shown in Fig. 5(c). According to the experiment in the previous work12, where the promoters and transcription terminators used are roughly of the same length, we treat the area cost of both promoter and transcription terminator as unity. Therefore, the area cost of a DNA gate is defined as the number of atomic terms, excluding the output gene, that appear in the 1g-wfs of the gate. For example, the gate c3_1 corresponding to a 3-input OR gate has three inverted promoters as shown in Fig. 2(d). Hence, the area cost of c3_1 is counted as 3 units. By providing the c17 netlist and the library to ABC, the tool can perform optimization and technology mapping to find an area-optimized circuit composed of DNA gates of the library. Note that area minimization of a recombinase-based circuit effectively reduces the number of used promoters and terminators on the DNA strand implementation. Therefore, less effort is required to synthesize the intended DNA strand via DNA assembly methods, e.g., Gibson assembly16. More importantly, a shorter DNA sequence is more likely to succeed in vector insertion to deploy the genetic circuit into the host cell to conduct the intended computation.

Figure 6(a) shows the result described in Verilog language of the synthesized c17 recombinase-based circuit using library gates listed in Fig. 5(c). The synthesized circuit comprises gates c2_4, c2_5, c3_14, and c3_25, and the total area cost is 10 units. Note that the naive DNA circuit implementation of c17 circuit by converting the digital logic gates in Fig. 5(b) to the corresponding DNA gates results in a total area cost of 12 units. Compared to the naive implementation, the area cost of the circuit synthesized by ABC technology mapping decreases. The logic functions of Y and Z in the synthesized circuit can be easily verified to be consistent with equation (12), implying the correctness of the synthesis result. The DNA circuit of module c17 in Fig. 6(a) is plotted in Fig. 6(c), where the symbols A, B, C, D, E, n7, and n8 represent some serine recombinases. In practice, to have recombinases achieve site-specific recombination in a synthetic genetic circuit, recombinases that have been reported to function outside their native hosts may be used. For example, well-reported recombinases17,18,19,20,21,22,23,24,25,26,27,28,29, such as ϕC31, ϕBT1, R4, BxB1, TP901-1, RV, SPBc, TG1, ϕFC1, MR11, ϕ370, ϕK38, A118, W β, and BL3 integrase, can be plausible molecular parts for realization of the recombinase signals in Fig. 6(c).

Figure 6
figure 6

Synthesis results of circuit c17 in Verilog descriptions and in DNA circuit implementations. (a) Tool ABC synthesized c17 circuit in Verilog description. (b) Manually designed c17 circuit in Verilog description. (c) DNA circuit implementation of the ABC synthesized circuit in (a). (d) DNA circuit implementation of the manually designed circuit in (b). In both (c) and (d), symbols A, B, C, D, and E indicate the recombinase inputs, the proteins encoded by the genes Y and Z are the outputs of the circuit, and the DNA gates encoding recombinases n 7 and n 8 and proteins Y and Z are the gates g0, g1, g2, and g3, respectively, in the modules c17 and c17_1.

Note that there can be more than one area-optimized circuit of a logic function. For comparison, in Fig. 6(b) we show another manually designed DNA implementation of c17 circuit whose area cost is 10 units as well. The corresponding DNA circuit is plotted in Fig. 6(d). Notice that the two circuits in Fig. 6 differ not only in their constituent logic gates, but also in their logic depths. The circuit of Fig. 6(c) is of two logic levels, whereas that of Fig. 6(d) is of three logic levels. There are six longest paths in the former circuit:

$$\{\begin{array}{l}A\to n7\to Y,\\ B\to n7\to Y,\\ B\to n8\to Y,\\ B\to n8\to Z,\\ C\to n8\to Y,\\ C\to n8\to Z.\end{array}$$

They involve a cascade of two logic gates. On the other hand, there are two longest paths in the latter circuit:

$$\{\begin{array}{l}B\to n7\to n8\to Y,\\ C\to n7\to n8\to Y.\end{array}$$

They involve a cascade of three logic gates. In digital electronic circuits, a longer circuit path often corresponds to a longer propagation delay between circuit input and output signals. Similarly in biological circuits, a longer circuit path involves more transcription and translation cascades, resulting in a longer response time of output gene expression to input stimuli. Here, the former and latter circuits involve two (n7 and Y) and three (n7, n8, and Y) gene expression cascades, respectively. Therefore, although these two circuits have the same area cost, the circuit of Fig. 6(c) is preferred due to its better performance, i.e., shorter input-to-output response time. In addition, we will detail in Section Discussion that the delay optimization may present fewer foreign genes and thus impose less metabolic burden on the host cell. In the in silico experiments, we will synthesize circuits with area or performance optimized.

To demonstrate the feasibility of the proposed synthesis flow, we conduct in silico experiment on other 67 ISCAS benchmark circuits using recombinase-based DNA gates. We expanded the library such that it includes all 684 DNA gates with decision list functions up to five inputs. In the library, the area cost of a gate is determined by the number of atomic terms, excluding the output gene, appearing in its corresponding 1g-wfs. To reduce the number of gene expression cascades, we simply assume each logic gate is of the same unit delay. By specifying a unit delay for each gate in the library, the delay of a synthesized circuit equals the logic level, which equals the number of gene expression cascades in the longest path in the circuit. Consequently, under the unit delay model the performance-driven logic synthesis minimizes the delay time between input stimuli and output response in the synthesized recombinase-based circuit. Note that this simple unit delay model is not meant to reflect the timing behavior of actual biological systems, but to facilitate the logic synthesis algorithm to perform circuit logic level minimization.

The experimental results of 54 (out of the 67) circuits are shown in Table 2. The numbers of primary inputs/outputs, the number of inverters, and the number of logic gates (with the number of included buffers, if non-zero, reported in parentheses) are listed Columns 2, 3, and 4, respectively. The circuits were synthesized under two optimization settings: one for area optimization and the other for delay optimization. The results of area optimization are reported in Columns 5–7 and those of delay optimization are reported in Columns 8–10. For each synthesized circuit, its number of DNA gates, total area, and gate level are shown. In the naive implementations of benchmark circuits by simply converting the digital logic gates to the corresponding DNA gates, the total area of a DNA circuit can be roughly calculated as “#inverter” + 2 × “#gate”. Compared to the naive implementation, the circuits synthesized by ABC have much less area cost. Taking circuit b18 for example, we observe that the total area of the naive implementation is about 202110 which is much larger compared to the area 101870 of the area-optimized implementation and 105328 of the delay-optimized implementation. On the other hand, comparing area and delay optimized b18 circuits, delay optimization reduces the number of gate levels from 137 to 51 at cost of increasing area by 3500 units.

Table 2 Results of technology mapping of ISCAS benchmark circuits.

Discussion

Area vs. Delay Optimization

To pursue area or delay optimization in genetic circuit synthesis is a matter of tradeoff, and may depend on the intended application and/or biological feasibility. Nevertheless, Table 2 reveals that when the library of recombinase-based logic gates is used in ABC for logic synthesis, delay optimization often achieves effective reduction (62% on average) in logic level, or circuit depth, with a slight increase (7% on average) in circuit area compared to area optimization. Taking the largest circuit b18 benchmark for example, from area to delay optimization, the area cost increases by 3.39% while the logic level decreases by 62.77%. Particularly, in practice since we are limited by the biotechnology and the metabolic burden, circuits to be synthesized cannot be as large as b18 benchmark, which only serves as a proof of concept. Instead, small circuits, such as b06, are more likely to be implemented. For benchmark b06, the area cost increases by 10.71% (56 to 62) and the logic level decreases by 50% (6 to 3) from area to delay optimization. Moreover, the delay optimization helps reduce metabolic burden (to be discussed below). These facts imply that delay-driven optimization may often be a proper objective for logic synthesis of recombinase-based genetic circuits.

Metabolic Burden

One of the advantages of recombinase-based genetic circuits is its low metabolic burden imposed on the host cell30. Unlike a classic genetic circuit requiring continuous production of and action by activators or repressors to maintain the output gene expression, the output gene expression in a recombinase-based genetic circuit is determined by its DNA configuration, which is changed by DNA inversion or excision by recombinases; no further continuous recombinase supply and action is needed afterwards. This permanent configuration change is understood as a long-term (nonvolatile) memory, leading to the advantage of a lower metabolic burden on the host cell. This advantage may allow more complex genetic circuit implementation using recombinases. For example, recombinase-based finite state machines have been implemented in E. coli cells31. Moreover, a 6-input AND gate, a 2-data-input 4-select-input Boolean logic look-up table, a full adder, a full subtractor, and a half adder-subtractor were implemented in human embryonic kidney and Jurkat T cells32. Furthermore, we have shown recombinase-based logic gates can be adopted in the conventional logic synthesis flow for efficient circuit optimization. Because an efficient design can reduce metabolic burden and outperform an inferior counterpart even with the same functionality33, complex circuit implementation may benefit from the automation and optimization method proposed in this report.

Even with recombinase based construction, implementing a large circuit in a living cell may still be challenging due to the increase of metabolic burden34 caused by two major effects. First, a larger synthetic circuit requires more cellular energy to maintain its presence in the host cell35. Second, a large number of introduced genes will compete for the transcriptional and translational resources, resulting in resource redistribution36 and unexpected coupling among seemingly unconnected modules37, and thus leading to cell growth defects and poorly predictable circuit behavior. One approach to address these issues is to separate the target circuit into sub-circuits and implement the circuit across a consortium of host cells7,38,39,40. In particular, the consortium is divided into colonies of the same number of the sub-circuits. Each colony is composed of a strain implementing one of the sub-circuits. The sub-circuits are connected through cell-cell communication by wiring molecules (for example, quorum-sensing molecules and yeast pheromones) or metabolites like benzoic acid. Collectively, the whole cell population implements the target circuit. This distributed strategy may also apply to a large recombinase-based circuit implementation. For instance, the c17 circuit in Fig. 6(a) may be implemented by distributing the gates g0, g1, g2, and g3 into four strains of cells.

We note from Table 2 that when using recombinase-based logic gates as the library for a target circuit synthesis, the option of delay optimization introduces fewer DNA gates, each of which contains a gene, than the option of area optimization. Hence, delay optimization is preferred over area optimization due to a lower metabolic burden imposed by fewer foreign genes in the delay-optimized circuit.

Experimental Steps for Circuit Realization

Given a target Boolean function to be implemented as a genetic circuit, our method can be applied as the first step to build the blueprint for the wet-lab construction by using the logic synthesis tool ABC to derive the area or delay-optimized circuit. The next task is to associate the abstract signals of the synthesized netlist with concrete biochemical parts, including promoters, recombinases, and genes, for wet-lab implementation. After this association step, the DNA molecule of the genetic circuit is readily to be constructed by Gibson assembly16, Unique Nucleotide Sequence (UNS) Guided assembly41, or other assembly methods. Note that the promoters used here should have the ability to strongly promote transcription. After the assembly, the DNA constructs are transformed/transfected into cells using a standard protocol, such as the polyethylenimine (PEI) protocol. The cells should be kept and maintained in custom or standard media, such as Luria-Bertani (LB) medium and Dulbecco’s Modified Eagle’s medium (DMEM), and grown for one to two days in a stimuli-free medium. To test the synthetic circuit, cells have to be exposed to stimuli and grown for several hours, and then the fluorescence response from cells is measured by a flow cytometer. For each sample of the measurement, the same number of cells should be used for consistency. After creating a gate using forward scatter (FSC) and side scatter (SSC) and applying a proper fluorescence threshold on each fluorescent protein channel, the percentage of cells in an ON state is determined by flow cytometry analysis.

Alternative Genetic Circuit Construction with CRISPR/Cas9 Systems

Cas9 nucleases42 may possibly be exploited to achieve gene expression effects equivalent to what recombinases can achieve. For example, Cas9 nucleases are able to induce DNA deletion43,44, defective Cas9 nucleases (dCas9) can repress transcription by blocking transcriptional initiation or elongation45, and dCas9 fused with a transcriptional activator is capable of activating gene expression46. Specifically, DNA deletion of P and T may achieve an effect equivalent to inverting P and T, respectivley; transcription repression may achieve an effect equivalent to inverting P and ; transcription activation may achieve an effect equivalent to inverting and T. These mechanisms allow CRISPR/Cas9 systems to be utilized as recombinase replacements for the implementation of decision list logic functions.

Conclusion

In this paper, we generalized the two-input recombinase-based DNA logic gates to multi-input cases. We formalized the syntax of recombinase-based logic gate construction, and obtained the Boolean function semantics of well-defined DNA sequences of recombinase-based logic gates. We also showed how to synthesize multi-level recombinase-based logic circuits using existing logic synthesis tools. In silico experimental results demonstrate the feasibility and efficiency of our proposed methods as a tool for recombinase-based genetic circuit minimization. As recombinase-based logic circuits have been used in clinical biomarker detection and tested in human cells, our tool can be useful to automate complex recombinase-based circuit construction for biologists to implement advanced biomedical applications.