Contextual dependencies expand the re-usability of genetic inverters

The implementation of Boolean logic circuits in cells have become a very active field within synthetic biology. Although these are mostly focussed on the genetic components alone, the context in which the circuit performs is crucial for its outcome. We characterise 20 genetic NOT logic gates in up to 7 bacterial-based contexts each, to generate 135 different functions. The contexts we focus on are combinations of four plasmid backbones and three hosts, two Escherichia coli and one Pseudomonas putida strains. Each gate shows seven different dynamic behaviours, depending on the context. That is, gates can be fine-tuned by changing only contextual parameters, thus improving the compatibility between gates. Finally, we analyse portability by measuring, scoring, and comparing gate performance across contexts. Rather than being a limitation, we argue that the effect of the genetic background on synthetic constructs expands functionality, and advocate for considering context as a fundamental design parameter.


List of Figures
. . . . 30 S28 Numerical similarity scores for AMTR A1 across all contexts. This data is plotted in Figure S4. . . 30 S29 Numerical similarity scores for BETI E1 across all contexts. This data is plotted in Figure S4. . . . 31 S30 Numerical similarity scores for BM3R1 B1 across all contexts. This data is plotted in Figure S4. . . 31 S31 Numerical similarity scores for BM3R1 B2 across all contexts. This data is plotted in Figure S4. . . 32 S32 Numerical similarity scores for BM3R1 B3 across all contexts. This data is plotted in Figure S4. . . 32 S33 Numerical similarity scores for HIYIIR H1 across all contexts. This data is plotted in Figure S4. . . 33 S34 Numerical similarity scores for LCARA I1 across all contexts. This data is plotted in Figure S4. . . 33 S35 Numerical similarity scores for LITR L1 across all contexts. This data is plotted in Figure S4. . . . . 34 S36 Numerical similarity scores for LMRA N1 across all contexts. This data is plotted in Figure S4. . . . 34 S37 Numerical similarity scores for PHIF P1 across all contexts. This data is plotted in Figure S4. . . . 35 S38 Numerical similarity scores for PHIF P2 across all contexts. This data is plotted in Figure S4. . . . 35 S39 Numerical similarity scores for PHIF P3 across all contexts. This data is plotted in Figure S4. . . . 36 S40 Numerical similarity scores for PSRA R1 across all contexts. This data is plotted in Figure S4. . . . 36 S41 Numerical similarity scores for QACR Q1 across all contexts. This data is plotted in Figure S4. . . . 37 S42 Numerical similarity scores for QACR Q2 across all contexts. This data is plotted in Figure S4. . . . 37 S43 Numerical similarity scores for SRPR S1 across all contexts. This data is plotted in Figure S4. . . . 38 S44 Numerical similarity scores for SRPR S2 across all contexts. This data is plotted in Figure S4. . . . 38 S45 Numerical similarity scores for SRPR S3 across all contexts. This data is plotted in Figure S4. . . . 39 S46 Numerical similarity scores for SRPR S4 across all contexts. This data is plotted in Figure S4. . . . 39 3 1 Decomposing fluorescence and scattering In our experiments we aim to pick cells in the log-phase, this causes the cell-size distribution to be constant [1]. We do however see that there is some experiment to experiment variation of the cell size distribution. To negate the influence of the cell size variation into the fluorescence measures we decompose the fluorescence values in two components: a part that is experiment-dependent and a part that is scattering dependent. We evaluate the part that is scattering dependent on the context average. The raw flow cytometry files, along with the decomposed values are available at DOI:10.25405/data.ncl.12073479.

Compatibility tables for individual strains
In order that the genetic circuits may be treated as logic NOT gates, the continuous variable (experimentally obtained standardised fluorescence) must be interpreted as a discrete variable (representing logic 1 or 0). Thresholds are therefore required to partition the input and output fluorescence values into groups that are to be interpreted as a logic value, or rejected as ambiguous. Compatibility between two gates is a qualitative measure of the agreement of these thresholds. In particular, the output thresholds of the 'input gate' must not lie in the group of inputs that would be rejected as ambiguous by the 'output gate'. Smaller ambiguous regions will increase the numbers of compatible pairs in silico, but circuits built from such pairs may behave unpredictably in the presence of noise or measurement error of a real system. Larger ambiguous regions will guard against the effect of noise, at the cost of flexibility in design. In this study we use a thresholding scheme that has used previously for the same library [2]. Further, we consider that two gates may only be connected if they are compatible, and a library with many compatible gates is desirable.
Here we show the compatibility between pairs of gates in individual strains, illustrating what can be achieved by incorporating backbone as a design parameter, without changing host. The superior performance found in the DH5α and CC118λpir E. coli. strains, in comparison to the P.Putida strain KT2440, may reflect the fact that the library components were initially selected for use in an E. coli. host, and that choice of host significantly impacts behaviour of genetic parts.  Figure S2: Compatibility tables for the library in different hosts. Input gate on the x-axis is the first gate, whose output provides the input for 'Output gate' on the y-axis. Two gates are compatible if their thresholds agree, and they do not use the same repressor molecule. Figure S3: Compatibility tables for the library in each of the 7 contexts. Each table shows the compatible pairings for the library when placed in a given host and on a given backbone.   In Table S1 there are 23 plasmids in total that are acquired from Cello study. 20 of these plasmids are NOT gates that have functional segments in common in all constructs. Those are LacI/P tac expression system inducible with IPTG and yfp gene that is used for output readout. Gate specific units that are not shared for all are the RBS sites and gate proteins. Plasmids have kanamycin resistance gene and p15A origin of replication.

List of constructs
Autofluorescence plasmid is named as pAN::1201. This plasmid is used to measure background noise created by the backbone. The plasmid is composed of a constitutive P lac promoter expressing lacZα gene. Also, two sensor proteins TetR and LacI are present at the same operon under P lacI promoter. RPU standard plasmid is pAN::1717. This plasmid shares the same sensory proteins as in autofluorescence plasmid with the same P lacI /LacI/TetR expression system, but the difference is that a yfp gene is expressed under constitutive promoter J23101. L3S2P21 terminator is used at the downstream of yfp for insulation of transcriptional read-throughs and native AraC terminator is used for insulation of tetR gene. pAN::1818 is the promoter activity plasmid. P lacI constitutive promoter controls the expression of lacI and tetR genes. yfp gene is expressed at the downstream of P tac promoter that is inducible with IPTG. Table S2 shows the complete list of strains that are used in this study. We have constructed 7 different libraries changing in plasmid backbones, copy numbers and hosts. 7 libraries are consisting of 2 Escherichia coli strains (DH5α and CC118λpir) and 1 Pseudomonas putida strain (KT2440). DH5a is composed of two different backbones pAN and pSEVA. CC118λpir, whereas, it contains only pSEVA backbones however in two different copy numbers low and medium copy. Finally, KT2440 consists of low, medium and high copy numbers in pSEVA backbones. All 7 libraries have each of 20 NOT gates and 3 additional gates, autofluorescence plasmid, RPU standardization plasmid and Promoter activity plasmid. Only in KT2440 pSEVA251 (high copy) library we have encountered cloning problems in 5 of the gates which may have been resulted from a possible toxicity effect caused by them in higher copies. These failed gate clones are in pSEVA251 backbone BM3R1-B1, PhIF-P3, QacR-Q2, SrpR-S2 and SrpR-S3. Hence, in total 135 strains are generated in this study with NOT gates, and 21 strains that are to help to standardize outcomes and measure promoter activity, in total 156 strains listed in Table S2 are used in this work. Table S3 gives detailed information about primers used in this study. There are four group of primers used here. First group is Gate-Unvrsl-Number primers, which are targeting various preserved regions in the expression system. These primers are to use for verification purposes of the clones and they are 6 in total. Second group of primers are called as Gate-Ctrl-2-Names. There are 12 of these plasmids which are named after the gate they are targeting. PCR verification of cloned gate can be done by using its corresponding group 2 primer. Third group is named with nomenclature PS-Number and they are 6 in total. These are general SEVA control primers targeting different regions of SEVA plasmids that help to confirm successful clones. Finally, in the fourth group there is a primer set used for pop-up of gates from pAN backbone to pSEVA backbone. These primers consist of gate homologous region and a flanking part that has PacI and SpeI restriction sides used for cloning into pSEVAs.

Context Similarity
This study found that the same genetic logic gate can exhibit differing behaviour depending upon the context in which it was situated. The characterisation of the genetic logic gate may be both quantitatively and quantitatively different. Qualitative changes, such as to the shape of the response curve, are particularly interesting when considering the effects of context-circuit interplay, because they suggest these interactions are nonlinear phenomena.
We attempted to quantify changes in curve shape using a similarity measure as described in Methods of the manuscript. Log transformation of the curve ensures that deviations in the upper regions of input and output are not disproportionately penalised. The min-max normalisation in both input and output dimensions captures the shape information of the curve. Methods based on comparison of the gradient of the curves were also considered, and produce similar results. (d) Similarity scores heatmap for Bm3r1 B1 in 7 contexts.
(s) Similarity scores heatmap for Srpr S3 in 7 contexts. (t) Similarity scores heatmap for Srpr S4 in 7 contexts. Figure S4: Similarity scores shown for all gates in all 7 contexts. A high similarity score (darker squares) is best. The minimum and maximum scores are 0 and 1, respectively.

Compatibility Scoring
Whilst compatibility tables indicate which pairs of gates may be connected, they offer no indication as to which pairs are most or least compatible. It may be desirable, from a optimisation perspective, to select pairs of gates for which the first's output thresholds lie as far away from the second's ambiguous region as possible whose. Doing so will improve performance in spite of noise and provide a greater margin for error.
Conversely, we may wish to optimise the library for a specific context by redesign of the parts. In this case, we would like to know which pairs would be compatible with only small changes to their existing behaviour, such that the reward for our optimisation efforts are maximised.
We computed a compatibility score to measure these characteristics, as defined in Methods. A positive/negative score for a pair of gates indicated the pair is compatible/incompatible. Further, a positive score represents the minimum of the maximum perturbations to the thresholds that could be tolerated, whilst retaining the compatibility of the pair. A negative score represents the maximum of the minimum perturbations to the thresholds that would be required to make the pair compatible that could be allowed to any of thresholds. Thus, pairs of gates can be ranked in terms of compatibility.
(a) Compatibility score heatmap for gates in the CC118λpir host with the pSeva221 backbone.
(b) Compatibility score heatmap for gates in the CC118λpir host with the pSeva231 backbone.
(c) Compatibility score heatmap for gates in the DH5α host with the pAN backbone.
(d) Compatibility score heatmap for gates in the DH5α host with the pSeva221 backbone.
(e) Compatibility score heatmap for gates in the KT2440 host with the pSeva221 backbone.
(f) Compatibility score heatmap for gates in the KT2440 host with the pSeva231 backbone.
(g) Compatibility score heatmap for gates in the KT2440 host with the pSeva251 backbone. Figure S5: Compatibility scores between pairs of gates for all contexts. Higher scores are better and indicate more compatible pairs. Negative scores indicate incompatible pairs. This section contains tables of numerical data that support the plots presented in the manuscript and in the rest of this supplementary material.

Gate Characterisation
The parameters of a hill equation are estimated from data for each gate. These parameters are then used for calculating the input and output thresholds as described in Methods. In the following tables, values of these parameters and thresholds are given, as well as whether the thresholds define a gate as a functional NOT gate. Values are omitted where the formula for a threshold produces a complex number, i.e. there are no solutions in the set of reals.

Compatibility pairings
We calculated the number of compatible pairings for each strain and plasmid backbone combination, as well as aggregates of these combinations (e.g. for the strain DH5alpha, and all backbones which were characterised in this host, an aggregation of contexts DH5alpha pAN and DH5alpha pSeva221). Of interest for each context is: • the number of gates which are valid, i.e. functional as NOT gates • the number of compatible pairings as a percentage of all possible pairings • the number of pairings which are 'inter-context', i.e. connections between gates which are on different backbones, or in different hosts The maximum number of pairings possible depends on the number of gates available. We calculate this using the binomial coefficient as 2 × n    Figure S4.  Table S28: Numerical similarity scores for AMTR A1 across all contexts. This data is plotted in Figure S4.   Table S30: Numerical similarity scores for BM3R1 B1 across all contexts. This data is plotted in Figure S4.   Figure S4.  Table S32: Numerical similarity scores for BM3R1 B3 across all contexts. This data is plotted in Figure S4.   Figure S4.  Table S34: Numerical similarity scores for LCARA I1 across all contexts. This data is plotted in Figure S4.   Figure S4.  Table S36: Numerical similarity scores for LMRA N1 across all contexts. This data is plotted in Figure S4.   Figure S4.  Table S38: Numerical similarity scores for PHIF P2 across all contexts. This data is plotted in Figure S4.   Table S40: Numerical similarity scores for PSRA R1 across all contexts. This data is plotted in Figure S4.  Table S41: Numerical similarity scores for QACR Q1 across all contexts. This data is plotted in Figure S4.  Table S42: Numerical similarity scores for QACR Q2 across all contexts. This data is plotted in Figure S4.  Table S43: Numerical similarity scores for SRPR S1 across all contexts. This data is plotted in Figure S4.  Table S44: Numerical similarity scores for SRPR S2 across all contexts. This data is plotted in Figure S4.  Table S45: Numerical similarity scores for SRPR S3 across all contexts. This data is plotted in Figure S4.  Table S46: Numerical similarity scores for SRPR S4 across all contexts. This data is plotted in Figure S4. 7 Using the provided codes and data The codes that implement the in silico methods from this study can be found and downloaded at https://github. com/lgrozinger/pyolin, along with instructions detailing its use.
In order to obtain the results from this particular study, only Python3.6 or greater is needed. However, to generate the figures used in the manuscript further requires gnuplot and L A T E X. Once these dependencies are installed, the command: python3 produce_figures.py full-update once run from the project root directory will perform the analysis and place figures in the appropriate subdirectories. Docker users may find it more convenient to use the Dockerfile associated with the project.
It should also possible to run the same analysis on a different dataset, if the data is provided as a csv file with the expected fields and format. The processed data from this study that is provided with the codes can act as a guideline.
The package is also capable of generating UCF files that can be used with the 'Cello' automated design framework [2].
The data used for this study can be obtained at https://figshare.com/s/18e6a10d708d15839837. A description of the files founds there can be found in the accompanying README file.