Article | Open | Published:

# Phosphatase activity tunes two-component system sensor detection threshold

## Abstract

Two-component systems (TCSs) are the largest family of multi-step signal transduction pathways in biology, and a major source of sensors for biotechnology. However, the input concentrations to which biosensors respond are often mismatched with application requirements. Here, we utilize a mathematical model to show that TCS detection thresholds increase with the phosphatase activity of the sensor histidine kinase. We experimentally validate this result in engineered Bacillus subtilis nitrate and E. coli aspartate TCS sensors by tuning their detection threshold up to two orders of magnitude. We go on to apply our TCS tuning method to recently described tetrathionate and thiosulfate sensors by mutating a widely conserved residue previously shown to impact phosphatase activity. Finally, we apply TCS tuning to engineer B. subtilis to sense and report a wide range of fertilizer concentrations in soil. This work will enable the engineering of tailor-made biosensors for diverse synthetic biology applications.

## Introduction

A central goal of synthetic biology is to program cells to sense and respond to chemical or physical inputs in desired ways1. To this end, researchers develop genetically encoded sensors, often based upon multi-step signal transduction pathways or one-component transcription factors2 that convert inputs of interest into biological signals such as gene expression. However, all biosensors respond to their cognate inputs over finite concentration ranges that are often mismatched with application demands3.

Despite this challenge, there has been little focus on developing technologies for tuning biosensor detection windows. In two recent studies, the input concentrations required to activate Escherichia coli nitrate and hydrogen peroxide sensors by 50% (i.e., the detection thresholds, quantified by the parameter K1/2) were decreased 412- and 15-fold by linking the respective sensors to the expression of a phage recombinase that inverts a segment of DNA into an orientation appropriate for transcription of an output gene4,5. Though this approach is simple and modular, the recombination step is irreversible and delays sensor response by up to 15 h, making it incompatible with applications requiring dynamic or rapid responses. In a separate pair of yeast studies, RNA secondary structure design was used to lower the detection threshold of an engineered theophylline-responsive antiswitch from 10 mM to 1 mM6, and protein expression level optimization was used to reduce the estradiol detection threshold of the mitogen-activated protein kinase (MAPK)/extracellular signal-regulated kinase pathway from 32 µM to 6.6 µM7. However, antiswitches currently sense a limited number of inputs, and both of these approaches yield modest changes in detection threshold, limiting the utility of these strategies. Finally, computational design8 and directed evolution9 of ligand-binding transcription factors show promise for tuning sensor detection thresholds. However, these methods are time and labor intensive and require extensive domain-specific expertise, limiting their widespread use.

Two-component systems (TCSs) are an important source of sensors for synthetic biology. Tens of thousands of TCSs have been identified in bacterial genome sequences. Individual members of this family sense inputs as diverse as metal ions of particular oxidation states10, respiratory electron acceptors11, gases12, inorganic phosphate13, heme14, quorum sensing autoinducers15, antimicrobial peptides16, simple sugars17, gut polysaccharides derived from the diet18 or host19, human20 and plant21 hormones, oxidative stress22, physical contact23, and specific wavelengths of light24. Synthetic biologists have begun to repurpose light-sensing TCSs to function as sensors for optogenetics25,26,27,28 and chemical-sensing TCSs to engineer diagnostic gut bacteria29,30,31, among other applications.

The prototypical TCS comprises two proteins: a sensor histidine kinase (SK) and a response regulator (RR) (Fig. 1a). The SK contains a (typically extracellular) N-terminal sensor domain that switches from an inactive to active conformation in the presence of the input32. This conformational change is transmitted to a C-terminal cytoplasmic signaling region comprised of catalytic and adenosine triphosphate (ATP) binding (CA) and dimerization and histidine phosphotransfer (DHp) domains. The CA domain catalyzes the transfer of the gamma phosphoryl group from ATP to a conserved histidine residue within the DHp domain. The phosphorylated SK (SK~P) binds the RR via a DHp interaction interface, and transfers the phosphoryl group to a conserved RR aspartate. Phosphorylation activates the RR, driving it to modulate transcription from one or more output promoters. Many SKs are also bi-functional and dephosphorylate the phosphorylated RR (RR~P) (Fig. 1a)33. The presence of input increases the rate at which the RR is phosphorylated, decreases the rate at which the RR~P is dephosphorylated, or both33. Many SK mutations, in both the DHp and CA domains, have been identified that decrease this phosphatase activity, resulting in increased RR~P levels34,35,36,37,38,39. When this increase is substantial, it results in leaky transcriptional output, i.e., output in the absence of input35,36,39. However, the impact of these phosphatase-altering mutations on TCS detection thresholds has not been considered.

Here, we combine mathematical modeling with an experimental synthetic biology approach to show that mutations that alter SK phosphatase or kinase activity can be used to rationally tune TCS detection thresholds. We demonstrate that our method functions in Gram-negative and Gram-positive bacteria and in diverse chemical-sensing TCSs. We go on to demonstrate that a widely conserved residue can be mutated to tune the detection thresholds of two recently described TCSs for which signaling mutations have not yet been identified. Finally, we utilize Bacillus subtilis expressing wild-type and sensitivity-enhanced nitrate sensors to quantify a wide range of fertilizer levels in soil. These sensors could be used to control the expression of engineered nitrogen fixation pathways in order to achieve synthetic nitrate homeostasis in soil.

## Results

### Mathematical model of TCS detection threshold

We hypothesized that TCS detection thresholds could be tuned by introducing mutations that alter SK kinase or phosphatase activity without compromising the overall response (i.e., dynamic range, or ratio of output in saturating versus zero input) of the system. Specifically, we considered that the detection threshold of a TCS occurs at the particular RR~P concentration that elicits a half-maximal output promoter response (i.e., RR~P1/2). For any input concentration, the corresponding RR~P concentration is set by the ratio of SK kinase to phosphatase activity40. Thus, we reasoned that mutations that enhance kinase or reduce phosphatase activity should result in RR~P1/2 being reached at a lower input concentration, thereby reducing TCS detection threshold. The opposite should also be true: TCS detection thresholds should increase with kinase-reducing or phosphatase-enhancing mutations. Furthermore, if a mutation is sufficiently weak that the window of altered RR~P concentrations still traverses the range to which the output promoter is sensitive, there should be little effect on TCS dynamic range.

To examine this hypothesis, we first utilized a previous mathematical model of TCS signaling41. We parameterized the model with the best available in vivo experimental values of TCS reaction rates as determined for the well-studied inorganic phosphate-sensing TCS PhoRB42. Then, we set the phosphatase activity parameter to different values between 1% and 10,000% that of wild type. Finally, we evaluated the resulting detection thresholds by simulating the relationship between input concentration and gene expression output (i.e., the transfer function) in each case (Supplementary Note 1)43. In agreement with our hypothesis, the model predicts that TCS detection threshold can be tuned by altering SK phosphatase activity (Fig. 1b). Moreover, intermediate changes in phosphatase activity alter detection threshold without impacting dynamic range (Supplementary Fig. 1). As expected, large decreases or increases in phosphatase activity result in high basal or low maximal expression, respectively, and thereby reduce dynamic range.

We also found that modulating kinase activity had the reciprocal effect to that of modulating phosphatase activity, with increasing kinase activity decreasing the detection threshold and decreasing kinase activity increasing the detection threshold (Supplementary Fig. 1). However, our primary goal is to decrease TCS detection thresholds, and it is easier to identify mutations that decrease rather than increase enzymatic activity. Thus, we chose to focus on decreasing phosphatase activity as opposed to increasing kinase activity. This decision is supported by mutational screens of SK activity that have found that decreases in phosphatase activity are much more common than increases in kinase activity38.

### Tuning the detection threshold of a nitrate sensor

To examine our modeling results experimentally, we selected two point mutations, C415R and D558V, that decrease the phosphatase activity of the E. coli nitrate-activated SK NarX via different mechanisms and to different extents. C415R targets the DHp interaction interface, weakens the interaction between NarX and its cognate RR NarL, and causes a moderate reduction in phosphatase activity39. On the other hand, D558V targets the CA domain and is thought to decrease phosphatase activity more strongly than C415R. However, because its impact has been measured only with gene expression assays, it is also possible that D558V may increase kinase activity39. We measured the nitrate detection thresholds of a wild-type NarXL that we engineered to function in Bacillus subtilis, and its corresponding C415R and D558V variants (Fig. 2a; Supplementary Fig. 2). The wild-type system exhibits a relatively high K1/2 of 762 μM (95% confidence interval (CI) 629–963 μM) (Fig. 2b). On the other hand, the medium strength C415R mutation decreases the value substantially (K1/2 = 22 μM, 95% CI 16–33 μM), and the strong D558V mutation reduces it even further (K1/2 = 6 μM, 95% CI 0–23 μM) (Fig. 2b).

Dynamic range is commonly reported as the primary performance metric for biosensors. The C415R and D558V versions of our nitrate sensor exhibit decreased dynamic range due to increased minimum output levels (Fig. 2b). Thus, we individually optimized SK and RR expression levels in these mutated sensors in an effort to maximize the dynamic range for each (Supplementary Fig. 3). Consistent with our modeling results, maximal dynamic range decreases from 1909-fold (wild type), to 78-fold (C415R) and 2-fold (D558V) (Supplementary Figs. 1, 3). On the other hand, the amplitude range, or difference between maximum and minimum output, may be a more useful performance metric for many applications. While the amplitude range of our wild-type nitrate sensor is 24,652 molecules of equivalent fluorescein (MEFL) (21 MEFL to 24,664 MEFL), it increases to 65,402 MEFL (2,508 MEFL to 67,910 MEFL) for C415R and 31,294 MEFL (34,758 MEFL to 66,052 MEFL) for D558V (Fig. 2b). These results provide compelling initial support for our approach.

To more rigorously validate TCS tuning, we next developed a strategy to continuously vary phosphatase activity in live cells (Fig. 3a). Specifically, we expressed wild-type NarX and NarX (C415R) under two different chemically inducible promoters and utilized green fluorescent protein (GFP) fusions and quantitative flow cytometry to map the relationship between inducer and SK levels (Fig. 3b; Supplementary Figs. 4-6). Then, we used different inducer combinations to achieve NarX/NarX (C415R) expression ratios between 0% and 100% at a constant total SK expression level (NarX+NarX (C415R)) (Fig. 3b). Assuming NarX and NarX (C415R) function identically outside of their different phosphatase activities, tuning their expression ratio in this way enables us to continuously vary phosphatase activity between mutant and wild-type levels. In strong agreement with our modeling results (Fig. 1b), the nitrate detection threshold decreases continuously from K1/2 = 1138 μM to K1/2 = 12 μM as the percentage of wild-type NarX decreases from 100 to 0% (Fig. 3c, d). The amplitude range increases from 46,372 MEFL (20 MEFL to 46392) to 87,324 (1,205 MEFL to 88,529 MEFL) as the percentage of mutant SK increases from 0 to 90%. Upon continued increase to 100% NarX (C415R), the amplitude range decreases slightly to 79,628 MEFL (11,118 MEFL to 90,746 MEFL) (Fig. 3c). We also observe that an eightfold decrease in detection threshold can be achieved with only a twofold decrease in the dynamic range, and this follows model predictions that moderate changes to the detection threshold have minor effects on dynamic range (Supplementary Note 1; Supplementary Figs. 1, 5). However, the large 100-fold decrease in detection threshold between 100% and 0% wild-type expression also decreases the dynamic range from 2,334-fold to 8-fold. This experiment clearly shows that TCS detection threshold can be tuned by tuning SK phosphatase activity. Furthermore, this iso-SK technique provides a synthetic biology method for tuning the detection threshold of a TCS to intermediate values not achievable using a mutation alone.

### Detection threshold tuning of an E. coli aspartate sensor

To evaluate the extensibility of our technology to other sensors and organisms, we next examined the engineered E. coli aspartate-activated TCS Taz-OmpR (Fig. 4a). Here, the SK Taz phosphorylates and dephosphorylates the transcription-regulating RR OmpR. Taz phosphatase activity is high in the absence of aspartate, and low in its presence44. A previous study identified numerous phosphatase-altering mutations of different strengths at Taz T436. In particular, substituting S, V, E, D, and K at this site decreases phosphatase activity by 10, 60, 91, 91, and 98%, and introducing A increases phosphatase activity by 25%36. Consistent with our NarX results (Fig. 2), T436S and V reduce the Taz-OmpR aspartate detection threshold in proportion to their strength (wild type: 19 μM (95% CI 15–25 μM); T436S: 12 μM (95% CI 8–20 μM); T436V: 4 μM (95% CI 0–13 μM) (Fig. 4b; Supplementary Fig. 7). Furthermore, T436A increases the detection threshold to 67 μM (95% CI 45–120 μM) (Fig. 4b; Supplementary Fig. 7). This T436A result indicates that the SK phosphatase activity alone, as opposed to an alternate effect of phosphatase-reducing mutations, is responsible for tuning TCS detection threshold. Furthermore, these data agree with previous results that show drastic decreases in phosphatase activity result in lowered dynamic ranges, while smaller changes, such as with T436A, have little effect on dynamic range (Supplementary Note 1; Supplementary Fig. 7). We conclude that TCS tuning can be used to both reduce and increase detection threshold, and can be applied to diverse TCS sensors and host bacteria.

Interestingly, the strong T436E, D, and K mutations abolish the Taz-OmpR aspartate response altogether (Supplementary Fig. 7). Simultaneous introduction of C415R and D558V into NarX destroys signaling to NarL as well (Supplementary Fig. 8). These results demonstrate that if phosphatase mutations are too strong, the SK will fail to signal to the RR, thereby imposing limits on the magnitude of sensitivity enhancement.

### Bioinformatic identification of a TCS hot spot residue

Unlike the initial model systems that we examined, most TCSs lack known phosphatase mutations. Therefore, we next aimed to develop a general method to apply TCS tuning to a wide range of systems. Taz T436 resides in the second (variable) position of the well-studied CA domain GXGXG motif, which is involved in binding an adenosine diphosphate co-factor that regulates SK phosphatase activity, as well as binding the phosphodonor ATP32. We performed a bioinformatic analysis that revealed that GXGXG is present in 64% of all bacterial SKs (Fig. 5a; Supplementary Fig. 9). Therefore, we hypothesized that the second GXGXG position might serve as a general hot spot residue that can be mutated to alter the detection thresholds of many TCSs.

To validate this strategy, we examined the tetrathionate sensor TtrSR and the thiosulfate sensor ThsSR (Fig. 5b, c; Supplementary Figs. 10, 11), two TCSs that we recently discovered in the genomes of marine Shewanella and ported into E. coli30. Like most SKs, the corresponding SKs TtrS and ThsS both contain the GXGXG motif and lack known phosphatase mutations. Therefore, we performed saturation mutagenesis on the second GXGXG residue in each (TtrS L627, ThsS L547) (Fig. 5a), and measured the response of both the wild-type and all 38 mutant TCSs to their cognate ligands (Supplementary Figs. 10, 11). Remarkably, we observed that 14 and 9 amino acids result in functional TtrSR and ThsSR sensors, respectively (Supplementary Figs. 10, 11). Most of the functional residues have high hydropathy scores, suggesting this site best tolerates hydrophobic amino acids (Supplementary Figs. 10, 11). Then, we characterized the transfer functions of the ten TtrSR and ThsSR variants exhibiting the largest fold activation (Supplementary Figs. 10, 11). All of the mutations that we tested lower the detection threshold (Fig. 5d–g). In the case of TtrSR, K1/2 varies between 35.6 μM (95% CI 27–48 μM) for wild-type and 1.5 μM (95% CI 1.3–1.9 μM) for the strongest mutant (Fig. 5d). For ThsSR, K1/2 varies between 192 μM (95% CI 138–305 μM) and 35 μM (95% CI 24–60 μM) (Fig. 5e). Because the CA domain is involved in the kinase and phosphatase reactions, further characterization is needed to determine which enzymatic activity, or activities, have been changed by these GXGXG mutations. Interestingly, we found that the TtrS(L627A) mutant not only decreased the detection threshold from 35.6 μM to 2.4 μM, but it also increased the dynamic range from 15- to 21-fold and the amplitude range from 1377 MEFL (100 MEFL to 1477 MEFL) to 2095 MEFL (105 MEFL to 2200 MEFL) (Fig. 5f). Conversely, decreasing the detection threshold of the thiosulfate sensor twofold with L547T resulted in a decrease in dynamic range from 34- to 13-fold and an increase in amplitude range from 19,390 MEFL (596 MEFL to 19,986 MEFL) to 23,482 MEFL (1905 MEFL to 25,387 MEFL) (Fig. 5g). We conclude that mutating the second GXGXG residue is a simple strategy for tuning the detection thresholds of diverse TCSs.

### Application of TCS tuning to fertilizer biosensing

Finally, we set out to demonstrate a proof-of-principle application for TCS tuning. Nitrate is the primary source of nitrogen used by crops, and a major component of fertilizer. However, over-application of fertilizer causes billions of dollars in damage per year to human health and the environment45. Recently, synthetic biologists have expressed bacterial nitrogen fixation pathways, which ultimately convert atmospheric N2 into nitrate, in non-native host bacteria46. However, heterologous production of nitrogen fixation pathways in soil bacteria could also lead to nitrate overproduction. To prevent this outcome, genetic feedback control systems wherein bacteria sense a wide range of soil nitrate levels and induce nitrogen fixation pathways only to the extent that they are needed are highly desirable.

To demonstrate such a sensing capability, we incubated B. subtilis engineered to express our wild-type and C415R NarXL systems in soil spiked with various amounts of a nitrate standard, and measured the resulting superfolder GFP (sfGFP) fluorescence values via flow cytometry (Methods; Fig. 6a; Supplementary Fig. 12). Then, we used the resulting data to generate a standard curve relating sfGFP fluorescence to soil nitrate concentration (Supplementary Fig. 12; Supplementary Note 2). Then, we added different amounts of commercial fertilizer, rather than nitrate, to the soil (Methods; Supplementary Fig. 12). Using the standard curves, we compared the amount of nitrate reported by each of our sensor systems to the amount specified by the manufacturer (Supplementary Note 2). Indeed, the wild-type NarXL system enables estimation of fertilizer levels within twofold of the manufacturer value between the tested values of 31.6 μM and 562 μM nitrate, while the C415R system allows accurate detection between 5.62 μM and 562 μM (Fig. 6b). This experiment demonstrates that we can use TCS tuning to engineer bacteria to sense a large range of nitrate concentrations in a complex soil environment. Such broad-range sensing could be coupled with nitrogen fixation pathways to maintain soil nitrate at ideal levels in different agricultural contexts.

## Discussion

This work extends a growing suite of techniques for engineering TCSs to function as sensors for synthetic biology. First, literature searches27,47,48 or bioinformatics30 can be used to identify TCSs that sense inputs of interest. If a candidate TCS has a known output promoter, and functions in the desired host and environmental conditions, it can be used as an off-the-shelf sensor without further modifications47,49. Otherwise, the sensor domain can potentially be swapped onto the SK of a second TCS that contains a reliable output promoter, resulting in the design of a chimeric sensor25,50,51. Like all gene regulatory systems, TCSs can exhibit substantial ‘leakiness’ in the off state, or modest dynamic range. These performance features can be improved by redesigning the sequence of the output promoter and optimizing the expression levels of the SK and RR27,30,48,52.

However, this workflow may produce sensors that do not respond appropriately to application-relevant input concentrations. For example, tetrathionate was previously shown to be elevated in the mouse colon during Salmonella typhimurium-induced inflammation53. Following this report, Silver and colleagues31 used S. typhimurium TtrSR to activate a transcriptional memory circuit in order to engineer a gut bacterium that senses and remembers tetrathionate exposure in order to diagnose colon inflammation. However, despite 100% tetrathionate activation in vitro, most bacteria expressing this sensor device are not activated by inflammatory conditions in vivo31. One possible reason for this discrepancy is that in vivo tetrathionate concentrations do not reach the S. typhimurium TtrSR detection threshold. Thus, by using TCS tuning to lower the detection threshold of TtrSR (Fig. 5f), it is possible that the performance of this diagnostic gut bacterium could be improved.

It is possible that nature uses phosphatase activity as a knob to tune TCS detection threshold as well. First, there are a wide range of SK residues that can be mutated to specifically alter phosphatase activity39. This fact suggests that evolution can tune TCS detection thresholds, which could enable organisms to adapt to new niches with different input concentrations. Interestingly, few mutations have been discovered that increase phosphatase activity, or decrease kinase activity. As of currently, this fact restricts our TCS tuning method to applications where lower detection thresholds (i.e., increases in sensitivity) are needed. However, sensitivity decreases are also desirable in many synthetic biology applications, which motivates future work to identify appropriate mutations.

Additionally, some SKs interact with phosphatase-modulating auxiliary proteins54. It is possible that these auxiliary proteins can tune the detection thresholds of the corresponding TCSs. Unlike SK mutations, they could also be dynamically induced or repressed in response to changing environmental or physiological conditions to temporarily adjust detection thresholds. This phenomenon is analogous to our use of chemically inducible promoters to adjust the NarXL nitrate detection threshold in our iso-SK experiment (Fig. 3). These intriguing possibilities remain to be explored.

Finally, our approach may be extensible to other kinase pathways. For example, eukaryotes use MAPK cascades to sense and respond to important extracellular signals such as growth factors and immunomodulators55. Threonine and tyrosine phosphatases modulate signaling through these pathways by dephosphorylating MAP kinases56. Researchers have expressed variants of these phosphatases under synthetic feedback control to re-program pathway response dynamics57,58. Alternatively, by constitutively expressing such phosphatases to different extents, or expressing phosphatases of different strengths, the detection thresholds of MAPK cascades could potentially be tuned.

In conclusion, we have demonstrated a simple, general strategy for tuning the detection threshold of TCSs—one of the largest and most diverse families of sensors in biology. Due to its effectiveness and ease of use, our method should have widespread applications in synthetic biology.

## Methods

### DNA and bacterial strain construction

Details of synthetic DNAs used in this work are given in Supplementary Data 1-4. All E. coli systems are expressed on extrachromosomal plasmids. All plasmids were assembled via Golden Gate cloning59. Assembled plasmids were transformed into E. coli NEB 10-β (New England Biolabs, cat no. C3019H). Ribosome binding site (RBS) strengths were calculated using the RBS calculator60.

All B. subtilis systems are constructed as linear double-stranded DNA Integration Modules (IMs) and integrated into the chromosome. All IMs were assembled with Golden Gate cloning59. Assembled DNA was amplified with PCR, transformed into B. subtilis 168 (BGSCID 1A1) and recombined into the chromosome using the two-step transformation protocol61. B. subtilis genomic DNA was then purified (Promega, A1120) and used for subsequent transformations.

E. coli NEB 10-β and B. subtilis 168 were grown in LB Miller broth shaking at 250 rpm at 37 °C. Then, 50 μg mL−1 ampicillin, 35 μg mL−1 chloramphenicol, and 100 μg mL−1 spectinomycin for E. coli and 100 μg mL−1 spectinomycin, 0.5 μg mL−1 erythromycin, 5 μg mL−1 chloramphenicol, and 5 μg mL−1 kanamycin for B. subtilis were added where appropriate. Transformed strains were stored in 15% glycerol stocks at −80 °C.

E. coli plasmids are available from Addgene using accession numbers listed in Supplementary Data 3. B. subtilis constructs are available from the Bacillus Genetic Stock Center using BGSC numbers listed in Supplementary Data 4.

### In vitro nitrate experiments

In vitro nitrate induction experiments were conducted with B. subtilis 168 ΔydfHI::camR (iND46; Supplementary Fig. 2). C minimal media with sodium succinate and potassium glutamate (CSE media) containing 30 mM KH2PO4 (Fisher BioReagents, BP362-1), 70 mM K2HPO4 (Fisher BioReagents, BP363-1), 25 mM (NH4)2SO4 (Sigma, A4418-100G), 10 mM MnSO4 (Sigma-Aldrich, M7634-100G), 500 µM MgSO4 (VWR, BDH9246-500G), 12.5 µM ZnCl2 (Sigma, Z0152-50G), 245 µM L-typtophan (Sigma-Aldrich, T0254-25G), 22 mg L−1 ammonium iron(III) citrate (Sigma-Aldrich, F5879-100G), 43.2 mM Potassium Glutamate (Alfa Aesar, A17232), 22.2 mM Sodium Succinate (Alfa Aesar, 33386), and 43.4 mM Glycerol (Fisher BioReagents, BP229-1) were used without antibiotics. Induction conditions were 25 mM NaNO3 (Sigma-Aldrich, S5506), 10 µM isopropyl β-d-1-thiogalactopyranoside (IPTG) (IBI Scientific, IB02125), and 1% xylose (Alfa Aesar, A10643) unless otherwise noted. IPTG and xylose levels were chosen for optimal fold change of the NarX(D558V) TCS (Supplementary Fig. 3). An overnight culture was inoculated from a 15% glycerol freezer stock and grown in 3 mL of media for 13–15 h. Cells were diluted to OD600 = 3 × 10-4 with relevant inducers in a 500 µL volume in 24-well plates sealed with a tin foil adhesive (VWR, F96VWR100). Cells were grown to an OD600 = 0.3 (approximately 6 h) and placed on ice prior to measuring via flow cytometry with a FL1 gain of 600. All growth was conducted shaking at 250 rpm at 37 °C.

### Aspartate experiments

Aspartate induction experiments were conducted in E. coli BW29655 (BW28357 Δ(envZ-ompR)520(::FRT); CGSC #7934; Yale University). M9 media containing 1× M9 salts (42 mM Na2HPO4, 24 mM KH2PO4, 8.9 mM NaCL, 19 mM NH4Cl; Teknova, M1902), 2 mM MgSO4 (VWR, BDH9246-500G), and 0.1 mM CaCl2 (Alfa Aesar, L13191) were used with 22.2 mM glucose (Avantor, 4908-06) as a carbon source, and 2 g L−1 casamino acids, 50 μg mL−1 ampicillin, 35 μg mL−1 chloramphenicol, 100 μg mL−1 spectinomycin, 10 μM IPTG and 50 ng mL−1 anhydrotetracycline (aTc; Takara Bio USA, 631310) were used. Then, 3 mL of this medium in a 14 mL culture tube was inoculated to OD600 = 5 × 10-3 from a single use 15% glycerol stock stored at −80 °C containing cells frozen during exponential phase. Bacteria were grown for 2 h shaking at 250 rpm at 37 °C. Amino acids were then removed by centrifuging at 3220 ×g for 5 min, resuspending in 5 mL of media without casamino acids, centrifuging at 3220 × g for 5 min, and resuspending in 5 mL of media without casamino acids. Aspartate was added to the culture and bacteria were grown for 2 h shaking at 250 rpm at 37 °C, placed on ice, and then measured via flow cytometry with an FL1 gain of 750.

### Computational analysis of the phosphatase hot spot residue

To estimate the fraction of known SKs that contain the phosphatase hot spot residue, we first assembled a library of non-redundant SK sequences from 4861 NCBI (National Center for Biotechnology Information) RefSeq bacterial genomes using HMMER362. We used hmmsearch to identify all proteins that had a C-terminal kinase core composed of a single kinase domain (Pfam: HisKA, HisKA_2, HisKA_3, His_kinase, H-kinase_dim) followed by an HATPase_c domain (reporting threshold set to 12.0 for each). We eliminated SKs with non-canonical signaling architectures by requiring that each had at least a minimal sensing region (>10 a.a. N terminal of the kinase core) and contained neither a Receiver domain (Response_reg) nor a histidine phosphotransfer domain (Hpt). This constraint resulted in 105,144 SK proteins. To eliminate redundant sequences from this pool, we used usearch63 to cluster the sequences according to a 60% sequence similarity threshold (using ‘-cluster_fast’ and ‘-sort length’ parameters). The centroids of each cluster were then used as representatives of non-redundant SKs, resulting in 56,855 proteins. We next created a hidden Markov model (HMM) representing the G2 box motif (Supplementary Fig. 9) by aligning 12 representative G2 box sequences64 and using hmmbuild to create a model. This model was then used with hmmsearch (default parameters) to identify SKs in the non-redundant set that match, yielding 38,966 SKs with putative G2 box motifs. Two additional criteria were used to eliminate false positives: (1) the putative G2 box must align to the correct region of the protein (C terminal to the HisKA domain), and (2) the G2 box must have G3 and G5 present when aligned to the HMM. Applying these constraints left 36,508 SKs remaining, constituting 64.21% of the full non-redundant SK data set. Finally, the distribution of residues in the second position of the GXGXG motif were tabulated from these SKs.

### Tetrathionate and thiosulfate experiments

Tetrathionate and thiosulfate induction experiments were conducted with E. coli BW28357 (CGSC#: 7991, Yale University). M9 media were used with 1 × M9 salts, with 43.4 mM glycerol (Fisher BioReagents, BP229-1) as a carbon source, 2 g L−1 casamino acids (EMD Millipore, 2240-500GM), 35 μg mL−1 chloramphenicol, and 100 μg mL−1 spectinomycin. For thiosulfate experiments, 200 μM IPTG and 20 ng mL−1 aTc were used, and leaky expression of the TtrSR TCS without inducers was found to be sufficient. Ligand induction was achieved with K2S4O6 (Sigma-Aldrich, P2926-25G) or Na2S2O3 (Sigma-Aldrich, 217247-25G). The experiment was started by inoculating 3 mL of media in a 14 mL culture tube to OD600 = 1 × 10−4 from a single-use 15% glycerol stock frozen during exponential phase and stored at −80 °C. Bacteria were grown at 37 °C shaking at 250 rpm for 4 h, placed on ice, and then measured via flow cytometry with a FL1 gain of 600.

### Soil nitrate experiments

Soil experiments were conducted with B. subtilis 168 ΔydfHI::camR,mCherry (iND77; Supplementary Fig. 12). CSE media with 0.3% xylose and 3 µM IPTG were used without antibiotics in all experiments. IPTG and xylose levels were selected to achieve a large fold change of both wild-type and C415R NarXL TCSs (Supplementary Fig. 12). Soil (Miracle Gro, All-Purpose Garden Soil) was prepared by removing large particles with a 1.75 mm strainer, transferring 0.1 g to a 14 mL culture tube, and adding NaNO3 or fertilizer (Vigoro, All-Purpose Plant Food) which contains 1.03 M NO3-. The experiment was started with an overnight culture inoculated from a 15% glycerol freezer stock and grown in 3 mL of media for 18 to 22 h of shaking at 250 rpm. Cells were diluted to OD600 = 3 × 10-4 and grown to OD600 = 0.075–0.125 shaking at 250 rpm. Then, 250 µL of cells were added to the soil, vortexed for 5 s to mix, and centrifuged for 20 s to settle, resulting in 1 mL of damp soil. Cells were incubated in soil with no shaking for 2 h and then placed on ice. Next, 5 mL of cold phosphate-buffered saline was added and the samples were vortexed for 10 s to resuspend the cells. Particulates were allowed to settle for 2 min and then the supernatant was passed through Whatman #1 filter paper (Sigma, WHA10016508) to further remove particulates. Samples were then measured on the flow cytometer with a FL1 gain of 700 and FL3 gain of 850 thresholded at 45% FL3. All experiments were conducted at 37 °C.

### Flow cytometry and sfGFP fluorescence calculation

Flow cytometry was conducted with a BD FACScan flow cytometer. The instrument employed blue (488 nm, 30 mW) and yellow (561 nm, 50 mW) solid-state lasers (Cytex) and a 510/21 nm filter (FL1) to measure GFP and a 650 nm long pass filter (FL3) to measure mCherry. For each sample,10,000–20,000 events were collected at 500–2000 events per second within a forward scatter (FSC), side scatter (SSC) gate. Rainbow calibration beads from Spherotech, Inc. (cat. no. RCP-30-20A) were also collected each day at identical detector gain settings. Flow cytometry data were processed with FlowCal65. Events were selected by discarding the first 250 and last 100 time ordered events, a density gate was then applied to select the densest 10% of events (~1000–2000 events) in FSC/SSC space to specifically select bacterial cells (Supplementary Fig. 13). FL1 fluorescence was transformed into MEFL units using a standard curve created from the calibration beads measured on that day. The geometric mean of the population was used to calculate the fluorescence of each sample.

To calculate sfGFP fluorescence, measured bacterial autofluorescence (119 MEFL for E. coli and 150 MEFL for B. subtilis) was subtracted from total cellular fluorescence. Some samples were not significantly different from cellular autofluorescence, resulting in exaggerated fold change calculations ($$\frac{{\mathrm{induced}\,\mathrm{sample} - \mathrm{autofluorescence}}}{{\mathrm{uninduced}\,\mathrm{sample} - \mathrm{autofluorescence}}}$$). Therefore, when calculating the fold change, if the sfGFP expression fell below the limit of detection ($$\mathrm{LOD} = 3 \ast \sigma _{\mathrm{autofluorescence}}$$; 16.6 MEFL for E. coli and 36.4 MEFL for B. subtilis) the LOD was used in place of the measured sfGFP value to calculate a lower bound of the fold change.

### Transfer function modeling and parameter estimation

All transfer function data were fit to an activating Hill equation $$\left( {y = \mathrm{low} + \left( {\mathrm{high} - \mathrm{low}} \right)\frac{{x^n}}{{K_{\frac{1}{2}}^n + x^n}}} \right)$$ using the LmFit python package66. Here, y is the sfGFP fluorescence (MEFL), x is the concentration of inducer (µM), low is the sfGFP fluorescence at 0 µM inducer (MEFL), high is the maximum sfGFP fluorescence (MEFL), K1/2 is the concentration of inducer that gives rise to half-maximal sensor activation (µM), and n is the Hill coefficient. All transfer functions were experimentally measured on three separate days. Replicate data points were combined into a single data set. This set was fit by the Hill equation. To fit both low and high sfGFP values well, the fit residuals at each data point were weighted by multiplying the residual by the inverse of the mean at that data point. The 95% confidence intervals of fit parameter values were calculated using the conf_interval function in LmFit, which executes the F-test. Fit parameters for all experiments in this study are shown in Supplementary Data 5.

### Code availability

The code used to generate a model of a TCS is included as a supplementary file to this article.

### Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request. DNA sequences are available from GenBank and accession numbers can be found in Supplementary Data 3, 4.

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## References

1. 1.

Brophy, J. A. N. & Voigt, C. A. Principles of genetic circuit design. Nat. Methods 11, 508–520 (2014).

2. 2.

Wang, Y.-H., Wei, K. Y. & Smolke, C. D. Synthetic biology: advancing the design of diverse genetic systems. Annu. Rev. Chem. Biomol. Eng. 4, 69–102 (2013).

3. 3.

Ang, J., Harris, E., Hussey, B. J., Kil, R. & McMillen, D. R. Tuning response curves for synthetic biology. ACS Synth. Biol. 2, 547–567 (2013).

4. 4.

Courbet, A., Endy, D., Renard, E., Molina, F. & Bonnet, J. Detection of pathological biomarkers in human clinical samples via amplifying genetic switches and logic gates. Sci. Transl. Med. 7, 289ra83 (2015).

5. 5.

Rubens, J. R., Selvaggio, G. & Lu, T. K. Synthetic mixed-signal computation in living cells. Nat. Commun. 7, 11658 (2016).

6. 6.

Bayer, T. S. & Smolke, C. D. Programmable ligand-controlled riboregulators of eukaryotic gene expression. Nat. Biotechnol. 23, 337–343 (2005).

7. 7.

O’Shaughnessy, E. C., Palani, S., Collins, J. J. & Sarkar, C. A. Tunable signal processing in synthetic MAP kinase cascades. Cell 144, 119–131 (2011).

8. 8.

Taylor, N. D. et al. Engineering an allosteric transcription factor to respond to new ligands. Nat. Methods 13, 177–183 (2016).

9. 9.

Hawkins, A. C., Arnold, F. H., Stuermer, R., Hauer, B. & Leadbetter, J. R. Directed evolution of Vibrio fischeri LuxR for improved response to butanoyl-homoserine lactone. Appl. Environ. Microbiol. 73, 5775–5781 (2007).

10. 10.

Mascher, T., Helmann, J. D. & Unden, G. Stimulus perception in bacterial signal-transducing histidine kinases. Microbiol. Mol. Biol. Rev. 70, 910–938 (2006).

11. 11.

Goh, E.-B. et al. Hierarchical control of anaerobic gene expression in Escherichia coli K-12: the nitrate-responsive NarX-NarL regulatory system represses synthesis of the fumarate-responsive DcuS-DcuR regulatory system. J. Bacteriol. 187, 4890–4899 (2005).

12. 12.

Gilles-Gonzalez, M.-A. & Gonzalez, G. Heme-based sensors: defining characteristics, recent developments, and regulatory hypotheses. J. Inorg. Biochem. 99, 1–22 (2005).

13. 13.

Hsieh, Y.-J. & Wanner, B. L. Global regulation by the seven-component Pi signaling system. Curr. Opin. Microbiol. 13, 198–203 (2010).

14. 14.

Torres, V. J. et al. A Staphylococcus aureus regulatory system that responds to host heme and modulates virulence. Cell. Host. Microbe 1, 109–119 (2007).

15. 15.

Papenfort, K. & Bassler, B. L. Quorum sensing signal–response systems in Gram-negative bacteria. Nat. Rev. Microbiol. 14, 576–588 (2016).

16. 16.

Hutchings, M. I., Hong, H.-J. & Buttner, M. J. The vancomycin resistance VanRS two-component signal transduction system of Streptomyces coelicolor. Mol. Microbiol. 59, 923–935 (2006).

17. 17.

Verhamme, D. T., Arents, J. C., Postma, P. W., Crielaard, W. & Hellingwerf, K. J. Glucose-6-phosphate-dependent phosphoryl flow through the Uhp two-component regulatory system. Microbiology 147, 3345–3352 (2001).

18. 18.

Sonnenburg, E. D. et al. A hybrid two-component system protein of a prominent human gut symbiont couples glycan sensing in vivo to carbohydrate metabolism. Proc. Natl. Acad. Sci. USA 103, 8834–8839 (2006).

19. 19.

Martens, E. C., Chiang, H. C. & Gordon, J. I. Mucosal glycan foraging enhances fitness and transmission of a saccharolytic human gut bacterial symbiont. Cell. Host. Microbe 4, 447–457 (2008).

20. 20.

Karavolos, M. H., Winzer, K., Williams, P. & Khan, C. M. A. Pathogen espionage: multiple bacterial adrenergic sensors eavesdrop on host communication systems. Mol. Microbiol. 87, 455–465 (2013).

21. 21.

Donoso, R. et al. Biochemical and genetic basis of indole-3-acetic acid (auxin phytohormone) degradation by the plant growth promoting rhizobacterium Paraburkholderia phytofirmans PsJN. Appl. Environ. Microbiol. 83, pii: e01991-16 (2016).

22. 22.

Singh, K. K. The Saccharomyces cerevisiae sln1p-ssk1p two-component system mediates response to oxidative stress and in an oxidant-specific fashion. Free Radic. Biol. Med. 29, 1043–1050 (2000).

23. 23.

O’Connor, J. R., Kuwada, N. J., Huangyutitham, V., Wiggins, P. A. & Harwood, C. S. Surface sensing and lateral subcellular localization of WspA, the receptor in a chemosensory-like system leading to c-di-GMP production. Mol. Microbiol. 86, 720–729 (2012).

24. 24.

Duanmu, D. et al. Marine algae and land plants share conserved phytochrome signaling systems. Proc. Natl. Acad. Sci. USA 111, 15827–15832 (2014).

25. 25.

Levskaya, A. et al. Synthetic biology: engineering Escherichia coli to see light. Nature 438, 441–442 (2005).

26. 26.

Ohlendorf, R., Vidavski, R. R., Eldar, A., Moffat, K. & Möglich, A. From dusk till dawn: one-plasmid systems for light-regulated gene expression. J. Mol. Biol. 416, 534–542 (2012).

27. 27.

Ramakrishnan, P. & Tabor, J. J. Repurposing synechocystis PCC6803 UirS–UirR as a UV-violet/green photoreversible transcriptional regulatory tool in E. coli. ACS Synth. Biol. 5, 733–740 (2016).

28. 28.

Ong, N. T. X. & Tabor, J. J. A miniaturized E. coli green light sensor with high dynamic range. Chembiochem Eur. J. Chem. Biol. (2018). https://doi.org/10.1002/cbic.201800007

29. 29.

Landry, B. P. & Tabor, J. J. Engineering diagnostic and therapeutic gut bacteria. Microbiol. Spectr. 5, (2017).

30. 30.

Daeffler, K. N.-M. et al. Engineering bacterial thiosulfate and tetrathionate sensors for detecting gut inflammation. Mol. Syst. Biol. 13, 923 (2017).

31. 31.

Riglar, D. T. et al. Engineered bacteria can function in the mammalian gut long-term as live diagnostics of inflammation. Nat. Biotechnol. 35, 653–658 (2017).

32. 32.

Bhate, M. P., Molnar, K. S., Goulian, M. & DeGrado, W. F. Signal transduction in histidine kinases: insights from new structures. Structure 23, 981–994 (2015).

33. 33.

Huynh, T. N. & Stewart, V. Negative control in two-component signal transduction by transmitter phosphatase activity. Mol. Microbiol. 82, 275–286 (2011).

34. 34.

Atkinson, M. R. & Ninfa, A. J. Mutational analysis of the bacterial signal-transducing protein kinase/phosphatase nitrogen regulator II (NRII or NtrB). J. Bacteriol. 175, 7016–7023 (1993).

35. 35.

Wanner, B. L. in Escherichia coli and Salmonella: Cellular and Molecular Biology (eds Neidhardt, F. C. et al.) 2, 1357–1381 (ASM Press, Washington, 1996).

36. 36.

Zhu, Y. & Inouye, M. The role of the G2 box, a conserved motif in the histidine kinase superfamily, in modulating the function of EnvZ. Mol. Microbiol. 45, 653–663 (2002).

37. 37.

Qin, L., Cai, S., Zhu, Y. & Inouye, M. Cysteine-scanning analysis of the dimerization domain of EnvZ, an osmosensing histidine kinase. J. Bacteriol. 185, 3429–3435 (2003).

38. 38.

Capra, E. J. et al. Systematic dissection and trajectory-scanning mutagenesis of the molecular interface that ensures specificity of two-component signaling pathways. PLoS Genet. 6, e1001220 (2010).

39. 39.

Huynh, T. N., Noriega, C. E. & Stewart, V. Missense substitutions reflecting regulatory control of transmitter phosphatase activity in two-component signalling. Mol. Microbiol. 88, 459–472 (2013).

40. 40.

Russo, F. D. & Silhavy, T. J. EnvZ controls the concentration of phosphorylated OmpR to mediate osmoregulation of the porin genes. J. Mol. Biol. 222, 567–580 (1991).

41. 41.

Batchelor, E. & Goulian, M. Robustness and the cycle of phosphorylation and dephosphorylation in a two-component regulatory system. Proc. Natl. Acad. Sci. USA 100, 691–696 (2003).

42. 42.

Gao, R. & Stock, A. M. Probing kinase and phosphatase activities of two-component systems in vivo with concentration-dependent phosphorylation profiling. Proc. Natl. Acad. Sci. USA 110, 672–677 (2013).

43. 43.

Igoshin, O. A., Alves, R. & Savageau, M. A. Hysteretic and graded responses in bacterial two-component signal transduction. Mol. Microbiol. 68, 1196–1215 (2008).

44. 44.

Jin, T. & Inouye, M. Ligand binding to the receptor domain regulates the ratio of kinase to phosphatase activities of the signaling domain of the hybrid Escherichia coli transmembrane receptor, Taz1. J. Mol. Biol. 232, 484–492 (1993).

45. 45.

Sutton, M. A. et al. The European Nitrogen Assessment: Sources, Effects and Policy Perspectives (Cambridge University Press, Cambridge, 2011).

46. 46.

Smanski, M. J. et al. Functional optimization of gene clusters by combinatorial design and assembly. Nat. Biotechnol. 32, 1241–1249 (2014).

47. 47.

Tabor, J. J., Levskaya, A. & Voigt, C. A. Multichromatic control of gene expression in Escherichia coli. J. Mol. Biol. 405, 315–324 (2011).

48. 48.

Ong, N. T. X., Olson, E. J . & Tabor, J. J. Engineering an E. coli near-infrared light sensor. ACS Synth. Biol. 7, 240–248 (2018).

49. 49.

Olson, E. J., Hartsough, L. A., Landry, B. P., Shroff, R. & Tabor, J. J. Characterizing bacterial gene circuit dynamics with optically programmed gene expression signals. Nat. Methods 11, 449–455 (2014).

50. 50.

Baumgartner, J. W. et al. Transmembrane signalling by a hybrid protein: communication from the domain of chemoreceptor Trg that recognizes sugar-binding proteins to the kinase/phosphatase domain of osmosensor Env. Z. J. Bacteriol. 176, 1157–1163 (1994).

51. 51.

Utsumi, R. et al. Activation of bacterial porin gene expression by a chimeric signal transducer in response to aspartate. Science 245, 1246 (1989).

52. 52.

Schmidl, S. R., Sheth, R. U., Wu, A. & Tabor, J. J. Refactoring and optimization of light-switchable Escherichia coli two-component systems. ACS Synth. Biol. 3, 820–831 (2014).

53. 53.

Winter, S. E. et al. Gut inflammation provides a respiratory electron acceptor for Salmonella. Nature 467, 426–429 (2010).

54. 54.

Jeong, D.-W. et al. The auxiliary protein complex SaePQ activates the phosphatase activity of sensor kinase SaeS in the SaeRS two-component system of Staphylococcus aureus. Mol. Microbiol. 86, 331–348 (2012).

55. 55.

Kim, E. K. & Choi, E.-J. Pathological roles of MAPK signaling pathways in human diseases. Biochim. Biophys. Acta 1802, 396–405 (2010).

56. 56.

Caunt, C. J. & Keyse, S. M. Dual-specificity MAP kinase phosphatases (MKPs). FEBS J. 280, 489–504 (2013).

57. 57.

Wei, P. et al. Bacterial virulence proteins as tools to rewire kinase pathways in yeast and immune cells. Nature 488, 384–388 (2012).

58. 58.

Bashor, C. J., Helman, N. C., Yan, S. & Lim, W. A. Using engineered scaffold interactions to reshape MAP kinase pathway signaling dynamics. Science 319, 1539–1543 (2008).

59. 59.

Engler, C., Kandzia, R. & Marillonnet, S. A one pot, one step, precision cloning method with high throughput capability. PLoS One 3, e3647 (2008).

60. 60.

Espah Borujeni, A., Channarasappa, A. S. & Salis, H. M. Translation rate is controlled by coupled trade-offs between site accessibility, selective RNA unfolding and sliding at upstream standby sites. Nucleic Acids Res. 42, 2646–2659 (2014).

61. 61.

Harwood, C. R. & Cutting, S. M. Molecular Biological Methods for Bacillus (Wiley, Chichester, 1990).

62. 62.

Finn, R. D., Clements, J. & Eddy, S. R. HMMER web server: interactive sequence similarity searching. Nucleic Acids Res. 39, W29–W37 (2011).

63. 63.

Edgar, R. C. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 2460–2461 (2010).

64. 64.

Wolanin, P. M., Thomason, P. A. & Stock, J. B. Histidine protein kinases: key signal transducers outside the animal kingdom. Genome Biol. 3, REVIEWS3013 (2002).

65. 65.

Castillo-Hair, S. M. et al. FlowCal: a user-friendly, open source software tool for automatically converting flow cytometry data from arbitrary to calibrated units. ACS Synth. Biol. 5, 774–780 (2016).

66. 66.

Newville, M. et al. Lmfit: non-linear least-square minimization and curve-fitting for python. Astrophys. Source Code Libr. ascl:1606.014 (2016).

## Acknowledgements

This work was supported by the ONR Young Investigator Award N00014-14-1-0487 and NSF CAREER 1553317. B.P.L. was supported by the DoD, Air Force Office of Scientific Research, National Defense Science and Engineering Graduate (NDSEG) Fellowship, 32 CFR 168a. We thank the Joel Moake lab for generous sharing of their flow cytometer.

## Author information

### Affiliations

1. #### Department of Bioengineering, Rice University, 6100 Main St., Houston, 77005, TX, USA

• Brian P. Landry
• , Rohan Palanki
• , Nikola Dyulgyarov
• , Lucas A. Hartsough
•  & Jeffrey J. Tabor
2. #### Department of Biosciences, Rice University, 6100 Main St., Houston, 77005, TX, USA

• Jeffrey J. Tabor

### Contributions

B.P.L. conceived of the project. J.J.T. supervised the project. B.P.L, N.D., and R.P. performed preliminary work and built DNA constructs. B.P.L. collected nitrate and aspartate data. R.P. collected tetrathionate and thiosulfate data. B.P.L. performed all data analyses. L.A.H. performed the bioinformatic analysis. B.P.L., L.A.H., and J.J.T. wrote the manuscript.

### Competing interests

The authors declare no competing interests.

### Corresponding author

Correspondence to Jeffrey J. Tabor.

## Electronic supplementary material

### DOI

https://doi.org/10.1038/s41467-018-03929-y