Harnessing Nature’s Diversity: Discovering organophosphate bioscavenger characteristics among low molecular weight proteins

Organophosphate poisoning can occur from exposure to agricultural pesticides or chemical weapons. This exposure inhibits acetylcholinesterase resulting in increased acetylcholine levels within the synaptic cleft causing loss of muscle control, seizures, and death. Mitigating the effects of organophosphates in our bodies is critical and yet an unsolved challenge. Here, we present a computational strategy that integrates structure mining and modeling approaches, using which we identify novel candidates capable of interacting with a serine hydrolase probe (with equilibrium binding constants ranging from 4 to 120 μM). One candidate Smu. 1393c catalyzes the hydrolysis of the organophosphate omethoate (kcat/Km of (2.0 ± 1.3) × 10−1 M−1s−1) and paraoxon (kcat/Km of (4.6 ± 0.8) × 103 M−1s−1), V- and G-agent analogs respectively. In addition, Smu. 1393c protects acetylcholinesterase activity from being inhibited by two organophosphate simulants. We demonstrate that the utilized approach is an efficient and highly-extendable framework for the development of prophylactic therapeutics against organophosphate poisoning and other important targets. Our findings further suggest currently unknown molecular evolutionary rules governing natural diversity of the protein universe, which make it capable of recognizing previously unseen ligands.

First, we apply a rigid substructure search algorithm Erebus 15 to detect candidates from the protein databank mimicking the active site of AChE. Second, to predict the binding of organophosphates to the identified scaffold, we employ a molecular docking algorithm MedusaDock 16 in which we account for both ligand and receptor flexibility. We choose VX as a molecular probe for the in silico studies due to its exceptional lethality.
Our top three results are phosphoribosyl isomerase from Mycobacterium tuberculosis (PDB ID 2Y85), antigen 85-A from Mycobacterium tuberculosis (PDB ID 1SFR), and Smu. 1393c from Streptococcus mutans (PDB ID 4L9A). We validate the presence of the predicted binding interactions by testing with serine hydrolase probes. The results from a fluorescence polarization study indicate that all three candidates present OP binding interactions (with equilibrium binding constants ranging from 4 to 120 μ M). Of these candidates, we show that Smu. 1393c binds the probes through covalent modification of a serine mimicking AChE inhibition. We further demonstrate that Smu. 1393c can catalyze the hydrolysis of omethoate (k cat /K m of (2.0 ± 1.3) × 10 −1 M −1 s −1 ) and paraoxon (k cat /K m of (4.6 ± 0.8) × 10 3 M −1 s −1 ), V-and G-agent analogs respectively. Additionally, Smu. 1393c protects AChE activity when exposed to two OP simulants and represents an OP bioscavenger candidate successfully identified using our computational workflow to mine structural databases for a novel target for a specific small molecule.

Results
Identification of putative candidate scaffolds. We first identify proteins with similar binding sites to AChE and BChE, using rigid substructure search algorithm, Erebus 15 (Fig. 1a). Erebus algorithm rapidly extracts selected geometric features, such as inter-atomic pairwise distances, of a submitted query scaffold and then searches the PDB structures for matching features. The returned matches represent structures with a root mean square deviation (RMSD) where the query atoms and the matching result atoms fall within the desired cutoff parameter; in our case we choose a cutoff of less than five angstroms. Our query scaffold is composed of atoms selected from the active site of AChE (Fig. 1b). The number of returned results from this query is substantial (greater than 1,000) and require additional refinement, in which we apply the following three criteria: (i) molecular weight below 40 kDa to ensure a low molecular weight; (ii) monomeric form of the retrieved solutions as detailed in the PDB file; and (iii) solvent accessibility of amino acids in the retrieved hits, which is a direct assessment of whether the scaffold can be accessed by the OPs. We derive these criteria to overcome the limitations of BChE as a stoichiometric bioscavenger, namely the tetrameric structure and large molecular weight. After this refinement there are 21 protein candidates remaining (Supplementary Table 1).
Identification of top-ranked candidates. The classic binding mode of OPs involves covalent binding of the phosphate group to an activated serine. Prior to forming covalent interactions, non-covalent interactions assist in orienting the OP for a successful serine nucleophilic attack. We screen the 21 protein candidates using  16,17 to model non-covalent protein-ligand interactions with VX as our ligand of interest ( Fig. 1a and 1c). MedusaDock comprises a docking step intertwined with a scoring step, where the docking step creates a ligand rotamer library and searches the best fit within the flexible side chain rotamer library, using the MedusaScore force field 16,17 . We select the final candidates through ranking the 21 candidates by best-predicted binding energy (Supplementary Table 1). We find this method of ranking reveals a natural division between the candidates specifically: (i) an Erebus RMSD less than 4 Å; (ii) an average ligand RMSD less than 8 Å; and (iii) a MedusaDock predicted binding energy less than − 28 kcal/mol. The top resulting candidates are antigen 85-A (PDB ID 1SFR) and phosphoribosyl isomerase (PDB ID 2Y85) from Myobacterium tuberculosis and Smu. 1393c from Streptococcus mutans (PDB ID 4L9A).
To evaluate predicted binding modes of identified candidates, we compare predicted binding energy distributions of organophosphate complexes with identified candidates to that of AChE and BChE. The greater the overlap between these distributions means the greater the probability of mimicking AChE OP interactions ( Fig. 2 and Supplementary Fig. 1). From docking simulations of all three candidates we find that the distance between the VX phosphate and predicted serine are less than 4 Å. This distance places the VX phosphate in a position for a serine nucleophilic attack. Additionally, the lowest predicted binding energy for each of our candidates' is within a 6 kcal/mol difference to the value of − 35.3 kcal/mol calculated for AChE (Supplementary Table 1). This difference indicates that our three candidates show predicted characteristics similar to AChE. OP binding to top candidates. We validate predicted candidate similarity to AChE by using serine hydrolase probes, specifically ActiveX Tamra fluorophosphonate serine hydrolase probe (TFP-probe) and ActiveX Desthiobiotin fluorophosphonate serine hydrolase probe (DFP-probe) ( Supplementary Fig. 2). These probes covalently phosphorylate, thereby labeling an active serine through a similar mechanism as organophosphates ( Supplementary Fig. 3). The labeled serine is detectable via fluorescent gel (TFP-probe) or western blot (DFP-probe). A clearly defined band was present for Smu. 1393c, indicating positive phosphorylation ( Fig. 3a and Supplementary Fig. 4). We did not detect a band for phosphoribosyl isomerase or antigen 85-A, suggesting no covalent binding though non-covalent interactions may still be present.
To quantify OP binding and check for non-covalent interactions, we use fluorescent polarization experiments utilizing the TFP-probe as substrate ( Fig. 3b and Supplementarty Fig. 5). We find that Smu. 1393c has low micromolar binding (K d = 4.17 ± 0.08 μ M). Though candidates' antigen 85-A and phosphoribosyl isomerase show no covalent binding, there are non-covalent binding interactions with K d 's of 20.9 ± 1.6 μ M and 120 ± 10 μ M, respectively. All three candidates display binding interactions that validate the predictions from our computational workflow.
OP covalent binding to Smu. 1393c active site. To confirm that the location of the phosphorylation is as predicted from our computational workflow, we substitute serine (Ser99) to alanine in the Smu. 1393c's predicted active site and perform labeling with both TFP-probe and DFP-probe and fluorescent polarization experiments with TFP-probe. A visible band indicating phosphorylation was not present in the Ser99 mutation, thereby covalent binding was indeed eliminated from both the fluorescent gel and the western blot analysis (Fig. 3a). In addition, the removal of Ser99 drastically increases the K d to greater than 300 μ M (Fig. 3b), suggesting that Ser99 plays an essential role in OP binding.
Smu. 1393c catalytic activity. The literature indicates that Smu. 1393c has catalytic activity against p-nitrophenyl acetate (PNP-ac) 18 . We determine the activity rate of Smu. 1393c aided hydrolysis of PNP-ac to be k cat /K m of 6.3 ± 1.3 M −1 s −1 (Fig. 3c, Table 1). Exposure to V-agent analog demeton-S-methyl (DSM) inhibits the PNP-ac hydrolysis activity with a K i of 10.64 ± 1.9 μ M. Using a double reciprocal plot we find that DSM inhibits Smu. 1393c in a non-competitive manner ( Supplementary Fig. 6), suggesting a potential second site that when bound has direct ramifications on activity.
Smu. 1393c protection of AChE. The inhibition of Smu. 1393c by organophosphate DSM indicates a protein/ligand interaction. We explore the ability of Smu. 1393c to protect AChE from inhibition through a quantitative set of activity assays, modified from Cherny et al., designed to gauge the amount of AChE activity maintained after exposure to two OP simulates 19 . Specifically, we use demeton-S-methyl (DSM) as a V-agent analog and diisopropylfluorophosphate (DFP) as a G-agent analog (Fig. 3f, Supplementary Fig. 2, and Supplementary Table 2).  we first use DSM as a V-agent analog. We find that Smu. 1393c shows unanticipated catalytic activity against DSM but are unable to complete the kinetic assays as the concentrations required to reach kinetic saturation exceed our supply. Additionally, to fully test the hydrolysis of the G-agent analog DFP requires an experimental setup unsupported in our laboratory. These reasons necessitate our change from DFP and DSM to paraoxon and omethoate as G-and V-agent analogs respectively ( Supplementary Fig. 2). After a series of kinetic assays we find that Smu. 1393c catalyzes the hydrolysis of omethoate with a k cat /K m of (2.0 ± 1.3) × 10 −1 M −1 s −1 (Fig. 3d, Table 1). When we test paraoxon hydrolysis the activity level is below the sensitivity of our instruments. Intriguingly, when we express Smu. 1393c with zinc and then test activity we see hydrolysis of paraoxon with a k cat /K m of (4.6 ± 0.8) × 10 3 M −1 s −1 (Fig. 3e, Table 1) and a loss of detectable omethoate activity. These findings suggest two different catalytic mechanisms. Interestingly, when we perform the same tests with Smu. 1393c S99A mutant we find no activity for either omethoate or paraoxon, suggesting that Ser99 is crucial in both mechanisms.

Discussion
OP poisoning is a worldwide problem with an estimated 3,000,000 people exposed yearly with up to 200,000 fatalities 20,21 . The availability of cheap and effective therapeutics becomes a necessity, as developing countries use OP pesticides in agriculture 22 . Additional concern lies in exposure to the more toxic variety of OPs, namely OP chemical nerve agents. These chemical weapons are cheap, deadly and present a viable threat in warfare or terrorist attack scenarios. The current preventative measures employed by military personnel include protective equipment and rigorous decontamination procedures 23 . These measures are only effective if a priori intelligence of an exposure situation allows proper implementation. What is needed is a molecular therapeutic designed to prevent AChE inhibition such as a stoichiometric bioscavenger, to competitively bind OPs, or a catalytic bioscavenger, to degrade OP chemicals. BChE, the current stoichiometric bioscavenger candidate, has a high molecular weight and high cost of production 10 , both of which limit its effectiveness.
Here we present three low molecular weight, producible bioscavenger candidates, antigen 85-A and phosphoribosyl isomerase from Mycobacterium tuberculosis and Smu. 1393c from Streptococcus mutans. Analysis of the structures reveals both antigen 85-A and Smu. 1393c have a core structure of an α /β hydrolase, and fall within the same fold class as AChE and BChE, but have less than five percent identity with either AChE or BChE 18 . Phosphoribosyl isomerase is unique among the three candidates with a core structure of a (β /α ) 8 -barrel scaffold 24 . These candidates are predicted to form non-covalent interactions with OPs and are experimentally verified using serine hydrolase probes with equilibrium binding constants between 4 and 120 μ M.
We establish that Smu. 1393c has the potential to protect AChE from inhibition. To be considered an OP bioscavenger there remains further investigation to determine the immunological effects from introducing such a foreign protein into the immune system. Additionally we show that Smu. 1393c has activity against OP omethoate (k cat /K m of (2.0 ± 1.3) × 10 −1 M −1 s −1 ) in the absence of zinc but is active against paraoxon (k cat /K m of (4.6 ± 0.8) × 10 3 M −1 s −1 ) when zinc is present. Examining the crystal structure of Smu. 1393c, we find a preponderance of histidine residues in the same cleft where the active serine (Ser99) is located that we suspect may aid in metal coordination (Supplementary Fig. 7). This hypothesis will be the subject of further investigations to understand the exact mechanism behind Smu. 1393c's hydrolysis activity. The level of activity that Smu. 1393c possesses falls far below that of metal-binding proteins, such as human paraoxonase-1 or organophosphate hydrolase. For Smu. 1393c to be a competitive organophosphate will require redesign of the hydrolysis mechanism.
By identifying these bioscavenger candidates we demonstrate our computational workflow ( Supplementary Fig. 8) which combines two successful algorithms; (i) Erebus 15 , a structural database search algorithm to identify molecules with substructures matching the query template; and (ii) MedusaDock 16 , a small molecule docking algorithm to predict the probability of interaction (see Supplementary text). We believe that with further development and automation, this workflow can be extended into applications benefitting both the pharmaceutical industry and the biotech community. Specifically, by aiding in drug adverse side-effect studies, molecular therapeutics, epitope vaccine design, drug repurposing, and protein sensor design.

Materials and Methods
Search for Novel Scaffolds. To effectively search the PDB database (www.rcsb.com), currently containing over 100,000 structures (December 2014), we select all proteins with less than 350 residues and less than four subunits. We search this subset using the rigid substructure search algorithm Erebus 15 by submitting a target scaffold defined by a group of atoms with known coordinates. The active site target scaffold we use is constructed using the hydroxyl oxygen from Ser203, the carboxylic oxygen from Glu334, the tele nitrogen from His447, all from the catalytic triad as oriented in the crystal structure of AChE (PDB ID 1F8U). We also include the indolic nitrogen from Trp86 to describe the binding pocket diameter. We use the default settings and set the RMSD cutoff at five angstroms. The final input scaffold consists of atoms with associated Cartesian coordinates formatted in the PDB style. Our starting query is below: Additionally, the computational cost of Erebus is minimal, taking a few minutes to hours, depending on the number of atoms in the search query, to search the entire structural database. For our query, the structural database was searched in ~1.5 hours.
Docking Simulations. We perform redocking of VX with BChE (PDB ID 2XQF) (Fig. 1c) as a control to benchmark the docking method and to obtain the reference distribution of docking scores for the native complex as well as with AChE (PDB ID 1F8U). The input files for MedusaDock 25 consist of (1) the target protein in PDB format, (2) the ligand of interest in a Mol2 format, and (3) the center for the search sphere marked by a small molecule or atom in the Mol2 format. To screen the 21 protein candidates we dock each candidate 100 times, each docking attempt is independent and includes ligand conformational sampling and simulated annealing. We select the top three candidates and dock them 1,000 times to extract a predicted binding energy distribution to compare to the reference distributions, where the frequency histogram plotted is the number of docked poses that fall within that bin. The computational cost of one MedusaDock calculation is on average between 4-6 minutes. To achieve results in a timely manner we distributed the 1,000 docking experiments over the killdevil computer cluster at the University of North Carolina Chapel Hill.
Protein Expression and Purification. We optimize the candidate protein gene sequences for expression in E. coli with an added poly-his tag and TEV protease site at the N-termini and clone the genes into the pET14B vector for expression in the E. coli strain BL21 (DE3) pLysS. We grow the cells in both standard LB (Smu. 1393c and phosphoribosyl isomerase) and super broth (antigen 85-A). For the case of the catalytically active Smu. 1393c, we grow the cells in standard LB with 0.5 mM ZnCl 2 . In all cases, we induce expression with 1 mM Isopropyl β -D-1-thiogalactopyranoside (IPTG) at 25 °C overnight for phosphoribosyl isomerase and Smu. 1393c, and 72 hours at 18 °C for antigen 85-A. The collected cells are stored at − 80 °C until use.
To purify we defrost the cell pellets on ice and resuspend in buffer A (50 mM phosphate pH 7.4, 500 mM NaCl, 40 mM imidazole, 5 mM β ME, and 0.1 μ M Pepstatin A). We use sonication to lyse the cells and centrifuge them in a Beckman J-20 rotor at 15,000 g for 2 hours. We collect and filter the supernatant through a 0.22 μ m filter before loading on a 5 mL HisTrap FF column from GE. We elute the protein with a gradient to buffer B (50 mM phosphate pH 7.4, 150 mM NaCl, 1 M imidazole). We then dialyze the proteins against 20 mM Tris pH 7.4 and elute from a HiTrap Q FF column from GE. We perform all column runs using a BioRad BioLogic low-pressure chromatography machine.
Fluorophosphonate labeling and fluorescence polarization. We label each protein with both ActiveX Tamra fluorophosphonate serine hydrolase probe (TFP-probe) for fluorescent gel scanning, and ActiveX Desthiobiotin fluorophosphonate serine hydrolase probe (DFP-probe) to show binding via western blot. Both probes are purchased from Thermo Scientific. We follow the provided protocol with minor modifications. We dilute the probe into a 2 μ M stock solution and calculate the necessary ratio of 1:30 to be 167 nM of probe to 5 μ M protein in a 20 μ L reaction. We incubate each protein sample with the probes for 15-30 minutes at room temperature. The reaction is quenched by adding 5 μ L of SDS-PAGE loading dye (200 mM Tris pH 6.8, 10 mM β ME 10% SNS, 30% glycerol and bromophenol blue) and boiled for 5 minutes. We analyze each sample by fluorescent gel scanning or western blot. As a negative control we include for each protein a sample with no probe.
We perform fluorescence polarization using the TFP-probe as substrate. We dilute the TFP-probe to a 200 nM concentration and titrate the given protein measuring the change in polarization until saturation. We fit the obtained data first to a one site-specific binding model, = * + , and then to a one site-specific binding model with Hill slope, = * There is no substantial difference between the fits for phosphoribosyl isomerase, antigen 85-A, and Smu. 1393c S99A. We find that the one site-specific binding model with Hill coefficient produces a better fit for Smu. 1393c (Hill coefficient = 2.0 ± 0.06).

AChE protection assays.
We purchase lyophilized Electrophorus electricus AChE, demeton-S-methyl, and diisopropylfluorophosphate from Sigma Aldrich. We run a modified protocol from Cherny et al. 19 , where standard Ellman assays recover kinetic parameters 26 and determine the ability of Smu. 1393c to protect AChE from the effects of OP simulants. We incubate equal amounts of Smu. 1393c and OP simulant (final concentration 500 nM) at 20 °C for 30 minutes. We add AChE to each sample and incubate at 20 °C for 60 minutes. As a detectable marker we add 500 μ M of the Ellman reagent 5,5-dithio-bis-(2-nitrobenzoic acid) (DTNB) to the samples. We vary the amount of acetylthiolcholine (0-20 mM) and bring each sample to a volume of 50 μ L using buffer (200 mM NaCl, 20 mM TRIS at pH 7.4). Each assay is performed in triplicate. We plot the kinetic parameters, obtained using SpectraMax M3 plate reader, and fit them to the Michaelis-Menton kinetic equation.
Smu. 1393c catalytic assays. We verify the catalytic activity of Smu. 1393c by observing the absorbance of the p-nitrophenyl group detectable at 405 nm marking the hydrolysis of p-nitrophenyl acetate (PNP-ac). We also assay the inhibition potential of demeton-S-methyl. We plot and fit the kinetic parameters to the Michaelis-Menton kinetic equation and calculate the inhibition constant by . We determine the activity of Smu. 1393c against both paraoxon and omethoate using absorbance. For paraoxon we detect the p-nitrophenyl group at 405 nm. To monitor the omethoate hydrolysis we use an Ellman assay 26 . We perform each assay in triplicate investigating paraoxon concentrations between 0-300 μ M and omethoate concentrations between 0-18 mM. Each assay is brought to 1000 μ L with 50 mM bis-tris-propane pH 7.4. We initiate the catalytic reaction by adding Smu. 1393c for a final concentration of 72 nM for paraoxon, and 5 μ M for omethoate. Additionally, we perform negative controls without Smu. 1393c. We analyze the data by subtracting the negative controls from the calculated initial velocities. We plot and fit the kinetic parameters to the Michaelis-Menton kinetic equation.