Chikungunya nsP2 protease is not a papain-like cysteine protease and the catalytic dyad cysteine is interchangeable with a proximal serine

Chikungunya virus is the pathogenic alphavirus that causes chikungunya fever in humans. In the last decade millions of cases have been reported around the world from Africa to Asia to the Americas. The alphavirus nsP2 protein is multifunctional and is considered to be pivotal to viral replication, as the nsP2 protease activity is critical for proteolytic processing of the viral polyprotein during replication. Classically the alphavirus nsP2 protease is thought to be papain-like with the enzyme reaction proceeding through a cysteine/histidine catalytic dyad. We performed structure-function studies on the chikungunya nsP2 protease and show that the enzyme is not papain-like. Characterization of the catalytic dyad cysteine residue enabled us to identify a nearby serine that is catalytically interchangeable with the dyad cysteine residue. The enzyme retains activity upon alanine replacement of either residue but a replacement of both cysteine and serine residues results in no detectable activity. Protein dynamics appears to allow the use of either the cysteine or the serine residue in catalysis. This switchable dyad residue has not been previously reported for alphavirus nsP2 proteases and would have a major impact on the nsP2 protease as an anti-viral target.

and is proteolytically cleaved into 4 individual mature proteins by the virus-encoded nsP2 protease in a specific manner while the structural proteins are subsequently translated from a subgenomic mRNA 7 .
The alphavirus nsP2 protein is a multifunctional enzyme with the N-terminus of the protein comprising of RNA helicase, nucleoside triphosphatase (NTPase) and RNA-dependent 5′ -triphosphatase activities [8][9][10] ; whereas, the C-terminus of nsP2 contains the protease domain 11,12 . nsP2 is considered to be an essential protein as it is responsible for viral replication and propagation with its proteolytic and other activities 13 . In addition nsP2 is translocated to the nucleus where it shuts down antiviral gene expression 14,15 . The proteolytic activity of nsP2 has been characterized in other alphaviruses, such as Sindbis virus (SINV), Semliki forest virus (SFV) and Venezuelan Equine Encephalitis virus (VEEV), as a papain-like cysteine protease with a cysteine-histidine catalytic dyad in the active site 11,12,[16][17][18][19] . This led to the suggestion that the cysteine-histidine catalytic dyad possesses a nucleophilic cysteine residue that catalyzes the peptide bond cleavage with a histidine residue serving as a general base in the reaction. Cysteine proteases or thiol proteases are comprised of different families each of which cluster into clans based on sequence identities, similarities and 3D-structure 20 . The alphavirus nsP2 protease is classified into the Togavirus cysteine endopeptidase family (C9) which belongs to clan CN 21 .
To date there is no vaccine against CHIKV and as the multifunctional nsP2 protein is critical for viral replication it makes an attractive potential anti-CHIKV drug target. To this end we have begun characterization of the protease active site with the initial focus on the catalytic dyad. Although the dyad residues have been identified previously in other alphavirus nsP2 proteins; in SINV, SFV and VEEV 11,12,[16][17][18] , the CHIKV nsP2 protease active site has not been experimentally characterized. In performing this study a structural comparison showed that CHIKV nsP2 protease is not papain-like, and we have found what appears to be a unique feature of CHIKV nsP2 protease, in which the cysteine dyad residue can be catalytically replaced by a vicinal serine.

Results and Discussion
Previously the alphavirus nsP2 protease enzyme has been defined as a papain-like cysteine protease 16,22-25 . As it appeared that the nsP2 was a cysteine-histidine dyad protease similar to papain, the papain protease has been used as a model for which structure was available for several decades. Currently, there are 3 available alphavirus nsP2 protease structures, for Venezuelan equine encephalitis virus (VEEV; PDB ID: 2HWK), Sindbis virus (SINV; PDB ID: 4GUA) and Chikungunya virus (CHIKV; PDB ID: 3TRK). Structural superposition with papain and these nsP2 structures cannot be performed beyond 4 atom pairs which supports that these alphavirus protein motifs are not related to papain (Fig. 1). However, structural superposition of CHIKV nsP2 protease domain to the VEEV protease (245 atom pairs) gives an RMSD of 1.122 Å and to SINV protease (250 atom pairs) an RMSD of 1.024 Å. The superposition of the three alphavirus nsP2 proteases demonstrates a highly conserved tertiary structure for the nsP2 proteases despite the amino acid sequence variations, for example, CHIKV nsP2 pro aligned with SINV and VEEV nsP2 pro shows 44% and 42% amino acid identity, respectively. Of interest in this report is the sequence difference as this is what makes the CHIKV nsP2 different from the other studied alphavirus nsP2 proteases.
Several studies have been reported characterizing the SFV, SINV, VEEV and CHIKV truncated nsP2 protease domain alone 8,24,[26][27][28] . Moreover, the SFV full length nsP2 and protease domain have been stated to be similar 29 . It was reported for SFV nsP2 that "The high solubility and specific activity of Pro39 suggest that this fragment represents a structurally compact protease domain of nsP2" 26 . The requirement for the N-terminus of nsP2 has only been demonstrated for SFV and in the same study nsP2 protease from SINV processed the SFV polyprotein but the SFV nsP2 could not process the SINV polyprotein 30 . These reports as well as the amino acid identity differences mentioned above demonstrate fundamental differences in the alphavirus nsP2 proteases. We have previously demonstrated that the full length CHIKV nsP2 and the truncated protease domain are not significantly different for protease activity 31 . In our previous report we demonstrated that the CHIKV nsP2 protease (truncated and full length) could recognize small peptide substrates, which has not been reported for Sindbis virus (SINV) or Semliki forest virus (SFV) nsP2 protease. Moreover, the CHIKV nsP2 protease recognition of small substrates also has been shown previously by another group 8 . Therefore an additional difference noted for CHIKV nsP2 is the length of the substrates that can be cleaved by the protease activity 31 . The substrates employed in the present study are the biological cleavage site sequences of the CHIKV protease with FRET tags; which is unlike previous reports for SFV and SINV nsP2 that use large GFP and thioredoxin fusion tagged sequences 29,30,32,33 . The SFV nsP2 showed no cleavage of 2/3 sequence unless 170 amino acids of the N-terminus of nsP3 was part of the substrate 33 . Although even the EGFP-170 nsP3 substrate showed incomplete cleavage suggesting the lack of activity for 2/3 cleavage could be due to steric hindrance by the presence of the thioredoxin fusion protein and, in any case, demonstrating specificity/activity differences compared to the other two SFV cleavage sites. In addition, CHIKV nsP2-Pro compared with the same region of SFV and SINV shows only 65% and 44% amino acid identity, respectively 31 . The amino acid identity differences in the nsP2 proteins would by itself logically suggest that the nsP2 proteases will be different. Moreover, the SFV cleavage sequences are actually different from the CHIKV sequences, for example the important SFV P5 residue of the 1/2 site, is Tyr; but, in CHIKV it is Asp. In fact, in the literature no data is presented concerning the testing of small peptides as substrates as all activity shown is with large fusion constructs. Verification of our in vitro findings in an in vivo context is the logical Scientific RepoRts | 5:17125 | DOI: 10.1038/srep17125 next study. The current study clearly demonstrates properties that the protease domain possesses which implies that these properties may also be expressed in vivo and this is the obvious justification for all recombinant protein characterization studies.
Several initial studies have shown cysteine and histidine residue involvement with catalysis, which gave nsP2 the papain-like designation 16,22 . As papain has been extensively studied (for example [34][35][36] ) and the alphavirus nsP2 has been characterized as papain-like there appear to have been few studies to characterize the kinetic mechanism for the nsP2 protease. This is unfortunate as now available structures and the data in this report demonstrate that alphavirus nsP2 is not papain-like. For example, a tryptophan residue in papain has been shown to make an important contribution to the catalytic mechanism 36,37 . In the early literature for alphavirus, a tryptophan residue was also found to cause nsP2 enzyme activity loss, similar to what had been shown for the papain catalytic mechanism 11 . Later, the tryptophan residue was suggested to be involved with nsP2 protease substrate recognition of the glycine residue in the P2 position 24 . Now, the availability of the nsP2 protease structures shows that the tryptophan residue is actually in the wrong orientation to participate in catalysis. The tryptophan residue appears to be in a pocket surrounded by hydrophobic residues and van der Waals contacts (3.3 to 4.1 Å in the CHIKV nsP2, PDB ID: 3TRK); therefore, it is most likely involved in structural stability of the loop that contains the catalytic histidine residue (Figs 1 and 2). To confirm this role we replaced the tryptophan residue with alanine and phenylalanine and performed characterization studies with the two proteins. The kinetic parameters obtained for the 2 tryptophan position mutants show differences for the two engineered proteins for the 3 substrates (Table 1). In contrast to the previous reports for the other alphavirus nsP2 enzymes neither mutant showed a total loss of activity for any substrate. Perhaps surprisingly, the Trp549Ala enzyme actually showed a 3-fold increase in activity for the AGC substrate ( Table 1). The alanine or phenylalanine residues affected the kinetic parameters to a different extent, and these effects also appeared to be substrate dependent. For example, both mutants showed decreased activity for AGA but also an increase in affinity as shown by a decreased K m value such that the catalytic efficiency ratio (k cat /K m ) for both enzymes increased compared to the wild type enzyme (Table 1). But both enzymes showed quite different effects on kinetic parameters with the AGC substrate; the Trp549Ala enzyme had increased activity, the Trp549Phe had decreased activity and both had similar K m 's compared to wild type enzyme parameters (Table 1). Both tryptophan position mutants, for the AGG substrate, showed similar activity as the wild type but displayed affinity (K m ) differences with Trp549Phe similar to wild type but Trp549Ala showing decreased affinity with an increased K m (Table 1).
We used several classical protease inhibitors as well as metal salts to characterize the nsP2 enzymes. Although the alphavirus nsP2 has been classified as a cysteine protease there are many studies that show bacterial and viral cysteine proteases behave differently to papain-like cysteine proteases, such that the enzymes show little to no inhibition to E-64, chymostatin, PMSF or leupeptin [38][39][40][41][42][43][44][45] . Therefore to further characterize the tryptophan mutants we used several protease inhibitors and metal ions to observe the effects on the activity with the 3 substrates (Tables 2 and 3). Both tryptophan mutants appeared to behave similar to the wild type enzyme in the presence of the protease inhibitors ( Table 2). The metal ion effects on the activity of the 2 tryptophan mutants for the 3 substrates suggested more complex conditions existed (Tables 2 and 3). Perhaps cobalt and zinc effects illustrate this complexity best. Cobalt showed different effects depending on the substrate employed, with enhanced activity for AGA, inhibition for Distances are in Angströms in green. Alpha helices are shown in red and beta strands in purple. Molecular graphics and analyses were performed with the UCSF Chimera package. Chimera is developed by the Resource for Biocomputing, Visualization, and Informatics at the University of California, San Francisco (supported by NIGMS P41-GM103311) 49 . AGC and moderate increased activity for AGG ( Table 3). The tryptophan residue mutants responded differently to each other as well as compared to the wild type suggesting topological changes in the active site upon protein interaction with the metal. With zinc these interactions allowed the tryptophan residue mutants to retain greater activity for all 3 substrates (Table 3). The modest effects of the tryptophan residue mutations on the kinetic parameters support the idea that the residue position does not play a direct role in catalysis. We suggest that the tryptophan residue position is involved in protein dynamics through anchoring of the loop that contains the catalytic dyad histidine residue. The residue changes in this position affected the conformational ensembles of the proteins, which indirectly affected the enzyme catalysis and affinity of interaction with the substrates.
Initial experiments to confirm the catalytic dyad cysteine residue were performed with the residue replaced with alanine. Surprisingly the engineered enzyme did not lose all activity as had been reported for SINV and SFV alphavirus nsP2 proteases 11,12 . Analysis of the available CHIKV nsP2 protease structure (PDB ID: 3TRK) suggested that a serine residue one helical turn away from the cysteine residue may be substituting in the catalytic role (Fig. 2). In the CHIKV structure this serine is 6.6 Å from the dyad histidine residue (Ser OG to His NE2) which seems too far. However, examination of the VEEV (PDB ID: 2HWK) and SINV (PDB ID: 4GUA) nsP2 protease structures show that the distances between the atoms of the dyad cysteine (SG) and histidine (NE2) residues range from 5.6 to 8.1 Å. This suggests dynamic movement of the static crystal structure conformations must occur to position the catalytic residues correctly and these types of movements could also account for placing the CHIKV serine residue an equivalent distance into a catalytic position. To test the hypothesis of serine catalysis we also generated Ser482Ala mutant enzyme as well as the double mutant Cys478Ala/Ser482Ala. Although the double mutant protein had no detectable activity with any substrate, both single mutation enzymes, Cys478Ala and Ser482Ala, had activity for all three substrates (Table 1). Both enzymes, Cys478Ala and Ser482Ala, had a similar k cat to the wild type enzyme for the AGC and AGG substrates (Table 1); however, both enzymes had significantly less activity than wild type for the AGA substrate ( Table 1). The binding affinity (K m ) for AGC and AGG was similar to wild type for both enzymes but was significantly increased (lower K m value) for the AGA substrate. Use of the AGA substrate appears to allow discrimination of the  kinetic parameters for the 3 enzymes, Cys478Ala, Ser482Ala and wild type. This suggests that the conformational dynamics of the 3 proteins are different with the changes in k cat of 3-and 9-fold as well as the changes in K m of 6-and 20-fold also demonstrating that other residues (in addition to the 'dyad' residues) are significantly involved in the binding and catalysis. Previously a similar nsP2 protease from the alphavirus SFV was characterized and shown to be completely resistant to E-64, PMSF and leupeptin as well as exhibiting metal ion effects 24,26 . In our study the protease inhibitors leupeptin and E-64 showed little to no effect on wild type, Cys478Ala and Ser482Ala enzyme activities for all 3 substrates (Table 2). Chymostatin showed mild inhibition of the wild type for all 3 substrates; but for Cys478Ala and Ser482Ala enzymes the inhibition appeared to be substrate dependent with even a slight enhancement for Ser482Ala activity with AGA (Table 2). PMSF showed the greatest effect on Cys478Ala activity for AGG but little effect on AGA activity of wild type, Cys478Ala and Ser482Ala enzymes ( Table 2). The five metal salts studied gave varying effects depending on the metal, the substrate and the enzyme being considered (Table 3). For example, cobalt significantly enhanced AGA activity for wild type and Cys478Ala enzymes but not Ser482Ala, but inhibited AGC activity of all 3 enzymes to a similar extent and gave a mild increase in AGG activity for all 3 enzymes (Table 3). Using substrate equivalent to our AGG (the cleavage site of nsP3/nsP4) the SFV nsP2 showed approximately 80% inhibition with cobalt, whereas for CHIKV we observed a slight enhancement in activity. Previously the SFV nsP2 was reported to be completely inhibited by copper and zinc 26 and this was also reported for a CHIKV nsP2 protease 8   Ser482Ala enzyme using the substrate AGA in the presence of zinc ( Table 3). As previously observed, we suggest that the differences observed for the CHIKV nsP2 proteases originate with the different substrates used, as the different amino acid lengths around the scissile site as well as total substrate length appear to impact on the properties of the enzyme 31 .
A thermal stability study was performed to determine if the residue changes introduced structural properties that could be detected by variations in protein stability. The stability data showed the wild type enzyme and all the engineered mutants in this study behaved in a similar fashion ( Table 4). The similarities of the physical properties of the proteins were also corroborated by similar behaviors during expression and purification of the recombinant proteins. The enzyme differences observed in this report would therefore appear to be due to conformational dynamics and the varying ensembles available to each enzyme.
Of specific interest in this report is that the CHIKV nsP2 protease appears able to interchangeably employ either the canonical dyad cysteine residue or a proximal serine residue for catalysis (Fig. 2). The enzyme retained activity with either a Cys478Ala or a Ser482Ala mutation but lost all detectable activity with the double mutation Cys478Ala/Ser482Ala. The data suggests that rather than an induced fit mechanism the substrate interaction employs conformational selection as either the cysteine or serine residue can be utilized for cleavage of all 3 substrates. Some conformations appear to yield greater activity and thereby display a slight residue preference for activity, such as cysteine residue for the AGA substrate cleavage. The cysteine residue appears to have a structural influence as the Cys478Ala mutant shows large experimental variation in the data. Actually, both cysteine and serine residue positions seem to impact on structure to the extent of influencing changes in binding affinity such as for the AGA substrate. The conformational dynamics also were impacted by the metal ions as demonstrated by specificity changes for wild type versus the cysteine and serine residue mutants for the 3 substrates. Several reports show viral protease enzyme activity is affected by metal ions 8,19,26,41,44,46 . The mechanism is unclear at this time but may be due to the metal interaction with the protein's amino acid residues (histidine, cysteine, aspartate, glutamate, tyrosine, tryptophan, methionine, serine, threonine, asparagine, glutamine or the main-chain amino and carbonyl groups 47 ) impacting on the enzyme's conformational dynamics which then could yield an increase or decrease in enzyme activity. In our study, for example, cobalt significantly increases activity for AGA substrate for the wild type and Cys478Ala but not Ser482Ala, while inhibiting all 3 protease activities for AGC substrate. These cobalt effects for AGA substrate also support the concept that the wild type enzyme uses serine or cysteine residues interchangeably for catalysis. However, some of the inhibitor data, such as chymostatin, where the wild type activity is greater than either cysteine or serine mutant suggests that the catalytic mechanism is much more complex. Indeed, taken together the data suggests that the available conformation ensembles are impacted by small molecules such as metal ions or inhibitors, as well as the residues present in the active site. These different ensembles present varying structures for the conformational selection process of the substrate binding which then yield the alterations in activity observed. The alphavirus nsP2 protease kinetic mechanism would seem to be more complex than the early literature has suggested. The CHIKV enzyme appears to be unique with the interchangeable cysteine/serine dyad residue; however, characterization of other alphavirus nsP2's with a similar serine -cysteine arrangement (for example SFV nsP2) needs to be performed. In addition, in vivo macromolecular assembly has also been suggested to play an important role in the functionality of nsP2 during viral replication 30 . Obviously, further characterization studies will be necessary if the nsP2 protease will be an anti-viral target.

Methods
Recombinant nsP2 protease characterization. The recombinant nsP2 proteases were constructed, expressed and purified as previously described 31 . The engineered mutants were expressed and purified by the same protocol as the wild type protein. Enzyme characterization was performed as previously described 31 . Briefly, the characterization was performed with 3 synthetic fluorescent substrates corresponding to the 3 cleavage sites of the chikungunya viral non-structural polyprotein. These substrates were designated nsP1/nsP2 (AGA), nsP2/nsP3 (AGC) and nsP3/nsP4 (AGG) with the amino acids nsP2 pro AGG  Metal and protease inhibitor characterization. The effects on nsP2 protease activity of metal ions and protease inhibitors were characterized as previously described 31 . Briefly, a single concentration of 2 mM metal ion was employed to test cobalt (Co 2+ ), magnesium (Mg 2+ ), zinc (Zn 2+ ), nickel (Ni 2+ ) and copper (Cu 2+ ) effects on protease activity. Different final concentrations of four protease inhibitors were used; 50 μ M chymostatin (inhibits chymotrypsin-like proteases), 10 μ M E-64 (a selective cysteine protease inhibitor), 100 μ M leupeptin (inhibits serine and thiol proteases) and 1 mM phenylmethanesulfonyl fluoride (PMSF; a commonly used serine protease inhibitor). The enzyme activity in the presence of the metal ions or inhibitors was measured with the 3 fluorescent substrates. The initial reaction rate was analyzed using GraphPad Prism ® software, version 5.01. Percent remaining activity was calculated from the enzyme activity obtained in the presence of inhibitor versus the control activity in the absence of inhibitor.
Stability test. Thermal stability of the nsP2 protease was determined as previously described 31 .
Briefly, freshly purified enzyme was incubated at 37 and 42 °C for 10 min then activity was immediately measured using the AGG substrate at 4.5 μ M final concentration. The initial reaction rate was evaluated using GraphPad Prism ® software, version 5.01. Percent remaining activity was assessed by comparison with the control reaction at 37 °C.  Table 5. Substrate sequences of chikungunya virus nsP2 protease used in this study. The substrate sequences used in the present study have been previously reported 31 , and this table is adapted from the previous report. Briefly, the three fluorescent substrates designated as AGA, AGC and AGG (as underlined) were synthesized corresponding to the scissile site sequences (shown in upper case bold text) of chikungunya virus non-structural polyprotein (nsP1/2, nsP2/3 and nsP3/4), respectively. A 2-(N-methylamino)benzoyl (Nma) fluorophore group was attached at the amino terminus and a 2,4-dinitrophenyl (Dnp) group attached to the carboxyl terminus of an additional lysine (K) residue.