Structural determinants of specificity and regulation of activity in the allosteric loop network of human KLK8/neuropsin

Human KLK8/neuropsin, a kallikrein-related serine peptidase, is mostly expressed in skin and the hippocampus regions of the brain, where it regulates memory formation by synaptic remodeling. Substrate profiles of recombinant KLK8 were analyzed with positional scanning using fluorogenic tetrapeptides and the proteomic PICS approach, which revealed the prime side specificity. Enzyme kinetics with optimized substrates showed stimulation by Ca2+ and inhibition by Zn2+, which are physiological regulators. Crystal structures of KLK8 with a ligand-free active site and with the inhibitor leupeptin explain the subsite specificity and display Ca2+ bound to the 75-loop. The variants D70K and H99A confirmed the antagonistic role of the cation binding sites. Molecular docking and dynamics calculations provided insights in substrate binding and the dual regulation of activity by Ca2+ and Zn2+, which are important in neuron and skin physiology. Both cations participate in the allosteric surface loop network present in related serine proteases. A comparison of the positional scanning data with substrates from brain suggests an adaptive recognition by KLK8, based on the tertiary structures of its targets. These combined findings provide a comprehensive picture of the molecular mechanisms underlying the enzyme activity of KLK8.

to as KLK8-Ca. Upon soaking the crystals with ZnCl 2 , one cell constant approximately doubled in length, accompanied by a transition to space group P2 1 , with four molecules per asymmetric unit. The corresponding 2.1 Å structure of KLK8, in which Zn2+ was not clearly identified, is referred to as KLK8-leup and resembles largely the KLK8-Ca molecule.
As in most trypsin-like serine proteinases the KLK8 chain consists of two six-stranded β-barrels that can be considered as half-domains ( Fig. 2A). At their interface the catalytic triad of Ser195, His57 and Asp102 is situated above the specificity subsites S4 to S4′, aligned from left to right in the standard orientation. Compared to the coordinates of mKlk8 (PDB code: 1NPM) human KLK8 exhibits a relatively low root mean square deviation (RMSD) of 0.63 Å for 222 equivalent residues atoms, agreeing well with respect to 73% identical residues 7 . Similar to other mature KLKs, the N-terminal α-ammonium group of Val16 forms the crucial salt bridge to the side chain carboxylate of Asp194, which stabilizes the oxyanion hole and a properly shaped S1 pocket ( Fig. 2A). Also, KLK8 contains the KLK-specific cis-Pro219 and six disulfide bridges.
The active site cleft from the non-prime specificity pockets S4 to S1 and S1′ to S4′ is predominantly negatively charged (Fig. 2B). Several basic residues cluster around the 60-loop in a positively charged surface patch, extending above the prime side region of the active-site cleft in standard orientation. Together with the positively charged patch around Arg109 this area resembles the more extended basic regions of KLK5 and KLK7 and the functionally important anion binding exosite I of thrombin 29 . Peptide substrate specificity of KLK8 measured by positional scanning for peptide positions P4 to P1. The height of the columns represents substrate cleavage rates by released ACC, while the x-axis indicates the amino acids in one-letter code, whereby n represents norleucine, which substitutes Cys. (B) PICS heat maps with amino acids listed in alphabetical order along the y-axis and peptide positions run from P6 to P6′ on the x-axis. The left panel displays cleavages in 73 substrates and the right panel contains corrected values for the natural abundance of each residue. Both PICS and PSSCL agree in a moderate preference for Thr and Trp in P4 and for basic residues in P3, such as Lys or Arg. Aliphatic residues Leu, Val, and Ile are found by both methods in P2, whereby Val dominates in the corrected PICS, while PSSCL ranks the polar Asn, Ser, and Thr relatively high. The major primary specificity for Arg over Lys in P1 is found by both methods, albeit stronger in PICS. The corrected PICS data show a preference for Ser and Met in P1′, for Ile and Trp in P2′, and no distinct specificity for P3′ and P4′. By contrast, S5′ and S6′ subsites are moderately specific for His and Tyr.  Active-site cleft and specificity pockets. Both KLK8 structures exhibit a similar active site architecture, whereby in the KLK8-Ca structure the substrate binding cleft is occupied by several water molecules, located mostly at polar or charged positions of the specificity pockets. By contrast, the P2 1 form of KLK8-leup contains four copies of the aldehyde inhibitor leupeptin, with the sequence acetyl-Leu1i-Leu2i-Argininal3i. For this reversible inhibitor an IC 50 of 66 µM was reported for KLK8 from insect cell expression, while it seems less than 10 µM for mKlk8 7,28 . Leupeptin binds covalently to Ser195 as analogue of an acyl intermediate in the canonical conformation (Fig. 3A). The P1-Arg3i aldehyde group in KLK8-leup is covalently linked to Ser195 Oγ as hemiacetal, with the oxygen occupying the oxyanion hole in the four KLK8 copies. This orientation of the hemiacetal oxygen corresponds to one of the two alternative conformations observed in a trypsin leupeptin complex (PDB code 1JRT) 30 . It forms hydrogen bonds to the backbone amide N-H atoms of Gly193 and Ser195 or a nearby water at average distances of 3.3 Å, while the other conformation with a hydrogen bond to the Nε2 atom of His57 is not observed. Similar to a substrate, leupeptin forms a short antiparallel β-sheet with two backbone hydrogen bonds between P3-Leu1i and Gly216, and the amide of P1-Arg3i and the carbonyl oxygen of Ser214 (Fig. 3A). The Arg3i side chain is buried in the S1 specificity pocket, which is rather hydrophobic in the upper region, whereas at the bottom the side-chains of Thr190, Ser217, Tyr228, and a water molecule bound to the Ser217 carbonyl O contribute to a polar environment. The negatively charged carboxylate of Asp189 forms a tight salt bridge to the positively charged Arg3i guanidyl group. Both positional scanning and PICS analysis corroborated that Arg is strongly favored over Lys as P1 residue (Fig. 1). This finding can be explained by the presence of Thr190, which enhances the specificity for P1-Arg similar to Ser190 of many trypsin-like proteinases, whereas an Ala190 would shift the P1 preference towards Lys 31 .
The funnel-shaped S2 subsite of KLK8 is bordered by the hydrophobic side-chains of His57, His99, and partially Trp215, which explains very well the accommodation of the Leu2i side-chain of the inhibitor and the preference for the aliphatic Val and Leu residues in the substrate profiling (Fig. 1). Also, the hydroxyl group of Tyr94 may interact with polar residues, such as Asn or Ser, which are accepted according to the PSSCL. Overall, this subsite appears to be quite flexible and adopts in all five KLK8 molecules different side chain conformations, in particular for His99, while water molecules sit at varying positions.
Although most P3 side chains of serine protease substrates extend to the bulk solvent, two major alternative conformations are possible for Leu1i and, consequently, for the acetyl group. In three KLK8-leup copies, the Leu The side chains of the catalytic triad, the activating salt bridge Val16-Asp194, the major specificity-determining Asp189 and Thr190, as well as the six disulfide bridges and Cys93 are shown as sticks (carbon teal, nitrogen blue, oxygen red, and sulfur yellow). Also, the three major ligands of Ca 2+ (green sphere) are depicted, namely Asp70, Asp77 and Glu80. Functionally important loops are displayed in different colors with labels. (B) Surface model of KLK8 with inhibitor and Ca 2+ . The left panel shows KLK8 in standard orientation with a stick model of leupeptin, occupying the subsites S4 to S1. The electrostatic surface potential is depicted red for negative and blue for positive potential in the range −10 e/k B T to +10 e/k B T. The right panel shows the KLK8 backside, rotated by 180°. The 60-loop and the area around Arg109 correspond to the thrombin anion binding exosite I, while the positive patch at the back around Lys166 and Lys186A has no counterpart.
SCIentIfIC REPoRtS | (2018) 8:10705 | DOI:10.1038/s41598-018-29058-6 side chain prefers the canonical conformation in the bulk solvent, whereby its backbone NH can form a hydrogen bond to the carbonyl O of Gly216 as in KLK5 32 , whereas in one copy it occupies the S4 pocket as in trypsin 30 . As Leu is not a preferred P3 residue according to PSSCL and PICS, it is difficult to relate structural and functional information for the unusual specificity of the S3 subsite for basic residues (Fig. 1). The variable arrangement of the predominantly hydrophobic His99, Trp215, Tyr172 and the more polar Gln175 and Glu97 may suffice to explain the relatively low specificity for P4 residues, including the uncommon preference for Thr and Trp, and the rejection of negatively charged P4 residues.
A molecular docking calculation for an ideal peptide substrate of KLK8, derived from the PICS data, resulted in proper binding of P4 to P4′ (Fig. 3B). Besides the expected accommodation in the S2 to S1 pockets, the interaction of the Arg guanidyl group by the carboxylate groups of Asp218 and Glu149 explains the specificity profiles for the preferred basic P3 residues. Also, a polar interaction of the P4-Thr with Glu97 may be the basis for the unusual S4 specificity. The S1′ subsite is a shallow pocket that is mostly shaped by two main chain stretches with polar character, while the hydrophobic bottom consists of the disulfide Cys42-Cys58, which agrees with a moderate preference for P1′-Ser and Met. Eventually, the accommodation of the aliphatic P2′-Ile, P3′-Leu and P4′-Met depends on the hydrophobic environment, such as Leu40, Leu41, Lys60 and Phe151 bordering the S2′ to S4′ subsites, respectively. The adjacent S5′ and S6′ subsites possess a mixed character, which tolerates aromatic, but also charged or polar residues, such as His and Tyr, while still putative prime side subsites may extend even beyond the 75-loop.
The Ca 2+ binding 75-loop. Among the loops shaping the active-site cleft of KLK8, the 75-or "calcium binding"-loop is the most distant from the catalytic triad. This loop is named after the central residue, but it is also known as the 70-80-loop. The 75-loop of KLK8-Ca exhibits an ideal coordination geometry compared to the slightly deviating KLK8-leup loops and holds the Ca 2+ ion with six ligands at an average distance of 2.3 Å, which is slightly below standard values (Fig. 4A,B) 33 . Among them are the side chain carboxylate groups of Asp70, Asp77, and Glu80, which provide a negative charge excess with respect to the divalent cation. Since the Oδ1 and Oδ2 of Asp77 bind the backbone N-H group of Ser72 at a distance of roughly 3.1 Å, some portion of the third negative charge is compensated. Similarly, the carboxylate of Glu80 binds the N-H groups of Asp77 and Gly78 with its Oε2 and Oε1 atoms, respectively. Also, the positively charged side chain of Arg66, appears to compensate some portion of the negative excess charge by a contact of the Nη1 to the Oδ2 of Asp70 at around 3.3 Å, albeit not with an ideal geometry. Additionally, the Ca 2+ ion is bound with an average distance of 2.3 Å by the carbonyl O atoms of Ser72 and Asn75, as well as by one water molecule. The Ca 2+ coordination sphere is octahedral, as it is known for coagulation factors or trypsin where the 75-loop binds Ca 2+ via three glutamates, including a mediating water 34 . By introducing the substitution Asp70Lys in the 75-loop, the stimulatory effect of Ca 2+ was essentially abolished in enzymatic assays with Boc-Val-Pro-Arg-AMC or Bz-Pro-Phe-Arg-pNA compared with normal activity (Fig. 4C). The function of Ca 2+ bound to the 75-loop can be dual as in trypsin, where it serves both to stabilize the enzyme and enhance the activity 35 . Zn 2+ binding in the 99-loop. The 99-loop of KLK8 exhibits an intermediate length between those of KLKs 4, 5, 6 and 7 and the long, exposed "kallikrein-loops" of KLKs 1 to 3 with an insertion of 11 residues 23 . Guided by the crystal structures of murine neuropsin, KLK5 and other KLKs with confirmed Zn 2+ inhibition we generated mutants in order to investigate the functional role of critical residues located in the 75-and 99-loops 23,24 . The substitution of His99Ala resulted in an active KLK8 variant that exhibited a 13-fold higher IC 50 (46.7 ± 3.2 µM) for Zn 2+ inhibition with respect to the wild type (Fig. 4D). Lack of the hydroxyl group in the variant Tyr94Phe resulted in a doubled IC 50 (6.9 µM), which hints to a role as secondary or water bridged ligand of Zn 2+ (Fig. 4E). Since cysteines are favored Zn 2+ ligands, the residual inhibition of the His99Ala mutant could depend on Cys93 (Fig. 4E), which was mutated to a serine. However, the Cys93Ser variant was as active as the wild type and displayed similar inhibition by Zn 2+ , which excludes Cys93 as ligand, since Ser is as Tyr not an ideal ligand for Zn 2+ 36 . This finding suggests that substrates still can bind to the active site in the inhibited state, while the inactivation is based on another process, such as a the disruption of the catalytic triad, which was found in the crystal structure of rat tonin (klk2) in complex with Zn 2+ (PDB code 1TON) 37 . Similarly, His57 in rat trypsin relocated from the triad to bind Cu 2+ together with an engineered His96 (1AND) 38   (E) In the 99-loop, Cys93 can be excluded as potential Zn 2+ ligand, since Zn 2+ inhibition of the Cys93Ser variant was unchanged compared to the wild-type. A water molecule (grey) between His99 and Tyr94 in KLK8leup is in a suitable position to bind Zn 2+ . The variant Tyr94Phe has a 2-fold higher IC 50  possible ligands are His57 or perhaps Asp102 (Fig. 4F). At least, we can narrow down the most likely Zn 2+ binding residues to the region around His99, Tyr94 and His57 in the S2 subsite (Fig. 4F). There, the Tyr94 OH and the carbonyl O atoms of Val96 and Asp98 could serve as additional ligands or secondary ones by bridging water molecules in the coordination sphere of the metal ion.

Discussion
KLK8/neuropsin is among the major suspects that make a substantial difference in human neuronal processes with respect to the corresponding activity in other primate brains. A structure based sequence alignment demonstrates that only three insignificant amino acid exchanges are present in the active KLK8 protease of our second closest relative in evolution, the gorilla, whereas chimpanzees have exactly the same sequence as humans (Fig. 5). Apparently, the major difference in all non-human primates is the lack of KLK8 isoform 2, with a 45 residue longer signal/prepeptide, which could be responsible for altered protein trafficking, as observed for long signal peptides of other protein isoforms 14,39 . In case the insertion is part of an elongated propeptide of pro-KLK8, it Other species are gorilla (Gorilla gorilla), mouse (Mus musculus) and rat (Rattus norvegicus) (mKlk8 and rKLK8), including bovine trypsin (bTRY) and chymotrypsinogen A (bCTRA) as numbering standard. Signal peptides of isoforms precede the propeptide, whereby isoform 2 with a 45 residue insertion is supposed to be unique for humans. Human and chimpanzee KLK8 are identical, whereas the gorilla has three exchanges. Mice share 73% identical residues in KLK8 with humans, similar to rats with about 71%. β-sheets, α-helices, and the short 3 10  might not influence the distorted loop network at all, whereas KLK8 activation might be altered, favoring a more efficient auto-activation as observed for KLKs 2 and 5 40 . Since isoform 2 is mostly expressed in the hippocampus, the special role of KLK8 in learning and memory formation, may be highly important for human consciousness.
Regarding natural KLK8 substrates, pro-KLKs 1, 2, 3, 5, 6, 9, 11, and 12 appear to be physiologically relevant in KLK activation cascades 40 . The growth hormone somatotropin is another interesting natural substrate, as well as single-chain t-PA, which after activation by KLK8 activates plasmin, facilitating cancer cell invasion 22,41 . Among neuronal mouse substrates is the signal protein neuregulin 1 (NRG1), which is cleaved by Klk8 at three sites with P1-Arg 42 . Although KLK8 expression correlates with cleavage of the neural cell adhesion molecule L1-CAM, the specific sites have not been determined yet 43 . However, murine Klk8 cleaves the receptor tyrosine kinase ephrin type-B receptor 2 (EphB2) after Arg518, resulting in NMDA receptor dependent synaptic plasticity events in stress response 44 . Recently, KLK8 inhibition was proposed as therapeutic target in Alzheimer's disease, which is accompanied by elevated KLK8 protein levels, while in a murine Alzheimer's model Klk8 inhibition by antibodies significantly attenuated the pathology 45 .
Both the synthetic library and the proteomic specificity profiling agree very well regarding the non-prime side, while there is basically no discrepancy with the MEROPS specificity matrix (Fig. 1). Also, the specific interactions of the P4 to P1 residues can be well explained on a structural level by the complex of KLK8 with the leupeptin inhibitor. In the prime side region, the MEROPS matrix is biased by the IVGG motif that is often present at the N-terminus of serine proteases, which are substrates in KLK activation cascades 40 . Otherwise, this part of the matrix, summarized as Ia/vi/gn/G with small letters for lower preference, resembles more the corresponding raw profile of PICS (as/avi/-/v) than the corrected one (ms/iw/-m) (Fig. 1B). Comprising 25 cleavages, the MEROPS matrix is just below the theoretical limit of 30 substrates for a reliable specificity determination within a 95% confidence interval, using the information entropy as measure of specificity 46 . By contrast, the PICS measurement comprised 73 substrate cleavages, which increases the reliability substantially, as evidenced by the concordance with the virtually unbiased positional scanning library results (Fig. 1A).
A structure based comparison of the specificity matrix and the profiling results with physiological substrates from murine and human neuronal tissue, i.e. NRG1, EphB2, and neuroserpin (SERPINI1), reveals that several cleavage sites are located in disordered regions as it is usually expected, but some comprise secondary structural elements as it was established for other proteases 47 . Some cleavage sites resemble the extended conformation found in surface loops and strands, which also coincide mainly with P1 to P3 sequence that is determined by mainly electrostatic and to some extent by hydrophobic and polar interactions. The cleavage sites KKE 18 R-GSGK of NRG1 and NRL 38 R-ATGE of neuroserpin belong to this category, probably requiring the α-helix of the non-prime side (PDB code 3FGQ) to unwind to some extent (Fig. 6A) 42,48 . Also, the reactive center loop of neuroserpin, which has been suggested as physiological inhibitor of KLK8 in the brain, with the sequence ISM 362 R-MAVL and an α-helical turn could fit in here 49 . However, the cleavage sites ELN 79 R-KNKP of NRG1 and GYG 518 R-YSGK of EphB2 hardly agree with the established substrate specificity of KLK8, except for the presence of P1-Arg 42,44 . Intriguingly, the three-dimensional models of both sites exhibit accessible surface stretches with basic residues in neighbor strands, such as Arg67 and Lys69 in NRG1 or Arg509 and Arg511 in EphB2 (Figs 6B,C). They are in reach of Asp218 and Glu149, which appear to be the determinants of the specificity for basic P3 residues. Thus, only by considering the three-dimensional structure of substrates with large deviations Figure 6. Models of natural protein substrates bound to the active site of KLK8. The molecular surface of KLK8 is shown with red patches of acidic residues and green for hydrophobic ones. (A) Human neuroserpin, a serine protease inhibitor from the hippocampus, is cleaved at Arg38 as P1 residue, which is located in the last turn of an α-helix, as observed in the crystal structure (PDB 3FGQ). Apparently, Arg36, Leu37 and Ala39 are suitable P3, P2 and P1′ residues in accordance with our profiling results. Proper binding and turnover by KLK8 requires a transition of the helical turn into a more linear, extended conformation. (B) The synaptic regulator neuregulin 1 (NRG1) possesses only P1-Arg79 and P2-Asn78 as KLK8 specific substrate residues at its cleavage site. According to a structure-based homology model, created by SWISS-MODEL 79 they belong to a β-strand, while the Leu77 side-chain might occupy the S4 instead of the S3 subsite. However, an adjacent β-strand contains Lys69, which is located close to Glu149 and Asp218, substitute the preferred basic P3 side-chain. (C) Similarly, a model from SWISS-MODEL of the neuronal KLK8 substrate ephrin type-B receptor 2 (EphB2), exhibits an arrangement of two strands, with P1-Arg518 as only specific residue for cleavage by KLK8. Interestingly, Arg509 and Arg511 might be ideally positioned for electrostatic interaction with Glu149 and Asp218. from the ideal recognition sequence, we can explain the preference of KLK8 for its known physiological targets. Since the phenomenon of significantly differing synthetic and natural substrate preferences is well known for other proteases, systematic analyses of similar cases may indeed reveal this type of alternative presentation of preferred cleavage sites as a more general pattern.
Concerning the kinetic parameters, the glycan-free E. coli expressed KLK8 can be compared with an identical, refolded KLK8 construct, which cleaved Boc-VPR-AMC with a catalytic efficiency of about 23 300 M −1 s −1 , which is nearly identical to our result (Table 1) 22 . Similarly, KLK8 from Sf9 insect cells and yeast expression (Pichia pastoris) had a catalytic efficiency of 21 000 M −1 s −1 for Boc-VPR-AMC and 20 000 M −1 s −1 for the substrate H-D-IPR-pNA, respectively 7,50 . Since the KLK8 variants from eukaryotic expression are most likely glycosylated at Asn95 in the 99-loop, it seems that this modification does not influence the basic enzymatic activity, in contrast to the corresponding glycan of the KLK2 99-loop 51 . Nevertheless, the glycan carrying 99-loop, as it was observed in the mouse Klk8 structure (PDB code 1NPM) may contribute to resistance against proteolysis, protection of Cys93 against oxidation, or conformational stability. In this latter context, it may favor a distinct conformation as proposed by the conformational selection model for KLKs. In particular, the glycan at Asn95 might regulate the concerted conformational rearrangements of loops surrounding the active site and concomitantly enhance the turnover of larger polypeptides 51,52 .
In the synaptic cleft, the Ca 2+ concentration is about 0.7 to 2 mM, which overlaps with the reported stimulatory range for KLK8 of 100 µM up to 10 mM 53 . Similar to the prototypic serine protease trypsin, the activity increases up to 4-times for small synthetic substrates by Ca 2+ stimulation 7 . However, some coagulation factors, such as FXa exhibit more than 30-fold Ca 2+ stimulation of their enzymatic activity against chromogenic substrates 54 . Regarding the mechanism of the stimulation molecular dynamics (MD) simulations showed that some parts of the KLK8 molecule tend to rearrange in the presence and absence of Ca 2+ at the 75-loop (Fig. 7A). Ca 2+ binding in the 75-loop results in significant conformational changes in the 37-, 61-and mainly the 99-loop (represented as root mean square deviations, RMSD), whereas smaller rearrangements occur in the Ca 2+ -free state. Intriguingly, MD calculations with the core glycan GlcNAc 2 Man 3 linked to Asn95 resulted in a wide open S2 pocket, a conformation that seems to be stabilized by the glycan (Fig. 7B). This observation might explain the stronger stimulatory effect of Ca 2+ on the activity of recombinant KLK8 from insect cells, which produce glycosylated proteins 7 . Apparently, the allosteric effect is transmitted upon Ca 2+ binding from the 75-loop via the neighboring 37-and 61-loops to the 99-loop. A similar allostery was seen in the comparison of a Ca 2+ -bound and -free factor IXa triple mutant, with a long-ranging conformational rearrangement from the 75-loop to the center of the active site, which was termed "communication line", involving the 148-loop, the activating N-terminal salt bridge and the S1 pocket 55 . Without Ca 2+ the 75-loop reorients itself and covers parts of the prime side region from S6′ on, which would interfere with the binding of polypeptide substrates (Fig. 7C). Concomitantly, the 99-loop closes the upper part of the S2 subsite with Val96 and His99 in a way that would hamper substrate binding in the non-prime side, which further corroborates the importance of this alternative communication line in KLK8. Such a blocked S2 pocket, involving Lys98 and Tyr99, has been observed for factor IXa in complex with the inhibitor benzamidine in the S1 pocket, albeit with Ca 2+ bound in the 75-loop 56 . The closed or "locked" conformation of the fIX 99-loop is based on the interaction of Tyr177 with asparagines 97 and 100, which is physiologically released by the major stimulatory factor VIIla 57 . In contrast to factor IXa, KLK8 lacks the specific residues that lock the closed conformation, as it seems to shift easier between the open and closed conformations. By contrast, mKlk8 adopts a relatively open conformation even in the absence of inhibitors or Ca 2+ . Furthermore, it has been demonstrated that Ca 2+ -binding to the 75-loop of factor IXa is allosterically linked to the Na+-binding/225-loop 58 . The corresponding 220-loop of most KLKs cannot bind Na + , due to the presence of cis-Pro219. A molecular dynamics study based on free and inhibitor bound KLK4 crystal structures described an allosteric interplay of loops surrounding the active site, in particular the 37-, 75-, and 220-loops, although it involves an inhibitory cation site at Glu77 and His25 59 . However, a comparative analysis of many trypsin-like serine proteases, including all known KLKs, suggested that the 99-, 148-, and 220-loop, surrounding the catalytic triad, open and close in a concerted manner, according to the conformational selection mechanism 52 . Thus, it would not be surprising, if all loops around the substrate binding cleft were connected in an allosteric network, with different characteristics in individual serine proteases.
In line with this general model, Zn 2+ was identified as second cationic modulator of KLK8 activity, which may fix or "lock" the closed 99-loop conformation (Fig. 7D). While its physiological role at synapses is less clear than the one of Ca 2+ , its extracellular synaptic concentration is around 1 µM, but may reach much higher concentrations during zincergic signaling in the hippocampus 60 . Moreover, Zn 2+ is a regulator of KLK5, 7, and 14 activity in skin, besides the LEKTI or SPINK inhibitors, which do not target KLK8 23,61 . Thus, Ca 2+ and Zn 2+ appear to be antagonistic regulators of KLK8 activity in brain and skin, especially during wound healing that is usually accompanied by an increase of the Ca 2+ concentration 62 . By contrast, in a mouse model of spinal cord injury, mKlk8 exhibited increased expression, which was confirmed in corresponding human cases 63 . In a multiple sclerosis mouse model upregulation of mKlk8 and mKlk6 was observed in brain encephalitis and in the spinal cord during disease development to axon degeneration. Both conditions are serious medical problems, which await proper treatment, perhaps by employing KLK8 as target for pharmaceuticals 64 .

Materials and Methods
Expression and purification of KLK8. KLK8 cDNA was obtained from ovarian tumor tissue mRNA and cloned into the pQE30 vector at BamH/HindIII restriction sites (Qiagen). The KLK8 constructs are coding for the mature protease domain with an N-terminal Met-Arg-Gly-Ser-His 6 -tag-Gly-Ser sequence followed by an enterokinase (EK) cleavage site (Asp 4 -Lys↓Ile-Ile). Site-directed PCR mutagenesis was performed with Pfu Turbo (Stratagene) and template digestion with DpnI (New England Biolabs). Wild-type KLK8 and the variants D70K, Y94F, H99A, and C93S were obtained from E. coli M15[pREP4] cells (Qiagen) as inclusion bodies. After treatment with denaturing lysis buffer (6 M guanidinium-HCl, 100 mM NaH 2 PO 4 , 10 mM Tris-HCl, pH 8.0) and removal of cell debris, KLK8 was purified with nickel-nitrilotriacetic acid Sepharose chromatography (Qiagen). Column washing was done by stepwise reduction of the pH from 8.0 to 5.0, followed by KLK8 elution with 8 M urea, 100 mM NaH 2 PO 4 , 10 mM Tris-HCl, pH 4.0. The solution was titrated with NaOH to pH 8.0 and incubated with 10 mM DTT overnight at 25 °C and dialyzed against the 100-fold volume of 4 M urea, 50 mM Tris-HCl, pH 8.0, 100 mM NaCl, and 0.005% Tween-20 for 12 h at 4 °C. Refolding of KLK8 was done by dropwise dilution in 2 M urea, 50 mM Tris-HCl, pH 8.0, 100 mM NaCl, 5 mM reduced glutathione, 0.5 mM oxidized glutathione, 0.002% NaN 3 , 0.005% Tween-20 in the 100-fold sample volume at 4 °C for 30 h. Afterwards, the KLK8 solution was concentrated with 10 kDa cutoff double cassette membranes. The latter procedure was repeated with 1 M urea. Final concentration in 150 mM NaCl, 50 mM Tris-HCl, pH 8.0, 0,005% Tween-20 (storage buffer) was done with microconcentrators (Vivaspin, 10 kDa cutoff, Sartorius). The N-terminal His 6 -tag was cleaved from KLK8 with EK (Sigma), of which 1 U yielded 25 μg/ml of KLK8 to 98% in 12 h at 4 °C in storage buffer, followed by treatment with the EK antibody capture kit (Sigma) for 60 min. Active KLK8 was purified with benzamidine-Sepharose affinity chromatography (Amersham Pharmacia), washed with storage buffer, containing 300 mM NaCl, and eluted with this buffer containing 20 mM p-amino-benzamidine. Eventually, size exclusion chromatography on Superdex 2000 (16/26) was performed with storage buffer. The identity of mature KLK8 was confirmed by SDS-PAGE, mass spectrometry and N-terminal sequencing.
Positional scanning with a synthetic combinatorial peptide library. The positional scanning procedure has been previously described in detail 52  Proteomic identification of cleavage sites. The general procedure has been described previously for KLK2, while a more detailed description on enhanced PICS was recently published 26,51 . The peptide library was generated by proteolysis of E. coli proteins by GluC. KLK8 samples were incubated with the library with a 1:300 ratio in 50 mM Tris (pH 7.5), 100 mM NaCl at 37 °C for 3 h. After the reaction, isotope labeling of protease-treated and control samples was performed, followed by liquid chromatography-tandem mass spectrometry (Q-Exactive plus MS with an Easy nanoLC 1000, Thermo Scientific). The spectrum to sequence assignment was done with X! Tandem (Version 2013.09.01) with E. coli strain K12 as reference proteome. Semi-specific peptides with an increase >8-fold in the KLK8 samples were accepted as cleavage products. Prime side and non-prime side sequences were assigned according to the reference database and the corresponding protease specificity presented as heat-maps employing Web-PICS 65 .
Enzyme kinetic measurements. Prior to measurements with fluorogenic substrates, active site titration of mature KLK8 was done with pNPGB (BACHEM). The final enzyme concentration was 120 nM and the one of p-NPGB was 100 µM 150 mM NacCl, 50 mM Tris-HCl, pH, pH 8.0, 0.005% Tween-20, 1% DMSO at 25 °C. The release of p-nitrophenol was monitored at 410 nm, which allowed to calculate the molarity of active KLK8 66 . Enzymatic activity of KLK8 using AMC substrates (BACHEM) was measured in 150 mM NaCl, 50 mM Tris-HCl, pH, pH 7.5, 0.005% Tween-20, 1% DMSO at 25 °C on a Perkin-Elmer LS50B spectrofluorimeter at excitation and emission wavelength of 380 nm and 460 nm. According to the results of the specificity profiling an ideal substrate was synthesized with the formula Ac-Thr-Lys-Leu-Arg-ACC 27 . Substrate final concentrations were 40 µM and the KLK8 concentration employed was 60 nM, Ca 2+ and Zn 2+ were added in the range of 1 to 1000 µM and 0.5 to 500 µM, respectively.
In measurements with pNA substrates, the following KLK8 final concentrations were employed: 175 nM wt, 250 nM D70K, and 320 nM H99A in the assay buffer with 50 mM Tris-HCl pH 7.5, 100 mM NaCl, 0.005% (v/v) Tween-20, 0.02% NaN 3 , and 5% (v/v) DMSO. For determination of k cat and K M the concentration of Bz-Pro-Phe-Arg-pNA (BACHEM) ranged from 50 to 1000 µM, while 250 µM were used in the Zn 2+ inhibition and Ca 2+ activation experiments at 37 °C. The release of pNA was monitored by the absorption at 405 nm every 20 seconds for 5 minutes. The fraction of active KLK8 was determined by active site titration with BPTI (82% active) that had been titrated with trypsin, which was 48% active according to the p-NPGB burst titration 67 . Crystallization, data collection and processing. Using the sitting-drop vapor diffusion method, KLK8 crystals were grown at 18 °C from drops containing 1 μl protein solution (8 mg/ml), 10 mM leupeptin and 1 μl precipitant (100 mM tri-Na citrate, pH 5.6, 35% (v/v) tert-Bu-OH), equilibrated against 500 μl of precipitant solution. KLK8-Ca crystals grew in the orthorhombic space group P2 1 2 1 2 1 with one molecule per asymmetric unit (asu), whereas KLK8-leup crystals were obtained by 2 min soaking with 100 µM ZnCl 2 and belonged to the monoclinic space group P2 1 with four molecules per asu, respectively (Table 2). Data for KLK8-Ca were collected with a wavelength of 0.97469 Å at the EMBL beamline X12 (DESY, Hamburg) and for KLK8-leup with a wavelength of 1.00000 Å at the beamline PX II (SLS, Switzerland). Data were indexed and integrated in XDS and scaled with SCALA and AIMLESS 68,69 . Two KLK8-Ca data sets were merged in order compensate the strong anisotropy of the 2.0 Å resolution data set. A data cutoff at 2.3 Å was suggested by the CC(1/2) values from AIMLESS. Analysis of KLK8-leup data in CTRUNCATE and XTRIAGE showed a pseudomerohedral twin fraction of 0.24, which was taken into account in following steps of refinement 70,71 . Molecular replacement, model building, and refinement. For the KLK8-Ca data in space group P2 1 2 1 2 1 a molecular replacement search was performed with PHASER in the automated search mode using the mouse Klk8 model with 73% identical residues (PDB code 1NPM) 72 . The best solution with one mol/asu exhibited Z values for the rotation function (RFZ) of 17.0 and of 29.3 for the translation function (TFZ) with a log-likelihood gain (LLG) of +970 and an R-factor after rigid body refinement of 52.2%. After inspection of proper molecular packing, the solution was evaluated with a total omit map from SFCHECK, which had a figure-of-merit (FOM) = 80.3% 73 . In case of the KLK8 data in space group P2 1 , the refined polypeptide KLK8-Ca model was employed in PHASER auto mode, resulting in a LLG = +6792 with Z-values of RFZ = 18.5 and TFZ = 21.8, and an R-factor of 44.0%. Model building for KLK8-Ca was done iteratively with COOT and refinement with PHENIX, resulting in final R cryst and R free values of 22.4% and 26.6%, respectively, for a maximum resolution of 2.30 Å (Table 2) 71 . The Ca 2+ site was fully occupied, while no significant electron density was observed for leupeptin. Due to the high anisotropy, real space refinement was required to reduce RSRZ outliers to acceptable 8.8%. Asn and Gln side chain orientations were corrected according to NQ-Flipper V2.7 74 . Similarly, the model of KLK8-leup was refined, requiring the pseudomerohedral twin operator (h, -k, -l), resulting in R cryst and R free values of 21.2% and 25.9%, respectively, at a resolution of 2.1 Å (Table 2). An anomalous Fourier map showed no distinct peaks corresponding to potential Zn 2+ sites, as well as density correlation and real space R-factors from OVERLAPMAP in CCP4i did not confirm transition metal ions. Overall, the main chain of all polypeptides is well defined, except for the loop around Arg148 and two C-terminal residues. Also, the 75-loops of KLK8 copies C and D are not well defined, while leupeptin shows some variations in the individual Leu side chain positions and the acetyl group. The Ramachandran plots show 97.3% (KLK8-Ca) and 94.9% (KLK8-leup) of residues in the most favored regions and 2.7% and 5.1% in additionally allowed regions, while no outliers were observed. All figures were created with PyMOL v1.7rc1 and electrostatic potential phi maps were calculated with the PDB2PQR Server and APBS tool 75 (CIT). Root mean square deviations (RMSD) were calculated with the program SUPERPOSE 76 .

Molecular dynamics in solution and docking calculations.
To perform the molecular dynamics calculations in an aqueous solution environment model coordinates were first titrated at pH 7.0 using the Protonated3D function of MOE 2015.10 (www.chemcomp.com) 77 . The resulting structure was solvated in a cubic box of water molecules with specified edges (83.6 Å × 83.6 Å × 83.6 Å) centered on the KLK8 using periodic boundary conditions. Cl − counter ions were added to maintain overall neutrality. Afterwards, a series of equilibration steps were performed by molecular dynamics annealing runs, without using a barostat, for 100 ps at temperatures 50 K, 150 K, 200 K, 250 K and 298,15 K. The production was performed for 500 ps at 298.15 K. The molecular dynamics calculations were accomplished using the AMBER99 force field as implemented in NWChem 6.6 78 . During warm up 10000 replicas were collected at each temperature and 5 × 100000 replicas during each production cycle. In each case the time step was 0.001 ps. After each warm-up cycle and each production cycle a PDB-structure was generated and used for further analysis. The single replicas were not compared with each other. Unbound molecular docking to KLK8 was similarly performed, using a model based on the PAR-1 fragment in complex with thrombin (PDB code 1LU9) and the PICS results. The coordinates of the best hit with acceptable binding of P1-Arg to the S1 pocket were further optimized with MOE.  Table 2. Data collection and refinement statistics. * Highest resolution shell in parentheses.