The sequence at Spike S1/S2 site enables cleavage by furin and phospho-regulation in SARS-CoV2 but not in SARS-CoV1 or MERS-CoV

The Spike protein of the novel coronavirus SARS-CoV2 contains an insertion 680SPRRAR↓SV687 forming a cleavage motif RxxR for furin-like enzymes at the boundary of S1/S2 subunits. Cleavage at S1/S2 is important for efficient viral entry into target cells. The insertion is absent in other CoV-s of the same clade, including SARS-CoV1 that caused the 2003 outbreak. However, an analogous cleavage motif was present at S1/S2 of the Spike protein of the more distant Middle East Respiratory Syndrome coronavirus MERS-CoV. We show that a crucial third arginine at the left middle position, comprising a motif RRxR is required for furin recognition in vitro, while the general motif RxxR in common with MERS-CoV is not sufficient for cleavage. Further, we describe a surprising finding that the two serines at the edges of the insert SPRRAR↓SV can be efficiently phosphorylated by proline-directed and basophilic protein kinases. Both phosphorylations switch off furin’s ability to cleave the site. Although phospho-regulation of secreted proteins is still poorly understood, further studies, supported by a recent report of ten in vivo phosphorylated sites in the Spike protein of SARS-CoV2, could potentially uncover important novel regulatory mechanisms for SARS-CoV2.


Scientific Reports
| (2020) 10:16944 | https://doi.org/10.1038/s41598-020-74101-0 www.nature.com/scientificreports/ SARS-CoV2 Spike at S1/S2 is required for viral entry into human lung cells 18 . However, the SARS-CoV2 spread also depends on cellular protease TMPRSS2 8,18 and the direct role of cellular furin has remained undefined. It is also not yet known if the novel sequence with the R-x-x-R motif inserted at the S1/S2 site enables furin specificity and efficient furin-dependent cleavage. Furthermore, it has been reported that the MERS-CoV Spike, although also harboring an RxxR motif, is not activated directly by cellular furin during viral entry 19 . The question of Spike activation is extremely important to solve, since the initial mechanistic trigger of the current SARS-CoV2 S1 S2 S1/S2 site   15 . Spike protein S1 (residue 1-685) is colored blue, Spike protein S2 (residue 686-1273) is colored brown and the intrinsically disordered S1/S2 proteolytic cleavage site is shown in red. The structure lacks the C-terminal residues 1148-1273. (b) Scheme of the constructs used to examine furin specificity. A 20-residue segment around the S1/S2 site was fused with a linker ELQGGGGG to the Streptococcal protein G B1 domain (GB1) and a C-terminal 6xHis tag. (c) Sequence alignment of the S1/ S2 region in SARS-CoV, MERS-CoV, SARS-CoV2, and bat virus SARSr-CoV RaTG13, which is closely related to SARS-CoV2 16 . (d) Coomassie-stained SDS-PAGE gels showing the proteolytic cleavage of the GB1-fused reporter constructs. Furin activity towards the S1/S2 region of SARS-CoV2, MERS-CoV, and SARS-CoV1 was measured in vitro using the GB1 reporter constructs shown in 'b' . The MW of SARS-CoV2 GB1 reporter protein is 10.9 kDa, which is cleaved by furin to 9.2 kDa and 1.7 kDa fragments. Uncropped images are shown in Supplementary Fig. S1. (e) Quantified data from the furin cleavage assay. The plot shows the relative amount of cleaved product compared to amount of uncleaved substrate at t = 0 min. Error bars show standard deviation. (f) Comparison of initial velocities of furin cleavage of SARS-CoV2 and MERS-CoV S1/S2 reporter constructs. To determine the initial velocity of SARS-CoV2 S1/S2 cleavage, the reaction was stopped at early time points (40 s, 1 min 20 s and 2 min 40 s) and these were used for the linear regression.

Scientific Reports
| (2020) 10:16944 | https://doi.org/10.1038/s41598-020-74101-0 www.nature.com/scientificreports/ pandemic could depend on the Spike cleavage site sequence as its host protease specificity can determine the zoonotic potential and "host-jump" of coronaviruses 13,17 . Recent Cryo-EM structures of the SARS-CoV2 Spike protein revealed that the insertion with the furin-like cleavage site emerges as a disordered loop on the side of the protein 9,20 . Such intrinsically disordered regions (IDRs) are easily accessible for enzymes and other protein signaling modules, and often encode short linear motifs (SLiMs) that act as recognition sequences for such binding partners. Intriguingly, besides the RxxR motif, the sequence of the loop SPRRA R↓SV contains two serines that match the consensus of proline-directed kinases (SP), and basophilic protein kinases (RxxS), the two largest subfamilies of mammalian kinases 21,22 . However, the Spike protein, except for its short C-terminal tail, is considered to be facing the endoplasmic reticulum (ER) or the Golgi lumen during the viral replication cycle 23 and is not present in cytoplasm or the nucleus, where most of the kinase signaling is taking place. Nevertheless, despite still being a poorly studied field, it is well known that protein phosphorylation does not only occur on cytoplasmic and nuclear proteins, but also takes place on secreted proteins in ER and Golgi lumen, and also in the extracellular space [24][25][26][27] . Strikingly, supporting evidence for the idea of Spike phospho-regulation arises from a recent report that confirmed more than ten in vivo phosphorylated sites in the Spike protein of SARS-CoV2 28 . Furthermore, phosphorylation has been shown to regulate fibroblast growth factor 23 targeting by furin 29 , indicating that kinases can regulate the secreted proteins as well as furin targeting.
In this study, we analyzed the furin cleavage site specificity of the SARS-CoV2 Spike using biochemical assays with substrate constructs based on the S1/S2 cleavage site sequence. We discovered that a motif RRxR, with a crucial P3 arginine (positions counted towards N terminus from the cleavage site, P4-P3-P2-P1), was required for cleavage of the S1/S2 site substrate constructs in vitro, while the motif in common with MERS-CoV (RxxR) was not sufficient for furin cleavage. Further, we describe a surprising finding that the two serines flanking the insert SPRRAR↓SV can be phosphorylated in vitro by different protein kinases and both phosphorylations affect the ability of furin to cleave the site. Finally, we discuss the possible novel regulatory mechanisms that such interplay of three different and mutually inter-dependent enzymatic modifications within the insert may present for SARS-CoV2 function.

Results
The furin cleavage consensus site is present in the S1/S2 site of Spike protein of SARS-CoV2 but not of MERS-CoV. As the furin cleavage site is in a disordered flexible loop at the side of the Spike protein 9,20 (Fig. 1a), we set out to analyze the amino acid sequence specificity of furin cleavage using purified protein fragments corresponding to the disordered region and containing the S1/S2 cleavage site. The constructs were designed based on 672 ASYQTQTNSPRRA R↓SVASQSI 692 amino acids of Spike followed by a linker and a GB1-6xHis tag for purification (Fig. 1b). First, we created a set of substrate constructs based on the S1/S2 sequences of SARS-CoV2, MERS-CoV, and SARS-CoV1 (Fig. 1c). We followed the cleavage of the constructs by purified furin preparation in SDS-PAGE. Strikingly, we found that although the R-x-x-R motif at S1/S2 in MERS-CoV has been considered as a furin cleavage site 13,14,30 , it was cleaved by furin at very low efficiency (Fig. 1d,e). Contrarily, the novel coronavirus SARS-CoV2 site was cleaved very efficiently ( Fig. 1d-f). Expectedly, the SARS-CoV1 motif lacking the furin site insertion showed no cleavage (Fig. 1d,e). As it is hard to determine the precise cleavage site using SDS-PAGE, we additionally performed a mass-spectrometric analysis of the cleaved fragments that confirmed the most abundant cleavage site was PRRAR ↓SV in SARS-CoV2, and PRSVR↓SV in MERS-CoV constructs (Supplementary Table S1). N-terminal peptides that were cleaved from PRR↓ARSV and PR↓RAR SV for SARS-CoV2, and PR↓SVRSV for MERS-CoV were detected with lower intensity, while the only forms of the C-terminal parts of the cleaved fragment started with an ↓SV-motif as predicted (Supplementary Table S1).
P3 arginine in R-R-x-R motif is necessary for efficient cleavage of S1/S2 site substrate constructs by furin. Next, we analyzed the furin cleavage of selected S1/S2 site mutants to better understand the specificity determinants of these sites. We used a recently described S1/S2 site mutant that lacks the four residue insertion (fur/mut 9 ) (Fig. 2a) as a control to confirm the cleavage specificity in the in vitro furin assay. Deletion of the furin site in fur/mut-GB1 reporter protein abolished the proteolytic cleavage of the reporter protein (Fig. 2b). Next, a patient-derived mutation R682Q in Spike (ZJU-1 31 ), that changes the furin core motif RxxR↓x to QxxR↓x, also completely inhibits furin activity towards the reporter protein (Fig. 2b). One difference between SARS-CoV2 and MERS-CoV S1/S2 sites is the presence of arginine 3 residues upstream of the cleavage site in the former (Fig. 2a). Interestingly, a mutation of this P3 residue to alanine (RRAR to RAAR) greatly decreased the furin cleavage rate compared to the wild-type SARS-CoV2 sequence (Fig. 2b,c). Introduction of arginine to the -3 position of MERS-CoV S1/S2 (RSVR to RRVR) considerably enhanced the cleavage rate by furin, resulting in only slightly lower cleavage efficiency compared to the wild-type SARS-CoV2 S1/S2 site (Fig. 2b,c). These findings highlight the presence of additional specificity determinants in the proteolytic cleavage sites on top of the commonly suggested RxxR, as a furin cleavage consensus motif. Further, this result shows how a single amino acid substitution can change protease specificity, which could have impacts on the tissue tropism and host range 32 . However, also the overall structuring of the region in and around the S1/S2 site, which is predicted to form a disordered loop in the case of SARS-CoV2, could depend on the context of native, full-length spike protein. This would affect accessibility of the sites and would impact how the Spike protein is recognized and processed by enzymes.
The SARS-CoV2 S1/S2 cleavage motif could be phosphorylated by proline-directed and basophilic kinases. In addition to the furin cleavage site, the four amino acid insertion to the S1/S2 site of Scientific Reports | (2020) 10:16944 | https://doi.org/10.1038/s41598-020-74101-0 www.nature.com/scientificreports/ the SARS-CoV2 Spike protein also introduces potential phosphorylation sites that flank the core furin motif (Fig. 3a). Spike S680 matches the consensus of proline-directed kinases (SP) and S686 forms a consensus for basophilic kinases (RxxS), two large subfamilies of mammalian kinases 21,22 . Interestingly, the presence of potential phosphorylation sites can also be seen in the polybasic proteolytic cleavage sites of several other viral envelope proteins, including the ones of H5N1 and H5N8 influenza viruses (Fig. 3b). Importantly, while the phosphorylation of Spike proteins has not been studied thoroughly, several phosphorylated residues including both SP and RxxS sites have been identified in the SARS-CoV2 Spike protein by mass spectrometry 28 . This prompted us to test whether the S1/S2 reporter proteins could be phosphorylated in vitro. For this, we used the cyclin B-Cdk1 complex (CDK) as a representative for proline-directed kinases and the protein kinase A (PKA) catalytic subunit as a representative for basophilic kinases. To examine the phosphorylation of different S1/S2-GB1 reporter proteins, we stopped the in vitro phosphorylation reactions at specific time points and analyzed the phosphorylation efficiency using Phos-tag SDS-PAGE to separate the phosphorylated protein from the non-phosphorylated substrate. The SARS-CoV2 S1/S2-GB1 reporter was fully phosphorylated by both CDK and PKA by the 60-min time point (Fig. 3c, Supplementary Fig. S2). Mutation of the predicted phosphorylation sites, S680 for CDK and S686 for PKA, abolished or greatly reduced the phosphorylation (Fig. 3c). The S1/S2 segments of SARS-CoV1 and MERS-CoV were expectedly not phosphorylated by PKA (Fig. 3c). The SARS-CoV1 S1/S2 segment does not contain a consensus site for proline-directed kinases, and while the MERS-CoV segment contains two potential proline-directed TP sites, these sites lack a basic residue in + 3 position, an important specificity determinant for CDK 33 . Introduction of the + 3R (denoted as P3 arginine for furin site) that greatly increases the furin cleavage efficiency of MERS-CoV S1/S2 also enhances its phosphorylation by CDK (Fig. 3c). Thus, the MERS-CoV S1/S2 segment could still be phosphorylated by other proline-directed kinases. Mutations in the + 2 and + 3 basic residues of SARS-CoV2 S680, which were found to affect furin cleavage (Fig. 2b), also decrease the phosphorylation rate by CDK (Fig. 3c), whereas with PKA, mutation of the -3R from S686 abolishes the phosphorylation, while mutation of -4R to Q has little effect (Fig. 3c). www.nature.com/scientificreports/ Phosphorylation inhibits furin cleavage of the S1/S2 site reporter proteins in vitro. Next, we analyzed how phosphorylation at these sites affects furin cleavage. For this, the S1/S2-GB1 reporter proteins were first phosphorylated by either CDK or PKA for 60 min, resulting in the phosphorylation of the majority of the substrate, followed by the addition of furin. Phosphorylation of either site adjacent to the core furin motif, S680 and S686, significantly inhibited the furin cleavage (Fig. 4a,b). To confirm that the effect of phosphorylation is connected to the specific site, we analyzed the cleavage of alanine mutants of the phosphorylation sites. Interestingly, alanine mutations of these sites decreased the cleavage rate, although not to the same extent as phosphorylation, and the addition of kinase to the S680A and S686A substrates did not affect their cleavage further (Fig. 4a,c). Next, we tested the effect of mutation of the phosphorylation sites to aspartic acid, often used to mimic phosphorylation. The S680D mutation decreased furin cleavage efficiency to the same extent as phosphorylation. However, S686D had a smaller effect on furin activity, similar to S686A mutation (Fig. 4a). Thus, the P + 1 site in the context of SARS-CoV-2 Spike protein requires the serine residue, and that other amino acids, not only the phosphorylation, whose side chains differ from serine can affect the cleavage, revealing a particular feature of SLiM R-x-x-R in the context of SARS-CoV2 Spike protein. Importantly, in a wider perspective, these results show that mutations and post-translational modifications outside the core RxxR furin motif significantly affect the cleavage.
A similar phospho-inhibitory effect was seen with MERS-CoV(RRVR)-GB1 (Fig. 4d,e), although the effect was less prominent, presumably due to less efficient phosphorylation of this protein (Fig. 3c), resulting in incomplete phosphorylation prior to furin addition. Taken together, these data reveal that phosphorylation adjacent to the furin motif can switch off the cleavage site.

Discussion
Our results confirm that a four amino acid insertion to the S1/S2 site of the SARS-CoV2 Spike protein introduces a furin cleavage site. However, the P3 arginine, not present in analogous site in MERS-CoV, is crucial for furin-dependent in vitro cleavage of a substrate construct carrying the sequence of the S1/S2 site. While some previous reports have noted furin-mediated activation of MERS-CoV Spike 14,18 , others have argued against this due to an off-target effect of the furin inhibitor dec-RVKR-CMK 19 . Our finding suggests that SARS-CoV2 may have acquired a true furin cleavage specificity in contrast to the coronaviruses causing the two previous outbreaks. In addition, the discovered possibility of tight phospho-regulation at two serines creates an interesting complexity where a disordered loop carrying a SLiM facilitates specific regulatory inputs for three different modifying enzymes. The in vivo functionality of this SLiM could emerge as one of the key functional elements in the SARS-CoV2 Spike protein, given that proteolytic processing on viral glycoproteins has been found to be a key virulence factor. Indeed, previous studies have shown that highly pathogenic influenza virus forms have been found to harbor optimal polybasic furin processing sites, whereas forms with low pathogenicity have monobasic cleavage sites [34][35][36] . The glycoprotein cleavage specificity also determines tissue tropism, as furin is ubiquitously expressed, whereas the proteases that process monobasic cleavage sites, like TMPRSS2, are expressed mainly in the aerodigestive tract 37,38 .
The motif RxxR is often referred to as the minimal furin cleavage site, whereas RxK/RR forms an optimal motif that is cleaved efficiently 39,40 . More recent reports, however, have defined a core motif of 8 amino acids (6 residues N-terminal and 2 residues C-terminal of the cleavage site 41 . The work presented here also suggests that the furin motif is more defined and longer, as we find that RRxR is necessary for efficient cleavage of the tested S1/S2 sites by furin and that mutations flanking this core also affect the cleavage efficiency. Interestingly, the glycoprotein of highly pathogenic Ebolaviruses have been found to contain an optimal furin cleavage site, while the glycoprotein of a closely related Reston virus has a non-optimal cleavage site 42,43 . However, the requirement for P3 arginine for furin specificity discovered in this study suggests that different cellular proteases may prefer different variations of cleavage site sequences. One hypothesis would be that furin may present an activity and specificity common for different hosts and thus a window for zoonotic transfer.
Importantly, although not sufficiently studied, it is known that besides cyto-and nucleoplasm, protein phosphorylation occurs also in the extracellular space, and in lumens of ER and Golgi 24,25,44 . For example, one of the first discovered phosphoproteins was casein, a true secreted protein 45 . Recently, an analysis of the saliva phospho-proteome discovered close to a hundred phospho-proteins 46 . Thus, despite being a secreted protein, the Spike has a potential of being regulated by protein kinases. Moreover, a recent report presented in vivo evidence of more than ten phosphorylated sites at the Spike protein of SARS-CoV2 28 . The sites S680 or S686 were not detected in this analysis, presumably because of lack of coverage of this region in mass-spectrometry analysis. Further studies are required to understand if phosphorylation of the Spike protein has a physiological role and if specific kinases are involved.
In conclusion, the described short linear motif acting as a triple regulatory module is quite unique, also in a general signaling context. Further studies are required to establish its exact role in SARS-CoV2 and also as a modular regulatory element in vivo.

Methods
Protein purification. Constructs containing a 20 amino acid region from the S1/S2 linker were fused via a linker with the sequence ELQGGGGG to the GB1 domain (immunoglobulin-binding domain of streptococcal protein G) and a C-terminal 6xHis tag. The pET28a vectors were transformed to E. coli BL21 cells and protein expression was induced at 37 °C by addition of 1 mM IPTG. The His-tagged proteins were purified by cobalt affinity chromatography using Chelating Sepharose (GE Healthcare). The proteins were eluted in buffer containing 25 mM Hepes-KOH (pH 7.4), 300 mM NaCl, 10% glycerol, 200 mM imidazole.  www.nature.com/scientificreports/ Phosphorylation assay. The phosphorylation of the S1/S2 linker constructs was examined in vitro using purified cyclin B-Cdk1 (Millipore 14-450) and PKA (murine cAMP-dependent protein kinase), purified as described in 47 . The phosphorylation reactions were carried out in buffer consisting of 50 mM Hepes-KOH (pH 7.4), 150 mM NaCl, 5 mM MgCl 2 , 50 mM imidazole, 2.5% glycerol, 0.2 mg/ml bovine serum albumin, 0.15 mM EGTA, 1 mM β-mercaptoethanol, and 500 μM ATP. The kinase reactions were performed in 30 µl containing 2.5 µg of S1/S2-GB1-6xHis substrate (7.6 µM). The concentration of PKA was 375 nM and cyclin B-Cdk1 5 nM.
The reactions were carried out at room temperature and were stopped at 10 s (0 min), 5 min, 20 min, and 60 min by pipetting 5.5 µl of the reaction mixture to 8 µl of 2 × Laemmli SDS-PAGE sample buffer containing 1 mM MnCl 2 . The stopped samples were incubated at 72 °C for 5 min and were loaded on Phos-tag SDS-PAGE gels containing 12.5% acrylamide, 100 µM Phos-tag, 200 µM MnCl 2 . The electrophoresis was carried out at 15 mA until the bromophenol blue dye front reached the bottom of the gel. Following electrophoresis, the gels were soaked in fixation solution (10% acetic acid, 30% ethanol aqueous solution) with gentle agitation for 15 min. The proteins were stained with colloidal Coomassie Blue G-250 48 .
The stopped reactions were heated at 72 °C for 5 min and loaded to 15% acrylamide SDS-PAGE. Following electrophoresis, the gels were immersed in fixation solution for 15 min and stained with colloidal Coomassie Blue G-250.
Mass-spectrometric analysis of cleavage sites. Furin cleavage reactions were set up as described above, except that BSA was not added to the reactions. The reaction mixtures were incubated for 60 min at room temperature and were stopped by adding EGTA to 10 mM concentration to inhibit furin. Reactions where 10 mM EGTA was present at the time of furin addition were carried out for controls. The peptides arising from furin cleavage were purified using C18 StageTips. The peptides were separated using Agilent 1200 series nanoflow system (Agilent Technologies) and sprayed into an LTQ Orbitrap mass spectrometer (Thermo Electron) with a nanoelectrospray ion source (Proxeon). The data was analyzed using MaxQuant 49 .

Data availability
All data generated or analyzed during this study are included in this published article (and its Supplementary Information files).
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.