A cysteine selenosulfide redox switch for protein chemical synthesis

The control of cysteine reactivity is of paramount importance for the synthesis of proteins using the native chemical ligation (NCL) reaction. We report that this goal can be achieved in a traceless manner during ligation by appending a simple N-selenoethyl group to cysteine. While in synthetic organic chemistry the cleavage of carbon-nitrogen bonds is notoriously difficult, we describe that N-selenoethyl cysteine (SetCys) loses its selenoethyl arm in water under mild conditions upon reduction of its selenosulfide bond. Detailed mechanistic investigations show that the cleavage of the selenoethyl arm proceeds through an anionic mechanism with assistance of the cysteine thiol group. The implementation of the SetCys unit in a process enabling the modular and straightforward assembly of linear or backbone cyclized polypeptides is illustrated by the synthesis of biologically active cyclic hepatocyte growth factor variants.


Introduction
In recent years, the study of protein function has made tremendous advances thanks to the development of chemical synthetic tools and strategies for producing peptides and proteins. The vast majority of proteins obtained this way are assembled using native chemical ligation (NCL 1 , Figure 1a) or derived methods . 2,3,4,5 NCL involves the reaction of a peptide thioester with a Cys peptide to produce a native peptide bond to Cys. The synthesis of complex protein scaffolds requires the control at some point of the reactivity of Cys for orienting the order by which the peptide bonds connecting the various peptide segments are produced (Figure 1a). 6 Therefore, designing new strategies for modulating Cys reactivity is a contemporary concern and stimulates the creativity of protein and organic chemists worldwide. 7,8,9,10,11 One hallmark of the Cys residue is its involvement in the formation of disulfide or selenosulfide bonds (Figure 1b), 12 which often play a critical role in protein folding. Nature also exploits the redox properties of Cys thiols to control the activity of some enzymes featuring a Cys residue at their catalytic site. 13 Indeed, the conversion of a catalytic Cys thiol into a disulfide is a powerful means for shutting down enzymatic activity because disulfides are poor nucleophiles compared to thiolates. Thioredoxinreductase or glutathione reductase are typical examples where the enzymes become active upon reduction of a disulfide bond. 13 In synthetic organic chemistry, the redox properties of the thiol group also offer a simple means for controlling its reactivity. 14 Unfortunately, acyclic dichalcogenide derivatives of Cys are labile or in fast exchange under the reducing conditions used for performing NCL.
Consequently, such a bioinspired control of NCL by using Cys thiol as a redox switch has not so far proved achievable. In practice, Cys reactivity is instead masked during protein assembly by introducing classical alkyl-or acyl-based protecting groups on the -amino group, on the side-chain thiol or both (for a recent review see reference 2 ).
To circumvent the high lability of Cys acyclic disulfides during NCL and to use Cys thiol as a redox switch for controlling protein assembly, we sought to embed the Cys thiol in a cyclic dichalcogenide as such species are known to be significantly more oxidizing than their linear counterparts. 15 In this work, we explored the properties of SetCys, the cyclic selenosulfide obtained by introducing a selenoethyl appendage on the α-amino group of Cys ( Figure 1c). We discovered that the products of NCL with SetCys peptides vary with the strength of the reducing agent. Importantly, SetCys spontaneously loses its selenoethyl arm in water at neutral pH in the presence of popular disulfide bond reductants such as dithiothreitol (DTT) or tris(2-carboxyethyl)phosphine (TCEP). This chemical behavior contrasts with the known difficulty in breaking carbon-nitrogen bonds, a process that usually requires harsch conditions, 16, 17 metal catalysis 18 or radical reactions. 19,20 In contrast, the detailed mechanistic investigations reported here point toward an anionic mechanism that depends on the ionization state of SetCys in its ring-opened and reduced form. In this respect, SetCys uncovers a novel mode of reactivity for Cys and provides a useful means for accessing complex protein scaffolds as illustrated by the total one-pot synthesis of biologically active backbone cyclized variants of the hepatocyte growth factor (HGF) kringle 1 (K1) domain.

SetCys peptides display an array of reactivities depending on the reducing environment
The NCL reaction is classically performed in the presence of aryl thiol catalyts, 21 of which 4mercaptophenylacetic acid (MPAA) is considered as the gold standard. 22 In addition to its catalytic abilities, the latter also contributes to the maintenance of the reactants in a weakly reducing environment. Interestingly, this experiment resulted in the exclusive formation of the ligation product with the Cys peptide. We further showed that the SetCys peptide does not interfere with NCL even when the thioester component features a sterically demanding amino acid at its C-terminus, typically a valine residue (see Supporting Information). Thus, the background NCL observed for a SetCys peptide in the presence of MPAA is unable to perturb a regular NCL involving a Cys peptide.
The most striking property of SetCys was observed when the SetCys peptide was subjected to the strong reducing conditions imposed by DTT or TCEP (Figure 2a, reaction 4). In this case, the reaction cleanly furnished the Cys peptide. We further documented that the reaction of a SetCys peptide with a peptide thioester in the presence of TCEP furnished a ligation product featuring a native Cys residue at the ligation junction (Figure 2a, reaction 5). In contrast, the loss of the N-alkyl substituent was not observed when the sulfur analog of SetCys, featuring a 2-mercaptoethyl group on the -nitrogen, was treated similarly, even after extended reaction times ( Figure 2b). 23,24 Therefore, the reactivity observed for SetCys depends specifically on the presence of selenium in its structure. 25

Insights into the conversion of SetCys to a Cys residue
From a mechanistic point of view, the loss of the selenoethyl group from the SetCys residue seems unlikely to involve radical intermediates since the reaction proceeds well in the presence of a large excess of sodium ascorbate and MPAA, 26,27 two reagents known to be powerful quenchers of alkylselenyl or alkylthiyl radicals. Furthermore, the loss of the selenoethyl limb is also observed when dithiothreitol is used as a reducing agent, definitely ruling out the possibility that the reaction might involve a classical TCEP-induced dechalcogenation process. 28,29 Further insights into the species involved in the reaction come from the data shown in Figure 3b, which presents the effect of pH on the rate of selenoethyl limb removal from a model SetCys peptide 2. The pH-rate profile of the conversion of SetCys peptide 2 into cysteinyl peptide 3 shows a maximum at pH 6.0  0.04 and two inflexion points at pH 4.8 and 7.3, which likely correspond to the pKas of the SetCys selenol and ammonium groups respectively. These values are in agreement with the pKa values reported for simple 2selenoethylamines 30 and Cys derivatives 31 or estimated by calculation ( Figure 3c). The fact that the pHrate profile of the reaction corresponds to the predominance zone for the selenoate/ammonium zwitterionic intermediate 2 +led us to propose that the decomposition of SetCys proceeds through the intramolecular substitution of the ammonium group by the selenide ion.
This mechanism results in the formation of an episelenide, which is known to be extremely unstable at room temperature and spontaneously decomposes into ethylene and selenium ( Figure 3a). 17 While selenium can be captured by TCEP in the form of the corresponding selenophosphine, whose formation was indeed observed in these reactions, detection of ethylene gas was made difficult by the small scale of synthesis.
The proposed mechanism is reminiscent of the cleavage of alkylamines by phenylselenol, albeit such reactions usually require elevated temperatures and/or assistance by metals. 16,17,32 Intrigued by the ease of SetCys to Cys conversion, we sought to determine if the SetCys thiol participates to the departure of the 2-selenoethyl limb. To this end, a N-(2-selenoethyl)-alanyl (SetAla) peptide analogue was prepared and treated with MPAA/TCEP/ascorbate at the optimal pH for the SetCys to Cys conversion, i.e., pH 6.0 (Figure 3d). LC-MS analysis of the mixture showed the conversion of the SetAla residue into Ala, but at a rate considerably lower (~ 8.5 fold) than those measured for the SetCys to Cys conversion. This experiment shows that the departure of the 2-selenoethyl limb is greatly facilitated by the nearby SetCys thiol, perhaps by allowing an intramolecular proton transfer as depicted in Figure 3a.

Insights into the mechanism of SetCys-mediated ligation
Having scrutinized the mechanism of SetCys conversion to a Cys residue under strong reductive conditions, we next examined the species involved during ligation with a peptide alkyl thioester under the same redox conditions. The monitoring of the reaction between SetCys peptide 1 and peptide thioester 4 indicated that a first ligation product 5, containing an internal SetCys residue, accumulated within the first minutes and then slowly disappeared over time in favor of peptide 6 featuring a native peptide bond to Cys (Figure 4a,b).
Regarding the mechanism of SetCys-mediated ligation under strong reducing conditions, we hypothesized that the early formation of intermediate 5 is due to the interception of the reduced SetCys unit 2 by the thioester component. The latter is likely to be present in the form of the aryl thioester 7, produced in situ from peptide alkyl thioester 4 by thiol-thioester exchange with the MPAA catalyst ( Figure 4c). The formation of tertiary amides of type 5 is known to be reversible in the conditions used for the ligation through their capacity to undergo an intramolecular nitrogen to selenium or sulfur acyl group migration. 23,24,33 Therefore, SetCys peptide 2 is constantly present in solution and escapes the SetCys/SetCys amide equilibrium by irreversibly losing its N-selenoethyl limb as discussed above. The Cys peptide 3 produced this way is expected to undergo a classical NCL reaction with aryl thioester 7 to yield ligated Cys peptide 6. Although the proposed mechanism arises from the properties of the SetCys unit described in Figure 2, we sought to confront it to kinetic data for validation. In addition, the model also tests the possibility of a direct conversion of SetCys amide 5 into final product 6, being fully aware that the cleavage of the selenoethyl appendage from compound 5 through an ionic mechanism is unlikely due to the poor nucleofugality of imido nitrogens.
We first determined the rate constants associated with the thiol-thioester exchange process involving peptide thioester 4 and MPAA (k +2 , k -2 ), with the conversion of SetCys peptide 2 into Cys peptide 3 (k 1 ) and with the reaction of peptide aryl thioester 7 with Cys peptide 3 (k 4 ), from model reactions run separately (see Supplementary Information). These rate constants were used to determine the remaining kinetic parameters k +3 , k -3 and k 5 by fitting the experimental data of the SetCys-mediated ligation (circles and triangles in Figure 4b) to the mechanistic model. The quality of the fit (dashed lines in Figure 4b

SetCys redox-switch enables the straightforward synthesis of cyclic proteins in one-pot
Having characterized the differential reactivity of the SetCys unit under mild and strong reducing conditions, we further sought to develop a simple process enabling the synthesis of cyclic proteins using SetCys as a redox switch ( Figure 5). The motivation for this particular application comes from the observation that although a few studies pointed out the potential of protein cyclization for improving protein thermal stability, 34,35 resistance to proteolytic degradation 35 and potency, 36 this modification has not so far been widely used for the design of protein therapeutics. This situation contrasts with the success of small cyclic protein scaffolds such as cyclotides used as platforms for drug design, 37 and the frequent use of macrocyclization in the development of small peptidic drugs. 38 The fact that protein cyclization is seldom used for protein optimization cannot be ascribed to inappropriate N-C distances, because half of the protein domains found in the protein data bank (PDB) have their C and N extremities joinable by linkers made of a few amino acids. 39 Rather, this situation reflects the paucity of tools for building cyclic proteins in a modular approach that facilitates the optimization of the linker joining the N-and C-termini. 40 Classical methods leading to cyclic proteins involve the macrocyclization of a bifunctional and linear precursor, 41 primarily by using the native chemical ligation reaction (NCL 1 ) between a C-terminal thioester group and an N-terminal cysteine (Cys) residue (Figure 5a). 2,42 Following this strategy, the optimization of the linker requires the production of a library of extended precursors of varying length and composition, an approach that inevitably makes the production of cyclic analogs cumbersome.
In this work, we sought to develop a modular one-pot method enabling the grafting of the linker to a unique linear protein precursor (Figure 5b). This can be achieved by exploiting the silent properties of the SetCys residue under mild reducing conditions for performing the first NCL (see property 3, Figure   2a , cK1-1 and cK1-2, from a unique 78 AA linear precursor. These cyclic polypeptides differ by the length of the linker joining N and C-termini of the K1 protein (10 and 14 residues respectively).

Folding and biological activity of biotinylated K1 cyclic analogs
The signaling of the HGF/SF/MET system plays a crucial role in the regeneration of several tissues such as the liver or the skin, while its deregulation is often observed in cancer. The HGF/SF K1 domain contains the main HGF/SF binding site for the MET tyrosine kinase receptor and thus constitutes an interesting platform for designing future drugs based on this couple of proteins. 46 In this study, we sought to produce cyclic analogues of the K1 domain to investigate the tolerance of the K1/MET signaling system to this modification. The X-ray crystal structure of the K1 protein shows that its tertiary structure is made up of a series of loops stabilized by three disulfide bonds (Figure 6a). 47 The N and C-terminal cysteine residues are on the opposite side of the MET binding loop and are linked by a disulfide bond.
The N-and C termini are thus close in space and can be joined by a peptide linker made of a few amino acid residues which include a biotinylated lysine residue. The latter is used to multimerize the ligand using streptavidin (S) due to the observation that multivalent presentation of the K1 domain is important for achieving high binding and agonistic activities. 46 The successful synthesis of the cyclic K1 polypeptides cK1-1 and cK1-2 set the stage for the folding step. cK1-1 and cK1-2 were folded into cK1-1f and cK1-2f respectively using the glutathione/glutathione disulfide redox system (Figure 6b,c). 45 The folding mixtures converged to a major form after 24 h and were purified by dialysis (Figure 6c, see also Supplementary Information).
Extensive proteomic analysis of the folded proteins cK1-1f and cK1-2f showed the exclusive formation of the native pattern of disulfide bonds between Cys128-206, Cys149-189 and Cys177-201 as shown in cK1-1f and cK1-2f proteins were first analyzed for their capacity to bind to the recombinant MET extracellular domain. The competitive AlphaScreen assay with recombinant NK1 protein showed that the backbone cyclized proteins cK1-1f and cK1-2f were ~10 fold less potent in binding the MET receptor than the biotinylated analog K1B (Figure 6d). This result was unexpected because the cyclization site is diametrically opposite the MET binding site. The capacity of the cyclic K1 proteins to activate the MET receptor was further examined using human HeLa cells (Figure 6e). MET phosphorylation induced by the cyclic K1 proteins was found to be less than that observed with the reference K1B analog. However, the tested K1 analogs triggered downstream signaling pathways, i.e., phosphorylation of AKT and ERK, with almost equal potency. Because previous studies showed marked differences between MET phosphorylation levels and the strength of MET specific phenotypes induced by HGF or HGF mimics, 48 we further analyzed the capacity of the different K1 proteins to trigger the scattering of human cells in vitro (Figure 6f). In this assay using Capan cells, the cyclic proteins behaved similarly to the reference protein K1B in the concentration range tested (10 pM-100 nM) by their capacity to induce a mesenchymal-like phenotype and cell scattering.

Corresponding authors
Correspondence to Oleg Melnyk or Vangelis Agouridas.

Competing interests
The authors declare no competing interests.

Supplementary information
Experimental details, materials and methods, kinetic model for SetCys-mediated ligation, LC-MS data and NMR spectra.