N-Terminal selective modification of peptides and proteins using 2-ethynylbenzaldehydes

Selective modification of the N-terminus of peptides and proteins is a promising strategy for single site modification methods. Here we report N-terminal selective modification of peptides and proteins by using 2-ethynylbenzaldehydes (2-EBA) for the production of well-defined bioconjugates. After reaction screening with a series of 2-EBA, excellent N-terminal selectivity is achieved by the reaction in slightly acidic phosphate-buffered saline using 2-EBA with electron-donating substituents. Selective modification of a library of peptides XSKFR (X = either one of 20 natural amino acids) by 2-ethynyl-4-hydroxy-5-methoxybenzaldehyde (2d) results in good-to-excellent N-terminal selectivity in peptides (up to >99:1). Lysozyme, ribonuclease A and a therapeutic recombinant Bacillus caldovelox arginase mutant (BCArg mutant) are N-terminally modified using alkyne- and fluorescein-linked 2-EBA. Alkyne-linked BCArg mutant is further modified by rhodamine azide via copper(I)-catalyzed [3 + 2] cycloaddition indicating that the reaction has high functional group compatibility. Moreover, the BCArg mutant modified by 2-ethynyl-5-methoxybenzaldehyde (2b) exhibits comparable activity in enzymatic and cytotoxic assays with the unmodified one.

S ite-selective chemical modification of peptides and proteins has become an emerging research field in chemical biology, which allows the production of well-defined bioconjugates for biological studies and drug development [1][2][3][4] . Although a number of bioconjugation reactions for specific amino acid modification have been developed in the past decade, due to the prevalence of multiple targeted residues on protein surface, only a few of them are amenable to give single-site modification 5,6 . To achieve site-selective modification, current methods mainly focus on labeling of the low abundant free cysteine residue or noncanonical amino acids, which always require sophisticated sequence engineering [7][8][9][10][11][12] . Besides, a few examples targeting the Cterminus or specific lysine ε-amino group have also been reported [13][14][15][16] . Despite these advances, it is still of ongoing interest to develop new methods for site-selective protein functionalization.
Targeting the N-terminus of peptides and proteins is a promising strategy to achieve single-site modification as a singlechain protein contains only one N-terminal residue in its sequence and it is mostly solvent exposed for functionalization 17,18 . On the other hand, recent studies suggest that a change of the charge in the N-terminal region of the signal peptide would disrupt the translocation of the small secretory preproteins 19 . Thus, development of an efficient N-terminal modification method will not only be an important direction for site-selective protein bioconjugation but also provide a chemical biology approach to study the biological functions of the protein N-terminus.
To achieve site-selective modification on the N-terminus, the important strategy is to perform the modification using pH control. As the N-terminal α-amino group possesses a lower basicity (pKa ≈ 6-8) compared with the lysine ε-amino group (pKa ≈ 10), a reaction medium with well-controlled pH value could favor the modification on the N-terminal α-amino group 20 . Based on this mechanism, N-terminal azidation, acylation, oxidation, and reductive alkylation have been reported by our group and others [21][22][23][24][25] . Another strategy is to utilize the specific residues on the N-terminus. Following this strategy, pyridoxal-5-phosphate (PLP) or Rapoport's salt (RS)-mediated transamination reaction involving tautomerization triggered by the lower pKa α-proton on the N-terminal amino acid [26][27][28] , as well as 2-pyridinecarboxyaldehyde (2-PCA)-mediated imidazolidinone formation via cyclization of the imine intermediate with the nearby amide group on the Nterminus 29 , have been reported by Francis and co-workers. Among those reactions, the one-step modifications using ketenes 22,23 or 2-PCA 29 without addition of oxidizing or reducing reagent are promising approaches to achieve regioselective modified bioconjugates. However, preparation of their derivatives requires multi-step synthesis and it hampered the further studies on their structure-reactivity relationship, stability of the conjugates, as well as their applications. Thus, it is still of importance to investigate new, efficient, and convenient approaches for sitespecific labeling of the N-terminus.
2-Alkynylarylaldehydes are versatile building blocks in organic synthesis [30][31][32][33] . Under transition metal catalysis, the in situ generated 2-alkynylarylaldimines between 2-alkynylarylaldehydes and primary amines could undergo 6-endo-dig cyclizations to give the corresponding isoquinoliniums, which have been demonstrated as key intermediates for syntheses of complex heterocycles 31,32 . Despite the fact that this transformation has been extensively explored in organic synthesis, studies on its applicability on protein modification remain largely elusive. We hypothesize that the efficient imine formation and tandem cyclization would render 2-alkynylarylaldehydes amenable to selectively modify the N-terminal α-amino group (Fig. 1).
In this work, we first report a metal-free one-step Nterminal modification of peptides and proteins using 2ethynylbenzaldehydes (2-EBA) under mild reaction conditions. The isoquinoliniums formed have been isolated and characterized by a model reaction. After a comprehensive study on the reaction conditions and the structure-reactivity relationship of the reagent, we demonstrate that, apart from the pH control, electronic effects also play important roles in controlling the N-terminal selectivity of the modification. We have also extended this reaction to protein modification, including labeling a therapeutic Bacillus caldovelox arginase mutant (BCArg mutant). The enzymatic and anti-cancer activities of the modified BCArg mutant have also been studied.
Results and discussion 2-Ethynylbenzaldehydes as N-terminal selective reagents. To begin our study, peptide YTSSSKNVVR 1a (molecular mass of 1140 Da, 0.1 mM) was treated with 20 equivalents of 2ethynylbenzaldehyde 2a (2-EBA, 130 Da) in 50 mM phosphatebuffered saline (PBS)/DMSO (9:1) at pH 6.5 for 16 h (Fig. 2a). After the reaction, we found that peptide 1a was modified to give mono-modified peptides (N-terminally modified peptide 3a and lysine-modified peptide 3a′) in 64% conversion and di-modified peptide 3a″ in 8% conversion (Fig. 2b). An increase of the molecular mass by 112 Da indicated that 2-EBA 2a was incorporated on the peptide 1a with loss of a H 2 O molecule, which was presumably ascribed to the formation of a quinolinium conjugate after the modification. As N-terminal selectivity [23][24][25] is calculated based on the ratio of the mono-modified peptide at N-terminal αamino group to lysine ε-amino group as determined by extracted ion chromatogram (EIC) of LC-MS analysis, we achieved Nterminal selectivity (3a:3a′) of 96:4 in the mono-modified peptides, with the corresponding MS/MS spectrum of N-terminally modified peptide 3a as the major product (Fig. 2c). Therefore, the conversions of N-terminally modified peptide 3a and lysinemodified peptide 3a′ were calculated as 61 and 3% respectively. To give an understanding on the proportion of N-terminally modified peptide in overall modified products, we included another method for determining the efficiency of N-terminal modification, referring to the ratio of the conversion of Nterminally modified peptide over the conversion of all modified peptides (i.e. 3a/(3a + 3a′+3aʺ)). Thus, the efficiency of Nterminal modification of YTSSSKNVVR 1a with 2-EBA 2a was 0.85.
To investigate the structure of N-terminally modified peptide 3a, we conducted a model study by treatment of L-alanine βbenzylamide 1b with 2-EBA 2a (1.1 equiv.) in H 2 O/CH 3 CN (1:3) at 60°C overnight, followed by addition of formic acid (2 equivalents) for 15 min (Fig. 2d). After the reaction, the corresponding isoquinolinium 3b was isolated in 30% yield. The formation of 3b suggested that the imine was first generated by reaction of 1b and 2a to give 2-ethynylbenzaldimine as a key intermediate. Then, the 2-ethynylbenzaldimine intermediate underwent subsequent intramolecular 6-endo-dig cyclization to give isoquinolinium 3b as the product. Remarkably, the formation of isoquinolinium salts by reaction of 2-EBA and primary amines under a metal-free condition in aqueous media has not yet been reported previously [30][31][32][33] .
With these promising findings, we next moved on to screen the peptide modification using a series of 2-EBA to improve the Nterminal selectivity of the modification and to study the structurereactivity relationship of the 2-EBA (Table 1 and Fig. 3). In all, 2-EBA 2a was commercially available and the others 2b-r were easily prepared by Sonogashira coupling reaction of the commercially available aromatic halides with trimethylsilylacetylenes, followed by desilylation 34 . Screening reactions in 50 mM PBS at pH 6.5 indicated that the reaction could be conducted with good to high conversions (up to 86%) with excellent N-terminal selectivity (up to > 99:1). 2-EBA bearing electron-donating groups at 5-or 4-positions (2b-g) gave the highest conversions (up to 86%) with excellent N-terminal selectivity (up to >99:1) ( Table 1, entries 2-7). Comparable conversions (up to 78%) and high Nterminal selectivity (up to >99:1) were obtained when using 2-EBA with weakly electron-withdrawing groups (fluoro or chloro) at 5-or 4-positions (2h-k, entries 8-11). Employment of 2-EBA with strongly electron-withdrawing groups (trifluoromethyl) at 5or 4-positions (2l and 2m) lead to moderate conversions (65 and 72%) and lower N-terminal selectivity (95:5 and 93:7). Incorporation of an alkyne moiety on the 2-EBA (2n and 2o) also gave 41-64% conversions with up to 96:4 of N-terminal selectivity, indicating that the present reaction has high compatibility with unsaturated C-C bond (Entries 14-15). Modification using 1-ethynyl-2-naphthaldehyde (2p) gave poor conversion (5%), which was probably attributed to the poor solubility of the compound (Entry 16). Incorporation of fluoro group at 6position (2q) resulted in high N-terminal selectivity (>99:1) but the high proportion of the di-modified products (74%) hampered its application (Entry 17). Moreover, introduction of fluoro group at 3-position (2r) lead to a lower N-terminal selectivity (93:7) (Entry 18). These findings revealed that 2-EBA were promising reagents for selective modification of the peptide N-terminus and incorporation of electron-donating groups or weakly electronwithdrawing groups would give high N-terminal selectivity of the modification.
We have also conducted some control experiments (Table 1 and Fig. 3). No reaction was observed when 2-EBA 2s bearing an  internal alkyne was used which was ascribed to the lower reactivity of the internal alkyne for the intramolecular cyclization (Table 1, entry 19). Using 4-ethynylbenzaldehyde (4a) and benzaldehyde (4b) gave no peptide conversion suggesting that a terminal ethynyl group located at the ortho-position of benzaldehyde played a key role for isoquinolinium formation to give the resulting conjugate (Entries 20-21). We have also compared our reagents with the N-(benzoyloxy)succinimide (4c) which was widely used for amine modification (Entry 22). It was found that N-(benzoyloxy)succinimide (4c) was highly reactive towards the amine groups on the peptide and poor N-terminal selectivity (73:28) was achieved even though we conducted the modification in a slightly acidic condition (pH 6.5), indicating that 2-EBA displayed unique properties towards N-terminal modification.
With the aforementioned findings, we designed and synthesized 2-EBA 2t for N-terminal modification of the peptides and proteins. The propargyl ether structure would provide an electron-donating effect towards the 2-EBA core structure to improve the N-terminal selectivity, while the free alkyne moiety allows attachment of versatile functional tags after the modification. Treatment of 2t with peptide 1a gave the high conversion (80%) with good N-terminal selectivity (98:2), suggesting that it would be a promising reagent for N-terminal modification of peptides and proteins.
Optimization of reaction conditions. We next sought to examine the effects of the reagent amount, temperature, and pH values of the media on the N-terminal selectivity of the modification. With the highest N-terminal selectivity ( Table 1, Table 2). Good conversion (86%) was observed using 20 equivalents of 2d with excellent N-terminal selectivity (>99:1). When the reaction temperature was reduced to 25°C and 4°C respectively, lower conversions were observed. Noticeably, poor N-terminal selectivity (60:40) was found at 4°C, suggesting that the low temperature would favor the reaction of the 2-EBA with the less hindered lysine ε-amino group.
We conducted time course experiments to test the effect of pH values on the modification. Bioconjugation reactions of peptide YTSSSKNVVR 1a with 2d (20 equiv.) in different pH values of 50 mM PBS and DMSO (9:1) were studied (Fig. 4a). At pH 6.5, excellent N-terminal selectivity (>99:1) was observed. As pH increased from 7.4 to 9, an increasing amount of mono-internal lysine-modified peptide and di-modified peptide as well as lower N-terminal selectivity were found at higher pH, indicating that the N-terminal selectivity of the modification was strongly influenced by the pH effects. Besides, the present reaction was optimized at 16 h to reach the highest conversion during the time course experiments.
To provide more insights on electronic effects of the substituents on the phenyl moieties of the 2-EBA towards the N-terminal selectivity of the modification, we performed the modification in different pH values of the media using 2-EBA with strongly electron-donating methoxy group (2b and 2c), weakly electron-withdrawing fluoro group (2h and 2i) as well as strongly electron-withdrawing trifluoromethyl group (2l and 2m). As shown in Fig. 4b, by increasing the pH of the medium, the Nterminal selectivity of the modification decreased. Surprisingly, 2-EBA bearing electron-donating substituents still gave moderately N-terminal selective modification (2b) or weakly lysine selective modification (2c) at pH 9.0 PBS medium, while 2-EBA bearing 2s electron-withdrawing substituents (2h, 2i, 2l, and 2m) changed to give lysine selective modification. For example, modification of peptide 1a with 2-EBA 2h in pH 9.0 PBS/DMSO medium afforded mono-modified peptide with 9:81 of N-terminal selectivity, indicating the labeling was highly selective to the lysine ε-amino group. The above findings implicated that the siteselectivity of the modification was not only controlled by the pH effects but also by the electronic effects of the substituents on 2-EBA. After screening of the effects towards the N-terminal modification, the stability of the bioconjugates was then studied by incubating the 2d-modified YTSSSKNVVR with excess amount of reducing or oxidizing reagents (glutathione (GSH), homocysteine, L-cysteine, DL-dithiothreitol (DTT), 2-mercaptoethanol, tris(2carboxyethyl)phosphine (TCEP), ascorbic acid, and hydrogen peroxide) in 50 mM PBS (pH 6.5)/DMSO (9:1) at 37°C for 2 h (Supplementary Fig. 49). LC-MS/MS analysis revealed that the 2dmodified YTSSSKNVVR was stable towards the additives with no significant decomposition or scrambling product.
Screening of a peptide library. We next studied the applicability of this reaction on modification of a library of 20 unprotected peptides, XSKFR (X = either one of 20 natural amino acids). The peptide sequences with nucleophilic Ser and Lys were chosen for examination of the N-terminal selectivity of this reaction. As shown in Table 2, peptides with N-terminal Ala, Cys, Asp, Glu, Gly, His, Lys, Asn, Gln, Ser, or Tyr gave excellent N-terminal selectivity (>99:1) ( Table 2, entries 1-11). Moderate-to-high Nterminal selectivities (86:14 to 98:2) were obtained for the Nterminal Ile, Leu, Trp, Phe, Val, Met, Thr, and Arg peptides (Entries 12-19). However, a low N-terminal selectivity of 46:54 was observed for PSKFR having N-terminal proline residue (Entry 20), which is presumably due to the iminium intermediate formed between proline and 2-EBA cannot undergo subsequent cyclization with the proximal alkyne group.
To further study the selectivity of the present bioconjugation reaction, we used peptides with cysteine at different positions (ASCGTN, AYEMWCFHQR, and KSTFC). Exclusive N-terminal modification with 70%, 27 and 15% conversions, respectively, was found with the cysteine residue remaining intact ( Supplementary  Figs. 77-79). In addition, sole modification at the internal lysine in an N-terminally acetylated peptide Ac-YTSSSKNVVR with 19% conversion was observed, indicating that the present bioconjugation reaction was highly chemoselective to the amino group of peptides as only the amino group of lysine was modified when the N-terminus is acetylated (Supplementary Fig. 80). For a peptide containing a second proline residue (YPSSSKNVVR) which has no reactivity towards 2-PCA 29 , it was found that the bioconjugation proceeded smoothly with 2d to afford 54% conversion with excellent N-terminal selectivity (>99:1) (Supplementary Fig. 81).
Protein modification using 2-ethynylbenzaldehydes. After studying the N-terminal peptide modification, we further explored the present reaction for protein modification (Fig. 5) employing alkyne-linked and fluorescein-linked 2-EBA (2t and 2u, respectively). The presence of the ether linkage was to improve the N-terminal selectivity of the reagent, suggested by the aforementioned findings. In total, 0.1 mM of lysozyme (PDB ID: 1DPX) was treated with 2t (0.5 mM, 5 equivalents) in 50 mM PBS (pH 6.5) at 37°C for 16 h, giving the 2t-modified lysozyme with 52% conversion (Supplementary Fig. 82a). LC-MS analysis of the reaction mixtures of lysozyme showed peaks at 14470 Da and 14489 Da, which were assigned to the mono-modified lysozyme. Upon trypsin digestion, the modification by 2t was found to selectively occur at the N-   partially (20-30%) decomposition at 37°C after 12 h, we also tested the stability of the 2t-modified RNase A by treatment of the modified RNase A in PBS with different pH values (pH 3-11) at 37°C ( Supplementary Fig. 88). Noticeably, no decomposition was found by LC-MS analysis, indicating that the quinolinium conjugate formed was highly stable and this modification would be amenable for preparation of bioconjugates for drug development. Human arginase, which is a manganese-dependent enzyme that degrades arginine into urea, has been reported to treat advanced hepatocellular carcinoma and metastatic melanoma where prior immunotherapy failed in early phase of clinical trial 35 . PEGylated human arginase I was developed as the first generation of therapeutic proteins with long half-life, which is now undergoing phase II clinical trials 36 . However, current PEGylation usually involved non-specific lysine/cysteine modification via NHS/maleimide chemistry. The second-generation therapeutic protein, Bacillus caldovelox arginase mutant (BCArg mutant), has been reported to induce a sustained complete remission in a patient with immunotherapy-resistant cancer 37,38 . In addition to lysozyme and RNase A, we also extended this newly developed N-terminal modification to modify the BCArg mutant. BCArg mutant (0.1 mM) was treated with alkyne-and fluorescein-linked 2-EBA (2t and 2u, 10 equivalent) in PBS (pH 7.4)/DMSO (9:1) at 37°C for 16 h to give the corresponding 2tand 2u-modified BCArg mutant in 40 and 19% conversions, respectively ( Supplementary Fig. 89). High N-terminal selectivity as revealed by the tryptic peptide fragments MKPI-SIIGVPMDLGQTR of 2tand 2u-modified BCArg mutants were observed by LC-MS/MS analysis ( Supplementary Fig. 90).
As depicted in Fig. 6, the 2t-modified BCArg mutant containing an alkyne handle (32615 Da) could be smoothly modified with a rhodamine-azide via copper(I)-catalyzed [3 + 2] cycloaddition reaction to give the rhodamine-labeled BCArg mutant (33268Da and 33296Da) in >99% conversion (Supplementary Fig. 91). SDS-PAGE analysis revealed that the rhodamine-labeled BCArg mutant gave a strongly green fluorescent signal while the 2t-modified BCArg mutant had no fluorescent signal at UV 365 nm (Supplementary Fig. 92). Coomassie blue staining on the same gel gave deep blue color signals of unmodified, alkyne-linked as well as rhodamine-labeled proteins, indicating that the fluorescent tag has been successfully labeled on the proteins using the N-terminal selective alkynelinked 2-EBA 2t and a sequential azide-alkyne click reaction. These results indicated that the present reaction has high compatibility with click chemistry.
Biological studies of N-terminally modified BCArg mutants.
To study the influence of the quinolinium conjugates on the biological properties of the therapeutic protein, we compared the enzyme activities and anti-cancer properties of the modified BCArg mutant with the unmodified analogue (Table 3). We prepared 2b-modified BCArg mutant (32%) by bioconjugation with 2-EBA 2b in 50 mM PBS (pH 7.4) at 37°C for 16 h ( Supplementary  Figs. 93-94). The enzymatic properties of the 2b-modified BCArg mutant was slightly lower than that of the unmodified BCArg mutant. The anti-cancer properties of the unmodified and 2bmodified BCArg mutants were then examined using breast cancer cell lines MDA-MB-231 and MDA-MB-468. Experimental IC 50 values indicated that the antitumor efficacy of the 2b-modified BCArg mutant was comparable to that of the unmodified one. These findings indicated that the 2b-modified BCArg mutant retained its biological activities after the bioconjugation.
In summary, we have discovered that 2-ethynylbenzaldehydes (2-EBA) are a useful reagent for N-terminal modification of peptides and proteins via isoquinolinium formation with the N-terminal α-amino group. After a comprehensive screening of the reaction conditions and the structure-reactivity relationship of the 2-EBA, we have found that apart from the pH control, the electronic properties of the substituents on the 2-EBA could also strongly affect the N-terminal selectivity of the modification. Under slightly acidic condition (pH 6.5) and employing 2-EBA with electron-donating and weakly electron-withdrawing groups, the modification has achieved excellent N-terminal selectivity. Conducting the reaction in basic medium (pH 9) and using 2-EBA with electron-withdrawing groups can switch the modification to become lysine selective. To help other researchers who are interested in using this bioconjugation reaction, the reaction conditions, substituent effects, and functional group tolerance for various N-terminal residues are summarized in Table 4  General procedure for modification of peptides using 2-ethynylbenzaldehydes. To an eppendorf tube (1.5 mL) with 80 µL of 50 mM PBS buffer pH 6.5, 10 µL of YTSSSKNVR (1a, 1 mM in Milli-Q® water) was added to the buffer, followed by 10 µL of 2-ethynylbenzaldehyde (2a-2t, 20 mM in DMSO). The reactive mixture was allowed to react in a 37°C water bath for 16 h. 10 µL of the mixture was drawn, diluted with 10 µL of Milli-Q® water and subjected to LC/MS-MS analysis. Unless otherwise specified, all peptides were treated as same as the above procedure.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
All principal data with detailed experimental procedure and characterization of this work are included in this article, and its Supplementary Information or are available from the corresponding author upon reasonable request.