Introduction

Over the course of the last 50 years, the efficiency in linear peptide assembly has advanced through a series of innovations, most notably stepwise solid-phase synthesis and fragment ligation1,2,3,4. Still, precise control of higher order structure through directed disulfide bond formation remains challenging5,6,7,8. This is particularly the case with peptides such as insulin and relaxin given the additional complexity of their heterodimeric structures9,10. Biomimetic linkers have been introduced to address these issues. They share the native linear order of proinsulin where the N-terminus of the A-chain is indirectly connected to the C-terminus of the B-chain. Conversion to the two-chain form initially employed enzymatic conversion and was restricted by the requirement for a unique proteolytic site11,12,13,14,15. The more recent reports employing chemically labile linkers represent a leap forward by eliminating the need for a proteolytic site, and the enzyme itself. The Kent group described an insulin-specific linkage of GluA4-ThrB30, which was saponified following oxidative folding16. Subsequently, a sequence-agnostic approach to synthesis in the insulin-like peptide family was reported, which employed a reversible tethering of the A-chain N-terminus through a labile amide to the B-chain C-terminal ester17,18. In each instance, the strategies mimicked the linear order of the A- and B-chains found in proinsulin. While native, this linear orientation increases synthetic complexity in requiring either non-standard chemistry for linker installation, or a two-step linker excision. A straightforward, general synthesis would ideally align with conventional solid-phase methods and employ a single chemical step for removal.

We report the synthesis of insulin and a set of related peptides by a synthetic protocol that employs a reversible crosslink of the two N-termini through parallel extension of the respective A- and B-chains made by conventional Solid Phase Peptide Synthesis(SPPS). The N–N heterodimers of these insulin-related peptides efficiently fold under standard redox conditions, and subsequently convert to the native hormone under mildly alkaline conditions by virtue of two simultaneous diketopiperazine cyclizations. The efficiency and versatility of the method are demonstrated in the synthetic yield of insulin and the direct translation to the synthesis of relaxin. The non-native N–N linkage enables the synthesis of two site-specific penicillamine-substituted analogs 4 and 5, that fail when using a high-efficiency N–C folding intermediate, named des-DI insulin15. This synthetic approach is based upon an orthogonal, non-native N–N linkage of individual peptide chains that is synthetically straightforward and of high efficiency in synthesis of insulin-like peptides. The approach holds promise for translation within the broader class of disulfide-rich heterodimeric peptides.

Results

Insulin synthesis

We explored the chemical synthesis of insulin, relaxin-2, and four insulin analogs (Fig. 1) through reversible crosslink of the two N-termini by parallel extension of insulin A- and B-chains made by conventional SPPS. A Lys-(iBu)Gly dipeptide extension at the N-termini was envisioned to provide a side-chain anchor for an OEG (polyethylene glycol)oxime-based crosslink (Fig. 2). Whether a suitably, sequence-extended N–N heterodimer of insulin A and B-chains might fold under standard redox conditions was a central uncertainty to be investigated.

Fig. 1
figure 1

Amino acid sequences of insulin family peptides. 1 Sequence of human insulin. 2 Sequence of human relaxin-2 (Z: pyroglutamate). 3 Sequence of four-disulfide human insulin. 4 Sequence of A7(Pen) insulin analog. 5 Sequence of B19(Pen) insulin analog. 6 Sequence of Lys-Pro insulin

Fig. 2
figure 2

Synthetic route to insulin-like peptides. i Synthesis of folded single-chain insulin-like peptides enabled by an intermediary N–N chemical ligation followed by intramolecular disulfide bond formation. ii Synthesis of two-chain insulin-like peptides by DKP-mediated linker cleavage of the folded single-chain insulin-like peptides. iii A one-pot relaxin synthesis through sequential ligation, folding, DKP cleavage, and pGlu formation with a single purification

The synthesis of the insulin A-chain began with the coupling of Fmoc-Asp-OtBu to Chemmatrix Rink amide resin to introduce the C-terminal Asn, and the remaining residues added by conventional automated Fmoc-based SPPS protocol (Fig. 3). The isoacyl Thr–Ser dipeptide at A8–A9 was incorporated as a means to enhance peptide assembly, solubility, and handling19. The Boc-Lys(iBu)Gly-OH dipeptide was installed through sequential α-bromoacetylation and isobutylamine treatment20 of resin-bound A-chain 7 followed by 3-(diethoxyphosphoryloxy)-1, 2, 3-benzotriazin-4(3H)-one (DEPBT)-mediated coupling of Boc-Lys(Fmoc)-OH. Successive side-chain coupling of PEG8 and bis(Boc)amino-oxyacetic acid provided resin-bound A-chain 8, which was deprotected and cleaved from the resin under standard conditions. The crude peptide, (Supplementary Figure 13) was purified by C8 reverse phase HPLC (RP-HPLC) to provide A-chain 9 in 25% yield (Fig. 4 and Supplementary Figure 14).

Fig. 3
figure 3

Synthesis of human insulin. Synthesis of human insulin from the N–N chemical ligation of purified A- and B-chains, followed by intramolecular disulfide bond formation, and sebsequently, linker cleavage by DKP formation. (-isoacyl dipeptide is indicated by a blue box). a 20 equiv. BrCH2COOH, 10 equiv. DIC, 1 h. b Isobutyl amine, 6 equiv. in DMSO, overnight. c Boc-Lys(Fmoc)-OH, DEPBT, DIEA, and DMAP (5%), 4 h. d Fmoc-N-amido-PEG8-acid, DIC, and 6-Cl-HOBt, 4 h. e Bis-boc-amino-oxyacetic acid, DIC, and 6-Cl-HOBt, 1 h. f TFA, TIS, DODT, and H2O, 2 h. g Terephthalaldehyde 10 equiv., 70% aqueous ACN. h Ligation in 0.1% TFA containing 50% aqueous ACN, 2 h. i Folding, Cys-Cystine, pH 9, 4 °C, 12 h. j 0.5 M sodium phosphate buffer at 56 °C

Fig. 4
figure 4

Analytical HPLC profiles in human insulin synthesis. HPLC profile of purified insulin A-chain 9 and B-chain 12, ligation intermediate 13, purified single-chain insulin 14, DKP cleavage reaction mixture 1+15, and purified insulin 1

The B-chain synthesis (Fig. 3) was initiated with loading of Fmoc-Thr(tBu)-OH on a ChemMatrix HMPB resin under Mitsunobu conditions to minimize racemization. The remaining residues (B1–B29), including the isoacyl Tyr–Thr dipeptide at B26–27 were incorporated by conventional Fmoc-based SPPS protocol. The Boc-Lys(iBu)Gly-OH dipeptide, PEG8 (polyethylene glycol), and bis(Boc)amino-oxyacetic acid were sequentially coupled to resin 10, followed by cleavage and deprotection to afford 11. The free aminooxy-derivatized B-chain 11 was treated with Terephthalaldehyde (10 equiv.) in 0.1% TFA/70% aqueous acetonitrile (ACN) to provide the crude imino-benzaldehyde B-chain derivative 12 (Supplementary Figure 15), which was recovered in 15% total synthetic yield following RP-HPLC purification (Fig. 4 and Supplementary Figure 16). The oxime ligation of A-chain and B-chain was accomplished by combination of 9 and 12 in a 0.1% TFA containing 50% aqueous ACN, for 2 h (Figs. 3, 4 & Supplementary Figure 17). The PEG8 was experimentally determined to be the minimal distance required between A- and B-chains to subsequently produce the properly disulfide-paired hormone.

The folding of the ligated A–B intermediate 13 was performed at pH 9 in an aqueous buffer with 2 mM cysteine and 0.5 mM cystine at 4 °C, to produce a single major product 14 (Figs. 3, 4 and Supplementary Figure 18). This properly folded, single-chain insulin was obtained in a 45% combined yield for ligation, disulfide-formation, and RP-HPLC purification (Supplementary Figure 19).

Single-chain insulin 14 was efficiently converted to two-chain insulin 1 using 0.5 M phosphate buffer (pH 7.0) at 56 °C (Fig. 3). The two simultaneous diketopiperzine (DKP) cleavage reactions were complete after 5 h to provide insulin 1 in 65% yield, following HPLC purification (Fig. 4, Supplementary Figures 2021). The speed of DKP formation can be further accelerated by selection of dipeptides that favor cis-configuration, which can be achieved by alkylation at the alpha carbon of the first amino acid and more judicious N-alkylation at the second. When compared to our previous report employing an N–C insulin order, the yield was enhanced in a relative sense by 20%17. This improvement predominantly results from eliminating the more alkaline pH needed to cleave the ester bond. Overall, the synthetic yield of insulin was 30%, starting from purified A-chain.

Synthesis of relaxin

The synthesis of relaxin (Supplementary Figures 12) began with A- and B-chains, respectively, utilizing Fmoc-Cys(Trt)-OH/ NovoSyn® TGA resin and Fmoc-Ser(tBu)-OH/ ChemMatrix HMPB esterified resin. The remaining amino acids were added by a conventional Fmoc protocol, with isoacyl dipeptides employed as Asp–Ser at B1–B2 and Ser–Thr at B26–B27. In addition, the N-terminal residue of the A-chain was introduced as Gln, which was subsequently cyclized to pGlu. The Boc-Lys(iBu)Gly-OH dipeptide, PEG8 and bis-Boc-amino-oxyacetic acid were introduced as reported in the insulin synthesis (Supplementary Figures 2225). The oxime ligation 21 and peptide folding 22 (Supplementary Figure 3) were also conducted as previously communicated with a combined yield of 46% (Supplementary Figures 2628). The DKP cyclization and the subsequent pGlu formation were completed in 7 h using 0.5 M phosphate buffer (pH 7.0, 56 °C) in a combined 65% yield (Supplementary Figures 2930), and the overall synthetic yield of human relaxin-2, 2 was 30%, starting from A-chain. To minimize intermediate handling loss, we assessed the ligation, folding, and linker excision steps starting with pure A- and B-chains and chromatographically purifying only at the end (Fig. 5 and Supplementary Figure 4). This simplified protocol improved the total synthetic yield from 30 to 38%, and represents one of the most efficient chemical syntheses reported yet for human relaxin. The bioactivity of the synthetic relaxin-2 proved indistinguishable from an external native control hormone (Supplementary Figure 12).

Fig. 5
figure 5

Analytical HPLC analysis in the synthesis of human relaxin. HPLC profiles for the synthesis of human relaxin from purified A- and B-chains through ligation 21, folding 22, DKP formation, and pGlu formation without intermediate purification to yield pure relaxin 2

Synthesis of a four-disulfide insulin analog

To further explore the potential of the new methodology, we applied it to an insulin analog with an additional, fourth disulfide linking CysA10 and CysB4 (Supplementary Figure 5). This analog as prepared by biosynthesis is reported to possess reduced propensity to fibrillation, and full in vivo activity21. The first chemical synthesis of these four-disulfides (4-DS) insulin analog was achieved through sequential disulfide bond formation that included an iodine oxidation step. An iodine-free synthesis of this challenging target suggests that the methodology may prove useful in the synthesis of other peptides with multiple disulfides, especially those with methionine and tryptophan.

The insulin-extended A-chain S1 and B-chain S2 synthesis incorporated Fmoc-Cys(Trt)-OH at A10 and B4 but otherwise were identical to the previously presented insulin protocol, and they were, respectively, achieved in yields of 24% and 15% (Supplementary Figures 35 and 36). The ligated linear precursor S3 was folded without modification of the insulin protocol, and the single-chain, 4-DS analog S4 was obtained in 40% yield (Supplementary Figure 34 and 37). The excision step was achieved in 5 h to yield the pure 4-DS insulin analog 3 in 64% yield (Supplementary Figure 38). This peptide as assessed by LC-MS and in vitro potency was indistinguishable from the same insulin analog as prepared by orthogonal disulfide bond formation22, and only slightly less potent than native insulin (Supplementary Figure 10). The single-chain form of the 4-DS analog S4 was sizably less potent than the two-chain form, demonstrating the deleterious impact of an N-terminal constraint on bioactivity, but not on the ability to form native disulfides with a linker of appropriate length. Insulin with a comparable crosslink at the N-termini of A- and B-chains was suppressed in bio-potency to nearly the same extent as observed in the 4-DS analog 14 (Supplementary Figure 10).

Synthesis of penicillamine-containing insulin analogs

The synthesis of the A7(Pen) A-chain S5 and B19(Pen) B-chain S8 (Supplementary Figure 6 and 7) employed the same protocol as employed for insulin, except for Fmoc-Pen(Trt)-OH at A7 or B19. The chain assembly yields following purification were 22% for the A-chain analog S5, and 14% for B-chain S8 (Supplementary Figures 40 and 45). The oxime ligation of S5 to 12 and S8 to 9 was conducted as previously achieved for native insulin, and the ligated purified synthetic intermediates S6 and S9 were obtained in respective yields of 55% and 50% (Supplementary Figures 41 and 46). The subsequent folding of S6 and S9 was without protocol modification as reported for native sequence and was complete with comparable efficiency in 12 h to provide S7 at 20% yield, and S10 at 19% (Supplementary Figures 39, 44, 42, and 47). The cleavage of the DKP-peg-bis linker was achieved in 9 h (pH 7.0, 56 °C), to provide analogs 4 and 5 in total yields of 30% and 28% (Supplementary Figures 43 and 48). The native disulfide pattern was implied by single LC-peaks in the Glu-C peptide mapping (Supplementary Figure 10, Supplementary Table 1), which was definitively confirmed for the Pen-A7 analog by comparison to disulfide isomers prepared by orthogonal synthesis (Supplementary Figures 60 and 61). The in vitro bioactivity of these novel insulin analogs was assessed and observed to be reduced to varying degrees relative to native hormone (Table 1, Supplementary Figure 11). The other four single-site, penicillamine insulin analogs (A6, A11, A20, and B7) were chemically synthesized using a linear desDI single-chain precursor without issue, (Supplementary Figures 5559)15. The A7 and B19 proved to be synthetically accessible only by the N-termini ligation approach we describe in this manuscript. The bioactivity of the penicillamine analog at A20 was least affected in a relative sense, especially when compared to the analogs at the other inter-chain cysteines (A7, B7, and B19). Interestingly, the placement of the gem-dimethyl substituent at A11 was approximately 100-fold more disabling than at A6, the other partnering residue in the single intra-chain disulfide.

Table 1 In vitro bioactivity of insulin penicillamine analogs

Synthesis of Lys–Pro insulin

Lys–Pro insulin represents the first hormone analog produced by rDNA-technology approved for human use23. The inversion of the natural dipeptide to Lys–Pro eliminates trypsin-like proteolysis. Consequently, this analog should be equally accessible by an enzyme-based approach as a synthesis that is DKP mediated. To prove this point and assess the relative efficiency in the removal of the auxiliary N,N-crosslink, insulin 6 was synthesized (Supplementary Figure 8) as described for native sequence, but with replacement of the DKP-susceptible dipeptide with a Gly–Lys dipeptide. Peptide chain synthesis, oxime ligation and disulfide formation in insulin 6 were achieved as with native hormone in 46% yield (Supplementary Figures 4952). The single-chain S14 was converted to the two-chain form 6 by Lys-C digestion (Supplementary Figure 54), in Tris buffer at pH 8 for 1 h. The Lys–Pro insulin was obtained after purification by RP-HPLC in 66% yield (Supplementary Figure 53). The overall yield of 6 as produced by enzyme cleavage was 30% from purified A-chain, which is identical to the yield of 1 obtained by DKP-mediated chemical cleavage.

Discussion

We report a general synthetic route to insulin-related peptides with likely application to the broader family of disulfide rich, two-chain peptides. This straightforward method demonstrates the use of a non-native N–N linkage that is compatible with automated SPPS. The use of identical N-terminal A- and B-chain extensions and conventional ligation streamlines the assembly of the heterodimer, followed by single-step excision of the auxiliary tether. Insulin and relaxin, which have historically constituted difficult synthetic targets, were produced by this procedure within a few days, in high yield. Notably, an initial attempt to synthesize insulin through an N–N linkage without an N-terminal extension was reported to be unsuccessful16. In our experience, the folding efficiency was dramatically enhanced by incorporation of the OEG-based N-terminal extensions. The central, enabling element of this approach is the reversible N-terminal crosslinking of the A- and B-chains to enable intramolecular native disulfide bond formation. The efficiency is highlighted in the synthesis of relaxin from A- and B-chains employing only a final chromatographic purification step in a 38% yield (Table 2).

Table 2 Synthetic yields at various stages of insulin-like peptides

The use of OEG-extended linkers was found to improve handling of the individual peptide chains, the ligated intermediate, and to enhance the subsequent formation of native disulfides. These conditions were applicable to the native hormones and translated to a synthetic target that had previously required orthogonal stepwise synthesis, a four-disulfide containing insulin analog22. The successful syntheses of two individual penicillamine substituted insulin analogs, that we could not prepare by native folding using a bio-mimetically linked insulin precursor15, demonstrate a unique virtue to this synthetic approach. The analogs complete an otherwise full set of selective penicillamine substitutions for each of the native cysteines (Table 1). There was no direct relationship in the difficulty of synthesis relative to bioactivity, as the B19 analog was of intermediate potency to the full set while A7 was least potent.

We envision the application of this approach beyond the insulin/relaxin super family. The methodology is compatible with peptides produced by any method where the linker can be semi-synthetically conjugated to a selective amine, preferably the N-terminus24. The linker can be further optimized to enhance the biophysical properties of synthetic intermediates. The synthetic approach is not limited to oxime linkage and could conceivably utilize other linkage chemistries. A sagacious aspect of the reported syntheses is the use of DKP formation, an adverse reaction in peptide synthesis25 as controlling element in the removal of the auxiliary crosslink17,18. Further refinements in the propensity to cyclize will broaden the ability to accelerate or delay reversal of the crosslink. As exemplified in the synthesis of Lys–Pro insulin, the synthetic strategy is compatible with selective proteolysis. Notably, the synthetic yields in use of the chemical and enzymatic cleavage were comparable, attesting to the productivity of the former.

Methods

Synthetic procedures

See Supplementary Methods and Supplementary Figs. 18.

Characterization of native disulfide bonding

See Supplementary Figure 9, Supplementary Figures 6061 and Supplementary Table 1.

In vitro bioanalysis

See Supplementary Figs. 1012.

Characterization

See Supplementary Figs. 1361 for LC-MS chromatograms of synthetic intermediates and final products.

Data availability

All data generated during the current study are available from the corresponding author on reasonable request.