Synergistic sequence contributions bias glycation outcomes

The methylglyoxal-derived hydroimidazolone isomer, MGH-1, is an abundant advanced glycation end-product (AGE) associated with disease and age-related disorders. As AGE formation occurs spontaneously and without an enzyme, it remains unknown why certain sites on distinct proteins become modified with specific AGEs. Here, we use a combinatorial peptide library to determine the chemical features that favor MGH-1. When properly positioned, tyrosine is found to play an active mechanistic role that facilitates MGH-1 formation. This work offers mechanistic insight connecting multiple AGEs, including MGH-1 and carboxyethylarginine (CEA), and reconciles the role of negative charge in influencing glycation outcomes. Further, this study provides clear evidence that glycation outcomes can be influenced through long- or medium-range cooperative interactions. This work demonstrates that these chemical features also predictably template selective glycation on full-length protein targets expressed in mammalian cells. This information is vital for developing methods that control glycation in living cells and will enable the study of glycation as a functional post-translational modification.


LC-MS Analysis and Quantification.
Analysis of mass spectrometry data was carried out using Agilent MassHunter BioConfirm Qualitative Analysis software and PEAKS Studio (v. 7.5) software. MS data was quantified using the MassHunter Molecular Feature Extractor, which reports cumulative ion counts (MS) as 'volumes' observed for any and all charge states associated with a particular ion. As peptides and their modified counterparts can ionize differently (eg. different charge states or different salt adducts) this method provides a more robust measure than comparing only a single charge state. For each peptide sample, compound lists were generated for each replicate of MGO treatments. Quantification was carried out by dividing the AGE adduct(s) volume by the total volume of both modified and unmodified peptide (Equation 1). This quantification approach allows for a robust comparison of glycation extents across different AGEs on different peptide substrates, even though each may exhibit some variation in ionization efficiency 1 . Additionally, absolute counts of peptides that were not treated with MGO were first evaluated to confirm that roughly similar levels of ionization were observed at the same known concentration. Retention time (RT) was used to identify discrete isomers with degenerate masses.  For tandem mass spectrometry, the collected scans were combined and reported for each identified precursor ion for each compound that triggered MS/MS acquisition. ProteinProspector (UCSF), an online proteomics tool, was used to generate a list of expected b and y ions for modified and unmodified peptides, which was used to assign the site of modification (See Supplementary Figs. 2, 5, & 18). PEAKS studio (v 7.5) software was used for the de novo sequencing of the raw MS/MS data generated from peptides that were cleaved from single beads selected during library screening.

AGE adducts for AcLESRHYA + MGO using standard glycating conditions and LC-MS acquisition
General Protocol for Glycation of Peptides. In general, glycation reactions for peptides in solution were performed at a 50 µL scale in 200 µL Eppendorf tubes. Commercially-available peptides (see Supplementary Fig. 1) were purchased as lyophilized powders and were reconstituted in 1:1 DMF:water to final stock peptide concentrations of 10 mM. Synthetic peptides purified in our laboratory were prepared as stock solutions that were 20 mM in DMF or 10 mM in 50% DMF/water, depending on solubility. MGO stocks were prepared by dilution of 15.3 µL of 40% w/v solution into 10 mL of ultrapure water (10 mM) and were stored at 4 °C for up to two weeks. To perform in vitro glycation, to 30 µL of ultrapure water was added 5 µL of a 10 mM stock peptide solution in 50% DMF/water, 10 µL of 100 mM phosphate buffered saline (PBS) at pH 7.3, and, lastly, 5 µL of a 10 mM MGO stock in water. The final concentrations were 1 mM peptide and 1 mM MGO in 20 mM PBS with 5% DMF co-solvent. MGO was always added in the final step to prevent any high concentration MGO exposure, and the glycation of each peptide was assessed individually. Tubes were capped, briefly spun in a benchtop microcentrifuge and incubated in a 37 °C water bath, typically for 3 h and in some cases up to 4 weeks. After incubation, peptide samples were diluted (1:100) into 5 mM Tris Buffer pH 7.3 to quench the reaction, unless otherwise noted, and subsequently subjected to LC-MS analysis.
Library Design and Synthesis. A one-bead one-compound peptide library was designed based on a sequence derived from human serum albumin (-LLVRYTK-) that we, 2 and others, 3 have found to be glycated by MGO in vitro. Our previous work revealed this site to be the most reactive in HSA, exclusively producing only [M+54] and [M+72] adducts. Although our past work has indicated that Lys is far less reactive towards MGO than Arg, 2 and we have found only nominal evidence of N-terminal glycation, we chose to replace Lys with Ala, and cap all N-termini to avoid potential competition from other nucleophiles. The library scaffold consisted of a C-terminal tetraalanine spacer, a central scaffold sequence of LXXRXXA, an amino-hexanoic acid (Ahx) linker, and, finally, an Nterminal cap (either biotin or acetyl). The four variable positions were randomized with 14 different amino acids: Ala, Asn, Asp, Gln, Glu, Gly, His, Leu, Phe, Ser, Thr, Trp, Tyr, and Val. All of the canonical amino acids were introduced at these positions, except for those with degenerate masses (Ile), prone to oxidation (Cys & Met), susceptible to competition with the central arginine (Arg & Lys), or that exhibit potentially confounding structural effects (Pro). The resulting combinatorial library of peptides thus contained 38,416-members with the full-length sequence being (Ac/Bio)-Ahx-LXXRXXA-AbAbAbAb conjugated C-terminally to the resin via a p-hydroxymethylbenzoic acid (HMBA) linker. The HMBA linker allows for side chain deprotection using standard trifluoracetic acid conditions, aqueous treatments of resin-bound peptides, and hydroxide base-mediated cleavage prior to analysis.
Standard solid phase peptide chemistry was used to achieve the library synthesis on the aqueous compatible ChemMatrix® resin (100-200 mesh, 0.50 mmol/g loading, PCAS BioMatrix, Inc., 1040378). To begin, the resin (~500,000 beads, 250 µmol) was swelled in DMF (10 mL), and placed on a shaker for approximately 45 min, in a 20 mL polypropylene fritted syringe. To couple the first amino acid, 5 equiv. (1.25 mmol) of Fmoc-b-Ala-OH was dissolved in DMF and was pre-activated with 5 equiv. (1.25 mmol) of diisopropylcarbodiimide (DIC). Subsequently, 0.1 equiv. (2.5 µmol) of 4-dimethylaminopyridine (DMAP) was added to the pre-activated amino acid, and the resulting solution was incubated with the swelled resin overnight. Excess reagent was removed by washing with approximately 10 mL DMF for a total of 5 cycles of brief shaking and draining. This coupling step was performed twice to ensure complete loading of the first amino acid to the HMBA linker. Subsequent additions of amino acids were achieved using iterative cycles of Fmoc-deprotection and coupling, as described above, though coupling steps were extended to 1.5 hours during randomization steps. To introduce randomized amino acids at four variable positions, standard split and pool methods were used. For these steps, the resin was split into 15 scintillation vials by capping a 20 mL syringe, adding 15 mL of DMF and placing 1 mL of slurry into each vial (estimated to be roughly 0.1 mmol of peptide. Following completion of the peptide sequence, an amino-hexanoic acid spacer was coupled using standard coupling protocols, as described above. The final synthesis step included either the addition of an N-terminal acetyl or biotin to cap the N-terminus. N-terminal acetylation was achieved by incubation with a solution of acetic anhydride (1.5 equiv.) with DIEA (3 equiv.) for 2 hours in 10 mL DMF. For N-terminal biotinylation, beads were incubated with 5 equiv. (1.25 mmol) of D(+)-Biotin (MilliporeSigma, 8.51209), 5 equiv.
Library Screening with α-MGH-1 Antibodies. As the library size is approximately 40,000 unique sequences, the following screening protocol was designed to oversample the library 3X within each replicate, and was completed in triplicate to ensure proper library sampling 4 . To begin, 120 mg of N-terminally acetylated peptide library (~120,000 beads) were side chain deprotected using a solution of TFA:TIPS:H2O (95:2.5:2.5) in a total volume of 20 mL, shaking in a polypropylene fritted syringe for 5 hours. Beads were then washed 5X with water and allowed to equilibrate in DI water overnight. The next day, beads were washed 5X with 20 mM PBS at pH 7.3 and then exposed to 0.5 mM MGO for 3 h at 37 °C. Following incubation, excess MGO was removed via washing (3X) with 20 mM Tris buffered with 150 mM NaCl and containing 0.1% Tween (TBST). The resulting MGO-modified beads were blocked using a 1% solution of bovine serum albumin (BSA) in TBST for 2 h at room temperature. Next, the blocked beads were incubated with a 1:1000 dilution of a-MGH-1 antibody (Cell BioLabs, Inc., STA-011) for 18-20 h. This antibody was developed using MGH-modified ovalbumin 5 . It has been reported to be specific for MGH-1, based on a competition assay using several AGE-modified antigens, and has been validated for the detection of MGH-1 using synthetic MGH-modified immunogens 5,6 . After incubation with primary antibody, the beads were washed 5X with TBST and then exposed to a 1:1000 dilution of a-mouse secondary antibody (Abcam, AB7069) conjugated to alkaline phosphatase for 4 h. After five washes with TBST to remove excess secondary antibody, beads were equilibrated in alkaline phosphatase buffer (100 mM Tris-HCl, 150 mM NaCl, 1 mM MgCl2 at pH 9.0) for 1 h. Following equilibration, the beads were exposed to color developing reagents 5-bromo-4-chloro-3-indolyl phosphate and nitro blue tetrazolium (BCIP/NBT, Promega, S3771) and were transferred to a petri dish and imaged under a microscope equipped with a camera (Leica DMi8). Dark purple beads were manually selected and cleaved using 50 µL of 100 mM sodium hydroxide per bead. The resulting single-bead cleavage mixtures were diluted 50% in 100 mM hydrochloric acid to neutralize the solution, filtered using fritted microcentrifuge spin columns and subjected directly to sequencing by LC-MS/MS. As a result, a total of 75 beads were selected out of a cumulative 360,000 possible beads.
Library Screening using Impaired Proteolysis by Trypsin. As the library size is approximately 40,000 unique sequences, the following screening protocol was designed to oversample the library 3X within each replicate, and was completed in triplicate to ensure proper library sampling 4 . 120 mg of N-terminally biotinylated peptide library (~120,000 beads) were side chain deprotected using a solution of TFA:TIPS:H2O (95:2.5:2.5) in a total volume of 20 mL, shaking in a polypropylene fritted syringe for 5 hours. Beads were then washed 5X with water and allowed to equilibrate in DI water overnight. The next day, beads were washed 5X with 20 mM PBS at pH 7.3 and then exposed to 0.5 mM MGO for 3 h at 37 °C. Following incubation, excess MGO was removed via washing (3X) with 20 mM Tris buffered with 150 mM NaCl and containing 0.1% Tween (TBST). The resulting MGO-modified beads were washed 3X in PBS and exposed to sequencing grade modified trypsin (Promega, V5111) for 48 h. As glycated arginine residues are resistant to trypsin activity, this treatment resulted in truncation only for unmodified beads, resulting in the removal of the N-terminal biotin tag for any sequences that did not react with MGO. After trypsin exposure, beads were washed 3X in TBST and blocked for 2 h using a solution of 0.1% BSA in TBST. Next, beads were incubated with a 1:1000 dilution of streptavidin alkaline phosphatase (Promega, V5591) in 0.1% BSA in TBST. The beads were then equilibrated in alkaline phosphatase buffer (100 mM Tris-HCl, 150 mM NaCl, 1 mM MgCl2 at pH 9.0) for 1 h. Following equilibration, the beads were exposed to color developing reagents BCIP and NBT (Promega, S3771), transferred to a petri dish and imaged under a microscope equipped with a camera (Leica DMi8). Dark purple beads were manually selected and cleaved using 50 µL of 100 mM sodium hydroxide. The resulting single-bead cleavage mixtures were diluted 50% in 100 mM hydrochloric acid to neutralize the solution, filtered using fritted microcentrifuge spin columns and subjected directly to sequencing by LC-MS/MS. As a result, a total of 30 (20 positive, purple; 10 negative, colorless) beads were selected out of a cumulative 360,000 possible beads.

MGO Treatments for ANP-Linked Resin-Bound Peptides.
To compare the performance of the top hit selected from the library (LESRHYA, peptide 1) and a control sequence derived from human serum albumin screening (LLVRYTA, peptide 4) while still attached to resin, it was necessary to find an alternative to the strongly basic cleavage conditions used to liberate peptides from the HMBA-linked resin. To do so, sequence matching peptide 1 and 4 were synthesized on ChemMatrix resin in a similar format as described in the library design section. However, for these studies, a photocleavable amino-nitrophenyl-propionic acid linker (ANP) was incorporated in place of the third -alanine spacer, and two 2-PEG spacers were used after ANP resulting in the following sequence: Ac-LXXRXXA-(2-PEG)2-ANP-(Ab)2. To evaluate differences in glycation between these two sequences, resin charged with either peptide 1 or peptide 4 (~5 mg) were side chain deprotected and equilibrated in a 20 mM PBS solution at pH 7.3. Following equilibration, they were treated with 0.5 mM MGO for 3 h at 37 °C in 20 mM PBS at pH 7.3. After MGO treatment, beads were washed 5X with 50 mM Tris buffer at pH 7.3 and allowed to incubate in this buffer for 24 h at room temperature to mimic the conditions used during library screening. Finally, the beads were transferred to a solution of 10% methanol in water and subjected to UV light (l = 360 nm) for 45 seconds. The resulting mixture was filtered and subjected to LC-MS/MS analysis.

Preparation of MGH-1-Modified Peptide 1 (1 MGH-1 ).
For NMR studies, roughly 10 mg of MGH-1 modified peptide 1 was prepared by incubation of peptide 1 (200 µL of 20 mM stock in DMF) with MGO (400 µL of 10 mM stock in water) in phosphate buffered saline (400 µL, pH 12, 100 mM) at a final concentration of 4 mM peptide 1, 4 mM MGO, 40 mM PBS at pH 12, and 20% DMF in 1 mL total volume. This solution was incubated at 37 °C in a water bath for 3 h. These high pH and higher phosphate concentration conditions were used to promote the formation of MGH-1. The resulting mixture of AGEs, in which MGH-1-modified peptide 1 (peptide 1 MGH-1 ) was the predominant product (30-50% yield by LC-MS), was purified by semi-preparative HPLC using a gradient of 10-30% acetonitrile in water over 20 min at 4.0 mL/min. Collected fractions were characterized by MALDI-TOF, pooled and lyophilized. This process was repeated until sufficient quantities of peptide 1 MGH-1 could be obtained.  The resulting mixture was agitated at room temperature for up to 24 hours. To evaluate amide formation at available carboxylate functional groups, 3 µL of each reaction mixture was diluted into 300 µL of pure deionized water (1:100) and subjected to analysis by LC-MS (as previously described) without further purification. The LC-MS mobile phase gradient elution was changed to 5-80% B during a 1.75-16.00 min range to ensure that any hydrophobic reaction products were observed within the chromatographic window. Rather than automated precursor ion selection, targeted ion identification for MS/MS acquisition was used to identify the triply modified derivatization product. After selection in MS, fragmentation was induced using subsequent scans of static collision energies (20, 25, 30, 35V) in each cycle. This strategy allowed for the sampling of multiple fragmentation events that are not captured in when using automated MS/MS acquisition. Plasmids for Protein Expression. Plasmids encoding green fluorescent protein (GFP) variants C-terminally fused to the peptide 1 sequence (-LESRHYA, GFP-1), the peptide 3 sequence (-LDDREDA, GFP-3), or a glycation-inert control sequence (-LESAHYA, GFP-1 Ala ) were designed and purchased from GenScript, in a pcDNA_3.1(+) vector, which was selected due to its propensity for uninduced, high expression in mammalian cell lines. These C-terminal peptide sequences were connected to GFP through a linker sequence containing a tobacco etch virus (TEV) protease cleavage site. The expression sequences used to encode the GFP-fusion proteins are shown below. For experiments to evaluate glycation of each GFP variant, roughly 1 million HEK-293T cells were seeded in a 10 cm sterile tissue culture dish and grown to 50-60% confluency (~18-24 hours). The next day, cultures were transfected with 10 µg of the desired plasmid using TransIT-LT1 Transfection Reagent (Mirus Bio, 2 μL/μg plasmid). After transfection, cells were cultured for an additional 24 hours (to >90% confluency) before MGO treatments and subsequent harvesting. In general, cells were treated with MGO diluted into DMEM supplemented with 10% FBS and 1% Pen-Strep, in stocks that were prepared the same day as MGO treatment. MGO stock solutions (100 mM) were prepared in sterile water by the addition of 154 µL of commercially available MGO (40% w/v) to a total volume of 10 mL. The 100 mM MGO stock was further diluted into DMEM until a desired concentration (2.5 or 5 mM final concentration) was reached. Adherent cells on 10 cm sterile culture dishes were treated with 10 mL of this MGO-supplemented DMEM and incubated at 37 °C at 5% CO2 for up to 3 hours. After MGO treatment, cells were harvested with TrypLE Express (Gibco). Harvested cells were transferred to a 15 mL conical tube, and pelleted at 200 x g. The resulting pellet was washed with 3-5 mL of 20 mM PBS, pH 7.3 and stored on ice until lysis. Cells were lysed on ice in 400-600 μL Tris-Cl buffer (50 mM Tris, 150 mM NaCl, 1 mM EDTA, 1 mM NaF, 1% Triton X-100) at pH 7.5 with a Pierce Protease and Phosphatase Inhibitor tablet (1 tablet/10 mL buffer). Lysates were clarified by centrifugation at 4255 x g for 30 minutes, and total protein quantified by BCA Protein Assay (Pierce) on a Tecan Spark 10M plate reader and analyzed by western blot (10 μg/sample) or used for immunoprecipitation.
Immunoprecipitation Protocol. Cellular lysates (~200-250 µg protein) were diluted to a total volume of 500 µL in Eppendorf tubes containing 50 mM Tris, 150 mM NaCl, 1 mM EDTA, 1 mM NaF, pH 7.5, supplemented with protease and phosphatase inhibitor tablets (Pierce). To this, 25 µL of GFP-Trap® agarose slurry was added and allowed to incubate for 1-2 hours at 4 °C with end-over-end rotation. Following the incubation period, the agarose resin was pelleted and washed 3X with the same buffer, per the manufacturer's recommendation and transferred to a fritted spin column. For elution using TEV proteolysis, the resin was suspended in 100 µL of 1X TEV Reaction Buffer (New England Biolabs) in a plugged spin column. To this, 100 units of TEV protease (New England Biolabs), were added and incubated at 30 °C, shaking. Following incubation, the supernatant was eluted into a clean Eppendorf tube by brief (30-60 s) centrifugation in a bench-top microcentrifuge. The resulting eluates were collected and subjected to LC-MS analysis with any further purification or dilution. To elute the full length GFP, not only the C-terminal peptide fragment, the protein-bound GFP-Trap® agarose was resuspended in 50 µL of glycine elution buffer provided in the GFP-Trap® enrichment kit (Chromotek). After a 15 min incubation, supernatants were collected by eluting into a clean Eppendorf tube by brief (30-60 s) centrifugation in a bench-top microcentrifuge. Resulting eluates were analyzed by LC-MS (for C-terminal peptide fragments) or by western blot (for full-length protein).
Western Blotting. Western blot analysis was used for cellular lysates and for GFP-Trap® eluates of full-length GFP protein.
Lysates with a concentration of 10 µg total protein, or GFP-Trap® elutions were diluted into 6X SDS loading buffer and boiled for 5 minutes. SDS-PAGE was run on pre-cast protein gels (8-16%, mini-PROTEAN® TGX™) in standard Tris/glycine/SDS running buffer at 200 mV for 35 min to resolve protein bands. Following separation by SDS-PAGE, proteins were transferred to PVDF membrane using the iBlot 2 (Invitrogen) dry blotting system. Membranes were subsequently blocked in a buffer containing 20 mM Tris, 150 mM NaCl and 0.1% tween (1X TBST) supplemented with 5% (w/v) skim milk powder. After blocking, primary antibody incubation was achieved overnight at 4 °C per manufacturer's recommendation. Membranes were washed 3X for 5 min each with 5% milk-TBST. Detection was performed by incubating membranes with HRP-linked secondary antibodies (Cell Signaling Technology) in 5% milk-TBST for one hour at room temperature. After secondary antibody incubation, membranes were washed again 3X for 5 min with 1X TBST. Chemiluminescent signal was developed with Clarity Western ECL Substrate (Bio-Rad) and imaged on a Bio-Rad ChemiDoc XRS+. . This set of peptides exhibited substantial differences in total glycation, as well as in the diversity of the individual AGE products that formed. Despite the presence of unprotected N-termini, the vast majority of modifications were found at Arg. The major exception was the formation of an [M+12] adduct, which remains structurally undefined, that formed exclusively for peptides with N-terminal Pro residues (see Supplementary Fig. 2). These studies support the premise that primary sequence alone, in an otherwise unstructured short peptide, is enough to influence glycation outcomes.          The commercially available polyclonal α-MGH-1 antibody (Cell BioLabs, STA-011) was raised against heterogeneously glycated ovalbumin. 5 To confirm that our library selection strategy did not introduce a preference for these potential epitopes, we evaluated all arginine-containing sequences from ovalbumin (a) as a frequency logo or (b) individually. We found that none of these sequences, or even similar sequences, were selected in the peptide library (see Supplementary Table 1), confirming that α-MGH-1 antibodies are indeed specific for MGH-1 modified Arg side chains. To confirm that we selected sequences that promote MGH-1 formation when using α-MGH-1 primary antibodies to screen our library, we also developed an alternative approach that could distinguish hit beads based on impaired proteolysis by trypsin. This alternative screening strategy was able to select sequences that promote high levels of glycation overall, but that is independent of the formation of specific AGEs, such as MGH-1. (a) To prepare this library, randomized sequences were capped with an N-terminal biotin in place of an acetyl group (see Supplementary Methods). After the same MGO treatment steps (0.5 mM MGO), beads were subjected to proteolytic cleavage by trypsin. Trypsin cleavage is impaired when Arg becomes glycated. Thus, the N-terminal biotin was removed only for unmodified sequences. The remaining glycated sequences were selected following incubation with a streptavidin-alkaline phosphatase conjugate and visualization with colorimetric BCIP/NBT reagents. (b) Representative image demonstrating that after this treatment, many beads display peptides that remained unmodified and were therefore colorless after exposure to trypsin and colorimetric reagents. However, a few beads were highly glycated, as indicated by the dark purple color (center). These dark purple beads were manually selected, cleaved, and sequenced using LC-MS/MS and PEAKS de novo sequencing software. (c) Using this approach, just 20 individual "hit" beads were identified. The consensus motif that was obtained using this strategy (LVHRGQA) is distinct from the one derived from the α-MGH-1 screening approach (LESRYYA). Sequences that lead to the lowest levels of overall glycation were identified by choosing beads that were virtually colorless (see (b)). The consensus motif obtained for these 10 "nonhits" was significantly different from those obtained for "hit" beads from either screening approach, and indicated that an abundance of negative charge is detrimental to glycation (see also Supplementary Table 2), corroborating our previous findings. 2 We note that the consensus motif that was obtained using impaired proteolysis by trypsin is different from the one obtained when using ⍺-MGH-1 antibodies. These results are consistent with a model in which the "trypsin" approach selects for any modified glycation adduct, which could include more than ten known chemical structures that can form just from the reaction between MGO and Arg. [7][8][9] It is most likely that the trypsin approach screens for the first steps of the glycation reaction that occur rapidly, such as the formation of MGH-DH (see Main Text Fig. 4), whereas the ⍺-MGH-1 screening approach provides information about features that promote the (presumably rate-determining) elimination step that converts MGH-DH to MGH-1 (see also Main Text Figs. 4 & 6).   Fig. 7). Library beads that remain unmodified upon treatment with MGO are able to be cleaved by trypsin to remove the N-terminally linked biotin motif. Sequences were identified by selecting beads with no color development upon treatment with alkaline phosphatase-conjugated streptavidin and color development by BCIP/NBT reagents. Because the trypsin cleavage was not 100% efficient, it was possible to sequence peptides selected from colorless beads based on the low levels (~10%) of full-length peptide that remained following trypsin exposure. Ten beads were selected, cleaved and sequenced using LC-MS/MS and PEAKS de novo sequencing software. In all cases, there is a substantial overrepresentation of acidic residues, Asp and Glu. Entry 1 (-LDDREDA-) is a hit that matches well with the consensus motif (see Supplementary Fig. 7) and was therefore chosen as a representative non-glycated control sequence for subsequent peptide studies.  (5)) Data were derived from independent experiments: peptide 1 (n=8), peptide 2 (n=5), peptide 3 (n=3), peptide 4 (n=5), peptide 5 (n=3). Importantly, the hit sequence obtained when screening with α-MGH-1 antibodies (peptide 1) yields significantly more [M+54] than both a control sequence from HSA (peptide 4) and the hit sequence obtained when screening using resistance to trypsin proteolysis (peptide 5), in as early as 3 hours of MGO treatment. The "non-hit" sequence (peptide 3) yields significantly less glycation and less [M+54] than any other peptide tested. (c) Distribution of AGE adducts observed for peptides 1-5 (Ac-LESRHYA (1), Ac-LESRYYA (2), Ac-LDDREDA (3), Ac-LLVRYTA (4), and Ac-LVHRGQA (5) after 24 hours of MGO treatment. Data were derived from independent experiments: peptide 1 (n=3), peptide 2 (n=5), peptide 3 (n=3), peptide 4 (n=4), peptide 5 (n=3). There are still significant differences in [M+54] and total glycation levels for peptides 1, 2, 4 & 5 compared to the negative control sequence, peptide 3. However, differences in the amount of [M+54] are overshadowed by large increases in the double addition adduct, [M+144], which likely corresponds to isomers of tetrahydropyridine (see Fig. 1), or other structurally uncharacterized MGO double addition adducts. A non-directional (two-tailed), one-way ANOVA using Tukey's multiple comparison test was used to compare the mean levels of [M+54] adducts (blue) or total glycation (black) for all peptides.   Replacing Glu in the -2 position with Gln (1a) led to a subtle decrease in total glycation that is not statistically significant. There were also no statistically significant changes in the [M+54] adduct. This can be attributed to a conservative substitution that retains critical polar contacts. In contrast, replacement with Asp (1b) led to a substantial decrease in both MGH-1 and total glycation levels, suggesting that the specific placement of the Glu side chain is critical. The substitution of Ser in the -1 position with Ala (1c) also led to a moderate decrease in total glycation levels and a small, but in this case statistically significant, decrease, in    , and in some cases total glycation, which can be attributed to the loss of negative charge that has been shown to be detrimental to glycation (see also LESR YA To do so, the most abundant hit sequence (LESRHYA, peptide B1), and a control sequence derived from HSA (LLVRYTA, peptide B4), were both synthesized on ChemMatrix resin using a photocleavable linker, ANP (see Methods). After MGO exposure and wash steps that mimicked the conditions used during library screening, glycated sequences were cleaved from the bead using high intensity UV light. Due to the nature of ANP linker cleavage, three ions corresponding to the acid, amide or methyl ester C-terminus were observed for each modified sequence. (b) Quantification of the extent and distribution of glycation for both peptides provided a clear rationale for why LESRHYA was selected from the library, but LLVRYTA was not: Although levels of total glycation were somewhat greater for B4 than for B1, the latter showed more [M+54] adduct, which is detected by the α-MGH-1 antibodies. Data shown in (b) are derived from independent experiments (n=2). (c) Additionally, extracted compound chromatograms (ECCs) for the observed [M+54] adducts show that peptide B4 led to multiple [M+54] isomers whereas peptide B1 yielded just one (see also Main Text Fig. 3). The representative ECCs shown are for the [M+54] adduct to amidated peptide obtained after photocleavage. Similar results were observed for methyl ester and acid products. . Additionally, these peptides all produced more overall glycation (>50%) than any other peptide series evaluated in this study, including peptide 1. The interplay between total glycation and formation of specific adducts is quite complicated. The overall increase in glycation in these peptides can mostly be attributed to the increase in additions that yield [M+72] and [M+144] adducts. We have observed that residue identity surrounding a central arginine is critical for influencing the mechanistic steps that result in AGE formation. For instance, the dense negative charge in peptide 3 likely disrupts initial MGO addition, as evidenced by the drastic decrease in [M+72] adduct at all timepoints (see Fig. 2 & Supplementary Fig. 8). Furthermore, the position of tyrosine, relative to arginine and other amino acids in the sequence, is critical in favoring [M+54] adducts. This suggests that the additional functionalities in the peptide 1 scaffold serve to favor MGH-1 formation, rather than promote overall glycation (see also Supplementary Fig. 13).

L E T R H Y A
A non-directional (two-sided) ordinary one-way ANOVA using Tukey's multiple comparison test was used to compare the mean yields of [M+54] (blue) or total glycation (black) for all peptides. . We found that the overall level of glycation as well as the distribution of AGEs could be influenced by scrambled peptide 1 variants. Importantly, we found that several scrambled variants (peptide 1 SP4 , 1 SP5 , and 1 SP6 ) produced multiple [M+54] isomers, whereas peptide 1 produces just one. Interestingly, all peptides that yielded multiple isomers of [M+54] contain a tyrosine directly adjacent to arginine in sequence. Though we have confirmed that tyrosine's activity as a base is critical, the relative position of other residues likely contribute to formation of the necessary intermediates that contribute to the overall mechanism for MGH-1 formation (see Main Text Fig. 6). This result strongly suggests that Tyr alone is not sufficient to produce the specific glycation outcome, and instead Tyr works cooperatively with other residues in the peptide 1 scaffold to produce a single MGH isomer. A non-directional (two-sided) ordinary one-way ANOVA using Dunnett's multiple comparison test to compare each mean to the mean for peptide 1 was used to determine if variants yield significantly different amounts of [M+54] (blue) or total glycation (black).   Adducts. MGH-DH is a single addition of MGO to arginine. The structure of MGH-DH includes a vicinal diol that comes from the formation of two hemiaminals between the aldehyde and ketone of MGO and the guanidino group on Arg. We hypothesized that this bis-hemiaminal could be derivatized using boronic acids, which have been reported to derivatize vicinal diols on sugars. 11 (a) Scheme for the chemical derivatization of the peptide 1 [M+72] A adduct (RT = 9.35 min) using boronic acids. We found that the resulting boronic esters were not stable to LC-MS conditions, and thus these reactions were evaluated using MALDI. Treatment of this first [M+72] A adduct, which dominates early timepoints, with a brominated boronic acid resulted in formation of a new peak corresponding to the boronic ester, which also possessed the characteristic isotope pattern expected for bromine. (b) The second [M+72] B adduct (RT =9.86 mins), which becomes the major product only after 48 h of dilution (see Main Text Fig. 4) is unable to be derivatized by boronic acids using the identical protocol, suggesting it to be consistent with formation of carboxyethylarginine (CEA) (see also Supplementary Figs. 17 & 18    . We found that the putative MGH-3 isomer undergoes rapid hydrolysis to form CEA (~70% at all concentrations) and also undergoes AGE removal to regenerate unmodified peptide 1 (~25%). After 48 h of incubation, the putative MGH-3 adduct was nearly quantitatively consumed. Compared to MGH-1 hydrolysis (see Supplementary Fig. 19), there appears to be less concentration dependence in these studies, although there are small differences in the rate of appearance of CEA at early timepoints. However, all concentrations tested led to similar levels product distributions (predominantly CEA and unmodified peptide 1) by the end of the 48 h incubation period. Because MGH-3 is known to be less stable than MGH-1 and more prone to hydrolysis, these observations are consistent with formation of a peptide 1-MGH-3 adduct. We note that the observed profile of CEA formation for peptide 1 upon dilution (see Main Text Figs. 4 & 5), tracks closely with that which is observed for peptide 1 MGH-1 , not the putative MGH-3 adduct. If MGH-3 were to form, we would expect to see greater levels of CEA formation along with greater levels of unmodified peptide 1 from which AGEs were removed. Levels of glycation for peptide 1 only drop very slightly upon dilution (see Main Text Figs. 4 & 5 and Supplementary Figs. 22-23). Thus, we conclude that the relevant intermediate in our study is indeed MGH-1. Together, our results are most consistent with a model in which MGH-1 is a direct precursor to CEA (see also Main Text Fig. 4, Main Text Fig. 5 and Supplementary Fig. 19).   These experiments were completed using our standard MGO reaction conditions, with the exception of the phosphate buffer pH, which was adjusted to either pH 6.0 or pH 12. At pH 6.0, all phenolic groups are expected to be predominantly protonated and uncharged. At pH 12.0, all phenolic groups are expected to be deprotonated and negatively charged. However, at pH 7.3, Tyr would be expected to be protonated and uncharged for peptides 1 and 1 Cl (>99% and ~90%, respectively), but for peptide 1 2Cl it would be deprotonated and negatively charged (only ~10% protonated). 3): peptide 1 (n=8); 1 Cl (n=3); 1 2Cl (n=3). Right panel (pH 12.0): peptide 1 (n=2); 1 Cl (n=2); 1 2Cl (n=2). At pH 6 and 12, there are no statistically significant differences in total glycation or MGH-1 formation. However, at pH 7.3, there are significant differences in the total levels of glycation and resulting product distributions. We attribute this to two distinct phenomena: First, at neutral or acidic pH, MGH-1 formation requires the Tyr phenoxide to act as a base, thus a greater proportion of MGH-1 is observed for peptides 1 Cl and 1 2Cl as compared to peptide 1. At pH 12, this trend disappears because the solution is basic enough for deprotonation to occur without any assistance from Tyr. Second, our published work and the work herein has demonstrated that clustered negative charge impedes glycation. This explains why peptide 1 2Cl , which has one extra negative charge as compared to peptides 1 or 1 Cl at pH 7.3, exhibits lower levels of glycation overall. Additionally, this experiment supports our hypothesis that tyrosine is acting as a base, as MGH-1 levels increased with increasing pH. An ordinary one-way ANOVA using Dunnett's multiple comparison test, to compare each mean to the mean for peptide 1, was used to determine if variants yield significantly different amounts of [M+54] (blue) or total glycation (black). p<0.05(*), p<0.01(**), p<0.001(***), p<0.0001(****). Additional statistical analysis is available in the Supplementary Data 1   , there is even greater similarity in the AGE product distributions for these three peptides. We attribute this finding to the observation that upon dilution, reactivity is restricted solely to intramolecular reactions and/or rearrangements, rather than the addition of new equivalents of MGO. As a result, by 48 hpd peptides 1, 1 Cl and 1 2Cl all have similar product distributions. However, major differences are observed in the rate of AGE interconversion for these peptides (see Main Text Fig. 6). This finding suggests that chlorinated peptide 1 variants, with greater levels of phenoxide, accelerate the mechanistic steps that form MGH-1, but do not significantly alter product distributions.