Identification of Isopeptides Between Human Tissue Transglutaminase and Wheat, Rye, and Barley Gluten Peptides

Celiac disease (CD) is a chronic immune-mediated enteropathy of the small intestine, which is triggered by the ingestion of storage proteins (gluten) from wheat, rye, and barley in genetically predisposed individuals. Human tissue transglutaminase (TG2) plays a central role in the pathogenesis of CD, because it is responsible for specific gluten peptide deamidation and covalent crosslinking, resulting in the formation of Nε-(γ-glutamyl)-lysine isopeptide bonds. The resulting TG2-gluten peptide complexes are assumed to cause the secretion of anti-TG2 autoantibodies, but the underlying mechanisms are only partly known. To gain more insight into the structures of these complexes, the aim of our study was to identify TG2-gluten isopeptides. With the use of discovery-driven as well as targeted nanoscale liquid chromatography tandem mass spectrometry, we detected 29 TG2-gluten isopeptides in total, involving seven selected TG2 lysine residues (K205, K265, K429, K468, K590, K600, K677). Several gluten peptides carried known B-cell epitopes and/or T-cell epitopes, either intact 9-mer core regions or partial sequences, as well as sequences bearing striking similarities to already known epitopes. These novel insights into the molecular structures of TG2-gluten peptide complexes may help clarify their physiological relevance in the initiation of CD autoimmunity and the role of anti-TG2 autoantibodies.

Celiac disease (CD) is defined as a chronic immune-mediated inflammatory disorder of the small intestine initiated by the storage proteins (gluten) of wheat, rye and barley in genetically predisposed subjects 1 . The ingestion of gluten causes villous atrophy, lymphocyte infiltration and the stimulation of CD4 + T cells against gluten epitopes in CD patients. These epitopes are presented by the human leukocyte antigen (HLA) class II alleles HLA-DQ2.5, HLA-DQ2.2 and HLA-DQ8 of the major histocompatibility complex (MHC) expressed on B cells and antigen-presenting cells. The presentation of gluten peptides leads to the activation of CD4 + T cells, which are the main effector cells for immunologic processes 2,3 .
Human tissue transglutaminase (TG2), a Ca 2+ -dependent protein-glutamine γ-glutamyltransferase (EC 2.3.2.13), is ubiquitously expressed and catalyses the deamidation of glutamine residues or the crosslinking reaction (transamidation) between a glutamine and a lysine residue to form a covalent N ε -(γ-glutamyl)-lysine isopeptide bond 4 . The TG2-mediated deamidation converts certain glutamine residues to glutamic acid residues by releasing ammonia and incorporating water. This leads to an introduction of negative charges in gluten peptides following a distinct pattern, e.g., the glutamine residues in the sequences QXP, QXXF(Y/W/M/L/I/V) or QXPF(Y/W/M/L/I/V), where X designates any other amino acid except P, are preferentially targeted 5 . This introduction of negatively charged amino acids increases the binding affinity of gluten peptides to the HLA molecules and enhances their antigenicity in CD patients 6 . During transamidation, the γ-carboxamide group of a protein-bound glutamine serves as acyl donor that is transferred to an acyl acceptor, such as small, biogenic amines or an ε-amino group of protein-bound lysine to form a crosslink 7,8 . The modification of gluten peptides by TG2 is known as a critical event in the pathomechanism of CD, particularly as TG2-gluten peptide complexes are formed 8 . Patients with active CD have specific anti-TG2 IgA (and IgG or IgM) antibodies 9

Results
Experimental approach to identify TG2-gluten isopeptides. To reduce complexity compared to a total gluten hydrolysate, our experimental approach to identify TG2-gluten isopeptides started with the preparation of the following GPTs: α-gliadins, γ-gliadins, ω5-gliadins, ω1,2-gliadins, high-(HMW-GS) and low-molecular-weight glutenin subunits (LMW-GS) of wheat, ω-secalins, HMW-secalins, γ-75k-secalins and γ-40k-secalins of rye and C-hordeins, γ-hordeins, B-hordeins and D-hordeins of barley ( Fig. 1a) 17,18 . The individual GPTs were hydrolysed using a combination of pepsin and chymotrypsin/trypsin to mimic the main enzymatic processes during gastrointestinal digestion 16,19 . Then, the resulting GPT hydrolysates were incubated with TG2, leading to the formation of TG2-gluten peptide complexes. These complexes were hydrolysed with trypsin followed by solid phase extraction (SPE) for clean-up of the isopeptide/peptide mixture and subsequent discovery-driven nanoscale liquid chromatography tandem mass spectrometry (nLC-MS/MS) analysis (Fig. 1b) 15 . The GPT blank controls without addition of TG2 were used to create customized protein databases (Table S1) for each GPT that were applied in the proteomics software MaxQuant (MQ) 20 . In order to identify TG2-gluten isopeptides, MQ searches for gluten peptides (α-side of the isopeptide) were performed against the appropriate GPT-database with each of seven selected TG2-peptides (β-side of the isopeptide) as modifications. These seven peptides (FLKNAGR, WKNHGCQR, ISTKSVGR, LAEKEETGMAMR, DLYLENPEIKIR, QKR, AVKGFR, lysine residue involved in crosslink formation highlighted in bold) containing the lysine residues K205, K265, K429, K468, K590, K600 and K677 from the TG2 amino acid sequence were selected as possible crosslinking sites. The lysine residues K590, K600 and K677 had previously been identified by Fleckenstein et al. 8 and the lysine residues K205, K265, K429 and K468 additionally by Lexhaller et al. 15 . K590, K600 and K677 were known as preferred TG2 crosslinking sites also for TG2 self-multimerization 13 , while K205, K265, K429 and K468 were involved in the formation of isopeptides with high identification scores 15 . The tryptic TG2 peptides were chosen to contain only one lysine residue to reduce potential variability on the TG2-side. The identities of the isopeptides were confirmed by annotating the band y-fragments as well as internal fragment ions (double fragmentation on both crosslinked peptide sequences) calculated with the MS-Product feature of ProteinProspector 21 . The identities of the isopeptides as well as the crosslinking site localisations were verified by re-analysing all samples using targeted parallel reaction monitoring (PRM) nLC-MS/MS. Data analysis was performed with Skyline 22 and additional manual curation. PRM analysis yields higher ion intensities, because it focuses on monitoring the predefined transitions from precursor to fragment ions. This higher overall intensity provided more fragments, especially around the crosslinking sites.
Identification of isopeptides in wheat GPTs. Altogether, 13 isopeptides were identified in the wheat GPTs. Table 1 shows the identified isopeptides (sorted by TG2-modification site) in each GPT, the gluten protein corresponding to the identified gluten peptide with UniProtKB accession number, name and organism, the MQ identification score, as well as the numbers of characteristic fragments identified in discovery-driven nLC-MS/ MS experiments and of those that were confirmed using PRM. The γ-gliadin-GPT hydrolysate contained five isopeptides (W2, W3, W6, W7, W9) with four different TG2-crosslinking sites. Four isopeptides with three different TG2 peptides were identified in the α-gliadin-GPT hydrolysate (W1, W8, W11, W12), two isopeptides with two different TG2 peptides in the LMW-GS-GPT hydrolysate (W4, W10) and one isopeptide each in the HMW-GS-GPT hydrolysate (W13) and the ω1,2-gliadin-GPT hydrolysate (W5). No isopeptides were identified www.nature.com/scientificreports www.nature.com/scientificreports/ in the hydrolysate of ω5-gliadin-GPT. The structures of the isopeptides as well as the localization probabilities for the crosslinks and the deamidation are shown in Fig. 2.
As an example, a very high MQ score (91.31) was obtained for the isopeptide VQGQGIIQPQQPAQL/ FLKNAGR (W3, Q and K involved in the isopeptide bond highlighted in bold, deamidation site underlined) based on the identification of 24 b-and y-fragments of the α-side (some fragments were identified without or with water-and ammonia-loss). First, the MQ search result of VQGQGIIQPQQPAQL carrying the TG2 isopeptide modification "fl" (= FLKNAGR) at Q 10 and a deamidation "de" at Q 4 was loaded into MQ Viewer to have all b-and y-fragments annotated. These fragments were by default decharged by MQ Viewer to show them as single charged fragments (Fig. 3a) 23 . Additionally, in Fig. 3b, the annotation was done manually in the MS/MS spectrum by combining the information from the spectral annotation of MQ Viewer and 35 internal fragments calculated by ProteinProspector for confident localization of the deamidation and crosslinking sites in the isopeptides. The correct detection of isopeptide W3 was confirmed by targeted MS analysis using PRM 24,25 . The PRM data revealed high quality chromatographic peaks for 15 characteristic fragment ions, including b 6α + to b 8α + as consecutive series 26 within the α-side, and seven fragments for the β-side modified at K with the deamidated VQGQGIIQPQQPAQL peptide. Q 10 was identified as the crosslinking site with a localization probability of 94.4% and the deamidation at Q 4 was detected with a probability of 99.9%.

Figure 1.
Workflow to identify isopeptides between gluten protein types of wheat, rye and barley and human TG2. (a) Extraction and separation procedure to obtain gluten protein types from wheat, rye and barley flours, respectively, (b) Proteomics workflow combining a reciprocal search strategy to identify isopeptides using discovery-driven mass spectrometry, MaxQuant, Skyline and parallel reaction monitoring (PRM). SPE: solid phase extraction; TG2: recombinant human tissue transglutaminase.
The isopeptides W1-W7 and W11-W13 were already identified unambiguously by discovery-driven nLC-MS/ MS and application of the confirmation parameters (at least seven identified b-or y-fragments, at least three fragments in a consecutive series and a crosslink localization probability ≥75% 15 ). The additional PRM analysis confirmed these 10 identified isopeptides and their crosslinking and deamidation sites. However, the PRM data was essential to unambiguously localize the crosslinking site or some deamidation sites for the three isopeptides W8-W10. For this purpose, specific transitions around these sites were used to confirm the localization of the crosslinking or deamidation sites as shown in Fig. 2.
The isopeptides R5 and R6 were already identified unambiguously by discovery-driven nLC-MS/MS, because they fulfilled the confirmation parameters and the crosslinking sites were identified with localization probabilities of 99.3% and 100%, respectively. The PRM data from these isopeptides were used as confirmation. To identify the crosslinking sites in the other rye isopeptides (R1-R4), the identification and confirmation of specific fragments by PRM analysis was needed. Figure 4 shows the structures of these isopeptides as well as the MQ localization probabilities and the specific fragments used to confirm the crosslinking site.

Identification of isopeptides in barley GPTs.
In total, ten isopeptides were identified in the GPTs of barley (C-hordeins, γ-hordeins, D-hordeins and B-hordeins) ( Table 3). Five isopeptides (B1, B3, B5, B6, B10) with four different TG2 peptides were identified in the D-hordein-GPT hydrolysate and four isopeptides (B4, B7-B9) in the γ-hordein-GPT hydrolysate. The B-hordein-GPT hydrolysate contained one isopeptide (B2) with a gluten peptide derived from wheat LMW-GS, most likely again due to high sequence homologies between B-hordeins from barley and LMW-GS from wheat (Fig. 5). No isopeptides were detected in the hydrolysate of the C-hordein-GPT itself. However, one isopeptide identified within the ω-secalin-GPT was assigned to a C-hordein. www.nature.com/scientificreports www.nature.com/scientificreports/ The isopeptides B2, B3 and B5 were already detected unambiguously by discovery-driven nLC-MS/MS experiments and the PRM analyses were only used for confirmation. The localization probabilities for the crosslinking sites were between 87.4% and 95.2% (Fig. 5). In comparison, PRM analyses were necessary to detect the specific fragments around the crosslinking sites in B1, B7, B9 and B10 and confirm the localization of the crosslinks (Fig. 5).
Regarding the isopeptide B4, the localization probability was 49.3% for the crosslink at Q 4 or Q 5 , respectively. The PRM data also did not reveal the exact position of the crosslink, because the specific transitions for these two sites were not detectable. The isopeptide B6 was identified with two deamidation sites, one of which was detected clearly with a localization probability of 77.4% at Q 8 . The positions of the second deamidation and the crosslinking site were ambiguous with localization probabilities of 51.0% at Q 10 or 40.2% at Q 11 for the deamidation and 46.8% at Q 10 or 39.8% at Q 11 for the crosslink. Even the PRM experiments did not provide any further information, so that the deamidation and crosslinking sites could not be assigned unequivocally within B6.
In the isopeptide B8, the crosslinking site was identified at various positions with various low localization probabilities by discovery-driven nLC-MS/MS: Q 8 with 27.3%, Q 9 with 36.4% and Q 12 , Q 13 , and Q 14 with 12.0%, respectively. The positions Q 2 (localization probability: 96.9%) and Q 3 (localization probability: 46.7%) of the two deamidated glutamine residues in the N-terminal part of the sequence were verified due to the specific transitions b 2α + to b 4α + , and the position of the crosslinking site could be confirmed at Q 9 based on the detection of the characteristic b 6α + to b 9α + fragments after PRM analysis. Q 12 had a deamidation probability of 96.9%, so that only the exact positions of the fourth deamidation in the rear part (Q 13 or Q 14 ) could not be assigned unambiguously due to missing specific fragments. www.nature.com/scientificreports www.nature.com/scientificreports/

Discussion
In this study, we applied a reciprocal proteomics strategy, including discovery-driven 15 as well as targeted MS measurements, to complex gluten hydrolysates and identified isopeptides between TG2 and gluten peptides. To get well-defined gluten raw materials, GPTs were isolated by modified Osborne fractionation following preparative RP-HPLC and characterized as described before 17,18 . In total, 13 isopeptides of wheat GPTs, six of rye GPTs and ten of barley GPTs were detected crosslinked to peptides containing any of the seven selected TG2-lysine residues (K205, K265, K429, K468, K590, K600, K677). The crosslinking sites were unambiguously identified by discovery-driven nLC-MS/MS with localization probabilities of >75% in 18 out of 29 isopeptides. The additional PRM analyses on the ambiguously identified crosslinks in 11 isopeptides were used to clearly assign the The fragments are marked in different colours as follows: y-fragments in red, b-fragments in blue, a-and c-fragments in turquoise, fragments with losses of NH 3 or CO marked in orange. (b) Spectrum of the isopeptide annotated manually with fragments of both sides of the isopeptides, calculated with ProteinProspector. The insert amplifies the range between m/z 100 to 400. The fragments are marked in different colours as follows: y-fragments of the γ-gliadin peptide in pink, b-fragments of the γ-gliadin peptide in blue, y-fragments of TG2 peptide in violet, a-and internal fragments in turquoise, fragments with losses of NH 3 or CO marked in orange.
www.nature.com/scientificreports www.nature.com/scientificreports/ crosslinking site. This method enabled the identification of the exact crosslinking and deamidation sites in 8 of the remaining 11 isopeptides due to the detection of the characteristic fragments around the modified sites. Only one deamidation site (B8), one crosslinking site (B4) as well as one deamidation and one crosslinking site (B6)   www.nature.com/scientificreports www.nature.com/scientificreports/ could not be assigned unambiguously. However, we were able to identify the subpart of the amino acid sequence, where the modified glutamines are located most likely.
No isopeptides were detected in the hydrolysates of ω5-gliadins, HMW-secalins and C-hordeins. This may have several causes, including poor digestibility of the proteins, especially for ω5-gliadins 27 , comparatively low percentages of the respective GPT within the isolate, especially for C-hordeins 17 , isopeptide concentrations that were below the limit of detection or even no formation of isopeptides. Due to the multitude of potential pairings considering that the TG2 sequence contains 32 lysine residues in total, we decided to focus our data evaluation on the seven selected TG2-lysine residues within the specific peptides that had been reported as reactive sites in previous investigations 8,15 . With the gluten peptide side also unknown prior to our investigation, including all 32 lysine residues would have dramatically increased the search space at the cost of decreasing confident isopeptide identification. However, our intentional limitation to these seven lysine residues also implies that we may have missed isopeptides, if they contained any other TG2-derived lysine peptide.
The gluten peptides involved in isopeptide formation were not always matched to the corresponding proteins that would be primarily expected in the respective GPT. Each isopeptide dataset was searched against the GPT-specific database that was generated during the discovery-driven experiment with the GPT blank controls. Nevertheless, these GPT-specific databases partly contained proteins from other closely related plant species due to incomplete or unannotated protein entries in the UniProtKB database 28 . In some cases, the gluten peptides were derived from a different Triticum species like T. timopheevii (W8) or from different Secale species including Psathyrostachys juncea (Russian wild rye) (R1-R4, R6). One peptide present in the rye ω-secalin hydrolysate was matched to a protein sequence from H. vulgare (R5) and, vice versa, two peptides from the barley γand B-hordein hydrolysates corresponded to protein sequences from T. aestivum (B2, B8). This can be explained with the close phylogenetic relationship of wheat, rye and barley that causes extensive amino acid sequence homologies, especially in the repetitive domains 29,30 . Several gluten peptides also contained missed peptic/tryptic/chymotryptic cleavages sites, as is known to occur frequently during gluten digestion 31,32 . To enhance the quality of correct protein identifications, it might be useful to search in other, curated databases, which include more complete gluten entries 33 .
The approach with TG2 and GPTs described here has to be seen as a two-component model system with simulated gastrointestinal digestion. The crosslinking reactions were performed using isolated fractions of wheat, rye and barley proteins and this is rather far away from the real conditions, where gluten proteins are part of a complex food matrix. The simulated digestion model is based on physiological conditions including the three gastrointestinal enzymes trypsin, pepsin and chymotrypsin, but without the action of other enzymes, e.g., brush-border enzymes. This design was chosen deliberately, because the additional action of several enzymes with different cleavage specificities would have made the MS data evaluation much more complicated and increased the peptide search space by several orders of magnitude. These limitations of the current study have to be considered carefully, because more gastrointestinal enzymes would produce more or maybe divergent peptides from a more complex matrix.  www.nature.com/scientificreports www.nature.com/scientificreports/ TG2 is known for its high reactivity with gluten peptides 34 , especially those harboring T-cell epitopes 16 . Depending on the neighboring C-terminal amino acids, TG2 specifically deamidates glutamine residues in the QXP-, QXXF(Y/W/M/L/I/V)-or QXPF(Y/W/M/L/I/V)-motifs (where X designates any amino acid except P) resulting in increased binding affinity of the gluten peptides to the CD-associated HLA molecules 35 . In contrast, the QXXP-or QP-motifs have been described as poor or no targets for TG2-mediated deamidation 5 . Thirteen of the 29 isopeptides carried a gluten peptide with at least one additional deamidation, of which five displayed the preferred QXP-motif (W9, R1, B2, B5, B9) and five the QXXF(Y/L/I)-motif (W3, R3, B1, B6, B8). Two gluten peptides were deamidated at the poor QXXP-(W10) and QP-motifs (R3), while the remaining deamidation sites were located in sequences with unknown effect on TG2-specificity. Non-enzymatic deamidation cannot be excluded in our experiments due to the slightly alkaline pH conditions during incubation with TG2 and tryptic digestion 36 , but our intent was to focus on the identification of crosslinking sites, rather than deamidation sites.
Among the 29 isopeptides, 12 crosslinking sites to TG2 were located in the preferred QXP-motif (W1-W3, W7, W8, W12, W13, R5, R6, B3, B5, B7) and three in the QXX(Y/I)-motif (R1, B9, B10). Five isopeptides had the crosslink within the QP-motif (W4-W6, W10, W11) and one within the QXXP-motif (W9) that are either no or poor targets for TG2. The other crosslinks were either not localized unambiguously (B4, B6) or involved QXXX-motifs (R2-R4, B1, B2, B8) that may or may not have an effect on TG2 specificity. In case of W4 no preferred target was available, but these results point to the fact that TG2 might not necessarily follow the known www.nature.com/scientificreports www.nature.com/scientificreports/ specificity when it comes to crosslinking TG2 molecules to gluten peptides instead of deamidation. However, further experiments would be necessary to study the mechanisms of crosslinking versus deamidation in more detail.
A further limitation of the current study is that it does not allow a differentiation if one TG2 molecule carries several gluten peptides crosslinked to different lysine residues or if there are several TG2 molecules that carry one gluten peptide each. In view of the relative distance between the active site of TG2 and the crosslinked lysine residues, it appears most likely that TG2 crosslinked the gluten peptides to other independent neighbouring TG2 molecules. In our well-defined model system, there were no other acyl acceptor substrates present except for TG2. However, this situation is uncommon under physiological conditions, where other extracellular matrix proteins such as collagen or fibronectin and free amines are always present 37,38 . To address this major limitation of the current study, further experiments would be necessary in the presence of other proteins or free amines as potential substrates for TG2.
Of the 26 different gluten peptides (the isopeptides W3/W7, R2/R4 and B6/B10 involve different TG2 lysine residues, but the same gluten peptide, respectively) identified as part of isopeptides, three contained three different complete 9-mer core regions of known T-cell epitopes 39 : IQPQQPAQL (DQ2.5.glia-γ2 40 ) in W3 and W7, FRPQQPYPQ (DQ2.5-glia-α3 40 ) in W11 (F at the N-terminal end missing due to chymotryptic cleavage) and QQPFPQQPQ (DQ2.5-glia-γ5 41 ) in R3. The crosslinked glutamine residue in W3, W7 and W11 was located within the core region, whereas R3 had the crosslink after the core region, but within the truncated motif of the epitope DQ2.5-glia-γ1 (PQQSFPQQQ 42 ) that contains a chymotryptic cleavage site. The DQ2.5-glia-α3 and DQ2.5.glia-γ2 epitopes had also been identified as preferred TG2 substrates by Dorum et al. 16 . Several of the other gluten peptides crosslinked to TG2 also show striking similarities with known T-cell epitopes. For example, B7 is identical to LQPQQPFPQ (DQ2.5-glia-γ4e 43 ) except for the C-terminal W, while also being identical to PQPQQPFPW (DQ2.5-glia-ω2 44 ) except for the N-terminal L. W9 and B4 contain seven and eight amino acids of QQPQQPFPQ (DQ2.5-glia-γ4c 41 ), respectively. Multiple sequence alignment of all identified gluten peptides that were bound to TG2 revealed that the PQQP-motif was the most common feature in many gluten peptides. However, there were also variations such as PQQL and PQQS, while some peptides had a different sequence altogether (e.g., within B9 or B1) (Fig. S1). The alignment of the gluten peptides considering the deamidation sites essentially showed a similar picture (Fig. S2).
As the formation of stable gluten peptide-HLA complexes is the prerequisite for activating the gluten-reactive T-cell response 35 these TG2-bound gluten peptides carrying known T-cell epitopes may contribute to enhanced T-cell reactivity. In turn, gluten-reactive T cells provide help to gluten-specific B cells with both receptor repertoires sharing a preference for deamidated gluten peptides with overlapping or adjacent recognition sequences 45,46 . Although eight of the gluten peptides we identified within the isopeptides were too short (only eight amino acids in five cases, or seven amino acids in three cases) to elicit binding to HLA-DQ2.5, -DQ2.2 or -DQ8 molecules, one (W1) did carry a sequence recognized by gluten-specific B-cells (IPEQ, WQIPEQ) 46 . Furthermore, the peptides W9, R3, R6, B4 and B7 contained the QPQQPF-motif 46 and W11 the PXPQP-motif 45 , that are reported as important sequences for B-cell receptor recognition. Regarding TG2-specific B cells, the most likely route is that TG2-gluten peptide complexes are taken up through the B-cell receptor 12 , but our knowledge on the cooperation of gluten-reactive T cells and TG2-specific B cells in B-cell activation warrants further investigation 11 . Our findings on isopeptide formation between TG2 and gluten peptides from a complex gluten hydrolysate may help shed some more light into the complex interactions between HLA-DQ2/8 molecules, gluten-reactive T cells, gluten-specific B cells and TG2-specific B cells. The workflow combining discovery-driven and PRM nLC-MS/ MS could also be adapted to other related questions, because TG2 is also known to interact not only with gluten, but also with extracellular matrix proteins, such as fibronectin 37,38 .

Conclusion
We identified 29 isopeptides of TG2 with peptides from gluten hydrolysates from wheat, rye and barley in vitro using a reciprocal proteomics strategy. The model system does not rely on model peptides, but uses gluten proteins extracted from the flours and hydrolysed by three different gastrointestinal enzymes to mimic physiological conditions in a simplified form. In addition to discovery-driven mass spectrometry, all isopeptides were verified by targeted proteomics (PRM) that allowed the localization of the respective crosslinking site. These results provide novel insights into preferred TG2 substrates and the molecular structures of TG2-gluten peptide complexes. Several gluten peptides carried known B-cell and T-cell epitopes, either intact 9-mer core regions or partial sequences, as well as sequences bearing striking similarities to already known epitopes. Further research combining in vitro and in vivo experiments on the extent and the activation of B cells are needed to get more insights on the immunological and physiological relevance of these complexes. With the proteomics strategy in place, it would be interesting to gradually move away from the well-defined model system to studying TG2-mediated crosslinking under physiologically relevant conditions, e.g., with additional action of brushborder enzymes and in the presence of other acyl acceptor substrates such as other extracellular matrix proteins or free amines.

Preparation of GPTs.
The GPTs α-gliadins, γ-gliadins, ω5-gliadins, ω1,2-gliadins, HMW-GS and LMW-GS of wheat, ω-secalins, HMW-secalins, γ-75k-secalins and γ-40k-secalins of rye, and C-hordeins, γ-hordeins, D-hordeins and B-hordeins of barley were isolated as reported in detail by Schalk et al. 18 and Lexhaller et al. 17 . Briefly, the protein fractions were isolated stepwise by modified Osborne fractionation from wheat, rye and barley flours using salt solution (0.4 mol/l NaCl with 0.067 mol/l Na 2 HPO 4 /KH 2 PO 4 , pH 7.6) to obtain the albumins/ globulins, ethanol/water (60/40, v/v) to obtain the prolamins and glutelin extraction solution (2-propanol/water (50/50, v/v)/0.1 mol/l Tris-HCl, pH 7.5, containing 2 mol/l (w/v) urea and 0.06 mol/l (w/v) dithiothreitol (DTT)) at 60 °C under nitrogen to obtain the glutelins. The supernatants of each prolamin and glutelin fraction were combined, concentrated, lyophilized and re-dissolved for preparative RP-HPLC. After filtration of the prolamin and glutelin solutions (0.45 μm), the GPTs were separated on a Jasco HPLC (Jasco, Gross-Umstadt, Germany) according to their retention times, collected from several runs, pooled, lyophilized and stored at −20 °C until use. Then, the GPTs were characterized by RP-HPLC, SDS-PAGE and discovery-driven mass spectrometry to verify their identities and purities as already reported in detail 17,18 . Enzymatic digestion of GPTs. Each GPT was suspended in 0.02 mol/l HCl (pH 2) and hydrolyzed with pepsin at an enzyme:substrate ratio of 1:20 (w/w) for 60 min at 37 °C. After adjusting the pH to 6.5 with sodium phosphate buffer (50 mmol/l), trypsin and chymotrypsin were added at an enzyme:substrate ratio of 1:40 (w/w), respectively and hydrolyzed for 120 min at 37 °C 16,19 . The samples were heated for 10 min at 95 °C to stop proteolysis, centrifuged and filtered. For the following crosslinking reaction with TG2, the samples were dried using a vacuum centrifuge (37 °C, 4 h, 800 Pa), reconstituted in TRIS/HCl buffer (0.1 mol/l, pH 7.4, 10 mmol/l CaCl 2 ) and the resulting peptide concentrations were estimated with a NanoDrop Micro-UV/VIS spectrophotometer and the protein A205 application (NanoDrop One, Thermo Scientific, Madison, USA) at 205 nm, which can be used to determine peptide concentrations based on the absorption of the peptide bonds.
Crosslinking reaction of TG2 and GPT hydrolysates. The reaction of TG2 (0.16 nmol/l) with each GPT hydrolysate was performed in TRIS/HCl buffer (0.1 mol/l, pH 7.4, 10 mmol/l CaCl 2 ) at a molar ratio of TG2:GPT hydrolysate of 1:150 at 37 °C for 120 min. 15 To inactivate TG2, all samples were heated at 95 °C for 10 min. The negative controls were prepared by adding the GPT hydrolysates after inactivation of TG2. Additional GPT blank controls contained only GPT in TRIS/HCl buffer and were treated as described above just without TG2. The samples and the negative controls were prepared in triplicates; the GPT blank controls were also prepared in triplicates, but pooled prior to tryptic hydrolysis.
Tryptic digestion and isopeptide clean-up. Enzymatic hydrolysis and peptide purification were carried out as described in detail by Lexhaller et al. 15 . Briefly, all samples, negative controls and GPT blank controls were hydrolyzed with trypsin at an enzyme:substrate ratio of 1:100 (w/w) at 37 °C for 24 h and the digestion was stopped with formic acid (FA, pH <2). Purification was done by solid phase extraction (SPE) using 50 mg Sep-Pak tC 18  Discovery-driven mass spectrometry. nLC-MS/MS analysis was performed on an Ultimate 3000 nanoHLPC system (Dionex, Idstein, Germany) coupled to a Q Exactive HF mass spectrometer (Thermo Fisher Scientific, Dreieich, Germany). The nanoscale LC system consisted of a trap column (75 µm × 2 cm, self-packed with Reprosil-Pur, C 18 , ODS-3, 5 µm resin, Dr. Maisch, Ammerbuch, Germany) and an analytical column (75 µm × 40 cm, self-packed with Reprosil-Gold, C 18 , 3 µm resin, Dr. Maisch). The injection volume was 2 µL (estimated peptide concentration: 0.16 µg/µL). The peptides were delivered to the trap column using solvent A0 (0.1% FA in water) at a flow rate of 5 µL/min and separated on the analytical column using a 60 min linear gradient from 4% to 32% solvent B at a flow rate of 300 nL/min (solvent A1, 5% DMSO, 0.1% FA in water; solvent B, 5% DMSO, 0.1% FA in acetonitrile) 49 . The MS was operated in data-dependent acquisition mode, automatically switching between MS1 and MS2 spectra to acquire full scans. The mass-to-charge (m/z) range for the acquisition of MS1 spectra was 360-1,300 m/z at an Orbitrap full MS scan (resolution: 60,000, automatic gain control (AGC) target value: 3e6, maximum injection time: 50 ms). In the MS2, the Top18 peptide precursors were automatically selected for fragmentation by higher energy collision-induced dissociation (isolation width: 1.7 Th, maximum injection time: 25 ms, AGC value: 1e5). Analysis was performed using 25% normalized collision energy at a resolution of 15,000.
Preparation of GPT databases. Each GPT blank control was searched individually against a protein database containing all gliadin entries (January 2019; 5,958 entries), glutenin entries (January 2019; 4,488 entries), secalin entries (January 2019; 219 entries) and hordein entries (January 2019; 158 entries) of the UniProtKB database using MQ (software version 1.6.0.1) 20 . The parameters were set as follows: digestion mode: specific, enzyme: trypsin, pepsin, chymotrypsin, maximum missed cleavage sites: 2, variable modifications: deamidation (NQ), oxidation (M), main search peptide tolerance: 4.5 ppm, mass tolerance for fragment ions: 0.5 Da. All other parameters were used as default settings. All identified proteins in the proteinGroups.txt file were used to create an appropriate database for each GPT.
Identification of TG2-gluten isopeptides. The Thermo Xcalibur full scan.raw files of each GPT (three samples and three negative controls) were directly used as input in MQ 20 and searched against the appropriate GPT database. Seven peptides containing lysine residues (K205, K265, K429, K468, K590, K600, K677) from the TG2 sequence were selected as possible crosslinking sites in the isopeptides. The elemental compositions of these tryptic TG2 peptides were calculated in silico to configure the TG2-sides of the isopeptides as modifications in MQ. A formal subtraction of NH 3 was necessary to use these peptides as modifications (TG2-modifications) in an isopeptide bond 15 . The parameters were set as follows for the individual search runs: digestion mode: specific, enzyme: trypsin, pepsin, chymotrypsin, maximum missed cleavage sites: 2, variable modifications: each TG2-modification in one single search run, deamidation (NQ), TG2-modifications: FLKNAGR, C 36

Annotation of MS/MS fragments of the isopeptides.
To confirm the identification and the respective crosslinking sites of the isopeptides, the b-and y-fragments of both sides were calculated with the MS-Product feature of the ProteinProspector webpage (v.5.22.1, University of California, San Francisco, CA, USA) 21 . The sequences of gluten peptides and the TG2-modifications were entered and the binding Q or K were replaced by "u" for the user-specified amino acid elemental composition of the other isopeptide site, respectively. ProteinProspector parameters were then set to calculate b-, y-and internal fragments and associated fragments due to water-and ammonia-loss. The charge states were calculated up to 5+ for the precursors and up to 3+ for the fragments.
Isopeptide confirmation and creation of PRM methods. Skyline-daily (version 19.0.9.149) 22 was used to confirm the identities of all detected isopeptides, to compare negative controls and samples and to create isolation lists for the PRM methods. To confirm the identified isopeptides and reject false positives, the sequences of the GPT-peptides were loaded into Skyline as the targets and modified with the appropriate TG2-modifications, a deamidation (−17 Da) or both, according to the MQ output. To identify the isopeptides from both sides, the reverse isopeptide sequence, i.e., the sequence of the TG2 peptide, was also loaded into Skyline and modified with the previously identified GPT peptide via a crosslink. Then, Skyline generated the appropriate precursors of all sequences. Every isopeptide was manually checked to fulfill the following parameters: (a) the retention time had to match with the identified retention time of the MQ search (ID), (b) comparison of retention time and isotopic dot product scores (idotp: generated from comparing the expected precursor isotopic distribution to the observed distribution; scored from 0-1 with 1 being the highest) among the triplicates using the graphical tools 22 , (c) reproducible detection of the isopeptide in the three replicates and absence in the negative controls; false positive matches in the negative controls were rejected, (d) the idotp had to be >0.9, (e) the threshold for unambiguous localization was set to a localization probability of >75% (MQ search). MS/MS libraries were built to generate the isolation lists for the isopeptides of each GPT. Therefore, the MQ output tables "msms.txt" of the searches of every modification were imported into Skyline. All identified isopeptides of one GPT and their reversed isopeptides with the appropriate GPT-modifications were summarized in one PRM method. This method was exported as an isolation list for use in the nLC-MS/MS system. A single isolation list and a single PRM method were created for each GPT.
Targeted mass spectrometry. All PRM measurements were carried out using the exact same instrument and LC conditions as for the discovery-driven setup (see above). The MS was operated in unscheduled PRM mode with the following settings: MS1 resolution: 60,000, MS1 automatic gain control (AGC) target value: 3e6, MS1 maximum injection time: 100 ms, MS1 scan range 360-1300 m/z, quadrupole isolation window width: 1.7 Th, MS2 maximum injection time: 22 ms, MS2 AGC value: 1e6. High-energy collision-induced dissociation was performed using a normalized collision energy of 27.
PRM data analysis. The Xcalibur.raw files of the PRM data were imported into Skyline separately for each GPT. The transitions of each target were checked manually and in comparison to the negative controls. To confirm the identified isopeptides and reject false positives, the following parameters were checked: (a) the retention time in the PRM data had to match with the identified retention time of the MQ search (ID) and the full scan data, (b) the comparison of retention time and idotp of the precursors among the triplicates had to fit using the graphical tools and no detection of the signals in the negative controls had to be observed, (c) according to Chen et al. 26 , at least seven identified b-or y-fragments had to match theoretical peptide fragments, (d) at least three fragments had to be consecutive in the peptide sequence. Every identified isopeptide was double-checked with the MQ search result in the MQ Viewer.
Multiple sequence alignment of gluten peptides. All gluten peptides identified as part of the isopeptides were compiled into a peptide fasta file, either without or with deamidation at the sites we had detected. The multiple sequence alignment was done using MAFFT online version 7.452 on January 16, 2020 (Computational