Destabilizing single chain major histocompatibility complex class I protein for repurposed enterokinase proteolysis

The lack of a high throughput assay for screening stabilizing peptides prior to building a library of peptide-major histocompatibility complex class I (pMHC-I) molecules has motivated the continual use of in silico tools without biophysical characterization. Here, based on de novo protein fragmentation, the EASY MHC-I (EZ MHC-I) assay favors peptide antigen screening to an unheralded hands-on time of seconds per peptide due to the empty single chain MHC-I protein instability. Unlike tedious traditional labeling- and antibody-based MHC-I assays, repurposed enterokinase directly fragments the unstable single MHC-I chain protein unless rescued by a stabilizing peptide under luminal condition. Herein, the principle behind EZ MHC-I assay not only characterizes the overlooked stability as a known better indicator of immunogenicity than classical affinity but also the novel use of enterokinase from the duodenum to target destabilized MHC-I protein not bearing the standard Asp-Asp-Asp-Asp-Lys motif, which may protend to other protein instability-based assays.

Scientific RepoRtS | (2020) 10:14897 | https://doi.org/10.1038/s41598-020-71785-2 www.nature.com/scientificreports/ ( Fig. 1b) 11 . Additionally, enterokinase protease was found to cleave the heavy chain of the unstable single chain MHC-I protein and not the stable one at 37 °C. Also, the newfound enterokinase preference for a more accessible unstable (conformational) state may also include unbound peptide at 37 °C. Therefore, the de novo protein fragmentation driven by enterokinase becomes the core of EZ MHC-I assay and is completed in three simple steps; mix, incubate and analyze (Fig. 1c). The resulting protein fragments are both visual and likely observed across the highly polymorphic MHC-I family, as represented by the frequent HLA-A + alleles, A*11:01, A*02:01 and A*02:07 single chain proteins but not unchained one (Fig. 1d,e). In this work, the well-characterized HLA-A*11:01 will be used to represent the polymorphic MHC-I protein family for comparing the EZ MHC-I assay against existing methods and later the development of understudied alleles such as HLA-A*02:07, which cannot be accurately predicted.
As a proof-of-principle study, the likelihood of enterokinase induced protein fragments is first explored with two forms of conditional HLA-A*11:01 pMHC-I protein, unchained and single chain. The unchained pMHC-I protein consists of three molecules, an original peptide, the highly conserved β2m and the polymorphic α polypeptide chains whereas the same three molecules when separated by a spacer makes the single chain protein 11,12 . For the unchained HLA-A*11:01 protein bearing a photo-labile peptide, only 30 to 50% of the whole α chain fragmented under different temperature, UV and enterokinase conditions ( Fig. 1d; Supplementary Fig. S1) 12 . This suggests a number of unchained MHC-I proteins still resistant to enterokinase even at 37 °C. On the contrary, the single chain pMHC-I protein bearing an enterokinase cleavable peptide showed an improved 70-95% of the protein fragmented at 37 °C unless rescued by a suitable peptide. These protein fragments originate from the α polypeptide chains, which do not bear the DDDD|K motif in the HLA-A*02:01, HLA-A*02:07 and HLA-A*11:01 proteins ( Fig. 1e; Supplementary Fig. S1). Also, the absence of such protein fragments at 25 °C suggests that the enterokinase specificity was repurposed only at higher temperature (Fig. 1e,f). Thus secreted enzymeexchangeable single chain proteins bearing cleavable bulging peptides from the insect cells were purified and the EZ MHC-I assay performed at 37 °C.
Next, for establishing that immunogenic peptides are also stabilizing or even resistant to enterokinase cleavage when bound to MHC-I protein, 35 known immunogenic peptides were retrieved from the Immune Epitope Database (IEDB, https ://www.iedb.org/) ( Table 1). In this work, a comparison was made for a trinity of methods, the EZ MHC-I assay using the HLA-A*11:01 single chain protein, the stability-based NetMHCstab-1.0 algorithm and the widely used affinity-based NetMHCpan-4.0 algorithm (Table 1). Unlike NetMHCpan, NetMHCstab algorithm could better predict stable T-cell epitopes instead of affinity matched-non epitopes but has not been updated due to the lack of experimental stability data 13 . Here, an inverse relationship is shown whereby increasing stabilities correspond to decreasing NetMHCpan rankings, which are higher peptide affinities. The EZ MHC-I assay classifies poor HLA-A*11:01 stability based on a different HLA-A*02:01 peptide score of zero (CLGGLLTMV) from the Epstein-Barr virus (EBV) proteome. Excellent HLA-A*11:01 peptide stability will be similar to a known antigenic HLA-A*11:01 peptide (IVTDFSVIK from EBV) score of one. To better understand this relationship using EZ MHC-I assay, 35 peptides from the IEDB resource were reduced to 27 with a smaller NetMHCpan cutoff of less than 1% to exclude poor affinity binders and is better performed at pH 6.2 (Fig. 2a). These 27 peptide epitopes generally have good EZ 50kDa stability scores between 0.3 and 1.0. And unlike the excluded eight peptides, which also had low stability NetMHCstab scores of less than 2 h, which would classify them as non-HLA-A*11:01 peptides ( Fig. 2a; Table 1). Based on the IEDB resource, the reported MHC alleles for the eight unstable peptides are GPISGHVLK (HLA-A11, HLA-A3), TMVMELIRMIK (HLA-A*11:01), LVSFLL-LAGR (HLA-A*11:01, HLA*03:01, HLA-A*31:01, HLA-A*33:01, HLA-A*68:01, HLA-A*30:01), LALEVARQKR (HLA-A*11:01), LVTFLLLCGR (HLA-A*11:01), LYASPQLEGF (HLA-A*11:01, HLA-A*24:02), AYQKRMG-VQM (HLA-A*11:01, HLA-A*24:02) and negative control CLGGLLTMV (HLA-A*02:01) with variable response frequencies, including negative outcome reported in IEDB resource and thus possible lack of immunodominance. Next, EZ MHC-I assay was compared with traditional enzyme-linked immunosorbent assay (ELISA) using unchained protein. For ELISA, the single chain protein cannot be used as it mandates the separation of unchained α protein while the β2m protein remains attached to the ELISA plate. Unlike enterokinase-based EZ-MHC-I assay, the unchained protein bears a photo-labile peptide whose complete cleavage depends on efficient UV irradiation while assuming minimal protein photo-damage. Despite these technical differences, there is still a good agreement between ELISA and EZ-MHC-I assays whereby the NetMHCpan predicted good affinity binders are stabilizing but not the non-affinity binders (Fig. 2b). Moreover, this pattern of good affinity binders that are also stabilizing is conserved across all four methods ( Fig. 2c; Table 2). However, deeper analysis with the EZ MHC-I assay also highlighted an additional number of immunogenic peptides, which were missed in NetMHCpan and NetMHCstab algorithms. These moderate peptides would be excluded as both predictive algorithms heavily rely on experimental data, which are often biased towards strong HLA affinity but may be experimentally identified using stability-based assay (Fig. 2d,e). Moreover, more suboptimal peptides are found stabilizing at pH 6.2 than pH 8 in EZ MHC-I assay (Fig. 2d,e). Hence, EZ MHC-I assay can detect stabilizing peptides and despite the possibility that enterokinase may target non-DDDD|K peptides, it did not lessen the number of identified immunogenic peptides but perhaps more than predictive algorithms at pH 6.2. This is probably due to the conformational protection when these labile peptides become inaccessible to enzymatic cleavage when bound to the MHC-I protein.
Next, to enrich the experience with the EZ MHC-I assay using the HLA-A*11:01 single chain protein, a total of 180 overlapping 8-to 12-mer peptides from the epidermal growth factor receptor (EGFR) driver mutations were designed. However, three peptides, QLMPFGSLL, PFGSLLDYV and GICLTSTVQLIM are unstable and thus excluded. The remaining 177 peptides are derived from exon 19 deletion (E746-A750 at the Leu-Arg-Glu-Ala sequence, EX), exon 21 mutation (L858R, LR) and exon 20 mutations (T790M, TM and C797S, CS), which are associated with a number of cancers including lung cancer 14 (Table 3; Supplementary Table S2). Here, the same pattern of destabilizing EZ 50kDa trending downward wsith increasing NetMHCpan ranks is observed Scientific RepoRtS | (2020) 10:14897 | https://doi.org/10.1038/s41598-020-71785-2 www.nature.com/scientificreports/ ( Fig. 2f; Supplementary Data 1). More importantly, with a broader peptide pool, two peptides CS1017 and CS1032 had similar measured stability as the antigenic Epstein-Barr virus peptide EBV1101 positive control but surprisingly not with NetMHCpan-4.0 and NetMHCstab-1.0 algorithms (Fig. 2f,g). Other possible exceptions include peptides EX1018 and LR1044 with very poor NetMHCpan ranks but are marginally stabilizing with an EZ 50kDa score of 0.2 (Table 3). Although, the EZ 50kDa cutoff is currently set at 0.3 ( Supplementary Fig. S3), the exact minimum threshold of EZ 50kDa score and peptide antigenicity remains undetermined but these peptides are likely to escape thymic negative selection and be presented as pMHC-I complexes with varying density on non-professional antigen presenting cells. Last, based on the observed good agreement between EZ MHC-I assay and existing methods highlighted in this work using the well studied HLA-A*1101 allele, understudied HLA alleles may finally be explored. Presently, understudied HLA alleles cannot be accurately predicted due to insufficient public data to develop suitable Table 1. HLA-A*11:01 characterization of peptide epitopes at two different pHs retrieved from the Immune Epitope Database (IEDB) with EZ MHC-I assay, NetMHCpan-4.0 and NetMHCstab-1.0 algorithms. For EZ MHC-I assay, the EZ 50kDa score for each peptide is calculated using positive Epstein-Barr virus peptide control (EBV1101) and negative Epstein-Barr virus peptide control (EBV0201); ranging from above 1 (more stable than control), 1 (equally stable) to below 0 (very unstable), determined with Eq. (4). The most optimal peptides are scored closer to ~ 0 in NetMHCpan rank, greater than 2 h for NetMHCstab-1.0 half-life and greater than 0.3 for EZ 50kDa . The seven peptides with EZ 50kDa below EBV0201 (negative control) as shown in Fig. 2a, are marked with * in Table 1. EBV Epstein-Barr virus, DENV Dengue virus, LCM lymphocytic choriomeningitis virus, CMV cytomegalovirus, IGRP Islet-specific glucose-6-phosphatase catalytic subunit-related protein. www.nature.com/scientificreports/ in silico algorithm and unfavorable low-throughput methods. Here, an understudied but frequent Asian allele HLA-A*02:07 was evaluated using the rapid EZ MHC-I assay. The top eight predicted HLA-A*02:07 peptides from the same library of 177 EGFR peptides were selected using NetMHCpan-4.0 algorithm. Also, EBV1101 and Hepatitis B peptide HepB0207 were used as control peptides; EBV1101 is a predicted non-binder and HepB0207 .0 cutoff is set as greater than 0% for all binders or less than 1% for potential binders (see Table 1). At pH 6.2, a tighter group of peptides is observed than at pH 8.0. (b) Comparison of traditional ELISA and EZ MHC-I assay with a collection of both IEDB peptide epitopes and peptides predicted using NetMHCpan-4 algorithm. Predicted non-binders tend not to be stabilizing whereas some predicted binders are found not stabilizing in both ELISA and EZ MHC-I assays (see Table 2). (c) A 4-dimensional comparison showing good agreement among the four methods. The bubble size is based on the ELISA assay whereby bigger size correlates to higher stability. The more stable and better binding peptides are clustered away from the less stable and poor binding peptides. Moderately predicted peptides may be confirmed using stabilitybased assays (see Table 2). (d) Additional peptides from IEDB resource are found stabilizing despite their higher NetMHCpan-4.0 ranks above 0.5%. (e) Additional peptides from IEDB resource are observed stabilizing despite falling below the recommended NetMHCstab-1.0 cutoff of 2 h. (f) A comprehensive EGFR peptide library (n = 177, duplicated) from key driver mutations were compared using EZ 50kDa , NetMHCpan and NetMHCstab tools using HLA-A*11:01 single chain protein. The positive EBV1101 and negative EBV0201 control peptides are labeled and colored blue. Two more EGFR mutation-derived peptides (CS1012 and CS1017) are found strongly stabilizing despite their predicted weak affinities using NetMHCpan For understudied HLA-A*02:07, there is a lack of correlation between NetMHCpan algorithm and measured EZ 50kDa stability (n = 11, duplicated). HepB0207 (FLPSDYFPSV) is a known positive control. The cutoff for EZ 50kDa at 0.3 is based on IEDB peptides with NetMHCpan-4.0 rank less than 1% and NetMHCstab-1.0 rank greater than 2 h (see Supplementary Fig. S3).
Refer to Supplementary Fig. S2 for full gels of insert in (h). www.nature.com/scientificreports/ is a known binder 15 . Using the HLA-A*02:07 single chain protein, out of the eight EGFR-derived peptides, only CS1044 and EBV0201 peptides were found stabilizing (Fig. 2h). CS1044 has a poor NetMHCpan rank of 2.238% and EBV0201 is an original HLA-A*02:01 binder and thus may be a shared cytotoxic T lymphocyte epitope with HLA-A*02:07. Also EBV0201 (CLGGLLTMV) and CS1044 (QLMPFGSLLDYV) are unlike classical HLA-A*02:07 peptides which tend to have D/P in position 3 15 . These suggest that more remains to be learned from the highly polymorphic HLA-alleles. Hence understudied but important HLA alleles remain inaccurate in silico but may be evaluated with the EZ MHC-I assay without sacrificing precious cells.

Discussion
The EZ MHC-I assay framework largely depends on targeted proteolysis of destabilized protein. Proteolysis can occur non-enzymatically and enzymatically. However, non-enzymatic thermal proteolysis is inefficient with prominent α and β2m chains of unchained MHC-I protein still present even at a higher temperature 37 °C. Although most single chain pMHC-I remain thermal stable, some single chain pMHC-I proteins such as the HLA-A*02:07 protein described in this work can be thermally proteolyzed but still usable, as rescued pMHC-I proteins are thermostable at 37 °C 10 . Based on an enzyme-exchangeable single chain pMHC-I molecular design, a cleavable bulging peptide allows peptide exchange. More importantly, enterokinase was repurposed to target accessible sites such as intrinsically disordered regions in a destabilized α chain and also free peptides 16 . In this work, the non-specificity of enterokinase is shown at 37 °C towards destabilized proteins but not stable and folded pMHC-I single chain protein.
Enterokinase is naturally secreted by the Brunner's glands for digestion in the duodenum and plays no known biological role in antigen presentation by the MHC-I proteins. However, in this technology, the non-specificity of enterokinase was successfully harnessed to cleave residues in the destabilized α chain not bearing the canonical DDDD|K motif but exposed by a leaving β2m chain attached. The enchained β2m is needed to further destabilize the α chain, as most of the empty α chain remains undigested for unchained pMHC-I protein even at 37 °C. The broad pH range of enterokinase activity between 6.0 and 8.5 also favors screening at pH 6.2 close to the luminal pH and peptide exchange above room temperature 17,18 . Moreover in this work, obvious protein fragments due to non-rescued single chain MHC-I protein with enterokinase is seen at 37 °C. Taken together, the EZ MHC-I assay for the highly polymorphic single chain pMHC-I proteins may be performed under mildly acidic luminal conditions similar to the endosomal compartments 17 .
The pMHC-I protein stability is well regarded as a better predictor than affinity as immunogenic peptides which bind more stably, also better define CD8 + T cells 7 . Also, Parker et al. has shown that the loss of 125 I-labeled β2m strongly reflects peptide dissociation in the α chain 9 . However, the lack of simple stability-based assays has continued worldwide usage of inaccurate affinity-based methods. Here, the use of a single chain construct propels whole (empty) MHC-I protein instability and induced protein fragments by enterokinase. This rapid technology has shown that NetMHCpan-4.0 predicted strong peptide binders (< 0.5%) and NetMHCstab-1.0 predicted stable peptides (> 2 h) generally also mean stable pMHC-I proteins. However, weaker and non-canonical peptides remain challenging for current in silico algorithms. In particular, lengthy peptides i.e. 15-and 16-mer, remain unpredicted and have been isolated in HLA class I molecules with potential roles in T cell immunity [19][20][21] . Although lengthy peptides or even mix-match combination remain unexplored here, the more versatile EZ MHC-I assay is feasible for studying lengthy peptides, post-translated peptides and dual peptide occupancy. Table 3. HLA-A*11:01 peptides comparison derived from EGFR mutations. The top 14 peptides from EGFR mutations out of 177 peptides ranked using the EZ MHC-I assay and HLA-A*11:01 single chain protein. Five poorly scored peptides with EZ 50kDa less than 0.1 and NetMHCpan rank less than 5% are also included for comparison. EBV Epstein-Barr virus, EX exon 19 deletion, LR L858R, CS C797S, TM T790M. The EZ 50kDa score shows that some poorly predicted NetMHCpan-4.0 peptides (> 2%) could be stabilizing epitopes. The scores from the NetMHCstab-1.0 algorithm are also shown for comparison.

Index
Peptide ID Sequence EZ 50kDa EZ 50kDa rank out of 177 www.nature.com/scientificreports/ Here, the EZ MHC-I assay for HLA-A*11:01 has identified a set of 14 stabilizing peptides with varying NetMHCpan ranks derived from key EGFR mutations associated with non-small cell lung cancer (NSCLC). Approximately 80% of lung cancers is associated with NSCLC driven by molecular EGFR mutations and ALK receptor tyrosine kinase translocations 22,23 . For somatic EGFR mutations, classical exon 19 deletion and L858R represent the majority of EGFR mutations in NSCLC, and are positive prognostics towards specific EGFR tyrosine kinase inhibitors (TKIs), gefitinib and erlotinib 24,25 . However, mutants such as T790M and C797S are commonly associated with rapidly acquired resistance during TKI treatment and thus a need for new strategies that specifically target such driver mutations 26 . Hence an alternative strategy to the classical ATP-site of the kinase is T cell biology. Unlike MHC-II, which is not expressed by most epithelial cells, MHC-I is expressed on most nucleated cells. More importantly, EGFR inhibitors such as erlotinib, cetuximab, and nimotuzumab can enrich peptide-MHC density on the skin (and likely other organs) and recruit T cell-driven processes 27 . Thus identifying relevant antitumor CD8 + T cells is a mean to increase clinical efficacy.
In summary, EZ MHC-I assay highlights the repurposing of intestinal enterokinase towards destabilized protein not bearing the standard Asp-Asp-Asp-Asp-Lys motif to significantly improve MHC-I peptide selection for pMHC-I multimer technology. Conceptualizing enterokinase in a broader context will reshape other protein instability-based assays. Here, in silico predictions are still limited to well-studied HLA-alleles, canonical peptides and affinity binding based data. The results from stability-based EZ MHC-I assay could help advance the development of in silico tools and discover more peptide epitopes across health and diseases.

Methods production of single chain pMHc-i protein. A melittin-leader modified MultiBac acceptor pACEBac1
vector is used to make the secreted single chain pMHC-I protein.
In the single chain trimer design, the unique BamHI and XbaI restriction sites introduce a peptide center within the peptide-β2m-α single chain module separated by two GS-rich spacers, and a peptide bearing DDDD|K peptide sequence to aid peptide dissociation and exchange. ELISA assay with photolabile unchained pMHC-1 protein and UV irradiation. The ELISA is performed using a photolabile pMHC-I protein mixture containing both biotinylated and non-biotinylated proteins. This is to minimize false positive due to aggregated empty MHC-I protein, whereby an excess of nonbiotinylated proteins at 97.5:2.5 molar ratio to biotinylated protein will minimize the formation of non-specific biotinylated protein aggregates. Non-biotinylated photolabile pMHC-I proteins refolded from the E. coli expression system were also purified with Dynabeads-streptavidin to remove endogenous biotinylated pMHC-I protein. For the ELISA setup, 400 ng of anti-human β2m (BioLegend, cat no. 316302) in ELISA coating buffer (BioLegend, cat no. 421701) was coated onto Nunc Maxisorp ELISA plates (BioLegend, cat no. 423501) at 4 °C overnight. The next day, unbound anti-human β2m was washed out with 1xELISA wash buffer (BioLegend, cat no. 421601) diluted in water and blocked with 1 × ELISA diluent B (BioLegend, cat no. 421205) diluted in phosphate buffer saline (PBS) for 1 h at 25 °C. The blocking buffer was washed out with 1xELISA wash buffer and the ELISA plate tapped dry. Next, in a 96 well plate, a reactant mixture of 62.5 nM of photolabile pMHC-I protein mixture with or without 5 μM of peptide in PBS was added and irradiated on ice with UV for 2 × 5 min using the UV crosslinker chamber at 365 nm (UVP, cat no. CL-1000L). The reactant mixture was transferred to the anti-human β2m-coated ELISA and left to incubate for 2 h at room temperature. After incubation, the unbound reactant was washed out with 1xELISA wash buffer and blocked with 1 × ELISA diluent B for 1 h at room temperature. The blocking buffer was washed out with 1xELISA wash buffer and the ELISA plate was tapped dry. For detection of bound biotinylated pMHC-I protein, streptavidin-horseradish peroxidase was added and allowed to incubate for 30 min at room temperature before flicking off, washing with 1xELISA wash buffer and the ELISA plate tapped dry. Finally, ultra 3,3′5,5′-tetramethylbenzidine was added for 10 min at room temperature before quenching with equal volume of 2 M sulfuric acid. The absorbance at 450 nm was measured with a plate reference at 570 nm using the microplate reader and corrected using Eq. (1): eZ MHc-i assay with single chain pMHc-i protein and enterokinase. The EZ MHC-I assay is performed in an eppendorf tube containing 3 U of enterokinase (NEB P8070) for every 1 μg of single chain pMHC-I protein with or without a peptide diluted in 20 mM sodium cacodylate pH 6.2 and 150 mM NaCl buffer and left at 37 °C for 14 h (overnight). To stop peptide exchange and proteolysis, the eppendorf tube is spun at 4 °C at 14,000×g for 3 min, also to settle any condensate prior to Laemmli SDS-PAGE without boil.

Data availability
All data collected for the EZ MHC-I assay are included in the paper and/ or the Supplementary Material. Additional data related to the paper may be requested from J.L.