a Vennbars (simplification of Venn diagram as depicted in the inset) showing the number of peptides lost (red), shared (blue) and gained (green) from Prosit-based rescoring of MaxQuant results of data published by Sarkizova and Klaeger et al.16 (using Spectrum Mill HLA v2) for each of the 92 monoallelic cell lines investigated in this study. b Peptide motif plot of 1846 9-mers confidently identified to be present in the cell line expressing allele C*12:03 from the published results (top panel) and 1369 9-mers added by the Prosit-based rescoring of MaxQuant results (bottom). Amino acids are colored according to their physio-chemical properties (black denotes hydrophpbic, red acidic, blue basic, purple neutral, and green polar amino acids). The difference between the motifs was estimated by Jensen-Shannon divergence (indicated in the bottom motif) comparing the positional weight matrices of both motifs. c Vennbars showing the number of peptides lost (red bar), shared (blue), and gained (green) when comparing results obtained from different workflows to results published by Sarkizova and Klaeger et al.16 (left of solid vertical line). The bar to the right of the vertical line compares the number of lost, shared, and gained peptides when comparing the Prosit-based rescored results of Spectrum Mill HLA v3 to the Prosit-based rescored results of MaxQuant. d Boxplots of the average emission probabilities (probability that a peptide is derived from a certain motif, see “Methods”) per allele of peptides shared (blue), gained (green) and lost (red) when comparing results obtained from the workflows indicated at the bottom of the plot. The number of alleles (n) for which an average emission probability was calculated is depicted at the bottom. Peptides lost (not confidently identified) by the rescored SM HLA v3 workflow in comparison to the SM HLA v3 workflow are shown separately depending on whether the fragment intensities of the peptide could be predicted by Prosit (rescorable) or not (non-rescorable). The box indicates the interquartile range (IQR). The black line marks the median, notches extend to 1.58 * IQR/sqrt(n), whiskers to 1.5 * IQR from the hinge. Data outside whiskers are plotted individually as black dots. Raw and analysis data are available from the PRIDE repository with identifier PXD021398 and MassIVE repository with identifiers MSV000084172 and MSV000080527.