a Spectral angle distributions comparing HCD Prosit 2020 predicted spectra against experimentally acquired spectra of non-spliced (gray) and proposed spliced (orange) peptides extracted from Liepe et al.21. b Identification results by Mascot from the original study were retrieved and rescored with Prosit. The raw MS data were also retrieved, re-searched by MaxQuant and MSFragger and rescored using Prosit. A single Percolator model was trained for confidence estimation, based on the results obtained from the MaxQuant analysis. This model was applied to the data from the rescored Mascot and rescored MSFragger results. c Two mirror plots of an experimental spectrum (top spectrum in both) identified either as the proposed spliced peptide FAGDLVR|GVA (top mirror plot, pipe symbol indicates splicing position) or non-spliced peptide alternative FAGDLVRNL (bottom mirror plot) plotted against the corresponding HCD Prosit 2020 predicted spectrum (bottom spectrum each). The spliced and non-spliced peptide sequences differ in 3 amino acids (Levenshtein distance 3). The spectral angle (SA) compares the predicted b- and y-ion intensities to the corresponding matching peaks in the experimental spectrum (excluding any observed but not matched peaks in the experimental spectrum). Matching fragments are highlighted in black whereas peaks without match are shown in gray. Fragment ions which are exclusively present in the predicted spectrum are marked with an asterisk (*). The blue and red fractions of the matched b- and y-ions (respectively) indicate the normalized intensity difference of these fragments. The confidence score (Score) of the proposed match was estimated by the shared Percolator model and is indicated at the top. d Barplots of different consecutive filtering steps of retained (blue) and rejected (red) proposed spliced peptides by various quality control steps. Raw and analysis data are available from the PRIDE repository with identifiers PXD021398, PXD000394, and Mendeley with identifier y2cvb5nvgn.1.