Fig. 2: Deep learning framework Prosit for tryptic and non-tryptic peptide fragment intensity prediction. | Nature Communications

Fig. 2: Deep learning framework Prosit for tryptic and non-tryptic peptide fragment intensity prediction.

From: Deep learning boosts sensitivity of mass spectrometry-based immunopeptidomics

Fig. 2

a The deep learning framework Prosit was trained on data available prior to this study (tryptic peptides, top panel; tryptic extension, middle panel) and data on non-tryptic peptides (bottom panel) generated in this study. b Beanplots comparing the prediction accuracy of the HCD Prosit 2020 model (red, this study) against the prediction accuracy of the previously published HCD Prosit 2019 model (gray, tryptic only, Gessulat and Schmidt et al.7) for the four introduced peptides sets (HLA class I, HLA class II, LysN, and AspN) and the previously published tryptic peptides. The number of underlying spectra (n) is indicated at the bottom. The black solid line and corresponding numbers indicate the median spectral angle for each distribution. c Beanplot comparing the prediction accuracy of the HCD Prosit 2020 model (red, this study) against the HCD Prosit 2019 model (gray, Gessulat and Schmidt et al.7) for singly charged peptides. The number of underlying spectra is indicated at the bottom. The black solid line and corresponding numbers indicate the median spectral angle for each model. d Mirror spectrum of the singly charged non-tryptic peptide YPYPVSNSV comparing the experimentally acquired HCD ProteomeTools spectrum (top panel, top spectrum) to its predicted spectrum by the HCD Prosit 2020 model (top panel, bottom spectrum) and the experimentally acquired CID ProteomeTools spectrum (bottom panel, top spectrum) to its predicted spectrum by the CID Prosit 2020 model (bottom panel, bottom spectrum). Fragment ions are labeled in blue and red for b- and y-ions, respectively. Matching peaks (present in both spectra) are visualized in black, whereas peaks only present in the top (experimental) spectra are colored in gray. Red and blue fractions of matching peaks indicate the normalized difference in intensity between the experimental and predicted spectra. e Beanplots comparing the prediction accuracy of the CID Prosit 2020 model between the training (light blue) and holdout (dark blue) set for the four introduced peptides sets (HLA class I, HLA class II, LysN, and AspN) and previously published tryptic peptides. The number of underlying spectra (n) is indicated at the bottom. The black solid line and corresponding numbers indicate the median spectral angle for each distribution. Raw and analysis data are available from the PRIDE repository with identifiers PXD004732, PXD010595, and PXD021013.

Back to article page