SignalP 5.0 improves signal peptide predictions using deep neural networks


Signal peptides (SPs) are short amino acid sequences in the amino terminus of many newly synthesized proteins that target proteins into, or across, membranes. Bioinformatic tools can predict SPs from amino acid sequences, but most cannot distinguish between various types of signal peptides. We present a deep neural network-based approach that improves SP prediction across all domains of life and distinguishes between three types of prokaryotic SPs.

Code availability

SignalP 5.0 is available at The web version of SignalP 5.0 is free for all users, while the standalone package is free for academic users (and can be provided upon request) but is licensed for a fee to commercial users.

Data availability

The data sets used for training and testing SignalP 5.0 can be downloaded from


J.J.A.A. designed the model architecture and trained the SignalP5 method with help from C.K.S. K.D.T. collected the training and test data sets, performed the benchmarks and analyzed results. C.K.S., T.N.P., O.W., S.B. and G.v.H. provided suggestions during the design of SignalP5. K.D.T and H.N wrote the paper with input from J.J.A.A., C.K.S. and O.W. H.N. supervized and guided the project. All authors edited and approved the manuscript.

The downloadable version of SignalP 5.0 has been commercialized by the Technical University of Denmark (it is licensed for a fee to commercial users). The revenue from these commercial sales is divided between the program developers (J.J.A.A., K.D.T., C.K.S., T.N.P., O.W., S.B., G.v.H. and H.N.) and the Technical University of Denmark.

Supplementary Figure 1 Box plot of the probability of the predicted class for correct and incorrect predictions.

A probability close to 1 means a highly reliable prediction. For Archaea, Gram-Positive and Gram-Negative the probability threshold is 0.25, as there are four possible classes (Sec/SPI, Tat/SPI, Sec/SPII and Other). For Eukarya this threshold is 0.5, as it has only two classes (Sec/SPI and Other). A probability close to this threshold means a very unreliable prediction. All classes, namely Sec/SPI, Tat/SPI, Sec/SPII and Other are combined in this plot.

Supplementary Figure 2

Performance of SignalP 5.0 on cleavage site detection when considering a window of 0, 1, 2 and 3 amino acids centered on the real cleavage site.

Supplementary Figure 3

The SignalP 5.0 neural network architecture.

