Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

SignalP 5.0 improves signal peptide predictions using deep neural networks

Abstract

Signal peptides (SPs) are short amino acid sequences in the amino terminus of many newly synthesized proteins that target proteins into, or across, membranes. Bioinformatic tools can predict SPs from amino acid sequences, but most cannot distinguish between various types of signal peptides. We present a deep neural network-based approach that improves SP prediction across all domains of life and distinguishes between three types of prokaryotic SPs.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1
Fig. 2

Similar content being viewed by others

Code availability

SignalP 5.0 is available at http://www.cbs.dtu.dk/services/SignalP/. The web version of SignalP 5.0 is free for all users, while the standalone package is free for academic users (and can be provided upon request) but is licensed for a fee to commercial users.

Data availability

The data sets used for training and testing SignalP 5.0 can be downloaded from http://www.cbs.dtu.dk/services/SignalP/data.php.

References

  1. Nouwen, N., Berrelkamp, G. & Driessen, A. J. J. Mol. Biol. 372, 422–433 (2007).

    Article  CAS  Google Scholar 

  2. Pohlschroder, M., Gimenez, M. I. & Jarrell, K. F. Curr. Opin. Microbiol. 8, 713–719 (2005).

    Article  Google Scholar 

  3. Rapoport, T. A. Nature 450, 663–669 (2007).

    Article  CAS  Google Scholar 

  4. Berks, B. C. Annu. Rev. Biochem. 84, 843–864 (2015).

    Article  CAS  Google Scholar 

  5. von Heijne, G. Protein Eng. 2, 531–534 (1989).

    Article  Google Scholar 

  6. Pohlschroder, M., Pfeiffer, F., Schulze, S. & Halim, M. F. A. FEMS Microbiol. Rev. 42, 694–717 (2018).

    Article  CAS  Google Scholar 

  7. Sankaran, K. & Wu, H. C. J. Biol. Chem. 269, 19701–19706 (1994).

    CAS  PubMed  Google Scholar 

  8. Szabo, Z. et al. J. Bacteriol. 189, 772–778 (2007).

    Article  CAS  Google Scholar 

  9. Nielsen, H., Engelbrecht, J., Brunak, S. & von Heijne, G. Protein Eng. 10, 1–6 (1997).

    Article  CAS  Google Scholar 

  10. Nielsen, H. & Krogh, A. Proc. Int. Conf. Intell. Syst. Mol. Biol. 6, 122–130 (1998).

  11. Bendtsen, J. D., Nielsen, H., von Heijne, G. & Brunak, S. J. Mol. Biol. 340, 783–795 (2004).

    Article  Google Scholar 

  12. Petersen, T. N., Brunak, S., von Heijne, G. & Nielsen, H. Nat. Methods 8, 785–786 (2011).

    Article  CAS  Google Scholar 

  13. Thompson, B. J. et al. Mol. Microbiol. 77, 943–957 (2010).

    Article  CAS  Google Scholar 

  14. Fu, L., Niu, B., Zhu, Z., Wu, S. & Li, W. Bioinformatics 28, 3150–3152 (2012).

    Article  CAS  Google Scholar 

  15. Henikoff, S. & HenikoffJ. G. Proc. Natl Acad. Sci. USA 89, 10915–10919 (1992).

    Article  CAS  Google Scholar 

  16. Frank, K. & Sippl, M. J. Bioinformatics 24, 2172–2176 (2008).

    Article  CAS  Google Scholar 

  17. Altschul, S. F. et al. Nucleic Acids Res. 25, 3389–3402 (1997).

    Article  CAS  Google Scholar 

  18. Matthews, B. W. Biochim. Biophys. Acta 405, 442–451 (1975).

    Article  CAS  Google Scholar 

  19. Savojardo, C., Martelli, P. L., Fariselli, P. & Casadio, R. Bioinformatics 34, 1690–1696 (2017).

    Article  Google Scholar 

  20. Bagos, P. G., Tsirigos, K. D., Plessas, S. K., Liakopoulos, T. D. & Hamodrakas, S. J. PEDS 22, 27–35 (2009).

  21. Reynolds, S. M., Kall, L., Riffle, M. E., Bilmes, J. A. & Noble, W. S. PLoS Comput. Biol. 4, e1000213 (2008).

    Article  Google Scholar 

  22. Kall, L., Krogh, A. & Sonnhammer, E. L. J. Mol. Biol. 338, 1027–1036 (2004).

    Article  CAS  Google Scholar 

  23. Viklund, H., Bernsel, A., Skwark, M. & Elofsson, A. Bioinformatics 24, 2928–2929 (2008).

    Article  CAS  Google Scholar 

  24. Tsirigos, K. D., Peters, C., Shu, N., Kall, L. & Elofsson, A. Nucleic Acids Res. 43, W401–W407 (2015).

    Article  CAS  Google Scholar 

  25. Bagos, P. G., Nikolaou, E. P., Liakopoulos, T. D. & Tsirigos, K. D. Bioinformatics 26, 2811–2817 (2010).

    Article  CAS  Google Scholar 

  26. Dilks, K., Rose, R. W., Hartmann, E. & Pohlschroder, M. J. Bacteriol. 185, 1478–1483 (2003).

    Article  CAS  Google Scholar 

  27. UniProt Consortium. Nucleic Acids Res. 46, 2699 (2018).

    Article  Google Scholar 

  28. Fraser, C. M. et al. Science 270, 397–403 (1995).

    Article  CAS  Google Scholar 

  29. Sigrist, C. J. et al. Nucleic Acids Res. 41, D344–D347 (2013).

    Article  CAS  Google Scholar 

  30. Bagos, P. G., Tsirigos, K. D., Liakopoulos, T. D. & Hamodrakas, S. J. J. Proteome. Res. 7, 5082–5093 (2008).

    Article  CAS  Google Scholar 

  31. Dobson, L., Lango, T., Remenyi, I. & Tusnady, G. E. Nucleic Acids Res. 43, D283–D289 (2015).

    Article  CAS  Google Scholar 

  32. Kozma, D., Simon, I. & Tusnady, G. E. Nucleic Acids Res. 41, D524–D529 (2013).

    Article  CAS  Google Scholar 

  33. Juncker, A. S. et al. Protein Sci. 12, 1652–1662 (2003).

    Article  CAS  Google Scholar 

  34. Kall, L., Krogh, A. & Sonnhammer, E. L. Bioinformatics 21, i251–i257 (2005).

    Article  Google Scholar 

  35. Hiller, K., Grote, A., Scheer, M., Munch, R. & Jahn, D. Nucleic Acids Res. 32, W375–W379 (2004).

    Article  CAS  Google Scholar 

  36. Gomi, M., Sonoyama, M. & Mitaku, S. Chem. Bio. Informat. J. 4, 142–147 (2004).

    Article  CAS  Google Scholar 

  37. Bendtsen, J. D., Nielsen, H., Widdick, D., Palmer, T. & Brunak, S. BMC Bioinformatics 6, 167–173 (2005).

    Article  Google Scholar 

  38. Zhang, Y. Z. & Shen, H. B. J. Chem. Inf. Model. 57, 988–999 (2017).

    Article  CAS  Google Scholar 

  39. Chou, K. C. & Shen, H. B. Biochem. Biophys. Res. Commun. 357, 633–640 (2007).

    Article  CAS  Google Scholar 

  40. Fariselli, P., Finocchiaro, G. & Casadio, R. Bioinformatics 19, 2498–2499 (2003).

    Article  CAS  Google Scholar 

  41. LeCun, Y., Bengio, Y. & Hinton, G. Nature 521, 436–444 (2015).

    Article  CAS  Google Scholar 

  42. Pan, S. J. & Yang, Q. IEEE Trans. Knowl. Data Eng. 22, 1345–1359 (2010).

    Article  Google Scholar 

  43. Lafferty, J. D., McCallum, A. & Pereira, F. C. N. Proc. Eighteenth Int. Conf. Mach. Learn. 282–289 (2001).

  44. Hochreiter, S. & Schmidhuber, J. Neural Comput. 9, 1735–1780 (1997).

    Article  CAS  Google Scholar 

  45. Graves, A. Supervised sequence labelling. in Supervised Sequence Labelling with Recurrent Neural Networks 5–13, https://doi.org/10.1007/978-3-642-24797-2_2 (Springer, Berlin and Heidelberg, Germany, 2012).

  46. Almagro Armenteros, J. J., Sonderby, C. K., Sonderby, S. K., Nielsen, H. & Winther, O. Bioinformatics 33, 3387–3395 (2017).

    Article  Google Scholar 

  47. Zhou, J., & Troyanskaya, O. G. Proc. 31st Int. Conf. Mach. Learn. 753–745 (2014).

  48. Bishop, C. Pattern Recognition and Machine Learning (Springer, New York, 2006).

  49. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R. J. Mach. Learn. Res. 15, 1929–1958 (2014).

    Google Scholar 

  50. Hutter, F., Hoos, H. H. & Leyton-Brown, K. Proc. 5th Int. Conf. Learn. Intell. Optimiz. 507–523 (2011).

  51. Abadi, et al. Proc 12th USENIX Conf. Operat. Syst. Des. Implement. 265–283 (2016).

Download references

Acknowledgements

SB would like to acknowledge support from the Novo Nordisk Foundation (grant NNF14CC0001).

Author information

Authors and Affiliations

Authors

Contributions

J.J.A.A. designed the model architecture and trained the SignalP5 method with help from C.K.S. K.D.T. collected the training and test data sets, performed the benchmarks and analyzed results. C.K.S., T.N.P., O.W., S.B. and G.v.H. provided suggestions during the design of SignalP5. K.D.T and H.N wrote the paper with input from J.J.A.A., C.K.S. and O.W. H.N. supervized and guided the project. All authors edited and approved the manuscript.

Corresponding author

Correspondence to Henrik Nielsen.

Ethics declarations

Competing interests

The downloadable version of SignalP 5.0 has been commercialized by the Technical University of Denmark (it is licensed for a fee to commercial users). The revenue from these commercial sales is divided between the program developers (J.J.A.A., K.D.T., C.K.S., T.N.P., O.W., S.B., G.v.H. and H.N.) and the Technical University of Denmark.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Integrated supplementary information

Supplementary Figure 1 Box plot of the probability of the predicted class for correct and incorrect predictions.

A probability close to 1 means a highly reliable prediction. For Archaea, Gram-Positive and Gram-Negative the probability threshold is 0.25, as there are four possible classes (Sec/SPI, Tat/SPI, Sec/SPII and Other). For Eukarya this threshold is 0.5, as it has only two classes (Sec/SPI and Other). A probability close to this threshold means a very unreliable prediction. All classes, namely Sec/SPI, Tat/SPI, Sec/SPII and Other are combined in this plot.

Supplementary Figure 2

Performance of SignalP 5.0 on cleavage site detection when considering a window of 0, 1, 2 and 3 amino acids centered on the real cleavage site.

Supplementary Figure 3

The SignalP 5.0 neural network architecture.

Supplementary information

Supplementary Information

Supplementary Figures 1–3, Supplementary Tables 1–12 and Supplementary Notes 1–3

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Almagro Armenteros, J.J., Tsirigos, K.D., Sønderby, C.K. et al. SignalP 5.0 improves signal peptide predictions using deep neural networks. Nat Biotechnol 37, 420–423 (2019). https://doi.org/10.1038/s41587-019-0036-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41587-019-0036-z

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing