DIA-NN: neural networks and interference correction enable deep proteome coverage in high throughput

Article metrics

Abstract

We present an easy-to-use integrated software suite, DIA-NN, that exploits deep neural networks and new quantification and signal correction strategies for the processing of data-independent acquisition (DIA) proteomics experiments. DIA-NN improves the identification and quantification performance in conventional DIA proteomic applications, and is particularly beneficial for high-throughput applications, as it is fast and enables deep and confident proteome coverage when used in combination with fast chromatographic methods.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: DIA-NN workflow and its performance on conventional and short chromatographic gradients.
Fig. 2: LFQbench test performance of DIA-NN.

Data availability

The newly generated mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE32 partner repository with the dataset identifier PXD014690; previously published data were also used to benchmark the software (repositories with identifiers PXD005573, PXD002952, PXD010529 and PXD006722). All the precursor and protein identification and quantification information has been uploaded to the OSF repository (https://doi.org/10.17605/OSF.IO/6G3UX).

Code availability

DIA-NN (1.6.0) is open-source and is freely available at https://github.com/vdemichev/diann under a permissive licence.

References

  1. 1.

    Yates, J. R., Ruse, C. I. & Nakorchevsky, A. Proteomics by mass spectrometry: approaches, advances, and applications. Annu. Rev. Biomed. Eng. 11, 49–79 (2009).

  2. 2.

    Aebersold, R. & Mann, M. Mass-spectrometric exploration of proteome structure and function. Nature 537, 347–355 (2016).

  3. 3.

    Geyer, P. E., Holdt, L. M., Teupser, D. & Mann, M. Revisiting biomarker discovery by plasma proteomics. Mol. Syst. Biol. 13, 942 (2017).

  4. 4.

    Zelezniak, A. et al. Machine learning predicts the yeast metabolome from the quantitative proteome of kinase knockouts. Cell Syst. 7, 269–283 (2018).

  5. 5.

    Bruderer, R. et al. Extending the limits of quantitative proteome profiling with data-independent acquisition and application to acetaminophen-treated three-dimensional liver microtissues. Mol. Cell. Proteom. 14, 1400–1410 (2015).

  6. 6.

    Meier, F., Geyer, P. E., Virreira Winter, S., Cox, J. & Mann, M. BoxCar acquisition method enables single-shot proteomics at a depth of 10,000 proteins in 100 minutes. Nat. Methods 15, 440–448 (2018).

  7. 7.

    Venable, J. D., Dong, M.-Q., Wohlschlegel, J., Dillin, A. & Yates, J. R. Automated approach for quantitative analysis of complex peptide mixtures from tandem mass spectra. Nat. Methods 1, 39–45 (2004).

  8. 8.

    Gillet, L. C. et al. Targeted data extraction of the MS/MS spectra generated by data-independent acquisition: a new concept for consistent and accurate proteome analysis. Mol. Cell. Proteom. 11, O111.016717 (2012).

  9. 9.

    Ludwig, C. et al. Data-independent acquisition-based SWATH-MS for qualitative and quantitative proteomics: a tutorial. Mol. Syst. Biol. 14, e8126 (2018).

  10. 10.

    Collins, B. C. et al. Multi-laboratory assessment of reproducibility, qualitative and quantitative performances of SWATH-mass spectrometry. Nat. Commun. 8, 291 (2017).

  11. 11.

    Vowinckel, J. et al. Cost-effective generation of precise label-free quantitative proteomes in high-throughput by microLC and data-independent acquisition. Sci. Rep. 8, 4346 (2018).

  12. 12.

    Bruderer, R. et al. Optimization of experimental parameters in data-independent mass spectrometry significantly increases depth and reproducibility of results. Mol. Cell. Proteom. 16, 2296–2309 (2017).

  13. 13.

    Reiter, L. et al. mProphet: automated data processing and statistical validation for large-scale SRM experiments. Nat. Methods 8, 430–435 (2011).

  14. 14.

    Elias, J. E. & Gygi, S. P. Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat. Methods 4, 207–214 (2007).

  15. 15.

    Ting, Y. S. et al. PECAN: library-free peptide detection for data-independent acquisition tandem mass spectrometry data. Nat. Methods 14, 903–908 (2017).

  16. 16.

    Wang, J. et al. MSPLIT-DIA: sensitive peptide identification for data-independent acquisition. Nat. Methods 12, 1106–1108 (2015).

  17. 17.

    LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).

  18. 18.

    Röst, H. L. et al. OpenSWATH enables automated, targeted analysis of data-independent acquisition. Nat. Biotechnol. 32, 219–223 (2014).

  19. 19.

    MacLean, B. et al. Skyline: an open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics 26, 966–968 (2010).

  20. 20.

    Peckner, R. et al. Specter: linear deconvolution for targeted analysis of data-independent acquisition mass spectrometry proteomics. Nat. Methods 15, 371–378 (2018).

  21. 21.

    Navarro, P. et al. A multicenter study benchmarks software tools for label-free proteome quantification. Nat. Biotechnol. 34, 1130–1136 (2016).

  22. 22.

    Sun, S. et al. MS-Simulator: predicting y-ion intensities for peptides with two charges based on the intensity ratio of neighboring ions. J. Proteome Res. 11, 4509–4516 (2012).

  23. 23.

    Kingma, D. P. & Ba, J. Adam: A Method for Stochastic Optimization. Preprint at arXiv http://arxiv.org/abs/1412.6980 (2014).

  24. 24.

    Storey, J. D. A direct approach to false discovery rates. J. R. Stat. Soc. Ser. B Stat. Methodol. 64, 479–498 (2002).

  25. 25.

    Röst, H. L. et al. OpenMS: a flexible open-source software platform for mass spectrometry data analysis. Nat. Methods 13, 741–748 (2016).

  26. 26.

    Chambers, M. C. et al. A cross-platform toolkit for mass spectrometry and proteomics. Nat. Biotechnol. 30, 918–920 (2012).

  27. 27.

    The UniProt Consortium. UniProt: the universal protein knowledgebase. Nucleic Acids Res. 45, D158–D169 (2017).

  28. 28.

    Parker, S. J. et al. Indentification of a set of conserved eukaryotic internal retention time standards for data-independent acquisition mass spectrometry. Mol. Cell. Proteom. 14, 2800–2813 (2015).

  29. 29.

    Deutsch, E. W., Lam, H. & Aebersold, R. PeptideAtlas: a resource for target selection for emerging targeting proteomics workflows. EMBO Rep. 9, 429–434 (2008).

  30. 30.

    Teleman, J. et al. DIANA—algorithmic improvements for analysis of data-independent acquisition MS data. Bioinformatics 31, 555–562 (2015).

  31. 31.

    Mülleder, M., Campbell, K., Matsarskaia, O., Eckerstorfer, F. & Ralser, M. Saccharomyces cerevisiae single-copy plasmids for auxotrophy compensation, multiple marker selection, and for designing metabolically cooperating communities. F1000Res 5, 2351 (2016).

  32. 32.

    Perez-Riverol, Y. et al. The PRIDE database and related tools and resources in 2019: improving support for quantification data. Nucleic Acids Res. 47, D442–D450 (2019).

Download references

Acknowledgements

We thank R. Bruderer (Biognosys) for providing the spectral libraries. This work was supported by the Francis Crick Institute, which receives its core funding from Cancer Research UK (FC001134), the UK Medical Research Council (FC001134), and the Wellcome Trust (FC001134), and received specific funding from the BBSRC (BB/N015215/1 and BB/N015282/1) and the Wellcome Trust (200829/Z/16/Z) as well as a Crick Idea to Innovation (i2i) initiative (grant number 10658).

Author information

V.D., M.R. and K.S.L. designed the study, V.D. and M.R. wrote the first manuscript draft, V.D. designed and implemented the algorithms, C.B.M., V.D. and S.I.V. performed the experiments, and all authors discussed the results and commented on the manuscript.

Correspondence to Markus Ralser.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Allison Doerr was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Demichev, V., Messner, C.B., Vernardis, S.I. et al. DIA-NN: neural networks and interference correction enable deep proteome coverage in high throughput. Nat Methods (2019) doi:10.1038/s41592-019-0638-x

Download citation