DIA-NN: neural networks and interference correction enable deep proteome coverage in high throughput

Demichev, Vadim; Messner, Christoph B.; Vernardis, Spyros I.; Lilley, Kathryn S.; Ralser, Markus

doi:10.1038/s41592-019-0638-x

Brief Communication
Published: 25 November 2019

DIA-NN: neural networks and interference correction enable deep proteome coverage in high throughput

Nature Methods volume 17, pages 41–44 (2020)Cite this article

42k Accesses
638 Citations
196 Altmetric
Metrics details

Subjects

Abstract

We present an easy-to-use integrated software suite, DIA-NN, that exploits deep neural networks and new quantification and signal correction strategies for the processing of data-independent acquisition (DIA) proteomics experiments. DIA-NN improves the identification and quantification performance in conventional DIA proteomic applications, and is particularly beneficial for high-throughput applications, as it is fast and enables deep and confident proteome coverage when used in combination with fast chromatographic methods.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: DIA-NN workflow and its performance on conventional and short chromatographic gradients.**

**Fig. 2: LFQbench test performance of DIA-NN.**

Highly accurate protein structure prediction with AlphaFold

Article Open access 15 July 2021

Three million images and morphological profiles of cells treated with matched chemical and genetic perturbations

Article Open access 09 April 2024

Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis

Article Open access 25 March 2024

Data availability

The newly generated mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE³² partner repository with the dataset identifier PXD014690; previously published data were also used to benchmark the software (repositories with identifiers PXD005573, PXD002952, PXD010529 and PXD006722). All the precursor and protein identification and quantification information has been uploaded to the OSF repository (https://doi.org/10.17605/OSF.IO/6G3UX).

Code availability

DIA-NN (1.6.0) is open-source and is freely available at https://github.com/vdemichev/diann under a permissive licence.

References

Yates, J. R., Ruse, C. I. & Nakorchevsky, A. Proteomics by mass spectrometry: approaches, advances, and applications. Annu. Rev. Biomed. Eng. 11, 49–79 (2009).
Article CAS Google Scholar
Aebersold, R. & Mann, M. Mass-spectrometric exploration of proteome structure and function. Nature 537, 347–355 (2016).
Article CAS Google Scholar
Geyer, P. E., Holdt, L. M., Teupser, D. & Mann, M. Revisiting biomarker discovery by plasma proteomics. Mol. Syst. Biol. 13, 942 (2017).
Article Google Scholar
Zelezniak, A. et al. Machine learning predicts the yeast metabolome from the quantitative proteome of kinase knockouts. Cell Syst. 7, 269–283 (2018).
Article CAS Google Scholar
Bruderer, R. et al. Extending the limits of quantitative proteome profiling with data-independent acquisition and application to acetaminophen-treated three-dimensional liver microtissues. Mol. Cell. Proteom. 14, 1400–1410 (2015).
Article CAS Google Scholar
Meier, F., Geyer, P. E., Virreira Winter, S., Cox, J. & Mann, M. BoxCar acquisition method enables single-shot proteomics at a depth of 10,000 proteins in 100 minutes. Nat. Methods 15, 440–448 (2018).
Article CAS Google Scholar
Venable, J. D., Dong, M.-Q., Wohlschlegel, J., Dillin, A. & Yates, J. R. Automated approach for quantitative analysis of complex peptide mixtures from tandem mass spectra. Nat. Methods 1, 39–45 (2004).
Article CAS Google Scholar
Gillet, L. C. et al. Targeted data extraction of the MS/MS spectra generated by data-independent acquisition: a new concept for consistent and accurate proteome analysis. Mol. Cell. Proteom. 11, O111.016717 (2012).
Article Google Scholar
Ludwig, C. et al. Data-independent acquisition-based SWATH-MS for qualitative and quantitative proteomics: a tutorial. Mol. Syst. Biol. 14, e8126 (2018).
Article Google Scholar
Collins, B. C. et al. Multi-laboratory assessment of reproducibility, qualitative and quantitative performances of SWATH-mass spectrometry. Nat. Commun. 8, 291 (2017).
Article Google Scholar
Vowinckel, J. et al. Cost-effective generation of precise label-free quantitative proteomes in high-throughput by microLC and data-independent acquisition. Sci. Rep. 8, 4346 (2018).
Article Google Scholar
Bruderer, R. et al. Optimization of experimental parameters in data-independent mass spectrometry significantly increases depth and reproducibility of results. Mol. Cell. Proteom. 16, 2296–2309 (2017).
Article CAS Google Scholar
Reiter, L. et al. mProphet: automated data processing and statistical validation for large-scale SRM experiments. Nat. Methods 8, 430–435 (2011).
Article CAS Google Scholar
Elias, J. E. & Gygi, S. P. Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat. Methods 4, 207–214 (2007).
Article CAS Google Scholar
Ting, Y. S. et al. PECAN: library-free peptide detection for data-independent acquisition tandem mass spectrometry data. Nat. Methods 14, 903–908 (2017).
Article CAS Google Scholar
Wang, J. et al. MSPLIT-DIA: sensitive peptide identification for data-independent acquisition. Nat. Methods 12, 1106–1108 (2015).
Article CAS Google Scholar
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Article CAS Google Scholar
Röst, H. L. et al. OpenSWATH enables automated, targeted analysis of data-independent acquisition. Nat. Biotechnol. 32, 219–223 (2014).
Article Google Scholar
MacLean, B. et al. Skyline: an open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics 26, 966–968 (2010).
Article CAS Google Scholar
Peckner, R. et al. Specter: linear deconvolution for targeted analysis of data-independent acquisition mass spectrometry proteomics. Nat. Methods 15, 371–378 (2018).
Article CAS Google Scholar
Navarro, P. et al. A multicenter study benchmarks software tools for label-free proteome quantification. Nat. Biotechnol. 34, 1130–1136 (2016).
Article CAS Google Scholar
Sun, S. et al. MS-Simulator: predicting y-ion intensities for peptides with two charges based on the intensity ratio of neighboring ions. J. Proteome Res. 11, 4509–4516 (2012).
Article CAS Google Scholar
Kingma, D. P. & Ba, J. Adam: A Method for Stochastic Optimization. Preprint at arXiv http://arxiv.org/abs/1412.6980 (2014).
Storey, J. D. A direct approach to false discovery rates. J. R. Stat. Soc. Ser. B Stat. Methodol. 64, 479–498 (2002).
Article Google Scholar
Röst, H. L. et al. OpenMS: a flexible open-source software platform for mass spectrometry data analysis. Nat. Methods 13, 741–748 (2016).
Article Google Scholar
Chambers, M. C. et al. A cross-platform toolkit for mass spectrometry and proteomics. Nat. Biotechnol. 30, 918–920 (2012).
Article CAS Google Scholar
The UniProt Consortium. UniProt: the universal protein knowledgebase. Nucleic Acids Res. 45, D158–D169 (2017).
Parker, S. J. et al. Indentification of a set of conserved eukaryotic internal retention time standards for data-independent acquisition mass spectrometry. Mol. Cell. Proteom. 14, 2800–2813 (2015).
Article CAS Google Scholar
Deutsch, E. W., Lam, H. & Aebersold, R. PeptideAtlas: a resource for target selection for emerging targeting proteomics workflows. EMBO Rep. 9, 429–434 (2008).
Article CAS Google Scholar
Teleman, J. et al. DIANA—algorithmic improvements for analysis of data-independent acquisition MS data. Bioinformatics 31, 555–562 (2015).
Article CAS Google Scholar
Mülleder, M., Campbell, K., Matsarskaia, O., Eckerstorfer, F. & Ralser, M. Saccharomyces cerevisiae single-copy plasmids for auxotrophy compensation, multiple marker selection, and for designing metabolically cooperating communities. F1000Res 5, 2351 (2016).
Article Google Scholar
Perez-Riverol, Y. et al. The PRIDE database and related tools and resources in 2019: improving support for quantification data. Nucleic Acids Res. 47, D442–D450 (2019).
Article CAS Google Scholar

Download references

Acknowledgements

We thank R. Bruderer (Biognosys) for providing the spectral libraries. This work was supported by the Francis Crick Institute, which receives its core funding from Cancer Research UK (FC001134), the UK Medical Research Council (FC001134), and the Wellcome Trust (FC001134), and received specific funding from the BBSRC (BB/N015215/1 and BB/N015282/1) and the Wellcome Trust (200829/Z/16/Z) as well as a Crick Idea to Innovation (i2i) initiative (grant number 10658).

Author information

Authors and Affiliations

Department of Biochemistry and The Milner Therapeutics Institute, University of Cambridge, Cambridge, UK
Vadim Demichev & Kathryn S. Lilley
The Francis Crick Institute, Molecular Biology of Metabolism laboratory, London, UK
Vadim Demichev, Christoph B. Messner, Spyros I. Vernardis & Markus Ralser
Department of Biochemistry, Charité Universitätsmedizin Berlin, Berlin, Germany
Markus Ralser

Authors

Vadim Demichev
View author publications
You can also search for this author in PubMed Google Scholar
Christoph B. Messner
View author publications
You can also search for this author in PubMed Google Scholar
Spyros I. Vernardis
View author publications
You can also search for this author in PubMed Google Scholar
Kathryn S. Lilley
View author publications
You can also search for this author in PubMed Google Scholar
Markus Ralser
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

V.D., M.R. and K.S.L. designed the study, V.D. and M.R. wrote the first manuscript draft, V.D. designed and implemented the algorithms, C.B.M., V.D. and S.I.V. performed the experiments, and all authors discussed the results and commented on the manuscript.

Corresponding author

Correspondence to Markus Ralser.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Allison Doerr was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Notes 1–10

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Cite this article

Demichev, V., Messner, C.B., Vernardis, S.I. et al. DIA-NN: neural networks and interference correction enable deep proteome coverage in high throughput. Nat Methods 17, 41–44 (2020). https://doi.org/10.1038/s41592-019-0638-x

Download citation

Received: 15 August 2018
Accepted: 09 October 2019
Published: 25 November 2019
Issue Date: January 2020
DOI: https://doi.org/10.1038/s41592-019-0638-x

This article is cited by

Neat plasma proteomics: getting the best out of the worst
- Ines Metatla
- Kevin Roger
- Ida Chiara Guerrera
Clinical Proteomics (2024)
Synergistic effect of sildenafil combined with controlled hypothermia to alleviate microglial activation after neonatal hypoxia–ischemia in rats
- Pansiot Julien
- Manuela Zinni
- Olivier Baud
Journal of Neuroinflammation (2024)
Targeting cancer-derived extracellular vesicles by combining CD147 inhibition with tissue factor pathway inhibitor for the management of urothelial cancer cells
- Vijay Kumar Boddu
- Piet Zamzow
- Masoud Darabi
Cell Communication and Signaling (2024)
A proteomic classifier panel for early screening of colorectal cancer: a case control study
- Hanju Hua
- Tingting Wang
- Zhe Shen
Journal of Translational Medicine (2024)
A knock down strategy for rapid, generic, and versatile modelling of muscular dystrophies in 3D-tissue-engineered-skeletal muscle
- Stijn L. M. in ‘t Groen
- Marnix Franken
- W. W. M. Pim Pijnappel
Skeletal Muscle (2024)

DIA-NN: neural networks and interference correction enable deep proteome coverage in high throughput

Subjects

Abstract

Access options

Similar content being viewed by others

Highly accurate protein structure prediction with AlphaFold

Three million images and morphological profiles of cells treated with matched chemical and genetic perturbations

Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Supplementary Information

Reporting Summary

Rights and permissions

About this article

Cite this article

This article is cited by

Neat plasma proteomics: getting the best out of the worst

Synergistic effect of sildenafil combined with controlled hypothermia to alleviate microglial activation after neonatal hypoxia–ischemia in rats

Targeting cancer-derived extracellular vesicles by combining CD147 inhibition with tissue factor pathway inhibitor for the management of urothelial cancer cells

A proteomic classifier panel for early screening of colorectal cancer: a case control study

A knock down strategy for rapid, generic, and versatile modelling of muscular dystrophies in 3D-tissue-engineered-skeletal muscle

Search

Quick links

Subjects

Abstract

Access options

Similar content being viewed by others

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links