Chimeric MS/MS spectra contain fragments from multiple precursor ions and therefore hinder compound identification in metabolomics. Historically, deconvolution of these chimeric spectra has been challenging and relied on specific experimental methods that introduce variation in the ratios of precursor ions between multiple tandem mass spectrometry (MS/MS) scans. DecoID provides a complementary, method-independent approach where database spectra are computationally mixed to match an experimentally acquired spectrum by using LASSO regression. We validated that DecoID increases the number of identified metabolites in MS/MS datasets from both data-independent and data-dependent acquisition without increasing the false discovery rate. We applied DecoID to publicly available data from the MetaboLights repository and to data from human plasma, where DecoID increased the number of identified metabolites from data-dependent acquisition data by over 30% compared to direct spectral matching. DecoID is compatible with any user-defined MS/MS database and provides automated searching for some of the largest MS/MS databases currently available.
Subscribe to Journal
Get full journal access for 1 year
only $9.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
All MS/MS data used in the evaluation of DecoID have been uploaded to the MetaboLights repository as study MTBLS2207 and is also available on the DecoID GitHub release (https://github.com/e-stan/DecoID/releases/). The publicly available dataset analyzed is available on MetaboLights as study MTBLS1066 (all reversed-phase, negative-mode data files were used). The MS/MS databases applied can be obtained at the curators’ websites (https://mona.fiehnlab.ucdavis.edu/, https://www.mzcloud.org/ and https://hmdb.ca/). The in-house IROA metabolite database is available within the DecoID release on GitHub (https://github.com/e-stan/DecoID/releases/), and the reference spectra have been uploaded to MoNA (submitter: E.S.; origin file: IROA_DB_for_mona_filtered_exported_addedInfo.msp).
Source code is available on Zenodo40 and GitHub (https://github.com/e-stan/DecoID/). Included is an example dataset along with documentation for both the DecoID Python package and user interface. A standalone executable built for Windows can alternatively be downloaded from the Patti Lab website (http://pattilab.wustl.edu/software/DecoID/).
Blaženović, I., Kind, T., Ji, J. & Fiehn, O. Software tools and approaches for compound identification of LC–MS/MS data in metabolomics. Metabolites 8, 31 (2018).
Baker, E. S. & Patti, G. J. Perspectives on data analysis in metabolomics: points of agreement and disagreement from the 2018 ASMS fall workshop. J. Am. Soc. Mass Spectrom. https://doi.org/10.1007/s13361-019-02295-3 (2019).
Nikolskiy, I., Mahieu, N. G., Chen, Y.-J., Tautenhahn, R. & Patti, G. J. An untargeted metabolomic workflow to improve structural characterization of metabolites. Anal. Chem. 85, 7713–7719 (2013).
Nash, W. J. & Dunn, W. B. From mass to metabolite in human untargeted metabolomics: recent advances in annotation of metabolites applying liquid chromatography–mass spectrometry data. Trends Analyt. Chem. 120, 115324 (2019).
Tsugawa, H. et al. MS-DIAL: data-independent MS/MS deconvolution for comprehensive metabolome analysis. Nat. Methods 12, 523–526 (2015).
Samanipour, S., Reid, M. J., Bæk, K. & Thomas, K. V. Combining a deconvolution and a universal library search algorithm for the nontarget analysis of data-independent acquisition mode liquid chromatography−high-resolution mass spectrometry results. Environ. Sci. Technol. 52, 4694–4701 (2018).
Li, H., Cai, Y., Guo, Y., Chen, F. & Zhu, Z.-J. MetDIA: targeted metabolite extraction of multiplexed MS/MS spectra generated by data-independent acquisition. Anal. Chem. 88, 8757–8764 (2016).
Yin, Y., Wang, R., Cai, Y., Wang, Z. & Zhu, Z.-J. DecoMetDIA: deconvolution of multiplexed MS/MS spectra for metabolite identification in SWATH-MS-based untargeted metabolomics. Anal. Chem. 91, 11897–11904 (2019).
Ting, Y. S. et al. PECAN: library-free peptide detection for data-independent acquisition tandem mass spectrometry data. Nat. Methods 14, 903–908 (2017).
Tsou, C.-C. et al. DIA-Umpire: comprehensive computational framework for data-independent acquisition proteomics. Nat. Methods 12, 258–264 (2015).
Wang, J. et al. MSPLIT-DIA: sensitive peptide identification for data-independent acquisition. Nat. Methods 12, 1106–1108 (2015).
Zhang, B., Pirmoradian, M., Chernobrovkin, A. & Zubarev, R. A. DeMix workflow for efficient identification of cofragmented peptides in high-resolution data-dependent tandem mass spectrometry. Mol. Cell. Proteomics 13, 3211–3223 (2014).
Dorfer, V., Maltsev, S., Winkler, S. & Mechtler, K. CharmeRT: boosting peptide identifications by chimeric spectra identification and retention time prediction. J. Proteome Res. 17, 2581–2589 (2018).
Houel, S. et al. Quantifying the impact of chimera MS/MS spectra on peptide identification in large-scale proteomics studies. J. Proteome Res. 9, 4152–4160 (2010).
Haug, K. et al. MetaboLights: a resource evolving in response to the needs of its scientific community. Nucleic Acids Res. 48, D440–D444 (2020).
Sud, M. et al. Metabolomics Workbench: an international repository for metabolomics data and metadata, metabolite standards, protocols, tutorials and training and analysis tools. Nucleic Acids Res. 44, D463–D470 (2016).
Kind, T. et al. Identification of small molecules using accurate mass MS/MS search. Mass Spectrom. Rev. 37, 513–532 (2017).
Zhu, X., Chen, Y. & Subramanian, R. Comparison of information-dependent acquisition, SWATH and MSAll techniques in metabolite identification study employing ultrahigh-performance liquid chromatography–quadrupole time-of-flight mass spectrometry. Anal. Chem. 86, 1202–1209 (2014).
Lawson, T. N. et al. msPurity: automated evaluation of precursor ion purity for mass spectrometry-based fragmentation in metabolomics. Anal. Chem. 89, 2432–2439 (2017).
Peckner, R. et al. Specter: linear deconvolution for targeted analysis of data-independent acquisition mass spectrometry proteomics. Nat. Methods 15, 371–378 (2018).
Kessner, D., Chambers, M., Burke, R., Agus, D. & Mallick, P. ProteoWizard: open-source software for rapid proteomics tools development. Bioinformatics 24, 2534–2536 (2008).
Wishart, D. S. et al. HMDB 4.0: the human metabolome database for 2018. Nucleic Acids Res. 46, D608–D617 (2018).
Horai, H. et al. MassBank: a public repository for sharing mass spectral data for life sciences. J. Mass Spectrom. 45, 703–714 (2010).
Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Methodol. 58, 267–288 (1996).
Vinaixa, M. et al. Mass spectral databases for LC/MS- and GC/MS-based metabolomics: state of the field and future prospects. Trends Analyt. Chem. 78, 23–35 (2016).
Cho, K. et al. isoMETLIN: a database for isotope-based metabolomics. Anal. Chem. 86, 9358–9361 (2014).
Bonner, R. & Hopfgartner, G. SWATH data independent acquisition mass spectrometry for metabolomics. Trends Analyt. Chem. https://doi.org/10.1016/j.trac.2018.10.014 (2018).
Telu, K. H., Yan, X., Wallace, W. E., Stein, S. E. & Simón‐Manso, Y. Analysis of human plasma metabolites across different liquid chromatography/mass spectrometry platforms: cross-platform transferable chemical signatures. Rapid Commun. Mass Spectrom. 30, 581–593 (2016).
Schymanski, E. L. et al. Identifying small molecules via high-resolution mass spectrometry: communicating confidence. Environ. Sci. Technol. 48, 2097–2098 (2014).
Fiehn, O. et al. The metabolomics standards initiative (MSI). Metabolomics 3, 175–178 (2007).
Licha, D. et al. Untargeted metabolomics reveals molecular effects of ketogenic diet on healthy and tumor xenograft mouse models. Int. J. Mol. Sci. 20, 3873 (2019).
Spalding, J. L., Naser, F. J., Mahieu, N. G., Johnson, S. L. & Patti, G. J. Trace phosphate improves ZIC-pHILIC peak shape, sensitivity and coverage for untargeted metabolomics. J. Proteome Res. 17, 3537–3546 (2018).
Heller, S., McNaught, A., Stein, S., Tchekhovskoi, D. & Pletnev, I. InChI—the worldwide chemical structure identifier standard. J. Cheminform. 5, 7 (2013).
XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching and identification. Anal. Chem. https://pubs.acs.org/doi/abs/10.1021/ac051437y (2006).
Stein, S. E. & Scott, D. R. Optimization and testing of mass spectral library search algorithms for compound identification. J. Am. Soc. Mass. Spectrom. 5, 859–866 (1994).
Chen, Y. & Wang, M. Hardness of approximation for sparse optimization with L0 norm. Technical Report (2016).
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Cho, K. et al. Targeting unique biological signals on the fly to improve MS/MS coverage and identification efficiency in metabolomics. Anal. Chim. Acta 1149, 338210 (2021).
Tautenhahn, R., Böttcher, C. & Neumann, S. Highly sensitive feature detection for high-resolution LC/MS. BMC Bioinformatics 9, 504 (2008).
Ethan Stancliffe. e-stan/DecoID: DecoID. https://doi.org/10.5281/zenodo.4783380 (Zenodo, 2021).
This work was supported by funding from the National Institutes of Health grants U01 CA235482 (to G.J.P.), R35 ES028365 (to G.J.P.) and R24 OD024624 (to G.J.P.).
G.J.P. is a scientific advisory board member for Cambridge Isotope Laboratories and has a research collaboration agreement with Thermo Fisher Scientific.
Peer review information Nature Methods thanks Mingxun Wang and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Arunima Singh was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Stancliffe, E., Schwaiger-Haber, M., Sindelar, M. et al. DecoID improves identification rates in metabolomics through database-assisted MS/MS deconvolution. Nat Methods 18, 779–787 (2021). https://doi.org/10.1038/s41592-021-01195-3