Mass spectrometry is a predominant experimental technique in metabolomics and related fields, but metabolite structural elucidation remains highly challenging. We report SIRIUS 4 (https://bio.informatik.uni-jena.de/sirius/), which provides a fast computational approach for molecular structure identification. SIRIUS 4 integrates CSI:FingerID for searching in molecular structure databases. Using SIRIUS 4, we achieved identification rates of more than 70% on challenging metabolomics datasets.
Access optionsAccess options
Subscribe to Journal
Get full journal access for 1 year
only $20.17 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
SIRIUS 4 is written in Java; is open source under the GNU General Public License (version 3); and works on Windows, macOS X, and Linux. In addition to the graphical front end, a comprehensive command-line version allows batch processing and integration into workflows; integration into GNPS1, OpenMS2, and MZmine4 is ongoing. We also provide source code, executable binaries, documentation, support, non-commercial training data, example files, and additional information on the SIRIUS website (https://bio.informatik.uni-jena.de/sirius/); a source copy is hosted on GitHub (https://github.com/boecker-lab/sirius). You can retrieve the InChIs of all compounds used to train CSI:FingerID from the web service (https://www.csi-fingerid.uni-jena.de/webapi/trainingstructures.csv?predictor=pos and https://www.csi-fingerid.uni-jena.de/webapi/trainingstructures.csv?predictor=neg).
Data for the CASMI 2016 re-evaluation are available from https://bio.informatik.uni-jena.de/data under a Creative Commons CC-BY license. Cross-validation data for the GNPS search re-evaluation are available from https://bio.informatik.uni-jena.de/data/ (Creative Commons CC0 1.0 Universal license). Data for the American Gut project are available in the MassIVE database (MSV000080186 and MSV000080187; Creative Commons CC0 1.0 Universal license). The analysis can be accessed via the GNPS website (http://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=9bd16822c8d448f59a03e6cc8f017f43 and http://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=d26ae082b1154f73ac050796fcaa6bda). Data for the study of clothing with antibacterial properties are available at MassIVE (MSV000081379; Creative Commons CC0 1.0 Universal license). Analysis is available at the GNPS website (https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=a5e8ca1b7a9c42cfb45fbb2855e36721). Source data for Supplementary Figs. 6–8 are available online.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
We gratefully acknowledge financial support by the Deutsche Forschungsgemeinschaft (BO 1910/20) to S.B. and the Academy of Finland (310107/MACOME) to J.R.. We thank the GNPS community, S. Stein, F. Kuhlmann, and Agilent Technologies Inc. (Santa Clara, CA, USA) for providing data that were used to estimate the hyperparameters of SIRIUS 4 and to train CSI:FingerID. We also thank F. Kuhlmann and Agilent Technologies for data used to evaluate the isotope scoring.