Innovation in mass spectrometry (MS) and the rapidly increasing throughput and sensitivity of MS instrumentation require adaptations and innovations in data processing tools. Here, we introduce MZmine 3, a scalable MS data analysis platform that supports hybrid datasets from various instrumental setups, including liquid and gas chromatography (LC and GC)–MS, ion mobility spectrometry (IMS)–MS and MS imaging. In particular, the integration of IMS–MS imaging and LC–IMS–MS datasets provides opportunities for spatial metabolomics analyses with increased annotation confidence.

Over the past decade, the MZmine project has evolved into a community-driven, collaborative effort. As an open-source ecosystem for MS data processing, MZmine is a cross-platform software (Supplementary Note 1) that can be tuned for robust, scalable and reproducible data analysis on personal computers as well as high-performance supercomputers. The project has seen continuous development since its inception in 2004 (refs. 1,2). Community additions (Fig. 1a) introduced various functions, such as performant feature detection workflows3,4, modules for lipid annotation5 and strong ties to other community projects (Fig. 1b). Here, data exchange formats and direct interfaces (listed under ‘Tool integration’ in the documentation) enable downstream analysis in external tools, such as compound annotation in SIRIUS6 and statistical analysis in MetaboAnalyst7, and directly bind MZmine results into the molecular networking ecosystem of the Global Natural Products Social Molecular Networking (GNPS) web platform (Supplementary Note 2)8,9,10.

Fig. 1: MZmine, an open-source community project for integrative LC–IMS–MS and IMS–MS data processing.
figure 1

a, Overview of active developments and key additions to MZmine since the first publication, which led to over 180 modules that now drive interactive, reproducible and efficient data processing and visualization in MZmine 3. b, Data exchange formats and direct interfaces enable downstream analysis with strong ties to projects like GNPS, SIRIUS and MetaboAnalyst. c, The integrative LC–MS and IMS–MS imaging workflow applies feature detection in RT, ion mobility and m/z dimension to MS data stored in open or vendor formats. Comprehensive processing and annotation results are merged into an aligned feature list. d, An aligned feature list with one ion feature detected in LC–IMS–MS samples and aligned to one MALDI–IMS–MS ion feature image. Annotation results (‘Lipid annotation’ column) and interactive charts include the table columns ‘Shapes’ (extracted ion chromatograms), ‘Mobilograms’ (extracted ion mobilograms) and ‘Images’ (extracted ion images).

Recent advances in MS instrumentation push sensitivity, resolving power and data acquisition speed, resulting in increased data volume and complexity. Notably, IMS gains traction in the field by including an additional separation dimension to LC–MS or imaging-based techniques like matrix-assisted laser desorption/ionization (MALDI)–MS. These advances introduce new acquisition modes (for example, parallel accumulation–serial fragmentation (PASEF))11 or enable combination of IMS and imaging, which was shown to improve annotation quality in MS imaging12. Furthermore, the number of large-scale cohort and multifactorial studies in clinical, environmental and other fields is growing, as registered in the three main metabolomics data repositories: MassIVE/GNPS8, MetaboLights and Metabolomics Workbench13. The need for scalable, reproducible and flexible data analysis workflows that can combine MS data from various sources remains unaddressed by existing tools. For example, to combine LC–(IMS–)MS and MS imaging results from the same sample, users are forced to master multiple software tools12 that divide the workflow and are specialized for either chromatography–MS (for example, MS-DIAL, XCMS, OpenMS)14,15,16 or MS imaging (for example, METASPACE, rMSI, Cardinal MSI, SpectralAnalysis)17.

The integrative spatial metabolomics workflow in MZmine 3 (Fig. 1c) imports LC–IMS–MS and IMS–MS imaging datasets stored in either open or vendor-specific formats and processes them by non-targeted feature detection. This entails resolving peak shapes for ion features in both the retention time (RT) and ion mobility dimension in LC–IMS–MS and extracting mobility-resolved ion image features with spatial distributions in IMS–MS imaging (Supplementary Figs. 13). Individual features from both methodologies are subsequently represented and aligned by their RT (LC only), m/z and ion mobility values. The resulting aligned feature list combines the strengths of the individual analytical methods by integrating the compound annotation capabilities of modern chromatography-based MS with spatial metabolite distributions that can be mapped to histological data, addressing the issue of missing MS2 data in most imaging studies. For data evaluation, MZmine organizes annotations in a feature table with interactive charts, exemplified in Fig. 1d for one ion feature detected in LC–IMS–MS samples and aligned to an ion image from one MALDI–IMS–MS imaging dataset. An exemplary spatial metabolomics workflow leading to LC–IMS–MS-resolved molecular networks, enriched with spatial ion feature information, is described in Supplementary Note 2 and Supplementary Fig. 4. Additional visualization modules (Supplementary Fig. 5) connect all available data dimensions; a fast memory-mapped data back end enables interactive exploration.

In MZmine 3, special attention was directed toward scalability due to the ever increasing study sizes that lead to large volumes of raw data, particularly in the case of LC–IMS–MS datasets. Efficient memory management and parallelization removed bottlenecks, resulting in an 89% reduction in processing time for 250 dissolved organic matter samples when compared to MZmine 2. A stress test demonstrated high sample throughput, where the mean processing times amounted to 0.1% to 0.3% of the total data acquisition time for six different LC–MS datasets (Supplementary Note 3 and Supplementary Fig. 6). Further, MZmine 3 was benchmarked using 8,273 fecal LC–MS2 samples, requiring just 47 min of processing time (see hardware specifications in Supplementary Note 3).

The improved performance of MZmine 3 over previous MZmine versions now allows processing of large datasets, including large-volume LC–IMS–MS data. For new users, the MZmine website contains detailed manuals and video tutorials, and the new processing wizard in MZmine provides starting points for various standard workflows and mass spectrometer types. In addition, a development tutorial is available for potential new contributors, and the modular design of MZmine enables testing and implementing of new ideas within the MZmine framework.