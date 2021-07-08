MaxDIA data analysis workflow

MaxDIA is embedded into the MaxQuant software environment (Fig. 1) and shares with it the graphical user interface, computational infrastructure and many algorithmic workflow components applicable to both. It is vendor neutral, with direct support for the most common native vendor file formats for reading mass spectra, as well as the open mzML file format30. Generic DIA acquisition modes are supported, including overlapping windows, variable window sizes, pooled multiple windows and variable m/z–ion mobility regions for timsTOF instruments. MaxDIA can be operated in a classical library-based approach or in discovery DIA mode. In the former, DIA datasets are interrogated within MaxQuant by spectral libraries generated with MaxQuant, whereas the latter does not require acquisition of a spectral library. In discovery DIA mode, spectral libraries are generated by DeepMass:Prism15, a BRNN that enables precise prediction of spectral intensities from peptide sequences. Decoy spectra are generated by reverting library sequences under the constraint of preserving the cleavage characteristics of the protease that was used in the experiment and ensuring that the decoy peptide masses, retention times and ion mobility values follow the same multivariate distribution as the target peptides. DIA samples and libraries are then analyzed in an end-to-end workflow for peptide and protein identification and quantification. MaxQuant’s 3D or 4D feature detection3,23 (Fig. 2) and de-isotoping are performed on the precursor data and on all liquid chromatography with tandem mass spectrometry (LC–MS/MS) or LC–ion mobility spectrometry (IMS)–MS/MS fragment data domains corresponding to precursor selection windows. Defining MS/MS features in a multi-dimensional way is particularly important for fragment data, because it avoids over-interpretation of identification results. This enables the requirement that every MS/MS feature is used at most once in peptide identification. Problems might arise if such precautions are not taken, because features will be double-counted for the identification of peptides that are similar to each other due to sequence homology or due to the presence or absence of a modification but for which there is insufficient evidence for the existence of both peptide forms.

Fig. 1: Overview of the MaxDIA workflow. MaxDIA can be operated in library and discovery mode. Many concepts and algorithms—for instance, for protein quantification—are re-used from the conventional MaxQuant workflow for DDA data and have been further developed for DIA. This results in an end-to-end DIA software that contains many established MaxQuant concepts, such as label-free quantification with MaxLFQ or iBAQ quantification. RT, retention time. Full size image

Fig. 2: 3D/4D feature detection of precursors and fragments. a, Visualization of precursors and fragments of a peptide measured on an Orbitrap. The raw data can be visualized together with the peak detection results as heat maps and 3D models for precursor and fragment data in the graphical user interface of MaxQuant. b, Two peptides with nearly equal mass, both with charge 2 and having very similar retention times, are resolved by ion mobility on a timsTOF Pro mass spectrometer. A heat map visualizes intensities as a function of retention time and collision cross-section for the precursor isotope patterns. The two respective MS/MS spectra of fragments assigned to the precursors are shown. RT, retention time. Full size image

Bootstrap DIA

Central to the workflow is bootstrap DIA, which consists of multiple steps of matching the library spectra to DIA samples (Supplementary Fig. 1). These steps aim to bootstrap the DIA identification process based on the least possible prior knowledge. Bootstrap DIA replaces and substantially extends the concept of the ‘first search–main search’ strategy31 as well as the ‘retention time alignment’ and ‘match between runs’ used in DDA MaxQuant. Increasingly more information is gained in each round, with this information used in subsequent rounds. For instance, in the first round of matching, no retention time constraint is used. Based on these matches, a linear model is fit between the library and sample retention times, which is used to align runs to one another, even when gradient lengths substantially differ. This linear correction can be applied to the data, and, in the second round of matching, retention times can be filtered based on a time window that is automatically adapted to the distribution of all retention time differences after linear alignment. This filtering removes sufficiently many false-positive matches, so that, from the third round of matching, a non-linear retention time recalibration function can be determined. Application of the non-linear recalibration function allows to subsequently apply more stringent filtering. Similar multi-step recalibration and filtering steps are applied to precursor and fragment masses as well as to collision cross-sections, if applicable. Supplementary Fig. 2 shows how target decoy distributions are affected after each matching step with increasingly more stringent filers. The resulting non-linear precursor and fragment m/z recalibrations depending on m/z and retention time are shown in Supplementary Figs. 3 and 4.

A consequence of the bootstrap DIA process is that precursor and fragment masses, retention times and ion mobility values are non-linearly aligned between each DIA sample and library without the need for spike-in standards. A prerequisite for this is that the DDA runs in the datasets used for the library are well aligned to each other, because the precision of alignment between library and DIA samples is otherwise limited by the variability of retention times and collision cross-sections within the library. Therefore, when processing libraries in MaxQuant, retention time and ion mobility alignments should be activated. A challenging attribute that can be learned from the data is non-linear retention time mappings between library and samples. This means that gradients between library and DIA runs do not need to be the same, and label-free quantification is possible even between DIA measurements with different gradient lengths. To evaluate the matching of different DIA gradient durations to a library, we generated a DDA library consisting of 16 high-pH reversed-phase fractions of a HeLa cell lysate measured with 25-min gradients and measured the same sample unfractionated with DIA using 30-, 60-, 90- and 120-min gradients. Supplementary Fig. 5 shows retention time alignments between the library and DIA samples, and precise quantification among samples with different gradient lengths is shown in Supplementary Fig. 6. These capabilities greatly enhance the flexibility of MaxDIA, making the software applicable to analyzing a broader range of samples.

Scoring of library-to-sample matches by machine learning

To quantify the quality of match between a library spectrum and a DIA sample at a given retention time and collisional cross-section (CCS) value, if applicable, we first find a precursor feature and all fragment features that match to the library spectrum with tolerances for m/z, retention time and CCS, dependent on the matching step in the bootstrap DIA workflow. To measure the match quality, we then calculate a score, which is the sum over all matching features of numbers between 0 and 1, each quantifying how far away from the apex the respective peak was hit (Supplementary Fig. 7). For a given library spectrum, this score is maximized over retention time and ion mobility. It is then ensured, through a second round of scoring, that every feature in a DIA sample is used, at most, for one library spectrum match.

This score then is enhanced through machine learning. To this end, we construct a feature space that, in addition to the score, contains various properties of the match (Supplementary Fig. 8), such as mass errors (in p.p.m.) for precursor and fragment ions as deviations from the theoretical masses calculated from elemental compositions. Also, the errors of retention times and ion mobilities are included in the feature space. An interesting feature is the apex fraction, which is the ratio of the intensity at the current retention time to the maximum peak intensity. We employ a classification algorithm to separate ‘target’ from ‘decoy’ hits based on this feature space. We define the machine learning-based match score as the assignment probability to the ‘target’ class of the machine learning algorithm. This is a number expressing the affinity to the ‘target’ spectra as opposed to the ‘decoy’ spectra. To eliminate the risk of overfitting, we determine these machine learning scores in five-fold cross-validation, such that a match for which the machine learning score is calculated has not been used for training the model that is used for its prediction.

We used several different classification algorithms and monitored their effect on the identification performance of MaxDIA. We compared the performances of XGBoost22, fully connected multi-hidden layer neural networks, random forests32 and AdaBoost (Supplementary Fig. 9), scanning, for each algorithm, suitable ranges of meta-parameters. We found that XGBoost performs best among the tested algorithms, in contrast to Demichev et al.10, who found neural networks to perform favorably. This choice is also different from DDA where, for similar purposes, support vector machine-based methods are used33. XGBoost provides information on the importance of features for classification (Supplementary Fig. 8). We found that, in the library-based approach, the feature defining whether the precursor has an isotope pattern assigned or was seen only as a single peak is of greater importance than the raw score itself. Furthermore, retention time, precursor mass errors, number of modifications and missed cleavages were among the top ten highest ranked features. Also among the top ten is the ‘sample fragment overlap’, which quantifies if and to what extent the N- and C-terminal ion series are overlapping in the DIA sample, thereby placing restrictions on the precursor mass.

Identification performance and quantification precision

To evaluate the performance of MaxDIA, we ran it, as well as Spectronaut 13 and Spectronaut 14, on a dataset comprising 27 technical replicate injections of peptides derived from the human HepG2 cell line measured in DIA as well as a DDA library created from 12 high-pH reversed-phase fractions (Methods). Using default parameters in both software, including a 1% FDR on precursor and protein levels, we obtained 6,238 protein groups mapped to Entrez Gene identifiers with MaxDIA compared to 6,015 with Spectronaut 13 and 6,304 with Spectronaut 14, with an overlap of 5,542 among all software platforms (Fig. 3a). MaxDIA found 7.4% more peptides than Spectronaut 13 and 5.8% more than Spectronaut 14 at 1% library-to-DIA-matches FDR. We found several peptide properties to be similarly distributed among the identification results of the two software platforms (Supplementary Fig. 10), including retention time, precursor charge and mass-to-charge ratio and precursor mass error. In addition, the length distribution of identified peptides was very similar between the two analysis software packages (Fig. 3b). Peptides that were uniquely found by MaxQuant were biased toward low signal intensity (Supplementary Fig 10a).

Fig. 3: Performance evaluation. Twenty-seven technical replicates of HepG2 cell lysate were analyzed on an Orbitrap mass spectrometer (Methods). a, Number of identified protein groups with 1% FDR on protein and peptide level and number of peptides at 1% library-to-DIA-sample FDR obtained with MaxDIA, Spectronaut 13 and Spectronaut 14. b, Histograms of peptide lengths identified with MaxDIA (blue) and Spectronaut 13 (red). c, Number of proteins with, at most, x out of 27 valid values for Spectronaut 13 (red), Spectronaut 14 (magenta) and MaxDIA with MaxLFQ minimum ratio count = 1 (blue, dashed) and = 2 (blue, solid). Multiple curves for the two MaxQuant series of curves correspond to seven different choices for the transfer q value (0.01, 0.03, 0.05, 0.1, 0.3, 0.5 and 1). d, Histograms of coefficients of variation for analyses with default settings in MaxDIA (solid blue) and in Spectronaut 13 and Spectronaut 14 (open). e, log–log scatter plot of LFQ intensities between two representative replicates obtained with MaxQuant. The two replicates were chosen to have the median Pearson correlation of all pairwise replicate comparisons. f, Same as in e for Spectronaut intensities. Similarly, the two replicates were chosen to represent the median Pearson correlation coefficient of all pairwise comparisons. g, Heat map with all pairwise Pearson correlations among the 27 replicates for MaxDIA (upper triangle) and Spectronaut (lower traingle). The two values corresponding to the comparisons in e and f are marked with red squares. h, log–log scatterplot of iBAQ protein intensities from MaxDIA against Spectronaut protein intsnsities. i, log–log scatterplot of MaxDIA iBAQ values averaged over the replicates against RPKM values from RNA-seq data. j, Same as i with protein intensities from Spectronaut. Full size image

Although DIA is thought to be better in terms of data completeness34,35 compared to DDA, we observe that this depends on the algorithmic details, and that there is a tradeoff between data completeness and confidence of protein identification within a specific sample, as opposed to the whole dataset. After identifying peptides and proteins for the whole dataset, we apply a ‘transfer q-value’ cutoff to the identifications of matches in each sample. Setting it to 1 implies that no sample-specific restrictions are applied and that the peptide is quantified, whenever any evidence is found for its existence. A transfer q value of 0.01 (equal to the global q value of library-to-sample matches) results in stringent identification in every sample and, hence, certainty about the actual sample-specific presence of peptides and proteins. We scanned through seven values of the transfer q value between 0.01 and 1 and monitored the number of proteins that have a certain number or fewer valid values in terms of label-free quantification (LFQ) intensities (Fig. 3c). As expected, for larger transfer q values, the curves are flatter and higher in terms of total protein numbers. When using 1 for the ‘minimum ratio count’ parameter of the LFQ algorithm, most parts of all curves are above the line for the Spectronaut 13 software and slightly below for the Spectronaut 14 software. For ‘minimum ratio count’ = 2, which ensures higher accuracy of quantification, the array of curves is intersecting with the Spectronaut curves. The ‘minimum ratio count’ parameter requires at least that many peptide features to be shared for a protein in a specific comparison between two samples11. After evaluating the accuracy of benchmark quantification results on several mass spectrometry platforms (see, for instance, Supplementary Fig. 15 for timsTOF data), we decided to select 0.3 as the default value for the transfer q value. Study-specific objectives (completeness of quantification versus certainty of identification in individual samples) might suggest deviations from this default value.

The distribution of coefficients of variation (CVs) (Fig. 3d) indicates substantially higher quantification precision obtained with MaxLFQ (described below) in MaxDIA compared to both Spectronaut versions, with median CVs of 0.072, 0.109 and 0.114, respectively. Figure 3e,f shows typical log–log scatter plots of protein intensities between replicates displaying fewer outliers and higher Pearson correlation for MaxDIA. All pairwise replicate Pearson correlations of logarithmic intensities are represented as a heat map in Fig. 3g for both programs, showing consistently higher correlations for MaxDIA (median 0.993) compared to Spectronaut (median 0.977). We found a good overall agreement between averaged Spectronaut intensities and MaxDIA intensity-based absolute quantification (iBAQ) values (Fig. 3h) with a Pearson correlation of 0.87. We performed mRNA versus protein copy number comparisons based on reads per kilobase per million mapped reads (RPKM)36 and iBAQ37 values, respectively, using MaxDIA and Spectronaut (Fig. 3i,j). Both comparisons showed similar correlations between mRNA and protein levels, which are also compatible with correlations typically found in such studies38.

Accuracy of FDR estimates and discovery DIA

To evaluate the reliability of FDR estimates using MaxDIA’s target-decoy strategy, we used a pooled DDA library generated from mixed human and maize samples, with corresponding DIA runs comprising only human samples34. Hence, every match identified as being derived from the maize proteome is a known false-positive identification (having discarded peptides that are shared among proteins of the two species). This enables calculation of an ‘external’ FDR, which is calculated independently of the ‘internal’ FDR estimated by the decoy approach in MaxDIA. Figure 4a compares internal and external FDRs on match, peptide and protein group levels. The curves for internal and external FDR are in very good agreement on all three levels. When comparing the numbers of identified matches, peptides and protein groups at 1% FDR, which is often taken as a default value in shotgun proteomics, the numbers differed by only 3.0%, 3.4% and 5.0%, respectively, between internally and externally controlled FDRs. Hence, our decoy-based FDR estimates are in good agreement with external FDR calculations.

Fig. 4: Internal and external FDR. a, Number of identifications (blue: matches; green: peptides; red: protein groups) as a function of estimated FDR. The FDR is estimated once with the ‘internal’ target-decoy method implemented in MaxQuant (solid lines) and once with the ‘external’ method using mixing maize and human samples for generating the library and using only human sample in the DIA runs (dashed lines). b, Same as in a but using in silico predicted libraries generated using DeepMass:Prism15 c, Same as a but using the raw score instead of the machine learning–derived score. d, Same as b but using the raw score instead of the machine learning–derived score. Full size image

Given these results, we investigated how accurate the FDR estimates are for cases in which the library is dissimilar to the DIA sample. Hence, we assembled a library of in silico predicted spectra based on DeepMass:Prism15 consisting of all tryptic peptides digested from all human UniProt39 sequences (Release 2019_05 containing 20,959 proteins) without missed cleavages. We additionally generated predicted retention times for each in silico spectrum based on a BRNN used previously for the same purpose15. Using this library with the same DIA dataset as in Fig. 4a, we generated the same curves for internal and external FDRs as before (Fig. 4b). Here as well, we observed good agreement between internal and external FDRs. In particular, at an FDR of 1%, the number of identified protein groups differed by only 1.5%. We did, however, identify 39% more protein groups with the in silico library compared to the measured library. This highlights that MaxDIA does not require that spectral libraries are generated from matching samples in a project-specific manner, and yet FDRs are still reliably controlled. This enables the use of MaxDIA in a ‘discovery’ mode (discovery DIA), which is not biased by a library and completely hypothesis free in terms of which proteins can be found, by using in silico predicted libraries for all protein sequences. We repeated all analysis while replacing the DeepMass:Prism algorithm with two other spectral prediction methods—wiNNer15 and PROSIT16—indicating that there are no substantial differences resulting from different choices among these prediction algorithms (Supplementary Fig. 11).

We additionally repeated these analyses using the raw matching score instead of the machine learning-improved score (Fig. 4c,d). This revealed that the agreement of internal and external FDR does not depend on whether the XGBoost-based machine learning was used to adjust the scoring. However, the use of machine learning did substantially increase peptide (83% and 58% for library DIA and discovery DIA, respectively) and protein group (28% and 18%, respectively) identifications.

MaxLFQ adaptation for DIA

A prime example of the re-use and continued development of algorithms from DDA MaxQuant to MaxDIA is the label-free quantification algorithm MaxLFQ11. Here, quantification is based on first calculating all pairwise peptide ratios between samples, which are then summarized by the intensity profile that best fits all the pairwise ratios. This procedure can be generalized to DIA by replacing a single ratio per peptide with multiple ratios derived from precursor intensities and from the most intense fragment peaks (Supplementary Fig. 12). This approach naturally implements hybrid quantification of precursor and fragment intensities.

To benchmark quantification accuracy, we downloaded a four-species dataset with well-defined small ratios between replicate groups34. Ratios are expected to be 0%, 10%, 20% or 30%, depending on the species comprising: Homo sapiens, Caenorhabditis elegans, Saccharomyces cerevisiae and Escherichia coli. We tested several combinations of precursor, fragment or mixed quantification and fragment intensities summed up or kept separately. We measured the variability as the interquartile range of ratios within each species and summed these over the four species (Fig. 5a). We found that hybrid quantification between precursors and fragments with fragment intensities kept separate for individual ion types in LFQ resulted in the smallest quantification errors measured as the sum of the interquartile ranges of ratio distributions over the four species. The accuracy observed exceeded both MS1- and MS2-level quantification reported by Bruderer et al.34. A further question is how the filtering of fragments by their intensity improves quantification accuracy. To this end, we used only the top N intense peaks for quantification while varying N (Supplementary Fig. 13a). We found that accuracy increases with the number of fragments used, indicating that no filtering of fragments by intensity is required. Similarly, we investigated whether filtering to the top N most intense peptides per protein is beneficial (Supplementary Fig. 13b), finding that it is best to use all available peptides.

Fig. 5: MaxLFQ for DIA. a, Stacked interquartile rages of protein ratio distributions in the small-ratio four-species dataset from Bruderer et al.34 using different versions of MaxLFQ for DIA and compared to the results from this publication. MaxDIA is capable of MS1 and MS2 level as well as hybrid quantification modes. b, Quantification of a three-species benchmark mixture measured on a SCIEX TripleTOF 6600 instrument mixing proteomes from three species in defined ratio2 with MaxLFQ for DIA. The accompanying DDA library was used. The box plots here and in the subsequent panels are based on the numbers of data points given in the tables below the respective plot (valid LFQ ratios). All box plots indicate the median and the first and third quartiles as box ends. Whiskers are positioned 1.5 box lengths away from the box ends. c, Same as b but analyzed with MaxDIA in discovery mode. d, Quantification of a three-species benchmark mixture measured on a Bruker timsTOF Pro instrument mixing proteomes from three species in defined ratio using a DDA library. e, Same as d but analyzed in discovery mode. Full size image

In recent years, several researchers have worked on approaches to remove interferences and improve the selection of transitions in DIA analysis40,41,42,43. Although this approach to improving quantification has its merits, in this study we followed a different strategy with MaxLFQ to obtain high accuracy on the level of protein groups. Single-fragment features that are interfered by overlapping features and, due to this, have incorrect intensities will not affect protein quantification in MaxDIA much because the protein-level quantification relies solely on the medians of peptide signal ratios (Supplementary Fig. 12c). Hence, even if a fraction of signals is affected by interferences, they are expected to drop out in the calculation of the median over multiple fragments and peptides. We compared MaxLFQ in MaxDIA to Avant-garde curated Skyline quantification on a multi-species benchmark dataset simulating realistic biological data41. We found that the transition-filtered quantification provided by Avant-garde is not systematically better than the MaxLFQ quantification in MaxDIA (Supplementary Fig. 14).

Next, we analyzed a quantitative benchmark dataset obtained on a SCIEX TripleTOF 6600 instrument, mixing proteomes from three species in defined ratios among replicate groups2 (Fig. 5b). Using the original library analyzed with MaxQuant and using default values for all parameters, we identified 4,627 protein groups and achieved linear quantification for all three species over the whole dynamic range. In discovery mode with a predicted library allowing for one missed tryptic cleavage, the number of identified protein groups rose by 48% to 6,858 (Fig. 5c), with, on average, improved quantification accuracy for the species with ratios as measured by interquartile ranges of species-specific ratio distributions. H. sapiens, which expresses a much larger number of proteins, received the largest increase, identifying almost two-fold more protein groups (4,012 versus 2,127), whereas C. elegans and E. coli received proportionally fewer additional proteins.

We next acquired a quantitative three-species benchmark dataset using ion mobility on a Bruker timsTOF Pro instrument. Using the DDA library acquired on the same instrument type, we identified 10,352 protein groups. We again used MaxLFQ for DIA with hybrid quantification with separate intensities for each fragment ion (Fig. 5d), seeing excellent quantification over the whole dynamic range without non-linearities. In discovery mode (Fig. 5e), the number of identified protein groups increases to 10,466 with higher quantification accuracy, again judged by the interquartile ranges of ratio distributions. Scanning through the transfer q value, we found that quantification accuracy was best with a value near 0.3 (Supplementary Fig. 15).

BoxCar and fractionated DIA

We recently implemented analysis of data acquired using the BoxCar acquisition method in MaxQuant in the DDA context24, whose primary goal is to achieve higher dynamic range for the precursor intensities. Because this should be beneficial for DIA as well, we implemented its generalization to combining high-dynamic-range precursor measurements with DIA acquisition for the fragments. Furthermore, it is possible with MaxDIA to analyze and quantify DIA samples that have been pre-fractionated on peptide or protein levels. This feature can be applied to all supported instruments and DIA acquisition methods. To highlight these features, we acquired both DDA libraries and DIA measurements from HEK cell lysate as single shots and as high-pH reversed-phase peptide fractionated samples, which were pooled into eight fractions for MS analysis (Methods). We analyzed all combinations of libraries and samples, and, in addition, we analyzed the DIA samples in discovery DIA mode allowing for one missed trypsin cleavage (Fig. 6a). For the fractionated DIA samples, we observed an increase in the number of identified protein groups concomitant with the size of the library, with the most identifications in discovery mode. With single-shot samples, the number of identified proteins saturates with library size, having slightly more identifications with the fractionated library. However, comparing identifications for the single-shot DIA samples between fractionated library and discovery mode, we found that the results were very similar, with 89% overlap of Entrez Gene identifier mapped protein groups (Supplementary Fig. 16). For a comparison of protein identifications for different fractionation depths of the DIA samples, see Supplementary Fig. 17. This indicates that, for both types of DIA samples, it is not compulsory to produce a deep, fractionated library, but that similar, or even better, results can be achieved in discovery DIA mode. Quantification with MaxLFQ among three replicates of fractionated DIA samples showed very good correlation, with a median Pearson correlation of 0.993 (Fig. 6b).

Fig. 6: BoxCar and fractionated DIA. a, Schedule of libraries and DIA samples. Three different library approaches—single-shot, deep-fractionated and discovery mode—were compared to single-shot, deep-fractionated DIA samples. b, MaxLFQ quantification among three replicates of fractionated BoxCar DIA samples analyzed in discovery DIA mode. All pairwise Pearson correlations are above 0.99. c, Venn diagram-like comparison represented as bar plot between RNA-seq data of HEK cells and three different library methods applied to the fractionated DIA samples. All data have been mapped to gene identifiers d, Histogram of protein identifications mapped to gene identifiers sorted into bins according to log 2 RPKM values of the RNA-seq data. Full size image

We then compared the results obtained with the three different library creation approaches to RNA sequencing (RNA-seq) data of HEK cells (Methods). Figure 6c compares the four sets of identifications based on gene identifiers. Of the 9,503 genes covered by proteomics methods, 65% were found with all three library methods. An additional 25% were found with both discovery mode and fractionated library but not with the single-shot library. In total, 608 proteins were uniquely found with the discovery approach, compared to 251 with the deep-fractionated library, suggesting preference for the discovery mode from the perspective of results, in addition to its economic advantages. In Fig. 6d, the results from Fig. 6c are displayed according to RPKM intervals of the RNA-seq data. The RNA-seq data show a bimodal left shoulder that is typical of expression noise44, genes for which there is only limited proteomic evidence of translation. As expected, highly abundant proteins are recovered with all methods, whereas, at low abundance, both the deep-fractionated library and discovery DIA approach add identifications.