Selected reaction monitoring (SRM) is a targeted mass spectrometric method that is increasingly used in proteomics for the detection and quantification of sets of preselected proteins at high sensitivity, reproducibility and accuracy. Currently, data from SRM measurements are mostly evaluated subjectively by manual inspection on the basis of ad hoc criteria, precluding the consistent analysis of different data sets and an objective assessment of their error rates. Here we present mProphet, a fully automated system that computes accurate error rates for the identification of targeted peptides in SRM data sets and maximizes specificity and sensitivity by combining relevant features in the data into a statistical model.
Access optionsAccess options
Subscribe to Journal
Get full journal access for 1 year
only $20.17 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
We thank J. Malmström and M. Jovanovic for providing the samples that were used as background matrix in the gold-standard data set, M. Jovanovic for careful reading of the manuscript, A. Srebniak for help in generating a software package, and H. Wenschuh. We acknowledge M. Claassen for discussions on machine learning. This work was supported by grants from the Forschungskredit of the University of Zurich, University of Zurich Research Priority Program in Systems Biology and Functional Genomics, GEBERT-RÜF Stiftung and Swiss National Science Foundation (grant 31000-10767), with funds from the US National Heart, Lung, and Blood Institute and the US National Institutes of Health (contract N01-HV-28179), and by SystemsX.ch, the Swiss initiative for systems biology.
Table of transitions, table of peak groups, table with identification statistics and classifier of the gold standard data set analysis. The transitions sheet contains the precursor m/z (Q1), fragment ion m/z (Q3), an id that groups the transitions according to precursor (transition group id), an id for the transition (transition id), a string describing the isotopic labeling of the peptide (isotype), the collision energy used (CE), the expected retention time used for scheduled SRM (tR), the expected relative intensity of the fragment ions (relative intensity %), a string indicating whether the transition is a decoy or target (decoy) and an id to group corresponding target and decoy transition groups (target decoy transition group id). The mProphet peak groups sheet contains a row for each peak group. The most important columns are an id for a transition group measurement (transition_group_record), the features used for scoring (all columns starting with main_var or var_), a column indicating the dilution of the synthetic peptides in the specific matrix (dilution), the species used for the background matrix (background), the class of the peak group in terms of identity as determined by the dilution alignment (real_class), a boolean indicating whether the peak group was derived from decoy or target transitions (real_decoy), a boolean indicating whether treated as decoy or target in the mProphet analysis (decoy) and the mProphet discrimination score (d_score). The mProphet all peak groups sheet contains the all peak groups of the analysis, not only the ones that rank highest in one transition group record (peak_group_rank). The mProphet stat sheet relates the mProphet discrimination score (cutoff) to the false discovery rate (FDR) and the sensitivity (sens). The mProphet classifier weight sheet contains the weights that were determined using the semi-supervised learning approach.
Table of transitions, table of peak groups, table with identification statistics and classifier of the human u2os cell line analysis. For a detailed description of the sheets see Supplementary Data 1 legend.
Table of transitions, table of peak groups, table with identification statistics and classifier of the human plasma analysis. For a detailed description of the sheets see Supplementary Data 1 legend.
Table of transitions and peak groups for the measurement of yeast target and decoy transitions in human plasma. The transitions sheet contains target transitions of yeast peptides and corresponding decoy transitions generated by two different decoy transition generation algorithms (ADD_RANDOM and REVERSE_PEP_AND_INCREASE_Q1). The mQuest peak groups sheet contains the data processed with mQuest. The mProphet analysis does result in meaningful results since the data contains no positive target measurements. For a detailed description of the sheets see Supplementary Data 1 legend.