Noble and Keich reply:

We find much to agree with in Sticker et al.1. Overall, it is clear that we are engaged in the same general project: to first ensure the validity of our statistical confidence estimates and thereafter to maximize our statistical power in MS-based proteomics experiments. We also agree that controlling the false discovery rate (FDR) among matches to a large peptide database and then reporting results relative to a selected subset of peptides does not correctly control the FDR. Indeed, this point has been made previously on multiple occasions2,3 and is well established in the statistical literature4. We also agree that the 'sub-sub' strategy—searching a subset database and evaluating the FDR within that subset—necessarily forces some matches between peptides in the subset and spectra that were generated by peptides outside of the database.

This leads to our two points of contention. First, Sticker et al.1 claim that their proposed 'all-sub' strategy leads to improved statistical power relative to the sub-sub strategy. In support of this claim, they report empirical results on two data sets. We contend that all-sub is not always better than sub-sub. Accordingly, we constructed a different setup that allowed us to more accurately characterize false positive spectrum identifications. Specifically, we ran a concatenated set of spectra—from 18 purified proteins (ISB18)5 and from the plant Arabidopsis thaliana6—against a corresponding concatenated database. Contrary to what Sticker et al.1 found, in this setting the relative performance of the two methods is reversed: at a 1% FDR threshold, sub-sub accepts 11,416 peptide–spectrum matches (PSMs), whereas all-sub accepts only 10,307. We conclude that all-sub's loss of statistical power is due to the large size of the Arabidopsis database (Supplementary Note).

Second, in addition to claiming superior statistical power of the all-sub procedure, Sticker et al.1 imply that the sub-sub strategy leads to invalid FDR control. As evidence, they point to the number of subset PSMs that matched a different peptide sequence in the complete search (all-all) and the subset search (sub-sub). However, their analysis does not account for the possibility that some of these PSMs may be incorrect in the all-all search and correct in the sub-sub search. Indeed, as the size of the competing, complement database increases, the probability that a correct match to the subset database will receive a lower score than an incorrect match in the complement database increases. This is precisely the effect that sub-sub aims to avoid. In the context of this simulation, Sticker et al.1 are concerned that by forcing Arabidopsis spectra to match against the ISB18 database, we will create many false positive PSMs. Fortunately, in our experimental setup, we can directly observe this rate of false matching: among the 11,416 PSMs accepted by sub-sub, only 41 (0.36%) involve an Arabidopsis spectrum. This is well below the 1% FDR threshold. Furthermore, we note that in the subset database search, 1,127 of the accepted PSMs involving ISB18 spectra actually switch to matching Arabidopsis peptides when we search against the combined database. According to the arguments laid out by Sticker et al.1, this rate of switching implies that that the actual sub-sub FDR is 10%. However, in our setup, we know that those ISB18 spectra are definitely not correct when matched to Arabidopsis peptides.

Thus, though all-sub may provide superior statistical power in some settings, this is not always the case. Precisely characterizing the situations in which a given analysis strategy is optimal will require further research.

Data availability statement. All data used in this work are publicly available via the URLs listed in the Supplementary Note.