Arabidopsis proteome and the mass spectral assay library

Arabidopsis is an important model organism and the first plant with its genome completely sequenced. Knowledge from studying this species has either direct or indirect applications for agriculture and human health. Quantitative proteomics by data-independent acquisition mass spectrometry (SWATH/DIA-MS) was recently developed and is considered as a high-throughput, massively parallel targeted approach for accurate proteome quantification. In this approach, a high-quality and comprehensive spectral library is a prerequisite. Here, we generated an expression atlas of 10 organs of Arabidopsis and created a library consisting of 15,514 protein groups, 187,265 unique peptide sequences, and 278,278 precursors. The identified protein groups correspond to ~56.5% of the predicted proteome. Further proteogenomics analysis identified 28 novel proteins. We applied DIA-MS using this library to quantify the effect of abscisic acid on Arabidopsis. We were able to recover 8,793 protein groups of which 1,787 were differentially expressed. MS data are available via ProteomeXchange with identifier PXD012708 and PXD012710 for data-dependent acquisition and PXD014032 for DIA analyses.

www.nature.com/scientificdata www.nature.com/scientificdata/ sequentially and cyclically isolated using a wide mass window (typically 25 Da) and subjected to fragmentation. Thus, the spectra of all ions including low abundance ions from a sample are acquired in an unbiased fashion. Subsequently targeted extraction of fragmented spectra can be performed by comparing the acquired spectral data with a pre-constructed ion libraries consisting of pairs of ion spectra and their accurate retention time to identify and quantify proteins. The advantages of this approach include its ability for proteome-wide quantitation with high consistency and accuracy, acquiring data via a hypothesis-free approach 16 , and its particular suitability for studying a large number of samples in a reproducible way 17 .
The sensitivity in SWATH/DIA-MS relies heavily on a high-quality and comprehensive assay library. In this study, we used two common mass spectrometry platforms (Orbitrap and TripleTOF) to analyze 10 different organs of Arabidopsis after extensive offline sample fractionation. We acquired a total of 835 raw files including 474 from Orbitrap Fusion and 361 from TripleTOF5600 plus. As a result, we constructed a spectral library containing more than 19,000 proteins (>15,000 protein groups), accounting for approximately 56.5% of the predicted Arabidopsis proteome. The usefulness of this library has been clearly illustrated in the subsequent DIA-MS analysis of Arabidopsis leaves treated with the plant hormone abscisic acid (ABA).

Methods
Plant materials and growth conditions. Arabidopsis thaliana ecotype Col-0 was used for this study.
Seeds were obtained from the Arabidopsis Biological Resource Center (ABRC) and the European Arabidopsis Stock Centre. They were surface sterilized with 75% ethanol and 0.1% Triton X-100 for 10 min, washed with 95% ethanol twice (2 min each), and then planted on growth media containing half strength Murashige and Skoog (½ × MS) salt and 1% (w/v) sucrose, and solidified with 0.8% agar. The plates were kept at 4 °C in the dark for 2 d and then moved to a growth chamber (CU36-L5, Percival Scientific) at 21 °C under a photoperiod of 16 h light and 8 h darkness for germination and growth. Cotyledons and roots were collected following growth for 6 days and 20 days, respectively. For preparation of leaves (rosette leaves and cauline leaves), stems, flowers (buds and open flowers) and siliques (green siliques), seedlings grown for one week on the plate were transferred to soil in a greenhouse room under the same growth conditions as in the growth chamber for an additional 5-week period, except rosette leaves which were grown for 3 weeks. Seeds were harvested when siliques turned yellow or brown. For the study of abscisic acid effect on Arabidopsis, rosette leaves were sprayed with or without 100 µM of abscisic acid and harvested at 2 h, 24 h and 72 h post treatment. In each condition, three leaves per plant from three independent plants were harvested and combined into a single sample per condition. All collected materials were immediately frozen in liquid nitrogen and stored at −80 °C until use.
Photosynthetic light of the growth chamber was set for 16 h light/8 h dark cycles at 21 °C. The cells were harvested by draining off the media using Stericup ® filter unit (Millipore, Billerica, MA), immediately flash frozen in liquid nitrogen and stored at −80 °C until use.
Protein extraction and digestion. All plant materials were ground in liquid nitrogen with a prechilled mortar and a pestle. The fine powder was resuspended with the extraction buffer (50 mM Tris, pH 8, 8 M urea, and 0.5% SDS) supplemented with protease inhibitor (Roche Diagnostics GmbH, Mannheim, Germany), and homogenized with a Dounce homogenizer. In order to extract more proteins, the crude homogenate was further subjected to 30 cyclic high/low pressurization (50 s of 35, 000 PSI and 10 s of ambient pressure) using a pressure cycling technology (Barocycle, PressureBioSciences, MA). The extracts were then centrifuged at 10,000 g for 5 min at 4 °C. The proteins in the supernatant were purified using methanol/chloroform precipitation and dried under vacuum. The dried pellets were resuspended into the extraction buffer (50 mM Tris, pH 8, 8 M urea, and 0.5% SDS) and sonicated. The protein content was determined using a microBCA kit (Thermo Scientific). For library generation, approximately 200 µg of proteins were reduced, alkylated and digested with trypsin as described 19 . The digests were desalted with microcolumns packed with C18 and Poros oligo R3 materials prior to a shallow-gradient Strong Cation Exchange (SCX) fractionation. For protein quantitation of ABA-treated sample, approximately 10 µg of proteins were digested using FASP method 20 prior to DIA-MS analysis.
ScX peptide fractionation. The peptides were reconstituted in 90 μL SCX buffer A (10 mM KH 2 PO 4 , 25% acetonitrile (ACN), pH 3.0) and loaded into the polySULFOETHYL A column (200 × 4.6 mm, 5 μm, 200 Å) (PolyLC, Columbia, MD) for SCX fractionation on Accela HPLC (Thermo Scientific). An increasing gradient of buffer B (10 mM KH 2 PO 4 , 500 mM KCl and 25% ACN, pH 3.0) was applied in a shallow-gradient elution protocol of total 100 min. The gradient consists of 100% buffer A for the initial 5 min, 0%-15% buffer B for 80 min, 30%-100% buffer B for 5 min, 100% buffer B for 5 min and 100% A for 5 min at a flow rate of 1 mL/min. The chromatography was monitored at 214 nm using diode array detector. After the pooling of some fractions based on the absorption intensity, a total of 30 fractions were obtained, desalted as described above and dried in the SpeedVac (Thermo Scientific).
MS analysis using TripleToF 5600+. The dried peptide mixture was redissolved into 0.1% formic acid (FA) and 3% ACN in water supplemented with indexed retention time (iRT) peptide standards according to the manual (Biognosys, Switzerland). They were then analyzed using a TripleTOF 5600 Plus MS (Sciex, USA) coupled with an UltiMate ™ 3000 UHPLC (Thermo Scientific). Briefly, approximately 1.5 µg of peptide mixture was injected into a precolumn (Acclaim PepMap100, C18, 300 µm × 5 mm, 5 µm particle size) and desalted for 15 min with 3% ACN and 0.1% FA in water at a flow rate of 5 µl/min. The peptides were eluted into an Acclaim PepMap100 C18 capillary column (75 µm I.D. × 25 cm, 3 µm particle sizes, 100 Å pore sizes) and separated with a 135-min gradient at constant 300 nL/min, at 40 °C. The gradient was established using mobile phase A (0.1% FA in H 2 O) and mobile phase B (0.1% FA, 95% ACN in H 2 O): 2.1-5.3% B for 5 min, 5.3-10.5% for 15 min, 10.5-21.1% for 70 min, 21.1-31.6% B for 18 min, ramping from 31.6% to 94.7% B in 2 min, maintaining at 94.7% for 5 min, and 4.7% B for 15-min column conditioning. The sample was introduced into the TripleTOF MS through a Nanospray III source (Sciex, USA) with an electrospray potential of 2.2 kV. The ion source was set with an interface heater temperature of 150 °C, a curtain gas of 25 PSI, and a nebulizer gas of 6 PSI. The mass spectrometry was performed with information dependent acquisition (IDA). The mass range of survey scans was set to 350-1250 Da. The top 30 ions of high intensity higher than 1,000 counts per second and a charge state of 2 + to 4 + were selected for collision-induced dissociation. A rolling collision energy option was applied. The maximum cycle time was fixed to 2 s and a maximum accumulation time for individual ions was set for 250 ms. Dynamic exclusion was set to 15 s with a 50 mDa mass tolerance.

MS analysis using orbitrap Fusion Lumos.
In both data-dependent acquisition (DDA) and DIA analysis, an Orbitrap Fusion Lumos mass spectrometer (Thermo Scientific) was coupled with an UltiMate TM 3000 UHPLC (Thermo Scientific). The peptide injection and elution gradient were essentially the same as described above. The peptides were separated and introduced into the Orbitrap MS through an integrated Easy-Spray LC column (50 cm × 75 µm ID, PepMap C18, 2 µm particles, 100 Å pore size, Thermo Scientific) with an electrospray potential of 1.9 kV. The ion transfer tube temperature was set at 270 °C. The MS parameters included application mode as standard for peptide, default charge state of 3 and the use of EASY-IC as internal mass calibration in both precursor ions (MS1) and fragment ions (MS2).
In DDA mode, a full MS scan (375-1400 m/z range) was acquired in the Orbitrap at a resolution of 120,000 (at 200 m/z) in a profile mode. The cycle time was set to 3 s between master scans, whereas the RF lens was set to 30%. A maximum ion accumulation time was 100 milliseconds and a target value was 4e5. MIPS (monoisotopic peak determination of peptide) was activated. The isolation window for ions was 1.6 m/z. The ions above an intensity threshold of 5 e4 and carrying charges from 2 + to 5 + were selected for fragmentation using higher energy collision dissociation (HCD) at 30% energy. They were dynamically excluded after 1 event for 10 s with a mass tolerance of 10 ppm. The option of inject ions for all available parallelizable time were selected to increase the MS/MS spectral quality. All MS/MS spectra were acquired in Orbitrap in a centroid mode, with the first fixed mass of 100 (m/z) and a resolution of 30,000 (at 200 m/z). A maximum injection time of 100 ms and a target value of 5 e4 were used.
For the DIA-MS analysis, quadrupole isolation window of 6 m/z was selected for the HCD fragmentation. The sample was gas-fractionated into precursor mass ranges among 400-550; 550-700 and 700-850 m/z respectively in each injection. The mass defect was 0.9995. The HCD collision energy was set at 30%. All MS/MS spectra were acquired in a centroid mode at a resolution of 30,000. The scan range of MS/MS was set between 350 and 1500 m/z. A maximum ion accumulation time was set as 100 milliseconds and a target value of was at 1e6.

Protein identification.
Mass spectrometry data (.raw file from Orbitrap Fusion and .wiff file from TripleTOF 5600+) were processed using the Maxquant software (version 1.5.3.30) 21 . The TAIR10 proteome database (35,386) and 262 common contaminant sequences were combined and used for the database search. The "revert" protein sequence was chosen in the decoy mode option for all Maxquant searches. Carbamidomethylation at cysteine residues was set as a fixed modification. The variable modifications included oxidation at methionine residues and N-terminal protein acetylation. The enzyme limits were set at full trypsin cleavage with a maximum of two missed cleavages allowed. A positive peptide identification was required to contain a minimum of seven amino acids and a maximum of five modifications. The mass tolerances of the precursor ion of Orbitrap Fusion data were set to 20 and 4.5 ppm for the first and main searches, respectively, whereas they were 0.07 and 0.006 Da for the TripleTOF 5600 data. The mass tolerances of the fragment ions were set 20 ppm and 40 ppm for Orbitrap Fusion and TripleTOF 5600 data, respectively. The false discovery rates (FDRs) of peptide-spectral match (PSM), protein identification and site decoy fraction were all set to 0.01.
To identify novel proteins, the Maxquant results were loaded into Scaffold software (version 4.4, Proteome software Inc., Portland, OR). The unmatched spectra from Maxquant searches were exported. They were then searched against a proteome database constructed using six-frame translation of the TAIR9 genome. The FDRs for PSM and protein were 0.01 to filter all identification. To minimize potentially random matches due to the use of only unmatched spectra data and the newly constructed large proteome database from TAIR9 genome, we extracted the novel identified sequences and merged them with the TAIR10 proteome database. This combined proteome database was considered as "clean database". All MS raw files were searched against this "clean proteome database" using Maxquant. The search results were imported into Scaffold software for validation and verification. Proteins and peptides were filtered based on the possibilities of 99% and 95% respectively. For novel identification, two minimum spectra were required to match exclusively to the novel proteins. The spectral matches were required to have a minimum 5 continuous y or b ion series, containing parent ion and match of major (high intensity) fragment ions to sequence. Each protein was matched with the Araprot11 transcript (DNA) using blast program (tblastn) to extract detailed annotation information in TAIR webpage (https://www.arabidopsis.org) such as genomics locus and gene model type.
MS spectral library generation and DIA data analysis. All the DDA MS data files were loaded into Spectronaut Pulsar X (version 12, Biognosys, Switzerland) for the library generation. The protein database was the combination of TAIR10 proteome sequence and the aforementioned novel identifications. The default settings for database match include: full trypsin cleavage, peptide length of between 7 and 52 amino acids and maximum missed cleavage of 2. Besides, lysine and arginine (KR) were used as special amino acids for decoy generation, and N-terminal methionine was removed during pre-processing of the protein database. Fixed modification was www.nature.com/scientificdata www.nature.com/scientificdata/ carbamidomethylation at cysteine and variable modifications were acetylation at protein N-terminal and oxidation at methionine. All FDRs were set as 0.01 for the peptide-spectrum match (PSM), peptide and protein. To be included into the final ion library, fragment ions were required to contain a minimum of three amino acids in length, with a mass range between 300 and 1800 m/z, and a minimum relative intensity of 5%. The best 3-6 fragments per peptide were included in the library. The iRT calibration was required with minimum R-Square of 0.8.
DIA data were analyzed using Spectronaut software against the spectral libraries to identify and quantify peptides and proteins. The Biognosys default settings were applied for identification: excluding duplicate assay; generation decoy based on mutated method at 10% of library size; and estimation of FDRs using Q value as 0.01 for both precursors and proteins. The p-value was calculated by kernel-density estimator. Interference correction was activated and a minimum of 3 fragment ions and 2 precursor ions were kept for the quantitation. The area of extracted ion chromatogram (XIC) at MS/MS level were used for quantitation. Peptide (stripped sequence) quantity was measured by the mean of 1-3 best precursors, and protein quantity was calculated accordingly by the mean of 1-3 best peptides. Local normalization strategy and q-value sparse selection were used for cross run normalization. As the control and treated plants were grown in a same condition, and leaves were collected from three individual plants and pooled in each condition, the initial abundances of proteins (if without treatment) for all conditions at the same growth stage were considered as identical. Thus, a paired Student's t-test (one sample, null hypothesis, no change, mean μ = 0) was performed to uncover differential expression between control and ABA treated sample for the same growth stage of plants. The t-test was performed based on the log2 ratios of the peptide intensities of the individual peptides of a protein. The resulting p values were corrected for multiple testing using the q-value approach to control the overall FDR 22 . Proteins with a fold-change of higher than 1.5 and a q-value of less than 0.01 were considered as differentially expressed proteins. The candidate proteins were submitted to the web-based platform of the Database for Annotation, Visualization and Integrated Discovery (DAVID; http://david.abcc.ncifcrf.gov) for Gene Ontology (GO) enrichment and pathway analysis 23 .

Data Records
The mass spectrometry DDA proteomics data acquired using Orbitrap Fusion Lumos and TripleTof 5600 plus have been deposited to the ProteomeXchange Consortium via the PRIDE 24 partner repository with the dataset identifier PXD012708 25 and PXD012710 26 respectively, whereas the mass spectrometry DIA proteomics data acquired using Orbitrap Fusion Lumos have been deposited at the same server with the dataset identifier PXD014032 27 . The MS spectral libraries in tab delimited text file format can be accessed at Figshare 28 . technical Validation experimental design. In SWATH/DIA-MS, the high quality and coverage of an assay library is the key for accurate quantitation of high number of proteins. Arabidopsis is a complex organism in that they have specialized organs at different developmental stages, and its proteome undergoes dynamic changes in different tissues during development.
To build a comprehensive assay library, we collected 10 samples from four different organs of Arabidopsis (Fig. 1a) including leaf, stem, flower and root. We improved the protein and peptide preparation protocols (Fig. 1b): use of cryogenic grinding of tissues, Dounce homogenization and pressure cyclic treatment (PCT technology) sequentially to get better yields in protein amount and species; purifying proteins sample by methanol/ chloroform which helped to remove majority of non-proteins (such as lipids and pigments), and desalting peptides using a cartridge containing both C18 and R3 material to minimize loss of hydrophilic peptides. Besides we fractionated sample extensively using an optimal 100-min shallow-gradient SCX and combined into final 30 fractions for LC-MS analysis by two MS platforms (Fig. 1c).
Protein identification. Using the optimal workflow, we identified a high number of proteins from each organ ( Table 1). The newer mass spectrometry platform Orbitrap Fusion generally identified ~30% higher number of proteins compared to the TripleTOF 5600 plus. In total, we identified more than 180,000 unique peptides from ~15,400 distinct protein groups, which were approximately 30% higher compared with two earlier genome-wide proteome analyses of Arabidoposis 29,30 , and accounted for ~56.5% of the total predicted proteome of Arabidopsis.
Next, we constructed a new proteome database using six-frame translation of TAIR9 genome. The unmatched spectra from earlier searches against TAIR10 proteome database were searched again with the new database to identify potential novel genes or gene models as described 30 . As a result, we were able to identify a total of 42 previously unannotated proteins. Most novel findings were previously annotated as "transposable element gene", "novel transcribed region" and "long noncoding RNA" as well as novel alternative translation and splicing. Of these, 28 proteins were not documented even in the latest TAIR11 proteome database ( Table 2). The sequences of novel identifications and their matched peptides as well as the matched spectra if the spectral count is smaller than 4 are available on Figshare 28 . Interestingly, half of these novel proteins (n = 14) were annotated as transposable element genes in TAIR webpage. There were 2 novel proteins annotated as novel transcribed region, 2 as long noncoding RNA, and 1 antisense long RNA. Four proteins ( Table 2, No. 20-24) were probably alternatively transcribed and translated proteins since there were other proteins from the same genomic locus. There were also 5 proteins ( Table 2, No. 24-28) with slightly different amino acids from the proteins in the predicted TAIR proteome.
14 out of the 42 novel identifications were found in the latest TAIR11 proteome database, confirming the validity of our proteogenomics approach ( Table 3). The sequences of these novel identifications and their matched peptides as well as the matched spectra if the spectral count is smaller than 4 are available on Figshare 28 . These include 2 new isoforms/entries ( Table 3, No. 1-2) that were added in TAIR11 in that they share most part of sequences with other isoforms/entries in TAIR10 but at least 1 unique peptide sequence were identified in this study. Six www.nature.com/scientificdata www.nature.com/scientificdata/ proteins ( Table 3, No. [3][4][5][6][7][8] were only identified in this study and in TAIR11 but not in TAIR10 proteome database; and that included two earlier annotated as "transposable element gene". One protein (AT5G13590.1) was identified with difference in two amino acids (in TAIR10: TRGAFLNSNR, and DEEPTELNLSLSK; this study, they were: TSGAFLNSNR and NEEPTELNLSLSK). There were also 5 proteins (Table 3, No. [10][11][12][13][14] containing additional sequences that are not present in TAIR10 protein database. Taken together, these data provide direct experimental evidence confirming the new revision of protein sequences. In the subsequent DIA-MS analysis of ABA-treated leave sample, seven novel proteins ( 3) with their relatively high spectral counts were also identified and quantified. Of these, the protein AtChr1@30100671@30103941 was observed to be down-regulated (fold change, 0.57, q < 0.0027) at 2 h post ABA-treatment, suggesting that this "transposable element gene" protein is not only expressed at relatively high abundance but may be also functionally active.  www.nature.com/scientificdata www.nature.com/scientificdata/ Arabidopsis spectral libraries. In order to quantify proteome dynamics in Arabidopsis by DIA-MS, we constructed a combined spectral library as well as platform-specific libraries from individual LC-MS platform (Table 4) using Spectronaut Pulsar. The library from Orbitrap Fusion analysis was comprised of 15,514 protein groups, 187,265 unique peptide sequences, and 278,278 precursors, while the library from the TripleTOF analysis contains 10,915 protein groups, 80,492 peptides, and 118,675 precursors. The combined library was comprised of a similar number of proteins (15,485) and slightly higher number of 284,418 precursors, suggesting that Orbitrap Fusion apparently recovered nearly all proteins and peptides that from TripleTOF 5600 mass spectrometry in this study.
The peptide distribution of the combined library was shown in Fig. 2a. Most peptides ranged from 8 to 15 amino acids in length, consistent with the properties of full tryptic peptides. Approximately 79% of precursors fall in a mass range between 400-850 m/z (Fig. 2b). In SWATH/DIA-MS analysis, most studies acquired over a mass window from 400 to 1200 Da (m/z) to obtain full coverage of peptides. The large scan window compromises the resolution and increases the complexity of mass spectra, leading to fewer peptide identifications. Our data could provide a valuable reference for designing DIA-MS methods with optimal mass windows. Approximately 50% of precursors have a charge of 2 (Fig. 2c), and 15% are cysteine modified peptides (Fig. 2d). The large number of cysteine-containing peptides may serve as a useful resource for redox proteomics quantitation.
Not surprisingly, the "response to abscisic acid" was enriched in all conditions. The other highly enriched BP terms were "oxidative-reduction process", "hydrogen peroxide catabolic process" and "response to oxidative stress". These oxidative-stress related processes were enriched among the up-and down-regulated proteins at all measured time points, indicating ABA treatment induced active reactive oxygen species (ROS) production. Indeed, it has been well documented that ABA can cause oxidative stress in Arabidopsis [31][32][33] . Several metabolic processes including carbohydrate metabolic process, sucrose biosynthetic/metabolic process, chitin catabolic process and macromolecule catabolic process were downregulated at the 2 h post-treatment (Fig. 4a), whereas they were enriched from the upregulation group of proteins (Fig. 4f) at 72 h post-treatment, indicating the ABA treatment initially reduces metabolism followed by gradually increasing the metabolism to the highest level at 72 h post-treatment. The other highly significant enrichments at 2 h included RNA secondary unwinding, translation, rRNA processing and ribosome biogenesis (Fig. 4d), all of which are related to gene transcription and translation, suggesting that protein synthesis was immediately activated in response to ABA stimuli.
Pathway alteration in response to ABA treatment. Based on KEGG database, pathway ath03010: Ribosome was the most significant enrichment (with Benjamini Hochberg adjusted p-value of 8.41E-14) from the upregulated proteins at 2 h post-treatment. This significance attenuates at both 24 h (a p-value of 0.022) and at 72 h post-treatment (p-value > 0.05). Other significant enrichments were related to biosynthesis and metabolic pathways. Most of these pathways were down-regulated at 2 h (ath00940: Phenylpropanoid biosynthesis; ath01100: Metabolic pathways; ath01110: Biosynthesis of secondary metabolites) and 24 h (ath00940: Phenylpropanoid biosynthesis; Biosynthesis of secondary metabolites). In contrast, they were enriched from the upregulated proteins at 72 h post-treatment (ath00940: Phenylpropanoid biosynthesis; ath01110: Biosynthesis of secondary metabolites; ath01100: Metabolic pathways). These results are consistent with the GO BP enrichment analysis, suggesting that both transcription and translation were more active upon ABA treatment and they gradually returned to normal after days, whereas the metabolic pathways were suppressed at earlier stages and subsequently became more active.

Usage Notes
In this study, we generated the largest set of tissue-specific DDA data using two high resolution of mass spectrometry platforms. This in-depth proteomics spectral information enabled us to identify and validate novel proteins using proteogenomics analysis by a 6-frame translation approach. These data are made available as a resource to the research community. Researchers can use these data to further annotate Arabidopsis genome data and correlate with RNAseq data.
This study provided the first comprehensive Arabidopsis spectral libraries, either instrument-specific or a combined library. In addition to the submission to PRIDE server, all these libraries in tab delimited text file format were shared on Figshare 28 which are ready for input for Spectronaut, Skyline and Peakview software. We used these libraries in DIA-MS and quantified a higher number of proteins in study of Arabidopsis proteome dynamics upon abscisic acid treatment, suggesting that rich information can be obtained in a high-throughput approach. It is worth to note that we have tested DIA data acquired from Orbitrap Fusion against the platform-specific and the combined libraries. All the analyses returned similar quantitation and low CVs, but use of Fusion library quantified highest number of proteins; it is a 8% higher than that from combined library analysis and a 13% higher than that of 5600 library analysis. This suggests that Fusion library should be applied for DIA analysis conducted using Orbitrap instrument. Although the optimal choice of libraries for SWATH data from TripleTOF or other MS platforms remains to be determined, we anticipate use of either of these libraries would be able to achieve in-depth proteome quantitation. Since this library contained approximately 30,000 cysteine-modified peptides, it can be also used for redox study of reversible cysteine modification. In addition, targeted proteomics using SRM and PRM can be designed based on this reference spectral library.