Comprehensive spectral libraries for various rabbit eye tissue proteomes

Rabbits have been widely used for studying ocular physiology and pathology due to their relatively large eye size and similar structures with human eyes. Various rabbit ocular disease models, such as dry eye, age-related macular degeneration, and glaucoma, have been established. Despite the growing application of proteomics in vision research using rabbit ocular models, there is no spectral assay library for rabbit eye proteome publicly available. Here, we generated spectral assay libraries for rabbit eye compartments, including conjunctiva, cornea, iris, retina, sclera, vitreous humor, and tears using fractionated samples and ion mobility separation enabling deep proteome coverage. The rabbit eye spectral assay library includes 9,830 protein groups and 113,593 peptides. We present the data as a freely available community resource for proteomic studies in the vision field. Instrument data and spectral libraries are available via ProteomeXchange with identifier PXD031194. Measurement(s) database type spectral library Technology Type(s) ion mobility spectrometry-mass spectrometry Sample Characteristic - Organism Oryctolagus cuniculus Sample Characteristic - Environment eye Sample Characteristic - Location United States of America Measurement(s) database type spectral library Technology Type(s) ion mobility spectrometry-mass spectrometry Sample Characteristic - Organism Oryctolagus cuniculus Sample Characteristic - Environment eye Sample Characteristic - Location United States of America

Peptide sample preparation. Equal volumes of 100 mM ammonium bicarbonate were added to the tear samples and the samples were heated at 95 °C for 5 min. The reduction and alkylation reactions were carried out by adding 5 mM dithiothreitol at 37 °C for 1 h and 10 mM iodoacetamide for 30 min in the dark, respectively. Protein concentration was determined using Bradford reagent, and 50 μg of proteins were used for tryptic digestion. Trypsin (1.25 μg, 1/40, w/w) was then added and incubated at 37 °C overnight. The reaction was stopped by adding trifluoroacetic acid (TFA). The digested peptides were cleaned up using C18 Ziptips and vacuum dried using a CentriVap (Labconco Corporation, Kansas City, MO).
Vitreous humor samples were centrifuged at 15000 × g at 4 °C for 15 min. Then, the supernatant was transferred to a clean tube, added to an equal volume of 100 mM ammonium bicarbonate, and heated at 95 °C for 5 min. The reduction, alkylation, digestion, and clean-up procedures were carried out as above.
The entire part of all other tissue samples (cornea, conjunctiva, iris, retina, and sclera) was used to ensure deep proteome coverage for each tissue 34 . Each tissue sample was minced into fine pieces in a biosafety level II cabinet and collected into clean 1.5 mL tubes. To each tube was added 100 μL of 100 mM ammonium bicarbonate (30% ACN, 8 M urea, and 20 mM dithiothreitol) 35 . After incubation at 37 °C for 30 min under mild shaking, the samples were centrifuged at 15,000 × g for 15 min. The supernatant was transferred to a clean tube, and 100 mM iodoacetamide was added. After incubation in the dark for 30 min at room temperature, 1 mL of 100 mM ammonium bicarbonate was added to each sample to reduce the urea concentration. The digestion and clean-up procedures were carried out as above. Other methods such as homogenization, sonication, and cryogenic grinding may be used for sample processing before protein extraction to improve protein coverage.
High pH reverse-phase fractionation. Dried peptide samples were resuspended in 10 mM ammonium formate (AF) pH 10. Stage-tips containing 6 C18 membranes were used to fractionate peptides. The stage-tips were pre-treated with isopropanol and 60% ACN in 10 mM AF pH 10, and finally re-equilibrated using 10 mM AF pH 10. Samples were then loaded onto the stage-tips, washed twice using 10 mM AF pH 10, and eluted into 12 or 16 fractions using an escalating concentration of ACN (2-60%) in 10 mM AF pH 10. Fractions were dried before reconstitution in 2% ACN with 0.1% FA for analysis and spiked with iRT peptides according to the manufacturer's instruction 36 . nanoLC-Ms/Ms. The liquid chromatography-mass spectrometry procedure was published elsewhere 37 .
Specifically, a NanoElute LC system coupled to a timsTOF Pro (Bruker Daltonics, Germany) via a CaptiveSpray source was used. Samples were loaded onto an in-house packed column (75 μm x 25 cm, 1.9 μm ReproSil-Pur C18 particle (Dr. Maisch GmbH, Germany), column temperature 40 °C) with buffer A (0.1% FA in water) and buffer B (0.1% FA in ACN) as mobile phases. The 120-min gradient was 60 min from 2% B to 17% B, 90 min to 25% B, 100 min to 37% B, 110 min to 80% B, and maintained for another 10 min. The parallel accumulation-serial fragmentation (PASEF) mode with 10 PASEF scans per cycle was used. The electrospray voltage was 1.4 kV, and the ion transfer tube temperature was 180 °C. Full MS scans were acquired over the mass-to-charge (m/z) range of 150-1700. The target intensity value was 2.0 × 10 5 with a threshold of 2500. A fixed cycle time was set to 1.2 s, www.nature.com/scientificdata www.nature.com/scientificdata/ and a dynamic exclusion duration was 0.4 min with ± 0.015 amu tolerance. Only peaks with charge state ≥ 2 were selected for fragmentation. Data processing. Software Spectronaut 38 v15 (Biosynosis, Switzerland) was used with default settings to generate spectral libraries. The UniProt SwissProt and TrEMBL combined database 39 (Oryctolagus cuniculus (Taxon ID 9986), downloaded on 01/10/2022, 43,526 entries containing both reviewed (894) and unreviewed (42632) entries without isoforms) was used to build the spectral library. Cysteine carbamidomethylation was used as a fixed modification, and methionine oxidation and acetylation as variable modifications. The false discovery rate (FDR) was controlled at < 1% at peptide spectrum match, peptide, and protein levels. Spectral assay libraries were generated for each rabbit eye tissue type (Table 1). In total, 108 data-dependent acquisition raw mass spectrometry data were used to generate one combined spectral assay library.
spectral assay library quality control using DIaLib-QC. All generated spectral assay libraries were evaluated using DIALib-QC 30 (http://www.swathatlas.org/DIALibQC.php), a freely available software tool to evaluate a library's characteristics, completeness, and correctness across 62 parameters of compliance. The DIALib-QC assessment reports for all spectral libraries have been deposited to the ProteomeXchange Consortium 31 (http://proteomecentral.proteomexchange.org) via the PRIDE 32 partner repository with the dataset identifier PXD031194 33 .

Data Records
The raw mass spectrometry data (.d), generated spectral assay library files (.xls,.kit), and DIALib-QC assessment reports for all spectral libraries have been deposited to the ProteomeXchange Consortium via PRIDE 32 with the dataset identifier PXD031194 33 . The data will be shared under the terms of the Creative Commons Attribution (CC BY) license as per PRIDE's standard terms. The raw mass spectrometry data files were labelled www.nature.com/scientificdata www.nature.com/scientificdata/ as "Rabbit(name of the eye tissue)_(other information such as fraction number, etc.).d.zip". The spectral libraries were labelled as "Rabbit(name of the eye tissue)Library.kit" and "Rabbit(name of the eye tissue)Library.xls". The. kit files can be imported into Spectronaut software and used for DIA-MS data analysis. The spectral libraries in generic format (.xls) can be used by third-party software such as Skyline. All DIALib-QC assessment reports for all spectral libraries were compressed into one zip file, within which quality assessment reports for each individual spectral library were included.

Technical Validation
High-quality assay libraries are required for accurate identification and precise quantification of peptides and proteins. To generate a spectral library for each rabbit ocular compartment, the DDA-MS datasets described above (see Methods and Data records, Table 1) were analyzed using the Biognosys' proprietary search engine Pulsar. False identifications are controlled by FDR estimation at three levels: peptide-spectrum match (PSM), peptide, and protein group level. To generate the spectral library for the entire eye, the search results (PSMs) from each compartment analysis before applying any FDR filter were combined, and a library-wide control of FDR was applied at the PSM, peptide, and protein group level. In all libraries (Table 1), the FDR was controlled at 1% at all three levels. There were 12,468 peptides representing 2,398 protein groups in the rabbit tear spectral library, which is the smallest library in all compartments (Fig. 1a,b). There were 75,384 peptides representing 7,927 protein groups in the rabbit iris spectral library, the largest library in all compartments (Fig. 1a,b). Overall, the rabbit eye spectral library included 2,214,258 transitions identifying 149,074 peptide precursors representing 113,593 www.nature.com/scientificdata www.nature.com/scientificdata/ stripped peptides and 9,830 protein groups (dark green bar in Fig. 1a,b). Among the spectral libraries for each rabbit eye compartment, 750 protein groups were common in all eye compartments, and 643 protein groups were common in all eye compartments except for tears (Fig. 1c), which accounted for 14% of the proteins identified in the rabbit eye. In addition, there were 1,581, 1,569, 678, 504, 481, 424, and 395 protein groups that were uniquely identified in rabbit iris, retina, vitreous humor, conjunctiva, cornea, tear, and sclera, respectively (Fig. 1c). These high numbers of proteins unique to each eye tissue demonstrated the differences between compartments.
To ensure correct identification of peptides and proteins during DIA-MS data analysis, DIALib-QC was used to evaluate the quality and characteristics of the library. As shown in Fig. 2a, high RT similarity between + 2 and +3 charge states of the same peptide as indicated by an RT correlation r 2 value of 1 demonstrates the high quality of chromatography and retention time normalization based on reference peptides in the library. Fig. 2b showed a higher number of y than b fragment ions (63% vs. 37%), which is expected with collision-induced dissociation (CID) fragmentation. Moreover, more than 99% of fragment ions are with +1 or +2 charge states (Fig. 2c). Peptides with more than 6 fragment ions per precursor constitute more than 90% of the library, ensuring an adequate number of ions to estimate peptide quantities (Fig. 2d). Precursor m/z values in the library range from 150 to 1,700 m/z with 60% of the precursors in the range between 400 and 1200 (Fig. 2e). Precursor charge states range from + 1 to +6, among which 99% are of charge states between +2 and +4 (Fig. 2f). The length of the peptides ranges from 7 to 46 amino acids, with > 98% of the total of less than 30 amino acids in length (Fig. 2g). Proteins with more than 5 peptides per protein constitute 62% of the proteins in the library, and the proteins with 2 or more peptides per protein reach 85% of the total (Fig. 2h). The high number of peptides per protein ensures confident identification of such proteins in the DIA-MS data analysis. The ion mobility values in the library were converted into ion-neutral collisional cross section (CCS) values using the Mason-Schamp equation 40 . Within each charge state, CCS values are correlated with m/z values (Fig. 2i). The ion mobility values in the library improve identification confidence of peptides, thus proteins in the data analysis.

Usage Notes
Library search of Ms/Ms data. Spectrum annotation via library search is both faster and more sensitive than database search algorithms 41 . Due to a lack of data, library search has not been practical except for the most common species (e.g., human and bacteria). As the assay libraries presented here contain data for several major tissues in the rabbit eye, peptide and protein identification via spectral library matching in vision research using rabbit models becomes an attractive alternative to database searching. Moreover, the use of the CCS values has the potential to increase the confidence in the identification of peptides.
Compatibility with commonly used software for peptide and protein analysis. In this study, we provide spectral assay libraries in Spectronaut's native format (.kit) and generic (.xls) format. The spectral libraries in generic format can be used with commonly used software such as Skyline for peptide and protein analysis.

Code availability
Software used in the generation of this project is third-party software as described in the Data Records section, i.e., Spectronaut and DIALib-QC.