High-pH reversed-phase fractionated neural retina proteome of normal growing C57BL/6 mouse

The retina is a key sensory tissue composed of multiple layers of cell populations that work coherently to process and decode visual information. Mass spectrometry-based proteomics approach has allowed high-throughput, untargeted protein identification, demonstrating the presence of these proteins in the retina and their involvement in biological signalling cascades. The comprehensive wild-type mouse retina proteome was prepared using a novel sample preparation approach, the suspension trapping (S-Trap) filter, and further fractionated with high-pH reversed phase chromatography involving a total of 28 injections. This data-dependent acquisition (DDA) approach using a Sciex TripleTOF 6600 mass spectrometer identified a total of 7,122 unique proteins (1% FDR), and generated a spectral library of 5,950 proteins in the normal C57BL/6 mouse retina. Data-independent acquisition (DIA) approach relies on a large and high-quality spectral library to analyse chromatograms, this spectral library would enable access to SWATH-MS acquisition to provide unbiased, multiplexed, and quantification of proteins in the mouse retina, acting as the most extensive reference library to investigate retinal diseases using the C57BL/6 mouse model. Measurement(s) retina Technology Type(s) mass spectrometry Sample Characteristic - Organism Mus musculus Measurement(s) retina Technology Type(s) mass spectrometry Sample Characteristic - Organism Mus musculus Machine-accessible metadata file describing the reported data: https://doi.org/10.6084/m9.figshare.13128044


Background & Summary
The retina is the site of many posterior ocular diseases, including retinal detachment 1 , diabetic retinopathy (DR) 2 , age-macular degeneration (AMD) 3 , glaucoma 4 , and myopia 5 . It can be divided into primarily ten major layers including (1) inner limiting membrane (ILM); (2) nerve fiber layer; (3) ganglion cell layer; (4) inner plexiform layer (IPL); (5) inner nuclear layer (INL); (6) outer plexiform layer (OPL); (7) outer nuclear layer; (8) outer limiting membrane (OLM); photoreceptor layer (PL); and (10) retinal pigmented epithelium (RPE) monolayer 6 . The C57BL/6 mouse is the most commonly used inbred strain for research, which has the advantage of its genome being sequenced, as well as the permissive genetic background allows maximal expression of most mutations. Jeon el al (1998) reported seven major cell populations in the C57BL/6 mouse retina, including rod photoreceptors, cone photoreceptors, Müller glia cells, retinal ganglion cells (RGC), horizontal cells, amacrine cells, and bipolar cells 7 . Modern transcriptomics analysis has recently revealed 39 transcriptionally distinct cell populations using single-cell RNA sequencing (scRNA-seq), supporting the presence of novel candidate cell subtypes of microglia, retinal endothelial cells, and astrocytes recently 8 . In addition to the wild-type mouse (Mus musculis), other well-established mouse models include the retinal degeneration 10 (rd10) mutant is used to study neuronal degeneration of the retina 9 and Ins2Akita for research into early retinal complications in diabetes 10 . The study of retinal diseases has been limited by a lack of retinal cell lines comparable to the neural retina, with only limited provision of the RPE 11,12 (D407, ARPE19) and RGC 13 (RGC-5) cells. Therefore, profiling normally growing C57BL/6 mouse retinal proteins could provide an important reference dataset to advance the understanding of retinal physiology and its ocular functions. A similar approach has been applied in for guinea pig 14 , human 15 and zebrafish 16  www.nature.com/scientificdata www.nature.com/scientificdata/ samples (brain, gallbladder, pancreas, large intestine, small intestine, liver, lung, stomach, and urinary bladder) were reported with the total of 11,340 proteins, acquired in 180 samples, comprises of 437 DDA scans 17 .
Highly sensitive mass spectrometry has become an indispensable tool to investigate protein expression, interaction, and post-translational modification (PTM). Data-dependent acquisition (DDA) extracts the most abundant eluted parent ions of a survey scan (MS1) and subsequently these are fragmented in a collision compartment (MS2) to enable peptide sequencing and identification by software data processing. The biased extraction affected by protein abundance has hindered reproducible quantification between sample runs. The generation of a spectral library assists the construction of a Data-independent acquisition (DIA) approach, in particular, a sequential windowed acquisition of all theoretical fragment ion mass spectra (SWATH-MS) that allows reproducible and precise quantification of thousands of proteins in complex tissue 18 . New biological insights have been made possible by combination and reutilization of open-access retinal spectral libraries for interrogation by SWATH-MS recently 19 .
Here we present the most comprehensive DDA spectral library, compiled using offline high-pH fractionation, which contains 5,950 non-redundant proteins, (54,865 peptides) at 1%FDR of the C57BL/6 mouse retina, accounting for 35% of the total reviewed proteins listed in the UniProt protein database (Mus musculus, UP000000589). The library was generated by combining the results of DDA from a total of 38 injections, with 36 peptide samples and two pooled samples extracted from normal adult mouse retinas. (Fig. 1 and Table 1) The dataset was generated with the combined sample preparation protocol of suspension-trap, S-Trap 20 , and subsequently with the commercially available high-pH fractionation kit that allows ultrafast reproducible lysis, digestion, and fractionation of the neural retina. This submission provides the reference list of proteomes identified in-vitro in C57BL/6 mouse neural retina for ocular proteomics research. The generated data has been accepted by the PRIDE repository 21 for open access.

animals.
Mice were maintained as in-house breeding colonies at the centralised animal facility, The Hong Kong Polytechnic University. Wild-type C57BL/6 mice (n = 3) were refracted between 3 and 6 weeks of age to assess refractive development under unmanipulated visual conditions. The eye of each animal was examined at postnatal day 25 with a high resolution spectral-domain optical coherence tomography (SD-OCT) to confirm the physiology of the external and internal ocular structures was normal. Animals were housed in standard mouse cages at 25 °C on a schedule of 12:12 hour of light/dark cycle, with mouse pellets and water available ad libitum. Refractive error measurement with Infrared photometer. Refractive error was measured with a customised infrared photorefractor (Steinbeis Transfer Centre, Germany) as previously described 22,23 . In brief, the pupil of each eye was dilated with mydrin-P ophthalmic solution, containing 0.5% tropicamide and 0.5% phenylephrine HCl for 15 minutes. The mouse was sedated (ketamine 70 mg/kg; xylazine 10 mg/kg delivered by intraperitoneal injection) and then placed on the SD-OCT cylindrical platform, the distance from the camera being based on the image acuity. The mouse eye was aligned to its Purkinje image with gaze control in the x-and y-axes smaller or equal to 5. The software automatically collects 99 datapoints within the gaze control, repeated for three technical replicates. The refractive error is represented as the mean value in diopters (D).
Ocular dimension measurement using SD-OCT. The axial length and the dimension of each ocular component was analysed by a SD-OCT system (Envisu R4310, Leica, Germany) with a 50° mouse probe after refractive error measurement. First, the mouse was aligned and positioned at a close distance to the probe under free scan mode. The optical disc position was Identified and adjusted to a position 2 mm above the optic nerve. This distance allowed the display of the whole eye structure from the corneal apex to the choroidal sclera layer. The eye was scanned in radial volume mode (A-Scans = 1000 lines, B-Scans = 6 scans, 32 frames, 80 lines of inactive A-scans, 0.4 mm diameter), which was repeated to obtain three technical replicates. The length of each www.nature.com/scientificdata www.nature.com/scientificdata/ component is represented as its mean value in mm. Axial length (AXL) was defined as the distance from the corneal apex to the posterior retina 24 . Corneal thickness (CT) was measured from the corneal apex to the posterior of cornea. Anterior chamber depth (ACD) was measured from the posterior of the cornea to anterior of the lens. Lens thickness (LT) was measured up to posterior lens. Vitreous chamber depth (VCD) was measured up to the retinal nerve fiber layer. Retinal thickness (RT) was measured up to retinal pigment epithelial layer.

Retina extraction.
Procedures employed were similar to published methods with minor modification 25 .
Mice were sacrificed with cervical dislocation after in-vivo ocular measurement. Both eyes were enucleated and immediately stored in ice-cold phosphate-buffered saline (PBS). The cornea was separated after hemisecting the eye equatorially, the crystal lens was removed from the eye cup with forceps. The remaining tissue was submerged in ice-cold PBS in a cell culture dish. The tissue was shaken gently, with the forceps holding the posterior sclera only. The retinal layer was detached from the retinal pigmented epithelium (RPE) and choroidal-sclera compartment. The retina was rinsed with ice-cold PBS, transferred to a 1.7 mL Eppendorf tube and snap-frozen in liquid nitrogen immediately.
Mouse retinal protein extraction. Three  www.nature.com/scientificdata www.nature.com/scientificdata/ lysis buffer containing 5% sodium dodecyl sulfate (SDS) and 50 mM triethylammonium bicarbonate (TEAB). Generation of SWatH spectral library. Raw data files from all the fractions were analysed using ProteinPilot 5.0.1 software (SCIEX, US). Mus musculus (Mouse) proteome database, containing 17,015 of reviewed proteins, was acquired from the UniProt proteome dataset UP000000589. Trypsin was set as the enzyme for digestion, Carbamidomethylation was set for fixed modification with all possible biological modifications selected. FDR analysis was performed by searching MS/MS spectra against a given target database combined with a reserved-sequence decoy proteome database. Peptide samples were injected between 4 to 6 µL, ranging from 1.2 to 2 µg due to varied peptide concentration (Table 1).

Data Records
The raw mass spectrometry readout DDA files were generated with Sciex proprietary software in (.wiff) format for library generation. The converted XML files, consensus spectral library, and ProteinPilot group file containing protein identification (ProteinPilot version 5.0.1) have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository 26 .  Table 2. Dimension in ocular compartments between age 25 and 46. Rx = refraction in diopters; AXL = axial length (mm); CT = corneal thickness (mm); ACD = anterior chamber depth (mm); LT = lens thickness (mm); VCD = vitreous chamber depth (mm); RT = retinal thickness (mm). Axial length equal to the sum of each ocular compartments (CT, ACD, LT, VCD, RT). Measurements were represented as the mean value of left and right eyes in mm, ± SD (n = 6).

technical Validation
Validation of ocular parameters in normal C57BL/6 mouse. The nocturnal mouse model has the potential to become an important model for studies of genetic and proteome alteration in the control of eye growth and myopia. The proteome dataset was built on the background of un-touched, normal C57BL/6 mice recovered at postnatal age (days) P46. Biometric measurements were performed at age P25, P32, P39, and P46 respectively. The dimensions in each ocular component between age 25 and 46 are shown in Table 2. Interocular differences in each ocular component, corneal thickness, anterior chamber depth, lens thickness, vitreous chamber depth, and retinal thickness are shown in Fig. 2. The axial length growth and lens thickness could be described by linear regressions. The observed constant growth rate in C57BL/6 mouse was similar to that previously reported 27 . The observed axial length of about 2.9 mm on day 22, was similar that reported by a study determining values from age 22 to 100 d in C57BL/6 mice 28 . Despite similar axial lengths, there were noticeable differences in the distribution of other components, with thicker corneal layer (0.0887 mm on P21) and increased anterior www.nature.com/scientificdata www.nature.com/scientificdata/ chamber depth (0.2854 mm on P21). However, there were no statistically significant differences in axial length, corneal thickness, anterior chamber depth, lens thickness, vitreous chamber depth, retinal thickness from age 25 to 46. The interocular difference was calculated by averaging and subtracting the dimension in the contralateral eye on the same time-point.
Characteristics of the proteomics dataset. The combination of S-Trap protein extraction and high-pH peptide fractionation procedures allowed for rapid and reproducible sample preparation of mouse retinas. This dataset represents the first S-Trap application of an established protocol of retinal sample for biological investigation using SWATH-MS approach. The overall characteristics of the generated DDA spectral library are listed in Fig. 3. The tryptic digestion yielded uncontrolled C-terminal cleavage with 46% and 52% devoted to lysine (K) and Arginine (R) amino acids. The abundance of peptide charge ranges from +2 charge to +5, with the majority (60%) of peptides found in +2 charge state. The instrument has been maintained for high mass accuracy and Fig. 3 Statistical analysis of the proteomics data. (a) Enzymatic digestion efficiency with 78.2% of expected canonical sequence peptides, 5% over-cleaved sequence peptides, and 15.8% under-cleaved sequence peptides; the distribution of peptide precursor charge state with 60% doubly charged, decreasing to 5+ accordingly. The tryptic digestion resulted in uncontrolled C-terminal cleavage with 46% and 52% devoted to lysine and arginine respectively. (b) Precursor mass error in ppm during acquisition. The identified peptide sequence mass-to-charge value and its retention time in C18 chromatographic separation. (c) The correlation of MS1 intensity with good correlation to identified peptide intensity and good correlation between MS2 signals to MS1 intensity, showing the robustness of the mass spectrometry system in ion transmission and corresponding confidence on MS2 peptide mass fingerprint spectrum.
www.nature.com/scientificdata www.nature.com/scientificdata/ positive correlation in ion intensity in MS1 and MS2. The false discovery rate (FDR) was controlled at 1% in protein and peptide level using ProteinPilot (SCIEX, US), the stringent community standard in proteomics assay (Fig. 4). This high confidence retina proteome dataset presents the first, and the largest DDA spectral library, at the time of submission, for eye and vision research using the popular C57BL/6 mouse model. False discovery rate control and TIC chromatogram. False discovery rate (FDR) cut-off at 1%, 5%, and 10% confidences, nonlinear fitting of racked proteins, ROC plot of identification and estimated FDR in (a) protein and (b) peptide level. At 1%FDR cut-off, the dataset identified 5,950 proteins and 54,865 peptides. (c) Total ion chromatogram and heatmap of ion intensity acquired in a 120-minute gradient separation. From individual analysis of fraction 1 to 6 identified 2721, 4325, 4800, 4391, 3847, and 3766 unique proteins at 1% FDR respectively.