Comparative proteomics in captive giant pandas to identify proteins involved in age-related cataract formation

Approximately 20% of aged captive giant pandas (Ailuropoda melanoleuca) have cataracts that impair their quality of life. To identify potential biomarkers of cataract formation, we carried out a quantitative proteomics analysis of 10 giant pandas to find proteins differing in abundance between healthy and cataract-bearing animals. We identified almost 150 proteins exceeding our threshold for differential abundance, most of which were associated with GO categories related to extracellular localization. The most significant differential abundance was associated with components of the proteasome and other proteins with a role in proteolysis or its regulation, most of which were depleted in pandas with cataracts. Other modulated proteins included components of the extracellular matrix or cytoskeleton, as well as associated signaling proteins and regulators, but we did not find any differentially expressed transcription factors. These results indicate that the formation of cataracts involves a complex post-transcriptional network of signaling inside and outside lens cells to drive stress responses as a means to address the accumulation of protein aggregates triggered by oxidative damage. The modulated proteins also indicate that it should be possible to predict the onset of cataracts in captive pandas by taking blood samples and testing them for the presence or absence of specific protein markers.

www.nature.com/scientificreports/Tryptic digestion and mass tag labeling.Reconstituted depleted serum aliquots containing 10 µg of total protein were digested with trypsin according to the filter-aided sample preparation (FASP) protocol 19 with some modifications.Briefly, each protein sample was reduced with 10 mM dithiothreitol, 8 M urea and 100 mM triethylammonium bicarbonate (TEAB) buffer (pH 8.0) for 1 h at 60 °C, then alkylated with 50 mM iodoacetamide at room temperature for 40 min before centrifugal filtration at 12,000×g (10-kDa cut-off) for 20 min at room temperature.The filtrate was diluted in 100 µL 300 mM TEAB buffer and the centrifugation step was repeated twice.The filtrate was transferred to a fresh tube containing 100 µL 300 mM TEAB buffer, and on-filter digestion was performed overnight at 37 °C using 0.1 μg/μL sequencing-grade trypsin (Promega, Madison, WI, USA).The peptides were eluted by centrifugation at 12,000×g for 20 min at room temperature with one change of TEAB buffer, and the final eluate was lyophilized.
For mass tag labeling, the lyophilized peptides were reconstituted in 100 μL 200 mM TEAB buffer and stored at room temperature.The tandem mass tag (TMT) reagent (Thermo Fisher Scientific) was mixed with acetonitrile and centrifuged according to the manufacturer's instructions.This reagent consists of three chemical groups (reporter, balance and reaction groups) that have the same combined mass (producing a single peak in MS1) but split into different masses during fragmentation (producing separate peaks in MS2 that can be quantified across multiple samples).The peptides and TMT reagent were mixed and allowed to react at room temperature for 1 h before we terminated the reaction with 8 μL 5% hydroxylamine for 15 min.The labeled peptides were lyophilized and stored at -80 °C.
The samples were fractionated at a flow rate of 300 nL/min on an Acclaim PepMap RSLC C18 column (75 μm × 150 mm, 2 μm, 100 Å; Thermo Fisher Scientific) mounted on an EASY-nLCTM 1200 liquid phase system (Thermo Fisher Scientific) and were eluted in a gradient of mobile phases A (0.1% formic acid in water) and B (80% v/v acetonitrile in water with 0.1% formic acid).The gradient elution profile was: 0-55 min, 8% B; 55-79 min, 30% B; 79-80 min, 50% B; 80-90 min, 100% B. The separated peptides were introduced into a Thermo Q-active mass spectrometer (Thermo Fisher Scientific) via a nano-liter spray ion source (Thermo Fisher Scientific).The mass resolution of MS1 was set to 70,000, the automatic gain control value was 1 × 10 6 , and the scanning range was 300-1600 m/z.The 10 most intense peaks were scanned in MS 2 by high-energy collisional fragmentation in data-dependent positive ion mode, with the collision energy set to 32 eV, the resolution set to 17,500, the automatic gain control set to 2 × 10 5 , the maximum ion accumulation time set to 80 ms, and the dynamic exclusion time set to 15 s.Proteomic data analysis.Proteomic data were analyzed using Xcalibur v2.1 and were screened against the UniProt panda database using Proteome Discoverer v2.2 (Thermo Fisher Scientific).The false discovery rate (FDR) for peptide identification was held below 1%.The screening criteria to classify differentially expressed proteins were a fold change of at least 0.5 and a significance of p < 0.05.The resulting datasets are available in the ProteomeXchange Consortium repository (http:// prote omece ntral.prote omexc hange.org/ cgi/ GetDa taset?ID= PXD03 1039 or https:// www.iprox.cn/ page/ proje ct.html?id= IPX00 04000 000; Project ID IPX0004000000, accession number PXD031039).Quantitative bias within samples caused by differences in amino acid sequence length was eliminated by applying the IBAQ method as previously described 20 .Furthermore, quantitative bias across samples caused by sampling and/or loading errors was eliminated by using the total quantity method, which consisted of (i) adding the IBAQ values of each protein identified in the sample to represent the total amount of protein in the sample, (ii) dividing the IBAQ of each protein in the sample by the total amount of protein in the sample, and (iii) multiplying this value by 100,000 to obtain the normalized protein quantitative value.Variation between the cataract and control groups was compared to within-group variation by one-way ANOVA (SPSS Statistics; IBM, Armonk, NY, USA).

Functional analysis of proteins differing in abundance between cataract and control samples.
The TMT quantitative proteomics data were screened for proteins differing in abundance between the pandas with cataracts (samples A1-A4) and healthy controls (samples in bands B or C or B + C).All differentially expressed proteins were then used as search queries against the Gene Ontology and KEGG pathways database, as well as databases of protein interactions, allowing the analysis of comparative expression profiles as well as the construction of heat maps, Venn diagrams, and protein interaction network maps [21][22][23] .Functional enrichment analysis was carried out to identify differentially expressed proteins significantly enriched in GO terms (biological processes, cellular localization and molecular functions) or KEGG metabolic pathways.Genes were mapped to the GO and KEGG databases, the number of proteins representing each term or pathway was calculated, and hypergeometric tests were applied to identify significantly enriched GO terms or KEGG pathways in the protein list.GO terms and KEGG pathways were considered significant at q < 0.05.
Protein interaction analysis.Interactions between the differentially expressed proteins and KEGG pathways were visualized by integrating the data using the online platform Omicsbean to generate the network diagrams shown in Fig. 5.As well as showing that many of the differentially modulated proteins interact directly with each other, the diagrams revealed that the downregulated proteins were responsible for most of the predicted interactions and were almost exclusively responsible for interactions with the proteasome.This suggests that one of the key events during age-related cataract formation in pandas is the depletion or loss of proteasome components, which would contribute to the inability of the proteasome to maintain protein integrity in response to oxidative stress.

Discussion
Aging humans and other mammals can develop cataracts due to the accumulation of oxidative damage in the lens, and this is associated with a declining quality of life 1 .In captive giant pandas, which can live 10-15 years longer than animals in the wild 3,4 , the prevalence of cataracts in the aged population (≥ 19 years old) has now reached ~ 20%.The development of age-related cataracts is mainly promoted by environmental factors that cause oxidative damage to DNA and proteins [5][6][7]9 . Ths triggers changes in DNA methylation and gene expression, and in previous studies we and others have identified target genes involved in apoptosis, DNA repair and oxidative stress responses that may represent an attempt to counter or reverse the damage that occurs during cataract formation 17,18,[24][25][26] .Epigenetic modifications such as DNA methylation can directly affect transcription, and we have previously reported correlations between the methylated loci detected in panda cataracts and the modulation of gene expression in affected vs unaffected individuals 17,18 .However, the regulation of gene expression at the level of protein synthesis and turnover means that transcript and protein levels often do not correlate directly, and this  differentially regulated proteins we detected were annotated as proteins that bind RAGE receptors, including S100A12, S100AB, S100A9 and PSMA2, all of which were upregulated.However, we note that the preponderance of extracellular proteins in our differentially regulated dataset could have a more prosaic explanation-that extracellular proteins are more likely to enter the bloodstream than intracellular proteins are therefore more likely to be captured as serum markers.This may also explain the detection of proteins related to the coagulation pathway such as plasma kallikrein.However, several of the differentially regulated extracellular proteins we detected are components of the extracellular matrix that have previously been associated with cataract formation 37 .For example, collagen 1A2 and the integrin-binding protein vitronectin were both downregulated in the affected animals (along with the intracellular integrin-binding protein TLN1), supporting the reported role of the extracellular matrix and integrin signaling in lens development and cataract formation 38,39 .Interestingly multiple keratin proteins were upregulated in the animals with cataracts (KRT18, KRT75, KRT1 and KRT10) whereas various cytoskeletal components and their interaction partners were downregulated, suggesting a complex network of signaling inside and outside the cell as a stress response to the accumulation of protein aggregates.
Although the above results paint an intriguing picture of the panda proteome related to cataract formation, it is important to highlight some weaknesses of the study that may influence the results.First, we acknowledge that the sample number is small, which reflects the limited availability of samples and the further limitations imposed by the ethical committee.This has the potential to introduce bias into the results causes by undetected disease in the test subjects despite our careful and strict inclusion criteria.Second, almost all recorded instances of captive giant pandas with age-related cataracts are female-only two male cases have been reported, one deceased with no material available and one living specimen.This has the potential to introduce bias into the results caused by sex-dependent factors, although we addressed this to a certain extent by dividing the control group into sex categories to enable female affected vs female unaffected and male unaffected vs female unaffected comparisons.www.nature.com/scientificreports/These comparisons revealed some differences between males and females but not in the same pathways that discriminated between affected and unaffected animals, suggesting there is negligible sex-dependent interference.
In conclusion, we have generated multiple lines of evidence based on quantitative proteomics showing that age-related cataracts in pandas involve the dysregulation of the lens proteome resulting in the detection of nearly 150 positive and negative protein markers in the blood.More than half of these markers are extracellular proteins, suggesting either that cataract formation causes the extensive modification of the extracellular proteome or that extracellular proteins are more likely to enter the bloodstream in sufficient quantities for detection.One of the most prominent aspects of the cataract serum profile was the depletion of proteasome components and their interaction partners, supporting previous results showing that the proteasome maintenance system deteriorates with age and is progressively less responsive to the accumulation of protein aggregates.Such aggregates may then exacerbate the problem by directly inhibiting the proteasome.Interestingly, we did not identify a single transcription factor among the differentially regulated proteins we detected, which indicates that the response to cataract formation is largely post-transcriptional.The identification of multiple protein markers correlating with the presence of cataracts could allow the development of bioassays for the early detection of cataracts in captive animals, based either on the most profound changes in abundance of key proteins such as those listed in Table 3, or the analysis of multiple biomarker profiles to define cataract-positive patterns, as shown for other diseases 40,41 .In terms of clinical applications, proteins that increase in abundance during the formation of cataracts could be tested as new drug targets, whereas those depleted during the formation of in cataracts could be evaluated as candidates for replacement therapy, leading to new opportunities for the prevention and/or treatment of cataracts in aging captive pandas.

Figure 1 .
Figure 1.Differential protein expression when comparing band A (pandas with cataracts) to bands B and C combined (pandas without cataracts).(a) The number of upregulated and downregulated proteins in band A compared to bands B + C. (b) Volcano plot showing the most meaningful differentially expressed proteins in band A compared to bands B + C by plotting significance on the y-axis against fold-change on the x-axis.Proteins to the left (green dots) are downregulated and those to the right (red dots) are upregulated according to the statistical threshold, below which the proteins are not considered to be differentially expressed (black dots).

Figure 2 .
Figure 2. Number of enriched GO biological process, cell component and molecular function classifications and KEGG pathways defined by the differentially regulated proteins in the A vs B + C comparison.The bar chart shows the number of enriched categories with no statistical cut-off (blue) and with a threshold of p < 0.05.

Figure 4 .
Figure 4. KEGG pathway enrichment analysis.(a) The top-10 pathways enriched among the proteins that differ in abundance between pandas with cataracts (band A) and those without cataracts (bands B + C).The enriched pathways are arranged left to right by p-value (more significant on the left) which is also represented by the logarithmic scale of the y-axis.The dotted lines represent cut-offs at significance values of p < 0.05 and p < 0.01.(b) Significant enrichment functional scatter plot in which the y-axis represents functional annotation, and the x-axis represents the Rich factor of that function (the number of differential genes enriched in a pathway divided by the total number of genes annotated in that pathway).The p-value is represented by the color of the dots, and the number of differentially expressed genes representing each function is shown by the size of the dots.Only the 14 highest enrichments are shown.

Figure 5 .
Figure 5.Protein interaction network visualized using the Omicsbean platform.Circles around the edge of the diagram represent the differentially regulated proteins (red = upregulated, green = downregulated, with different shades representing the fold change).The squares represent biological processes, cell localization, molecular functions or signaling pathways, and the color code indicates significance (yellow = low, blue = high, with deeper shades indicating greater significance).Solid lines represent protein-protein interactions, and dotted lines represent the involvement of proteins in metabolic/molecular pathways or functions.

Table 1 .
Description of giant panda sample donors.Sample band A features the affected females, whereas bands B and C feature the healthy females and males, respectively.

Table 2 .
Triplicate absorbance readings (and average reading) and the calculated concentration of proteins in the analyte (diluted for measurement) and the original depleted serum sample.