Proteomic analysis of human lacrimal and tear fluid in dry eye disease

To understand the pathophysiology of dry eye disease (DED), it is necessary to characterize proteins in the ocular surface fluids, including tear fluid (TF) and lacrimal fluid (LF). There have been several reports of TF proteomes, but few proteomic studies have examined LF secreted from the lacrimal gland (LG). Therefore, we characterized the proteins constituting TF and LF by liquid chromatography mass spectrometry. TF and LF were collected from patients with non-Sjögren syndrome DED and from healthy subjects. Through protein profiling and label-free quantification, 1165 proteins from TF and 1448 from LF were identified. In total, 849 proteins were present in both TF and LF. Next, candidate biomarkers were verified using the multiple reaction monitoring assay in both TF and LF of 17 DED patients and 17 healthy controls. As a result, 16 marker proteins were identified (fold-change > 1.5, p-value < 0.05), of which 3 were upregulated in TF and 8 up- and 5 down-regulated in LF. In conclusion, this study revealed novel DED markers originating from the LG and tears by in-depth proteomic analysis and comparison of TF and LF proteins.


Results
Comprehensive global proteome profiling of TF and LF. The proteins in pooled TF and LF samples from patients with DED (n = 5) and controls (n = 5) were analysed by liquid chromatography MS (LC-MS/MS) following trypsin digestion and high-pH reverse-phase liquid chromatography fractionation (Fig. 1a, Table 1). In total, 1165 proteins and 1448 proteins were identified in the TF and LF with a false discovery rate of less than 1% at the protein and peptide spectrum match levels, respectively. In total, 1764 proteins mapped to 1671 genes were identified in TF and LF samples (Supplementary Table S1).
Among the identified proteins in TF and LF, only 48% (489) were common between TF and LF, indicating that 599 proteins in LG were not detected in TF. In addition, 316 proteins not found in LF were identified in TF. Therefore, these proteins originated from the ocular surface (Fig. 1b).
To determine the cellular components of TF and LF proteins, we conducted Gene Ontology (GO) analysis using Database for Annotation, Visualization and Integrated Discovery (version 6.7). Figure 1c demonstrates that the cellular pool of TF comprised mostly extracellular region proteins such as defensin β1 (DEFB1), tenascin XB (TNXB), N-acetylgalactosaminyltransferases 1 (GALNT1), and etc., whereas that of LF consisted largely of cytosol proteins such as ribosomal proteins (RPL18, RPLP1, RPS16), histidyl-TRNA synthetase (HARS), adenylosuccinate Lyase (ADSL) and so on. Notably, proteins specifically identified in each fluid showed distinct  biological characteristics. TF-specific proteins (In total, 316) belonged to the oligosaccharide metabolic process and glycosylation-related processes groups, whereas LF-specific proteins (In total, 599) belonged to the translation and RNA processing groups ( Supplementary Fig. S1).
Differentially expressed proteins in the TF and LF of patients with DED. We found 138 differentially expressed proteins (DEPs) in TF and 161 DEPs in LF (fold-change > 2) (Supplementary Table S2). Among the proteins, PLA2G2A, PCBP1, and YWHAB were simultaneously upregulated and UBA52, CALML5, CEACAM1, and WFDC were downregulated in both the TF and LF of DED patients. Furthermore, the expression levels of FGB, IGHM, and GANAB were increased in LF but reduced in TF in DED patients compared to in controls. In contrast, expression levels of LAP3, SERPINB2, CTSL, and ZG16B were increased in TF but reduced in LF in DED. When all quantifiable proteins were considered, approximately 50% of the DEPs in TF and LF between DED and control samples showed equivalent increasing or decreasing patterns. Supplementary Fig. S2 shows the alteration patterns of proteins with protein-protein interaction information from the Search Tool for the Retrieval of Interacting Genes (STRING) database and were enclosed by GO-BP (Biological Process) terms. Next, we compared our data for proteins types with those identified in four previous studies of DED biomarkers from TF based on an fold-change > 1.5 [16][17][18][19] . Of the 77 TF proteins whose expressions were altered by DED, as reported in previous studies, 37 showed a consistent alteration pattern in this study. The expression pattern of five downregulated proteins in TF, including defensin α family (DEFA1) and lactoperoxidase (LPO), corresponded to previously reported results. Furthermore, 11 upregulated proteins in this study were previously reported to be upregulated in TF as the result of DED. Interestingly, LF DEPs showed alteration patterns consistent with the reported expression patterns of DEPs in previous TF proteomics studies. Among them, 20 upregulated [e.g., a-2-HS-glycoprotein (AHSG), valosin-containing protein (VCP), and orosomucoid 1 (ORM1)] and 7 downregulated proteins [lipocalin-1 (LCN1), alpha-2-glycoprotein 1, zinc-binding (AZGP1), and lysozyme C (LYZ)] showed the same quantitative patterns as those reported in previous TF proteomics studies 23 (Supplementary  Tables S3a,b).
Gene ontology analysis revealed a key regulator of DED function. To gain insight into the functional roles of DEPs associated with DED in TF and LF, we first compared the GO-BP and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis using up-and downregulated DEPs (Supplementary Tables S4a-d, S5a-c). Next, we displayed the GO-BP heatmap to detect alterations at the molecular systems level of the proteome associated with DED. Forty-nine categories were enriched and are presented in the heatmap (Fig. 2).
Notably, the GO-BP category of 'immune/inflammatory response process' was actively changed in LF, but not in TF. Most upregulated LF proteins, including complement C4b (C4B), haptoglobin (HP), Alpha-1-acid glycoprotein 1 (ORM1), and clusterin (CLU), belonged to the GO-BP category of 'immune response' from only LF. Moreover, the category of 'apoptosis regulation' was down-regulated in TF and up-regulated in LF, indicating that the cell death and protective mechanism from immune-inflammatory cytokines is more active in LG than on the ocular surface. Additionally, carbohydrate mechanism-related genes were up-regulated in TF, but not in LF.

Proteome interaction network model describing DED.
To prospectively understand the map of cellular networks altered in the TF and LF of DED patients, we constructed network models using DEPs from TF and LF by using STRING database information (Fig. 3a). From this map, we explored the key proteins involved in crucial biological processes that may influence the pathogenesis of DED by carrying out a calculation of centrality (weighted closeness) based on the degree of betweenness (Fig. 3b). The results included the top 29 proteins based on degree and betweenness centrality which reflect the amount of control that this node exerts over the interactions of other nodes in the network 24 . Heat shock protein HSP 90-alpha (HSP90AA1) showed the highest betweenness centrality in this network model. Additionally, these proteins formed a close network with the category of 'apoptosis' , or 'homeostasis' and showed strong centrality within the protein-protein network model. The proteins were related to the innate immune response, specifically the complement pathway, and were activated systematically in our DED network model.

Verification of candidate biomarkers by multiple reaction monitoring.
We performed MRM analysis using 100 μg proteins from DED patients [TF, n = 17; LF, n = 17] and controls [TF, n = 17; LF, n = 17] to verify the DEPs in the LFQ data ( Table 1). As a result, 62 proteins were confirmed as candidates and 269 Q3 transitions were generated (Fig. 4a, Supplementary Table S6). To verify the reproducibility of the LC-MRM-MS runs, we evaluated coefficients of variation between triplicate results based on the best transitions of each peptide. The overall median and average coefficients of variation of the target peptides were 6.7% and 11.7%, respectively. Based on MRM quantification, 11 and 24 proteins were significantly changed in the TF and LF of DED compared to controls (p value < 0.05, FC > 1.5), respectively (Supplementary Table S7). Among these proteins, 3 up-regulated proteins in TF (GAA, NQO1, and VCP) showed consistent alteration patterns as in the LFQ quantification set. Furthermore, 8 up-regulated proteins (LPO, PLA2G2A, HP, PKM, SERPINA3, P4HB, CBR1 and ORM1) and 5 downregulated proteins (IGKC, LCN1, HBA1, HBB, and AZGP1) in LF were consistent with the LFQ DEPs. Their alteration patterns are represented as interactive box plots which show that the marker candidates could clearly discriminate between DED patients and controls (Table 2 and Fig. 4b).
To confirm the sensitivity and selectivity of the 16 candidate markers for DED, we calculated the area under the receiver operating characteristic curves (AUC). All marker proteins showed high AUC values exceeding 0.7, as shown in Fig. 5. Particularly, the LG-secreted proteins LPO and PLA2G2A which are related to the immune/ inflammatory response showed very high sensitivity and specificity with AUC values of 0.822 and 0.876, respectively, in DED patients.

Discussion
There were three important findings in this study. First, we isolated and identified nearly 1700 proteins from ocular surface fluid, while only several hundred proteins have been previously documented in proteomics analyses of human TF [16][17][18][19] . Therefore, this study represents the largest number of ocular surface fluid proteins of DED patients identified to date. Second, this is the first study reporting the proteomics of human LF compared to those of TF. With the development of LF collection methods, we directly collected fluid originating from the LG, identified and measured protein and expression levels, and compared them with TF proteomics data. Accordingly, the most important pathway and proteins among the DEPs were in the categories of 'immune/inflammatory response from LF, not from TF. Finally, after the DEPs of TF and LF were discriminated between DED patients and controls, they were validated by an MRM assay in larger individual samples. Thus, we identified novel candidate markers with high sensitivity and specificity for DED patients.
In this study, we identified a larger number of proteins in LF than in TF (1448 vs. 1165), although TF contains a greater number of different cells such as the corneal epithelium, immune cells, nerves, vascular cells, Goblet cells, and meibomian gland cells. Moreover, approximately half of the identified proteins (849) were common between TF and LF. These results indicate that 316 TF proteins were not secreted from the LG and that the LG contributes approximately three-quarters of the TF protein composition. Interestingly, 599 proteins secreted from the LG were not found in the TF. Although the fate of these proteins was not determined in the present study, there may be two possibilities explaining their loss. These proteins may be used on the ocular surface soon after secretion from the LG or they are easily degraded by proteases expressed on the ocular surface and/or TF. In addition, a relatively large number of proteins from the corneal/conjunctival epithelium is thought to be secreted under DED conditions. Our GO-BP and KEGG analyses revealed that the most abundant TF-proteins belonged to the extracellular region. Specifically, DEFB1 is well known as antimicrobial peptides produced by corneal and conjunctiva as well as lacrimal gland 25 . TNXB is thought to function in wound healing process during matrix insult like desiccating stress on the ocular surface. Also, GALNT1 is one of the important enzymes involved in the initiation of mucin-type O-glycosylation in human conjunctiva 26 . On the other hand, LF-proteins predominantly belonged to the cytosol area. Since LGs are exocrine glands that actively synthase and secrete proteins with aqueous layer of tears, LF contains ribosomal proteins (e.g., RPL18, RPLP1, RPS16) and HARS which play an essential role in catalysing protein biosynthesis. ADSL is also important in sustaining metabolic and energetic nucleotide cycles for active exocytosis 27 . Collectively, LF and TF proteins originate from different pools and other types of cells, therefore, have some different physiological functions, although >800 proteins overlapped between the two fluids in the present study.
We found that several known biomarker proteins in the TF showed similar expression patterns in the LF of DED patients. In particular, reductions of LCN1, LTF, and LYZ, which are related to the immune response in TF, have been well-documented in previous studies 18,28 . The authors suggested that their decrease in tears is a relevant indicator of LG dysfunction 18,28,29 . According to this study, down-regulation of these marker proteins is caused by proteomic changes in the LF in DED, resulting in impairment of the bacterial defence system in the LF environment. This may render patients more susceptible to microorganism growth on the ocular surface. The expression levels of these proteins were previously shown to decrease only in TF, but our results demonstrated that they are also decreased at the LF level.
In this study, we verified the DEPs in an MRM assay, which is a highly sensitive and selective method for targeted quantification of protein abundance 30 . To select proper MRM transitions, we utilized Human SRMAtlas 31 , which provides definitive and verified peptide transitions and collision energy information optimized by quadrupole-based mass spectrometry. Furthermore, the Agilent Bravo Platform, which uses an accurate liquid handling system, provided highly reproducible preparation for 68 individual samples. The MRM assay revealed, in total, 16 marker proteins (3 in TF and 13 in LF) that were consistent with the LFQ quantification results. In addition, these proteins showed very high AUC values of greater than 0.7 (Fig. 5) and may be novel biomarkers for discriminating between DED patients and controls. Of these proteins, immune response-related proteins in the LF including PLA2G2A, LPO 32 , HP 33 , SERPINA3 and ORM1 were significantly increased in both the LFQ and MRM data. Especially, in the MRM verification set, PLA2G2A was significantly up-regulated in LF (fold-change = 1.92, p-value = 0.0015, AUC = 0.88), but not in TF. PLA2G2A is a pro-inflammatory enzyme that catalyses the initial step of the arachidonic acid pathway 34 . Moreover, PLA2G2A in TF plays a major role in killing a broad spectrum of gram-positive and gram-negative bacteria at the ocular surface under physiological conditions 35,36 . In previous studies, the ocular surface was found to show increased levels of PLA2GA in the TF of DED patients and DED mouse conjunctival tissue 37,38 . Although this biomarker candidate was previously found to be The colours of the nodes represent proteins whose levels were greatly increased (red) or decreased (green) in dry eye disease. Each shape represents a distinguishable alteration condition (outlined in the box). The connection between nodes (grey lines) shows either a regulatory role or physical interaction between proteins. Large nodes represent a high degree of connectivity with other proteins in the network.  important in preventing microbial infections at the ocular surface under DED conditions, our results revealed that it is highly activated in the LF as well. Additionally, LCN1 was significantly down-regulated in the LF of DED patients (AUC = 0.83). This indicates that immune response-related proteins known to be biomarkers in TF are more meaningful marker proteins in LF.
Notably, innate immune defence-related proteins, specifically those related to the complement pathway, were up-regulated at the protein level in LF. Especially, LPO, C4B, F5, FGG, FGA, FGB, KNG1, MIF, SERPINC1, SERPING1, SERPINA1, SERPINA3 and PRDX1 were increased in the LF of DED patients in LFQ data. The complement pathway has not been investigated previously in either non-Sjögren syndrome or Sjögren syndrome DED. Complement pathway proteins are now thought to be important markers of age-related macular degeneration 39,40 , bacterial keratitis 41 , and ocular allergy 42 .
In this study, we could not compare the expression patterns at the gene level in certain related human tissues including the LG and cornea. It is difficult to carry out human LG biopsies because of the LG's highly vascularized structures and ethics limitations. However, valuable information may be obtained when transcriptomic and proteomic analyses are performed with human ocular tissues in future studies. In addition, another limitation of this study was the small number of cohort samples. Although we identified 16 meaningful candidate marker proteins for DEP through both LFQ and MRM analysis, further studies including more TF and LF samples from DED patients and control groups are needed to apply these markers in diagnosis.
In conclusion, we provide fundamental information regarding biomarker candidates for DED using proteomics of fluids from the human ocular surface and subsequent MRM verification in individual samples. Because there are no definitive diagnostic criteria or biomarkers for DED to date, the identification of biomarkers in DED may lead to more accurate diagnosis and grading of DED and ultimately to the development of targeted drug therapies. Further studies of larger patient cohorts are needed to determine and select more accurate markers representing the DED pathology, determine drug selection, and evaluate disease prognosis.

Materials and Methods
Patient enrollment and determination of ocular surface dryness. This cross-sectional, case-control clinical trial was conducted at two sites: Gangnam Severance Hospital (Department of Ophthalmology, Yonsei University College of Medicine, Seoul, Korea) and Sacred Heart Hospital (College of Medicine, Hanllym University, Chuncheon, Gangwon-Do, Korea). All procedures conformed to the tenets of the Declaration of Helsinki. The study was approved by the Institutional Review Board of each hospital, and informed consent was obtained from all patients. DED was diagnosed according to the diagnostic criteria of Asia Dry Eye Society 43 . The inclusion criteria were as follows: one or more DED-related symptoms, including tightness, foreign body sensation, irritation, red-eye, itching sensation, blurring, or pain; a Schirmer's test I result (without anesthesia) of <5 mm in 5 minutes, a tear break-up time of <5 seconds or a typical DED pattern of superficial punctuate erosion of the conjunctiva or cornea. Patients were excluded if they had (1) a history of using eye drops within the current month, (2) infection, trauma, an ocular procedure, or other surgery within the previous 6 months, (3) severe blepharitis with meibomian gland dysfunction, (4) a blinking abnormality (e.g., Parkinson's disease or facial nerve palsy), or (5)    by another author (E.J.C). All evaluations were performed in a blinded fashion on the disease status of all subjects. Table 1 summarizes the clinical parameters for the classification of DED and control samples at each step of proteomic analysis including global profiling and MRM assay.

Sampling of tear fluid and lacrimal fluid for proteomics.
To measure and collect the fluids from patients' tears, a bonded 2.0 × 10-mm polyester fiber rod (TRANSORB ® WICKS, FILTRONA, Richmond, VA, USA) was used as previously reported. Briefly, to collect the TF which is a mixed fluid of the secretion from the LGs, meibomian glands, and corneoconjunctival cells, a polyester wick was applied to the tear meniscus of the lower lid margin. After then, it was removed and placed into a 1.5-ml Eppendorf tube, which was stored at −70 °C until the mass spectrophotometric assay was performed. In addition to the TF, pure LF was collected. The detailed methods for collecting TF and LF were published with video file 44 .
In-solution digestion. Analysis for proteomics data. Further statistical and bioinformatics analyses were performed using Perseus software (v. 1.5.0.31). Proteins that were quantified by at least two peptides using the 'unique plus razor peptides' setting in MaxQuant were used for LFQ to prevent ambiguous abundance comparison. Before loading the LFQ intensity data, hits to the reverse database, contaminants, and proteins only identified by site were eliminated. After loading the data, all duplicate data were grouped separately. All LFQ intensities were transformed to log2 values. Proteins that did not display all values in at least one group were filtered out. Additionally, in cases with a missing value, missing values were replaced by imputation based on the normal distribution (using a width of 0.3 and a downshift of 1.8) 48 . Proteins with expression greater than ±2-fold change from Student's t-test in LFQ intensity were classified to differentially expressed proteins (DEPs). Further annotation enrichment analyses were performed on the resulting significantly DEPs.
Enrichment analysis using gene ontology and network analysis. A gene ontology (GO) search was performed to explore the biological processes and cellular components in TF and LF associated with DED. KEGG pathway mapping was also performed using DAVID freeware 49 . GO biological processes enriched by the DEPs were identified as those with a p-value < 0.05. To construct a network depicting the enriched processes, we selected DEPs involved in enriched biological processes. To reconstruct the network model for DEPs, we collected protein-protein interactome information from the STRING 9.1 public database 50 . The network model was built with sorted DEPs and interactome data using Cytoscape. Transition selection and LC-MRM analysis. (1). Unique peptides in human database were selected for representing target proteins with following criteria: fully tryptic peptides with no missed cleavages, unique to a particular protein, with a length between 6 and 30 amino acids. To quantify the proteins, at least two peptides are selected and three product ions were selected. Optimized collisional energy from SRMAtlas database 31 was applied in our dynamic MRM (dMRM) analysis. To validate the existence of target transitions, selected transitions of target proteins were tested with several MRM scans and transitions which have peaks higher than 1000 area were enriched as final transitions. Final transitions were analyzed using digested peptides in dMRM mode. RP-HPLC column (150 × 2.1 mm ID, Agilent Zorbax Eclipse Plus C18 Rapid Resolution HD, 1.8 um particles) equipped with Agilent 1290 LC separated digested peptides and Agilent 6490 triple-quadrupole mass spectrometer, controlled by Agilent's MassHunter Workstation software (v.B.06.00), generated dMRM result data. Gradient was set up as below: HPLC gradient started at 5% solvent B (90% ACN, 0.1% formic acid) for 2 min and went up to 30% solvent B during a 36 min time period and raised up to 40% solvent B in 4 min, followed by a steep increase to 80% B within 2 min. After retaining 80% solvent B at 4 min, equilibrium was held with 5% sol B 12 min A post column equilibration time of 4 min was used after every sample analysis. Following parameters were used for MRM acquisition: 3500 V capillary voltage, 300 V nozzle voltage, 11 L/min sheath gas flow at a temperature of 250 °C (ultra-high-purity nitrogen),15 L/min drying gas flow at a temperature of 150 °C (ultra-high-purity nitrogen), 30 psi nebulizer gas flow (ultra-high-purity nitrogen), 380 V default fragmentor voltage, 5 V cell accelerator potential, and wide (1.2 da full-width-at-half-maximum) and unit resolution (0.7 Da full-width-at-half-maximum) in the first and third quadrupoles, respectively. To gain enough number of dot points, dwell times for each transitions were determined between 6.55 and 248.88 ms, with transitions being the maximum number that could be monitored in a given 1000-ms cycle.

Automated Peptide Sample Preparation for LC-MRM analysis.
Data Analysis and Statistics. The areas were extracted using Skyline 3.7ver 52 . Savitzky−Golay smoothing was applied to increase the quality of the chromatograms. Peptides' areas between 204 runs were normalized to ß-galactose peptide (APLDNDIGVSEATR, 729.36 m/z (Q1) → 563.28 m/z (Q3), CV = 16.58%) to correct experimental variation. The best transition was selected on the basis of intensity and consistency for the quantification. Independent t-test, ROC (receiver operating characteristic) analysis was conducted using SPSS version 21.0 (IBM Corp., Armonk, NY, USA) to determine the significance of target proteins.