Rear 4-min Schirmer test, a modified indicator of Schirmer test in diagnosing dry eye

This study aims to investigate the reliability and efficacy of rear 4-min Schirmer test, as a supplement indicator, in assessing tear secretion and diagnosing dry eye. 180 participants were enrolled in this study. Schirmer test I without anaesthesia was performed once on both eyes to determine the value of normal Schirmer test. The values of tear secretion were recorded at each minute. Other examinations included the following: the ocular surface disease index (OSDI), the standard patient evaluation of eye dryness (SPEED), fluorescein stain, tear film break-up time (BUT), and Meibomian gland (MG) secretion grading. The participants were divided into dry eye (DE) group and non-dry eye (ND) group. The values of the 2-min Schirmer test, rear 3-min Schirmer test, rear 4-min Schirmer test, and 5-min Schirmer test were 5.36 ± 4.63, 5.57 ± 2.11, 7.21 ± 4.13, and 10.93 ± 6.30, respectively, in the DE group. These indicators were 8.25 ± 6.80, 2.73 ± 2.31, 7.36 ± 3.42, and 11.84 ± 6.16, respectively, in the ND group. The rear 4-min Schirmer test had a significant correlation with OSDI and SPEED in the DE group (r =  − 0.242/ − 0.183) and in the ND group (r =  − 0.316/ − 0.373). Meanwhile, the rear 4-min Schirmer test had a stronger connection with fBUT (r = 0.159) and MG secretion (r =  − 0.162) in the DE group and also had higher accuracy in diagnosing severe DE and borderline DE. In conclusion, the rear 4-min Schirmer test may be a supplement indicator in assessing tear secretion and diagnosing DE.

www.nature.com/scientificreports/ make up for the partial shortcomings of classical ST. Our purpose was to investigate the reliability and efficacy of rear 4-min Schirmer test to help ophthalmologists diagnose DE.

Materials and methods
Subjects. This prospective study was performed at the outpatient clinic of the Second Affiliated Hospital of Zhejiang University School of Medicine in September 2019. The way we recruited volunteers was based on sampling of convenience. Participants attending the outpatient department of a teaching hospital attached to a medical college were enrolled in the study. The multiple rate comparison method performed with PASS version 15 was used to estimate sample size. The final sample size also included 10% dropout rate. All participants were divided into DE and non-dry eye (ND) groups. The criteria of DE diagnosis were OSDI > 13 points and fBUT < 10 s; patients not meeting these criteria was classified into the ND group. Both eyes of per participant were included in this study. The exclusion criteria were as follows: age < 18 years; current pregnancy; eye allergies, conjunctival inflammation, corneal ulcer, eyelid inflammation, palsy, valgus, and other ocular lesions; history of wearing contact lenses within 30 days; history intraocular operation within 6 months; severe blepharitis; or severe systemic disease. This study followed the tenets of the Declaration of Helsinki and was approved by the Ethics Committee of the Second Affiliated Hospital of Zhejiang University School of Medicine (NO. 2019-307). All participants provided written informed consent after an explanation of the nature and possible consequences of the study. DE symptom questionnaires. Ocular surface disease index. The ocular surface disease index (OSDI) was a validated dry eye questionnaire that can measure the severity of DE, symptoms, functional problems, and environmental triggers queried for the past week 17 . The OSDI have been regarded as an established standard questionnaire 4 . Each OSDI answer in 12 questions was graded on a scale of 0-4. A total OSDI scores ranged from 0 to 100. The results were interpreted as follows: normal (scores 0-12) and dry eye (scores 13-100) 18 .This study used Chinese version and validated in previous studies 19,20 .
Standard patient evaluation of eye dryness questionnaire. The standard patient evaluation of eye dryness (SPEED) questionnaire was administered to grade the level of DE symptomology 21 . And the SPEED questionnaire is also useful in assessing dry eye symptoms in a nonclinical sample 22 . The assessment standard of the SPEED questionnaire is derived by summing the scores from the frequency and severity parts of the questionnaire for 3 months. The values of frequency and severity in the SPEED questionnaire were obtained by summing the scores of the eight items (with each rated from 0 to 4), and the total SPEED scores ranged from 0 to 28. The results were interpreted as follows: normal (scores 0) and dry eye (scores 1-28). We uesd a Chinese language version and validated in previous study 23 .

DE symptom examinations. ST I.
ST was performed once without topical anaesthesia, and both eyes were evaluated at the same time. The filter paper (Showa Yakuhin Kako, Tokyo, Japan) was folded and placed between the lower eyelid and the globe at the junction between the middle and lateral thirds of the eyelid. The participants were asked to close their eyes, as ST with close eyes may be more reliable in DE 24 . The test lasted 5 min, and the length of the wetted paper was directly read off the scale on the paper. Wetting was measured at 1, 2, 3, 4, and 5 min. The participants were excluded from the study if any of the scores exceeded 30 mm (the whole paper was wetted) because of the inability to give an exact measurement of the amount of wetting. The results were interpreted as follows: ≤ 5 mm: severe DE, ≤ 10 mm: borderline DE, and > 10 mm: normal tear secretion 25 . We observed multiple indicators of ST, as follows: (1) ST1: The value of the normal (5-min) ST. Tear film BUT. Tear film instability was assessed using BUT measurements, which were obtained using the following method. TBUT tests include fluorescein tear film break-up time (fBUT) and noninvasive tear film breakup time (NIBUT). fBUT: Fluorescein was added to the tear film for both eyes, and the patients were asked to blink several times to ensure uniform distribution. The time from the last blink of the eye to the first dry spot on the tear film was measured. Three consecutive measurements were made and recorded. The average values of the three measurements were recorded. NIBUT: NIBUT was measured using Keratograph 5 M (Oculus Optikgerate GmbH, Wetzlar, Germany). The participants placed the lower jaw on the jaw rest and keep both eyes open. After successful focusing, they were asked to blink twice and look at the red point in the centre, keeping the eye open as much as possible until the next blink. Then, the device automatically detected the tear film distribution map and displayed the measured values. BUT was categorised as moderate (5-10 s) or severe (< 5 s) 26,27 .
Meibomian gland secretion quality grading. The quality of the Meibomian gland (MG) secretion grading was briefly assessed using a Meibomian Gland Evaluator (TearScience, Morrisville, NC, USA). A total of 15 Meibomian glands (in the nasal, middle, and temporal parts of the lower tarsus) were selected. The quality of meibum was graded as follows: 0, clear fluid; 1, thick and cloudy dropout; 2, inspissated and congealed dropout with the consistency of toothpaste; and 3, no dropout. The total scores ranged from 0 to 45. www.nature.com/scientificreports/ CFS score. For CFS, the participants' cornea was divided equally into five quadrants, and the score for each quadrant was recorded after fluorescein staining, as follows: 0, no punctate staining; 1, less than 30 stained points; 2, more than 30 stained points but no fusion; and 3, entirely stained with fusion. A total score of five parts was given as 0-15.

Sequence of DE tests.
Participants completed the tests in the following order: first, each participant signed informed consent forms, completed the OSDI and SPEED questionnaires, and provided general information. Second, a Keratograph 5 M measurement was performed to obtain the NIBUT. Third, fBUT, CFS, and Schirmer I tests were performed. Finally, MG secretion was measured using MG evaluators. A ≥ 10-min intertest interval was set during which the patients were asked to rest with their eyes closed.
Statistics. All statistical analyses were performed using SPSS (version 24, IBM, Armonk, NY, USA) for Mac. A Kolmogorov-Smirnov test and P-P plot were used to assess the normality of the continuous variables. An independent samples t test was used to compare the parameters of the ST indicators and the demographic characteristics between the two groups. The chi-square test was used to compare the sex distribution between the two groups. Pearson's rank-order correlation was used to identify the correlation between ST, BUT, OSDI, SPEED, MG secretion grading, and CFS. One-way ANOVA test was uesd to compare the values of four indicators of ST.
In order to investigate the reliability and efficacy of four indicators of ST, true positive rate and true negative rate were used and the chi-square test was used to compare the efficacy of four indicators of ST. The true positive rate is equal to true positive/(true positive + false negative) × 100%. The definition of true positive rate is the proportion of patients who are actually dry eye in the total number of patients who are just judged to be dry eye according to the criteria of test. The definition of patients who are actually dry eye is that their tear secretion is less than the cut-off.The true negative rate is equal to true negative/(true negative + false positive) × 100%. The true negative rate is the proportion of patients who are actually normal in the total number of persons who are just judged to be normal according to the criteria of test. The definition of normal person is that their tear secretion is more than the cut-off. The figures were created using SPSS and GraphPad Prism, version 8 (San Diego, CA). The datas are presented as mean ± standard deviation. p < 0.05 was considered statistically significant.

Results
Participants' demographics and baseline characteristics. We included 360 eyes of 180 participants; of them, 240 eyes of 120 participants had DE and 120 eyes of 60 participants did not.   Table 2. We also recorded the ST results at each minute. However, we noticed a sign of tear secretion when the patients were performed ST. The speed of mean wetting per minute had been slowed down in ST. The speed was approximately 1.25-4.56 mm/min in the test, including in the DE and ND groups (Fig. 2). We observed that the speed of wetting paper in the first minute was unstable and casual in different participants. Therefore, we focused on the other indicators of ST, which removed the data from the first minute. We observed that the value of the rear of 4-min was approximately 3/5 of ST1.  Table 3 presents the details. We observed that both ST1 and rear 4-min ST had relationships with the other indicators. As we expected, ST had a weak correlation with the questionnaire scores. However, the rear 4-min ST had a higher correlation with OSDI and SPEED than ST1.

Relationship Value (mean ± SD) ST1 2-min ST Rear 3-min ST Rear 4-min ST
Value (mean ± SD) 10     www.nature.com/scientificreports/ the true negative rate (TNR) in the ND group. We observed that the TNR of ST1, 2-min ST, rear 3-min ST, and rear 4-min ST was 50 eyes (41.67%), 47 eyes (39.17%), 48 eyes (40.00%), and 54 eyes (45.00%). The TPR of the four indicators showed a significant difference (p < 0.001). Based on the above results, we chose one indicator, rear 4-min ST, to be a reliable supplement for ST. We observed that the number of patients in theory whose tear secretions were less than 10 mm in 5 min accounted for 74.62% of people whose value of rear 4-min ST was lower than 6 mm. To provide the reference for the clinical diagnosis, we divided the patients into severe DE (ST ≤ 5 mm) and borderline DE (ST 5-10 mm). The value of tests in severe dry eye and borderline dry eye was in the supplementary Table 2. We compared rear 4-min ST with ST1 in subjective questionnaires and objective examinations. As shown in Table 5, the relationship between rear 4-min ST and other examinations in patients with severe DE was higher than that in patients with borderline DE. This means that in patients with severe DE, DE may be more easily diagnosed through the rear 4-min ST than classical ST.

Discussion
No consensus exists on the diagnostic criteria for DE in different areas, even though the International Dry Eye Workshop proposed the criteria for DE diagnosis. Most ocular examinations are not a perfect means to diagnose DE 28 , and suffer from low sensitivity and specificity in detecting symptoms. One of the major reasons may be the heterogeneity of DE. However, there is no single existing test to reliably diagnose DE, and two or three tests are often required, making the process time and labor intensive. Therefore, it is crucial to simplify the process and improve the accuracy of the tests. ST is one of the most commonly used tests for assessing tear production ability due to its low cost and simplicity. However, the value of diagnosing DE has not yet been unified with the improvements of ST over decades. Jones' 12 prospective of basal ST was that the cut-point should be set at 10 mm/5 min; however, Van Bijsterveld 29 thought the length of wetting paper less than 5.5 mm should be defined as deficient aqueous production. A cutoff of 5.0 mm/5 min is recommended as meaningful 4,30 . Even though ST without topical anaesthesia may cause patients discomfort, we think that reflex tear secretion is an indispensable indicator.
ST is easily influenced by many factors, such as reflex tear, paper irritation, temperature, and humidity 31,32 . Therefore, doctors have strived to improve the accuracy of ST in clinical practice. Schulze suggested altering the ST to strip meniscometry because of its rapid performance of the procedure (5 s per eye) 33 . As we know, ST without topical anaesthesia may induce reflex tearing, and reflex tear secretion is unstable. To decrease the disadvantage of irritation of filter paper, a new strip named K-Schirmer was developed in Korea, and its reliability of repeated measurements was higher than ST 34 . In Vasileios Karampatakis' research, he compared the 2-and 5-min test and evaluated the reliability and the tear secretion of the Schirmer test I in 2 min 15 . In his study, he noticed that the speed of wetting paper slowed over time. And we saw a similar sign about the speed of wetting.  www.nature.com/scientificreports/ During the initial 1 min, the speed of tear production was faster than in the next min; this phenomenon suggested an adaptation of the central nervous system 16 . Another reason was that a few remaining tears may exist in the low fornix. For that matter, the doctor should make sure that there are no tears before the test. However, we realised that the technique and process of dipping tears were not convenient for doctors and patients. In our study, we took the ST once as usual and recorded the values of each minute to determine the elements that may be more reliable for assessing tear secretion. Our study included both patients and normal people. Therefore, we had enough samples to verify whether our selected indicator really worked. In the long term, ST is implemented only once in clinical practice. Only one test may impair the accuracy of the results and the validity of the test. To increase the accuracy of examinations, we should take care of more than one indicator. Telles suggested measuring several dynamic wetting lengths when we gauge human tear production 35 . Therefore, we selected four indicators: ST1, 2-min ST, rear 3-min ST, and rear 4-min ST. In our study, the ST values were consistent with the OSDI and SPEED. However, the relevance was low. This was consistent with a previous report 36 . At the same time, we valued ST with other examinations, such as BUT, MG secretion grading, and CFS. BUT is considered to be the most common statistical pattern to value the stability of tear film 37 . MG secretion grading is used to assess the extent of gland obstruction 38 . The fluorescein staining score is a test that values the environment of the cornea. Similar to the results above, ST was correlated to the fBUT and MG secretion grading in the DE group. Besides ST1, we noticed an interesting phenomenon: rear 4-min ST were correlated to the fBUT and MG secretion grading in the DE group. However, it was also correlated with the fBUT and MG secretion grading in the ND group. Finally, rear 4-min ST had higher TPRin diagnosing DE. It suggested that participants in our study, regardless of which group, rear 4-min ST may be a reliable indicator in assessing tear secretion. The reasons may include adaptation of irritation, the stable speed of tear secretion and reduction in the eye's pain of patients. For doctors, it just needs a few seconds to record the value of the first minute and fifth minute of ST, and it is able to give reliable data to evaluate tear secretion. However, some participants may be sensitive about the paper and have massive tear secretion. These people do not fit this measurement and need other tests.
In conclusion, our study found that rear 4-min ST seemed to be a better assessment of tear production. Our data indicate that rear 4-min ST < 6 mm and 5-min ST < 10 mm likely indicates DE. One of the limitations of this study is that we did not account for the drainage and evaporation of tears.