Reliability and efficacy of maximum fluorescein tear break-up time in diagnosing dry eye disease

This study aims to investigate the reliability and efficacy of maximum fluorescein tear break-up time (FTBUTmax) in diagnosing dry eye disease (DED). 147 participants were enrolled in this study. Ocular symptoms were assessed by Ocular Surface Disease Index (OSDI). The fluorescein tear break-up time (FTBUT) examination, corneal fluorescein staining (CFS), and Schirmer I test were performed on both eyes. Each participant underwent 3 consecutive FTBUT tests, and five types of FTBUT values including FTBUTmax, the minimum FTBUT (FTBUTmin), the first FTBUT (FTBUT1), the average of three FTBUTs (FTBUT123) and the average of the first and second FTBUT (FTBUT12) were recorded. FTBUTmax was larger than the other FTBUT values, but no differences were found among the values of FTBUT1, FTBUT123, FTBUT12 and FTBUTmin. In the ROC analysis, FTBUTmax had the largest or the second largest area under the ROC (AUROC) in all three DED diagnostic criteria, while FTBUTmin had the least AUROC of them. ROC efficacy of FTBUTmax was significantly higher than that of FTBUT123, FTBUT12, FTBUT1 and FTBUTmin in the OSDI criteria and higher than that of FTBUT1 and FTBUTmin in Schirmer I test and CFS tests. FTBUTmax has a close correlation with OSDI, Schirmer I test and CFS, and is an effective tool for the DED diagnosis.

Dry eye disease (DED) is considered as a complicated ocular disorder, with symptoms affecting 5-30% of the population worldwide [1][2][3] . The International Dry Eye Workshop (2017) defines DED as "a multifactorial disease of the ocular surface characterized by a loss of homeostasis of the tear film, and accompanied by ocular symptoms, in which tear film instability and hyperosmolarity, ocular surface inflammation and damage, and neurosensory abnormalities play etiological roles" 4 . Several approaches including the staining of cornea and conjunctiva, Shirmer test, tear osmolarity examination and symptomatic questionnaires have been introduced to detect DED 5 , but their reliability and precision in the DED diagnosis is still a challenge. According to previous research, measurement of tear break-up time (TBUT) has been the most widely-adopted dry eye examination because of its convenience and efficacy 6,7 .
TBUT was first introduced by Norn 8 to assess the stability of tear film. It is traditionally defined as "the interval between the last complete blink and the first appearance of a dry spot or disruption in the tear film" 9 . Over the past decades, fluorescein tear break-up time (FTBUT) has been extensively applied to DED diagnosis by means of a slit-lamp microscope and fluorescence staining. However, its reliability is disputed by different operational and environmental factors, including inter-and intra-observer bias, fluorescein concentration, incomplete blink characteristics, uneven distribution of tear film, air humidity and ambient temperature, etc [10][11][12] . Still, FTBUT is the mostly welcomed in detecting DED, especially in ophthalmologic-related interdiscipline, such as rheumaticand gynecologic-related DED, due to the limited use of non-invasive tear break-up time worldwide. Therefore, improvement of old FTBUT methods and exploring new FTBUT statistical approaches are warranted. Instead of using a single reading raised by Norn, Gu 13 et al. proposed to assess tear film stability and dry eye severity with an average of two or three FTBUT consecutive readings. Interestingly, according to our formal clinical activities, the longest FTBUT (FTBUTmax) among three consecutive measurements appeared to perform well in clinical DED diagnosis and treatment.
To find out which statistical type of FTBUT values is the most applicable to DED diagnosis, five different FTBUT values were raised in our study, namely, the maximum of three consecutive FTBUT values (FTBUTmax), the minimum of three consecutive FTBUT values (FTBUTmin), the first value (FTBUT1), the average of three readings (FTBUT123) and the average of the first and second FTBUT values (FTBUT12). It was found Classification and Evaluation of DED. On the occasion that FTBUT values were the study target of our experiment, they could not be used to define and diagnose DED. Therefore, we chose three other parameters for DED detection, i.e., Ocular Surface Disease Index (OSDI), corneal fluorescein staining (CFS) and Shirmer I test. Specifically, an OSDI score > 13 points, a CFS score > 0 point or a Shirmer I test < 10 mm was used as dry eye diagnostic criterion according to Kim  FTBUT. FTBUT examination was conducted by an ophthalmologist with more than 20 years of clinical experience. FTBUT values were measured using commercially available sterile fluorescein paper strips (Jinming New Technological Development Co. Ltd., Tianjin, China). Briefly, approximately 5μL (a drop) normal saline was instilled to the trip, which was then shaken to remove extra liquid in order to minimize the volume of fluorescein fluid. Afterwards, the strip was gently touched with the inferior temporal bulbar conjunctiva for 1-2 s. Participants were asked to blink three times naturally to facilitate the uniform distribution of fluorescein on the ocular surface. The time from the last blink of the eye to the first dry spot on the tear film was measured under a cobalt-blue filter. Three consecutive measurements were recorded with a time interval of 30 s. Two eyes were observed separately.
Corneal fluorescein staining. Corneal fluorescein staining assessment was carried out right after FTBUT testing with the same fluorescein staining strips. Corneal staining was evaluated under a yellow filter using the Oxford scale. Scores over 0 point were regarded as positive.

Schirmer I test.
A sterile 5 mm*30 mm Schirmer strip was gently inserted between the middle and lateral third of each lower lid margin. Participants were then instructed to softly close their eyes. Five minutes later, the length of the wetting strip was recorded in millimeters.

Statistics. All statistical analyses were performed with software SPSS for Windows, version 25 (IBM, Chicago, IL, USA) and MedCalc for Windows, version 19 (MedCalc Software, Ostend, Belgium). Figures were created by SPSS and GraphPad Prism, version 8 (San Diego, CA)
. Descriptive statistics were summarized as mean ± standard deviation (SD). A Kolmogorov-Smirnov test was used to assess the normality of continuous variables. Raw data of FTBUT undertook logarithmic transformation before statistical analysis. FTBUT values were compared by one-way ANOVA. The receiver operating curve (ROC), area under the ROC (AUROC), cutoff point, sensitivity, specificity and Youden index were calculated. Correlations between FTBUT values and other dry eye parameters were explored using logarithmic Pearson's correlation coefficient (r). The analysis was double-side and a p value < 0.05 was considered statistically significant.   www.nature.com/scientificreports/  Table 3. When dividing by Schirmer I test, ROC efficacy of FTBUTmax was significantly higher than that of FTBUT1 and FTBUTmin, but not different from that of FTBUT123 and FTBUT12. When dividing by OSDI, ROC efficacy of FTBUTmax was significantly higher than that of FTBUT123, FTBUT12, FTBUT1 and FTBUTmin. When dividing by CFS, ROC efficacy of FTBUTmax was significantly higher than that of FTBUT12, FTBUT1 and FTBUTmin, but not different from that of FTBUT123.

Correlations between FTBUT values and other dry eye examinations. Pearson's correlation
analysis of logarithmic data and raw data was made. According to Table 4   www.nature.com/scientificreports/

Discussion
In this study, the reliability and efficacy of FTBUTmax was investigated by comparing it with other four types of FTBUT values. All five types of FTBUT values were derived from three repeated FTBUT measurements. According to our results, FTBUTmax tended to be a reliable and effective parameter in DED diagnosis on account of its largest or second largest AUROC in three DED diagnostic criteria, including Schirmer I test, the OSDI score and CFS. Besides, the ROC analysis result of FTBUTmax was consistent with the value of Pearson's correlation coefficient r, which demonstrated its reliability. As for the first FTBUT and minimum FTBUT values, they were biased given the fact that their AUROC values were lower than those of other types of FTBUT values.
Previous studies have doubted about the reliability of FTBUT results since the instillation of fluorescein fluid may reduce the stability of tear film 17,18 . In order to eliminate the negative impact of fluorescein fluid, we shook the wet sodium fluorescein stripe during the test to minimize the volume of fluid instilled into eyes and reduce the irritation of ocular microenvironment 19 . As FTBUT was measured within one or 2 min, the influence of environmental factors could be possibly ignored. Close correlation was found between five FTBUT values (FTBUTmax, FTBUTmin, FTBUT1, FTBUT12 and FTBUT123) and dry eye examinations (Schirmer I test, OSDI and CFS), which verified the reliability of these FTBUT values in measuring the severity and degree of dry eye subjective symptoms.
The patient's blinking pattern has a great influence on tear film stability as the blinking completion level interferes in the progress of tear film reformation 20 . A complete blink induces the formation of tear film. In this process, the tears from the upper and lower eyelids mix to form an aqueous layer while the Meibomian glands secrete lipids to generate a lipid layer 21 . In incomplete blinking, however, the upper eyelid only partially covers the surface of the eye without making contact with the lower lid, making it impossible for the aqueous layer, lipids, and other secretions to fully expand on the ocular surface. Harrison et al. 22 found that subjects who often did not complete their blinks experienced a faster rate of tear film break-up and a shorter FTBUT. Our study results implied the first FTBUT value was more likely to be affected by insufficient blinks partly because participants were not familiar with fluorescein examination. It was consistent with the findings of Braun's study 23 . In addition, the unfamiliarity with FTBUT examination and a lack of compliance may result in unpremeditated movement of participants' eyeballs and thus less stable FTBUT values. Consequently, ophthalmologists might be deceived by an unstable and biased FTBUT result and provide over-treatment for "patients" who are actually healthy 24 .
Therefore, FTBUTmax, which indicated a relatively full blink as well as a stable ocular environment between blink intervals, proved to be as reliable and effective as FTBUT123 and superior to FTBUTmin, FTBUT1 and FTBUT12 in diagnosing DED. Our study suggests that ophthalmologists announce integrated and precise procedures and matters to patients before FTBUT examination. Several blinks and simulative exercises before the first examination might be useful in reducing the error caused by unfamiliarity.
There are also some limitations in this study. Firstly, a lack of "golden standard" in diagnosing DED made it difficult to identify "real" DED patients. FTBUT values themselves were considered as one of the most vital  www.nature.com/scientificreports/ diagnostic examinations, which rendered their sensitivity and specificity less reliable. Secondly, FTBUT values might be affected by observers and subjects. Camera recording of the FTBUT test and double-blind reading were recommended as they were more precise. However, our study failed to reach the most suitable equipment.
To improve the reliability of FTBUT results, we invited a veteran observer with over 20 years of clinical experience. Finally, it was hard to standardize the blink interval of FTBUT. Different lengths of the blink interval and incomplete opening of eyes have complicated impact on the results.
In conclusion, we investigated the performance of a novel FTBUT statistical pattern --FTBUTmax for tear film assessment in the detection of dry eye. Overall, although the findings of our pilot study require the validation of independent investigations, FTBUTmax seems to be a useful and reliable tool for dry eye diagnosis and the assessment of tear film stability since FTBUT measurement is still widely adopted in clinical practice.