Diagnostic Test Efficacy of Meibomian Gland Morphology and Function

Meibomian gland dysfunction (MGD) is the leading cause of dry eye and proposed treatments are based on disease severity. Our purpose was to establish reliable morphologic measurements of meibomian glands for evaluating MGD severity. This retrospective, cross-sectional study included 100 MGD patients and 20 controls. The patients were classified into dry eye severity level (DESL) 1–4 based on symptoms and clinical parameters including tear-film breakup time, ocular staining and Schirmer I. The gland loss, length, thickness, density and distortion were analyzed. We compared the morphology between patients and controls; examined their correlations to meibum expressibility, quality, and DESL. Relative to controls, the gland thickness, density and distortion were elevated in patients (p < 0.001 for all tests). The area under the receiver operating characteristic curve was 0.98 (95% confidence interval [CI], 0.96–1.0) for gland loss, and 0.96 (CI 0.91–1.0) for gland distortion, with a cutoff value of six distorted glands yielding a sensitivity of 93% and specificity of 97% for MGD diagnosis. The gland distortion was negatively correlated to the meibum expressibility (r = −0.53; p < 0.001) and DESL (r = −0.22, p = 0.018). In conclusion, evaluation of meibomian gland loss and distortion are valuable complementary clinical parameters to assess MGD status.


Correlation between meibomian gland morphology and other clinical findings in MGD.
Reliability of measurements of meibomian gland morphology. Cohen's kappa and the ICC were calculated to assess the reliability of the morphological quantifications. Cohen's kappa values were 0.84 and 0.81 for intraobserver and interobserver agreements, respectively, in subjective meibograde. Similarly, the ICC values were between 0.81 and 0.89 in computerized MG dropout. Furthermore, the observers' ability to repeat and reproduce morphological quantifications of MG length, thickness, and density and number of distorted glands yielded a range of ICC values of 0.56-0.94 (Table 3), referred with agreement [16][17][18] . Both subjective meibograde and computerized analysis of MG morphology demonstrated moderate to strong intra-and interobserver agreement.

Discussion
The heterogeneous presentation of MGD complicates its detection and monitoring. Various treatments have been proposed based on MGD severity 15 , but the lack of a universally accepted staging system of clinical severity renders it problematic. The Tear Film & Ocular Surface Society (TFOS) attempted to establish an MGD grading system with focus on a few limited clinical findings, including altered meibum expression, meibum quality, and ocular staining 15 . Such a grading system underestimates the significance of MG anatomical changes in progressive MGD.  Table 1. Comparison of morphologic and functional parameters between MGD patients and healthy controls. MG = meibomian gland. * Indicates significance after adjusting for the influence of age using a general linear model.
In the present study, we investigated multiple morphologic characteristics of MGs as assessed by meibography and examined their clinical application for evaluating MGD severity. We found that the meibograde, gland distortion, and MG length had excellent ability to discriminate between MGD patients and healthy subjects. Both meibograde and gland distortion were weakly correlated to DESL, meibum expressibility, and meibum quality suggests the necessity of MG morphology analysis in MGD development. Moreover, both subjective meibograde and computerized quantification of MG loss showed moderate to strong interobserver agreement indicating a great reliability for both analysis methods. The gland distortion is an early pathogenic finding and associated with progressive loss of MGs indicating severe MGD.
We found that a cut-off value of six distorted glands was sensitive and specific for diagnosing MGD. It was also observed that meibum expressibility decreases with progressive reduction in the number of distorted glands. Moreover, the MGD patients with worst meibum expressibility (score of 3) had the lowest number of distorted glands. Similarly, the lowest number of distorted glands was found in patients with highest meibograde (grade of 6). These findings suggest that MG torsion is, to a certain extent, pathogenic in early-stage MGD, and this particular structural change of MGs disappears with disease progression as MGs start to drop out.
The underlying mechanism of distorted MG development is unknown. The distorted MGs have also been observed in patients with allergic conjunctivitis 19 , and the duct distortion might represent an inflammatory process in early phase of MGD. However, the status of allergic conjunctivitis in MGD patients was not evaluated, and could therefore, be a confounding factor that contributes to observation of distorted glands 19 . Moreover, we observed that the decreased meibum secretion was also related to the reduced number of distorted glands.
Our results indicate that finding of fewer distorted glands with increasing meibograde is associated with MGD development.  www.nature.com/scientificreports www.nature.com/scientificreports/ In addition to the number of distorted glands, we also found that the subjective meibograde and gland length were effective discriminators of MGD. In line with previous reports 2,20 , MGD patients had significantly higher MG dropout than the healthy controls. For this study, we used a modified four-point grading scale based on previously suggested cut-off values for MG dropout for discriminating between dry and normal eyes 21,22 . The results demonstrated a high efficacy of the meibograde for discrimination between MGD and healthy controls. Furthermore, a higher meibograde was associated with increased score of both meibum expressibility and quality. Taken together, our findings confirm that quantitative assessment of gland dropout is a sensitive and specific indicator of MGD development and progression.
Herein, we observed that MG length, thickness, and density were all weakly correlated with the meibum expressibility score, but not related to the quality of expressed meibum. These morphological changes may affect the secretion ability of a gland, but do not seem to affect the macroscopic quality of the secreted meibum. Our findings both agree with 11 and contradict 12 previous findings, complicating consistent interpretation of the results. It is, however, surprising that increased gland thickness was not related to altered meibum quality, as it has been hypothesized that gland obstruction and dilatation are partly driven by increased meibum viscosity 2 . On the other hand, it is impossible to evaluate meibum quality if a gland is completely obstructed and does not secrete meibum at all.
The tests of meibum expressibility and quality are considered a surrogate measure of MG function 2,7 , and our findings suggest that meibum quality is a sensitive and specific test for MGD. The diagnostic efficacy of meibum quality might be overestimated due to that MGD classification was based on meibum quality and expressibility. Nevertheless, the efficacy of meibum expressibility was poor, and may result from only assessing limited number of centrally located glands. It is known that the variable secretory activity of individual glands depending on their location along the eyelid 23 . In healthy subjects, the nasal MGs tend to produce more meibum 24 and are more active even after considerable MG loss 25 . Thus, both nasal and temporal regions of the eyelid should be examined in future studies.
The reliability of clinical parameters is an important attribute of a consistent classification of MGD severity. In the present study, there was moderate to strong agreement among three observers regarding the quantification of morphologic features. Consistency was lowest for MG thickness and density. A possible explanation is interobserver disagreement in selecting the three most representative MGs, which will always be an issue in cases where all MGs are not evaluated. The variability between observers was, however, mitigated by using the average of three independent observers. These findings indicate that the clinical morphology parameters that are repeatable in a consistent manner should be chosen for evaluation of MGD and its severity. This approach may be useful in clinical practice, allowing investigators to standardize the quantification of morphologic features and to compare results obtained at different locations.
There are some limitations to the present study. The estimates on efficacy of meibum expressibility and quality score are subjected to the selection bias. The initial MGD diagnosis in current study was based on altered meibum expressibility or quality (score >1), and might consequently resulted in an overestimation of the diagnostic efficacy of those two tests. Second, some of the morphologic features were evaluated on the upper eyelids only. There are anatomical differences between the upper and lower eyelids; the lower eyelids have fewer glands 26 ; while the lower eyelids have greater gland thickness, the gland length is shorter 11,21 . Future studies should include evaluation of the lower eyelids despite the strong correlation between the upper and lower eyelids 11,21 . Third, only the three most prominent glands were chosen for quantifying MG length, thickness, and density. There could potentially be interobserver disagreement in selecting the most representative glands. Despite the strong interobserver agreement in this study, a possible approach in future studies might be to investigate only a part of the eyelid, preferably the third part of the eyelid corresponding to the site where meibum expressibility and quality are tested.
Moreover, the observed morphological changes in patients could also be a result of confounding variables, including allergic conjunctivitis mentioned earlier and contact lens wear which has been reported to be associated with loss of MGs 27 . However, the potential for confounding factors was reduced by randomization of the group sample of patients and controls. Furthermore, the use of case-control in a study of diagnostic test may lead to inflated estimates of diagnostic accuracy compared to using a series of consecutive patients. Of note, none of the volunteers had any symptoms of ocular discomfort, which reduces the likelihood of additional conditions that potentially could generate false-positive results. Lastly, prospective studies are needed to confirm the utility of meiboman gland distortion cutoff as a diagnsotic parameter for MGD In conclusion, structural MG changes are closely associated with MGD progression. More specifically, gland distortion, has a comparable diagnostic capability as MG loss and MG quality, and is therefore strongly affected by the pathological processes of MGD. Moreover, gland torsion is a pathogenic finding in the early stage, and

Materials and Methods
Study subjects. One hundred and nine MGD patients and twenty healthy volunteers of mainly Caucasian ethnicity were evaluated in this retrospective, cross sectional, case-control study. MGD patients were selected from the patient pool from the Norwegian Dry Eye Clinic by a simple random sampling method. Results of a set of standardized clinical examinations including Ocular Surface Disease Index (OSDI) questionnaire, tear-film break-up time (TFBUT), Schirmer I test, ocular staining, meibum expresibility and quality, and meibographic imaging at their initial presentation to the clinic were analyzed.
The assessment of MGD is made after diagnosing DED, which was based upon symptom assessment and clinical tests as TFBUT, Schirmer I test and ocular surface staining 2 . Subjects with (1) score >1 for either meibum quality or expressibility or (2) score = 1 for both meibum expressibility and meibum quality, and over 20 years old were classified as MGD patients 2 . The patients were further evaluated with regard to the dry eye severity level (DESL) and scored with 1-4 according to the guidelines proposed by the 2007 International Dry Eye Workshop 28 . Briefly, DESL score was given based on a combination of severity of ocular symptoms and clinical ocular surface parameters, including TFBUT, ocular staining and Schirmer I (Table 4).
Twenty healthy volunteers without any systemic diseases, pre-existing ocular conditions or dry eye symptoms were further recruited as a control group for this study. For control group, the clinical tests including TFBUT, Schirmer I, meibum expressibility, meibum quality were performed, and meibography images were also obtained.
The study was conducted in accordance with the Declaration of Helsinki. The Regional Committee for Medical & Health Research Ethics, Section C, South East Norway (REC) reviewed the use of the data in this study. REC found the research project "Evaluation of data from the Norwegian Dry Eye Clinic" to be outside the remit of the Act on Medical and Health Research (2008) and, therefore, could be implemented without specific approval. Written informed consent was obtained from all participants' prior data collection.
Morphology analysis. The morphology was evaluated by analyzing meibography images obtained with the non-contact infrared meibography system OCULUS Keratograph 5 (OCULUS, Wetzlar, Germany). Images were excluded based on the following criteria: 1) interrupted complete assessment of the eyelid; 2) inadequate exposure of the tarsal area; 3) strong reflection of illumination; or 4) lack of focus of the image. MG loss in each eyelid was evaluated subjectively using a four-point grading scale (meibograde) of 0-3 as described in our previous work 22 : grade 0: 0-25% loss; grade 1: 26-50% loss; grade 2: 51-75% loss; and grade 3: >75% loss. The grades for both the upper and lower eyelids were summed to yield a total grade from 0 to 6 for each eye. MG dropout was also analyzed using computer and ImageJ software. Both MG loss and total tarsal area were measured as described by Pult et al. 21 , and the ratio was presented as the MG dropout percentage (0-100%). Further computerized analyses of additional morphologic characteristics were performed on the upper eyelids only. For MG thickness and length measurements, three glands mainly in central region, with length and thickness in close approximation to majority of the glands were subjectively chosen as most representative glands and analyzed. MG area density was assessed by measuring the interglandular space between two adjacent MGs at three different sites on the eyelid ( Fig. 2A) 22 . A larger interglandular space value indicated lower density. For measurement of MG length, a continuous line following the path of the gland and covering the entire visible length of a gland was drawn and measured. To depict the MG thickness a continuous horizontal line covering the gland horizontally was drawn and measured. To measure the interglandular space, a continuous horizontal line was drawn between the outer borders of two adjacent glands and measured. Lastly, the number of distorted MGs (with torsion >45°) in upper eyelid was counted (Fig. 2B) and represents level of gland distortion for each eye (Fig. 3).
Three experienced observers analyzed the meibography images to assess the interobserver reliability. The observers repeated their analyses at a 2-week interval to evaluate intraobserver agreement. The observers were masked for the diagnosis, from other observers, and their own previous analyses.
Clinical tests of meibomian gland function. All patients first completed a symptom questionnaire to obtain an OSDI score between 0 (no symptoms) and 100. Five MGs in the central area of the lower eyelids were tested for their ability to express meibum. The ability of these glands to secrete meibum was graded 0-3 based on the number of expressible glands as described by Pflugfelder et al. 29 : 0 = all glands expressible; 1 = 3-4 glands expressible; 2 = 1-2 glands expressible; and 3 = no glands expressible. Meibum quality was assessed on the central 8 MGs in the lower eyelids, and rated on a 0-3 scale: 0 = clear fluid; 1 = cloudy fluid; 2 = cloudy, particulate    www.nature.com/scientificreports www.nature.com/scientificreports/ fluid; and 3 = inspissated, toothpaste-like meibum 30 . The score for each expressed gland was summed to yield a composite score 2 . The Schirmer I test was performed without anesthesia by inserting the test strip in the lateral third of the lower eyelid for 5 minutes 4 . TFBUT for each eye was measured 30 seconds after instillation of 5 µl 2% fluorescein to the conjunctival sac. Ocular surface fluorescein staining was analyzed in similar fashion and graded using the Oxford grading system 4,31 . Statistical analysis. Data were analyzed with SPSS (v24.0). Cohen's kappa values were calculated to evaluate the observers' agreement of the subjective meibograde, and intraclass correlation coefficient (ICC) was evaluated for the consistency of computerized measurements of morphology. A principal component analysis (PCA) was performed to take into account and summarize the inter-eye correlation. PCA is a statistical data reduction technique used to explore the directions of maximal collinearity among a group of variables 32,33 . In this study, the result of the individual parameter from both eyes of each subject was optimally weighted using PCA loadings, so that a single factor score characterizing each subject could be obtained and used for further statistical analysis 34 . Relationships between morphological features and MG function were determined by Pearson correlation. The patients and healthy subjects were compared using the Mann-Whitney U statistics and Kruskal-Wallis with Dunn's post-hoc test. The influence of age in between-group comparisons was adjusted using a general linear model. A receiver operator characteristics (ROC) curve was generated to investigate the clinical application and optimal cut-off values of morphologic measures in MGD diagnostics. P < 0.05 was considered statistically significant.

Data availability
The datasets generated during and analyzed during the current study are available from the corresponding author on request.