Glaucoma is a multifactorial disease where various systemic features are involved in the progression of the disease. Based on initial systemic profiles in electronic medical records, this study aimed to develop a model predicting factors of long-term rapid retinal nerve fiber layer (RNFL) thinning over 5 years in 505 patients with primary open-angle glaucoma. Eyes with faster or slower RNFL thinning were stratified using a decision tree model, and systemic and ophthalmic data were incorporated into the models based on random forest and permutation methods, with the models interpreted by Shapley additive explanation plots (SHAP). According to the decision tree, a higher lymphocyte ratio (> 34.65%) was the most important systemic variable discriminating faster or slower RNFL thinning. Higher mean corpuscular hemoglobin (> 32.05 pg) and alkaline phosphatase (> 88.0 IU/L) concentrations were distinguishing factors in the eyes with lymphocyte ratios > 34.65% and < 34.65%, respectively. SHAP demonstrated larger baseline RNFL thickness, greater fluctuation of intraocular pressure (IOP), and higher maximum IOP as the strongest ophthalmic factors, while higher lymphocyte ratio and higher platelet count as the strongest systemic factors associated with faster RNFL thinning. Machine learning-based modeling identified several systemic factors as well as previously acknowledged ophthalmic risk factors associated with long-term rapid RNFL thinning.
Glaucoma is a multifactorial disease1. Elevated intraocular pressure (IOP) and decreased blood flow are considered the two strongest factors associated with the pathogenesis of glaucoma. However, neither seems to fully explain the mechanism of glaucomatous optic neuropathy, because disease progression is still observed in patients who receive extensive treatment and those lacking evident factors for progression. The identification of factors other than IOP and blood flow may contribute to a greater understanding of the pathogenesis of glaucoma.
Population-based studies have identified several systemic and demographic factors that may be associated with glaucoma. For example, both the Barbados Eye Study2 and the Rotterdam study3 found that old age, male gender, and family history of glaucoma were associated with the prevalence of open-angle glaucoma (OAG). The Baltimore Eye Survey identified African race, systemic hypertension, low perfusion pressure, and family history of glaucoma as factors associated with the development of glaucomatous damage4, whereas the Blue Mountains Eye Study reported that old age, diabetes mellitus5, systemic hypertension6, and thyroid disease7 were associated with the prevalence of OAG. Moreover, the large randomized Collaborative Normal-Tension Glaucoma Study found that female gender, migraine, and African race were risk factors for the progression of normal tension glaucoma (NTG)8, and several other studies reported that old age, low blood pressure, and primary vascular dysregulation were risk factors for glaucomatous progression9,10,11,12. These population-based studies, however, have been limited by their incorporation of relatively few variables associated with risk, with some of these variables based on subjective evaluation by patients. Thus, these studies have been unable to analyze systemic risk factors comprehensively and objectively for glaucoma progression.
Electronic medical records (EMRs) provide large-scale medical databases that are readily available for systematic studies investigating risk factors for various diseases. Our institute has developed an EMR system, beginning in 2009, containing longitudinal data on large numbers of patients. This database is readily available for studies of ophthalmic and systemic factors associated with glaucoma progression. The present study sought to use the large-scale systemic database containing EMRs to identify systemic factors associated with long-term glaucoma progression by machine learning methods.
The study initially included 756 glaucomatous eyes that satisfied the inclusion criteria. Fifty-seven of the 80 identified variables that showed an importance value < 0.01 in the Random Forest (RF) were excluded by the stepwise method and the remaining 23 variables having an importance value ≥ 0.01 were included in the model. Of the 756 glaucomatous eyes, 251 were excluded because of missing data, resulting in a final sample of 505 eyes. The mean follow-up period was 9.6 ± 1.4 years (range 5.3–11.4 years). The clinical characteristics of the enrolled subjects are summarized in Table 1.
A decision tree model was used to stratify patients with faster or slower optical coherence tomography (OCT) retinal nerve fiber layer (RNFL) thinning, which identified six strongly discriminating variables, resulting in seven groups of eyes with different rates of RNFL thinning (Fig. 1). The three strongest variables were ophthalmic variables; eyes with IOP fluctuation (IOPfluc) > 7.325 mmHg (n = 6), maximum IOP (IOPmax) > 25.25 mmHg (n = 62), and baseline retinal nerve fiber layer thickness (RNFLT) > 58.5 μm (n = 65) were more likely to progress rapidly, with RNFL thinning rates of − 3.634 μm/year, − 1.353 μm/year, and − 0.806 μm/year, respectively. The model also identified three systemic discriminating variables for the eyes (n = 437) with smaller IOPfluc, lower IOPmax, and larger baseline RNFLT. The strongest systemic variable was blood lymphocyte ratio; the 200 eyes with blood lymphocyte ratio > 34.65% showed a faster rate of RNFL thinning (− 0.95 μm/year) than the 172 eyes with blood lymphocyte ratio ≤ 34.65% (− 0.638 μm/year). Analysis of the 200 eyes with blood lymphocyte ratio > 34.65% showed that the 40 eyes with mean corpuscular hemoglobin (MCH) > 32.05 pg had a faster rate of RNFL thinning (− 1.346 μm/year) than the 160 eyes with MCH ≤ 32.05 pg (− 0.851 μm/year). On the other hand, the analysis showed that the rate of RNFL thinning in the 172 eyes with a lower lymphocyte ratio (≤ 34.65%) was influenced by serum alkaline phosphatase (ALP), with faster thinning rates observed in the 37 eyes with serum ALP > 88.0 IU/L (− 0.985 μm/year) than in the 135 eyes with serum ALP ≤ 88.0 IU/L (− 0.543 μm/year).
Because systemic factors were significantly associated with RNFL thinning in the fourth node of the decision tree (Fig. 1), the interactions among systemic factors were evaluated in the 372 eyes included in the fourth node (Fig. 1) by partial interaction plots (PIPs) (Fig. 2). The PIPs showed an independent, and strong influence of blood lymphocyte ratio on rapid RNFL thinning when the level exceeded approximately 35%. While the rate of RNFL thinning tended to linearly increase with thicker baseline RNFL (Fig. 2A), the influence of ALP (Fig. 2B) and MCH (Fig. 2C) showed breakpoints (approximately 40 IU/L and 32 pg, respectively), above which the rate of RNFL thinning got faster.
Using RF13 and permutation14,15 methods, the effects of each variable on the rate of RNFL thinning were assessed, and the results were interpreted using the SHapley Additive exPlanations (SHAP) method. Figure 3 shows a SHAP plot demonstrating the features that strongly predict faster RNFL thinning. The three strongest features were ophthalmic variables, specifically baseline RNFLT, IOPfluc, and IOPmax (Fig. 3A). The SHAP plot showed that larger global RNFLT, larger IOPfluc, and higher IOPmax were associated with faster RNFL thinning (Fig. 3B). Lymphocyte ratio was the strongest systemic variable influencing the rate of RNFL thinning, followed by platelet (PLT) count (Fig. 3A). Higher lymphocyte ratio and lower PLT count were associated with faster rates of RNFL thinning (Fig. 3B).
Figure 4 shows the partial dependence plots (PDPs) of the ophthalmic (Fig. 4A–C) and systemic (Fig. 4D, E) variables that had the strongest impact on the rate of RNFL thinning in this model. Inflection points of the PDP curves for baseline RNFLT (Fig. 4A), IOPfluc (Fig. 4B), and IOPmax (Fig. 4C) were 57.0 μm, 2.22%, and 22 mmHg, respectively, indicating that values exceeding these cutoff points were associated with a faster rate RNFL thinning. Lymphocyte ratio > 34.08 (Fig. 4D), and PLT count < 196.0 or > 330.0 (Fig. 4E) were also associated with faster RNFL thinning.
This study was based on a large database containing the EMRs of patients with primary open-angle glaucoma (POAG) who had been followed up for a mean of 10 years. The database contained a wide range of potential systemic predictors of glaucoma progression, with the effects of these factors on the rate of progressive RNFL thinning quantified by various models. In addition to detecting ophthalmic factors, the models revealed several blood test parameters associated with the rate of RNFL thinning. These results suggested that systemic data in patient EMRs can predict glaucoma progression in patients with POAG.
Conventional regression models assume that all relationships are linear and that the interactions between variables are not of interest. These assumptions, however, are not true in real-world applications. The strength of machine learning models is their consideration of all potential nonlinear relationships and interactions among features, enabling the development of more realistic prediction models. The RF and permutation methods are recognized as the most powerful machine learning methods in the development of disease prediction models. The major limitation of these methods has been their high complexity, limiting the ability to interpret their results. However, the recent development of explainable artificial intelligence (i.e., SHAP) has facilitated the interpretation of the results of machine learning models. SHAP allowed the successful explanation of the prediction models, as well as determining the importance and dependence of each feature. The interactions between features could be readily illustrated by partial interaction plots. Another weakness of the machine learning methods is that they do not show the interaction between the features that could together contribute to the result. Despite that the RF is operated by the construction of multiple decision trees in the learning phase, a decision tree model itself can illustrate groups having similar characteristic features that may have interactions while affecting the dependent variable. Using the decision tree, we could stratify patients having different rates of RNFL thinning with different dominant features that may co-contribute to the rate of progressive RNFL thinning.
The decision tree analysis revealed that IOPfluc, IOPmax, and baseline RNFLT were the three variables most predictive of the rate of RNFL thinning. The SHAP feature importance model also showed that these three variables were the most important factors influencing the rate of RNFL thinning. IOP16,17 and its fluctuations18,19,20 have been identified as risk factors for glaucoma progression. Although this study focused on systemic factors, the results emphasize the importance of IOP-related variables as predictors of glaucomatous damage. Thicker baseline RNFL has been associated with faster RNFL thinning21,22. Because the rate of progression was calculated based on absolute, not percentage, RNFLT, the rate of RNFL thinning should be slower in eyes with thinner baseline RNFL when the same proportion of axons disappears23. It is also possible that, due to residual connective tissue24, which acts as a floor, RNFLT is unlikely to be below 40 μm, even in eyes with total axonal loss25. Therefore, eyes with thin RNFL on OCT progress more slowly than eyes with thicker RNFL.
The decision tree model showed that blood lymphocyte ratio, MCH, and ALP were important systemic variables determining the rate of RNFL thinning. Of these variables, the blood lymphocyte ratio was the strongest, a finding confirmed by SHAP. Inflammatory processes have been reported to be involved in the mechanism of glaucomatous optic neuropathy. For example, evaluations of aqueous humor26 and blood samples27,28 have found that inflammatory cytokine concentrations are elevated, and T-cell homeostasis altered in patients with POAG. These abnormalities in immune mediators suggest that abnormal immune responses may contribute to the initiation or exacerbation of glaucomatous damage29. Abnormal T-cell activity has been associated with glaucomatous degeneration of retinal ganglion cells (RGCs) in rodents30,31,32, and transient elevation of IOP was found to induce T-cell infiltration into the retinas of mice deficient in T and/or B-cells30. Taken together, these findings suggest that elevated serum lymphocyte ratio may indicate altered immunity, which may make the optic nerve more susceptible to glaucomatous damage.
Higher MCH was the second systemic determinant associated with faster RNFL thinning in the decision tree model. MCH represents the average quantity of hemoglobin in a single red blood cell. Thus, a decrease in blood MCH may indicate chronic anemic conditions, such as iron deficiency, whereas an increase in MCH may be associated with vitamin B12 (cobalamin) or B9 (folate) deficiency. The relationships between vitamin B12 / B9 deficiency and glaucoma vary among studies. For example, serum concentrations of vitamins B12 and B9 were found to be significantly lower in patients with than without OAG33. Other studies, however, found that the serum concentrations of vitamins B12 and B9 did not differ between patients with healthy and glaucomatous eyes34,35,36. The present study found that MCH was an important variable in certain conditions, including those with lower IOPfluc and IOPmax, larger baseline RNFLT, and higher lymphocyte ratio. These findings suggest that discrepancies among study results may be attributable to differences in subject characteristics. Vitamins B12 and B9 are both essential cofactors in homocysteine metabolism, and their insufficiency results in the accumulation of homocysteine, a neurotoxin that can induce RGC apoptosis via stimulation of the N-methyl-D-aspartate (NMDA) receptor36. Serum homocysteine concentrations have been reported to be elevated in serum and plasma samples from patients with OAG35,36. Moreover, the peripapillary RNFL in the temporal quadrant was thinner in patients with vitamin B12 deficiency than in healthy controls37. In contrast, higher MCH was more frequently observed in glaucoma patients with a longer duration than a shorter duration of disease38. The present study did not evaluate serum vitamin B12 / B9 or homocysteine concentrations. Thus, the effect of any inter-relationships among MCH, vitamin B, and homocysteine on glaucomatous RGC damage is not conclusive. However, our findings, together with previous results, suggest that high MCH could contribute to progressive RGC damage, under specific conditions.
The decision tree model identified ALP as the second systemic determinant. ALP, an enzyme involved in the hydrolysis of phosphate monoesters, is synthesized in bones, intestines, the placenta, and the hepatobiliary system39. Although serum ALP is metabolically inert, elevated serum ALP may indicate disorders involving the hepatobiliary system, bones, or blood. The role of elevated ALP in glaucomatous optic neuropathy is unclear. ALP concentrations were found to be significantly higher in the aqueous humor of eyes with POAG than in nonglaucomatous eyes40. Moreover, ALP activity was found to be higher in the trabecular meshwork tissue of glaucomatous eyes41. Because ALP is also regarded as a marker of calcification42,43, which is involved in the inactivation of mineralization inhibitors44, increased ALP activity in the trabecular meshwork might be indicative of a mineralization process during the development of glaucoma41. To our knowledge, this study is the first to identify serum ALP as a factor associated with faster RNFL thinning. High serum ALP was found to be important in patients with lower lymphocyte ratio in the present study, whereas high serum ALP concentration accompanied by low lymphocyte ratio has been associated with end-stage renal disease45,46 and cancerous conditions47,48,49,50. Additional studies are required to clarify the effects of ALP on glaucoma progression, and the mechanism by which serum ALP concentration is higher in patients with glaucoma.
The influence of blood lymphocyte ratio, ALP, and MCH seemed to have sharp cutoff points (approximately 35%, 40 IU/L, and 32 pg, respectively), above which the rate of RNFL thinning became remarkably faster (Fig. 2). Notably, that normal ranges are 20–40%, 44–147 IU/L, and 27.5–33.2 pg for blood lymphocyte ratio, ALP, and MCH, which fall near the cutoff points of respective variables. It can be speculated that these systemic variables become pathogenically meaningful when they are abnormally increased above certain levels.
The SHAP model showed that PLT count was the second strongest systemic factor associated with the rate of RNFL thinning, following the lymphocyte ratio. Abnormal PLT activity has been associated with the development and progression of glaucoma. Relative to controls, PLT activity was reported to increase significantly in eyes with POAG51, and especially in eyes with NTG52. PLT aggregation was higher in POAG patients with visual field progression than patients without in patients suspected of having glaucoma53. In contrast, PLT count is decreased in POAG, but PLT distribution width (PDW), a marker indicating a variation in PLT size, was associated with disease severity in patients with POAG54. Altered PLT aggregation may have a negative influence on blood flow in small branches of the short ciliary arteries supplying the optic disc53,55. In addition, aqueous outflow can be blocked by coagulated PLT in Schlemm’s canal56. Our PDP showed that both lower and higher PLT counts were associated with a faster rate of RNFL thinning, whereas values from the 20th to the 95th percentile of the average value were not. It can be speculated that abnormally high and low PLT both play a role in glaucomatous optic neuropathy. However, the influence of PDW on the rate of RNFL thinning was minimal in the present study. Although increased PDW indicated a worse disease stage in the glaucoma continuum54, it may not be a causal factor for disease progression. Further study is needed to clarify the causal relationship between PLT activity, PDW level, and glaucomatous progression.
On the other hand, PLT count was found significant only by the SHAP model but not by the decision tree model. A limitation of the decision tree model is that the importance of a variable may be weakened in subgroups after dividing the data by other variables that appear first in the analysis. The decision tree in our study did not find an association between PLT count and the rate of RNFL thinning after separating the eyes into subgroups with a cutoff lymphocyte ratio of 34.65%. RF and permutation methods do not rely on a single decision tree by combining multiple decision trees to create a model.
The strength of this study was that it was based on a large population that had been evaluated over a long period. This study, however, had several limitations, including its retrospective study design and potential selection bias, which may have been caused by the exclusion of some eyes with missing data during processing. In addition, this study was performed to determine the risk factors affecting rapid RNFL thinning within a limited cohort of selected glaucoma patients. Therefore, the results of the study should be validated by further experiments based on a larger database of a prospective study. Another limitation was that all participants were of South Korean ancestry; thus, these results may not be applicable to all ethnic populations. Larger, longer, prospective multicenter longitudinal studies including patients of different ethnic groups are required to confirm these findings.
In conclusion, this study identified some blood test results associated with long-term progressive RNFL thinning in patients with POAG. Identifying a systemic factor prognostic for faster glaucoma progression may enable clinicians to take steps to retard progression in patients predicted to progress despite IOP control. The machine learning approach used in the present study should be considered in any future attempt to discover new risk factors associated with glaucoma and to potentially test whether the systemic factors found in this study could be generalized to other populations.
Materials and methods
Participants and ophthalmic evaluation
This study involved the development of machine learning models based on retrospective data contained in patient EMRs of patients diagnosed with POAG between September 17, 2009, and February 11, 2014, at the Glaucoma Clinic of Seoul National University Bundang Hospital. The study protocol was approved by the Institutional Review Board of Seoul National University Bundang Hospital (B-2111-723-104) and adhered to the Declaration of Helsinki. Informed consent was waived by the Institutional Review Board of Seoul National University Bundang Hospital due to the retrospective nature that makes informed consent unfeasible.
At enrollment, all participants underwent comprehensive ophthalmic examinations, including assessments of best-corrected visual acuity (BCVA), Goldmann applanation tonometry, refraction tests, slit-lamp biomicroscopy, gonioscopy, dilated stereoscopic examination of the optic disc, disc photography, red-free fundus photography (EOS D60 digital camera; Canon, Utsunomiyashi, Tochigiken, Japan), spectral-domain optical coherence tomography (SD-OCT) scanning of the circumpapillary RNFL and the optic nerve head (ONH; Spectralis; Heidelberg Engineering, Heidelberg, Germany), and standard automated perimetry (Humphrey Field Analyzer II 750; 24-2 Swedish interactive threshold algorithm, Carl Zeiss Meditec, Dublin, CA, USA). Other ophthalmic examinations included measurement of corneal curvature (KR-1800; Topcon, Tokyo, Japan), central corneal thickness (Orbscan II; Bausch & Lomb Surgical, Rochester, NY, USA), and axial length (IOLMaster version 5; Carl Zeiss Meditec). Mean IOP (IOPmean) was defined as the average of all IOP values measured during the entire follow-up period, except for those measured within the first 6 months after starting IOP lowering medication. IOPfluc and IOPmax were defined as the standard deviation, and the highest IOP of the values that were used to calculate IOPmean, respectively.
Patients were included if they were diagnosed with POAG between September 17, 2009, and February 11, 2014, if their EMRs included systemic test results obtained within 6 months from the time of glaucoma diagnosis, and if they were followed up in the glaucoma clinic for > 5 years and underwent annual SD-OCT examinations to measure circumpapillary RNFL thickness.
A diagnosis of POAG was based on gonioscopy showing an open iridocorneal angle and signs of glaucomatous optic nerve damage (e.g., neuroretinal rim thinning, notching, or a RNFL defect) with consistent visual field defect. Glaucomatous visual field defect was defined as (1) outside normal limits on glaucoma hemifield tests, (2) a cluster of three or more non-edge points on a pattern deviation plot with a probability of < 5%, with one having a probability of < 1%, and/or (3) a < 5% probability of pattern standard deviation confirmed on two consecutive reliable tests, with fixation loss rates ≤ 20% and false-positive and false-negative error rates ≤ 25%.
Eyes were excluded if they had a BCVA worse than 20/40; a spherical equivalent ≤ 8.0 D or ≥ 3.0 D; a cylinder correction ≤ 3.0 D or ≥ 3.0 D; a history of intraocular surgery, except for uneventful cataract surgery; or a retinal (e.g., diabetic retinopathy, retinal vessel occlusion, or retinoschisis) or neurological (e.g., pituitary tumor) disease. When both eyes were eligible, one eye was randomly selected for this study.
EMR data source
Systemic data obtained within 1 year from the first OCT examination was extracted from the EMRs. These systemic evaluations included the results of regular health screenings, tests from other departments, or preoperative checkups. Factors evaluated included complete blood counts, such as absolute neutrophil counts (/μL), the percentages of atypical lymphocytes (%), band neutrophils (%), basophils (%), eosinophils (%), large unstained cells (%), lymphocytes (%), metamyelocytes (%), monocytes (%), myelocytes (%), normoblasts (%), and segmented neutrophils; erythrocyte sedimentation rate (mm/hr), hemoglobin concentration (g/dL), hematocrit (%), mean corpuscular hemoglobin (MCH, pg), mean corpuscular hemoglobin concentration (g/dL), mean corpuscular volume (MCV, fL), mean plasma volume (fL), procalcitonin (PCT, %), platelet distribution width (PDW, %), platelet (PLT, × 103/μL), red blood cell (× 106/μL), white blood cell (× 103/μL) counts, and red cell distribution width (as both coefficient of variation (%) and standard deviation (fL). Serum chemistry tests included serum concentrations of albumin (g/dL), alkaline phosphatase (ALP, IU/L), aspartate aminotransferase (IU/L), alanine aminotransferase (IU/L), blood urea nitrogen (mg/dL), total bilirubin (mg/dL), total CO2 (mmol/L), C-reactive protein (mg/dL), calcium (mg/dL), chloride (mmol/L), cholesterol (mg/dL), creatinine (mg/dL), glucose (mg/dL), phosphorus (mg/dL), potassium (mmol/L), total protein (g/dL), sodium (mmol/L), and estimated glomerular filtration rate. Serologic tests included concentrations of anti-HBs (IU/L), anti-HCV (positive/negative), HBsAg (positive/negative), HIV Ag/Ab (positive/negative), and rapid plasma reagin (reactive/nonreactive). Urinalysis tests included blood (+/−), bilirubin (+/−), color, ketone (+/−), nitrite (+/−), pH, protein (+/−), specific gravity, turbidity, urobilinogen (+/−), and white blood cell stick (+/−). Coagulation tests included prolongation time (%), prolongation time (international normalized ratio), prolongation time (seconds), and activated partial thromboplastin time (seconds).
To exclude variables with least importance, those with an importance value < 0.01 in the RF model were deleted by repeated calculations.
Determination of the rate of OCT RNFL thinning
Circumpapillary RNFLT was measured using a circular scan protocol of the Spectralis OCT system (Heidelberg Engineering, Heidelberg, Germany). Measurements at baseline were obtained by placing a circle of diameter 3.5 mm concentric with the optic disc. Follow-up scans were obtained using a built-in realignment procedure. The rate of change of global RNFLT (expressed as microns per year) for each subject was determined by linear regression analysis over time.
Only images with adequate quality of Spectralis Q (signal strength) > 15 were included. Images with motion artifacts, missing data, and/or centration errors were excluded. The accuracy of the segmentation of the RNFL was reviewed, and segmentation errors were manually corrected.
All records in the data source were thoroughly organized and indexed. Raw data were exported to python for processing and analysis. The decision tree model was used to stratify patients with faster or slower RNFL thinning, based on factors influencing the rate of RNFL thinning. The effect of each variable on the rate of RNFL thinning was assessed using random forest (RF)13 and permutation14,15 methods. Because machine learning models are difficult to interpret due to their complexity, the results of the models were interpreted using the SHAP method57, a type of explainable artificial intelligence method. The source code for our machine learning models is available at the GitHub repository (https://github.com/hyunjoongkim1/Systemic-factors-for-long-term-glaucoma-progression/new/main).
Decision tree analysis
Decision tree models classify variables into two groups by predicting the optimal cut-off value to divide subgroups. These models search for the best predictor and the corresponding cutoff value that splits one group into two subgroups, such that the responses of the two subgroups differ significantly. The present study used a decision tree model to stratify the variables in order of their contribution to the rate of RNFL thinning. All possibilities of nodal splits were considered, with stepwise elimination used to determine the model that best delineated the risk factors associated with a faster rate of RNFL thinning. Other methods, including RF, permutation, and SHAP feature importance, were used to reinforce the results derived from the decision tree and to check for overfitting. Interactions between features were visualized by partial interaction plots.
Random forest feature importance
RF is an ensemble learning method for classification and regression. During the learning phase, RF constructs multiple decision trees, reducing the risk of overfitting, a weakness of regression trees, by averaging over multiple decision trees13. When generating decision trees in RF, various decision trees are created using a random variable selection technique. It is well known that RF produces more accurate predictions than a single decision tree. Although RF has been shown to outperform the predictive performance of other machine learning methods, it is difficult to interpret the results predicted by the RF method, a disadvantage similar to that of other machine learning methods. The number of decision trees for RF learning was set at 200, with five-fold cross-validation showing that the optimal number of features for each node was three.
Permutation feature importance
The permutation method is a method of exploring the data space that a variable can make by randomly permuting values. The importance of a variable can be evaluated according to how much the predicted value after permutation differs from the value before permutation. If it is a significant variable, the prediction accuracy will be greatly reduced. Permutation feature importance is defined as a decrease in a model score or an increase in prediction error when a single feature value is randomly shuffled. Because this procedure reduces the relationship between the feature and the true outcome, a reduction in the model score is indicative of the magnitude of dependence of the model on the feature. The difference in performance is regarded as the importance of the feature14,15.
SHAP feature importance
The SHAP method was developed to explain the output of any machine learning model. Based on the output of a particular model, SHAP values can represent a fair or reasonable allocation of feature importance. After constructing a model with several features, the SHAP value was obtained by determining the average change relative to the presence or absence of any individual feature. The SHAP value of each feature was an indicator of the strength of that feature on the positive or negative prediction of the model, with a larger absolute SHAP indicating that the feature had a greater impact on prediction by the model57,58. SHAP values were calculated to determine the contribution of each variable and its correlation with the rate of RNFL thinning. In addition to the feature importance plot, the SHAP values were used to draw partial dependence plots, evaluating the marginal effects of strong systemic variables on the rate of RNFL thinning59.
Except where stated otherwise, data are presented as mean ± standard deviation. All statistical and machine learning analyses were performed using Python v3.8.5, with the scikit-learn package v1.0 used for regression analysis and decision tree construction and the shap package v0.39.0 used for SHAP analysis.
The datasets generated during and/or analyzed during the current study are available from the corresponding author upon reasonable request.
Kwon, Y. H., Fingert, J. H., Kuehn, M. H. & Alward, W. L. Primary open-angle glaucoma. N. Engl. J. Med. 360, 1113–1124 (2009).
Leske, M. C., Connell, A. M., Wu, S. Y., Hyman, L. G. & Schachat, A. P. Risk factors for open-angle glaucoma. The Barbados Eye Study. Arch. Ophthalmol. 113, 918–924 (1995).
Czudowska, M. A. et al. Incidence of glaucomatous visual field loss: a ten-year follow-up from the Rotterdam Study. Ophthalmology 117, 1705–1712 (2010).
Sommer, A. Glaucoma risk factors observed in the Baltimore Eye Survey. Curr. Opin. Ophthalmol. 7, 93–98 (1996).
Mitchell, P., Smith, W., Chey, T. & Healey, P. R. Open-angle glaucoma and diabetes: The Blue Mountains eye study, Australia. Ophthalmology 104, 712–718 (1997).
Mitchell, P., Lee, A. J., Rochtchina, E. & Wang, J. J. Open-angle glaucoma and systemic hypertension: The blue mountains eye study. J. Glaucoma 13, 319–326 (2004).
Lee, A. J. et al. Open-angle glaucoma and systemic thyroid disease in an older population: The Blue Mountains Eye Study. Eye (London) 18, 600–608 (2004).
Drance, S., Anderson, D. R. & Schulzer, M. Risk factors for progression of visual field abnormalities in normal-tension glaucoma. Am. J. Ophthalmol. 131, 699–708 (2001).
Leske, M. C. et al. Predictors of long-term progression in the early manifest glaucoma trial. Ophthalmology 114, 1965–1972 (2007).
Kaiser, H. J., Flammer, J., Graf, T. & Stumpfig, D. Systemic blood pressure in glaucoma patients. Graefes Arch. Clin. Exp. Ophthalmol. 231, 677–680 (1993).
Flammer, J. et al. The impact of ocular blood flow in glaucoma. Prog. Retin. Eye Res. 21, 359–393 (2002).
Gherghel, D. et al. Relationship between ocular perfusion pressure and retrobulbar blood flow in patients with glaucoma with progressive damage. Am. J. Ophthalmol. 130, 597–605 (2000).
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
Altmann, A., Tolosi, L., Sander, O. & Lengauer, T. Permutation importance: A corrected feature importance measure. Bioinformatics 26, 1340–1347 (2010).
Cava, W., Bauer, C., Moore, J. H. & Pendergrass, S. A. Interpretation of machine learning predictions for patient outcomes in electronic health records. AMIA Annu. Symp. Proc. 2019, 572–581 (2020).
Heijl, A. et al. Reduction of intraocular pressure and glaucoma progression: results from the Early Manifest Glaucoma Trial. Arch. Ophthalmol. 120, 1268–1279 (2002).
Kass, M. A. et al. The Ocular Hypertension Treatment Study: A randomized trial determines that topical ocular hypotensive medication delays or prevents the onset of primary open-angle glaucoma. Arch. Ophthalmol. 120, 701–830 (2002).
Lee, P. P., Walt, J. W., Rosenblatt, L. C., Siegartel, L. R. & Stern, L. S. Association between intraocular pressure variation and glaucoma progression: Data from a United States chart review. Am. J. Ophthalmol. 144, 901–907 (2007).
Nouri-Mahdavi, K. et al. Predictive factors for glaucomatous visual field progression in the Advanced Glaucoma Intervention Study. Ophthalmology 111, 1627–1635 (2004).
Rao, H. L. et al. Relationship between intraocular pressure and rate of visual field progression in treated glaucoma. J. Glaucoma 22, 719–724 (2013).
Leung, C. K. et al. Evaluation of retinal nerve fiber layer progression in glaucoma: a study on optical coherence tomography guided progression analysis. Investig. Ophthalmol. Vis. Sci. 51, 217–222 (2010).
Kim, Y. W., Lee, E. J., Kim, T. W., Kim, M. & Kim, H. Microstructure of beta-zone parapapillary atrophy and rate of retinal nerve fiber layer thinning in primary open-angle glaucoma. Ophthalmology 121, 1341–1349 (2014).
Lee, E. J. et al. Comparison between lamina cribrosa depth and curvature as a predictor of progressive retinal nerve fiber layer thinning in primary open-angle glaucoma. Ophthalmol. Glaucoma 1, 44–51 (2018).
Radius, R. L. & Anderson, D. R. The histology of retinal nerve fiber layer bundles and bundle defects. Arch. Ophthalmol. 97, 948–950 (1979).
Hood, D. C. et al. Retinal nerve fiber structure versus visual field function in patients with ischemic optic neuropathy. A test of a linear model. Ophthalmology 115, 904–910 (2008).
Kuchtey, J., Rezaei, K. A., Jaru-Ampornpan, P., Sternberg, P. Jr. & Kuchtey, R. W. Multiplex cytokine analysis reveals elevated concentration of interleukin-8 in glaucomatous aqueous humor. Investig. Ophthalmol. Vis. Sci. 51, 6441–6447 (2010).
Huang, P. et al. Serum cytokine alteration is associated with optic neuropathy in human primary open angle glaucoma. J. Glaucoma 19, 324–330 (2010).
Yang, X. et al. T-Lymphocyte subset distribution and activity in patients with glaucoma. Investig. Ophthalmol. Vis. Sci. 60, 877–888 (2019).
Wax, M. B. The case for autoimmunity in glaucoma. Exp. Eye Res. 93, 187–190 (2011).
Chen, H. et al. Commensal microflora-induced T cell responses mediate progressive neurodegeneration in glaucoma. Nat. Commun. 9, 3209 (2018).
Gramlich, O. W. et al. Adoptive transfer of immune cells from glaucomatous mice provokes retinal ganglion cell loss in recipients. Acta Neuropathol. Commun. 3, 56 (2015).
Wax, M. B. et al. Induced autoimmunity to heat shock proteins elicits glaucomatous loss of retinal ganglion cell neurons via activated T-cell-derived fas-ligand. J. Neurosci. 28, 12085–12096 (2008).
Coban, D. T., Ariturk, N., Elmacioglu, F. & Ulus, C. A. The relationship between glaucoma and serum vitamin B12, folic acid levels and nutrition. Acta Med. Mediterr. 31, 281–286 (2015).
Cumurcu, T., Sahin, S. & Aydin, E. Serum homocysteine, vitamin B 12 and folic acid levels in different types of glaucoma. BMC Ophthalmol. 6, 6 (2006).
Xu, F. et al. Homocysteine, B vitamins, methylenetetrahydrofolate reductase gene, and risk of primary open-angle glaucoma: A meta-analysis. Ophthalmology 119, 2493–2499 (2012).
Roedl, J. B. et al. Homocysteine levels in aqueous humor and plasma of patients with primary open-angle glaucoma. J. Neural Transm. (Vienna) 114, 445–450 (2007).
Türkyılmaz, K. et al. Evaluation of peripapillary retinal nerve fiber layer thickness in patients with vitamin B12 deficiency using spectral domain optical coherence tomography. Curr. Eye Res. 38, 680–684 (2013).
Firat, P. G., Demirel, E. E., Dikci, S., Kuku, I. & Genc, O. Evaluation of iron deficiency anemia frequency as a risk factor in glaucoma. Anemia 2018, 1456323. https://doi.org/10.1155/2018/1456323 (2018).
Karl, M. M. The serum alkaline phosphatase. JAMA 203, 591 (1968).
Latarya, G. et al. Human aqueous humor phosphatase activity in cataract and glaucoma. Investig. Ophthalmol. Vis. Sci. 53, 1679–1684 (2012).
Xue, W., Comes, N. & Borras, T. Presence of an established calcification marker in trabecular meshwork tissue of glaucoma donors. Investig. Ophthalmol. Vis. Sci. 48, 3184–3194 (2007).
Jono, S. et al. Phosphate regulation of vascular smooth muscle cell calcification. Circ. Res. 87, E10–E17 (2000).
Tanimura, A., McGregor, D. H. & Anderson, H. C. Calcification in atherosclerosis. I. Human studies. J. Exp. Pathol. 2, 261–273 (1986).
Magne, D. et al. Cartilage formation in growth plate and arteries: From physiology to pathology. BioEssays 27, 708–716 (2005).
Abramowitz, M. et al. Serum alkaline phosphatase and phosphate and risk of mortality and hospitalization. Clin. J. Am. Soc. Nephrol. 5, 1064–1071 (2010).
Matsumoto, Y. et al. Relationship between susceptibility to apoptosis and Fas expression in peripheral blood T cells from uremic patients: A possible mechanism for lymphopenia in chronic renal failure. Biochem. Biophys. Res. Commun. 215, 98–105 (1995).
Kottel, R. H. & Fishman, W. H. Developmental alkaline phosphatases as biochemical tumor markers. In Biochemical Markers for Cancer (ed. Chu, T. M.) 93–115 (Marcel Dekker, 1982).
Herz, F. Alkaline phosphatase isozymes in cultured human cancer cells. Experientia 41, 1357–1361 (1985).
Bone, G. & Lauder, I. Cellular immunity, peripheral blood lymphocyte count and pathological staging of tumours in the gastrointestinal tract. Br. J. Cancer 30, 215–221 (1974).
Ménétrier-Caux, C., Ray-Coquard, I., Blay, J. Y. & Caux, C. Lymphopenia in cancer patients and its effects on response to immunotherapy: An opportunity for combination with cytokines?. J. Immunother. Cancer 7, 85 (2019).
Hoyng, P. F., Greve, E. L., Frederikse, K., Geijssen, C. & Oosting, H. Platelet aggregation and glaucoma. Doc. Ophthalmol. 61, 167–173 (1985).
Matsumoto, M., Matsuhashi, H. & Nakazawa, M. Normal tension glaucoma and primary open angle glaucoma associated with increased platelet aggregation. Tohoku J. Exp. Med. 193, 293–299 (2001).
Hoyng, P. F., de Jong, N., Oosting, H. & Stilma, J. Platelet aggregation, disc haemorrhage and progressive loss of visual fields in glaucoma. A seven year follow-up study on glaucoma. Int. Ophthalmol. 16, 65–73 (1992).
Ma, Y. et al. Association between platelet parameters and glaucoma severity in primary open-angle glaucoma. J. Ophthalmol. 2019, 3425023. https://doi.org/10.1155/2019/3425023 (2019).
Pache, M. & Flammer, J. A sick eye in a sick body? Systemic findings in patients with primary open-angle glaucoma. Surv. Ophthalmol. 51, 179–212 (2006).
Watanabe, Y., Hamanaka, T., Takemura, T. & Murakami, A. Involvement of platelet coagulation and inflammation in the endothelium of Schlemm’s canal. Investig. Ophthalmol. Vis. Sci. 51, 277–283 (2010).
Lundberg, S. M. & Lee, S. I. A unified approach to interpreting model predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems 4768–4777 (2017).
Rodriguez-Perez, R. & Bajorath, J. Interpretation of compound activity predictions from complex machine learning models using local approximations and shapley values. J. Med. Chem. 63, 8761–8777 (2020).
Friedman, J. H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001).
Supported by Seoul National University Bundang Hospital Research Fund (no. 14-2022-0024), the Patient-Centered Clinical Research Coordinating Center, funded by the Ministry of Health & Welfare, Republic of Korea (Grant Nos. HI19C0481, HC19C0276), and by the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science, and Technology (No. 2016R1D1A1B02011696). The funding organizations had no role in the design or conduct of this research.
The authors declare no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Yoon, J.S., Kim, Ye., Lee, E.J. et al. Systemic factors associated with 10-year glaucoma progression in South Korean population: a single center study based on electronic medical records. Sci Rep 13, 530 (2023). https://doi.org/10.1038/s41598-023-27858-z