A literature-derived dataset on risk factors for dry eye disease

Dry eye disease (DED) is a common disease associated with disorder of tear secretion. Research on risk factors for DED, such as depression, arthritis, thyroid disease, stroke and diabetes, is important to facilitate its diagnosis and prognosis. We created a dataset on risk factors for DED (DrDED) with public access that can provide up-to-date and validated data acquired from systematically searched and screened, high-quality studies. The established DrDED contained 119 studies published between 2000 and 2022. The range of the study sample size was from 43 to 4,871,504. The study types were, as follows: cross-sectional (n = 92), retrospective cohort (n = 9), prospective cohort (n = 10), and case-control (n = 8) studies. Data from eligible studies were collected and presented for the present study, including the publication information, study characteristics, definition and prevalence of the disease, and risk factors for DED, together with the strength of association. With the publication of new relevant studies, the DrDED will be updated, and the data will be made accessible to the users. Design Type(s) Dataset creation objective Measurement Type(s) Patient outcome • scientific publication • risk factors • dry eye disease Technology Type(s) Digital curation • documenting • meta-analysis Factor Type(s) Depression • arthritis • thyroid disease • stroke disease • diabetes Sample Characteristic(s) Homo sapiens • dry eye disease • global


Background & Summary
Dry eye disease (DED) is an inflammatory and chronic clinical condition, and characteristic for reduced tear film stability that affects the ocular surface 1,2 . DED occurs when any of the three layers of a healthy tear film (aqueous fluid, mucus, and fatty oils) is affected, and as a result, the surface of the eyes is inadequately lubricated, and became less smooth and clear 3 .
The prevalence of DED is high among ophthalmic conditions, and varies between 14.6% and 30% 4 , depending on the diagnostic method used and population characteristics 5 . A recent systematic review revealed that one of five Asians experience DED 6 . Hence, DED has been considered a disease of multifactorial etiology, and is correlated to the presence of certain comorbidities in systemic organs 7 . DED etiology and pathology findings have indicated that patients with different DED etiologies may present with different signs and symptoms 8 . It has been reported that patients who smoke or have comorbidities experience more severe symptoms of DED, when compared to patients in a control group 9 . The potential factors that might increase the incidence of DED include primary lacrimal gland diseases, rheumatoid arthritis, ultraviolet ray exposure, improper wearing of contact lenses, older age, female, smoking, and some medicines, such as tricyclic antidepressants and antihypertensive drugs 3,6,10-12 .
The categories of risk factors for DED can be classified as modifiable and non-modifiable, and further classified into three categories: consistent, probable and inconclusive 13 . For example, a modifiable factor, alcohol drinking, was considered inconclusive, in terms of its association with DED. On the contrary, Asian ethnicity is considered an essential, non-modifiable and consistent risk factor for DED 14 . The observed increase in incidence of DED in recent years may be associated with the growing exposure to risk factors in the general population, such as increased sunlight exposure, smoking, and aging 12,15 . The examination of DED risk factors and the strength of its association with DED can provide valuable information for the prevention and treatment of DED.
The present study established a dataset on risk factors for dry eye disease (DrDED 16 ), which comprised of published literatures of high-quality clinical studies that reported the DED risk factors and its associative strengths. At present, the DrDED contains the following data obtained from 119 studies: publication details, geographic regions from where the study subjects were obtained, number and demographic characteristics of the study subjects, the DED risk factors under investigation, and the magnitude of the association strength with the disease in the form of statistical measures, including odds ratios (ORs) and confidence intervals (CIs).
www.nature.com/scientificdata www.nature.com/scientificdata/ Methods overview and literature searches. The present study was conducted based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines 10 . The clinical research question was initially formulated: what are the risk factors of DED and the strength of its association with the disease? The preparation stage for the DrDED development included a systematic literature search, the establishment of the inclusion and exclusion criteria, and the study screening through the evaluation of the eligibility of each study. Next, the DrDED was created by acquiring data correlated to the research question from eligible studies, and entering these data in a Microsoft Excel spreadsheet. After the data validation, the dataset was ready for utilization by researchers who had access to the DrDED. All stages were conducted with focus on the clinical research question (Fig. 1).
A systematic search strategy was performed using the following Medical Subject Headings terms and free-text terms: "dry eye syndrome", "dry eye disease", "dry eye", "evaporative dry eye disease or syndrome", "population at risk", "risk factor", "risk score", and "health correlate" (the relationship among disease names and among risk factor-related terms, "OR"; the relationship between the disease and risk factors). The Medline (since 1946), Embase (since 1974), and Cochrane Central Register of Controlled Trials (since 1996) electronic databases were also searched from inception to September 2022. No restriction was made on the type of study. In order to ensure that no eligible study was missed, the references of previously published systematic reviews 11 and included studies were checked.
inclusion and exclusion criteria. The following inclusion criteria were applied for the screening of the titles, abstracts and full-text articles: any type of primary patient research on risk factors or predictors for DED, including cross-sectional studies, case-control studies and cohort studies; DED was diagnosed according to the United States National Eye Institute/Industry Workshop criteria 12 ; the data for risk factors in the DED analysis was provided; the appropriate methods for the study quality control were reported; a between-group comparability analysis was performed for the baseline data; confounding variables were adjusted in the calculation of ORs for the studied factors; complete study and participant information (author/s, year of publication, country, participant age, and gender) was available. Articles that reported duplicate data from the same study were excluded.
Risk of bias assessment. Two independent researchers performed the risk of bias assessment for each included study using the risk of bias tool, the Newcastle-Ottawa Scale (NOS), as recommended by the Cochrane Handbook for Systematic Reviews. The NOS consisted of eight items in three domains (selection, comparability and outcomes), and the total score ranged from 0 to 9. The studies were rated, as follows: low risk of bias (high quality study), when the NOS score ranged within 7-9; high risk of bias, when the NOS score ranged within 4-6; very high risk of bias (very low quality study), when the NOS score ranged within 0-3 17 .

Data Records
As presented in Fig. 2, among the 3,300 citations found in the extensive search, 210 articles were identified after screening the titles and abstracts, and after reviewing the full texts. A total of 119 relevant articles published between 1997 and 2022 met the study inclusion criteria 16 . The types of studies were, as follows: cross-sectional (n = 92), case-control (n = 8), retrospective cohort (n = 9), and prospective cohort (n = 10) studies. The range of the study sample size was from 43 to 4,871,504. The reported risk factors and strength of association with the disease (measured by multiple logistic regression models and reported with ORs and 95% CIs) were acquired from eligible studies, and studies included in the DrDED 16 . Table 1 presents the structure of the DrDED. The data obtained from all eligible studies, according to the clinical research question, was entered in the database 16 .
A total of 34 risk factors were reported in the DrDED including thyroid disease, arthritis, hypertension, diabetes, stroke disease, smoking status, alcohol drinking status, depression, age (every 10 years, as a continuous variable for continuous variables or categorical variables as reported in the original studies), gender, caffeine use, antihistamines use, visual display terminal use, contact lens use, body mass index, site/geographical region, gout history, dyslipidemia, multivitamin use, pterygium, occupation or work activity environment, allergy, hormone replacement therapy, corticosteroid use, cataract or cataract surgery, glaucoma, residence at high altitude, under-correction of refractive error/refractive surgery, residence (rural or urban), benign prostatic hyperplasia, rosacea, diuretics, migraine headache, and sleep disorders.
In order to illustrate the utilizations of the DrDED, a subset of 41 representative studies published between January 2000 and April 2021, which reported one or more of the eight common risk factors for DED, was selected. In these studies, the participants were recruited from the same geographic location where the respective studies were conducted, and the baseline characteristics of participants between the case and control groups were comparable. When the OR and 95% CI values did not include the no-effect threshold of 1, a significant association with DED was indicated, and the factor was considered a risk factor of DED.

technical Validation
The gold-standard two-step process of two independent reviewers were conducted based on the consensus between the products of the two reviewers, and this was strictly implemented throughout the entire creation process of the DrDED. Two experienced reviewers with ophthalmologic expertise backgrounds screened the titles and abstracts, screened the full-texts, and independently acquired the data from eligible studies (Fig. 1). Each datapoint in the documents of the two reviewers was manually compared and validated using the statistical software, Review Manager (RevMan, Version 5.4, the Cochrane Collaboration, 2020). The accuracy of the extracted sample sizes, and the OR and 95% CI values were presented by generating a forest plot in both numeric and graphic forms. If any discrepancy was identified in the data and outliers reported by the two reviewers, the full text was carefully reviewed, and the data was corrected, accordingly. Any disagreements on the study eligibility was resolved through team discussion. Furthermore, an external methodologist conducted the final data validation by checking the study flow, logic and accuracy of all data.  Table 1. Structure of the DrDED. Note: DrDED, dataset on risk factors for dry eye disease; ID, identifier; DED, dry eye disease; OR, odds ratio; CI, confidence interval. * The data was reported as categorical variables, unless specified. ** When the subgroup data is available, this will be entered in the DrDED in multiple rows and the study ID will be labelled with the number and information of the subgroups.