Introduction

As the global burden of hepatic fibrosis continues to rise, there is a growing consensus among liver societies and guidelines to recommend screening for advanced hepatic fibrosis in specific patient groups1, such as those with obesity, chronically elevated liver enzymes, type 2 diabetes, and metabolic syndrome, as well as in individuals with fatty liver2,3,4,5. These patient groups are referred to as the 'at-risk population' for non-alcoholic fatty liver disease (NAFLD)-related fibrosis. Current guidelines advocate the use of fibrosis index-4 (FIB-4) or NAFLD fibrosis score (NFS) as initial screening tools to identify individuals at high risk of advanced hepatic fibrosis within this at-risk population6,7.

Indeed, the 'at-risk group' not only faces a higher risk of liver disease but also experiences extrahepatic adverse outcomes, including cardiovascular diseases and malignancies8,9,10. Therefore, it is crucial to develop a comprehensive risk assessment algorithm that can effectively evaluate the risk of mortality related to liver and cardiovascular diseases and extrahepatic malignancies. This approach enables a more holistic evaluation of the risk profile of at-risk individuals, thereby aiding in early detection and targeted interventions for better patient outcomes. Previous research has demonstrated that employing high cutoff values of FIB-4 and NFS tests can be effective in discriminating cardiovascular and overall mortality, in addition to assessing liver disease, in patients with NAFLD11. However, in the real-world, non-invasive tests (NITs) primarily utilize a low cut-off as a rule-out strategy for identifying high-risk groups. To date, there is a lack of data on whether a low cut-off of NITs, which are used as the first tier for screening advanced hepatic fibrosis in the general population, can provide a holistic evaluation of high-risk groups, including mortality rates due to cardiovascular diseases and other non-liver-related conditions.

The steatosis-associated Fibrosis Estimator (SAFE) score not only demonstrated superior diagnostic performance in identifying subjects with significant fibrosis compared to FIB-4 and NFS but also exhibited a strong correlation with overall mortality in the general population. These findings suggest that the SAFE score has the potential to effectively stratify populations at risk of hepatic fibrosis and predict adverse clinical outcomes, particularly in primary care settings or populations with a low prevalence of advanced hepatic fibrosis12,13. However, further validation in other ethnic groups and studies related to hepatic and extrahepatic outcomes are necessary to establish its broader applicability. Additionally, future investigations should focus on assessing the ability of NITs to evaluate the overall mortality risk in high-risk populations, considering cardiovascular diseases, extrahepatic malignancies, and liver disease. Such comprehensive assessments will aid better risk stratification and early intervention strategies for improved patient management.

To date, few studies have explored the efficacy of various NITs as first-tier screening tools to identify high-risk groups in community cohorts, particularly concerning liver-related and hard cardiovascular outcomes. Our study aimed to investigate whether a low NIT cutoff could offer a comprehensive assessment of both hepatic and cardiovascular hard outcomes in a community cohort with a low prevalence of liver fibrosis.

Methods

Characteristics of cohort

The Cardiovascular Disease Association Study (CAVAS) was established as a part of the Korea Genomic Epidemiology Study (KoGES) which is a nationwide prospective cohort study led by the Korea Disease Control and Preventive Agency (KDCA)14. The KoGES-CAVAS study included six rural areas: the Multi-Rural Communities cohort (MRCohort) in Yangpyeong, Namwon, and Goryeong; the ARIRANG in Wonju and Pyeongchang; and the Kangwha cohort. These three cohorts were initially separate but were later combined into the CAVAS with a standardized protocol starting in 2008. A total of 21,715 participants who provided written informed consent were recruited between January 2005 and December 2011. Further information on the research design can be found in a previous study15. This study was conducted in accordance with the Declaration of Helsinki and Istanbul and was approved by the Institutional Review Board of Hanyang University Hospital (IRB No. HY-2022-11-012).

Follow up and mortality

Follow-up visits were conducted every 2–4 years from 2007 to 2017. The cause and time of death were determined as of December 2022 by linking the non-identifying information of the cohort patients with death statistics obtained from the Korea Statistical Information Service (KOSIS). Information on the cause of death was obtained from the ICD10-based diagnosis at the time of death. The causes of death, including cardiac, liver, and extrahepatic malignancies, and their matched ICD10 codes are detailed in Supplementary Table 1.

Inclusion and exclusion criteria

The inclusion criteria were as follows: (1) 40 years at the beginning of the observation period. (2) Information on the ICD10-based diagnosis of death was available for all patients (Supplementary Table 1). (3) At least 1 year of medical history and 2 years of follow-up. The exclusion criteria were as follows: (1) loss to follow-up (n = 9). (2) Inappropriate information about BMI (n = 16), laboratory variables for calculating three noninvasive tests (NITs) (liver enzymes [n = 35], platelets [n = 1050], globulin [n = 1824]), and diagnosis of metabolic syndrome (n = 70) (Fig. 1).

Figure 1
figure 1

Study flowchart. BMI body mass index, FIB-4 fibrosis-4 index, LFT liver function test, SAFE steatosis-associated fibrosis estimator.

Target population of analysis

A total of 24,000 participants were included in the community-based rural cohorts. After exclusion based on the exclusion criteria, a total of 13,130 participants who had any risk factors such as fatty liver (hepatic steatosis index [HIS] > 36), two or more metabolic abnormalities for diagnosing metabolic syndrome, diabetes mellitus, and abnormal liver enzymes were selected as the at-risk population and analyzed. The mean follow-up was 12.4 years.

Clinical parameters

History of hypertension, diabetes, or dyslipidemia, intake of corresponding medications for these conditions, and social history of alcohol drinking status were obtained from the questionnaires. Alcohol drinking status was categorized as non-, past or current drinker. The anthropometric measurements included waist circumference, blood pressure, height, weight, total fat mass, and lean mass. Additionally, fasting serum glucose, total cholesterol, low-density lipoprotein cholesterol, high-density lipoprotein cholesterol, triglycerides, total protein, albumin, AST, ALT, and-glutamyl transferase levels were measured.

Calculation of NITs (SAFE score, FIB-4 and NFS)

FIB-4 and NFS were calculated, and their cut-off values were selected based on a study by McPherson et al.16. In subjects aged over 65 years, the low/high cut-off values of FIB-4 and NFS were 2.0/2.67 and 0.12/0.676, respectively. In subjects aged less than 65 years, the low/high cutoff values of FIB-4 and NFS were 1.3/2.67 and -1.455/0.676, respectively. The SAFE score was calculated, and the cut-off values were selected based on the study by Sripongpun et al.12 the low/high cut-off values of the SAFE score were 0/100. If the NIT score (FIB-4, NFS, and SAFE score) was lower than the low cutoff values, the subjects were classified into the low-risk group. If the score was between the low and high cut-off values, the subjects were assigned to the intermediate-risk group. If the score was higher than the high cut-off value, the subjects were assigned to the high-risk group.

Disease definition of fatty liver, metabolic syndrome, and at-risk group

The ‘at risk group’ was defined as the group of individuals with any of the following risk factors: fatty liver, two or more metabolic abnormalities, diabetes mellitus, and abnormal liver function test (serum aspartate transaminase [AST] > 40 IU/L or serum alanine transaminase [ALT] > 40 IU/L)3. Metabolic risk factors for diagnosing subject with metabolic syndrome were defined as follows17: (1) waist circumference ≥ 85 cm for women and ≥ 90 cm for men, (2) blood pressure ≥ 130/85 mmHg and/or medication history of anti-hypertensive medications, (3) serum triglycerides ≥ 150 mg/dL, (4) high-density lipoprotein cholesterol < 50 mg/dL for women and < 40 mg/dL for men, and (5) fasting glucose level ≥ 100 mg/dL with HbA1c ≥ 5.7% and/or medication history of anti-diabetes medications. Metabolic syndrome was defined as the having three or more metabolic risk factors. This study highly focused on comparing the performance of various NITs for identifying individuals at high risk of various death especially within a community-based at-risk population for hepatic fibrosis beyond NAFLD patients. Unlike the typical approach of diagnosing NAFLD, this study considered all individuals with fatty liver as part of the at-risk population regardless of their alcohol drinking status.

The hepatic steatosis index was calculated to identify the patients with fatty liver disease18. Subjects with fatty liver were defined as those with a hepatic steatosis index > 36.

Statistical analyses

Continuous and categorical variables are presented as mean ± standard deviation and numbers and percentages, respectively. Continuous variables were analyzed using the Student’s independent t-test, and categorical variables were analyzed using the chi-square test. The areas under the receiver operating characteristic curves (AUROC) of FIB-4, NFS, and SAFE scores for predicting clinical hard outcomes were compared using DeLong’s test in MedCalc (version 20; MedCalc Software Ltd., Ostend, Belgium). Sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) at low cutoff values were assessed. The incidence rates of various mortalities were calculated by dividing the total number of cases by the observation time (1000 person–years). 100-Survival probability (%) versus years of follow-up was generated using the Kaplan–Meier method. Univariate and multivariate Cox regression results for clinical outcomes such as all cause, cardiac, liver, and extrahepatic malignancy mortality according to grade of NITs were reported as hazard ratios (HR) and 95% confidence intervals (95% CI). Sex, presence of hypertension, high triglyceride levels, and low HDL levels were adjusted. The presence of diabetes was additionally adjusted for when estimating the adjusted HR for various deaths according to the grade of FIB-4. Statistical analyses were performed using SPSS software (version 26.0; IBM Corp., Armonk, NY, USA). Statistical significance was set at P < 0.05.

Results

Baseline characteristics of ‘at-risk population’

In total, 70.2% (13,130/18,711) of KoGES-CAVAS cohort was identified as ‘at-risk’ (Fig. 1). The prevalence of fatty liver, two or more metabolic abnormalities, diabetes mellitus, and abnormal liver function in these populations were 24.4%, 65.7%, 12.3%, and 29.7%, respectively (Table 1). Those in the ‘at risk group’ were older and had more unfavorable metabolic profiles, such as higher body mass index (BMI), waist circumference, blood pressure, and serum triglyceride and glucose levels, and higher liver enzymes, compared with those not at risk. When the low cutoffs of three NITs were used as the standard for further evaluation, 27.7% for FIB-4, 23.0% for NFS, and 51.8% for SAFE scores among at-risk populations needed the second step test.

Table 1 Clinical characteristics of community based rural cohort according to the presence of the risk for hepatic fibrosis.

Predictive value of cardiovascular mortality using low cut-off among various NITs

During a median follow-up of 12.3 years, 1626 of the 13,130 individuals in the at-risk population died (Table 1). Out of the total deaths, cardiovascular-related deaths accounted for 14.8% (240/1626), liver disease-related deaths for 7.0% (114/1626), and extrahepatic malignancy-related deaths were 28.0% (455/1626). The high-risk group selected by the high cutoff of all NITs (FIB-4, NFS, and SAFE score) showed the highest incidence of various mortality rates (cardiovascular, liver, and extrahepatic malignancy-related deaths). However, the intermediate-risk group selected between the low and high cutoff of FIB-4 and NFS did not show significant differences in the incidence rates of cardiovascular and extrahepatic malignancy-related deaths compared to those in the low-risk group (Fig. 2A). Only the SAFE classification showed a consistently increasing trend in incidence rates across all types of mortality.

Figure 2
figure 2

Hard outcome events during follow-up period. (A) Incidence rate (number of cases/1000 person-years) of overall mortality, cardiac mortality, liver mortality, and mortality from extrahepatic malignancy in at-risk population according to the ties of FIB-4, NFS, and SAFE score. The proportion of the low-risk and intermediate/high-risk groups identified by FIB-4 (B), NFS (C), and SAFE score (D) in various death. FIB-4 fibrosis-4 index, Int intermediate, NAFLD nonalcoholic fatty liver disease, NFS NAFLD fibrosis score, SAFE score steatosis-associated fibrosis estimator score.

The low cutoff of the SAFE score can minimize the number of missed patients in all types of mortality

In practice, NITs are used to rule out or rule in risk assessments by applying a low cut-off value. Of the total deaths caused by liver problems, the proportion in the group below the low cutoff value (low-risk group) was lower than that in the group above the low cutoff value (intermediate- or high-risk group) for all NITs (Fig. 2B–D). However, the proportion of cardiovascular mortality belonging to the group below the low cutoff (low-risk group) was higher than that in the group above the low cutoff (intermediate- or high-risk group) for both FIB-4 (70.4% vs. 29.6%) and NFS (78.8% vs. 21.3%) (Fig. 2B,C). Similarly, the proportion of extrahepatic malignancy-related mortality belonging to the group below the low cutoff was also higher than in the group above the low cutoff for both FIB-4 (64.8% vs. 35.2%) and NFS (79.8% vs. 20.2%). Only in the case of the SAFE score, the proportion of cardiovascular (26.7% vs. 73.3%) and extrahepatic malignancy (33.8% vs. 66.2%)-related mortality belong to the group below the low cutoff were lower, compared to the group above the cut-off (Fig. 2D). The low cutoff values of both FIB-4 and NFS did not effectively distinguish the risk of all-cause, cardiovascular, and extrahepatic malignancy mortality between the 'rule-out' and 'rule-in' groups. However, the SAFE score consistently demonstrated a clear trend across all mortality types, with a higher number of deaths in the 'rule-in' group than in the 'rule-out' group.

Diagnostic performance of three NITs for prediction of various mortalities

There were no differences in the AUROCs for predicting cardiac- and liver-related mortality among the three types of NITs (Table 2; Supplementary Table 2). When predicting overall mortality, AUROC was highest in order of FIB-4 (0.688, 95% CI 0.674–0.702), SAFE score (0.678, 95% CI 0.664–0.692), and NFS (0.659, 95% CI 0.645–0.674) (P values, FIB-4 vs. SAFE score: 0.026; FIB-4 vs. NFS: < 0.001; SAFE score vs. NFS: 0.001). All NITs showed a distinctively higher AUROC for predicting liver mortality than for predicting other mortalities. The SAFE score exhibited a higher sensitivity for predicting various types of mortality than the FIB-4 or NFS scores. Specifically, it demonstrated 2–3 times higher sensitivity for overall mortality (72.1% for SAFE score vs. 36.7% for FIB-4 or 25.3% for NFS) and cardiac mortality (73.3% for SAFE score vs. 29.6% for FIB-4 or 21.3% for NFS). Positive predictive value (PPV) and negative predictive value (NPV) were found to be comparable across all causes of death for the three types of NITs when applied low cut-off.

Table 2 Predictive ability of FIB-4, NFS and SAFE score for mortality due to various cause by using their low cut-off values in at-risk population.

The risk assessments for various mortality by three NITs

The survival curve showed the highest mortality rate in individuals above the high cutoff, regardless of the NIT type or cause of death (Supplementary Fig. 1). However, when we utilized the low cut-off as a 'rule-out' strategy, as commonly practiced, there was no significant difference in cardiovascular mortality between the 'rule-out' (low-risk) group and the 'rule-in' (intermediate- or high-risk) group when using FIB-4 or NFS classification (Fig. 3). For NFS, no distinction in overall and extrahepatic malignancy mortality was observed between the ‘rule-out’ and ‘rule-in’ groups. Only the SAFE score could effectively differentiate between the low- and intermediate-risk groups across all types of mortality; there was no overlap area in survival curves between low-risk and intermediate- or high-risk groups. The results of both univariate and multivariate analyses consistently demonstrated that only the SAFE score was able to differentiate the risk of all kinds of mortalities between the low-risk ('rule-out') and intermediate- or high-risk ('rule-in') groups (all P values for HR < 0.001) (Supplementary Table 3, Fig. 4).

Figure 3
figure 3

Mortality curve (100-survival probability) from overall, cardiac, liver, and extrahepatic malignancy according to tiers of three NITs. ‘100-survival probability’ versus years of follow-up graphs were generated by the Kaplan–Meier method. FIB-4 fibrosis-4 index, Int intermediate, NAFLD nonalcoholic fatty liver disease, NFS NAFLD fibrosis score, NITs noninvasive tests, SAFE score steatosis-associated fibrosis estimator score.

Figure 4
figure 4

Adjusted hazard ratio (aHR) for various mortalities according to tiers of three NITs. Various mortalities include overall (A), cardiac (B), liver death (C), and death from extrahepatic malignancy (D). Statistical analyses were performed using Cox regression and results were reported as aHR and 95% CI. Sex, presence of hypertension, high triglyceride level, and low HDL level were adjusted in common. Presence of diabetes was additionally adjusted in case of estimating adjusted-HR for various death according to tiers of FIB-4. CI confidence interval, FIB-4 fibrosis-4 index, HR hazard ratio, Int intermediate, NAFLD nonalcoholic fatty liver disease, NFS NAFLD fibrosis score, NITs, noninvasive tests, SAFE score steatosis-associated fibrosis estimator score.

The diagnostic performance of SAFE score in older populations

As individuals progressed from their 40 s to over 70 years of age, the proportion of those identified as the intermediate- or high-risk group (SAFE score > 0) also increased from 19.4 to 79.8% (Supplementary Table 4). In other words, approximately 80% of individuals over the age of 70 may be classified as positive for the test. Although the PPV increased with age (Supplementary Table 5), the absolute number of false positives increased only in older populations. Furthermore, the diagnostic performance of the SAFE score for overall mortality showed a low specificity and NPV of 20.8% and 64.1%, respectively, for those over 70 years of age (Supplementary Table 5). This finding suggests that the SAFE score has a lower ability to rule out clinical outcomes in older populations. In addition, the AUROCs for various mortality rates decreased with age. Taken together, the SAFE score showed low diagnostic performance for various mortalities in the older populations.

Discussion

To the best of our knowledge, this is the first study to compare the performance of various NITs using a low cutoff value to identify individuals at high risk of cardiovascular and extrahepatic malignancy-related mortality within a community-based at-risk group, in addition to liver-related hard outcomes. The SAFE score appears to enable a more comprehensive evaluation of the risk profile of at-risk individuals for mortality related to liver, cardiovascular diseases, and extrahepatic malignancies. The FIB-4 and NFS scores demonstrated good predictive capabilities for cardiovascular and extrahepatic malignancy-related mortality when higher cut-off values were used. However, the low cut-off values for FIB-4 and NFS did not effectively distinguish cardiovascular mortality in the intermediate- or high-risk groups from that in the low-risk group. Considering that low cutoff values are commonly used in clinical practice for various noninvasive tests to exclude high-risk individuals2,3,4, the SAFE score exhibits an advantage over FIB-4 and NFS in the comprehensive evaluation and selection of high-risk groups with elevated risks of not only liver disease but also cardiovascular mortality within a community-based cohort. In terms of diagnostic performance, the AUROC values for predicting liver- and cardiovascular-related mortality were similar among the three NITs. This indicates that the superiority of the SAFE score lies in a reasonably low cutoff value for identifying high-risk individuals within a community-based cohort, rather than in its overall diagnostic performance in predicting cardiovascular and liver-related factors, compared to FIB-4 and NFS.

If the low cutoff values of FIB-4 and NFS were used to exclude advanced hepatic fibrosis within the community-based cohort, a significant percentage of deaths related to cardiovascular issues (70.4% for FIB-4 and 78.8% for NFS) would be overlooked (Fig. 2B,C). This means that individuals classified as having a low risk of FIB-4 and NSF could include subjects with a considerable risk of cardiovascular-related death, which was not captured by the low cut-off values. In contrast, the SAFE score consistently showed the superior predictability, more than 70% in sensitivity, for not only liver related but also overall and cardiovascular related deaths (Fig. 2D). Additionally, the low cut-off SAFE score showed comparable or better PPV in predicting various mortalities, despite the larger number of subjects diagnosed as positive by the SAFE score (Table 2). These findings imply that the SAFE score can be a more attractive option for the holistic evaluation of at-risk groups at the primary care level.

However, caution should be exercised when applying SAFE scores to older populations. Age is considered not only a significant risk factor for hepatic fibrosis and mortality but also a confounding factor that affects the accuracy of NITs. As age increases, so does the NIT score, which may lead to an overestimation. To address this issue, FIB-4 and NFS utilize higher cut-off values for individuals aged 65 years or older than for those in other age groups. A recent study also reported that the SAFE score demonstrated a lower ability to rule out clinically significant fibrosis in older populations (aged 60–80)13. Similar findings were also observed in our study, as mentioned in the Results section. Consequently, special caution will be required when applying the SAFE score to older populations.

The FIB-4 and NFS are primarily designed to screen for advanced hepatic fibrosis in patients with NAFLD or viral hepatitis19,20. In other words, the FIB-4 index and NFS were developed to screen for advanced hepatic fibrosis in high-risk patients. In contrast, the SAFE score was specifically developed to screen for significant hepatic fibrosis in populations with a low prevalence of advanced hepatic fibrosis, or in primary care settings. Because a considerable proportion of advanced fibrosis can be included in high-risk groups, such as patients with NAFLD or viral hepatitis, targeting advanced hepatic fibrosis as a screening strategy is a reasonable approach. However, in the case of the population or primary care settings, a small proportion of the overall population is compatible with advanced hepatic fibrosis. In this relatively benign population, targeting advanced hepatic fibrosis in the screening strategy is not a reasonable choice. Previous studies have also pointed out the above concerns regarding the appropriate target or cutoff values of NITs in a population with a low prevalence of advanced hepatic fibrosis7,21. Moreover, the overall or cardiovascular disease mortality in patients with NAFLD showed a dose-dependent relationship with the degree of hepatic fibrosis in previous studies8,22,23. From this perspective, the SAFE score, which includes individuals with significant hepatic fibrosis as diagnostic targets, can be used to predict mortality. Another reason is that the SAFE score includes additional metabolic parameters such as BMI and diabetes, which are associated with cardiovascular and overall mortality, making it more comprehensive than FIB-4.

This study has several limitations. First, hepatic steatosis index used as a diagnostic tool to assess fatty liver. This tool is an indirect method for evaluating the degree of steatosis in the liver, therefore it has lower accuracy than imaging tools for measuring the degree of steatosis directly, such as ultrasonography or MRI-PDFF. However, it also showed the good diagnostic performance for finding patients with NAFLD at higher cutoff values (specificity, 93.1%; PPV 86.7%)18. It can be also a more appropriate option for assessing fatty liver disease in primary care settings since its calculation variables (BMI, AST, ALT, presence of DM) are commonly used in such clinical settings. Moreover, our study focused on evaluating the predictive ability of the three NITs for long-term clinical outcomes in a heterogeneous group consisting of people with obesity, chronically elevated liver enzymes, type 2 diabetes, or metabolic syndromes other than fatty liver. There is a high probability that there will be few people with fatty liver disease without other risk factors. Therefore, we believe that the overall results did not differ. Second, almost all the participants were indwellers in rural areas. Therefore, the proportion of elderly people is high. Additionally, there was no consideration of the medication or underlying diseases of the participants owing to the design of the study. However, our cohort also has unique strengths. The cohort consisted of volunteers from six rural communities in South Korea. Local residents participated throughout South Korea, therefore it is expected to capture a more generalized picture of health and disease than a clinical disease-focused or health-screening cohort. The cohort size represented a substantial percentage (ranging to 4–10%) of the local population. The Epidemiological Data and Quality Management Center, which is responsible for overseeing the cohort centers in these communities, ensured the quality of data collection for this multicenter study, resulting in standardized and reliable data. Nevertheless, further study designed as prospective large-scale studies are required.

In conclusion, our results consistently showed that a low cut-off SAFE score could differentiate the risk of overall mortality, cardiovascular mortality, and mortality from extrahepatic malignancy between the low- and intermediate- or high-risk groups. It outperformed the low cutoff values of FIB-4 and NFS, particularly in predicting outcomes other than liver-related mortality. The low cutoff SAFE score allowed for holistic evaluation of both hepatic and extrahepatic high-risk groups within the community cohort. By utilizing the SAFE score with a low cutoff, it is possible to identify individuals who are at an elevated risk for both liver- and non-liver-related adverse outcomes, providing a comprehensive assessment of their overall health status. This approach ensures that groups at high risk for various health conditions are not overlooked, leading to more effective preventive strategies and interventions.