Introduction

Non-alcoholic fatty liver disease (NAFLD) is a major cause of liver-related morbidity globally, with prevalence rates as high as 30%1, and steadily increasing number of cases, from 391.2 million in 1990 to 882.1 million in 20172.

It appears to be associated with one or more components of the metabolic syndrome, which include hypertension, dyslipidemia, obesity, and Type 2 diabetes mellitus and insulin resistance1,3. Although its pathogenesis has not yet been completely established, growing evidence shows that insulin resistance and dysregulation in lipid metabolism play huge roles in the development of hepatic steatosis. High fat diet, insulin resistance, obesity, dysregulated peripheral lipolysis, and other potential risk factors lead to increased entry of free fatty acids into the liver, which places hepatocytes under a ‘lipotoxic’ condition4. The accumulation of triacylglycerol in the hepatocyte cytoplasm presents histologically as steatosis. With the constant, repeated micro-hepatic injury, endoplasmic reticulum stress and mitochondrial dysfunction now arise. This then leads to lobular inflammation, cellular apoptosis, and hepatic fibrosis over time4.

If neglected, this treatable condition can lead to various severe consequences, such as advanced cirrhosis, hepatocellular cancer, and potentially cardiovascular morbidity5 and mortality6,7. Considering the prognostic consequences of NAFLD, effective therapy is warranted to prevent disease progression. Aside from weight loss and lifestyle modifications, pharmacologic treatments remain limited. In line with this, the role of a novel oral hypoglycemic agent called sodium–glucose cotransporter 2 (SGLT-2) inhibitor on the treatment of NAFLD has recently been investigated in various animal studies done on rodents models8,9,10 and human clinical trials11,12, with potential positive effects.

In a real-world study of 56 patients with NAFLD and Type 2 diabetes mellitus who received SGLT-2 inhibitors for 48 weeks, there were statistically significant decreases in both controlled attenuation parameter (CAP) from 312 dB/m at baseline to 280 dB/m at week 48, and liver stiffness measurement (LSM) from 9.1 kPa at baseline to 6.7 kPa at week 4811. Additionally, the SGLT-2 inhibitor group showed statistically significant reductions in body weight, alanine transaminase (ALT), uric acid, and Fibrosis-4 (FIB-4) index at week 48 in comparison to the non-SGLT-2 inhibitor group that received other oral hypoglycemic medications using the 1:1 propensity-score matched analysis11. Moreover, a meta-analysis of 10 randomized controlled trials with 573 participants showed that use of SGLT-2 inhibitors yielded statistically significant reductions in the levels of ALT, aspartate transaminase (AST), liver proton density fat fraction (MRI-PDFF), visceral fat mass area, and subcutaneous fat areas12.

The impetus for this systematic review and meta-analysis is to give a more precise estimate of effect and address variations with use of SGLT-2 inhibitors in the treatment of NAFLD to help guide clinical practice guideline development, with a renewed focus on the use of imaging and histopathology outcome measures as found in the up-to-date randomized controlled trials.

The objective of this meta-analysis is to evaluate the effectiveness of sodium–glucose cotransporter-2 (SGLT-2) inhibitors in improving hepatic steatosis and hepatic fibrosis using imaging biomarkers and histopathology in patients with non-alcoholic fatty liver disease.

Methodology

This present study was conducted in accordance with preferred reporting items for systematic review (PRISMA) guidelines13 (See also Supplementary Table S1).

Eligibility criteria

Studies were selected based on the following inclusion criteria: (1) randomized-controlled trials, regardless of blinding status, that include studies examining adult participants more than or equal to 18 years of age with diagnosed non-alcoholic fatty liver disease irrespective of other comorbidities with or without diabetes mellitus, obese or non-obese; (2) use of SGLT-2 inhibitors in the active experimental group and not restricted to a particular sub-type, dosage, frequency or duration; (3) for the comparators, studies with placebo or the usual standard of care as relevant controls.

Non-randomized clinical trials, cohort studies, case control studies or cross-sectional studies were excluded. There were no limitations regarding the type of setting and length of follow-up.

The primary outcomes included the controlled attenuation parameter (CAP) which is a method used to measure the degree of ultrasound attenuation by hepatic fat at the central frequency of FibroScan®. The CAP value is expressed in dB/m. The next primary outcome was the liver stiffness measurement (LSM) which refers to the non-invasive quantification of liver stiffness by transient elastography using FibroScan®. The LSM value is expressed as kilopascal or kPa. Another primary outcome of interest is the proportion of patients with at least one point improvement or one-stage reduction in the histological scores with respect to hepatic steatosis, hepatocellular ballooning, lobular inflammation, and fibrosis after treatment.

The secondary outcomes include the Fibrosis-4 (FIB-4) index which is a non-invasive scoring system from several laboratory tests that can predict significant hepatic fibrosis patients. It includes age (years), AST level (U/L), platelet count (109/L) and ALT (U/L). Other imaging biomarkers of hepatic steatosis included in this study were the liver-spleen attenuation ratio as quantified by computed tomography, and the magnetic resonance imaging derived proton-density-fat-fraction (MRI-PDFF) which enables a quantitative assessment of liver fat over the entire liver.

Information sources

Two independent reviewers (A.M.O.L., and J.A.P.) performed a comprehensive electronic database search via PUBMED14, Cochrane Central Register of Controlled Trials15, Embase16 and ACP Journal Club17, published from inception to 29 December 2023. Other unpublished trials and study registries were also sought after by scanning the ClinicalTrials.gov, ISRCTN Register, EU Clinical Trials Register and WHO ICTRP. The planned literature search was not restricted to the English language or publication date filter.

Search strategy

The MEDLINE search strategy entered at PubMED database14 was the use of the following terms: ((((((((((SGLT-2 inhibitor) OR (dapagliflozin)) OR (empagliflozin)) OR (ipragliflozin)) OR (tofogliflozin)) OR (canagliflozin)) OR (licogliflozin)) OR (luseogliflozin)) OR (ertugliflozin)) AND (non-alcoholic fatty liver disease)) OR (NAFLD) with filters for randomized controlled trial. These search terms were also similarly adapted for use with other electronic bibliographic databases15,16,17. We also attempted to look for other related articles that were not identified by electronic searches.

Selection process

Literature search results were imported to EPPI-Reviewer Version 4.14.2.0 for data management to facilitate the screening process18. Two independent reviewers (A.M.O.L, and J.A.P.) identified potentially eligible studies through screening of titles and abstracts, and assessed if these met the inclusion criteria. Any duplicate records of the same report were removed. For any disagreements between the authors, consensus of which articles to screen for full text were resolved by discussion. Subsequently, two researchers (A.M.O.L, and J.A.P.) independently screened full-text articles and scrutinized their eligibility. Again, any disagreements over the eligibility of the studies were resolved by consensus.

Data collection process

Two reviewers (A.M.O.L, and J.A.P.) independently extracted data from every report with the use of a standardized data collection form. The retrieved data were then compared. Any disagreements or discrepancies during the data collection process were resolved by reviewing the data again, and through a consensus discussion. A.M.O.L entered the data into Cochrane’s RevMan Web19, and again double checked for its completeness and accuracy. For missing data or unclear information, the reviewers attempted to correspond with the study investigators through their respective official email addresses.

Data items

We collected the following data from each included study: title and first author with citation details, study design, study location, total study duration, type and number of participants, study inclusion and exclusion criteria, baseline patient characteristics, detailed description of the intervention and control (type of drug used, dosage, frequency, duration of treatment), the outcome of interest and results.

Study risk of bias assessment

Two independent reviewers (A.M.O.L, and J.A.P.) assessed the risk of bias in included studies using the Cochrane Risk of Bias 2.0 (RoB 2.0) tool20. The domains in RoB 2.0 include bias arising from the randomization process; bias due to deviations from intended interventions; bias due to missing outcome data; bias in measurement of the outcome; and bias in selection of the reported result. The judgment of overall risk-of-bias were subdivided into “low risk of bias”, “some concerns” or “high risk of bias.”20. Any discrepancies in risk assessment were settled through constructive discussions between authors.

Effect measures

We used the unstandardized mean differences at 95% confidence interval for continuous data given the same measurement scales to assess outcomes. Additionally, we utilized the final measurement outcomes rather than the change-from-baseline outcomes as the latter was not reported in most of the studies and considering the difficulty to measure the outcomes precisely coupled with the skewness of the distribution21, hence the former was selected. For dichotomous data, we calculated the risk ratio (RR) and their 95% confidence interval.

Synthesis methods

To decide which studies were eligible for each synthesis, structured approaches were done with tabulation and coding of the main characteristics of the population, intervention, and outcomes. The intervention component of each study was coded along two dimensions namely SGLT-2 inhibitor only or SGLT-2 inhibitor plus another drug. The comparison group was likewise categorized into placebo only or standard of treatment, pioglitazone, ursodeoxycholic acid, tenegliptin, glimepiride, liraglutide, metformin, and metformin plus pioglitazone.

During the data synthesis, we noticed some continuous data presented as median and interquartile range. In this scenario, if the respective corresponding author cannot be contacted, we convert the study data into the estimated mean and standard deviation using the R-package ‘estmeansd 0.2.1’ for which was primarily referenced from the Mcgrath et al.22 2020 DEPRESsion Screening Data (DEPRESSD) Collaboration. There were also a few study data highlighted as mean and standard error (SE) for which it was converted to standard deviation (SD) using Revman Web by Cochrane19 and computed as SE multiplied by the square root of the sample size. In cases where there is no available information on variability measures, we did not exclude the studies from the meta-analysis. Instead, we imputed the missing data by getting the average standard deviations (SD) from other studies in the same meta-analysis.

To synthesize results, a meta-analysis of effect estimates was done using the Revman Web by Cochrane19. The random-effects meta-analysis model was chosen under the assumption that the true intervention effects are related through a distribution. The choice of this model was based on the clinical and methodological diversity of the included studies, and the concern for small-study effects. For continuous data, the inverse variance statistical method was used; while for dichotomous outcomes, the Mantel–Haenszel method was employed in consideration for some studies with small sample sizes and/or lower event rates21.

Heterogeneity among studies was identified in terms of clinical, methodological, and statistical factors. For statistical heterogeneity, assessment was done using the I2 statistics, with an interpretation of I2 value of 30–60% representing moderate heterogeneity, 50–90% indicating substantial heterogeneity, and 75–100% interpreted as considerable heterogeneity21. Whenever significant heterogeneity was detected, we proceeded with sensitivity analysis by omitting studies that are judged to be at high risk of bias, or excluding studies that are deemed ineligible such as unpublished data, inadequate sample sizes, substantial variances in patient characteristics, methodology or intervention21.

Reporting bias assessment

Funnel plot asymmetry was assessed for meta-analysis with at least 10 studies. It was interpreted by visual inspection and statistical tests using Egger regression and Begg & Mazumdar rank correlation. Additionally, assessment for selective outcome reporting or selective non-reporting of results was completed by directly examining if a study protocol/register is available prior to the published study, and if the outcomes stated in the protocol are concurrent with the published report. If a protocol was unavailable, we then compared the outcomes reported in the methods and results section of the published paper.

Certainty assessment

Two independent reviewers (A.M.O.L. and J.A.P.) evaluated the certainty of evidence for each outcome using the Grades of Recommendation, Assessment, Development and Evaluation Working Group (GRADE)23 approach. The rating criteria for considering lowering or raising the certainty of evidence was dependent on these factors, which consisted of risk of bias, inconsistency, indirectness, imprecision, publication bias, large effect, dose response and other plausible confounding bias. We utilized the online GRADEpro23 to generate the summary of findings table together with the footnotes as rationale for up-grade or down-grade the certainty of evidence.

Results

Study selection

In the initial database search, we found a total of 1089 potential records (see Fig. 1). After eliminating the duplicates, we proceeded with screening the 854 records, from which we reviewed 37 full-text articles, and finally, we included 18 randomized-controlled studies24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41. Also, we comprehensively searched the references and other documents which cited these studies. However, no additional articles that fulfilled inclusion criteria were found in these searches.

Figure 1
figure 1

PRISMA 2020 flow diagram for systematic reviews.

We excluded 19 studies from our review because of the following reasons: Population criteria was not met in seven studies; there were no imaging/pathologic markers of steatosis or fibrosis or outcome of interest in six studies; SGLT-2 inhibitor was not used in one study; there were secondary publications of three included studies; there was incomplete data in one study; and there was still ongoing recruitment/trial in one study.

Study characteristics

Across 18 trials, a total of 1330 participants were included, with 631 in the SGLT-2 inhibitor group and 657 in the control group. No major differences in baseline characteristics were identified among the included studies. The mean age of participants ranged from 43.8 to 65.0 years in the SGLT-2i group and 41.8–65.6 years in the control group. All studies enrolled patients with non-alcoholic fatty liver disease based on ultrasonography24,27,29,32,35,36,37,40, computed tomography32,33,35, magnetic resonance imaging28,34, fatty liver index26, or pre-treatment biopsy findings30,38,39. Subsequently, two out of 18 studies excluded Type 2 diabetes mellitus in their eligibility criteria37,40. Empagliflozin, dapagliflozin, tofogliflozin, luseogliflozin, licogliflozin and ipragliflozin were the studied sodium-glucose cotransporter-2 inhibitors for the experimental group; whereas the control group had varied medications used including pioglitazone, metformin, glimepiride, tenegliptin, liraglutide, placebo or the usual standard of care. Lifestyle modifications and anti-diabetic drugs (including metformin, DPP4-inhibitors, sulfonylureas, and insulin) without any SGLT-2 inhibitors, pioglitazone or GLP-1 agonist, comprise the usual standard of care for Type 2 diabetes mellitus, which was used by three studies as control34,36,38. The course of treatment ranged from 12 to 72 weeks depending on the included trial. Other in-depth characteristics of included studies were presented in Table 1 and in the Supplementary Tables S2–S4 online.

Table 1 Characteristics of included studies.

Risk of bias in studies

The risk of bias of individual trials were assessed using the RoB 2.0 tool20. A summary of these assessments is provided in Table 2. Overall, most of the included studies (16/18) presented some concerns for risk of bias, while two studies were assessed as having low risk of bias.

Table 2 Summarized RoB 2.0 of included studies.

Results of individual studies

For study results, see Figs. 2, 3, 4, 5, 6 and 7 which present the summary statistics for each of the individual studies, and the pooled effect estimates with their confidence intervals using Forest plots. The risk of bias judgments was also displayed beside the plot to consider the limitations of interpreting the findings.

Results of syntheses

Effect on hepatic steatosis

Controlled attenuation parameter (CAP)

Six randomized controlled trials (RCTs)24,29,31,36,37,39, including a combined total of 372 patients, compared the mean difference in controlled attenuation parameter (CAP) among non-alcoholic fatty liver patients given SGLT-2 inhibitor or the control intervention (see Fig. 2a). Each study utilized the transient liver elastography via FibroScan® to measure the controlled attenuation parameter as assessment for liver fat content. Five of the trials had some concerns for bias overall, owing to the lack of information on allocation concealment and/or use of per-protocol analysis. Empagliflozin was administered in two of the trials24,37, while ipragliflozin, tofogliflozin, and dapagliflozin were used in other trials as the active experimental group. For the control group, the medication given in each study slightly varies. Four trials were given standard of treatment for Type 2 diabetes mellitus, including one study receiving glimepiride39, metformin only31, respectively; the latter as combined metformin and pioglitazone29; whereas the other remaining two studies were given placebo24,37. After combining the results, the pooled mean difference in the reduction of CAP scores after randomly assigning to SGLT-2 inhibitor treatment versus the other comparators was (MD: − 10.59 dB/m, 95% CI [− 18.25, − 2.92], p = 0.007, I2 = 0%, moderate certainty of evidence).

Figure 2
figure 2

(a) Forest plot of comparison: summary of mean difference in controlled attenuation parameter (CAP) post-treatment in patients with NAFLD randomized to either SGLT-2 inhibitor or control. (b) Forest plot of comparison [Sensitivity analysis, Taheri 2020 excluded]: summary of mean difference in controlled attenuation parameter (CAP) post-treatment in patients with NAFLD randomized to either SGLT-2 inhibitor or control.

On sensitivity analysis, the study by Taheri et al.37 was excluded as the sample population criteria did not include NAFLD patients with Type 2 diabetes mellitus (see Fig. 2b). As compared with the previous pooled result, findings showed a higher mean difference in the reduction of CAP scores (MD: − 13.77 dB/m, 95% CI [− 23.00, − 4.55], p = 0.003, I2 = 0%, moderate certainty of evidence).

Liver-to-spleen attenuation (L/S) ratio

Only three studies32,33,35 measured the degree of fatty liver using L/S ratio on CT tomography during the study period (see Fig. 3). The liver-to-spleen attenuation ratio was calculated as the average liver attenuation at the region of interest divided by the average spleen attenuation using the unenhanced CT tomography. In total, 80 patients received SGLT-2 inhibitors and 83 patients received the control intervention. One trial35 was assessed to have low risk of bias while the remaining two32,33 had some concerns for bias due to lack of information on allocation concealment or blinding of the outcome assessor. The results showed that the SGLT-2 inhibitors significantly increased the L/S ratio by weighted mean difference (0.11, 95% CI [0.01, 0.21], p = 0.04, I2 = 78%, low certainty of evidence) compared to the control group.

Figure 3
figure 3

Forest plot of comparison: summary of mean difference in liver-to-spleen attenuation (L/S) ratio post-treatment in patients with NAFLD randomized to either SGLT-2 inhibitor or control. *Shibuya 2018 median and interquartile range study data converted to mean and SD using R-package ‘estmeansd 0.2.1’.

Liver proton density fat fraction (PDFF)

Five clinical trials27,28,30,34,41 provided data on the liver proton density fat fraction (PDFF) using magnetic resonance imaging (see Fig. 4a). A total of 175 cases and 155 controls had measured the MRI-PDFF (%) over the study period. The types of SGLT-2 inhibitors used include empagliflozin, dapagliflozin, licogliflozin, and tofogliflozin. The control intervention group consisted of placebo or standard of treatment for Type 2 diabetes mellitus other than SGLT-2 inhibitor, or pioglitazone treatment. Most of the studies presented some concerns for bias due to deviations from intended interventions with use of per-protocol analysis, and one trial34 lacked details on the concealment of the allocation sequence. The meta-analysis showed a trend towards a reduction in the liver proton density fat fraction at the end of treatment (MD: − 2.61%, 95% CI [− 5.05, − 0.17], p = 0.04, I2 = 78%, low certainty of evidence). In the sensitivity analysis, the study by Yoneda et al.41 was excluded owing to its control group being given pioglitazone, which is a medication shown to have some evidence in the treatment of NAFLD. On the other hand, the other trials consisted of placebo or standard of treatment groups. Results showed a consistent finding in the difference in means in the liver proton density fat fraction between SGLT-2 inhibitors and control with note of a decrease in heterogeneity (MD: − 3.65%, 95% CI [− 5.55, − 1.75], p = 0.0002, I2 = 59%, low certainty of evidence) (see Fig. 4b).

Figure 4
figure 4

(a) Forest plot of comparison: summary of mean difference in liver proton density fat fraction (PDFF) post-treatment in patients with NAFLD randomized to either SGLT-2 inhibitor or control. *Harrison 2022 study—standard deviation derived from standard error calculated by Revman. **Ericksson 2018 and Harrison 2022 study measured as change-from-baseline scores. (b) Forest plot of comparison [Sensitivity analysis, Yoneda 2021 excluded]: summary of mean difference in liver proton density fat fraction (PDFF) post-treatment in patients with NAFLD randomized to either SGLT-2 inhibitor or control. *Harrison 2022 study—standard deviation derived from standard error calculated by Revman. **Ericksson 2018 and Harrison 2022 study measured as change-from-baseline score.

Effect on hepatic fibrosis

Liver stiffness measurement (LSM)

Analysis on the effect of SGLT-2 inhibitors versus control on the decrease in liver stiffness measurement (LSM) using meta-analysis is provided in Fig. 5a. Across seven randomized controlled trials (RCTs)24,26,31,36,37,39,41, a total of 227 patients in the SGLT-2 group, and 220 patients in the control group were included. Only one trial37 excluded Type 2 diabetes mellitus in the non-alcoholic fatty liver disease population. All trials measured LSM using transient liver elastography or FibroScan® except for one trial40 which used magnetic resonance elastography (MRE). Two studies24,37 used empagliflozin as the SGLT-2 inhibitor while the other two trials39,41 received tofogliflozin, and the remaining ones had dapagliflozin31,36, or dapagliflozin plus liraglutide as intervention26. The control groups were given other standard Type 2 diabetes mellitus treatments excluding SGLT-2 inhibitor, or placebo with moderate intensity physical activity and standard dietary advice only37. Results showed that the SGLT-2 inhibitors slightly decrease the LSM level by mean difference (− 0.67 kPa, 95% CI [− 1.19, − 0.16], p = 0.010, I2 = 69%, low certainty of evidence) compared to comparators. After performing a sensitivity analysis for which the study by Taheri et al.37 was excluded due to its non-inclusion of Type 2 diabetes mellitus population, a trend was observed towards a moderate reduction in the LSM level with further decrease in heterogeneity (MD: − 0.87 kPa, 95% CI [− 1.30, − 0.44], p < 0.0001, I2 = 37%, low certainty of evidence) (see Fig. 5b).

Figure 5
figure 5

(a) Forest plot of comparison: summary of mean difference in liver stiffness measurement (LSM) post-treatment in patients with NAFLD randomized to either SGLT-2 inhibitor or control. *Takeshita 2022 measured as change-from-baseline scores. (b) Forest plot of comparison [Sensitivity analysis, Taheri 2020 excluded]: summary of mean difference in liver stiffness measurement (LSM) post-treatment in patients with NAFLD randomized to either SGLT-2 inhibitor or control. *Takeshita 2022 measured as change-from-baseline scores.

Fibrosis-4 index (FIB-4)

The FIB-4 index is a simple and valuable non-invasive tool in estimating the risk of liver fibrosis. The compiled studies calculated it using the following formula: FIB-4 index = (Age [years] × AST [U/L])/(platelet [109/L] × ALT [U/L]). The mean baseline FIB-4 index across 10 studies24,25,27,28,29,37,39,40,41 ranges from 0.775 to 1.50 in SGLT-2 group, and 0.826–2.12 in the control group. Almost all studies have some concern for bias except for one study24 with low risk of bias. The pooled estimate demonstrated that SGLT-2 inhibitor treatment significantly decreased the FIB-4 index as compared to control (MD: − 0.12, 95% CI [− 0.21, − 0.04], p = 0.005, I2 = 16%, moderate certainty of evidence) (see Fig. 6).

Figure 6
figure 6

Forest plot of comparison: summary of mean difference in FIB-4 index post-treatment in patients with NAFLD randomized to either SGLT-2 inhibitor or control. *Takeshita 2022 median and interquartile range study data converted to mean and SD using R-package ‘estmeansd 0.2.1’.**Harrison 2022 with imputed standard deviation.

Effect on histopathology (steatosis, hepatocellular ballooning, lobular inflammation, liver fibrosis)

Only two trials39,40 histologically evaluated all biopsied specimens at baseline and post-treatment at 48 and 72 weeks respectively with a combined total of 41 patients in the SGLT-2 group and 45 patients in the control group. The biopsied liver tissues were scored for hepatic steatosis, hepatocellular ballooning, and lobular inflammation using the NAFLD activity score (NAS)42 while the liver fibrosis was classified according to Brunt et al.43 criteria. The pathological outcomes for each category were summarized separately in Fig. 7a–d. In the meta-analysis, the SGLT-2 inhibitor group was noted to have a higher likelihood of having at least a one-score or one-stage reduction after treatment than the control group with respect to hepatocellular ballooning (RR: 2.19, 95% CI [1.22, 3.94], p = 0.009, I2 = 0%, moderate certainty of evidence) and liver fibrosis (RR: 2.29, 95% CI [1.12, 4.68], p = 0.02, I2 = 33%, moderate certainty of evidence). There were no significant differences identified between the two groups with respect to the changes in steatosis or lobular inflammation.

Figure 7
figure 7figure 7

(a) Forest plot of comparison: summary of one-score or one-stage reduction in hepatic steatosis after treatment in patients with NAFLD randomized to either SGLT-2 inhibitor or control. (b) Forest plot of comparison: summary of one-score or one-stage reduction in hepatocellular ballooning after treatment in patients with NAFLD randomized to either SGLT-2 inhibitor or control. (c) Forest plot of comparison: summary of one-score or one-stage reduction in lobular inflammation after treatment in patients with NAFLD randomized to either SGLT-2 inhibitor or control. (d) Forest plot of comparison: Summary of one-score or one-stage reduction in liver fibrosis after treatment in patients with NAFLD randomized to either SGLT-2 inhibitor or control.

Reporting biases in syntheses

The funnel plot for Fig. 8 shows no evidence for publication bias. Egger’s test for a regression intercept gave a p-value of 0.129, while Begg and Mazumdar’s test for rank correlation gave a p-value of 0.655, indicating no publication bias.

Figure 8
figure 8

Funnel plot of the meta-analysis of studies based from FIB-4 index syntheses.

All outcomes were found to be stated in their respective protocols and reported in the final publication reports. Hence, no selective outcome reporting was identified.

Certainty of evidence

Using the GRADE approach23, these outcomes, which consist of CAP, FIB-4 index, one-score reduction in hepatic steatosis and fibrosis, were noted to have moderate certainty of evidence. Others were designated as having low certainty of evidence given the inconsistency of results and imprecision which downgraded the evidence level. Other detailed summaries of findings were presented in Table 3, with footnotes explaining judgments.

Table 3 Summary of findings table by GRADEpro.

Discussion

The efficacy of SGLT-2 inhibitors in treating hepatic steatosis and fibrosis utilizing several imaging biomarkers and histopathology in patients with non-alcoholic fatty liver disease was investigated in this systematic review and meta-analysis of sixteen randomized controlled trials.

Consolidating the findings above, this meta-analysis found significant mean differences in CAP, L/S ratio, and MRI-PDFF after treatment, favoring the effect of SGLT-2 inhibitor over that of control especially in NAFLD patients.

These translate to a probable or slight positive effect on hepatic steatosis. In contrast, no significant differences were identified between the two groups with regards to the changes in steatosis or lobular inflammation on biopsy post-treatment. Nonetheless, interpretation of such findings in the latter appears to be inconclusive given the small number of studies.

With regards to hepatic fibrosis, there was a slight reduction in the LSM and FIB-4 index with use of SGLT-2 inhibitors in comparison to controls among non-alcoholic fatty liver patients with Type 2 diabetes mellitus. This was also confirmed histologically with the meta-analysis results showing at least a one-stage reduction in the treatment group with respect to hepatocellular ballooning and liver fibrosis.

Overall, our findings appear to be in line with those of other earlier systematic reviews and meta-analyses studies which also investigated the effects of SGLT-2 medications on NAFLD patients. A systematic review of four RCTs and four observational studies done by Raj et al.44 reported favorable effects of SGLT-2 inhibitors on the level of liver enzymes, liver fat, and fibrosis in patients with NAFLD. Similar positive effects of SGLT-2 inhibitors on liver enzyme and fat levels were also seen in individuals with NAFLD, according to a comprehensive evaluation of seven systematic reviews conducted by Shao et al.45.

A meta-analysis of four randomized trials evaluating liver fat content (LFC) using MRI by Coelho et al.46 (2020) showed a decrease in hepatic steatosis with the use of SGLT-2 inhibitors (MD: − 3.39%, 95% CI [− 6.01, − 0.77], p = 0.01, I2 = 89%). Moreover, a meta-analysis of two trials by Xing et al.47 reported a reduction in MRI-PDFF with the treatment of SGLT-2 inhibitors (MD: − 2.07%, 95% CI [− 3.86, − 0.28], p = 0.02, I2 = 10%). Another meta-analysis of two to four trials done by Song et al.48 revealed concordant findings, showing significant reduction in the level of liver controlled attenuation parameter (CAP) (MD: 0.29, 95% CI [− 26.95 to − 13.64], p < 0.00001, I2 = 0%), MRI-PDFF (MD:1.97, 95% CI [− 3.49 to − 0.45], p = 0.01, I2 = 11%), NAFLD score (MD: 0.55, 95% CI [1.04 to − 0.05], p = 0.03, I2 = 0%), fatty liver index (FLI) (MD: 11.21, 95% CI [− 16.53 to − 5.89], p < 0.0001, I2 = 0%), FIB‐4 index (MD: 0.25, 95% CI [− 0.39 to − 0.11], p = 0.0007, I2 = 10%), and increase in L/S ratio (MD: 0.16, 95% CI [0.10–0.22], p < 0.00001, I2 = 49%) with SGLT-2 inhibitor use.

In contrast, the meta-analysis done by Amjad et al.49 showed no statistically significant difference in fibrosis regression utilizing FIB-4 score (SMD = − 0.12, 95% CI: − 0.41 to 0.1, p = 0.994, I2 = 0%) and hepatic steatosis by using MRI-PDFF (SMD = − 0.31, 95% CI: –0.68 to 0.07, p = 0.502, I2 = 0%) between SGLT-2 inhibitors versus controls. However, it must be noted that this study only analyzed a total of three trials for both outcomes with inclusion of non-NAFLD population in one of the trials.

Despite the relevant findings in this meta-analysis, there were still some limitations in the evidence included in the review. We identified a few eligible studies with small sample sizes and inadequate power, leading to imprecise estimates. Also, most of the included studies had some concerns for risk of bias due to allocation concealment issues, use of per-protocol analysis, or missing results which partially affected the certainty (or confidence) in the body of evidence. Likewise, moderate to substantial statistical heterogeneity (I2 > 50%) was observed in some of the analysis which can be explained partially by variability in the studied participants, interventions and control used, in addition to the duration of treatment. Methodologically, different studies have diverse trial design methods (i.e., 1:1, 1:1:1, 2:1; double-blinded, open-label, and sample size differences) and quality which may reflect also in the heterogeneity measures. In terms of applicability to specific ethnic populations, majority of the studies included were conducted within Asia, with many subjects coming from Japan or China. The limited racial diversity within the study population may hence slightly affect the generalizability of the findings of this study.

Nonetheless, this updated meta-analysis was conducted in a comprehensive manner with an adequate number of databases searched by more than one reviewer to screen and extract data. Authors of the studies included were also contacted for completion and clarification of any missing information. The PRISMA guide & checklist13, and the Risk of bias 2.0 tool20 were likewise followed for transparency and completeness in reporting of the systematic review. Moreover, the certainty of evidence was presented using the GRADE approach23, which may assist physicians in the clinical decision-making process together with the patients.

As to implications for practice and policy, the positive evidence from this meta-analysis considers the use of SGLT-2 inhibitors for the treatment of Type 2 diabetes mellitus with an obese profile in the setting of concomitant non-alcoholic fatty liver disease.

The addition of SGLT-2 inhibitors to the usual management of dietary and lifestyle modifications in Type 2 diabetic patients with NAFLD may potentially prevent disease progression and the various complications that accompany the disease. Until now, there has been no standard clinically approved pharmacologic treatments for NAFLD. However, there is some growing evidence on the efficacy of other anti-diabetic medications including pioglitazone and GLP-1 agonists in its management. Although some studies included in this meta-analysis opted to compare the effects of SGLT-2 inhibitors with standard treatments not known to have any effects on NAFLD, some used pioglitazone50 or liraglutide51 as its control, which may slightly affect the positive magnitude of the results. Nevertheless, as a whole, the results of this meta-analysis remain useful in the clinical setting, thus providing an alternative treatment for those who have contraindications to the use of either pioglitazone or GLP-1 agonists.

Current standards of care in the management of Type 2 diabetes mellitus focus on the goal of cardiorenal risk reduction in high-risk patients on top of achievement and maintenance of glycemic and weight control52. Given the growing prevalence of NAFLD and its complications, it may be warranted to further discuss NAFLD in conversations highlighting the list of potential therapies beyond screening measures.

In the context of implications for future research, more randomized controlled trials are needed with (1) larger sample sizes to improve precision; (2) enrollment of a wider range of ethnic populations; (3) study objectives evaluating superiority or efficacy over placebo or standard of care (not only equivalence trials); and (4) use of intention-to-treat analysis in generating results to address bias due to deviations from intended interventions.

Conclusion

The pooled meta-analysis suggests that sodium-glucose cotransporter 2 inhibitors slightly improve hepatic steatosis and fibrosis as compared to controls in adult patients with non-alcoholic fatty liver disease and Type 2 diabetes mellitus with low to moderate certainty of evidence.