Introduction

Non-alcoholic fatty liver disease (NAFLD) is a growing epidemic, which is associated with obesity and the metabolic syndrome1,2, now affecting 25% of the world’s population3. Especially NAFLD patients with obesity and type 2 diabetes are at high risk for inflammatory disease progression (non-alcoholic steatohepatitis, NASH)4. In such patients, the stage of liver fibrosis is the strongest predictor for overall survival and liver-related morbidity and mortality5,6,7. This disease burden is expected to increase in the future and pose serious challenges to health-care systems8.

Unfortunately, to date, the risk stratification of NAFLD patients relies on sophisticated histological assessment, which itself has poor inter-rater reliability and is contra-indicated in patients suspected for liver injuries4,9,10,11. However, to address the growing prevalence of fatty-liver disease adequately, easy to perform, economically viable and widely available non-invasive methods for disease staging and grading are required12. Over the last decade, ultrasound-based methods, namely liver elastography, serum-based fibrosis markers and anthropometry-based scores have been proposed for risk stratification of NAFLD13. Elastography exploits the fact that the liver hardens with increasing fibrosis, a property that can be measured by assessing the liver’s Young’s modulus. The available methods have their advantages and disadvantages and none of them has sufficient diagnostic properties to warrant universal acceptance13,14. Hence research has focused on combinations of methods to improve diagnostic performance that often include laboratory parameters, conventional sonography and liver stiffness measurements (LSM) with elastography15. The evaluation of such step-wise algorithms is often limited by the high prevalence of advanced disease inherent to biopsy-controlled studies, rendering extrapolation to the screening setting bold at best16,17. Nonetheless, national and international guidelines recommend complex diagnostic workups, especially for screening purposes in primary care including the important field of diabetology4,18. Hitherto, these guidelines have not been widely adopted19,20 and concerns regarding feasibility in clinical practice have been voiced12,21.

Therefore, we designed a study to evaluate the current screening recommendations in the EASL-EASD-EASO and German guidelines (DGVS)4,18 for identifying patients at risk for NAFLD, especially regarding referral rates. We further identified factors that could facilitate referral decisions in primary care including a novel NASH surrogate score22.

Patients and methods

Ethical statement

The study was performed in accordance with the guidelines for good clinical practice (E6/R1) and the ethical guidelines of the Helsinki Declaration. The study was approved by the local ethical committee (University of Leipzig, registration number 035/17-ek) and registered in the German Clinical Trials Register (DRKS00012281). Informed written consent was obtained from all participants.

Design overview and patients

This was a prospective cross-sectional study designed to evaluate the performance of current guideline recommendations on NAFLD risk assessment in patients with diabetes. The corresponding analysis comprised referral rates for each guideline risk class along with the association according to fibrosis risk. In addition, simplified algorithms were identified, and their diagnostic properties evaluated. The potential use of genetic risk markers and a new NASH surrogate (FAST score22) were analysed.

From April 2017 to May 2018, consecutive adult patients (≥ 18 years) with established diagnosis of type 2 diabetes were invited to participate in the study. We considered patients at our hospital within a treatment programme for type 2 diabetes and its sequelae or those referred from specialized collaborating type 2 diabetes out-patient centres. Patients were eligible for inclusion when diagnosis of type 2 diabetes was established according to national guideline recommendations23. In brief, HbA1c ≥ 6.5% or fasting plasma glucose concentration ≥ 7.0 mmol/l at the time of initial diagnosis were required for type 2 diabetes diagnosis23,24. Exclusion criteria comprised liver transplantation or major liver surgery, pregnancy or lactation, malignant diseases with restricted life expectancy, or other types of diabetes. Patients who reported increased alcohol consumption (women > 140 g/week, men > 210 g/week, respectively4,25,26) were excluded from the final analysis.

Study examination

All patients underwent a thorough interview including medical history of type 2 diabetes and liver diseases as well as alcohol, smoking, and caffeine consumption habits. On the same day, a clinical examination including assessment weight and height, abdominal ultrasound examination, liver stiffness measurement (LSM) with vibration controlled transient elastography (VCTE) combined with measurement of controlled attenuation parameter (CAP) and laboratory assessment were performed. All patients fasted overnight prior to the study examinations.

Ultrasound examination

A standardized abdominal ultrasound examination was performed by a certified experienced examiner (VB). A high-end ultrasound device (Toshiba Aplio 500, software version AB_V7.00*R003, Canon Medical Systems, Tustin, USA) equipped with a linear and curved array probe was used. Presence of hepatic steatosis was defined by a bright echo pattern of the liver parenchyma compared to the right kidney4. Biliary obstruction, liver congestion due to right heart failure, presence of ascites, or focal liver lesions at the LSM measurement site were ruled out in all patients.

Vibration controlled transient elastography (VCTE)

All participants underwent LSM with VCTE (FibroScan; Echosens, Paris, France), using the appropriate probe (M or XL-probe, 3.5/2.5 MHz, respectively), defined by the skin-to-liver-capsule distance (M-probe ≤ 25 mm; XL-probe > 25 mm). VCTE was performed by a trained certified examiner in accordance with guidelines14 as described previously27. Elevated LSM indicated suspicion of significant liver fibrosis. An intermediate fibrosis risk was defined by LSM between 7.9 and 9.6 kPa for the M-probe (7.2–9.3 kPa for the XL-probe18). In addition to the LSM value, the VCTE device calculates CAP as a surrogate for severity of hepatic steatosis, expressed in dB/m28. We used accepted cut-offs to define steatosis grading29.

Laboratory analysis and laboratory-based fibrosis scores

Same day blood samples were taken, unless data were available from within eight weeks. The following parameters were recorded: standard blood count, alanine and aspartate aminotransferases (ALT and AST), gamma-glutamyl-transferase (GGT), glycohemoglobin (HbA1c), high-density lipoprotein (HDL) and triglycerides. In all patients, fasting glucose and insulin were determined after overnight fasting on the day of study inclusion.

Laboratory-based fibrosis risk indices were.

  • Non-alcoholic fatty liver disease fibrosis score (NFS): NFS = –1.675 + (0.037*age[years]) + (0.094* BMI[kg/m2] + (1.13*diabetes[yes = 1,no = 0]) + (0.99*AST/ALTratio)–(0.013*platelet[Gpt/l]) –(0.66*albumin[g/dl]). As sensitive/specific cut-off we used, − 1.455 (age 36–65) , 0.12 (age ≥ 65)/0.676 (age ≥ 36), respectively30.

  • FIB-4-score: FIB-4 = (age[years]*AST[U/L])/(platelet[Gpt/l]x(ALT[U/L])1/2)31. As sensitive/specific cut-off we used, 1.3 (age < 65), 2.0 (age ≥ 65)/2.67 (all ages), respectively30.

Application of guideline recommendations and screening strategies

Current international and national guidelines4,18 aim to identify patients at risk for fibrosis and suggest different follow-up scenarios or in-depth assessment depending on risk classification, Fig. 1. Because the European Association for Liver/-Diabetes/-Obesity Guidelines (EASL-EASD-EASO) are unspecific regarding the recommended fibrosis scores4, we considered all potential variants. National German guidelines do not yet consider age-adapted cut-offs, which we however took into account to improve performance. Patients with increased alcohol consumption were not considered for the main analyses.

Figure 1
figure 1

Diagnostic algorithms of guideline recommendations for screening and risk stratification in NAFLD patients4,18. NFS NAFLD Fibrosis score, FIB-4 Fibrosis-4, LSM liver stiffness measurement.

Furthermore, we aimed to identify parameters for an optimized referral strategy with reduced referral rates for further hepatological assessment and/or avoiding cost-intensive serum-markers/liver elastography. Based on the results of univariate and multivariate analysis, a simple proposal using AST as a single marker for risk stratification was further evaluated.

PNPLA3 and TM6SF2 genotyping

In addition to the guideline-based recommendations for NAFLD risk assessment, a genetic analysis of genotyping of common fibrosis risk alleles was performed32. In particular, the patatin-like phospholipase domain-containing protein 3 (PNPLA3) gene variant p.I148M (rs738409) and the transmembrane 6 superfamily 2 (TM6SF2) gene variant p. E167K (rs58542926) were analysed, see Supplement for details.

FAST score

Newsome et al. recently evaluated a novel elastography-based scoring system (FAST score) using VCTE to identify patients with active NASH (NASH ≥ 4) and significant fibrosis (F ≥ 2)22. The score is calculated from LSM, CAP and AST. At a cut-off of 0.35, the score achieved a sensitivity of ≥ 0.90 for the diagnosis of NASH, a cut-off of 0.67 had a specificity ≥ 0.9022. For the present study, FAST score values were retrospectively calculated by the manufacturer of the VCTE device (Echosens, Paris, France).

Statistical analysis

Statistical analyses were performed using the R statistical package (Version 3.4.2). Means and standard deviations are denoted by X ± Y and medians and interquartile range by X [Y, Z]. LSM was always treated on a logarithmic scale. Group comparisons of continuous variables were based on ANOVA and a chi-squared test without continuity correction was used to assess the association in contingency tables or Fisher’s exact test if expected counts were below 5. Pearson’s linear correlation was used with confidence intervals based on Fisher’s transformation. Linear models were employed to analyse the association between LSM and covariates found to have significant correlations and logistic regression for LSM categories using the established cut-offs18, see above. Logistic regression and multivariate analysis were used to identify independent clinical, laboratory data and patients’ characteristics associated with an increased LSM. A p-value < 0.05 was considered significant.

For the EASL-EASD-EASO guideline, LSM categories were used to define the fibrosis risk for calculating diagnostic properties. This cannot be applied to the German guidelines (DGVS) however, which already incorporate LSM. Patients with invalid VCTE examinations were treated as intermediate fibrosis risk with the need of further clarification according to the DGVS definition.

Results

Study population

184 of 204 enrolled subjects qualified for the final analysis (Fig. 2). The baseline characteristics of the study cohort, stratified for LSM, are summarized in Table 1.

Figure 2
figure 2

Consort diagram. NAFLD Non-alcoholic fatty liver disease.

Table 1 Baseline characteristics, stratified by liver stiffness measurement (LSM).

Guideline recommendations

Figure 3 shows the results of applying the evaluated guidelines to our study cohort. For the DGVS guideline, 63% of patients are recommended for a long-term follow-up after having been referred to a specialist for LSM. Biopsies are recommended for 7%, and 18% are advised to return to the specialist for intensive management. The remaining 12% do not receive any recommendation although some (n = 9, 5%) have elevated LSM. Three LSM measurements were invalid due to high BMI (38, 43 und 58 kg/m2).

Figure 3
figure 3

A/B: Guideline recommendations: clinical consequences and referral rates of diagnostic algorithms proposed by current guidelines in 184 patients with type 2 diabetes and NAFLD. DGVS DGVS S2k Guideline non-alcoholic fatty liver disease, EASL-EASD-EASO EASL–EASD–EASO Clinical practice guidelines for the management of NAFLD, LSM Liver stiffness measurement, FIB4 Fibrosis-4; age-adapted cut-offs for NAFLD fibrosis score and FIB4 score were used30.

The evaluated guidelines of the EASL-EASD-EASO and the German Society for Digestive and Metabolic Diseases (DGVS) made use of different risk scores (NFS or FIB4) and are unspecific regarding cut-offs (Fig. 4). Hence, we provide the results using different established cut-offs (NFS sens/spec, FIB4 sens/spec30). Applying the EASL-EASD-EASO guideline, between 60 and 77% are referred to specialists, 17% to 34% are followed up with 2-year intervals and 6% have long-term follow-up, independent of the risk-score and the cut-offs.

Figure 4
figure 4

Risk stratification using non-invasive fibrosis scores with established aged-adapted cut-offs (sensitive/specific). The risk categories are colour-coordinated. Each patient’s result is imaged in grey horizontal lines. The more specific cut-off for FIB4 score (3.25) is additionally illustrated; FIB4 Fibrosis-4, LSM liver stiffness measurement, VCTE vibration controlled attenuation parameter.

In a second approach we analysed the diagnostic accuracy of applied guideline recommendations according to the NFS and FIB-4 index and variable LSM cut-offs (Table 2). LSM was used as reference for advanced fibrosis for all patients. The German guideline recommendations imply use of the LSM in the diagnostic pathway, thus the PPV was 100% by definition and the number of false positive results cannot be defined. The sensitivity applying the German NAFLD pathway differ from 47–75% in our study cohort. The EASL-EASD-EASO guidelines have a sensitivity ranging from 86 to 96% and a specificity from 30 to 52%, respectively.

Table 2 Performance of diagnostic algorithms when applied to patients with type 2 diabetes.

Alternative proposal for risk stratification

A univariate analysis showed a correlation of NFS with log(FIB4) of 0.79 (95% CI 0.73–0.84). The correlation of NFS with log(LSM) was 0.46 (95% CI 0.33–0.56) and the correlation between log(FIB4) and log(LSM) was 0.52 (95% CI 0.41–0.62).

In a multivariate analysis, LSM was significantly associated with AST and “Patient reports previous hepatic problem”, but not with sex, age, BMI or HbA1c, Table 3 (see supplement 1 for the analogous analysis using LSM categories).

Table 3 Multivariate analysis without cut-offs for LSM.

Multivariate analysis demonstrates that AST is highly relevant when identifying patients with elevated LSM. Based on this finding, the results using a simple proposal based entirely on AST in type 2 diabetes patients would classify 75% (138/184) for long-term follow-up and the remaining 25% for specialist referral.

Our simple proposal using only the AST > ULN had a sensitivity of 46% with a specificity of 88%. The further diagnostics properties of this proposal are shown in Table 2, Panel B).

FAST score

The application of the recently introduced FAST score using the cut-offs published by Newsome et al.22 reduced the required referral to specialist to 35% with the lower cut-off or 12% with the upper one. A sensitivity/specificity calculation was not conducted due to missing reference standard. Figure 5 shows the distribution of the FAST score depending on the recommendation from the EASL-EASO-EASD guidelines. Considering only the patients recommended for specialist referral, we find 45 (42%) below the lower FAST cut-off, 42 (39%) between the two cut-offs, and 21 (19%) above the upper one. The patients above the lower cut-off had slightly higher mean HbA1c (7.3% vs 6.9%, p = 0.18), BMI (33.9 vs 32.5 kg/m2, p = 0.30) and duration of disease (13.8 vs 11.1 years, p = 0.19) and age (64.5 vs 61.8, p = 0.17). This provides an indication that these patients may be at higher risk according to known factors as well.

Figure 5
figure 5

Box plots of FAST score according to EASL-EASD-EASO recommendations. The dashed lines refer to the cut-offs suggested for the FAST score22. For risk stratification the specific cut-off of 2.67 were used for FIB4 score30.

Genetic analysis for NAFLD risk genes

In addition to guideline recommendations we look at screening markers that may complement screening algorithms in the future. The common NAFLD genetic risk variants of the genes PNPLA3 and TM6SF2 are predestined for such an analysis. The risk variant (non-CC) of the gene PNPLA3 was found in 84 patients and the non-risk variant in 99. There were technical problems with the measurement in 1 case. The risk variant (non-CC) of the gene TM6SF2 was found in 29 patients and the non-risk variant in 155. Both risk variants were not associated with hepatic steatosis according to CAP nor with fibrosis according to NFS. There was a significant association between PNPLA3 and LSM, which did not persist in a multivariate model containing AST. Specialist referral according to EASL-EASD-EASO and DGVS was also not associated with genetic risk variants (see Supplement).

Discussion

Recommendations for risk assessment in NAFLD patients have not been sufficiently assessed prospectively, in particular in type 2 diabetes. Our population of type 2 diabetics shows (a) high prevalence of elevated LSM indicative of advanced fibrosis, (b) very high referral rates for an in-depth hepatological work-up of up to 77% based on current international and national guidelines and (c) that a much simpler referral algorithm may produce comparably good results.

About 10% of the European general population has diabetes, many of whom have substantial risk for disease progression to NASH and ultimately advanced fibrosis23,24,33,34. Liver involvement as a co-morbidity in diabetes has become identified as an important driver of disease progression to NASH or advanced fibrosis with associated complications as hepatocellular carcinoma and is a growing burden to health care. Increased awareness has led to screening guidelines that are unfortunately not sufficiently well-known nor is infrastructure sufficient for their widespread application, especially in primary care19,21,35,36. This may prolong the diagnostic process and limit the chances of effective therapeutic interventions before manifestation of cirrhosis or liver cancer37.

The optimal screening strategy has excellent sensitivity and specificity, is cheap, widely available, easily administered and gives quick results. In our cohort one quarter of the NAFLD-patients show elevated LSM and thus is at risk for complications of liver fibrosis, a number comparable to what others have found in even larger populations16,34,38,39. Optimal screening strategies should identify roughly this number, but the EASL-EASD-EASO guideline leads in our cohort to a referral rate up to 77% and the German guideline requires LSM in 76% of cases. This demonstrates that the diagnostic properties of the algorithms are sub-optimal. Moreover, clarification using LSM is neither cheap due to personnel, nor is it available in primary care. Finally, the step-wise algorithms are not easily administered or understood (see Fig. 1) and, in fact, EASL-EASD-EASO fails to specify the cut-offs for the serum-based fibrosis indices (see Fig. 4).

We identified three major weaknesses in the EASL-EASD-EASO and German algorithms when applied to patients with type 2 diabetes. Firstly, risk stratification according to presence of steatosis using conventional ultrasound identifies almost the entire population, but overlooks patients requiring attention in the small number excluded. These may be patients with advanced disease whose steatosis has already begun to recede (9/23 in our data). Second, the application of the original NFS without age-adaptation categorizes almost all diabetics to be “at risk” and seems unsuitable for application in the obese diabetic cohort. Even with the age-adapted cut-offs used here, NFS remains quite unspecific. On the other hand, the FIB-4 score seems promising with the age-adaptation. Unfortunately, there are many suggestions for cut-offs with little consensus (cf. Fig. 4 and e.g. the cut-offs at 2.67 and 3.25). Finally, NAFLD requires the exclusion of all other potentially relevant diseases including even fairly moderate alcohol consumption4,18. In our cohort, 19 patients were excluded based on this definition, 12 of whom consumed only up to two drinks per day, and half of whom are at risk for advanced fibrosis. Such alcohol consumption may well aggravate the diabetic effect on liver disease but falls between the cracks of disease definitions and guidelines. Self-reporting on alcohol consumption is known to be unreliable40 and it could be more sensible to include patients with moderate alcohol consumption for secondary preventive measures.

Liver stiffness can be considered the gold standard of non-invasive risk assessment, with VCTE the best evaluated and approved method for NAFLD13,14. A particularity of the German guideline is the implementation of VCTE in the screening algorithm, which was indicated in about 75% of our patients. LSM-based approaches were recently evaluated from a health economics standpoint and found to be potentially cost-saving in populations at risk41. However, universal access to this method would require large investments21 and modifications to the reimbursement system in most European countries, depending on national health care. Although this works in tertiary referral centres, extrapolation indicates that the method’s PPV could be a serious limitation in primary care16,42.

Our prospective data corroborate retrospective ones, suggesting that the EASL-EASO-EASL algorithms are not easily implementable in the outpatient healthcare12. Moreover, they agree with a study showing that the high prevalence of liver steatosis constitutes a problem for the algorithms43. Hence, we see the need for very low threshold identification strategies based on simple reliable patient characteristics and low-cost laboratory data, e.g. BMI, AST and sex. Our multivariate analysis demonstrates that diabetic patients with elevated AST are prone to increased LSM. A simple proposal based exclusively on elevated AST in type 2 diabetics would be easy to implement and have acceptable diagnostic properties despite sending “only” 25% to liver specialists. Recent publications have also found AST to be a simple marker for fibrosis44 and that there is an association with progression of fatty liver disease45.

Genetic testing for risk alleles were in line with known distribution of disease severity32,46,47,48, but relative risk was too low to suggest a meaningful diagnostic benefit.

The guideline recommendations were developed to identify patient at risk for liver fibrosis because of its association with liver specific morbidity and overall mortality5,7,49. Ongoing phase 2 and 3 trials in drug development for the therapy of NASH suggest that medication will soon be available to prevent fibrosis progression in NASH50. Future screening strategies will include NASH, a condition which could not be identified without liver biopsy until now. The recently proposed FAST score is a composite of VCTE and AST and could be helpful for stratification of treatment decisions22. The application to our cohort had a favourable distribution and especially patients with a high score and recommended for specialist referral by the EASL-EASO-EASL algorithm may be interesting candidates for pharmacological therapy, but requires further evaluation. Beyond the guideline based recommendations, many serum markers such as M2BP, hyaluronic acid or type IV collagen 7 are being evaluated for fibrosis detection in different settings51.

We recognize that the missing gold-standard by histological proven diagnosis is a limitation. However, biopsies are often contraindicated in diabetes patients with cardiovascular comorbidities requiring anti-coagulation and/or antithrombotic therapy. Hence, to avoid selection bias, suitable surrogates such as VCTE represent the best available option16.

Even assuming a PPV as low as 50%, our data suggest advanced fibrosis in 12% of the cohort. While our cohort may have more severe disease than the general diabetic population, this bias is shared by almost all studies as it is inherent to a university hospital. Hence, our single centre study should be reproduced in a multi-centre approach, ideally in primary care. Such a study should be accompanied by a detailed health-economics analysis to verify potential cost benefits of such screening strategies41.

With increasing NAFLD prevalence, efficient screening strategies are of the utmost importance. The DGVS and EASL-EASD-EASO guidelines constitute an important starting point, but show high referral rates to liver specialists in diabetic populations and require prohibitively substantial resources. Greater focus on AST and less focus on conventional ultrasound and complex algorithms may be the next steps toward improvement.