Introduction

The number of patients diagnosed with non-alcoholic fatty liver disease (NAFLD) is rising worldwide1. Obesity is a key risk factor for NAFLD development, and it is estimated that 90% of obese people suffer from NAFLD2. Obesity increases the risk of NAFLD by almost fivefold3 and accelerates its progression to advanced fibrosis, cirrhosis, and non-alcoholic steatohepatitis (NASH)4. Liver biopsy is currently the gold standard method for diagnosing liver fibrosis, steatosis, inflammation, and hepatocyte ballooning in NAFLD. However, there are limitations to consider when performing liver biopsy, including the invasiveness of the procedure, and the risk of sampling errors. Therefore, elastography was developed as an alternative, non-invasive technique. The accurate assessment of liver fibrosis is essential for patients with NAFLD as this is the most important pathological determinant of prognosis5,6,7.

The estimated prevalence of NAFLD among Asians is 27.4% (95% confidence interval CI 23.3–31.9%)8, and it is reported that individuals with overweight and obese have increased risk of NAFLD and 50.7% of them has NAFLD9. It remains unclear whether elastography has sufficient diagnostic ability for overweight and obese patients. In the evaluation of NAFLD, magnetic resonance imaging (MRI)-based techniques have been reported to have a better diagnostic performance than vibration-controlled transient elastography (VCTE)10,11; when compared to VCTE, MRI-based techniques are more expensive and time consuming to implement. Therefore, it is clinically useful to know the difference in diagnostic performance between these two techniques.

In this study, we investigated the accuracy of VCTE and magnetic resonance elastography (MRE) to detect liver fibrosis and the accuracy of controlled attenuation parameter (CAP) measurements. Similarly, we performed and MRI-based proton density fat fraction (MRI-PDFF) to detect liver steatosis among patients with overweight and obese.

Methods

Study design

We conducted a cross-sectional retrospective study of patients with NAFLD who had undergone all three examinations (liver biopsy, VCTE and MRE) within 6 months between January 2014 and March 2020 at Yokohama City University Hospital. The study protocol complied with the ethical principles of the 1975 Declaration of Helsinki and was approved by the Ethics Committee of Yokohama City University Hospital (Yokohama, Japan; approval number B2104, April 28, 2021), and all patients provided written informed consent. The inclusion and exclusion criteria are described in Supplementary Online Resource 1.

Definition of overweight and obese

According to the World Health Organization definition, obesity is defined as a body mass index (BMI) ≥ 30 kg/m2, and overweight is defined as a BMI > 25 and < 30 kg/m212. Notably, Asians have high prevalence of type 2 diabetes and cardiovascular risk factors, even among normal weight populations (those with a BMI > 25 kg/m2)13. In 2004, the World Health Organization Western Pacific Regional Office established a different definition of obesity for Asians, where the BMI was categorized as “underweight” (< 18.4 kg/m2), “normal” (≥ 18.5–22.9 kg/m2), “overweight at risk (overweight)” (≥ 23.0–24.9 kg/m2), “obese I” (≥ 25.0–29.9 kg/m2), and “obese II” (≥ 30.0 kg/m2)13. In our country, the Japanese Society for the Study of Obesity defines obesity as a BMI of 25 > kg/m2 or above14. In the current study, we used international standards to define BMI < 25 as “normal,” BMI > 25 and < 30 kg/m2 as “overweight,” and BMI ≥ 30 kg/m2 as “obese.”

Histological findings

Liver biopsy was performed in all participants. The procedure and methods are described in Supplementary Online Resource 2. Grading and staging were based on the NASH clinical network criteria, as previously reported. Steatosis affecting < 5%, 5%–33%, 33–66% and > 66% of hepatocytes was classified as grade 0, 1, 2 and 3, respectively. Lobular inflammation was graded according to the number of inflammatory foci per field of view at a magnification of 200× , with 0, < 2, 2 − 4, and > 4 foci per field classified as grades 0, 1, 2, and 3, respectively. Hepatocellular ballooning involving no, few, and many cells was classified as grade 0, 1, and 2, respectively15.

Vibration controlled transient elastography

VCTE data were determined utilizing the M or XL probes of FibroScan (EchoSens, Paris, France). The probe to liver capsule distance (PCD), defined as the layer between the skin and the liver capsule, was measured to determine which probe size should be used. The M probe was used when the PCD was < 25 mm, and the XL probe was used when the PCD was > 25 mm. The details of the methods and testing procedures have been reported previously16,17. The patient was placed in a supine position with the right arm raised to the maximum height, and the liver stiffness measurement (LSM) of the right lobe of the liver was measured from the intercostal space. The LSM value was calculated as the median value and expressed in kilopascals (kPa).

The CAP values provided by the device were used for evaluation only if the VCTE was valid for the same signal. Thus, CAP values were simultaneously measured from the same volume of liver parenchyma as the VCTE.

Reliable VCTE were defined as those with a success rate (ratio of number of acquisitions) of at least 60% and a median interquartile range (interquartile range = range of the middle 50% of the data) value of < 30% among 10 valid measurements. Unreliable VCTE were defined as fewer than 10 valid acquisitions, a success rate < 60%, and/or an interquartile range/median ≥ 30%, as previously reported16,17.

MRI proton density fat fraction and MR elastography

Liver MRE was performed utilizing 3.0 Tesla imagers (GE Healthcare, Milwaukee, WI) at our institution according to previously described methods18. MRI-PDFF was measured by a modified Dixon method with advanced processing (IDEAL IQ, GE Healthcare), utilizing the method previously published19,20,21. A region of interest was drawn to measure MRE, including only the parenchyma of the right lobe and avoiding the liver edge and surface, large vessels, bile ducts, gallbladder, tumor, and artifacts. The average of the measurements from four slices was used for the analysis. Examinations were considered adequate if the total number of pixels over four slices acquired in a participant was greater than or equal to 700 pixels22. To measure MRI-PDFF, another region of interest was drawn on the in- and out-of-phase images near the site where the region of interest was drawn for MRE.

MRE and MRI-PDFF obtained simultaneously were registered in a database, extracted for this study and analyzed by one of the authors who was blinded to the pathological results (K. I.). With the equipment used in this study, MRI-PDFF is a black-and-white image, thus, making it difficult to visually assess liver steatosis. However, MRE can also be colorized and displayed, allowing visual assessment of LSM (Fig. 1).

Figure 1
figure 1

MRE colorized images. MRE magnetic resonance elastography.

Clinical features, laboratory characteristics and scoring systems

Detailed histories, physical measurements, and biochemical tests were obtained from all participants. Each patient’s height and weight were measured using a certified scale after removal of shoes and any heavy clothing. The BMI was calculated as weight (kg)/height (m)2. After a night of fasting (12 h), a venous blood draw was performed, and the following parameters were measured: platelet count (PC), albumin level, aspartate aminotransferase (AST) level, alanine aminotransferase (ALT) level, type IV collagen 7 s, hyaluronic acid, fasting blood glucose level, and glycosylated hemoglobin level. Standard methods were used to measure these parameters.

The ratio formula of Fibrosis-4 index23, AST to ALT ratio24, AST to platelet ratio index25, and NAFLD fibrosis score26 are described in the supplementary material.

Statistical analysis

Statistical analysis was conducted by using JMP statistics software (version 15.0.0; SAS Institute, Cary, NC, USA). Univariate comparisons between patient groups were performed with the use of Student’s t test. The 95% CIs were calculated using the Woolf method. Jackknife tests were used to compare the area under the receiver operating curve (AUROC) between the two groups. Statistical significance was set at P < 0.05.

Results

Patient characteristics

Between January 2014 and March 2020, 424 patients underwent liver biopsy, VCTE, and MRE at our center. Patients who did not undergo all three tests within 6 months were excluded (n = 191). Patients with insufficient information (n = 3), and those with PCD ≥ 25 mm in whom an XL probe was not used (n = 32), procedure failure (n = 10), or unreliable VCTE (n = 25) were also excluded. Finally, the data of 163 patients were analyzed. A flowchart of patient enrolment is shown in Fig. 2. The fibrosis stage and steatosis grade were evaluated in all patients based on the liver biopsy (Table 1). The number of patients with fibrosis stage 0, ≥ 1 ≥ , ≥ 2, ≥ 3 and 4 assessed by liver biopsy was 2, 15, 18, 5, 13 and 0 for those with BMI < 25 kg/m2; 1, 10, 21, 25 and 11 for those with BMI ≥ 25 and < 30 kg/m2; and 1, 8, 11, 23 and 14 for those with BMI ≥ 30 kg/m2, respectively. None of the patients with a BMI > 25 kg/m2 had stage 4 fibrosis. The numbers of patients with steatosis grades 0, ≥ 1, ≥ 2 and 3 assessed by liver biopsy with BMI < 25, ≥ 25 and < 30 and ≥ 30 kg/m2 were 4, 18, 12 and 4; 2, 33, 21, and 12; and 1, 22, 21 and 13, respectively. Analyses were conducted according to three groups stratified by BMI: normal (BMI < 25 kg/m2, n = 38), overweight (BMI > 25 and < 30 kg/m2, n = 68), and obese (BMI ≥ 30 kg/m2, n = 57). Patient characteristics and laboratory findings for the entire study population and by group are presented in Table 1. The M probe was used when the PCD was < 25 mm and the XL probe was used when the PCD was > 25 mm; as BMI increased, the PCD tended to increase, and the XL probe was used in more patients.

Figure 2
figure 2

Flow chart of patient enrollment.

Table 1 Patient characteristics.

Analysis of liver fibrosis in groups according to BMI

We examined the diagnostic accuracy of LSM measured by VCTE and MRE, by comparing the AUROCs. The analysis was performed according to the BMI groups specified above. Jackknife tests were performed to investigate if there were significant differences in the results between the groups according to BMI (Table 2). VCTE and MRE predicted liver fibrosis of stage 2 or above in patients with NAFLD with an AUROC of 0.89/0.94/0.85/0.88 (MRE) and 0.90/0.95/0.83/0.94 (VCTE) for overall/normal/overweight/obese participants, respectively. MRE predicted liver fibrosis of stage 3 or above in patients with NAFLD with an AUROC of 0.92/0.96/0.87/0.95 (MRE) and 0.93/0.95/0.91/0.92 (VCTE) for overall/normal/overweight/obese, respectively. VCTE and MRE predicted liver fibrosis 4 (cirrhosis) in patients with NAFLD with an AUROC of 0.94/-/0.95/0.95 (MRE) and 0.89/-/0.88/0.87 (VCTE) for overall/normal/overweight/obese patients, respectively. The specific cut-offs used in this study are presented in Fig. 3. There was no considerable difference in the AUROCs according to the BMI in all groups (Fig. 4).

Table 2 Diagnostic ability of VCTE and MRI/MRE techniques to detect hepatic fibrosis and steatosis presented by AUROCs in patients with NAFLD.
Figure 3
figure 3

Specific cut-offs to define each grade of steatosis and stage of fibrosis by VCTE and MRI/MRE techniques. MRE magnetic resonance elastography, MRI magnetic resonance imaging, VCTE vibration-controlled transient elastography.

Figure 4
figure 4

Comparison of the diagnostic accuracy of VCTE and MRI/MRE techniques to detect liver fibrosis in patients with NAFLD presented by AUROCs. MRE magnetic resonance elastography, MRI magnetic resonance imaging, VCTE vibration-controlled transient elastography, AUROC area under the receiver operating curve.

Analysis of liver steatosis measurements categorized by BMI

We examined the diagnostic accuracy of liver steatosis measured by CAP and MRI-PDFF among the different BMI categories by comparing their AUROCs (Table 2). CAP and MRI-PDFF detected liver steatosis of grade 1 or above in patients with NAFLD with an AUROC of 0.95/0.81/1.00/1.00 (MRI-PDFF) and 0.89/0.73/0.95/0.93 (CAP) for overall/normal/overweight/obese participants, respectively. MRI-PDFF and CAP predicted NAFLD with an AUROC of 0.90/0.93/0.93/0.83 (MRI-PDFF) and 0.77/0.81/0.82/0.68 (CAP) for overall/normal/overweight/obese participants, respectively. CAP and MRI-PDFF predicted liver steatosis of grade 3 or above in patients with NAFLD with an AUROC of 0.89/0.90/0.90/0.88 (MRI-PDFF) and 0.69/0.63/0.75/0.61 (CAP) for overall/normal/overweight/obese patients, respectively. The specific cut-offs used in this study are presented in Fig. 3.

Comparing the results of the AUROCs using the Jackknife test, there was a considerable difference in the assessment of liver steatosis in the total group, normal ≥ 1 and 2, and overweight ≥ 1. As the BMI increased, the difference between the CAP and MRI-PDFF became more considerable (Fig. 5).

Figure 5
figure 5

Comparison of the diagnostic accuracy of VCTE and MRI/MRE techniques to detect liver steatosis in patients with NAFLD presented by AUROCs. MRE magnetic resonance elastography, MRI magnetic resonance imaging, VCTE vibration-controlled transient elastography, AUROC area under the receiver operating curve.

Factors influencing the success rate of VCTE

As shown in Fig. 2, 25 cases of unreliable VCTE were observed. There were no considerable differences in sex, age, height, weight, BMI, PCD, LSM and MRI-PDFF in the histological assessment of liver fibrosis and steatosis between reliable and unreliable VCTEs (Table 3). Among the patients with sufficient data who received liver biopsy, VCTE, and MRE within the 6-month time frame described above (n = 188), the percentage of patients with a reliable VCTE increased each year between 2014 and 2020. The percentage of patients undergoing VCTE increased annually. The rates are summarized in Supplementary Online Resource 3.

Table 3 Factors that may influence the implementation of reliable VCTE.

Discussion

Obesity is rising worldwide, and as the number of obese patients increases, so does the number of patients with NAFLD. It is anticipated that the diagnosis of NAFLD using non-invasive elastography will replace invasive diagnostic techniques, such as liver biopsy. Therefore, the diagnostic accuracy of elastography needs to be fully evaluated.

Asians are less likely to be obese than Caucasians but have a higher body fat percentage and higher risk of death from cardiovascular and other diseases despite lower BMIs27; therefore, the World Health Organisation has set criteria for obesity for Asians that differ from those for Caucasians13. It is imperative to correctly assess the extent of liver fibrosis not only in very obese people, but also in overweight and mildly obese people.

The choice of VCTE probes depends on the PCD. In a study of severely obese patients with an average BMI of 40 kg/m2, high diagnostic accuracy was achieved with proper use of the M and XL probes28. The present study was conducted in a population with a mean BMI of 28 kg/m2. The XL probe was used in one of 38 (2.6%) participants with normal BMI, eight out of 68 (11.8%) overweight participants, and 23 of 57 (40.4%) obese participants. PCD tended to increase with increasing BMI. Even in the overweight group (mean BMI, 27.35 kg/m2), the XL probe was used in 10% of patients, which is consistent with the findings of a report on the usefulness of the XL probe in overweight and obese patients29. In this group, PCD > 25 mm was found in 8% of patients with a BMI of 28–30 kg/m2, which was consistent with previous reports. In contrast, it has been reported that MRE can evaluate liver fibrosis with high diagnostic performance without being affected by the BMI or the degree of liver inflammation in the evaluation of NAFLD30. Although the degree of liver inflammation was not considered in our study, the MRE had a high diagnostic performance in assessing liver fibrosis by the BMI, which was previously reported.

In the assessment of liver fibrosis, both the VCTE and MRE could predict fibrosis of stage ≥ 2, ≥ 3, and 4 (cirrhosis) in patients with NAFLD with an AUROC ≥ 0.83 in all BMI groups. There were no considerable differences between the VCTE and MRE results. Previous reports have shown that both VCTE and MRE can assess liver fibrosis accurately with high diagnostic performance10,31,32, and it was reported that MRE is more accurate in identifying and staging liver fibrosis than VCTE10,11. In this study, VCTE was as highly diagnostic as MRE in the evaluation of LSM, with no difference in diagnostic performance, but MRE had a higher AUROC, as previously reported. It is possible that the background for the good VCTE results in this study was the relatively low BMI values, which made the VCTE measurement easier to perform.

Conversely, in the assessment of liver steatosis, MRI-PDFF could detect liver steatosis of grades ≥ 2 and ≥ 3 in patients with NAFLD with a good AUROC across all BMI groups, while the diagnostic ability of CAP was lower than that of MRI-PDFF across all BMI groups. Further, compared to MRI-PDFF, the diagnostic performance of CAP for the diagnosis of liver steatosis tended to decrease with increasing BMI. Especially, MRI-PDFF provided an excellent assessment of liver steatosis, but the CAP results were inferior to those of MRI-PDFF. Recently, Beyer et al. compared MRI- and ultrasound-derived indices of liver steatosis in a pooled cohort of 580 patients with NAFLD (mean age, 56 years; sex, 60% women; mean BMI, 31.39 [26.8–36.8 kg/m2])33. Their study assessed liver steatosis in the largest number of patients to date and concluded that MRI-PDFF could accurately diagnose individuals with the range of histological steatosis, and that CAP is suitable for identifying those with lower levels of fat (steatosis of grade 1 or above). In our study, the AUROCs of MRI-PDFF were higher than those of CAP. Our results indicated that CAP tended to be effective in identifying lower grade liver steatosis, specifically steatosis grade 1, and these are consistent with the findings reported by Beyer et al.33.

Regarding the examination success rate, this study found that the success rate of VCTE was 85.3%, and the success rate of MRE was 100%. Chen et al.34 reported that high BMI, increased chest circumference, and increased waist circumference were associated with unsuccessful VCTE, but neither PCD nor the type of VCTE probe was a considerable risk factor for unsuccessful VCTE examination. In our study, chest and waist circumference were not measured, but high BMI, PCD, and the type of VCTE probe were not risk factors for unsuccessful VCTE (Table 3). Further, Chen et al.34 reported that training or technical improvements improved VCTE AUROCs. Thus, we examined the years in which the tests were carried out and calculated the success rates for the investigations. The success rate tended to increase with each passing year, indicating that the performance of the machines and the skill of the examiner improved; this supported the postulation of Chen et al. that the AUROC of VCTE would increase with the number of studies performed by the examiners and the use of newer machines with better performance.

Several factors affecting the success rate of MRE have been identified, including claustrophobia, obesity to the extent that the patient cannot fit in the magnet bore, and hemochromatosis35,36. As this study did not include patients with claustrophobia or hemochromatosis, this likely contributed to the high MRE success rate.

There were some limitations to this study. First, the proportion of patients with advanced liver fibrosis in this population was small. Second, we cannot rule out referral bias and patient selection bias, as liver biopsy and elastography might have been performed more selectively in patients with NAFLD who may have already progressed to NASH. Therefore, the results of this study may not be representative of all patients with NAFLD. Third, this study was conducted in a single institution with a relatively small-sized sample and was assessed by a pathologist in a single institution. Fourth, our study is a retrospective analysis of cases, in which all three tests, MRE, VCTE, and liver biopsy, could be performed in a short period of time. In clinical practice, these investigations are not always accessible, and results may vary if they are performed prospectively. Fourth, previous studies have found a strong correlation between liver steatosis and viscosity by MRE37,38, but we have not discussed this issue. Viscosity is measured by measuring MRE at several different frequencies and finding the difference between them. In this retrospective study, viscosity was not evaluated because MRE was not measured at multiple frequencies. In addition, viscosity cannot be measured with the FibroScan used in this study. Therefore, evaluation of viscosity by BMI using noninvasive testing methods is an issue for future study.

In conclusion, this is the first report focusing on the diagnosis of liver fibrosis and steatosis using non-invasive techniques in overweight patients with NAFLD. Compared to other ethnic groups, Asians with a lower BMI are more likely to have NAFLD, and this research shows the diagnostic accuracy of non-invasive examinations in patients with average Asian body compositions. The key prognostic factor in NAFLD is the development of liver fibrosis5,6,7, and elastography, a non-invasive test, has been reported to be useful in maintaining long-term patient follow-up39. This study examines the diagnostic performance of VCTE and MRI techniques for assessing liver fibrosis and steatosis in overweight and obese patients with NAFLD, concluding that both are likely to be valuable for long-term follow-up.

In summary, both VCTE and MRE provide comparably good assessment of liver fibrosis, although the MRI-PDFF methods have a greater diagnostic ability for liver steatosis than CAP in overweight and obese patients with NAFLD.