Introduction

Liver cirrhosis—the end-stage of various types of chronic liver disease—is the 11th most common cause of death worldwide1. Liver transplantation is the only definitive treatment1. Patient management otherwise crucially relies on the screening for and management of serious complications such as hepatocellular carcinoma and gastroesophageal varices2. Chronic infection with hepatitis C virus and hepatitis B virus and alcoholic liver disease are the most common etiologies of liver cirrhosis worldwide2,3. Identification of the underlying etiology is important for treatment selection, alleviation of disease progression, and the allocation of transplant organs including posttransplant follow-up4.

Liver biopsy has been the reference standard for diagnosing the etiology of liver cirrhosis5. However, biopsy is invasive with an incidence of moderate to major procedure-related complications of approx. 1.2% and a mortality rate of 0.4%. Other limitations of liver biopsy include inter- and intraobserver variability and sampling error6,7,8. Such limitations emphasize the need for developing an alternative noninvasive method for identifying the etiology of liver cirrhosis especially in patients with indeterminate findings based on standard noninvasive diagnostic algorithms including physical examination, laboratory testing, biochemical markers, and imaging modalities5,6,9,10.

Gadoxetic acid-enhanced MRI provides morphological information on liver parenchyma, blood vessels, and the biliary tree (both anatomical and functional) while at the same time allowing detection and characterization of hepatic focal lesions as well as estimating functional liver capacity. Hepatocyte function—the main determinant of gadoxetic acid uptake and excretion—is known to be impaired in patients with liver cirrhosis11,12,13.

Radiomics analysis is a new technology based on the extraction of quantitative high-throughput features from radiologic images. The implementation of radiomics in radiology is gaining interest due to a wide range of applications such as its potential ability to characterize focal lesions including evaluation of tumor heterogeneity and microenvironment, phenotype classification, and prediction of response to treatment. In addition, radiomics models modified using the selected features can improve diagnostic accuracy, predict prognosis, and guide the clinical decision-making process10,14,15,16,17.

We hypothesize that a radiomics model based on features extracted from gadoxetic acid-enhanced MRI may allow identification and assessment of imaging features specific for different etiologies of liver cirrhosis and thus improve the classification of cirrhosis etiologies in patients with an inconclusive diagnosis based on currently available noninvasive diagnostic tests. Therefore, the purpose of our study was to develop, train, and validate a radiomics-based model as a noninvasive tool to predict the etiology of liver cirrhosis using features extracted from hepatobiliary phase (HBP) images of gadoxetic acid-enhanced MRI.

Results

Demographic data

The study included 248 patients (mean age, 60.5 ± 13.3 years; age range, 14–88 years), among them 179 men (mean age, 60 ± 12.8 years; age range, 14–81 years) and 69 women (mean age, 62 ± 14.3 years; age range, 21–88 years). Patient demographics are presented in Table 1.

Table 1 Summary of patient demographics.

Differentiating between all included etiologies of liver cirrhosis

In one-vs-one multiclass classification differentiating between all 6 etiologies in the training subset, the fivefold cross-validated linear support vector machine (SVM) yielded the highest accuracy (52.8–82.5% for models derived from two-dimensional (2D) region of interest (ROI) and 53.6–78.5% models derived from three-dimensional (3D) volume of interest (VOI)). The highest sensitivity was noted with alcoholic cirrhosis (74.1% for 2D-ROI-derived models and 75.9% for 3D-VOI-derived models) and the highest specificity with cholestatic liver disease-induced cirrhosis (95% for 2D and 86.7 for 3D). Without validation, the fine K nearest neighbor (KNN) classifier had the highest accuracy of 100% (Table 2).

Table 2 Performance metrics of machine learning-based classification of radiomics features in the training subset.

Differentiating cholestatic liver disease-induced cirrhosis from noncholestatic etiologies

In binary classification differentiating cholestatic liver disease-induced cirrhosis (group 3) from noncholestatic etiologies of cirrhosis (groups 1, 2, 4–6) in the training subset, the fivefold cross-validated ensemble classifier—subspace discrimination—had the highest accuracy (87.6% and 85.6%), sensitivity (97.6% and 95.6%), positive predictive value (0.883 and 0.877), and the largest area under the curve (AUC) (0.83 and 0.80) in 2D- and 3D-derived models, respectively (Table 2). Confusion matrices are listed in Fig. 1.

Figure 1
figure 1

Confusion matrix of the training subset showing etiology predicted by the radiomics model in comparison to the diagnostically established etiology of liver cirrhosis. The shaded cells indicate correct predictions by the radiomics model. A and B are confusion matrices for all groups constructed using features extracted from 2-dimensional (2D) (A) and 3-dimensional (3D) (B) features. C and D are confusion matrices for noncholestatic (0) vs. cholestatic (1) liver cirrhosis in 2D (C) and 3D (D) models.

In logistic regression analysis of the testing subset, three of the 45 features extracted and analyzed were omitted (Histo_Excess Kurtosis, Histo_Entropy_log2, GLCM_Entropy_log2) because of collinearity. The largest AUC was observed for differentiating cholestatic liver disease-induced cirrhosis from noncholestatic etiologies (AUC = 0.960 for 2D-derived models and 0.910 for 3D-derived models, P < 0.001).

2D- vs. 3D-generated radiomics models

Comparison of radiomics models constructed using features extracted from 2D ROIs and 3D VOIs revealed larger AUCs in 2D-based models for viral hepatitis (P = 0.05), cholestatic liver disease (P = 0.05), and nonalcoholic steatohepatitis (NASH) (P = 0.87). In alcoholic cirrhosis (group 1), the model constructed using 3D features had a larger AUC (0.831 for 3D-VOI-derived models vs. 0.767 for 2D-ROI-derived models, P = 0.01) (Table 3, Fig. 2). The model differentiating cholestatic from noncholestatic liver cirrhosis had the largest number of statistically significant features (Table 4).

Table 3 Logistic regression analysis of the testing subset for alcoholic cirrhosis, viral hepatitis-induced cirrhosis, cholestatic liver disease-induced cirrhosis, and NASH-associated cirrhosis.
Figure 2
figure 2

ROC curves of the testing subset for prediction of different etiologies of liver cirrhosis. Prediction of different etiologies of liver cirrhosis using one-vs-all multiclass logistic regression comparison between 2D- and 3D-extracted features in the following subgroups: alcoholic cirrhosis (a), viral hepatitis (b), cholestatic liver disease (c), and nonalcoholic steatohepatitis (NASH)-associated cirrhosis (d).

Table 4 List of statistically significant features in regression analysis for alcoholic cirrhosis, viral hepatitis-induced cirrhosis, cholestatic liver disease-induced cirrhosis, and NASH-associated cirrhosis (groups 1–4).

In least absolute shrinkage and selection operator (LASSO) analysis, the largest deviance ratio—in training (0.132 for 2D and 0.316 in 3D) and testing (0.131 for 2D and 0.196 in 3D) subsets—was observed for the model differentiating cholestatic liver disease-induced cirrhosis from noncholestatic etiologies (Table 5). The features selected with LASSO are listed in Table 6. Heat maps of the significant features were plotted (Fig. 3).

Table 5 Results of the least absolute shrinkage and selection operator (LASSO) logistic regression model for training and testing subsets.
Table 6 Features selected using the least absolute shrinkage and selection operator (LASSO) logistic regression analysis.
Figure 3
figure 3

Heat maps generated from 2-dimenensional (a) and 3-dimensional (b) ROIs segmented in HBP images and demonstrating the distribution of significant features in the study population.

Intra-class correlation coefficient (ICC)

The overall intra-class correlation coefficient (ICC) estimates and their 95% confident intervals (CI) for all features combined was 0.68 (CI 0.56–0.87) for 2D and 0.71 (CI 0.61–0.93) for 3D measurements suggesting moderate reliability18. Individual ICC values for the most statistically relevant features in 2D- and 3D-measurments are listed in Table 7.

Table 7 Individual intraclass correlation coefficient (ICC) values for the most statistically relevant features.

Discussion

Liver cirrhosis is an increasing cause of death worldwide. Approximately 1 million people die from complications of cirrhosis all over the world each year. In addition, cirrhosis accounts for 1.6% and 2.1% of the worldwide burden of disability-adjusted life years and years of life lost, respectively1,6.

In the present study, we developed and evaluated a radiomics-based model to predict the etiology of liver cirrhosis from HBP images of gadoxetic acid-enhanced MRI. Gadoxetic acid-enhanced MRI has been previously investigated for the staging of liver fibrosis using radiomics analysis10 and a deep convolutional neural network (DCNN)7. To the best of our knowledge, no study to date has proposed that the etiology of liver cirrhosis can be predicted using a radiomics-based model.

The fivefold validated radiomics models created using HBP images acquired with gadoxetic acid-enhanced MRI allowed classification of the etiology of liver cirrhosis with AUCs of 0.767–0.960, accuracies of 52.8–87.6%, and positive predictive values of 0.377–0.883. The highest diagnostic accuracy of 87.6% was achieved for the 2D-based model differentiating cholestatic liver disease-induced cirrhosis from noncholestatic etiologies.

During the process of radiomics model building, we investigated different techniques, including supervised machine learning and LASSO regression analysis, to explore characteristics and identify the optimal features for model construction. LASSO turned out to several advantages as it reduces redundancy, dependency, and dimensionality of the features and thus enhances model accuracy19. In addition, LASSO enables the generation of interpretable models using variable selection and regularization as well as integration of selected features into a radiomics signature19. As for classification algorithms, we used both binary and multinomial classification. However, a binary model yields a single probability value, which can be more readily interpreted than a multinomial model, while the latter is more complex and returns multiple probability values for different etiologies of liver cirrhosis10. This explains the higher performance metrics we obtained for the binary model differentiating cholestatic liver disease-induced cirrhosis from noncholestatic cirrhosis.

In addition, we constructed the radiomics models using features generated from 2D ROIs and 3D VOIs. In general, segmentation of the 3D VOI was easier and faster. However, the 2D-generated models performed better as evidenced by better results in terms of accuracy, sensitivity, predictive values, and AUCs except for alcoholic cirrhosis, where the performance of the 3D-based model was better. A possible explanation might be that in 2D ROIs a more representative sample of the liver parenchyma was analyzed since an entire axial section of the liver parenchyma was segmented, while 3D VOIs only covered a volume in the right lobe, which might have rendered 3D-based models less accurate in view of the inhomogeneous distribution of parenchymal involvement as in patients with primary sclerosing cholangitis (PSC). We considered segmentation of the entire liver parenchyma for 3D VOI measurements. However, we might argue that segmentation of the whole liver in 3D VOI might be a source of bias and renders the results less accurate since it would be difficult to exclude focal lesions, large (> 5 mm) blood vessels / bile ducts, as well as regions severely affected by artefacts. Furthermore, Segmentation of the whole liver would be time consuming and our target was to investigate a gadoxetic acid-enhanced MRI-based radiomics model that could be easily integrated into routine clinical practice.

Liver biopsy is indicated if noninvasive diagnostic tests fail to yield a definitive etiology of liver cirrhosis. It is a valuable means for diagnosis and differentiation of a wide range of liver diseases such as storage and metabolic diseases, autoimmune hepatitis (AIH), fatty liver diseases, and cholestatic liver diseases (such as small-duct PSC and immunoglobulin G4-associated cholangitis)20. Should the radiomics model prove to be comparable to liver biopsy in identifying the etiology of liver cirrhosis, this would have important benefits for patients, since radiomics analysis does not involve any invasive procedures.

Our study has several limitations. First, we used a retrospective study design. Second, the small population size and the uneven distribution of patients among subgroups are major limitations. Only a few MRI examinations were available for cross-validation, especially for NASH- and AIH-associated cirrhosis. We had to combine patients into broader categories including multiple related etiologies—as in cholestatic liver disease and viral hepatitis-induced cirrhosis—to further improve the models. Model performance might have been better if the number of MRI examinations had been larger and more balanced across different subgroups. Future studies need to involve more patients to further explore the possibility of subgroup analysis such as identification of different etiologies of cholestatic liver disease, which is currently an indication for liver biopsy. Third, the still developing radiomics technology is another major limitation. No standardized definition of radiomics-based features has been established21. Fourth, the ICC had moderate reliability. This could be explained by the low number of patients, which might have hindered a robust estimation of interobserver reproducibility of the interpreted radiomics features. Fifth, we did not evaluate how model performance may be affected by various demographic characteristics and clinical settings such as patient age, aspartate transaminase-to-platelet ratio index (APRI) score, or presence of focal liver lesions. Sixth, no fixed patient characteristics for either the training or the testing collectives was feasible because of the random selection of MRI examinations in the fivefold cross-validation. Finally, we only used HBP images to evaluate radiomics features. Performance of the model might be improved by including several pulse sequences in the analysis, especially diffusion-weighted images, T1-weighted images (with and without fat suppression), and T2-weighted-images (with and without fat suppression).

In conclusion, radiomics-based analysis of hepatobiliary phase images of gadoxetic acid-enhanced MRI may be a promising noninvasive method for identifying the etiology of liver cirrhosis with better performance of the 2D- compared with the 3D-generated models. This approach needs to be validated in future prospective studies in larger patient populations.

Patients and methods

Patient population and study design

We retrospectively identified all patients (n = 524) with confirmed etiology of liver cirrhosis who underwent gadoxetic acid-enhanced MRI (n = 766) at our institution between January 2014 and August 2019. The etiology of liver cirrhosis was diagnosed by hepatologists primarily based on clinical examination and laboratory parameters, supported by characteristic imaging findings on gadoxetic acid-enhanced MRI such as in patients with cholestatic liver disease-induced cirrhosis. Histopathological diagnosis (i.e., liver biopsy) was reserved for patients in whom definite diagnosis of the etiology of liver cirrhosis was not confirmed by the above-mentioned methods, primarily in patients with autoimmune hepatitis-induced cirrhosis, NASH-associated cirrhosis, and patients in group 6 (other etiologies).

The study was approved by the local institutional review board (ethics committee of the Charité–Universitätsmedizin Berlin) and carried out in accordance with relevant guidelines and regulations. Informed consent was waived by the ethics committee of the Charité–Universitätsmedizin Berlin.

Inclusion criteria were: a confirmed etiology of liver cirrhosis, no previous liver transplantation or cancer-related treatment including surgical resection or locoregional interventions for liver tumors, no infiltrative or large hepatic focal lesions which could preclude segmentation of ROIs, and completion of the MRI examination. Exclusion criteria were: unconfirmed etiology of liver cirrhosis (including 35 patients (38 MRI scans) who were diagnosed with cryptogenic cirrhosis), past history of liver transplantation, liver resection or locoregional intervention for management of hepatic malignancy, presence of infiltrative or large tumor for which it was difficult to draw ROIs, and nondiagnostic image quality due to severe artifacts or technical problems during acquisition resulting in incomplete MRI examination.

After exclusion, 248 patients who underwent 306 MRI examinations remained for analysis (Fig. 4). The study population included 8 patients (8 MRI scans) with malignant portal vein thrombosis (PVT), 4 patients (4 MRI scans) with benign PVT, and 2 patients (4 MRI scans) who were on systemic sorafenib therapy (Nexavar, Bayer Pharma AG, Berlin, Germany).

Figure 4
figure 4

Flow chart of inclusion and exclusion of patients with liver cirrhosis who underwent gadoxetic acid-enhanced MRI. *7 patients (8 MRI) had malignant portal vein thrombosis. **MRI examinations were discontinued prematurely, and no hepatobiliary phase was acquired.

Etiology of liver cirrhosis

MRI examinations were divided into 6 groups based on the etiology of liver cirrhosis (Table 1):

  1. 1.

    Group 1: Alcoholic cirrhosis (n = 108).

  2. 2.

    Group 2: Viral hepatitis-induced cirrhosis (n = 93).

  3. 3.

    Group 3: Cholestatic liver disease-induced cirrhosis (n = 58).

  4. 4.

    Group 4: NASH-associated cirrhosis (n = 28).

  5. 5.

    Group 5: AIH-associated cirrhosis (n = 8).

  6. 6.

    Group 6: Other etiologies (n = 11).

Laboratory parameters and serum fibrosis/cirrhosis test

Liver function tests (aspartate aminotransferase, alanine aminotransferase, alkaline phosphatase, gamma-glutamyl transferase, serum total bilirubin, and serum albumin), kidney function tests (serum creatinine and estimated glomerular filtration rate), international normalized ratio, and platelets, performed within 1 month before or after gadoxetic acid-enhanced MRI were selected for analysis. The APRI score (n = 281) was calculated as follows: (aspartate transaminase [IU/L]/aspartate transaminase upper normal limit)/platelet count [× 109/L]22.

MRI examinations

All MRI examinations were performed on a 1.5 T Magnetom Aera (Siemens Healthcare, Erlangen, Germany) using an eight-channel body phased-array coil. Transverse T1-weighted images (T1WIs) (volume-interpolated breath-hold examination (VIBE) sequence covering the entire liver with 60–80 slices and an adjusted field of view of 255–300 × 340–400 mm) were acquired before and approximately 20 min after manual intravenous bolus administration of 0.1 ml per kg body weight of gadoxetic acid (Gd-EOB-DTPA, gadoxetate disodium; Primovist/Eovist, Bayer HealthCare, Berlin, Germany)23. Imaging parameters were as follows: repetition time (TR) of 4.58 ms, echo time (TE) of 2.25 ms, flip angle (FA) of 9°, slice thickness of 3 mm, and matrix size of 276 × 340.

Workflow of radiomics model

The workflow included four steps: liver parenchymal segmentation, feature extraction, model construction, and, finally, model evaluation.

A reader with 10 years of experience in abdominal imaging and MRI who was blinded to the patients’ clinical and laboratory findings reviewed all MRI examinations and extracted radiomics features. To assess interobserver reproducibility using intra-class correlation coefficient (ICC), a second reader with 5 years of experience extracted the features in a randomly selected group of 30 patients. Axial VIBE T1WIs acquired approximately 20 min after gadoxetic acid administration, i.e., in the HBP, were imported into the radiomics platform as Digital Imaging and Communications in Medicine (DICOM) files. Texture analysis was performed using LIFEx software, version 5.10 (French Alternative Energies and Atomic Energy Commission, http://www.lifexsoft.org)24.

Liver parenchymal segmentation

Two-dimensional ROI and 3D VOI were segmented using drawing tools in the LIFEx software. The 2D ROI was drawn manually just above the level of the right portal vein, covering an entire slice of the liver parenchyma using 2D drawing tool (mean area, 41.56 ± 10.56 cm2; range, 17.8–74.3 cm2). A 3D VOI measuring about 40 mm3 (mean volume, 40.8 ± 6 cm3; range, 13.8–63.6 cm3) was segmented using 3D drawing tool, in the right posterior segment of the liver. The ROIs and VOIs were drawn 5 mm away from the liver capsule, avoiding large blood vessels (caliber ˃ 5 mm), dilated bile ducts, tumor masses, and artifacts (Fig. 5). The performance of radiomics models generated using 2D- and 3D- extracted features was compared.

Figure 5
figure 5

Gadoxetic acid-enhanced hepatobiliary phase (HBP) MR images showing region of interest (ROI) segmentation in two-dimensional (2D) and three-dimensional (3D) format. HBP images, axial before (a) and after (b,c) 2D (b) and 3D (c) ROI segmentation as well as coronal (d) reconstructed images showing 3D ROI segmentation. Patient 1 is a 36-year-old female with nonalcoholic steatohepatitis (NASH)-associated liver cirrhosis. Patient 2 is a 47-year-old male with primary sclerosing cholangitis complicated by liver cirrhosis.

Radiomics feature extraction

A total of 45 features were extracted from the delineated ROIs. The extracted features were divided into two categories: nontextural features and textural features. In the first order, nontextural features including histogram-based indices and conventional indices were extracted. In the second or higher order, textural features were extracted based on four textural matrixes: grey-level co-occurrence matrix (GLCM), neighborhood grey-level different matrix (NGLDM), grey-level run-length matrix (GLRLM), and grey-level size zone matrix (GLSZM). The extracted features are listed in supplementary Table 1.

Preprocessing before feature extraction included image spatial resampling and gray-level normalization. Voxel sizes were resampled to the same size of 1.2 × 1.2 × 3 mm3 using the relative intensity resampling method between the minimum and the maximum in the VOI. Image gray-level intensity was normalized to a scale of 1 to 6424.

Radiomics model construction and evaluation

All MRI examinations included in the study were randomized in a 4:1 ratio into training (n = 245) and testing (n = 61) subsets using computer-generated random numbers without matching any patient characteristics.

In the training subset, the LASSO logistic regression model with fivefold cross-validation was used to select the optimal and most informative features for predicting the etiology of liver cirrhosis. The selected features were subjected to further selection and modeling via binary logistic regression with elastic net regularization. Elimination of unreliable and statistically insignificant features was important to avoid overfitting and thus decrease running time and increase accuracy of the radiomics model19. A further step was to choose suitable classifiers. The radiomics signature was calculated using supervised classification algorithms. One-vs-one multiclass classification was used to differentiate between all groups (6 classes) while binary classification was used to distinguish between cholestatic and noncholestatic liver cirrhosis. The classification algorithms used are listed in supplementary Table 2. Following completion of training on classifiers, the testing subset was analyzed to determine the diagnostic performance of the models constructed in predicting the etiology of liver cirrhosis. Sensitivity, specificity, predictive values, accuracy, and receiver operating characteristic (ROC) curves were analyzed to evaluate the performance of the radiomics models.

Statistical analysis

LASSO logistic regression was performed using Stata/MP version 16.0 (StataCorp, College Station, Texas, USA). Performance of logistic regression model was evaluated using AUC of the ROC curve. Statistical comparison of 2D and 3D ROIs was performed using the chi-square test. Classification with different methods and ROC analysis were performed with MATLAB R2019b (MathWorks, Natick, MA, USA). Heat maps of the statistically significant features were plotted using the “heatmap.2” package of R (R software version 3.6.3, R Foundation for Statistical Computing, Vienna, Austria, https://www.r-project.org). Other statistical analyses were performed with Stata/MP.

Intraclass correlation coefficient (ICC) was evaluated based on a two-way mixed-effects model for absolute agreement. The 95% confidence intervals were calculated using 1000 bootstrap iterations25,26. Comparison between different AUC values was calculated using the nonparametric method by DeLong et al.27 Categorical data are provided as absolute numbers (percentages) and continuous variables as mean ± SD. P-values < 0.05 were considered statistically significant.