A radiomics-based model to classify the etiology of liver cirrhosis using gadoxetic acid-enhanced MRI

The implementation of radiomics in radiology is gaining interest due to its wide range of applications. To develop a radiomics-based model for classifying the etiology of liver cirrhosis using gadoxetic acid-enhanced MRI, 248 patients with a known etiology of liver cirrhosis who underwent 306 gadoxetic acid-enhanced MRI examinations were included in the analysis. MRI examinations were classified into 6 groups according to the etiology of liver cirrhosis: alcoholic cirrhosis, viral hepatitis, cholestatic liver disease, nonalcoholic steatohepatitis (NASH), autoimmune hepatitis, and other. MRI examinations were randomized into training and testing subsets. Radiomics features were extracted from regions of interest segmented in the hepatobiliary phase images. The fivefold cross-validated models (2-dimensional—(2D) and 3-dimensional—(3D) based) differentiating cholestatic cirrhosis from noncholestatic etiologies had the best accuracy (87.5%, 85.6%), sensitivity (97.6%, 95.6%), predictive value (0.883, 0.877), and area under curve (AUC) (0.960, 0.910). The AUC was larger in the 2D-model for viral hepatitis, cholestatic cirrhosis, and NASH-associated cirrhosis (P-value of 0.05, 0.05, 0.87, respectively). In alcoholic cirrhosis, the AUC for the 3D model was larger (P = 0.01). The overall intra-class correlation coefficient (ICC) estimates and their 95% confident intervals (CI) for all features combined was 0.68 (CI 0.56–0.87) for 2D and 0.71 (CI 0.61–0.93) for 3D measurements suggesting moderate reliability. Radiomics-based analysis of hepatobiliary phase images of gadoxetic acid-enhanced MRI may be a promising noninvasive method for identifying the etiology of liver cirrhosis with better performance of the 2D- compared with the 3D-generated models.


Differentiating between all included etiologies of liver cirrhosis.
In one-vs-one multiclass classification differentiating between all 6 etiologies in the training subset, the fivefold cross-validated linear support vector machine (SVM) yielded the highest accuracy (52. .5% for models derived from two-dimensional (2D) region of interest (ROI) and 53.6-78.5% models derived from three-dimensional (3D) volume of interest (VOI)). The highest sensitivity was noted with alcoholic cirrhosis (74.1% for 2D-ROI-derived models and 75.9% for 3D-VOI-derived models) and the highest specificity with cholestatic liver disease-induced cirrhosis (95% for 2D and 86.7 for 3D). Without validation, the fine K nearest neighbor (KNN) classifier had the highest accuracy of 100% (Table 2).
In logistic regression analysis of the testing subset, three of the 45 features extracted and analyzed were omitted (Histo_Excess Kurtosis, Histo_Entropy_log2, GLCM_Entropy_log2) because of collinearity. The largest AUC was observed for differentiating cholestatic liver disease-induced cirrhosis from noncholestatic etiologies (AUC = 0.960 for 2D-derived models and 0.910 for 3D-derived models, P < 0.001).
In least absolute shrinkage and selection operator (LASSO) analysis, the largest deviance ratio-in training (0.132 for 2D and 0.316 in 3D) and testing (0.131 for 2D and 0.196 in 3D) subsets-was observed for the model differentiating cholestatic liver disease-induced cirrhosis from noncholestatic etiologies ( Table 5). The features selected with LASSO are listed in Table 6. Heat maps of the significant features were plotted (Fig. 3).

Intra-class correlation coefficient (ICC). The overall intra-class correlation coefficient (ICC) estimates
and their 95% confident intervals (CI) for all features combined was 0.68 (CI 0.56-0.87) for 2D and 0.71 (CI 0.61-0.93) for 3D measurements suggesting moderate reliability 18 . Individual ICC values for the most statistically relevant features in 2D-and 3D-measurments are listed in Table 7.

Discussion
Liver cirrhosis is an increasing cause of death worldwide. Approximately 1 million people die from complications of cirrhosis all over the world each year. In addition, cirrhosis accounts for 1.6% and 2.1% of the worldwide burden of disability-adjusted life years and years of life lost, respectively 1,6 .
In the present study, we developed and evaluated a radiomics-based model to predict the etiology of liver cirrhosis from HBP images of gadoxetic acid-enhanced MRI. Gadoxetic acid-enhanced MRI has been previously investigated for the staging of liver fibrosis using radiomics analysis 10 and a deep convolutional neural network (DCNN) 7 . To the best of our knowledge, no study to date has proposed that the etiology of liver cirrhosis can be predicted using a radiomics-based model. www.nature.com/scientificreports/ The fivefold validated radiomics models created using HBP images acquired with gadoxetic acid-enhanced MRI allowed classification of the etiology of liver cirrhosis with AUCs of 0.767-0.960, accuracies of 52.8-87.6%, and positive predictive values of 0.377-0.883. The highest diagnostic accuracy of 87.6% was achieved for the 2D-based model differentiating cholestatic liver disease-induced cirrhosis from noncholestatic etiologies.
During the process of radiomics model building, we investigated different techniques, including supervised machine learning and LASSO regression analysis, to explore characteristics and identify the optimal features for model construction. LASSO turned out to several advantages as it reduces redundancy, dependency, and dimensionality of the features and thus enhances model accuracy 19 . In addition, LASSO enables the generation  www.nature.com/scientificreports/ of interpretable models using variable selection and regularization as well as integration of selected features into a radiomics signature 19 . As for classification algorithms, we used both binary and multinomial classification.
However, a binary model yields a single probability value, which can be more readily interpreted than a multinomial model, while the latter is more complex and returns multiple probability values for different etiologies of liver cirrhosis 10 . This explains the higher performance metrics we obtained for the binary model differentiating cholestatic liver disease-induced cirrhosis from noncholestatic cirrhosis. In addition, we constructed the radiomics models using features generated from 2D ROIs and 3D VOIs. In general, segmentation of the 3D VOI was easier and faster. However, the 2D-generated models performed better as evidenced by better results in terms of accuracy, sensitivity, predictive values, and AUCs except for alcoholic cirrhosis, where the performance of the 3D-based model was better. A possible explanation might be that in 2D Table 3. Logistic regression analysis of the testing subset for alcoholic cirrhosis, viral hepatitis-induced cirrhosis, cholestatic liver disease-induced cirrhosis, and NASH-associated cirrhosis. NASH nonalcoholic steatohepatitis.  www.nature.com/scientificreports/ ROIs a more representative sample of the liver parenchyma was analyzed since an entire axial section of the liver parenchyma was segmented, while 3D VOIs only covered a volume in the right lobe, which might have rendered 3D-based models less accurate in view of the inhomogeneous distribution of parenchymal involvement as in patients with primary sclerosing cholangitis (PSC). We considered segmentation of the entire liver parenchyma for 3D VOI measurements. However, we might argue that segmentation of the whole liver in 3D VOI might be a source of bias and renders the results less accurate since it would be difficult to exclude focal lesions, large (> 5 mm) blood vessels / bile ducts, as well as regions severely affected by artefacts. Furthermore, Segmentation of the whole liver would be time consuming and our target was to investigate a gadoxetic acid-enhanced MRIbased radiomics model that could be easily integrated into routine clinical practice. Liver biopsy is indicated if noninvasive diagnostic tests fail to yield a definitive etiology of liver cirrhosis. It is a valuable means for diagnosis and differentiation of a wide range of liver diseases such as storage and metabolic diseases, autoimmune hepatitis (AIH), fatty liver diseases, and cholestatic liver diseases (such as small-duct PSC and immunoglobulin G4-associated cholangitis) 20 . Should the radiomics model prove to be comparable to liver biopsy in identifying the etiology of liver cirrhosis, this would have important benefits for patients, since radiomics analysis does not involve any invasive procedures.
Our study has several limitations. First, we used a retrospective study design. Second, the small population size and the uneven distribution of patients among subgroups are major limitations. Only a few MRI Table 4. List of statistically significant features in regression analysis for alcoholic cirrhosis, viral hepatitisinduced cirrhosis, cholestatic liver disease-induced cirrhosis, and NASH-associated cirrhosis (groups 1-4). NASH nonalcoholic steatohepatitis. www.nature.com/scientificreports/ examinations were available for cross-validation, especially for NASH-and AIH-associated cirrhosis. We had to combine patients into broader categories including multiple related etiologies-as in cholestatic liver disease and viral hepatitis-induced cirrhosis-to further improve the models. Model performance might have been better if the number of MRI examinations had been larger and more balanced across different subgroups. Future studies need to involve more patients to further explore the possibility of subgroup analysis such as identification of different etiologies of cholestatic liver disease, which is currently an indication for liver biopsy. Third, the still developing radiomics technology is another major limitation. No standardized definition of radiomics-based features has been established 21 . Fourth, the ICC had moderate reliability. This could be explained by the low number of patients, which might have hindered a robust estimation of interobserver reproducibility of the interpreted www.nature.com/scientificreports/ radiomics features. Fifth, we did not evaluate how model performance may be affected by various demographic characteristics and clinical settings such as patient age, aspartate transaminase-to-platelet ratio index (APRI) score, or presence of focal liver lesions. Sixth, no fixed patient characteristics for either the training or the testing collectives was feasible because of the random selection of MRI examinations in the fivefold cross-validation. Finally, we only used HBP images to evaluate radiomics features. Performance of the model might be improved by including several pulse sequences in the analysis, especially diffusion-weighted images, T1-weighted images (with and without fat suppression), and T2-weighted-images (with and without fat suppression).
In conclusion, radiomics-based analysis of hepatobiliary phase images of gadoxetic acid-enhanced MRI may be a promising noninvasive method for identifying the etiology of liver cirrhosis with better performance of the 2D-compared with the 3D-generated models. This approach needs to be validated in future prospective studies in larger patient populations. www.nature.com/scientificreports/

Patients and methods
Patient population and study design. We retrospectively identified all patients (n = 524) with confirmed etiology of liver cirrhosis who underwent gadoxetic acid-enhanced MRI (n = 766) at our institution between January 2014 and August 2019. The etiology of liver cirrhosis was diagnosed by hepatologists primarily based on clinical examination and laboratory parameters, supported by characteristic imaging findings on gadoxetic acidenhanced MRI such as in patients with cholestatic liver disease-induced cirrhosis. Histopathological diagnosis (i.e., liver biopsy) was reserved for patients in whom definite diagnosis of the etiology of liver cirrhosis was not confirmed by the above-mentioned methods, primarily in patients with autoimmune hepatitis-induced cirrhosis, NASH-associated cirrhosis, and patients in group 6 (other etiologies). The study was approved by the local institutional review board (ethics committee of the Charité-Universitätsmedizin Berlin) and carried out in accordance with relevant guidelines and regulations. Informed consent was waived by the ethics committee of the Charité-Universitätsmedizin Berlin.
Inclusion criteria were: a confirmed etiology of liver cirrhosis, no previous liver transplantation or cancerrelated treatment including surgical resection or locoregional interventions for liver tumors, no infiltrative or large hepatic focal lesions which could preclude segmentation of ROIs, and completion of the MRI examination. Exclusion criteria were: unconfirmed etiology of liver cirrhosis (including 35 patients (38 MRI scans) who were diagnosed with cryptogenic cirrhosis), past history of liver transplantation, liver resection or locoregional intervention for management of hepatic malignancy, presence of infiltrative or large tumor for which it was difficult to draw ROIs, and nondiagnostic image quality due to severe artifacts or technical problems during acquisition resulting in incomplete MRI examination.

MRI examinations.
All MRI examinations were performed on a 1.5 T Magnetom Aera (Siemens Healthcare, Erlangen, Germany) using an eight-channel body phased-array coil. Transverse T1-weighted images (T1WIs) (volume-interpolated breath-hold examination (VIBE) sequence covering the entire liver with 60-80 slices and an adjusted field of view of 255-300 × 340-400 mm) were acquired before and approximately 20 min after manual intravenous bolus administration of 0.1 ml per kg body weight of gadoxetic acid (Gd-EOB-DTPA, gadoxetate disodium; Primovist/Eovist, Bayer HealthCare, Berlin, Germany) 23   www.nature.com/scientificreports/ Workflow of radiomics model. The workflow included four steps: liver parenchymal segmentation, feature extraction, model construction, and, finally, model evaluation.
A reader with 10 years of experience in abdominal imaging and MRI who was blinded to the patients' clinical and laboratory findings reviewed all MRI examinations and extracted radiomics features. To assess interobserver reproducibility using intra-class correlation coefficient (ICC), a second reader with 5 years of experience extracted the features in a randomly selected group of 30 patients. Axial VIBE T1WIs acquired approximately 20 min after gadoxetic acid administration, i.e., in the HBP, were imported into the radiomics platform as Digital Imaging and Communications in Medicine (DICOM) files. Texture analysis was performed using LIFEx software, version 5.10 (French Alternative Energies and Atomic Energy Commission, http:// www. lifex soft. org) 24 .
Liver parenchymal segmentation. Two-dimensional ROI and 3D VOI were segmented using drawing tools in the LIFEx software. The 2D ROI was drawn manually just above the level of the right portal vein, covering an entire slice of the liver parenchyma using 2D drawing tool (mean area, 41.56 ± 10.56 cm 2 ; range, 17.8-74.3 cm 2 ). A 3D VOI measuring about 40 mm 3 (mean volume, 40.8 ± 6 cm 3 ; range, 13.8-63.6 cm 3 ) was segmented using 3D drawing tool, in the right posterior segment of the liver. The ROIs and VOIs were drawn 5 mm away from the liver capsule, avoiding large blood vessels (caliber ˃ 5 mm), dilated bile ducts, tumor masses, and artifacts (Fig. 5). The performance of radiomics models generated using 2D-and 3D-extracted features was compared.
Radiomics feature extraction. A total of 45 features were extracted from the delineated ROIs. The extracted features were divided into two categories: nontextural features and textural features. In the first order, nontextural features including histogram-based indices and conventional indices were extracted. In the second or higher order, textural features were extracted based on four textural matrixes: grey-level co-occurrence  www.nature.com/scientificreports/ matrix (GLCM), neighborhood grey-level different matrix (NGLDM), grey-level run-length matrix (GLRLM), and grey-level size zone matrix (GLSZM). The extracted features are listed in supplementary Table 1.
Preprocessing before feature extraction included image spatial resampling and gray-level normalization. Voxel sizes were resampled to the same size of 1.2 × 1.2 × 3 mm 3 using the relative intensity resampling method between the minimum and the maximum in the VOI. Image gray-level intensity was normalized to a scale of 1 to 64 24 .

Radiomics model construction and evaluation.
All MRI examinations included in the study were randomized in a 4:1 ratio into training (n = 245) and testing (n = 61) subsets using computer-generated random numbers without matching any patient characteristics.
In the training subset, the LASSO logistic regression model with fivefold cross-validation was used to select the optimal and most informative features for predicting the etiology of liver cirrhosis. The selected features were subjected to further selection and modeling via binary logistic regression with elastic net regularization. Elimination of unreliable and statistically insignificant features was important to avoid overfitting and thus decrease running time and increase accuracy of the radiomics model 19 . A further step was to choose suitable classifiers. The radiomics signature was calculated using supervised classification algorithms. One-vs-one multiclass classification was used to differentiate between all groups (6 classes) while binary classification was used to distinguish between cholestatic and noncholestatic liver cirrhosis. The classification algorithms used are listed in supplementary Table 2. Following completion of training on classifiers, the testing subset was analyzed to determine the diagnostic performance of the models constructed in predicting the etiology of liver cirrhosis. Sensitivity, specificity, predictive values, accuracy, and receiver operating characteristic (ROC) curves were analyzed to evaluate the performance of the radiomics models.
Statistical analysis. LASSO logistic regression was performed using Stata/MP version 16.0 (StataCorp, College Station, Texas, USA). Performance of logistic regression model was evaluated using AUC of the ROC curve. Statistical comparison of 2D and 3D ROIs was performed using the chi-square test. Classification with different methods and ROC analysis were performed with MATLAB R2019b (MathWorks, Natick, MA, USA). Heat maps of the statistically significant features were plotted using the "heatmap.2" package of R (R software version 3.6.3, R Foundation for Statistical Computing, Vienna, Austria, https:// www.r-proje ct. org). Other statistical analyses were performed with Stata/MP.
Intraclass correlation coefficient (ICC) was evaluated based on a two-way mixed-effects model for absolute agreement. The 95% confidence intervals were calculated using 1000 bootstrap iterations 25,26 . Comparison between different AUC values was calculated using the nonparametric method by DeLong et al. 27 Categorical data are provided as absolute numbers (percentages) and continuous variables as mean ± SD. P-values < 0.05 were considered statistically significant.