Preoperative nomogram for microvascular invasion prediction based on clinical database in hepatocellular carcinoma

The presence of microvascular invasion (MVI) is a critical determinant of early hepatocellular carcinoma (HCC) recurrence and prognosis. We developed a nomogram model integrating clinical laboratory examinations and radiological imaging results from our clinical database to predict microvascular invasion presence at preoperation in HCC patients. 242 patients with pathologically confirmed HCC at the Ningbo Medical Centre Lihuili Hospital from September 2015 to January 2021 were included in this study. Baseline clinical laboratory examinations and radiological imaging results were collected from our clinical database. LASSO regression analysis model was used to construct data dimensionality reduction and elements selection. Multivariate logistic regression analysis was performed to identify the independent risk factors associated with MVI and finally a nomogram for predicting MVI presence of HCC was established. Nomogram performance was assessed via internal validation and calibration curve statistics. Decision curve analysis (DCA) was conducted to determine the clinical usefulness of the nomogram model by quantifying the net benefits along with the increase in threshold probabilities. Survival analysis indicated that the probability of overall survival (OS) and recurrence-free survival (RFS) were significantly different between patients with MVI and without MVI (P < 0.05). Histopathologically identified MVI was found in 117 of 242 patients (48.3%). The preoperative factors associated with MVI were large tumor diameter (OR = 1.271, 95%CI: 1.137–1.420, P < 0.001), AFP level greater than 20 ng/mL (20–400 vs. ≤ 20, OR = 2.025, 95%CI: 1.056–3.885, P = 0.034; > 400 vs. ≤ 20, OR = 3.281, 95%CI: 1.661–6.480, P = 0.001), total bilirubin level greater than 23 umol/l (OR = 2.247, 95%CI: 1.037–4.868, P = 0.040). Incorporating tumor diameter, AFP and TB, the nomogram achieved a better concordance index of 0.725 (95%CI: 0.661–0.788) in predicting MVI presence. Nomogram analysis showed that the total factor score ranged from 0 to 160, and the corresponding risk rate ranged from 0.20 to 0.90. The DCA showed that if the threshold probability was > 5%, using the nomogram to diagnose MVI could acquire much more benefit. And the net benefit of the nomogram model was higher than single variable within 0.3–0.8 of threshold probability. In summary, the presence of MVI is an independent prognostic risk factor for RFS. The nomogram detailed here can preoperatively predict MVI presence in HCC patients. Using the nomogram model may constitute a usefully clinical tool to guide a rational and personalized subsequent therapeutic choice.

www.nature.com/scientificreports/ Microvascular invasion (MVI) is defined as the presence of tumor cells within a vascular lumen lined by endothelium that is visible only by microscopy 5,6 , and considered a critical determinant of early recurrence and survival of HCC. Tumor cells can spread and metastasize in the liver to form a portal vein tumor thrombus or multiple lesions or distant metastasis with the presence of MVI 7 . Recurrence within 5 years developed in about 70% of HCC patients treated with curative surgical resection. Nevertheless, the 5-year recurrence rate of liver transplant was 10-20%, slightly better than surgical resection. Zhao et al. 8 found that patients with MVI benefited from anatomical hepatectomy in terms of disease-free survival rate compared with non-anatomical hepatectomy. Upon further research, it was found that residual intrahepatic metastases were frequently considered as the main cause of early recurrences within 2 years of tumor resection. Furthermore, early recurrence depends on the aggressiveness of the primary tumor, particularly the likelihood for MVI and satellitosis 9,10 . Chan et al. 11 developed a practical statistical method that allows clinicians to estimate the risk of early recurrence with pre and post-operative data included MVI. Multiple features of MVI carry prognostic significance for HCC. Some studies have indicated that high-risk MVI patients did significantly worse in regard to both recurrence and survival [12][13][14] .
At present, the diagnosis of MVI was determined on histopathological examination of the surgical specimens obtained after HCC resection or liver transplantation. Consequently, the effect of the histopathological diagnosis on preoperative decision making was limited 15 . An accurate preoperative estimation of MVI presence can help surgeons choose accurate surgical approaches and improve HCC patients' prognosis based on risk-benefit assessment. However, clinical surgeons were faced with a challenge of accurately predicting MVI and finding a uniform scheme or standard for doing so 16 . A nomogram has been considered as a reliable tool to integrate and quantify significant risk factors for disease prognosis 17,18 . Therefore, the aim of our research was to construct a novel nomogram to predict the probability of occurrence of MVI in HCC patients.

Results
Patients characteristics and prognostic outcomes. Survival analysis indicated that the probability of OS and RFS were significantly different between patients with MVI and without MVI (P < 0.05, Fig. 1). In patients with MVI, the 1-year and 3-year probability of OS were 91.2%, 60.9% respectively and 98.4%, 85.7%, respectively for without MVI. The 1-year and 3-year probability of RFS were 61.2%, 45.6%, respectively for patients with MVI and 85.0%, 65.8%, respectively for patients without MVI.
Dimensionality reduction and element selection. AST, GGT, TB, AFP and tumor diameter were selected using LASSO binary logistic regression analysis. The LASSO coefficient profiles of the features were plotted ( Fig. 2A). The optimum parameter (lambda) selection in the LASSO model performed tenfold crossvalidation through minimum criteria. The partial likelihood deviance (binomial deviance) curve was presented versus log (lambda). Dotted vertical lines were showed at the optimum values by performing the lambda.min and the lambda.1se (Fig. 2B). Finally, we chose the optimum value corresponding to the minimum value of lambda.  Table 4). Incorporating tumor diameter, AFP and TB, the nomogram (Fig. 3A) achieved a better concordance index of 0.725 (95%CI: 0.661-0.788) with 1000 bootstrap samples to measure discrimination in predicting MVI presence (Fig. 3B). The sensitivity and specificity were 76.8%, and 69.4%, respectively.

Development and validation of the MVI-predicting nomogram.
DCA showed that if the threshold probability was > 5%, using the nomogram to diagnose MVI could be much more beneficial and it was obvious that its net benefit was considerably higher than that of independent clinical tumor diameter, AFP and TB models within 0.3-0.8 of threshold probability (Fig. 4).

Discussion
According to statistical analysis of newly diagnosed HCC patients, patients with histopathologically identifiable MVI (48.3%) had a worse post hepatic resection prognosis and MVI was an independent risk factor for RFS. Incorporating three variables, tumor diameter, AFP and TB, screened through multivariate logistic regression, we built and validated a new preoperative prediction nomogram model for MVI in HCC patients. Individuals with higher total points were at greater risk for MVI. For example, if an HCC patient's tumor diameter was 5 cm, AFP > 400 ng/ml, and TB > 23 umol/l, their total points were 116.5, and the corresponding MVI was about 80%; thus, the predicted probability of MVI in such a patient can be regarded as very high. The resultant nomogram could accurately preoperatively distinguish between patients with and without MVI and with better consistency between the predicted probability and the actual frequency of MVI. Anatomical resection or partial hepatic resection with a wide tumor margin was recommended to eradicate MVI 19 . And Hirokawa et al. found that the disease-free survival rate associated with surgical margin ≥ 10 mm was significantly better than that associated with surgical margin < 10 mm in MVI-positive patients 20 . Our nomogram provides an intuitive and easy-tounderstand clinical tool to determine the risk of MVI for HCC patients.
Several studies have developed related preoperative prediction models for MVI in HCC. A preoperative prediction of MVI in HBV-related HCC within the Milan criteria indicated that the preoperative factors associated with MVI were large tumor diameter, multiple nodules, incomplete capsule, AFP level, platelet, HBV DNA load, and a typical dynamic pattern of tumors on contrast-enhanced MRI 15 . Pan et al. 7 reported that tumor size, number of tumors, neutrophils and AFP were risk factors independently associated with MVI. A radiomic analysis of contrast-enhanced CT indicated that the nomogram model demonstrates good performance for predicting MVI and clinical outcomes with large-scale clinico radiologic and radiomic features 21 . Deng et al. 22 found that when incorporating the independent risk factors of MVI including tumor size, AFP and neutrophil to lymphocyte ratio (NLR), the resulting nomogram achieved a concordance index of 0.71. Although several studies have developed and validated preoperative prediction models for MVI in HCC patients, the clinical features of the recruited patients in these studies were heterogeneous and so were the inclusion criteria. Consequently, further research was urgently needed to harmonize and improve the MVI prediction accuracy. In our study, tumor  www.nature.com/scientificreports/ diameter and AFP were the independent MVI prognostic factors, consistent with previous studies. A study in a multicenter international database found that the incidence of MVI increased with tumor size of HCC resection patients (≤ 3 cm: 25%; 3.1-5 cm: 40%; 5.1-6.5 cm: 55%; > 6.5 cm: 63%) 23 . Vessels that encapsulate tumor clusters (VETC), previously linked to HCC metastatic dissemination, which was associated with high AFP levels and poor differentiation, and VETC was well correlated with MVI 24 . In addition, first we found that total bilirubin was also a significant predictive factor for MVI when integrated the tumor diameter and AFP. An elevated level of bilirubin almost always indicated the presence of an underlying abnormal liver function. The alteration of total bilirubin was significantly associated with the progression of liver cancer, and with the progression of HCC, most liver function indexes were gradually dysregulated 25 . According to a recent study, a preoperative radiomics-based nomogram demonstrated that the model provided better predictive performance when integrated AFP and TB. However, this was only found in patients with solitary hepatocellular carcinoma ≤ 5 cm 26 . In very early and early HCC patients, TB was also a significant risk factor for overall survival 27 and disease-free survival prediction model 28 , but the mechanism of interaction between TB and MVI was still not very clear. Athough preoperative clinical biochemical results and radiological imaging were usually selected in prediction model design, in our study, we choose the most frequently reported indicators, tumor diameter, number of It is important to note that the rates of HBV infection and anti-HBV therapy had no difference between with MVI and without MVI patients in this study. We also found no correlation between MVI and HBV through multivariate analysis. Compared to other studies that limited the predicted population to HCC patients with HBV infection, our nomogram had a better application for new patients. LASSO regression analysis model was used to construct data dimensionality reduction and element selection in our study. In previous studies, the LASSO regression had rarely been used in prediction models 7,15,21,22,31 . Overfitting, optimism, and miscalibration might be addressed and accounted for during the model development by applying bootstrapping techniques and LASSO regression 32 . Although the majority of previous studies generally randomly split the dataset into two subsets, a development sample and a validation sample, this was not done in our study. This approach was statistically inefficient or wasteful as not all available data was used in the development of the model 33 . Internal validation was a necessary part of prediction model development 34 . However, for external validation, substantial sample sizes should be used for sufficient power to detect clinically important changes in performance as compared with the internally validated estimate 35 . Owing to the sample size limitation, we were constrained to internal validation for the nomogram model. We also opted to conduct a DCA to determine the clinical utility of our nomogram.  www.nature.com/scientificreports/ To mitigate limitations faced by our study, future external validations are necessary since this was single center retrospective cohort. Secondly, our nomogram was developed based only on and limited to the clinical database; and other prognosis-related factors and biomarkers need to be identified and incorporated to further advance its accuracy. The last but not the least, the use of the nomogram in predicting the risk of a patient harboring MVI is only a new methodology, because MVI status is not the only factor in deciding on therapeutic procedures for HCC patients; and the correlation between MVI and subsequent recurrence is far from definite. The cutoff value of AFP to 400 ng/mL was set by review previous HCC researches 15,23,26,36 . Patients were defined as hypertensive on the basis of 'gold standard' , and had at least three consecutive measurements of systolic blood pressure (SBP) > 140 mm Hg and/or diastolic blood pressure (DBP) > 90 mm Hg 22. Controls had SBP and DBP < 120 mm Hg and < 80 mm Hg, and first degree relatives had no family history of hypertension. Type 2 diabetes mellitus (T2DM) was diagnosed according to the World Health Organization criteria: (1) Fasting glucose level > 7 mmol/l; or (2) the 2-h oral glucose tolerance test showing a glucose level of ≥ 11.1 mmol/l; or (3) hemoglobin A1c ≥ 6.5%, or (4) the subject has a clinical diagnosis of the disease. Anatomic or nonanatomic resection was performed after the clinical evaluation, and all the obtained surgical specimens were histologically assessed to determine the presence of MVI as well as the Edmondson-Steiner grade by different pathologists. Pathological examination results included tumor diameter, no. of tumors, cirrhosis and MVI. Radiological imaging (CECT and CEMRI) results included tumor diameter, no. of tumors and cirrhosis. Baseline data was collected from our www.nature.com/scientificreports/ hospital clinical database. Patients were consistently followed-up after HCC resection at intervals of 3 months. Patient follow-up was aimed at determination of overall survival (OS) and recurrence-free survival (RFS). OS was measured from the date of HCC resection to the date of the patient's death or the date of last follow-up visit. RFS was calculated from the date of HCC resection to the date when tumor recurrence was diagnosed. The preoperative and tumor recurrence diagnosis were based on criteria of the guidelines for diagnosis and treatment of primary liver cancer in China 37 .

Methods
Statistical analysis. Statistical analysis of the numerical variables was performed using unpaired Student's t-test for parametric data, Categorical variables were compared using Pearson's chi-square test or Fisher exact test. Survival curves were calculated using the Kaplan-Meier method and performed using the log-rank test. Multivariate Cox proportional hazards regression model was used to evaluate the independent prognostic factors of overall survival and tumor recurrence. LASSO regression analysis model was used to construct data dimensionality reduction and element selection. Subsequently, stepwise multivariate logistic regression analysis was performed to identify the independent risk factors. Then, a nomogram was formulated to predict MVI based on the results of LASSO regression and multivariate logistic regression analysis. Nomogram performance was assessed via internal validation and calibration curve statistics (concordance index was calculated to measure discrimination with1000 bootstrapping techniques). Decision curve analysis (DCA) was conducted to determine the clinical usefulness of the nomogram by quantifying the net benefits along with the increase of threshold probabilities. Student's t-test, pearson's chi-square test or fisher exact test, survival analysis, and logistic regression analysis were performed using SPSS 25.0 (IBM Corporation, 2020, USA). LASSO regression, nomogram, survival figures and decision curve analysis were performed or plotted using R version 3.6.2 and all figures were plotted by R. (R: Language and Environment for Statistical Computing, R core Team, R foundation for Statistical Computing, Vinena, Austria, 2019, http:// www.r-proje ct. org/), with packages dependencies: "rms", "glamet", "rmda" "survival", "survminer" and "pROC". P < 0.05 was considered statistically significant.