Introduction

Hepatocellular carcinoma (HCC) is the most common primary liver neoplasm and the second leading cause of cancer-related mortality worldwide1. HCC is now generally treated according to the modified Barcelona Clinic Liver Cancer (BCLC) staging system and the treatment strategy recommended by European Association for the Study of the Liver (EASL) guidelines. BCLC staging system are classified into five stages and intermediate stage categorized by medium position. It is defined as multinodular by tumor status, preserved liver function and 0 for performance status1. Transcatheter arterial chemoembolization (TACE) is recommended as the first-line treatment and median survival was slightly overed 20 months by randomized controlled trial2. Later some previous cohort studies have reported a median survival of around 40 months in well selected candidates with a good technique approach3,4. In 2008, sorafenib was the first tyrosine kinase inhibitor (TKI) approved for use in patients with unresectable HCC. The group of patients who received sorafenib survived about 3 months longer than those who did not receive the drug5. A trial that compared the efficacy of TACE plus sorafenib with TACE alone was conducted in Japan and it was reported that TACE plus sorafenib significantly improved progression free survival (PFS) compared to TACE alone in patients with intermediate stage HCC6. After 2017, some molecular-targeted agents (MTAs) and immunotherapy became available in Japan7,8,9,10,11. Patients with intermediate stage HCC represent various types with regard to tumor burden and liver function status12. Some populations in this stage have been included among cases with TACE failure or refractoriness. Therefore, sequential therapy of some MTAs or combination therapy with TACE and MTAs have been proposed in subpopulations of intermediate stage HCC13,14. Individual MTA therapy has been reported to show efficacy, but it is not easy to achieve a complete response. Even if a partial response can be achieved, tumor progression and/or new lesions may sometimes occur due to resistance to the previous therapy. Thus, sequential MTA treatment or MTA-TACE therapy has been administered and reported to prolong survival6,15,16,17,18. Nowadays, multi-MTA has become the usual HCC therapy, whereas Sorafenib was the predominant therapy in TACE refractory patients before 2017. But there are still some problems. One of the problems is that it is difficult to evaluate the appropriateness of the sequential systemic therapy for each patient, because of the difficulty in organizing highly evidenced clinical studies for too many sequential and/or combined treatments.

If we can predict the survival of one HCC patient treated with the previous therapy, especially prior to 2017, which included only TACE or sorafenib, treating a patient with the current sequential MTA treatment and and/or MTA-TACE therapy will enable us to evaluate the effectiveness. So far, several factors related to the prediction of HCC patients’ survival have been reported, such as gender, age, tumor morphology, the grade of liver function and the value of tumor markers19,20,21,22. The prediction of HCC patients’ survival is important for therapeutic management. Parametric models are generally used in survival studies for accurate predictions. Some of the parametric models have suggested an acceleration of the failure time approach, which directly targets the patients’ survival prediction23. Therefore, using parametric models could provide a better estimate to predict the duration of survival.

In this study, we investigated the predictors of survival in the patients with intermediate stage HCC before 2017 and developed a mathematical model to estimate the survival using a parametric distribution.

Methods

Study design and participants

In this retrospective cohort study, 753 HCC patients were included 2002 to 2017 at the eight liver centers in Japan (Akita University Hospital, Iwate Medical University Hospital, Tohoku University Hospital, Hirosaki University Hospital, Aomori Prefectural Central Hospital, Yamagata University Hospital, Fukushima Medical University Hospital and Ehime Prefectural Central Hospital). HCC stage was evaluated according to BCLC classification and all the cases when diagnosed as intermediate stage was enrolled in this cohort. The clinical information of patients was extracted from their medical records and they were followed to identify their death status via phone-call or medical information sheet from their relative hospitals. The survival time was designated as the time between the diagnosis date of HCC intermediate stage and the occurrence of death. The death status was considered as a failure event. All therapies were allowed but just one type of MTA (sorafenib) could be used. No patients underwent liver transplantation.

This study was approved by the institutional review board of Tohoku University Hospital (2021-1-377). The study was conducted in accordance with the principles of the Declaration of Helsinki (Fortaleza revision, 2013).

Data collection

The HCC nodules were characterized by contrast-enhanced computed tomography (CT) or magnetic resonance imaging (MRI), including the number of tumor nodules, the diameter of the largest nodule and the vascular invasion. The following clinical parameters and biochemistry data were included in the table: age, gender, etiology, Eastern Cooperative Oncology Group (ECOG) performance status, total bilirubin, aspartate aminotransferase (AST), alanine aminotransferase (ALT), albumin, platelets, prothrombin time, α -fetoprotein (AFP), des-γ-carboxyprothrombin (DCP), tumor size and numbers, Child–Pugh score, albumin-bilirubin (ALBI) score and modified ALBI (mALBI) grade, Kinki criteria, tumor node metastasis-the liver cancer study group of Japan (TNM-LCSGJ) and treatment naïve or recurrence. The ALBI score was calculated using the formula: linear predictor = (log10 (total bilirubin × 17.1) × 0.66) + (albumin × 10 × − 0.085), and the cut points of the mALBI grade were as follows: x ≤ − 2.60 (grade 1), more than − 2.60 to < − 2.27 (grade 2a), not less than − 2.27 to ≤ − 1.39 (grade 2b) and more than − 1.39 (grade 3)24,25. Kinki criteria was subclassified into three stages in the BCLC intermediate stage. It was classified as follows: Child–Pugh scores of 5–7 points with ‘in’ in terms of the ‘up-to-seven’ criteria (B1), Child–Pugh scores of 5–7 points with ‘out’ of the ‘up-to-seven’ criteria (B2) and Child–Pugh scores of 8–9 points with either ‘in’ or ‘out’ of the ‘up-to-seven’ criteria (B3)26,27. Continuous variables are presented as the mean ± standard deviation or median (interquartile range) and categorical variables as numbers. Survival was calculated as the time from the date of the initial diagnosis as BCLC intermediate stage to death by HCC.

Statistical analysis and development of parametric models

Patient survival probability was analyzed using the Kaplan–Meier method. The survival-related factors were extracted by the Cox proportional-hazard regression model. Based on the factors with multivariate significance (p < 0.05) and clinical relevance that have been previously reported.

Parametric models were applied and we selected three parametric models including the exponential, Weibull and log-normal. The fit of the models was evaluated using probability plots and the Akaike information criterion (AIC) and Bayesian information criterion (BIC), with a smaller value indicating a better fit. The AIC and BIC are methods based on in-sample fit to estimate the likelihood of a model to calculate the future values28,29. A plot of the negative log of the estimated survivor function against log time can provide a visual check of the appropriateness of the parametric model for the survival data30.

Weibull distribution model

The survival model was applied based on Weibull distribution. The cumulative failure probability was defined as F (x, α, β) = 1 − exp[− (x/β)α]. x showed the survival time, α, scale parameter and β, shape parameter. In this case, median survival time could be calculated as follows

$$ \begin{aligned} & {\text{F }}\left( {{\text{x}},\alpha,\beta} \right) \, = { 1} - {\text{ exp}}\left[ { - \left( {{\text{x}}/\beta} \right)^{\alpha} } \right] \, = \, 0.{5} \\ & {\text{x }} = \beta{\text{x }}\left( { - {\text{log}}0.{5}} \right)^{{{1}/\alpha}} \\ \end{aligned} $$

β is consisted of the statistical weighing and the risk factors.

$$ {\beta } = {\text{ exp }}\left( {{\text{intercept }} + {\text{ coefficient1}}*\left( {{\text{factor1}}} \right) \, + {\text{ coefficient2}}*\left( {{\text{factor2}}} \right) + \cdots } \right) $$

The risk factors were extracted from the multivariate analysis of the Cox proportional-hazard regression model. The intercept and the coefficient of each risk factor were determined by JMP Pro 15.0.0 (SAS Institute, Cary, NA).

Ethical approval

This study followed the principles of the Declaration of Helsinki (Fortaleza revision, 2013). Study approval statement: This study was reviewed and approved by the institutional review board of Tohoku University Hospital (approval number: 2021-1-377).

Consent to participate

Due to the retrospective observational study, the institutional review board of Tohoku University Hospital waived the need for written informed consent. The identifying data of the enrolled patients has been delinked and the authors could not access the individual data.

Results

The characteristics of the patients in this cohort study

The characteristics of the patients in the cohort are shown in Table 1. 753 patients were enrolled in this cohort. The median age was 70 years and the majority were male. Hepatitis C virus was the predominant etiology. All patients had normal performance status. There were 300 (39.8%) patients with a Child–Pugh score of 5, 242 (32.1%) with a score of 6, 118 (15.7%) with a score of 7 and 93 (12.4%) with a score of over 8, corresponding to 542 (72.0%) patients with Child–Pugh class A and 211 (38.0%) with up to class B. The median ALBI score was − 2.22, There were 184 (24.4%) patients with mALBI grade 1, 150 (19.9%) with grade 2a, 355 (47.1%) with grade 2b, and 64 (8.5%) with grade 3. The median size of the largest nodule was 3.0 cm and the number of nodules was five. According to Kinki criteria, there were 279 (37.1%) patients with B1 stage, 381 (50.6%) with B2 and 93 (12.3%) with B3. The median AFP concentration was 39.6 ng/dL and DCP was 111 mAU/mL. When diagnosed HCC with intermediate stage, 279 (37.1%) patients were treatment naïve, while 474 (62.9%) were recurrence. After the BCLC intermediate stage, the median overall survival (OS) was 24.05 months by the non-parametric estimator of survival functions (Fig. 1).

Table 1 Description of the patients (n = 753).
Figure 1
figure 1

Kaplan–Meier curves for overall survival in this cohort study.

Selection of parametric models

We compared the performance in which we fit each model. The fit of the models was compared using AIC and BIC, and the Weibull distribution showed a smaller value (Table 2). The probability plots were prepared regardless of whether or not the data set followed a given distribution such as the three parametric models. The probability plots can provide a visual check of the appropriateness. The Weibull plots appeared approximately linear (Fig. 2).

Table 2 The score of Akaike information criterion and Bayesian information criterion.
Figure 2
figure 2

Estimation of parametric model. Probability plots for each distribution. The linear course indicates the graph of log–log survival against the log of failure time in each distribution. The dot-to-dot linear showed the survival distribution.

Predictor of survival and survival prediction model.

A univariate Cox proportional-hazard regression analysis was performed in this cohort (Table 3). Age, naïve or recurrence, sum of the size of the largest tumor nodule, the number of nodules, total bilirubin, albumin, AFP and DCP were significantly different. Based on these variables, multiple regression analysis was conducted. Age, naïve or recurrence, sum of the size of the largest tumor nodule, the number of nodules, total bilirubin, albumin, AFP and DCP were selected as independent predictors of survival (Table 3).

Table 3 Univariate and multivariate analyses of derivation cohort by the Cox proportional-hazard regression model.

Development of a new estimated survival model

We developed a new mathematical survival model using Weibull distribution with seven predictors selected by multiple regression analysis. In this cohort, DCP was included the independent predictor. But we could not measure the DCP when patient indicated warfarin, and therefore we determined not to recommend as the risk factor. There were still two issues in making this model. One was how to count the nodules of the numerous or diffuse type accurately. Up to ten nodules, we usually could not count them accurately. For this, one of the criteria with liver transplantation showed that fewer than 11 nodules could indicate transplantation. Therefore, we set eleven when the number of HCC nodules were more than 10 or the diffuse type. Next was how to manage the value of AFP when it was too wide to use a parameter. The outlier could hardly be indicated as a parameter for the mathematical model. Therefore, we designated the outlier as outside the 95% confidential interval (CI). Greater than 5078 ng/dL was out of 95%CI and we set 5078 as the value of AFP when AFP was over 5078. In Weibull distribution model, the predicted survival time was derived as [exp (intercept + coefficient1 × (factor1) + coefficient2 × (factor2) + …) × (− log 0.5)1/α]. The seven factors related with overall survival were selected by Cox proportional-hazard regression analysis. The parameter estimates were calculated by JMP program and shown in Table 4. The 50% survival duration (months) was predicted by this new Weibull model indicated as EXP(4.02580 + (− 0.0086253) × Age + (− 0.34667) × (0 for naïve/1 for recurrence) + (− 0.034962) × (number of nodules) + (− 0.079447) × (the size of the largest nodule, cm) + (− 0.21696) × (T-bil, mg/dl) + 0.27912 × (Alb, g/dl) + (− 0.00014741) × (AFP, ng/dl)) × (− log0.5)^0.67250 (supple info. 1).

Table 4 The coefficient of each factor in Weibull survival model.

Discussion

In this study we demonstrated the predictors of survival in intermediate stage HCC patients and developed a mathematical model to estimate the survival time based on the predictors of the parametric distribution. The Weibull distribution had the best fit among all investigated parametric models in our data. The EASL guidelines recommend treating intermediate stage HCC patients with TACE and showed 2.5 years as the estimated mean survival time. After 2017, several MTAs and one immune checkpoint inhibitor combined with a single MTA were approved for HCC therapy in Japan. Each MTA was confirmed for its effectiveness based on the overall survival and/or progression free survival in several randomized, double-blind clinical trials. However, even treatment with atezolizumab and bevacizumab for unresectable HCC patients shows complete remission in only about 5% of patients. Therefore, in the real world, multiple treatments such as combination and/or sequential therapy with MTAs, TACE and radiation are usually administered for HCC patients in the intermediate stage. Actually, it is difficult to evaluate the effectiveness of sequential total systemic therapy. Given these circumstances, we developed a mathematical model to estimate the survival duration in intermediate stage HCC patients. Using this new model, we can evaluate the efficacy of recent sequential therapy to determine whether it is appropriate or not.

TACE was recommended only for intermediate stage HCC patients before 2018. Several studies have reported predictors or predicted models. The parameters of tumor burden, liver function reserves and AFP are well-known predictors of survival in intermediate stage HCC patients undergoing TACE19,26,31,32,33. The intermediate stage subclassification system adopted in the Child–Pugh score included up-to-seven criteria31. Recent studies suggest that the ALBI grade might be a better surrogate of liver function reserve in HCC patients treated with TACE34. Moreover, some prediction models that rely on post-TACE assessment have been reported, but they are not useful for the selection of treatment33,35. To develop a predicted survival model, we investigated the predictors of overall survival. By multivariate analysis, we identified naïve/recurrence, number of nodules, size of the largest tumor, total bilirubin, albumin and AFP levels as independent predictors of overall survival in intermediate stage HCC patients. Based on these predictors, our survival model was developed using these parameters. This model is consistent with all routinely available parameters and is simple to calculate by common calculation software.

We established the predictors of survival in HCC patients through a parametric survival modeling approach. Previously, parametric models were well known for analyzing survival data. Survival time is considered to follow known distributions in exponential, Weibull and lognormal models23. Some parametric models that identified survival predictors of some diseases have been described. A Weibull or lognormal distribution is typically used for survival predictors36,37,38,39. In our data, The Weibull model had the best fit among the three investigated parametric models for the AIC score. Interestingly, using our new parametric survival model, we achieved more flexibility for predicting the survival duration in patients with intermediate stage HCC. This model could be recommended for planning, health policy-making and the evaluation of treatments and, potentially, it may contribute to improving the survival of patients with HCC.