The Prognostic Value of Alpha-Fetoprotein Response for Advanced-Stage Hepatocellular Carcinoma Treated with Sorafenib Combined with Transarterial Chemoembolization

This retrospective cohort study aimed to evaluate the prognostic value of the alpha-fetoprotein (AFP) response in advanced-stage hepatocellular carcinoma (HCC) patients treated with sorafenib combined with transarterial chemoembolization. From May 2008 to July 2012, 118 HCC patients with baseline AFP levels >20 ng/ml treated with combination therapy were enrolled. A receiver operating characteristic curve was used to generate a cutoff point for AFP changes for predicting survival. The AFP response was defined as an AFP decrease rate [ΔAFP(%)] greater than the cutoff point. The ΔAFP(%) was defined as the percentage of changes between the baseline and the nadir values within 2 months after therapy. The median follow-up time was 8.8 months (range 1.2–66.9). A level of 46% was chosen as the threshold value for ΔAFP (sensitivity = 53.7%, specificity = 83.3%). The median overall survival was significantly longer in the AFP response group than in the AFP non-response group (12.8 vs. 6.4 months, P = 0.001). Multivariate analysis showed that ECOG ≥ 1 (HR = 1.95; 95% CI 1.24–3.1, P = 0.004) and AFP nonresponse (HR = 1.71; 95% CI 1.15–2.55, P = 0.009) were associated with increased risk of death. In conclusion, AFP response could predict the survival of patients with advanced-stage HCC at an early time point after combination therapy.

Currently, radiological imaging evaluation is widely used for the prognostic assessment of HCC. The Response Evaluation Criteria in Solid Tumors (RECIST) focuses on whole-tumor shrinkage 8 . The modified RECIST (mRE-CIST) criteria measure the change of the tumor necrotic area. However, radiological imaging evaluation has several limitations 9 . First, it is challenging to measure tumor size when the tumor grows in a diffuse pattern. Second, radiological imaging evaluation is a relatively subjective assessment and lacks inter-observer reproducibility 10 . Third, our previous studies showed that RECIST and mRECIST criteria fail to predict survival at an early time point 11 . Therefore, alternative methods to estimate treatment efficacy are needed.
Alpha-fetoprotein (AFP) is a glycoprotein that is secreted in approximately 70% of HCC 12 . As the most common biomarker of HCC, AFP has confirmed its value in screening and diagnoses in multiple studies 13 . Recently, several studies unanimously suggested that the AFP response was associated with longer overall survival (OS) in HCC patients after locoregional treatment modalities or systematic chemotherapy 14,15 . However, the prognostic value of the AFP response in patients with advanced-stage HCC who are treated with sorafenib combined with TACE remains unclear.
The aim of this study was to evaluate the prognostic value of the AFP response in patients with advanced HCC who were undergoing treatment with sorafenib combined with TACE and to explore the correlation between the AFP response and a radiological evaluation from an early time point.

Materials and Methods
All HCC patients consecutively admitted to our department between May 2008 and June 2012 who were treated with a combination therapy of sorafenib and TACE were retrospectively considered in our study. The inclusion criteria were as follows: 1) an age ≥ 18 years old, 2) an interval between sorafenib and TACE of ≤ 60 days, 3) an Eastern Cooperative Oncology Group (ECOG) performance status score ≤ 2, 4) a Child-Pugh A or B (≤ 7), and 5) no other molecular target agents. The exclusion criteria were as follows: 1) main portal vein invasion,2) concurrent malignancy,3) an absence of a repeat AFP measurement within 2 months after treatment initiation,4) a baseline AFP < 20 ng/ml, and 5) poor compliance. The diagnosis of HCC was based on the American Association for the Study of Liver Disease (AASLD) criteria 16 . Histology was needed only in case of diagnostic uncertainty. OS was measured from the beginning of combination therapy to the date of death or the last follow-up. The requirement to obtain informed consent was waived. The study protocol was approved by the ethics committees of Xijing Hospital. All the methods used in this study were carried out according to the approved guidelines.
Treatment and follow-up. The patients received sorafenib at an initial dose of 400 mg twice daily. Later, the dose of sorafenib was modified based on the degree of adverse events (AEs). AEs were assessed according to the National Cancer Institute Common Terminology Criteria for Adverse Events version 4.0. In our clinical practice, patients continue sorafenib treatment if the AEs can be safely controlled. TACE was performed using 10-50 mg doxorubicin mixed with 5-20 mg lipiodol. Gelatin foam was injected until the tumor-feeding vessels were completely obstructed. TACE procedures were repeated according to the radiological response 5 . Combined therapy was defined as an interval between sorafenib and TACE of less than 60 days, regardless of the order of the two treatments. Standard follow-up evaluations, including contrast-enhanced computer tomography (CT) scans and laboratory assessments, were performed during weeks 4 and 8 after the initiation of treatment and every 8 weeks thereafter. The end of the follow-up period was either death or December 31 st 2014. AFP evaluation. The serum AFP concentration was measured at baseline (before the initiation of combined therapy) and at every follow-up visit using an electro chemiluminescence immunoassay (ElecsysCobas e601, Roche). The AFP variation rate (Δ AFP) was defined as the percentage of change between the baseline and the nadir within 1-2 months after combination therapy.
( ) The AFP response was defined as an Δ AFP(%) greater than the cutoff point (Δ AFP(%) > cutoff point), whereas AFP non-response was defined as an AFP decrease rate less than the AFP variation cutoff point (Δ AFP(%) < cutoff point). The researcher who extracted the AFP data was blinded to the survival outcome.
Radiological evaluation and definitions. Radiological imaging assessments were performed with contrast-enhanced spiral computed tomography (CT) at baseline (before the initiation of combined therapy) and at every follow-up visit after combined therapy. The RECIST and mRECIST criteria were used for radiological evaluation. The treatment responses were blindly assessed by three experienced clinicians (Yan Zhao, JiaJia and Wei Bai). In cases of discrepancies, the images were jointly reviewed by all of the clinicians, and a consensus decision was reached. If the patients were evaluated as having a complete response (CR) or a partial response (PR) within 2 months after combination therapy, these individuals were considered to be responders. If the patients were evaluated as having stable disease (SD) or progressive disease (PD), these individuals were considered to be non-responders 17 . Statistical analyses. Continuous variables were presented as median values with ranges, and categorical variables were presented as frequencies with percentages. A receiver operating characteristic (ROC) curve was used to generate a cutoff point for AFP changes that predicted survival. For the area under the curve, a cutoff point with the highest sum of sensitivity and specificity was chosen as the most discriminative value of the AFP response for predicting survival. This statistic may range from 0 to 1, and cutoff points with a c-statistic > 0.7 are generally considered useful 18 . A Mann-Whitney U test was used to compare continuous variables, whereas a Chi-squared test was used to compare categorical variables between the AFP response and non-response groups.
The Κ coefficient was used to measure the inter-method concordance of the radiological response and the AFP response. OS time was assessed by Kaplan-Meier methods, and the survival difference between groups was estimated by the log-rank test. Patients lost to follow-up or alive at the end-of-observation date were censored. Univariate and multivariate Cox regression analyses were used to test the prognostic factors of OS. Variables with a P value < 0.1 in the univariate analysis were included in the multivariate analysis. Statistical analyses were performed using SPSS version 16.0 (SPSS, Inc., Chicago, IL, USA). A two-sided P value < 0.05 was considered to be statistically significant.
A comparison between AFP response and non-response groups. The median time from the baseline treatment to AFP follow-up was 1.4 months (range 0.4-2.0). The area under the ROC curve (c-statistic) for predicting survival was 0.716 (Fig. 3). The most discriminative value of the Δ AFP(%) for predicting survival was 46%. This cutoff point had a sensitivity of 53.7% and a specificity of 83.3%.
The correlation between AFP response and radiological evaluation. Of the 118 patients, 84 (71.2%) were properly evaluated according to both RECIST and mRECIST criteria. Survival was of insufficient time to carry out contrast-enhanced CT scans in 1 patient, 3 patients did not have a complete imaging examination due to clinical deterioration, 10 patients had non-measurable diffused tumor lesions in the liver, and 20 patients did not have completely preserved follow-up image data. The median time for assessing radiological imaging response was 1.2 months (range, 0.7-2.0 months). The rates of CR, PR, SD and PD were 0, 7 (8.3%), 66 (78.6%) and 11 (13.1%), respectively, according to the RECIST criteria, and 24 (28.6%), 23 (27.4%), 30 (35.7%) and 7 (8.3%), respectively, according to the mRECIST criteria. The response rates (CR and PR) and nonresponse rate (SD and PD) were 8.3% and 91.7% according to the RECIST criteria and 56% and 44% according to the mRE-CIST criteria, respectively. With RECIST criteria, the median survival value of response group was not obtained because too few patients (n = 7) were classified into this group and 4 patients were censored. However, there was no difference between the response and nonresponse groups (P = 0.132) (Fig. 4B). With mRECIST criteria, the survival difference was not statistically significant between the response and nonresponse groups [14.8 months (95% CI 10.9-18.7) vs. 10.3 months (95% CI 6.8-13.8), P = 0.075] (Fig. 4C). Multivariate analysis showed that both the RECIST (HR = 2.2; 95% CI 0.9-5.6, P = 0.094) and mRECIST (HR = 2; 95% CI 0.9-2.2, P = 0.160) criteria were not independent predictors of overall survival. The outcomes of both the radiological assessment and AFP response are shown in Table 3. The patient evaluation in every response category was markedly different between the RECIST criteria and the AFP response (Κ = 0.077), whereas the majority of patients were classified into the same response categories when assessed using the mRECIST criteria and the AFP response. However, the agreement was still weak between the mRECIST criteria and the AFP response (Κ = 0.383).

Variables
All patients (n = 118) AFP response (n = 49) AFP non-response (n = 69) P value Of the 34 patients without radiological evaluation, 8 and 26 patients were in the AFP response and AFP non-response groups, respectively. The median OS was significantly longer in the AFP response group than in the AFP non-response group (11.3 months vs. 3.9 months, P = 0.002) (Fig. 4D).

Discussion
Because AFP assessment is a simple and reproducible method to for the evaluation of the efficacy of combination treatment, our study demonstrates the feasibility of using the dynamic trend of AFP as an early biomarker for predicting survival outcomes after combination therapy in advanced HCC patients.  AFP is a well-established tumor marker for screening and diagnosing HCC, and the AFP level appears to be associated with the prognosis of HCC patients 19 . Previous studies demonstrated that an elevated AFP level would decrease in HCC patients after hepatic resection and would rebound in cases of HCC recurrence 20 . Recently, the AFP response has been reported to be a significant prognostic factor in HCC patients treated with different locoregional modalities or systemic chemotherapy 14,15,21 . To our knowledge, the current analysis is the first exploration of the potential prognostic value of the AFP response in HCC patients treated with sorafenib combined with TACE. And our study population was mainly consisted of advanced stage HCC patients, which was different from previous report. The major findings of this study were as follows: 1) the adaptive AFP variation cutoff point to predict prognosis was a 46% reduction, 2) the AFP response (a decline of more than 46% from baseline within 2 months after the initiation of combination therapy) was associated with longer OS in patients with advanced-stage HCC who were treated with sorafenib in combination with TACE, and 3) the AFP response could predict the overall survival at an earlier time point compared to radiological assessment, particularly in circumstances in which radiological evaluation could not be performed.
In previous studies, the AFP response was defined as an AFP level that decreased by more than 20%, 30% or 50% 14,22,23 . However, the definition of the AFP response mostly originated from personal clinical experiences or speculation but not from statistical analyses. In contrast, we used a ROC curve to generate an adaptive AFP variation cutoff point (an AFP reduction of 46%) for the AFP response. More importantly, by using this cutoff point the AFP response group had significantly longer survival than the AFP nonresponse group, and it was an independent predictor for overall survival. Thus, the AFP level could be incorporated into the algorithm for assessing the prognosis of HCC patients. Additionally, it should be noted that patient selection in previous studies was different from ours. In previous studies, patients with baseline AFP < 100 ng/ml or < 200 ng/ml were excluded to differentiate from other benign liver diseases 14,21 . Thus, the conclusions of these studies were suitable only for patients with a relatively high baseline AFP level. In contrast, our inclusion criteria were relatively wider, only patients with a baseline AFP < 20 ng/ml were excluded from our study because not all HCC patients have an elevated AFP level.
Radiological evaluations, such as those based on RECIST and mRECIST criteria responses, have been widely used in the prognostic assessment of HCC 17,24 . Radiological response has also been established to correlate with the pathological response 25,26 . However, in the current study, both the RECIST and mRECIST assessment within 2 months after treatment were not independent predictors of overall survival. Additionally, the agreement between radiological assessment and the AFP response was weak regardless of whether the RECIST or mRECIST criteria were used, though a majority of patients were classified into the same response categories when assessed using the mRECIST criteria and the AFP response. These results were consisted with our previous study that showed that the earliest time to evaluate the response to combination therapy was 3 months 11 . This result could be explained by the reality that the baseline tumor burden in Chinese patients is higher than those reported in western countries. Only one TACE session may not be efficient enough to achieve complete tumor response. Moreover, the study by Georgiades et al. showed that initial nonresponders after the first TACE session could obtain prolonged survival from further treatment 27 . Therefore, under these circumstances, radiological assessment could not be used as an early predictor of overall survival. Additionally, our study demonstrated that the AFP response could predict the prognosis of these patients in the absence of a radiological evaluation, especially in patients with diffuse malignant tumors that could not be evaluated by radiological criteria. Hypovascular or diffusely infiltrative tumor patterns are often present in real-world clinical settings 22   and treatment efficacy has the potential to help assess treatment response in clinical practice when the standard imaging findings are equivocal. Another potential advantage of the AFP assessment would be reducing the cost burden of repeat radiological scans.
Several limitations of this study should be recognized. First, this was a retrospective study with a relatively small number of patients. A potential bias may exist because not all the patients had follow-up AFP assessments within 2 months after treatment and consequently only the patients with complete follow-up information were included in the analysis. Further well-design prospective studies with large sample sizes are needed to confirm the prognostic value of the AFP response. Second, the serum AFP concentration might be influenced by hepatitis, cirrhosis and liver cell necrosis. Not all HCC patients have a significantly elevated AFP level at baseline, and patients with viral hepatitis and other benign liver diseases incidentally do have an elevated AFP level 28,29 . An AFP reduction might also be induced not only by treatment for HCC but also by antiviral or anti-fibrosis therapy. Unfortunately, the related data were lacking because we did not collect follow-up information about these types of therapies. Third, the 46% cut-off point was based on this study cohort that mainly consisted of advanced stage HCC patients. Its application in patients with intermediate stage HCC requires further validation.
In conclusion, our study suggested that the AFP response could predict overall survival in advanced-stage HCC patients at an early time point after the treatment of sorafenib combined with TACE. Further prospective studies are necessary to validate the prognostic effect of a decline of 46% as an accurate AFP variation cutoff point.