Introduction

Hepatocellular carcinoma (HCC) is the third-leading cause of cancer-related deaths worldwide1; even with treatment with potentially curative therapies, HCC has a 5-year recurrence rate of approximately 35–70%2. Vascular invasion is direct evidence of the biological aggressiveness of a tumor. Microvascular invasion (MVI), which is defined as tumor invasion of a portal vein radicle, the large capsule vessel or a vascular space lined by endothelial cells, is detectable only by microscopy3. Relative to other HCC patients, patients with MVI have a markedly higher risk of early recurrence and poor postsurgical survival3,4, even in cases involving small, solitary HCC tumors5,6. Moreover, patients who do not satisfy the Milan criteria can nonetheless exhibit excellent outcomes if they are confirmed to be negative for MVI7.

The gold standard for diagnosing MVI is a histological examination, which requires extensive sampling. A noninvasive evaluation system capable of preoperatively identifying MVI would be of great clinical value for better determining optimal therapeutic strategies. If MVI is detected, an adjuvant therapy such as sorafenib treatment8 or trans-arterial chemoembolization (TACE)9 could be used, given that these approaches have been reported to improve survival for HCC patients with MVI.

Several promising markers of MVI have been identified, although assessments of such markers involve technically demanding methods, including genetic testing, protein analysis and radiographic examination. A non-smooth tumor margin, which manifests as a lobulated or irregularly shaped tumor with the focal/multifocal outgrowth of nodules protruding into the non-tumor parenchyma10,11, is a two-dimensional imaging feature observable via even the simplest imaging methods, including ultrasonography. Moreover, this feature has been reported to be strongly related to elevated MVI risk and has even been identified as the only significant factor in a multivariable analysis10,12,13,14. To better understand this topic, we performed a review of the diagnostic performance of a non-smooth tumor margin for preoperative MVI assessment.

Results

Study selection

We identified a total of 1101 studies using our search strategy (Fig. 1). Studies as duplicated reports, reviews, case reports, editorials or conference abstracts were excluded (351 studies). After titles and abstracts were reviewed, 685 studies were excluded for examining non-primary HCC, being in vitro experiments or unrelated to MVI estimation. The remaining 65 studies were subjected to full-text review, and 59 of these studies were excluded for overlapping population (3 studies), sample size smaller than 30 (1 study), involving gross analyses of non-smooth tumor margins (7 studies), or containing no valid extractable data for our meta-analysis (48 studies). Six studies were selected for inclusion, and 5 additional studies were included after reviewing citations of the retrieved articles.

Figure 1
figure 1

Flow chart diagram presenting the selection of eligible studies.

Study characteristics

Characteristics of the 11 studies are presented in Table 1, and additional information is presented in Table 1 of the Supplementary materials. The included studies were published between 2009 and 2016. In total, 618 pathologically diagnosed MVI-positive patients and 1030 MVI-negative patients were included in this meta-analysis. Two15,16 of the included studies were performed in European populations, and the remaining nine studies were performed in Asian populations. One study17 assessed HCC patients treated via liver transplantation, 8 studies assessed HCC patients treated via liver resection, and two10,16 studies included HCC patients treated via either modality. Tumor margins were evaluated using computed tomography (CT), magnetic resonance imaging (MRI), and either of these approaches in two studies10,12, 7 studies, and two studies14,15, respectively. In Witjes et al.16, 22% of patients received preoperative radiofrequency ablation (RFA) or trans-arterial chemoembolization (TACE). Patients with macrovascular invasion and patients who received preoperative anti-tumor treatment were excluded from the remaining studies, with the exception of the study by Kim et al.11, which did not clearly address this issue.

Table 1 Characteristics of the 11 included studies.

Heterogeneity test and meta-regression analysis

The Spearman correlation coefficient was 0.06, and the corresponding P value was 0.85, suggesting the absence of threshold effect-derived heterogeneity. However, the inconsistency indexI2 was 89.1%, indicating heterogeneity due to non-threshold effects. Meta-regression analysis indicated that the patients’ mean age may have contributed to this high heterogeneity. QUADAS score summaries are presented in Table 1 and Fig. 2. The QUADAS score varies from 6 to 13, which indicated that other potential contributors to the heterogeneity resulted from the risk bias in patient selection due to the inappropriate inclusion and exclusion of patients (Ariizumi et al.13 included patients of recurrent HCC with a history of hepatectomy; and Yang et al.18 excluded patients with interval between MRI and surgery longer than 14 days). This risk bias may indirectly impact on the mean age and patient percentage with a positive imaging parameter. We adopted a clinical perspective and conducted sensitivity and subgroup analyses based on the patients’ mean age, with a threshold of 60 years; imaging method; and the percentage of patients with a non-smooth tumor margin.

Figure 2
figure 2

Quality of included studies according to QUADAS-2 guidelines. Risk of bias and applicability concerns of each included study. Proportion of studies with risk of bias; proportion of studies with regarding applicability. The risk bias mainly raised from patient selection due to inappropriate inclusion and exclusion, and determined as “high”. Another source of bias was indefinite report of blind method and interval between imaging and MVI pathological detection, and determined as “unclear”.

Accuracy of a non-smooth tumor margin for predicting MVI

High heterogeneity among studies precluded the pooling of data to obtain a summary result. As indicated in Table 2, subgroup analysis stratified by age with a threshold of 60 years showed that a non-smooth tumor margin has strong diagnostic power for MVI in studies with a mean patient age older than 60. In particular, the DOR of a non-smooth tumor margin could reach 21.30 (95% CI [12.52, 36.23]), with an area under the curve (AUC) as high as 0.90. And the summary sensitivity and specificity of the subgroup of age older than 60 was 0.81[0.72, 0.89] and 0.81[0.73, 0.87], respectively (Fig. 3 and Fig. 4). A non-smooth tumor margin had stronger diagnostic power when CT was used for imaging (2 studies) than when MRI was the imaging approach, and strong diagnostic power was also observed in the studies which included greater than 53% patients with a non-smooth tumor margin (2 studies), with DORs in this case reaching 19.42 (95% CI [7.53, 50.16]) and 28.78 (95% CI [13.92, 59.36]), respectively. Table 3 presents a comparison between using the single factor of a non-smooth tumor margin and using multivariable scoring systems for MVI assessment. Six scoring systems were identified19,20,21,22,23,24, and the comparison indicated that for the subgroups of patients older than 60 and patients who underwent CT imaging, a non-smooth tumor margin exhibited excellent diagnostic power. This power was equivalent to or greater than the diagnostic powers calculated for certain multivariable scoring systems.

Table 2 Pooled results of subgroup analysis.
Figure 3
figure 3

Forest plots of sensitivity and exact 95% confidence interval for subgroups of age ≥ 60 and age <60. CI: confidence interval. TP: true-positive, FP: false-positive, FN: false-negative, TN: true-negative. The summary sensitivity was 0.81[0.72, 0.89] for the subgroup of age older than 60, and 0.37[0.23, 0.54] for that of age younger than 60.

Figure 4
figure 4

Forest plots of specificity and exact 95% confidence interval for subgroups of age ≥ 60 and age <60. CI: confidence interval. TP (true-positive), FP (false-positive), FN (false-negative), TN (true-negative). The summary specificity was 0.81[0.73, 0.87] for the subgroup of age older than 60, and 0.76[0.70, 0.81] for that of age younger than 60.

Table 3 Comparisons with multivariable based scoring system.

Discussion

Preoperative MVI assessment has become clinically possible given the discovery of associations between a pathological diagnosis of MVI and imaging/clinical features, proteomics characteristics and gene signatures2,25,26,27. Pathologically, MVI has been reported to most frequently present at the site of extranodal extension10,12. Moreover, a non-smooth tumor margin can be regarded as indicative of HCC’s aggressive biological tendencies to invade the tumor capsule and protrude into the non-tumoral parenchyma. In this meta-analysis, we systematically reviewed studies evaluating the diagnostic accuracy of a non-smooth tumor margin for MVI. High heterogeneity among the included studies precluded the pooling of results to draw clinically applicable findings. However, the subgroups of studies with a mean patient age older than 60 years and studies involving CT imaging exhibited low heterogeneity and indicated that the diagnostic power of a non-smooth tumor margin for MVI was strong. This diagnostic power was found to be equivalent to or even greater than the diagnostic powers of certain multivariable-based scoring systems. This result shows that a non-smooth tumor margin is a promising indicator of preoperative MVI assessment. The diagnostic accuracy for MVI may be further improved by constructing a model that includes a non-smooth tumor margin in addition to other factors.

Several considerations warrant attention. First, image interpretation is equipment dependent. Our study reveals that CT is superior to MRI with respect to evaluating a non-smooth tumor margin for MVI assessment. This result is reasonable given that CT has better spatial resolution than MRI and that spatial resolution is extremely important for distinguishing between non-smooth and smooth tumor margins. However, only 2 studies from the same research group (with different patient populations) addressed the use of CT to evaluate tumor margins for MVI assessment; therefore, the pooled result is incompletely convincing, and more high-quality evidence is needed to reach more definitive conclusions. One possible explanation for why the subgroup of studies with a mean patient age older than 60 years exhibited relatively good pooled results is that CT was used for imaging at a higher rate in these studies than in studies involving younger patients (82% and 0%, respectively). Our study revealed that CT was a better assessment approach than MRI in the context of tumor margin evaluation, although ultrasonography may also play an important role given its advantage of enabling the scanning of additional sections compared to CT and MR. In our opinion, the intercostal or subcostal scanning of ultrasonography can obtain sections along multiple axis of the tumor, which is superior in examining irregular shaped tumors. Moreover, ultrasonography is economical and does not result in radiation exposure. However, to date, no published study has used ultrasonography for MVI-related assessments of tumor margins.

The results related to patients with a non-smooth tumor margin larger than 53% point to the most important limitation of our analysis. Image interpretation is both equipment dependent and operator dependent. Yamashita et al.6 reported that relative to pathological diagnosis, imaging diagnosis of a non-smooth tumor margin has a sensitivity of only 54%. Controversy exists with respect to evaluations of a non-smooth tumor margin. In 2 studies with nearly the same eligibility criteria, Chou et al.10,12 reported finding a non-smooth tumor margin in different percentages (53% and 39%, respectively) of patients from the same ethnic group, indicating that tumor margins may be misjudged and that the association of a non-smooth tumor margin with MVI risk may have been underestimated (odds ratios of 33.0 and 12.5, respectively). This possibility is consistent with the results of our meta-analysis. There are several additional limitations of the present study. First, considerable heterogeneity among studies was observed. Clinical heterogeneity (for example, ethnographic characteristics) may limit the reliability of pooled results. Moreover, a limited number of studies were included in this meta-analysis, and most of these studies had small sample sizes. Additional large-scale and high-quality studies are necessary to reach more definitive conclusions. In summary, our study reveals that a non-smooth tumor margin is a promising two-dimensional imaging feature for the preoperative assessment of MVI. Further investigation of this association is required, and this feature should be considered in the construction of future scoring systems.

Methods

Search strategy

Pertinent studies were identified by searching the PubMed, Embase, and Cochrane Library databases for articles published by April 12, 2017. No language or other restrictions were imposed. This search involved the use of “hepatocellular carcinoma” as a MeSH term and a free term and “microvascular invasion” as a free term combined with the search terms “margin”, “boundary”, “irregular”, “lobulated”, “confluence”, “infiltrative” and “extension”. The full search strategy is described in the Supplementary materials.

Inclusion and exclusion criteria

Clinical trials addressing the use of a non-smooth tumor margin for the preoperative evaluation of MVI in HCC patients were considered for inclusion. Studies were required to report sufficient data to construct a diagnostic 2 × 2 table. Investigations of non-primary HCC, studies involving gross analyses (without imaging) of non-smooth tumor margins and trials with sample sizes smaller than 30 subjects were excluded. When multiple publications assessed the same population, only data from the most recent comprehensive report were included. A literature flow diagram is depicted in Fig. 1.

Data extraction and quality assessment

All identified studies were screened for eligibility by one author (Y.H.), and a sample of 20% of these studies was independently assessed by another author (X.W.H.) to ensure consistent application of the eligibility criteria. Potentially eligible citations from retrieved articles were reviewed to identify additional studies. Both investigators (Y.H. and X.W.H.) used QUADAS (Quality Assessment of Diagnostic Accuracy Studies included in systematic reviews) to evaluate level of evidence and reached consensus28. Two authors (H.T.H. and Q.Z.) independently extracted the following data from each included study: the first author’s last name, publication year, country, sample size, mean age and possible sources of bias. These variables included mean tumor size, the percentage of male or MVI-positive patients and patients with a non-smooth tumor margin, the inclusion of patients with macrovascular invasion, the administration of preoperative anti-tumor treatment, surgical modality (i.e., resection or transplantation) and imaging modality. Data were extracted as true positives (TP), false positives (FP), false negatives (FN) and true negatives (TN) to form a diagnostic 2 × 2 table. Disagreements during study selection, data extraction and quality assessment were resolved by consensus, with arbitration by a third author as needed (W.W.).

Statistical analysis

We used R (version 3.2.5; The R Foundation for Statistical Computing, Vienna, Austria) and Meta-Disc 1.4 for Windows (XI Cochrane Colloquium, Barcelona, Spain) for all statistical analyses. Summary results are presented in terms of sensitivity (SEN), specificity (SPE), positive likelihood ratio (PLR), negative likelihood ratio (NLR), diagnostic odds ratio (DOR) and area under the receiver operating characteristic curve (AUC-ROC). Heterogeneity due to the threshold effect was investigated using the Spearman correlation coefficient in Meta-Disc 1.4. A strong positive correlation (P < 0.05) suggested a significant threshold effect. Heterogeneity caused by non-threshold effects was measured using the inconsistency index I2, with I2 < 50% indicating heterogeneity among studies. In cases involving heterogeneity, the DerSimonian-Laird method was used to calculate estimates. A meta-regression was performed to evaluate possible sources of bias listed in the Supplementary material (Table 1) and QUADAS scores, which were the basis for subgroup analyses. All tests of statistical significance were two-sided, and the chosen threshold for statistical significance was 0.05.