Introduction

The key strategies for breast ductal carcinoma in situ (DCIS) management are to prevent its progression into invasive disease and to avoid disease recurrence, particularly invasive breast carcinoma which accounts for half of the recurrences. Identifying high-risk DCIS that have potential to invade is an excellent approach towards patient risk assessment and stratification for individualized management [1]. However, since the current clinicopathological parameters are inadequate to define DCIS risk precisely, identification of novel prognostic markers is necessary [2, 3]. Furthermore, the newly described genetic signatures such as Oncotype DX DCIS for prediction of recurrence show controversial results and need further validation [4,5,6,7,8,9]. Of note, genes included in Oncotype DX signature are mainly related to cellular proliferation and metabolism as subsequent indicators of invasion rather than invasive potential [6]. In addition, currently available risk indices such as Van Nuys Prognostic Index and nomograms rely mainly on clinicopathological parameters and to lesser extent on markers related to the tumor cells with little consideration for the surrounding microenvironment [7, 10,11,12]. With the emerging role of tumor microenvironment and the related proteins in the disease behavior [13], identification of more robust genetic signatures incorporating the crosstalk between tumor epithelial cells and surrounding microenvironment might provide a better approach for DCIS risk assessment and hence better management.

Basement membrane degradation and stromal remodeling are fundamental steps in progression from DCIS into invasive disease. Although the key role of matrix metalloproteinases in stromal breakdown is undeniable, explanation of DCIS progression into invasive disease depending solely on them is insufficient. Studies that targeted blocking metalloproteinases action in order to prevent disease progression reported non-promising results [14, 15]. Taken together, identification of novel markers that play a role in DCIS invasiveness might help in better understanding of the disease biology and risk stratification.

Legumain is a cysteine endopeptidase belonging to the asparaginyl endopeptidase family encoded by the legumain gene [16, 17]. Legumain activates zymogen gelatinase A by cleavage of pro-gelatinase A, which is an important mediator of extracellular matrix degradation, thereby helping the tumor to invade and metastasize [18,19,20]. It also activates other proteases that are key in regulating angiogenesis, growth, and other related functions in tumors [21]. Legumain is expressed at elevated levels in invasive breast cancer [16, 22], colorectal [23], prostate [24], and gastric carcinomas [17], and is related to poor prognosis [21]. Moreover, legumain is differentially expressed between normal breast tissue and invasive breast cancer [16]; however, the role of legumain in DCIS has yet to be established. In this study, we aim to assess the pattern of legumain expression and its prognostic significance in a large well-annotated DCIS cohort.

Materials and methods

Study cohort

A well-characterized annotated cohort of DCIS including pure DCIS (n = 776) and DCIS-mixed with invasive breast cancer (DCIS-mixed) (n = 239) diagnosed between 1990 and 2012 at Nottingham City Hospital (Nottingham, UK) was used as previously described [25]. Patients’ demographic data, histopathological parameters, and management including post-operative radiotherapy and development of local recurrence were collected (Supplementary Table 1). Local recurrence-free interval was defined as the time (in months) between 6 months after the first DCIS surgery and occurrence of ipsilateral local recurrence (either as DCIS or invasive breast cancer). Cases undergoing re-operation within the first 6 months due to close surgical margins or presence of residual disease were not counted as recurrence. Patients who developed contralateral disease following DCIS diagnosis were censored at the time of development of the contralateral cancer. Within a median follow-up period of 103 months (range 6–240 months), 83 cases (11%) developed a recurrence in the pure DCIS cohort compromising 30 DCIS (36%) and 53 invasive cancer with or without DCIS (64%). Six recurrence events were developed after mastectomy and 11 events after management with breast-conserving surgery followed by adjuvant radiotherapy, while the majority of the recurrences (n = 66) occurred after breast-conserving surgery alone.

Additionally, data on different molecular classes and tumor-infiltrating lymphocyte density were available for the cohort [25, 26]. To avoid selection bias, the DCIS-mixed cohort was selected with clinicopathological features comparable to the pure cohort regarding age at diagnosis, DCIS nuclear grade, and the presence of comedo necrosis.

Immunohistochemistry

Tissue microarrays were prepared from both cohorts. The TMA was constructed using a TMA GRAND MASTER 2.4-UG-EN MACHINE, using 1 mm punch sets. Cases with heterogeneous DCIS morphological patterns or grade were sampled from all representative areas. In addition, whole tissue sections from 20 cases compromising 10 pure DCIS and 10 DCIS-mixed cases were assessed to evaluate the pattern of legumain expression in malignant breast tissue and adjacent stroma and normal tissue.

Primary antibody specificity for rabbit polyclonal legumain antibody (ab125286, Abcam, UK) was validated using Western blot on whole-cell lysates of MCF7 and SKBR3 human breast cancer cell lines (obtained from the American Type Culture Collection, Rockville, MD, USA) as previously described [27,28,29]. Legumain antibody was used at a dilution of 1:500, which showed a single specific band at the predicted size of 56-kDa.

Expression of legumain protein in DCIS was assessed by immunohistochemistry using the Novocastra NovolinkTM Polymer Detection Systems Kit (Code: RE7280-K; Leica, Biosystems, UK). Tissue microarray and full-face sections (4 µm) were stained with rabbit polyclonal legumain (dilution 1:150), and then incubated for 24 h. Normal kidney tissue was used as a positive control, while a negative control was carried out by omitting the primary antibody.

Scoring of legumain expression

Percentage of cells showing cytoplasmic granular/vesicular staining [16] was estimated in tumor epithelial cells and the surrounding stromal fibroblasts, separately. Cores containing <15% either tumor epithelial cells and/or stroma were excluded from the scoring. All scored cores showed representative areas of specialized stroma (within two high-power fields) [30] surrounding the malignant ducts. In addition, the few cores included malignant epithelial cells only were excluded as it was difficult to differentiate between in situ or invasive process and the origin of these tumor cells. This method aimed to improve the reliability of the study and the cases excluded were random. Cases with multiple cores were scored and the average score was used for the analysis. For mixed cohort, each component, DCIS and invasive, was scored separately for the tumor epithelial cells and surrounding stroma. The cases were scored by two pathologists (MST and IMM) using a multiheaded microscope, considering the percentage of positive staining of any intensity. For dichotomization of protein expression, cut-off points for either malignant epithelial cells or stromal expression of legumain were defined according to the conducted results from X-tile bioinformatics software (Yale University, version 3.6.1) [31] based on local recurrence-free interval in the pure DCIS cohort. High legumain expression within tumor epithelial cells was considered when more than 65% of tumor cells showed staining, while expression in more than 10% of the surrounding fibroblasts was considered high expression.

Analysis of legumain mRNA expression in breast cancer

To emphasize the prognostic role of legumain in breast cancer and given the lack of data on the transcriptomic profiles of DCIS, legumain-normalized mRNA expression was evaluated as a potential prognostic marker in the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) cohort dataset [32], which comprises a large (n = 1980) cohort of invasive breast cancer with comprehensive molecular characterization. Moreover, to validate the prognostic significant of legumain in breast cancer, analysis using the Breast Cancer Gene-Expression Miner v4.1 (bc-GenExMiner v4.1) database was carried out.

Statistical analysis

Statistical analyses were performed using SPSS v21 (Chicago, IL, USA) for Windows. Student’s t test and analysis of variance were used to correlate between legumain mRNA level as a continuous variable and other clinicopathological parameters in METABRIC data. Association with legumain mRNA expression and breast cancer-specific survival was performed after dichotomization of expression into high and low groups based on the median value.

Spearman’s ρ test was used to correlate between legumain expression with the tumor epithelial and stromal cells. Association between legumain expression and clinicopathological parameters in pure DCIS was performed using χ2, Mann–Whitney, and Kruskal–Wallis tests. Wilcoxon's signed-rank test was used to compare the expression of legumain between DCIS component and invasive component within the DCIS-mixed cases. Univariate survival analysis against local recurrence-free interval was carried out using log-rank test and Kaplan–Meier curves. Cox regression model was used for multivariate analysis of legumain expression for all recurrences (either DCIS or invasive breast cancer) and invasive recurrences. For all tests, a two-tailed p -value of <0.05 was considered as statistically significant.

This work obtained ethics approval by the North West – Greater Manchester Central Research Ethics Committee under the title; Nottingham Health Science Biobank (NHSB), reference number 15/NW/0685.

Results

Pattern of legumain expression

The evaluation of full-face tissue sections demonstrated representative distribution of legumain expression either in the tumor epithelial cells or in the surrounding specialized stroma throughout the whole section, indicating representability of tissue microarrays to assess legumain expression in our cohort. Adjacent normal breast terminal duct-lobular units showed negative or very faint cytoplasmic staining of legumain. Occasional inflammatory and stromal cells were also stained in few cores. When present, legumain was expressed in the cytoplasm of the epithelial tumor cells and surrounding fibroblasts (Fig. 1).

Fig. 1
figure 1

a Normal breast ductolobular unit (×20) shows negative staining of legumain. b Negative legumain expression (×20) in a pure DCIS case. c Strong expression of legumain in tumor cells and surrounding fibroblasts (×20) in a pure DCIS case (inset: high power view showing the granular pattern of legumain expression). d High expression of legumain in the fibroblasts surrounding DCIS case (×40). e Expression of legumain in a mixed case (×40) showing strong staining in invasive component either within the tumor cells or surrounding stromal fibroblasts

After unbiased exclusion of uninformative cores (lost cores, folded tissue during processing and staining of cores containing <15% tumor cells and/or stroma), the final number of cases suitable for scoring was 464 pure DCIS and 191 DCIS-mixed. Legumain expression showed a unimodal distribution. The median percentage of positive tumor epithelial cells was 25% in pure DCIS, 30% in the DCIS component of mixed cases, and 60% in invasive component of the latter (all showed a range between 0 and 100%). For stromal expression, the median percentage of positive stromal cells was 5% in pure DCIS (range 0–80%), 70% in the DCIS component of mixed cases (range 0–80%), and 90% in the invasive component of the latter (range 0–90%). Within the pure DCIS cohort, high legumain expression was observed in 23 and 44% in tumor epithelial and surrounding stromal cells, respectively. There was a positive linear correlation between expression of legumain within the epithelial cells and surrounding fibroblasts (r = 0.408, p < 0.0001, Spearman’s correlation).

The proportion of cases with high legumain was greater in DCIS-mixed than pure DCIS, both within the tumor epithelial cells (23% of pure DCIS cases vs. 36% of DCIS-mixed with invasive breast cancer, χ2 = 11.7, p = 0.001) and stromal cells (44% for pure DCIS vs. 86% of DCIS-mixed with invasion, χ2 = 95.5, p < 0.0001). Similar results were observed when the data were analyzed using a continuous scale (p = 0.049 and p < 0.0001, for tumor epithelial cells and stromal cells, respectively). Moreover, there was a statistically significant difference between legumain expression within the tumor epithelial cells of the DCIS component and invasive component of DCIS-mixed cases (p < 0.0001). Similarly, legumain staining was more frequent in the stromal fibroblasts surrounding the invasive component than those surrounding the DCIS component (p < 0.0001) (Fig. 2).

Fig. 2
figure 2

Bar chart showing differences of legumain expression between pure DCIS and DCIS-mixed both in tumor cells (black bars) and surrounding stroma (gray bars). P value from analysis of variance (ANOVA) error bars represent +2 standard deviations

Significance of legumain expression in pure DCIS

High expression of legumain within the malignant epithelial cells and/or surrounding stromal fibroblasts in the pure DCIS was associated with various clinicopathological parameters characteristic of poor prognosis, including high nuclear grade, presence of comedo necrosis, hormonal receptor negativity, HER2 positivity, high proliferative index, and dense tumor-infiltrating lymphocytes (Table 1). Analysis of continuous data of legumain expression scores showed similar results (Supplementary Table 2).

Table 1 Correlation between legumain expression with different clinicopathological parameters in the pure DCIS cohort

To validate the prognostic value of legumain in invasive breast cancer, the METABRIC cohort [32] was used to assess the levels of legumain mRNA and correlate its expression with the clinicopathological variables and outcome. Higher legumain mRNA level was associated with high tumor grade (p = 0.03), lymph node metastasis (p = 0.04), estrogen receptor negativity (p = 0.001), and HER2 positivity (p = 0.006) in addition to shorter breast cancer-specific survival (hazard ratio (HR) = 1.3, 95% confidence interval (CI) = 1.1–1.5, p = 0.007) (Supplementary Table 3 and Supplementary Figure 1). Analysis using the Breast Cancer Gene-Expression Miner v4.1 (bc-GenExMiner v4.1) database showed that high legumain mRNA was associated with higher metastatic relapse and/or death (HR = 1.3, 95% CI = 1.1–1.5, p = 0.0001).

Outcome analysis in pure DCIS cohort

High legumain expression within tumor epithelial cells was associated with shorter local recurrence-free interval (all recurrences either as in situ or invasive disease) in pure DCIS (HR = 2.7, 95% CI = 1.6–4.8; p = 0.0002, Fig. 3). Association with shorter local recurrence-free interval was observed in patients treated with breast-conserving surgery without adjuvant radiotherapy (HR = 2.6, 95% CI = 1.4–4.4; p = 0.002, Fig. 3); however, the significant association with poor outcome was not maintained in patients treated with either mastectomy (HR = 0.9, 95% CI = 0.1–8.8; p = 0.9) or breast-conserving surgery followed by adjuvant radiotherapy (HR = 2.4, 95% CI = 0.8–10.4; p = 0.08). Interestingly, there was an association between high legumain expression and ipsilateral local recurrence as invasive disease (HR = 3.1, 95% CI = 1.5–6.5; p = 0.001, Fig. 3) particularly in patients treated with breast-conserving surgery without post-operative adjuvant radiotherapy (HR = 3.3, 95% CI = 1.5–7.3; p = 0.004). Supplementary Figure 2 shows forest plots illustrating the HR for disease recurrence of the different clinicopathological parameters in patients treated with breast-conserving surgery based on univariate survival analysis. Stromal expression of legumain did not show any significant association with tumor recurrence.

Fig. 3
figure 3

Kaplan–Meier curves show that high expression of legumain within the tumor epithelial cells is associated with shorter ipsilateral local recurrence-free interval in the whole series (a), and in breast-conserving surgery (BCS) without adjuvant radiotherapy (b). High expression also showed an association with shorter local recurrence-free interval as invasive disease in the whole series (c) and in patients treated with breast-conserving surgery without adjuvant radiotherapy (d)

Multivariate survival analysis showed that high expression of legumain in tumor cells was a poor prognostic factor for tumor recurrence in patients treated with breast-conserving surgery independent of known other determinants of high-risk DCIS, including age at diagnosis, DCIS size, presentation, nuclear grade, comedo necrosis, margin status, molecular classes, and radiotherapy either for all recurrences (HR = 3.5, 95% CI = 1.8–4.9; p = 0.0003) or when the analysis confined to invasive recurrences (HR = 3.4, 95% CI = 1.8–8.3; p = 0.002) (Table 2 and Fig. 4).

Table 2 Multivariate survival analysis (Cox regression model) of variables predicting outcome in terms of ipsilateral local recurrence in patients treated by breast-conserving surgery in pure DCIS cohort
Fig. 4
figure 4

Forest plots showing the hazard ratio of the different clinicopathological parameters and ipsilateral tumor recurrence for patients treated with breast-conserving surgery in pure DCIS cohort based on the multivariate analysis results for: a all recurrences whether DCIS or invasive breast cancer and b for invasive recurrences only

Interestingly, when legumain expression in tumor cells was incorporated with the other determinants of DCIS risk described by Van Nuys Prognostic Index [12], it provided better stratification for local recurrence risk, whereby high expression of legumain was associated with worse outcome in all risk groups when compared to similar groups with low legumain expression (Fig. 5).

Fig. 5
figure 5

Kaplan–Meier curves show the association between DCIS risk and local recurrence-free interval in patients treated with breast-conserving surgery based on Van Nuys Prognostic Index alone (a), and when legumain was incorporated with the Van Nuys Prognostic Index (b)

Discussion

The underlying mechanisms promoting the transition from DCIS to invasive disease remain unclear and there is a demand to gain a better understanding. Several studies and risk assessment models are available; however, none is adequate for patients’ risk stratification and hence a considerable percentage of patients with DCIS are either over-treated or under-treated. Furthermore, the biological and clinical heterogeneity of DCIS makes risk stratification quite challenging. An explanation of disease progression based exclusively on intrinsic tumor cell factors is insufficient, as there is a group of low-grade DCIS with indolent appearance and low proliferation index that yet carries progression potential to invasive breast cancer [33]. Studying the role of the DCIS microenvironment and the interaction between its various components and understanding how this influences disease behavior could resolve the DCIS dilemma and provide a more adequate risk stratification model for personalized management [34,35,36,37]. As invasion through the outer myoepithelial later and basement membrane degradation is a key step in DCIS progression to invasive cancer, studying potential markers that drive this process and their prognostic value is a convincing approach to refine DCIS risk.

The lysosomal cysteine protease legumain is a proteolytic enzyme and plays role in autoimmunity and cancer [21, 38, 39]. Overexpression of legumain is linked with poor prognosis in different tumors including invasive breast cancer [17, 22,23,24, 40, 41]. Its action depends mainly on increasing the invasive and metastatic potential of the tumor via its proteolytic properties and stromal degradation [38]. Comparing legumain expression between normal, borderline, and invasive ovarian tissues reveals that it has a role not only in tumor migration and invasion but also in tumor development [40]. However, similar studies are lacking in breast cancer to assess the role of legumain in DCIS. It was reported that legumain is differentially expressed between normal breast tissue and invasive breast cancer [16, 21, 39]. Furthermore, using the METABRIC cohort for robust molecular data in a large number of invasive breast cancer, we have shown an association between aggressive behavior of invasive breast cancer and higher levels of legumain mRNA. These observations support our hypothesis that legumain is a promising candidate marker that requires additional studies to decipher its role in DCIS behavior.

Here we explored the expression of legumain in a large well-characterized cohort of DCIS and scored the protein expression in tumor cells and surrounding stromal fibroblasts. Interestingly, high legumain expression was associated with other features of high-risk DCIS. These findings support the role of legumain in DCIS progression. Supporting this, our results showed that legumain expression is higher in DCIS co-existing with invasive carcinoma than pure DCIS, and much higher in the invasive component either within the tumor cells or in the surrounding stromal fibroblasts.

The poor prognostic value of legumain was shown with a shorter recurrence-free interval in patients with high levels of legumain expression independently from other clinicopathological factors. These findings were consistent for all recurrent events, either DCIS or invasive breast cancer or when the analysis was confined to invasive recurrences only, which provides more evidence that legumain plays a key role in DCIS progression to invasive disease. Our study shows that expression of legumain in tumor epithelial cells, but not stromal cells, is associated with recurrence, a finding that might reflect the potential epithelial cell-intrinsic role of early-stage tumors in extracellular matrix degradation that facilitates tumor progression and the dual role of tumor and stromal cells in progression and aggressiveness of advanced tumors. The latter interaction is supported by the dramatic increase of legumain expression in stromal cells surrounding the invasive component compared to those surrounding the DCIS component in mixed cases or those surrounding pure DCIS. However, further functional studies are highly recommended to understand the underlying mechanisms and functions of legumain expression in carcinogenesis and tumor progression either from the tumor cells or the surrounding stroma.

Incorporation of legumain with the other clinicopathological factors provided a better identification of different risk groups. These findings indicate that legumain is a promising marker for better definition of high-risk DCIS as well as for the identification of patients with lower risk where radiotherapy could be omitted.

Thus far, little is known about the biological processes which involve legumain in cancer progression. However, a correlation was observed between tumor invasion and metastasis and the presence of cysteine endopeptidases, such as cathepsins B and L [42]. Protease zymogen cathepsins B and L may also be activated by legumain-mediated hydrolysis of asparaginyl bonds. Legumain acts as an asparaginyl endopeptidase in regulation of extracellular matrix remodeling through the activation of zymogen pro-gelatinase A, which is an important mediator of extracellular matrix degradation, or the degradation of fibronectin, which is a main component of the extracellular matrix [43, 44]. Animal tumor models generated with cells overexpressing legumain demonstrated an in vivo behavior that is vigorous with more invasive growth and metastasis [39]. This phenotype is proposed to result from the proteolytic function of legumain to activate other protease zymogens. The inhibitory effect of cystatins on tumor cells is consistent with the involvement of legumain, and perhaps other cysteine proteases, in tumor invasion and metastasis. Whether the tumor-suppressing effect is mediated through inhibition of legumain catalytic activity or other cysteine proteases is presently unknown [43].

Legumain is present intracellularly in a pro-active form [38, 39] and one of the activating mechanisms is low pH. Interestingly, our findings showed that legumain is associated with the presence of comedo-type necrosis, which is consistent with low pH and supports our findings. Legumain is usually overexpressed in cells adjacent to necrosis [39], which was observed in our study as well where central cells facing the comedo necrosis showed higher legumain expression than the peripheral cells within the ducts.

The role of legumain in tumor aggressiveness is not related solely to its proteolytic activity but also to its proliferation activation mechanisms. This may be related to decreased apoptotic activity of cells and increased calcium influx into cells [21, 45]. Supporting this possibility, our study showed that legumain was expressed in highly proliferative DCIS, which may further augment the adverse action of legumain in the context of disease outcome.

The role of legumain in autoimmune disease and inflammatory process is undeniable [38]. Legumain functions in antigen presentation to inflammatory cells may be a cause for such phenomenon. Overexpression of legumain in tumor-associated macrophages and endothelial cells of the surrounding tissues has been reported [41]. Accordingly, the link between legumain and dense inflammatory cells infiltrates is warranted to be investigated. We previously reported that dense tumor-infiltrating lymphocytes have poor prognostic significance in DCIS, a reverse phenomenon to the invasive disease for which the underlying mechanisms are unclear [26]. We saw a striking association of high stromal legumain and a dense lymphocytic infiltrate in pure DCIS (Table 1) that may be associated with an inflammatory function for legumain. Taken together, legumain may interact with the inflammatory cascade and affect DCIS behavior.

Extracellular matrix degradation is an essential step for DCIS progression to invasive disease. Legumain might have a potential role in DCIS aggressiveness through its proteolytic activity and regulatory mechanism in cellular proliferation. Additional functional studies to decipher the role of legumain and its mechanism of action in DCIS behavior are warranted. Legumain may also be a valuable prognostic indicator, especially for invasive recurrence.

This study has been carried out on TMA sections, which might underestimate the role of tumor heterogeneity. However, all cases in our cohort were histologically reviewed before TMA construction and used multiple cores for cases with heterogeneous grades or morphological patterns. Moreover, our cohort did not include any patients treated with endocrine therapy.