Introduction

Worldwide, breast cancer accounts for 23% of total diagnosed cancer cases and is the second leading cause of cancer-related death among women [1]. In solid tumors, cell–cell decohesion is a recognized phenomenon allowing tumor cells to grow invasively into surrounding tissues [2]. E-cadherin, a calcium-dependent adhesion molecule encoded by the CDH1 gene located on chromosome 16q22.1 [3], has an important role in gland formation, cell differentiation, polarity, and maintaining the integrity of epithelial cells [4]. Subsequently, decreased expression of E-cadherin, which is frequently seen in breast cancer, may lead to cellular de-differentiation and invasiveness [5, 6].

Reduced/loss of E-cadherin expression in the vast majority of invasive lobular carcinomas and lobular carcinoma in situ, together with loss of CDH1 gene copy number [7,8,9] or CDH1 gene mutation [10] in a large proportion of cases, suggests a plausible role for E-cadherin as a tumor suppressor gene [11, 9]. However, there is limited evidence to support a role for E-cadherin as a tumor suppressor gene in invasive ductal carcinoma [12]. In fact, ductal carcinoma in situ and low-grade invasive ductal carcinoma generally show stronger E-cadherin membrane staining than that seen in the normal breast epithelial cells, denoting increased expression rather than a loss of expression [13]. Although some studies indicated that a proportion of invasive ductal carcinoma shows loss/reduced E-cadherin protein expression, these tumors were typically high-grade aggressive tumors. Of note, accumulating evidence suggests that high-grade invasive ductal carcinoma are characterized by genomic instability with loss of increasing number of tumor suppressor genes during the carcinogenesis process that contributes to their aggressive behavior [12]. In addition, reduced/loss of E-cadherin expression is frequently associated with loss of estrogen expression, larger tumor size, and with the development of metastasis and recurrence [14,15,16,17]. These findings suggest that E-cadherin loss occurs as a late event in the process of carcinogenesis arising in association with or as a part of genomic instability rather than as an early neoplastic event as seen in invasive lobular carcinoma [13, 18, 19]. However, the reasons for dysregulation of E-cadherin protein expression remain ill-defined [20].

We therefore aimed to study the mechanisms of reduced/loss E-cadherin expression in high-grade invasive ductal carcinoma compared with invasive lobular carcinoma and its potential molecular implications.

Materials and methods

Study cohort

This study was conducted on multiple well-characterized cohorts of high-grade invasive ductal carcinoma using different molecular techniques (Supplementary Table 1). First, a well-characterized cohort of primary grade 3 invasive ductal carcinoma from patients presenting to Nottingham City Hospital between 1989 and 1998 (n = 813), and for whom detailed clinicopathologic data were available was used to determine E-cadherin expression using immunohistochemistry [12]. The mean patient age was 52 years (range 18–71) and tumor size ranged in diameter from 0.1 to 5 cm at time of presentation, with a mean tumor size of 2 cm (Supplementary Table 2). To understand the molecular biology of E-cadherin expression, high-grade invasive ductal carcinoma (n = 883) cases in the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) cohort [21] were used to investigate copy number alterations and CDH1 mRNA expression. The mean patient age was 59 years (range 26–96) and mean tumor size at time of presentation was 3 cm (range from 1 to 18 cm). In the METABRIC-invasive ductal carcinoma series, DNA/RNA was isolated from fresh frozen samples and transcriptional profiling was obtained using the Affymetrix SNP 6.0 Illumina Total Prep RNA Amplification Kit and Illumina Human HT-12 v3 Expression Bead Chips (Ambion, Warrington, UK). Copy number alteration was considered at the gene level by segments and the Šidák correction [22], whereas gene expression data were pre-processed and normalized as described previously [21]. In this cohort, patients with estrogen-positive tumor and /or lymph node negative at time of diagnosis did not receive adjuvant chemotherapy, whereas those with estrogen-negative tumors and lymph node-positive status received adjuvant treatment. Next-generation RNA sequencing (RNA-Seq) was conducted on an additional triple negative breast cancer cohort (n = 106) to investigate E-cadherin reduced/loss expression in this subtype of breast cancer. The mean patient age was 48 (range 27-69) and tumors size ranged in diameter from 1 to 6 cm at time of presentation, with a mean tumor size of 2 cm (Supplementary Table 2).

Immunohistochemistry staining and scoring

Mouse monoclonal anti-E-cadherin antibody [Cl;4A2C7, Ref#180223, LOT 954621A, Invitrogen, UK] was used to assess protein expression on immunohistochemically stained tissue sections after prior validation of the antibody by western blotting using MDA-MB-231 and MDA-MB-157 breast cancer cell lysates (obtained from American Type Culture Collection, Rockville, MD, USA). Immunohistochemistry staining procedure was performed using Novocastra Novolink TM Polymer Detection Systems kit (Code: RE7280-K, Leica, Biosystems, UK) on 4 µm tissue microarray sections [20]. Sections were incubated for 24 h with the anti-CDH1 antibody diluted to a concentration of 1:25. Scoring of membranous protein expression was performed using the modified histo-score [23]. We used the lower quartile from the modified histochemical score value (i.e., 85) to stratify the cohort into high and reduced/loss E-cadherin expression groups. Cases in the METABRIC cohort were stratified using a similar approach for total CDH1 mRNA expression. Copy number alteration and CDH1 mRNA expression were correlated with E-cadherin protein expression in the same cases where available (n = 131).

RNA sequencing

RNA-Seq was performed on representative formalin-fixed paraffin-embedded blocks of triple negative breast cancers (n = 106), which had also been assessed histopathologically for tumor burden. Invasive tumor cells were micro-dissected from unstained tissue sections where tissue burden was at least 50% of the tissue section area. Micro-dissected tissues were deparaffinized, rehydrated, and centrifuged to remove excess ethanol. RNA was extracted using the Omega Mag-Bind XP formalin-fixed paraffin-embedded RNA isolation kit (Omega, M2595-01) and Kingfisher Flex magnetic particle separator (ThermoFisher) as per the manufacturer’s instructions. RNA was measured with a Nanodrop 2000c spectrophotometer (Thermo Scientific). First-strand cDNA synthesis was performed on ~100 ng RNA at 25 °C for 10 min, 42 °C for 15 min, and 70 °C for 15 min using random hexamers and ProtoScript II Reverse Transcriptase (New England BioLabs, Ipswich, MA). Second-strand synthesis and RNA-Seq libraries were prepared using the Illumina TruSeq RNA access library kit (Illumina, RS-301-2002) and sequenced on an Illumina HiSeq 2500 using PE75 run chemistry. The targeted read count was 60 M total reads per sample. Sequencing was performed at the Emory Integrated Genomics Core Facility, Emory University, Atlanta, USA. Raw FastQ sequence reads files were quality assessed and adapter processed using the trim galore wrapper for Fastqc and Cutadapt with reads with phred scores > 30 retained. The resultant quality-trimmed reads were aligned to the hg38 (GRCh38.83) build of the human genome using the STAR aligner. Transcript abundance quantification were performed using HTSEQ [34]. Only one sample per patient was included in downstream analyses by random selection. Differential gene expression was assessed using Robina implementation of Edge-R [24].

Pathway analysis

The online public available web-based gene set analysis tool, Webgestalt, (http://www.webgestalt.org/option.php) was used to identify differentially regulated canonical pathways. This pathway analysis was based on transcripts differentially expressed at the p < 0.05 level and generated by Robina analysis, including only unbiased hits with significant z-scores based on network-adjusted p-values < 0.05 using KEGG pathway database [25].

Statistical analysis

IBM SPSS 24.0 (Chicago, IL, USA) software was used for statistical analysis. The χ2-test was used to assess the effect of copy number alteration on reduced/loss of CDH1 mRNA expression. Furthermore, we evaluated copy number alteration of established tumor suppressor genes in cases that exhibited reduced/loss of CDH1 mRNA expression and copy number loss via copy number alteration in the METABRIC cohort, to infer genetic instability as the likely driver of the reduced/loss of CDH1 mRNA expression using χ2-test. Mann–Whitney test was used to compare the expression of CDH1 mRNA expression with expression of well-established transcription factors affecting E-cadherin expression [26]. Furthermore, we evaluated expression of a set of genes previously demonstrated to have 93% predictive accuracy in distinguishing invasive lobular carcinoma from invasive ductal carcinoma via the prediction analysis for microarrays test [27]. Expression of proteins related to DNA repair and proliferation were compared with expression of the E-cadherin protein using the Mann–Whitney test. Furthermore, the association of E-cadherin protein expression with that of transcription factors mRNA expression (assessed using next-generation sequencing—HTSEQ values) was evaluated using the Mann–Whitney test. Two-tailed p-value < 0.05 was considered as statistically significant. RNA-Seq values were expressed as SEMs in GRAPH PAD PRISM v.7 for data presentation.

Results

Evaluation of E-cadherin protein expression in the high-grade invasive ductal breast carcinoma cohort (n = 813)

The specificity of E-cadherin antibody was validated by western blotting that showed a single specific band at the expected molecular weight (~100 kDa). A total of 217/813 (27%) of high-grade invasive ductal carcinoma and 46/106 (43%) of triple negative breast cancer showed reduced/loss membrane expression of E-cadherin. Within the METABRIC cohort, reduced/loss CDH1 mRNA expression was observed in 208/883 (23%) cases. Furthermore, triple negative breast cancer showed reduced/loss of CDH1 mRNA expression in 90/235 (38%) cases. Reduced/loss CDH1 mRNA expression cases were observed in 104 cases of the basal (37%), 18 cases of the HER2 enriched (11%), 40 cases of luminal A (27%), 29 cases of luminal B (12%), and 17 cases of the Normal-like (29%) molecular subtypes (Supplementary Table 3 and Supplementary Figure 1). In the subset of cases that were included in the METABRIC dataset (n = 131), there was a positive linear correlation between CDH1 mRNA and the dichotomized E-cadherin protein expression (r = 0.27, p = 0.002).

Reduced/loss E-cadherin protein expression was associated with GammaH2AX (p < 0.0001) and phosphatase and tensin homolog (PTEN) (p = 0.003) protein expression (Table 1).

Table 1 Correlation between level of proteins associated with altered E-cadherin expression in high-grade invasive breast cancer cohort (n = 813)

E-cadherin copy number alteration in ductal breast cancer

To investigate whether reduced/loss of E-cadherin expression in the invasive ductal carcinoma cases is due to copy number alteration, we examined copy number alteration and CDH1 mRNA levels. We observed that 44/208 (21%) of cases showed significant association between loss of CDH1 copy number and reduced/loss CDH1 mRNA expression (p = 0.003) (Supplementary Table 4). Only one case with copy number loss did not show any association with the transcription factors investigated, while the remaining cases showed upregulation of one or more transcription factors (Supplementary Table 5). Interestingly, 77% of tumors presenting with reduced/loss CDH1 mRNA expression did not show CDH1 copy number loss, indicating that other mechanisms are implicated. Subsequently, investigating the triple negative tumors, only 7/90 (8%) of cases showed copy number to be associated with reduced/loss CDH1 mRNA expression. However, there was no statistical association between copy number loss and reduced/loss of CDH1 mRNA expression (p = 0.10) (Supplementary Table 6). More importantly, among those cases, only 1 (copy number loss) case did not show any association with any transcription factors, while the rest of the 6/90 (7%) cases (copy number loss) showed upregulation of one or more transcription factors (Supplementary Table 7). Moreover, 83/90 (92%) of triple negative tumors with reduced/loss CDH1 mRNA expression showed neutral/amplified CDH1 copy number expression.

In addition, reduced/loss CDH1 mRNA expression in invasive ductal carcinoma showed copy number loss of multiple well-established breast cancer tumor suppressor genes located at different chromosome loci: TP53, ATM, BRCA1, and BRCA2 (p < 0.001) (Supplementary Table 8).

Expression of E-cadherin suppressor transcription factors

In cases with reduced/loss E-cadherin expression (n = 208) from the METABRIC cohort, upregulated mRNA expression was observed with ZEB2 (56%), TWIST2 (54%), NFKB1 (54%), ZEB1 (53%), TWIST1 (52%), SLUG (51%), SNAIL (50%), GSK3BETA (49%), TGFB1 (47%), LLGL2 (38%), and CRUMBS3 (34%). Only 4% of the cases were affected by nine or more upregulated transcription factors (Supplementary Table 9 and Supplementary Figure 2). Upregulated expression of TWIST2, ZEB2, NFKB1, LLGL2, and CRUMBS3 were significantly associated with reduced/loss of CDH1 mRNA expression (Table 2). In triple negative breast cancer with reduced/loss E-cadherin expression, upregulated mRNA expression was observed with ZEB2 (63%), SLUG (62%), TWIST2 (59%), TWIST1 (57%), ZEB1 (54%), SNAIL (52%), TGFB1 (51%), GSK3BETA (50%), NFKB1 (46%), LLGL2 (24%), and CRUMBS3 (24%) (Supplementary Table 10 and Supplementary Figure 3). Only 3% of the cases harbored nine or more upregulated transcription factors (Supplementary Table 7). Upregulated expression of TWIST2, TWIST1, ZEB2, ZEB1, SLUG, LLGL2, and CRUMBS3 were significantly associated with reduced/loss of CDH1 mRNA expression (Table 3).

Table 2 Correlation between mRNA levels of the genes associated with altered E-cadherin expression in breast cancer in the METABRIC cohort
Table 3 Correlation between mRNA level of the genes associated with E-cadherin expression in triple negative high-grade invasive ductal carcinoma in the METABRIC cohort

Proteins associated with E-cadherin expression in invasive triple-negative ductal breast carcinoma

There was no significant statistical correlation between reduced/loss of E-cadherin expression with transcription factors, DNA repair family, nor other markers such as ki67, ATM, and PTEN on the protein level in triple-negative breast cancer (Table 4a, b and Supplementary Figure 4A)

Table 4A Correlation between level of proteins known to control E-cadherin expression using triple-negative invasive breast carcinoma cohort (n = 106)
Table 4B Correlation between level mRNA expression of other genes known to control E-cadherin expression using triple-negative invasive breast carcinoma cohort (n = 106)

E-cadherin loss and expression of genes differentially expressed between invasive lobular carcinoma and invasive ductal carcinoma within the triple-negative breast cancer cohort

There was no significant association between reduced/ loss of E-cadherin expression in the high-grade triple-negative ductal cancer and those genes differentially expressed between invasive lobular and ductal carcinoma (Cathepsin B, TPI1, SPRY1, SCYA14, TFAP2B, thrombospondin 4, Osteopontin, HLA-G, CHC1) [27] (Table 5 and Supplementary Figure 4B).

Table 5 Genes differentially expressed between lobular vs. ductal breast carcinomas in triple-negative breast cancer cohort (n = 106)

Genomic study and pathway analysis

Next-generation sequencing identified 2143 differentially expressed genes (Benjamin–Hochberg; p < 0.05, differentially expressed by > two-fold, false discovery rate < 0.05). Triple-negative invasive ductal carcinoma with reduced/loss E-cadherin expression (n = 46) showed 849 significantly overexpressed and 1294 downregulated genes. It is noteworthy that dysregulation of genes regulating Wnt signaling pathway, the top predicted master regulator of E-cadherin expression, based on p-value, whose activity could explain protein expression differences were FZD2, GNG5, HLTF, WNT2, and CER1; PIK3-AKT signaling pathway top predicted master regulator controlling E-cadherin expression were FGFR2, GNF5, GNGT1, IFNA17, and IGF1 (Table 6). Importantly, key genes differentially expressed between invasive lobular carcinoma and invasive ductal tumors [27] did not show association with E-cadherin reduced/loss of expression in the invasive triple-negative ductal carcinoma (Table 5).

Table 6 Pathway analysis results using Webgestalt to identify differentially regulated canonical pathways in the triple-negative breast cancer cohort

Discussion

Reduced/loss of E-cadherin expression is recognized as part of the main molecular events driving loss of cell–cell adhesion and thus facilitating cancer invasion and metastasis [28]. Some authors have suggested that E-cadherin can serve as a phenotypic marker to distinguish between invasive lobular carcinoma and non-invasive lobular tumors [27]. Mechanisms seeding reduced/loss of E-cadherin expression comprise CDH1 gene mutation [10], truncating mutation [29], promoter hypermethylation [30], and transcriptional inactivation [31]. Reduced/loss of E-cadherin expression is observed in 84% of invasive lobular carcinomas [9]. Several studies have shown that ~38% of high-grade invasive ductal tumors show reduced/ loss of E-cadherin expression and this phenomenon has been linked to aggressive tumor behavior. Interestingly, CDH1 gene mutations were not identified in this subgroup [11, 19, 32].

One of the recognized mechanisms leading to reduced/loss of E-cadherin expression is loss of heterozygosity at chromosome 16q22.1, where the CHD1 gene is located [33]. Studies investigating the mechanism underlying reduced/loss of E-cadherin protein expression in invasive lobular carcinoma cases uncovered loss of wild-type allele due to loss of heterozygosity at 16q22.1 occurring in > 70% of cases [7, 8]. Furthermore, CDH1 gene mutation and promoter hypermethylation were observed in 20% and 56% of invasive lobular carcinomas, respectively [7]. Interestingly, co-occurrence of these mechanisms rarely occurs in invasive lobular tumors [34]. Remarkably, mutational inactivation of CDH1 gene mostly coexists with loss of the wild-type allele in invasive lobular carcinoma [35]. As reduced/loss of E-cadherin expression in invasive lobular tumors is predominantly caused by loss of heterozygosity, it has been suggested that copy number loss of the CDH1 gene can be used to discriminate between invasive ductal carcinoma and invasive lobular tumors when it is difficult to differentiate them based on histological evaluation [36]. Our investigation revealed that copy number loss occurred in only 21% of invasive ductal carcinomas displaying reduced/loss of E-cadherin expression. Therefore, other mechanisms must underlie the downregulation of E-cadherin in the majority of cases. Other mechanisms of E-cadherin reduced/loss of expression without copy number loss include DNA hypermethylation, a mechanism that may induce the CDH1 reduced/loss of mRNA expression detected in 60% of metastatic invasive ductal carcinoma [37].

Loss of CDH1 gene at 16q22.1 in invasive lobular carcinoma is one of the main genetic events and is observed early in the process of carcinogenesis in lobular carcinomas. We hypothesized that reduced/loss of E-cadherin expression in a subset of invasive ductal tumors might be the result of genomic instability and occurs as a late event during the process of cancer progression. Our results demonstrate that loss of CDH1 copy number is associated with copy number aberrations of multiple well-established breast cancer tumor suppressor genes located at different chromosomes; copy number loss of ATM (11q22.3), PTEN (10q23.31), RB1 (13q14.2), TP53 (17p13.1), BRCA1 (17q21.31), and BRCA2 (13q13.1) tumor suppressor genes. Moreover, DNA damage response pathways, which are crucial for detecting DNA lesions and arresting the cell cycle until the DNA is repaired or inducing cell death if cells sustain irreparable DNA damage [38], have key roles in preventing genetic instability and tumorigenesis [39]. Investigation of correlations between reduced/loss of E-cadherin expression and expression of biomarkers related to DNA damage response pathways in breast cancer revealed negative correlation between reduced/loss of E-cadherin protein expression and GammaH2AX and PTEN expression, suggesting that reduced/loss of E-cadherin expression is associated with impaired DNA damage response and, likely, genomic instability. Taken together, these results support our hypothesis that reduced/loss of E-cadherin expression in invasive ductal carcinomas is associated with genomic instability.

Reduced/loss of E-cadherin expression can also be caused by overexpression of its associated transcription factors [40,41,42, 26]. Our results showed a negative correlation between reduced/loss of CDH1 mRNA expression and the mRNA expression of transcription factors known to suppress E-cadherin expression and cause disruption of cell–cell adhesion [43, 44]; in fact, 76% of cases harboring reduced/loss of CDH1 mRNA show upregulation of one or more of these transcriptional repressors.

Remarkably, other key factors in epithelial–mesenchymal transition such as TGFBeta1, SNAIL, and SLUG did not show any correlation with E-cadherin reduced/loss of mRNA expression. These observations suggest that reduced/loss of E-cadherin expression is not merely a surrogate for epithelial–mesenchymal transition but represents a readout of other pathways controlling E-cadherin expression at membranes level.

Of note, reduced/loss of E-cadherin protein expression occurs in up to 50% of triple-negative invasive ductal carcinoma, which may contribute to increased lymph node metastasis, and poor patient outcomes [45]. We observed a negative correlation between reduced/loss of CDH1 mRNA expression and the mRNA expression of multiple transcription factors known to suppress E-cadherin expression in our triple-negative breast cancer cohort. On the contrary, when we investigated the same genotype within the cohort tested by next-generation sequencing, none of these transcription factors showed statistically significant associations with E-cadherin expression. It is possible that different molecular mechanisms regulate E-cadherin expression, although we cannot exclude the possibility that our cohort is small to such associations.

More importantly, genes differentially expressed between invasive ductal and invasive lobular breast tumors as identified by Waldman et al. [27] and could represent the effect of E-cadherin loss in lobular carcinoma compared with ductal tumors showed no statistically significant difference, when tested on mRNA level, in breast cancer cases, showing reduced/loss of E-cadherin expression compared with these tumor with normal expression. This may indicate not only that more complex molecular mechanisms are responsible for E-cadherin reduced/loss of protein expression in these cases but also E-cadherin loss in ductal carcinoma does not produce the same effects in lobular tumors. This may also be supported by the lack of morphological features and metastatic behavior characteristic of lobular carcinomas in ductal tumors lacking E-cadherin expression.

In this study, differential gene expression using next-generation sequencing investigating differences between cases with reduced/loss of E-cadherin expression and cases with normal/high expression showed dysregulation of genes regulating PIK3-AKT signaling pathway. Our analysis exposed a negative correlation between the genes regulating this pathway and reduced/loss of E-cadherin protein expression, suggesting that overexpression of those indicators may promote signaling via the PIK3-AKT pathway and thus negatively regulate E-cadherin expression. Receptors such as insulin-like growth factor receptor 1 can induce the activity of Akt pathway [46]. Our results are in agreement with reports indicating activation of PIK3-AKT represses E-cadherin expression and stimulates cell migration [47]. Nonetheless, dysregulation of genes regulating Wnt signaling pathway was also present in our results. Mutation or deregulation of gene expression of the canonical Wnt pathway is implicated in cancer [48,49,50].

Our study limitation relates to comparing gene expression data obtained from microarrays, as used in the METABRIC cohort comprising different molecular subtypes of invasive ductal tumors, and the RNA-Seq dataset available for our triple-negative breast cancers only. We have chosen triple-negative breast cancer to study E-cadherin protein expression in invasive ductal carcinoma cases, as up to 50% of this molecular subtype show reduced/loss of E-cadherin protein expression [45, 51]. On the contrary, studies have shown that reduced/loss of E-cadherin expression occurs in 23% and 27% of luminal and HER2 enriched subtypes, respectively [51]. RNA-Seq approaches cover multiple aspects of the transcriptome without any a priori knowledge, allowing to identify novel transcripts, splice junctions, and noncoding RNAs [52]. We acknowledge that comparison between these two different approaches may or may not provide the same results due to intrinsic differences in assay design [53]. For instance, next-generation sequencing may have different lower limits of detection or may encompass different genomic regions [52]. More importantly, invasive ductal carcinoma cases used in the METABRIC cohort comprise different molecular subtypes, whereas the RNA-Seq data were acquired for a triple-negative breast cancer cohort, which also may have a role in our study. Therefore, further validation of our findings is warranted.

Conclusion

Reduced/loss E-cadherin expression in invasive ductal carcinoma is a complex biological phenomenon, which, according to the findings of this study, appears to be a part of the genomic instability process occurring late in the process of carcinogenesis rather than an initial neoplastic event and results in different effects to those produced in invasive lobular carcinomas. Using the high-throughput next-generation sequencing, we have unraveled potential novel regulators controlling different signaling pathways that regulate E-cadherin protein expression in invasive ductal carcinoma. These regulators warrant further investigation and validation using different platforms.