Background

MicroRNAs are short non-coding RNAs that post-transcriptionally regulate gene expression.1 MicroRNAs have been implicated in many diseases, including rare inherited syndromes, arising from germline mutations in MiRNA genes, and several cancers types.2 Research into the biology and pathology of these molecules has led to the identification of clinically useful genetic and epigenetic biomarkers and more recently novel therapeutic agents.3 These therapeutic agents are based on antagomiR technology, synthetic RNA molecules that bind miRNA targets, and have shown promise in the control of disease symptoms and progression.

MicroRNA-196A (mature RNA MIR196A, non-human miR196a) is transcribed from two genomicĀ loci, HOXC (Chr12 in humans, gene MIR196A2) and HOXB (Chr17 in humans, gene MIR196A1), both situated upstream of HOX9, respectively.4 The precursor transcript expressed from MIR196A2 (pre-MIR196A2) produces two mature miRNA molecules, miR-196a-5p (herein referred to as MIR196A, miR196a non-human) and miR-196a-3p, whilst the HOXB precursor gene MIR196A1 (pre-MIR196A1) also encodes miR-196a-5p but a different 3ā€² miRNA, miR-196a-1-3p. Early studies into the function of miR196a in mice and chicken, demonstrated a requirement for miR196a expression to suppress Hoxb8 RNA, essentially controlling its spatiotemporal pattern along the anterior-posterior axis.5,6,7,8

MIR196A been implicated in a range of cancers, primarily as an oncogene. For example, MIR196A is overexpressed in breast tumours relative to normal breast tissue,9 and additionally a single nucleotide polymorphism (SNP, rs11614913, C>T) within the MIR196A2 gene is associated with a decreased risk of breast cancer.10 The decrease in risk from rs11614913 was found to be associated with a decrease in processing of the precursor transcript to mature miRNA, resulting in less MIR196A expression and highly suggestive of an oncogenic role in breast cancer. MIR196A has also been shown to target the 3ā€² UTR of Annexin-1 (ANXA1), an important mediator of apoptosis,11 in response to the pro-angiogenic vascular endothelium growth factor (VEGF), leading to alterations in angiogenesis, a hallmark of tumourigenesis.12 A separate study demonstrated that MIR196A could increase growth, migration and invasion of a non-small-cell lung cancer cell line through direct targeting of HOXA5.13 Two studies have recently shown that MIR196A can directly influence the cell cycle by targeting p27/Kip1, an inhibitor of cell cycle progression, to dramatically increase growth and pro-oncogenic features of cancer cell lines.14,15 Despite the clear importance on MIR196A in cancer, its transcriptional regulation remains poorly understood.

Transcriptional regulation is a complex multi-faceted biological process that is significantly altered in cancer. MicroRNA genes are regulated transcriptionally in a similar manner to protein coding and long non-coding RNA genes. Promoters mostly lie upstream (within 10ā€‰kb of the mature miRNA), contain a CpG island and in an active state when the miRNAs are transcribed by RNA Pol II are enriched for H3K4me3 and lack H3K27me3 similar to protein coding genes.16,17 Taken together, these data indicate that potential promoters for miRNAs can be identified in a similar manner to methods for protein coding genes. Several instances of miRNA regulation by enhancers have been described, but this area is very much in its infancy.18,19

In this study, we aimed to characterise the expression landscape of MIR196A including factors regulating its expression and explore potential roles of regulatory elements and factors in breast cancer prognostication.

Material and methods

Cell culture

MCF7 cells, for the development of endocrine resistance sub-lines were obtained from AstraZeneca. MCF7, Tamoxifen-resistant (TAMR), Fulvestrant-resistant (FASR), and oestrogen-deprived (MCF7x) cells were cultured as described.20,21,22 All cell lines were cultured for less than 6 months after authentication by short-tandem repeat (STR) profiling (Cell Bank, Australia). MCF7 cells were cultured in RPMI (ThermoFisher, 11875-093) supplemented with 5% foetal calf serum (FCS, ThermoFisher, 1600ā€“044). TAMR, FASR and MCF7x lines were cultured in phenol-red free RPMI (ThermoFisher, 11835ā€“030) supplemented with 5% charcoal stripped FCS (Sigma-Aldrich, F6765), additionally TAMR cells were maintained in 4-Hydroxytamoxifen (Sigma H7904, 10āˆ’7M) and FASR cells in Fulvestrant (HerceptinĀ®, Genentech, 10āˆ’7M).

Cloning and reporter assays

All PCR products for luciferase reporter assays were ligated into Invitrogenā€™s pCR-Blunt (K270020) plasmid using T4 DNA Ligase (New England BioLabs, M0202S), at 40ā€‰C overnight. MIR196A enhancers and promoters were digested from pCR-Blunt and cloned into the luciferase reporter plasmid pGL3-Basic (Promega, E1751). Enhancers were cloned into the BamHI/SalI site whilst promoters were cloned into the multiple cloning site immediately upstream of the luciferase gene. See Supplementary TableĀ 1 for primers.

MCF7 cells were transfected in antibiotic free media with 500ā€‰ng of modified pGL3 reporter constructs, 20ā€‰ng of pRL-TK (Renilla transfection control) and with 0.5ā€‰Ī¼L of Lipofectamine 3000 (Life Technologies, L3000-008). 48ā€‰h post transfection luciferase readings were measured using a DTX-880 luminometer and Dual-Glo Stop and Glo luciferase reporter kit (Promega, E2920), following the manufacturerā€™s recommended protocol.

RNA extraction and gene expression

Cell lysates were prepared using Life Technologies TRIzolĀ® reagent and RNA was chloroform extracted and isopropanol precipitated. RNA was DNaseI treated with the DNA free kit from Ambion (Life Technologies, AM1906). RNA for miRNA analysis was reverse transcribed using the miScript RT II kit from Qiagen (218161), following instructions as per the manufacturer. Assays for all miRNAs were performed with Qiagenā€™s miScript SYBR Green PCR Kit (218073). Primers specific to each mature or precursor miRNA were assayed coupled with a universal primer, see Supplementary TableĀ 2 for assay IDs. Expression data for miRNAs was normalised to the snoRNA RNU6b. All qRT-PCRs were performed using the protocols advised by the manufacturers on a Corbet Rotorgene-6000.

Processed read counts for RNA-Seq on MCF7 cells following oestradiol treatment was sourced from K. Nephew (see author list).23 RNA-Seq from Adriamycin (ADM) and paclitaxel (PTX) resistant MCF7 derived cells was sourced from GSE68815,24 as processed and normalised read counts. Expression of HOX genes in human breast cells was sourced from Gascard et al.25 as normalised read counts.

Genomic data analysis

Accession codes for publicly available data were as follows, MCF7 ChIP-Seq (GSE14664,26), GRO-Seq (GSE27463,27), ChIA-PET (GSE39495,28,29), Breast tumour ERĪ± ChIP-Seq (GSE32222,30). MCF7 histone ChIP-Seq and breast cell 450ā€‰K array data was sourced from ENCODE31 via http://genome.ucsc.edu/ENCODE/downloads.html. ChIP-Seq reads were adapter trimmed and data was mapped to the human genome (hg19) using Bowtie32 and peaks called by MACS33 and viewed in the Interactive Genome Viewer (IGV)34 available through the Broad Institute servers. DNA methylation 450ā€‰K array data for MCF7 and endocrine resistant sublines was previously published, see Stone et al.35 Normalised DNA methylation of breast tumours was sourced from The Cancer Genome Atlas (TCGA).36 Methylation Ī²-values were correlated to the gene expression of MIR196A from the TCGA cohort,36 Pearson correlation coefficients are reported. For transcription factor (TF) binding to the MIR196A2 promoters, the genomic regions upstream of the MIR196A2 gene were visualised through the UCSC genome browser.37 ENCODE38 TF ChIP-Seq and JASPAR39 TF motifs were mapped across the putative promoter elements and snapshots from the browser were taken.

Breast tumour expression analysis

METABRIC expression and clinical information were sourced from EGAS00000083 through a Material Transfer Agreement with the consortium.40,41 Expression values were pre-processed by METABRIC and available as log2 array intensities. Clustering of Illumina Array and miR-Seq data was performed using the Multiple Experiment Viewer (MeV,42). Data was mean-centred and hierarchically clustered via Manhattan average-linkage based clustering of both rows and columns. Genes were correlated within clusters using the CORREL function of Microsoft Excel. The protein network was generated through the cBioPortal link (www.cbioportal.org,43) using the TCGA data.36 cBioPortal utilises protein data and visualisation tools through Cytoscape.44

Survival analysis

Tumour cohorts were based on immunohistochemistry of METBARIC40 patients as either ER+ or ERāˆ’ PGRāˆ’ HER2āˆ’ as triple negative breast cancer (TNBC). Univariate and multivariate Cox proportional hazard regression analyses were performed using MedCalc for Windows, version 12.7 (MedCalc Software, Ostend, Belgium). Kaplan-Meier survival analysis and generation of survival curves was done inĀ GraphPad Prism. Optimal cutoffs for low and high expression groups were determined using receiver operator characteristic (ROC) curves based on the expression of genes (MIR196A, HER2 and PGR) versus patient overall survival. The gene expression value that represents the maximum deviation from the ā€˜random guessā€™ line was used as a cut-off to discriminate low versus high expression. Lymph node status was designated as positive (+, ā‰„1 node presenting with disease at time of surgery) or negative (āˆ’). Tumour grade and stage clinical information were sourced from METABRIC.40 Tumour size was categorised as T1ā€‰=ā€‰ā‰¤ā€‰20ā€‰mm, T2ā€‰=ā€‰>ā€‰20ā€‰mm, and <50ā€‰mm and T3ā€‰=ā€‰ā‰„ā€‰50ā€‰mm.

3C and ChIA-PET

Chromosome conformation capture (3C) was adapted from Vakoc 2005,45 Hagege 200746 and Tan-Wong 2008.47 Briefly, cells were grown to 60ā€“80% confluencey and fixed with 1% paraformaldehyde. Libraries were generated for each cell line using HindIII with control libraries undigested and unligated, representing native gDNA without chromosome conformation. GAPDH primers (amplified fragment contains no cut sites for these enzymes) were used to determine the digestion and ligation efficiency of each library by comparing 3C-qPCR values to primers that amplify a fragment containing a HindIII cut site. For each 3C-qPCR, primers were designed between 100ā€“250ā€‰bp up or downstream of each HindIII cut site with the primer across the putative enhancer used as bait in each 3C-qPCR. The bait primer was combined with each of the primers across the enhancer region for the 3C-qPCR and Ct levels from each 3C-qPCR were normalised to the lowest Ct value (most abundant interaction) so that this interactionā€‰=ā€‰1 relative interaction. Enhancer-promoter interactions are demonstrated as a peak across a region of multiple primers.

Results

MIR196A expression correlates with HOXC genes in breast cancer

Several HOXC protein coding and non-coding genes have known associations with breast cancer progression. We assessed expression patterns of HOXC genes and MIR196A (mature miRNA) in the METABRIC cohort of breast tumours (Supp Fig.Ā 1). These data indicate that MIR196A expression highly correlates to HOXC genes, particularly HOXC10, which lies directly upstream of this miRNA.

Next, we investigated whether these associations are also observed in normal cells of the human breast. Here associations between mature MIR196A expression and HOXC genes are more limited, with correlations most strongly with HOXC11 and HOXC10, the genes upstream of the HOXC MIR196A2 gene (Supp Fig.Ā 2A). Consistent with its role in degrading HOX transcripts, HOXC8, HOXB8 and HOXA7 (all validated targets) negatively correlate with MIR196A expression. MIR196A appears to be most highly expressed within the basal stem-cell (BSC) derived cells, whilst expression is lower in the more differentiated cell types (Supp Fig.Ā 2B).

MIR196A expression is regulated by oestrogen

We and others have previously demonstrated regulation of HOXC genes by oestrogen in breast cancer.48,49,50,51,52 Given that MIR196A expression strongly correlates with expression of HOXC protein coding genes in breast tumours (Supp Fig.Ā 1), we sought to determine if oestrogen also regulates the HOXC embedded MIR196A2 precursor gene. Chromatin immunoprecipitation (ChIP-Seq) for RNA polymerase II demonstrates that polymerase binding in the region surrounding the HOXC10 gene and MIR196A gene is dependent on oestrogen in MCF7 cells (Fig.Ā 1a). Global-run-on sequencing (GRO-Seq) is able to measure nascent RNA, assessing changes in transcription with high sensitivity. Analysis of MCF7 GRO-Seq data clearly indicates a dramatic increase in RNA production in the genomic region surrounding MIR196A2, peaking at 40ā€‰mins following addition of oestradiol (E2) (Fig.Ā 1b). This increase in RNA production from the HOXC locus was validated with qRT-PCR from MCF7 cells following addition of E2 (Fig.Ā 1c). The regulation of HOXC10 by oestrogen has been previously established,51 we find similar results which indicate an increase in expression by E2 (Supp Fig.Ā 3). We next analysed data from MCF7 cells where low levels of E2 (1ā€‰nM) were used and find a similar pattern of a rapid increase in pre-MIR196A2 expression (Fig.Ā 1d). Additionally, there was no change in the expression of the HOXB precursor, pre-MIR196A1. Taken together this suggests that MIR196A is transcriptionally regulated by oestrogen through its HOXC precursor, MIR196A2.

Fig. 1
figure 1

E2 influences MIR196A2 expression in breast cancer. a Identification of putative promoter regions for the MIR196A2 gene using histone marks and ChIP-Seq indicated in figure. Refseq genes are indicated in blue at the top with coordinates based on hg19 chromosome 12. Putative promoter regions (PP1,2,3) and the previously implicated SNP (rs116149130) and CpG from Hoffman et al.10 are indicated by black rectangles. MCF7 DNA methylation 450ā€‰K array data indicate unmethylated (black), partial methylation (blue) and methylated (red). b GRO-Seq measurements of RNA Polymerase engagement and elongation points from the putative promoters, after E2 stimulation in MCF7 cells. Lower part, ChIP-Seq for binding of ERĪ± and FOXA1 to the putative promoters. c qRT-PCR the MIR196A2 response to E2 in MCF7 cells. Qiagen precursor primers were used to detect the precursor miRNA at the specified time points and CT values were normalised to a DMSO vehicle control and the qRT-PCR control of RNU6b. d MiRNA-Seq RPKM for the precursor miRNAs following E2 addition to MCF7 cells. e Luciferase reporter assay measuring the influence of MIR196A2 putative promoter on the luciferase gene transcription. Measurements are RLU normalised to the renilla plasmid (pRL-TK) acting as a transfection control and to the pGL3/Empty plasmid. Experimental measures are done in triplicate with the experiment repeated, data not shown. E2 oestradiol, H3K4me3 Histone 3 Lysine 4 tri-methylation, H3K27ac Histone 3 Lysine 27 acetylation, H3K27me3 Histone 3 lysine 27 tri-methylation, HMEC Human Mammary Epithelial Cell, GRO-Seq Global run-on sequencing, RPKM reads per kilobase per million, RLU relative light units

Transcriptional regulation of the MIR196A2 precursor gene

To identify the structural elements associated with the transcriptional regulation of MIR196A2, histone methylation patterns in the MCF7 breast cancer cell line were assessed. This analysis uncovered putative promoter elements upstream of MIR196A2 including a shared promoter with HOXC10 (Fig.Ā 1a). Given the strong association of MIR196A and HOXC10 expression in breast and their co-regulation by oestrogen, it seems likely they may share a common promoter element which we have cloned in three separate elements labelled putative promoter 1 (PP1), Overlap (between PP1 and PP2) and PP2.

Given that MIR196A2 expression is regulated by oestrogen we hypothesised that its transcription may be controlled by the oestrogen receptor (ER). Using publicly available datasets we established that oestrogen mediated upregulation of MIR196A2 expression is accompanied by binding of ERĪ± and its pioneer factor FOXA1 to two putative promoter regions, PP1 and PP3, upstream of the MIR196A2 transcription start site (Fig.Ā 1b).

The putative promoter elements were subsequently cloned into luciferase reporter vectors to assess transcriptional activity. PP1 and PP2 (modestly) increased luciferase gene transcription (Fig.Ā 1e), with the most active promoter in MCF7 cells being PP1 (HOXC10 promoter).

Given that ERĪ± often binds to distal enhancer elements to exert its function, we examined the hypothesis that MIR196A2 is controlled by long-range transcriptional regulation, mediated by ERĪ± tethered gene looping. Using ChIA-PET (Chromatin Interact Analysis by Paired End Tags) genome-wide chromatin interactions that immunoprecipitate with either ERĪ± or RNA Polymerase II (correlative with active promoters and enhancers), we identified two major sites of interaction with the MIR196A2/HOXC10 promoters (Fig.Ā 2a). One of these is a previously identified HOTAIR enhancer (HOTAIR distal enhancer, HDE49) and the other a novel interacting partner (MIR196A2-Enhancer, mE). Chromosome conformation capture (3C) enzymatic digestion of the HOXC genomic locus results in two fragments covering the MIR196A2 region. 3C-qPCR analysis demonstrates that both enhancer elements physically interact with each of the MIR196A2/HOXC10 promoter regions (Fig.Ā 2b). Cloning of these fragments downstream of the putative-promoter luciferase reporters clearly demonstrates significant augmentation of transcription for both the PP1 and PP2, with HDE appearing to be the most active in MCF7 cells (Fig.Ā 2c).

Fig. 2
figure 2

Distal putative enhancer elements of the MIR196A2 putative promoters. a Histone modification and ChIA-PET of ESR1 and RNA Pol II in MCF7 cells across the HOXC locus and corresponding gene desert. The histone modification H3K27ac is a measure of regulatory element activity and was assessed in HMEC and MCF7 cells plus or minus E2. ChIA-PET interactions are represented by red lines, solid rectangles indicate the sequenced tag and the two points that were physically interacting and tethered to either ESR1 or RNA Pol II. b Zoom-in of mE and HDE (left) and the putative promoter elements (right). Black rectangles indicate the genome fragment sizes post digestion with HindIII. Graph is 3C-qPCR for either the 3ā€‰C PP1 or 3ā€‰C PP2-3 fragments with the Y-axis the relative interaction and the X-axis the genomic location. All genomic coordinates were based on chromosome 12 in the hg19. c Luciferase reporter assay showing the augmentation of HOXC promoters with either mE or HDE, graphed as RLU normalised to the co-transfected control renilla plasmid and to the vector backbone pGL3-Basic. ChIA-PET Chromatin interaction analysis by paired-end tags, HDE HOTAIR distal enhancer, mE MIR196A2 enhancers, PP1/2/3 putative promoter 1

Interestingly, a previous study10 identified a SNP (rs11614913) and an upstream CpG island that are both associated with a decrease in breast cancer risk. This SNP lies within the MIR196A2 gene and the CpG island (CpG_Hoffman) is immediately upstream, falling into the 3ā€™ end of the PP3. Analysis of DNA methylation reveals that this CpG island is mostly methylated in non-malignant MCF10A and cancerous MCF7 cells, whilst unmethylated in human mammary epithelial cells (HMEC) (Fig.Ā 1a).

MIR196A is differentially expressed in breast cancer

Since MIR196A is regulated by ERĪ±, we investigated its expression patterns in relation to commonly utilised molecular markers of breast tumours (Fig.Ā 3a). This analysis identified four distinct clusters of MIR196A expression (Clusters 1ā€“4). Interestingly clusters 1 and 3 show a strong correlation to expression of hormone receptors (HR) (AR, ERĪ±, PGR, HER2) and HR cofactors (Fig.Ā 3b). In contrast, clusters 2 and 4 have significant negative correlation to expression of ERĪ±, PGR, FOXA1 and GATA3, whilst associating with EGFR and HER2. This expression is further defined by the PAM50 intrinsic subtypes where MIR196A is strongly expressed in the HER2 subtype, whist in the luminal A and B subtypes expression is very dynamic (Fig.Ā 3c).

Fig. 3
figure 3

MIR196A is differentially expressed in breast cancer. a Mean-centred log2-expression of MIR196A and commonly utilised breast cancer molecular markers. Expression values were hierarchically clustered and the PAM50 tumour subtypes are indicated above the plot. Expression values are indicated by colour scale bar. b Pearson correlation coefficients, and corresponding P-values, for each gene against the expression of MIR196A either in the orange or purple clusters. c Intensity values for the expression of MIR196A across the five molecular subtypes, PAM50. d Correlation of HOXC methylation to expression of MIR196A for CpG dinucleotides upstream of its gene. Data was sourced from the METABRIC cohort40,41 for a, b and c. Methylation and expression data for d was sourced from the TCGA cohort of breast tumours.36 Basal (nā€‰=ā€‰179), HER2 (HER2-enriched, nā€‰=ā€‰112), LumA (Luminal A, nā€‰=ā€‰568), LumB (Luminal B, nā€‰=ā€‰354), N-Adj (Normal-adjacent, nā€‰=ā€‰116), N-Like (Normal-like, nā€‰=ā€‰82)

DNA methylation accumulates within promoters at CpG islands to suppress gene expression through inhibition of transcription factor binding.53 To predict further regulators of MIR196A expression in clusters 2 and 4 where negative correlation to ESR1 is seen, we investigated the DNA methylation of our putative promoter elements. The majority of upstream CpGs show a negative correlation to MIR196A expression, which the strongest correlation seen to sites within PP2 and A2-Gene (Fig.Ā 3d). Utilising ENCODE TF ChIP-Seq data and motif sites from JASPAR, we identified 49 TFs binding within 100ā€‰bp of these methylation sites (Supp Figs.Ā 4 and 5, Supp TableĀ 3). Of the 49 TFs, 18 factors exhibit significant positive correlation in clusters 2 and 4, while 9 are significantly negatively corelated to MIR196ā€‰A expression (Supp Fig.Ā 6A, Supp TableĀ 3). Interestingly, a cluster of factors (CEBPA, CEBPB, EBF1, EGR1, EGR2, EZH2, JUN, KLF4, KLF5, PPARG, RXRA) presents as highly interconnected through protein-protein interactions and transcriptional regulation and appears largely independent of ERĪ± (Supp Fig.Ā 6B, blue oval). These data suggest that in breast cancer, an interconnected group of transcription factors may influence expression of MIR196A independent of ERĪ±.

MIR196A is a biomarker of breast cancer progression

To further explore the expression of MIR196A in breast cancer, we utilised expression data from the METABRIC cohort of breast tumours.40,41 Expression analysis of this miRNA indicate that it is significantly over-expressed in breast tumours compared to normal adjacent tissue and over-expression is associated with an increase in tumour stage (Fig.Ā 4a, b). Interestingly, high expression of MIR196A is associated with a poor survival in oestrogen receptor positive (ER+) breast cancer, whilst high expression associates with a better outcome in triple-negative breast cancer (TNBC) over the first 5 to 10 years following initial diagnosis (Fig.Ā 4c, d).

Fig. 4
figure 4

MIR196A is a biomarker of breast cancer progression. a Log2 miR-Array intensity for the expression of MIR196A in normal adjacent tissue and breast tumours. Two-tailed T-test with p valueā€‰<ā€‰0.0001 (****). b Log2 intensity for the expression of MIR196A in normal adjacent tissue and tumours at stages 0 to 4. A One-Way ANOVA was used to find a significant trend with a p-valueā€‰=ā€‰<ā€‰0.0001. c, d Kaplan-Meier curves stratifying OS of breast tumours by expression of MIR196A for patients with ER+ or TNBC. e Kaplan-Meier survival curves for patients with ER+ disease, treated with both CT and HT. Expression and survival data sourced from METABRIC.40,41 Log-rank p-value (P) and hazard ratios (HR) displayed. CT Chemotherapy, HT hormone therapy, Norm-Adj Normal Adjacent, OS overall survival and TNBC triple-negative breast cancer

Using MIR196A expression, overall survival of ERā€‰+ā€‰tumours responding to both hormone therapy (HT) and chemotherapy (CT) was stratified (Fig.Ā 4e). Women with low MIR196A expression had exhibited a high rate of survival (>95% at 10 years, HRā€‰=ā€‰8.003, P-valueā€‰=ā€‰0.0125), whilst most women within the high expression group died within 17 years (61% at 10 years).

Given that MIR196A is regulated in part by oestrogen, and the disparity in prognostication of ER+ and TNBC, we investigated the effects of menopause on the stratification of survival for ER+ women. The effects of menopause on the human breast are largely unknown, however serum levels of oestrogen and progesterone dramatically reduce post menopause. In pre-menopausal women, high expression of MIR196A is associated with improved overall survival in ER+ disease (HRā€‰=ā€‰0.463, P-valueā€‰=ā€‰0.0288) (TableĀ 1, Supp Fig.Ā 7A). Multivariate analysis demonstrates that MIR196A is one of the few significant biomarkers for ER+ tumours arising before menopause. In post-menopausal women, all tested biomarkers were significant in ER+ disease, including MIR196A, however high expression is now associated with decreased overall survival (HRā€‰=ā€‰1.847, P-valueā€‰=ā€‰0.0005) (TableĀ 1, Supp Fig.Ā 7B). A similar trend was also observed in TNBC, where in pre-menopausal women, MIR196A high expression correlates with a better outcome (Supp Fig.Ā 7C), stratification in post-menopausal women however, found no significant trend (Supp Fig.Ā 7D).

Table 1 Menopause effects the stratification of patient survival by MIR196A expression in ER+ disease

Therapeutic resistance leads to increases in MIR196A expression

TNBC is resistant to hormone-based therapies and HR+ disease often becomes resistant to anti-oestrogen treatment. Using established models of HR+ disease resistance we found that MIR196A expression is significantly increased in tamoxifen resistant MCF7 cells (TAMR) whilst it is almost depleted in acquired fulvestrant resistance (FASR) (Fig.Ā 5a). These expression patterns match changes in DNA methylation to the HOXC10/MIR196A2 promoters in these same cells (Fig.Ā 5b). For HR+ resistant tumours the only remaining therapeutic options are radiotherapy and chemotherapy. Using RNA-Seq data for cell line models of resistance to adromycin (ADM) and paclitaxel (PTX), two commonly used chemotherapeutics, MIR196A expression again increases in resistant cell lines compared to the treatment sensitive cell line (Fig.Ā 5c). These data suggest an intrinsic requirement for elevated MIR196A expression in HR+ tumour resistance.

Fig. 5
figure 5

Therapeutic resistance leads to anĀ increase in MIR196A expression. a, d qRT-PCR of the relative expression for the mature MIR196A and precursor MIR196A2 transcripts and HOXA7 in MCF7 derived cell line models of endocrine therapy resistance. miRNA expression values are normalised to the expression of RNU6b and the MCF7-C cell line, HOXA7 expression was normalised to RPLP0 and MCF7-C expression. Error bars are the standard deviation of two technical replicates and four biological replicates. b Corresponding DNA methylation for MCF7 derived cell lines, as measured by 450ā€‰K methylation array, for the MIR196A2 genomic region. c, e Log2 RPKM expression of MIR196A and HOXA7 in MCF7 wild-type and ADM and PTX derived resistance cell lines. f Peak scores for the binding of ERĪ± to MIR196A2 regulatory elements in ER+ breast tumours. Peak scores were generated using MACS, normalised to the Input control for the ChIP-Seq library. Peak scores are the average for 9 responders, 9 non-responders and 3 metastases. Data is sourced from Ross-Innes et al.30 ADM Adriamycin/doxorubicin, FASR Fulvestrant resistant, MACS model-based analysis for ChIP-Seq, MCF7-C Control, MCF7-X Oestrogen deprived, PTX paclitaxel, RPKM reads-per-kilobase-per-million, TAMR Tamoxifen resistant

Several HOX genes are validated targets of MIR196A (HOXA7, HOXB8, HOXC8 and HOXCD8) HOXA7 shows the greatest negative correlation to MIR196A in human breast cells (Supp Fig.Ā 2A). Expression of HOXA7 strongly mirrors that of MIR196A in the panel of endocrine resistant MCF7 sublines (Fig.Ā 5d). Additionally, expression of HOXA7 decreases in the ADM and PTX resistant lines (Fig.Ā 5e), in contrast to the increase in MIR196A expression. In these models of therapeutic resistance, MIR196A may be reducing expression of HOXA7.

Utilising ERĪ± ChIP-Seq performed in human patients with HR+ disease,30 binding sites for ERĪ± were identified in the genomic region of MIR196A. This tumour cohort contains three groups of tumours, (1) tumours from women who respond to HR therapy, (2) those who do not and (3) metastases from resistant tumours. An increase in ERĪ± occupancy is seen at both enhancer and promoter regions of MIR196A in non-responders and metastases (Fig.Ā 5f). The increased genome-wide ERĪ± binding in the more resistant tumours was shown by the authors to associate with changes to expression patterns crucial for the resistant tumour to survive therapy and become resistant.

Discussion

The expression of MIR196A in breast cancer is both dynamic and complex. In this paper, we have elucidated important elements, factors and mechanisms controlling the transcriptional regulation of MIR196A and shown that changes in regulation are associated with breast cancer progression and therapeutic resistance.

Several studies have demonstrated regulation of HOXC genes by oestrogen.49,51,54 The majority of HOXC genes are lowly expressed in breast luminal epithelial cells (BLEC), where ERĪ± is most highly expressed and cells are responsive to oestrogen. The regulation of HOXC genes by ERĪ± may be specific to cancer cells through an acquired mechanism of regulation. Several studies have shown that enhancers that are normally repressed can become activated in cancer55,56 and given the extensive chromatin looping between the HOXC locus and its adjacent gene desert, this seems the likely mechanism for cancer expression.

We have previously demonstrated that long-range regulation of HOXC genes occurs in breast cancer and is influenced by ERĪ± and its associated cofactors.49 HOX gene expression is tightly controlled in a spatiotemporal manner to ensure proper axial formation along the anterior-posterior axis during embryonic development.57 Within the cell types of the human breast, HOX gene expression appears dynamic and the association between MIR196A and HOXC genes is not significant. The strong correlation in expression of all HOXC genes in breast tumours with MIR196A is in stark contrast to expression in normal tissues. Several instances have been described regarding the influence of multiple distal enhancers on gene expression, such as the well characterised locus-control-region (LCR) of the Beta-globin genes or the c-Myc enhancers active across multiple cancer types.45,58,59,60 Given the extensive interactions between this locus and its adjacent gene desert, we hypothesise that a consorted effort of multiple enhancers is responsible for the overexpression of these genes in cancer possibly driven by extensive binding and activity of ERĪ±. To explore this hypothesis a high resolution chromatin interaction analysis of this region in breast cancer cells would be required, such as 5C61 or NG Capture-C,62 coupled with ERĪ± ChIP-Seq and ChIA-PET.29 In addition, we see an increase in MIR196A2 expression in response to low-dose E2, suggesting a direct influence by ERĪ± that is further increased at higher dosages of 10ā€‰nM in our qRT-PCR assays. It would be interesting to explore how low dosages of E2 influence distal enhaner elements and if more oestrogen is required for enhancer-promoter activity.

Whilst this manuscript was in preparation new data has come to light which corroborates our conclusions. Jiang et al.63 demonstrate that the mature MIR196A transcript positively responds to oestrogen stimulation in MCF7 cells, and this is mediated by upstream ERĪ± binding. This binding peak falls within PP3. Whilst we show that PP3 is not able to increase luciferase expression in a luciferase reporter assay, the binding of ERĪ± may be important for the activity of the HOXC10 and MIR196A2 promoters. In our data we see a time delay in the processing of the precursor MIR196A2 gene into mature MIR196A, suggesting a second mechanism of regulation post-transcriptionally. Evidence suggests that the levels of mature miRNAs are more reliant the microprocessor complex and that individual miRNAs vary significantly in there maturation and stability.64

Previous genetic association studies have demonstrated that the SNP (rs116149130) within the precursor gene, MIR196A2, confers a reduced risk of breast cancer incidence.10,65,66 This SNP is found within the MIR196A-3p sequence of the MIR196A2 precursor gene. Hoffman and colleagues10 demonstrated that rs116149130 reduces microRNA maturation thereby reducing expression of the mature miRNA. They also identified that an upstream CpG island is associated with reduced risk when hypermethylated. Here we show that this upstream CpG island lies within the transcriptionally active region of HOXC10 and MIR196A2 as observed through GRO-Seq. Interestingly, this CpG island is completely methylated in models of oestrogen deprivation and fulvestrant treatment, but not in tamoxifen resistant cells. DNA methylation is most commonly associated with repressed transcription,67 hypermethylation of this region in a transcriptional high region may severely impair expression. Given that various transcription factors strongly influence transcription in endocrine resistant breast cancer, these data suggest that binding of ERĪ± accompanied by cofactors may be needed to maintain low methylation levels and active transcription in breast cancer.68,69,70,71,72

Using hierarchical clustering of breast tumour RNA-Seq data, we observed two distinct expression patterns associated with MIR196A expression. Interestingly, DNA methylation at several sites within the HOXC locus negatively correlates with the expression of this miRNA, supporting the notion of DNA methylation as a repressive epigenetic modification in this context.67 We demonstrated that several transcription factors that bind to these differentially methylated regions strongly associate with MIR196A expression in breast cancer, even in those tumours with show a negative correlation to ERĪ± expression. These transcription factors appear to influence the expression of one another and in some cases form protein-protein complexes. Further investigations should aim to fully elucidate the role of this network and its influence on MIR196A expression.

High expression of MIR196A is a biomarker of poor prognosis in ER+ tumours, especially in those patients resistant to therapy. Expression of MIR196A increases in response to tamoxifen and chemotherapeutic agents in oestrogen responsive MCF7 cells. This increase in expression is associated with loss of DNA methylation within the promoter regions of the miRNA. In poor responders with ER+ tumours, HOXC enhancer elements appear to more readily bind the ER. These data raise the possibility that the pathway to resistance to therapy in ER+ tumours involves the de-repression and over-activation of promoter and enhancer elements. This is commonly seen throughout cancer,56,73,74 with suggestions that enhancer disruption can revert cells to a non-terminally-differentiated state a common hallmark of tumourigenesis. HOX genes are essential in embryonic development, these genes would be a valuable asset for any tumour cell to use to sustain a stem-cell like state.75,76

Breast cancer incidence and relative subtype changes after menopause.77,78 In women younger than 45, luminal breast tumours account for 33ā€“44%.79,80 This increases to 70ā€“72% in women older than 65. In contrast, basal-like tumours are more common in younger women, suggesting a switch or evolution in the factors driving cancer following menopause, most likely related to the decline in oestrogen production. It is then interesting to note that higher expression of MIR196A associates with good outcome in pre-menopausal women with ER+ tumours, and a poor outcome of ER+ tumours following menopause. Given the strong involvement of HOX genes in development, we hypothesise that there is a change in the regulation and expression of these genes through and following menopause, which in turn impacts their contribution to the development of certain breast cancer subtypes.

MIR196A is a dynamically expressed miRNA in both normal mammary cells and breast tumours. This miRNA is a possible biomarker for the progression of breast tumour to becoming resistant to therapy. Future studies should aim to uncover the purpose of increase MIR196A expression and if it is required for development of resistance alone or in combination with other HOXC genes.