Genome-wide methylation analyses identifies Non-coding RNA genes dysregulated in breast tumours that metastasise to the brain

Brain metastases comprise 40% of all metastatic tumours and breast tumours are among the tumours that most commonly metastasise to the brain, the role that epigenetic gene dysregulation plays in this process is not well understood. We carried out 450 K methylation array analysis to investigate epigenetically dysregulated genes in breast to brain metastases (BBM) compared to normal breast tissues (BN) and primary breast tumours (BP). For this, we referenced 450 K methylation data for BBM tumours prepared in our laboratory with BN and BP from The Cancer Genome Atlas. Experimental validation on our initially identified genes, in an independent cohort of BP and in BBM and their originating primary breast tumours using Combined Bisulphite and Restriction Analysis (CoBRA) and Methylation Specific PCR identified three genes (RP11-713P17.4, MIR124-2, NUS1P3) that are hypermethylated and three genes (MIR3193, CTD-2023M8.1 and MTND6P4) that are hypomethylated in breast to brain metastases. In addition, methylation differences in candidate genes between BBM tumours and originating primary tumours shows dysregulation of DNA methylation occurs either at an early stage of tumour evolution (in the primary tumour) or at a later evolutionary stage (where the epigenetic change is only observed in the brain metastasis). Epigentic changes identified could also be found when analysing tumour free circulating DNA (tfcDNA) in patient’s serum taken during BBM biopsies. Epigenetic dysregulation of RP11-713P17.4, MIR3193, MTND6P4 are early events suggesting a potential use for these genes as prognostic markers.

Brain metastases comprise 40% of all metastatic tumours and breast tumours are among the tumours that most commonly metastasise to the brain, the role that epigenetic gene dysregulation plays in this process is not well understood. We carried out 450 K methylation array analysis to investigate epigenetically dysregulated genes in breast to brain metastases (BBM) compared to normal breast tissues (BN) and primary breast tumours (BP). For this, we referenced 450 K methylation data for BBM tumours prepared in our laboratory with BN and BP from The Cancer Genome Atlas. Experimental validation on our initially identified genes, in an independent cohort of BP and in BBM and their originating primary breast tumours using Combined Bisulphite and Restriction Analysis (CoBRA) and Methylation Specific PCR identified three genes (RP11-713P17.4, MIR124-2, NUS1P3) that are hypermethylated and three genes (MIR3193, CTD-2023M8. 1 and MTND6P4) that are hypomethylated in breast to brain metastases. In addition, methylation differences in candidate genes between BBM tumours and originating primary tumours shows dysregulation of DNA methylation occurs either at an early stage of tumour evolution (in the primary tumour) or at a later evolutionary stage (where the epigenetic change is only observed in the brain metastasis). Epigentic changes identified could also be found when analysing tumour free circulating DNA (tfcDNA) in patient's serum taken during BBM biopsies. Epigenetic dysregulation of RP11-713P17.4, MIR3193, MTND6P4 are early events suggesting a potential use for these genes as prognostic markers.
More than 90% of cancer related deaths are attributed to metastases and about 80% of breast cancer deaths occur from metastases 1 . 15-25% of breast tumours metastasise to the brain during the course of disease 2 and despite aggressive treatment strategies with surgical resection, stereotactic radiosurgery and whole brain radiotherapy (WBRT), the prognosis of breast to brain metastasis (BBM) patients remains poor 3 . Due to the challenges associated with the treatment of brain tumours, it is crucial to find novel prognostic markers that will inform the clinical management of breast cancer patients. www.nature.com/scientificreports/ The seed and soil theory of metastases indicates that metastasing tumour cells have affinity for specific organs 4,5 . Metastases arise from metastatic initiating cells, which intravasate from the primary sites and remain as undetectable disseminated tumour cells before evolving into clinically visible lesions in distant organs 6 . These dormant cells are often found as micrometastases in bone marrow and lymph nodes many years after primary tumour treatment [7][8][9] .
Primary breast tumour hormone receptor status (Estrogen receptor (ER), Progesterone Receptor (PR) and Human EGFR Receptor 2 (HER2)) can be used as prognostic markers for brain metastasis risk 10,11 and BBM occurs more frequenty in patients with ER-, HER2 + or triple negative breast tumours (ER-/PR-/HER2-) 12 . Of the different tumour subtypes, triple negative breast tumours have the worst prognosis 13 , and have faster rates of metastases compared to HER2 + patients 14,15 . However, following primary tumour treatment, and a period of apparent dormancy (in some cases lasting > 10 years), ER + tumours frequently metastasise to the brain [16][17][18] . Therefore, identifying genomic alterations that occur uniquely in primary tumours that eventually metastasise to specific organs could help clinical management of the disease 18 .
We have previously used a candidate-gene approach to identify epigenetically dysregulated genes in BBM 19 . Following from this study, we wished to carry out a non-selective, genome-wide analysis to identify novel epigenetic changes that are common in breast tumours that metastasis to the brain. To identify candidate genes that are differentuialy methylated when comparing primary breast tumours and BBM we have carried out 450 K-methylation array analysis of BBM tumours and compared the genome-wide methylation status of these tumours to primary breast tumours from the Cancer Genome Atlas (TCGA). We then determined the methylation status of candidate genes in BBM and their originating primary breast tumour. This analysis has led to the identification of DNA methylation alterations that commonly occur in primary breast tumours that eventualy metastasis to the brain. We have also identified epigenetic changes that occur only after the tumor cells have disseminated from the primary breast tumours. In addition, as a first step towards the development of a non-invasive prognostic tool, we have validated the methylation status of the genes identified in patients' serum. The identified epigenetic alterations may be used as potential non-invasive markers and new therapeutic targets for BBM patients.

Materials and methods
Patients and samples. Thirty fresh-frozen metastatic brain tumours (BBM) that originated from primary breast tumours (BP) were provided by The Walton Research Tissue Bank (WRTB), Liverpool and Brain Tumour North West (BTNW) Tissue Bank, Preston; BBM tumours were labelled BBM1 to BBM30. Formalin fixed paraffin embedded (FFPE) originating primary tumours (BP) from individual patients corresponding to their brain metastases (matched-pairs) were available for 11 of these tumours (BBM1, BBM2, BBM5, BBM7, BBM8, BBM10, BBM11, BBM12, BBM13, BBM14 and BBM15). These primary and BBM pairs were labelled as individual patients such as patients 1, patient 2, patients 5 etc. Patients' serum was available for BBM1, BBM2, BBM5, BBM6, BBM7, BBM8, BBM10, BBM11, BBM12, BBM13. Serum was collected at the time of BBM Surgery. Receptor status information is available for 9 of the 11 primary tumour pairs, six of these are ER + ve, one is triple negative, one is ER/PR-ve, and one is HER2 + ve (with ER/PR unknown status). Additional clinical information is available through our previous publication 19 . The time between primary tumour surgery and removal of the brain metastasis ranges from 2 to 10 years.
An independent cohort of 40 primary breast tumours (BP) analysed during this study were ductal carcinomas; their clinical characteristics were described previously 19 . Molecular characterization was available for 20 of these tumours, 15 of these are ER + ve and three are triple negative. No brain metastases were observed in any of these patients. Seventeen of these patients had been screened for metastasis for ≥ 5 years from the time of primary tumour surgery and nine had been screened for ≥ 10 years.
The research ethics committee (North Wales REC: 11/WNo03/2) approved tissues from the research banks and informed consent was obtained from each patient. The project was carried out following local ethical approval (University of Wolverhampton Life Sciences Ethics Committee: LSEC/201,011/43). This study was conducted according to the principles expressed in the Declaration of Helsinki.
Genomic DNA/RNA extraction. Genomic DNA was extracted from fresh-frozen BBM tumours using The DNA isolation kit from cells and tissues (Roche, Germany) as previously described 19 . Briefly, 25 mg of tissue was homogenised using lysis buffer and incubated at 37 °C for 30 min followed by the addition of Proteinase K and RNase solution. The samples were then centrifuged and processed according to manufacturer's instructions. For FFPE samples, a FFPE DNA extraction kit (Qiagen, USA) was used as previously described 19 . Briefly, a small block of samples embedded with paraffin was cut into thin sections and mixed with xylene followed by 100% ethanol. The samples were then processed according to manufacturer's instructions. The tumour-free circulating DNA from the patients' serum was extracted using ZR serum DNA kit (Zymo research, USA). Briefly, 2 ml plasma from each patient was transferred to a conical shaped 50 ml universal tube. 8 ml of genomic lysis buffer and 10 μl of zymoBeads were added to each sample and placed in a shaker for two hours at room temperature. The samples were then processed further according to manufacturer's instructions. Similarly, the total RNA was extracted using EZ-RNA extraction kit (Biological Industries, Israel). Briefly, fresh-frozen tumours were homogenized using lysis buffer followed by addition of extraction solution. The samples were then centrifuged and processed according to manufacturer's instructions. The concentration of DNA and RNA was measured using nanodrop2000 (Thermo Scientific, USA). Of  www.nature.com/scientificreports/ bridge Genomic Services (CGS), UK. Chip processing was carried out based on 450 K array design according to manufacturer's instructions. Signal intensities generated by Illumina GenomeStudio were converted to β-values and BeadStudio software was used to remove biases between the Infinium I and II probes. In order to remove the technical biases between the 450 K array downloaded from the TCGA (14 normal breast tissues and 14 primary breast tumours) and our array on 23 BBM, further normalisation was carried out using R statistical package 20 . Processing of array data was performed with the R packaged RnBeads (version 0.99.10) and normalized using the SWAN normalization option. The hg19 genome annotations were used during the analysis and sex chromosome data was excluded 21,22 . Data in figures and tables are derived from gene-gene comparisons using standard analysis options in the package. The tumour barcode for the tumour data downloaded from TCGA is given in Supplementary Fig. 1.

Illumina BeadChip 450 K HumanMethylation array.
Initial screening of candidate genes signatures hypermethylated or hypomethylated in brain metastases compared to primary breast tumours and normal breast tissues. In order to generate an initial candidate list of genes that are either hypomethylated or hypermethylated in BBM, we compared CpG methylation between normal breast tissues (BN), primary breast tumour tissues (BP) and breast to brain metastases (BBM) samples. For this, we downloaded 450 K methylation array data from The Cancer Genome Atlas (TCGA) for 14 normal breast tissues and 14 primary breast tumours ( Supplementary Fig. 1A, B. Our analysis aimed to identify individual CpGs that gained methylation (hypermethylated) in BBM compared to primary breast tumour and normal breast tissue samples (see Supplementary Fig. 1C for an example). We also looked for individual CpGs that have lost methylation (hypomethylated) in BBM compared to primary breast tumors and normal breast tissues. (See supplementary Fig. 1D for an example). In order to generate an initial list of CpGs, that have either gained or lost methylation in BBM, we retrieved individual CpGs (array probes) with a β-value of ≥ 0.4 in 50% of the BBM tumours and < 0.4 in 50% of the primary tumours and normal tissue; previous studies have shown that β-values of ≥ 0.4 are associated with silencing of genes or significant loss of expression [23][24][25][26] . From this list of methylated CpGs we retained only those CpGs that had a diference in β-value of ≥ 0.15 between the metastatic and primary tumour sets. Similarly, In order to identify candidate hypomethylated loci in BBM, we selected CpG probes that had β-values < 0.4 in BBM and β-values of ≥ 0.4 in normal breast tissues and primary tumour tissues in at least 50% of each sample sets and then retained the CpG loci that have an average β-value differences of ≥ 0.15 between BBM and, normal breast tissues and primary breast tumours. The CpG loci that met these criteria and were significnaly differentialy methylated (P ≤ 0.05) across the different tissue types (Supplementary Table 1A) were carried forward for further analysis. Experimental validation of methylation status of individual genes. The methylation status of each gene corresponding to differentially methylated probes was determined by Combined Bisulphite and Restriction Analyses (CoBRA). Semi-nested PCR was carried out and DNA methylation status of candidate genes was determined by digesting CoBRA PCR products with BstUI or TaqI restriction enzymes (Fermentas, UK). CoBRA primers were designed based on the standard primary designing criteria used in analyzing bisulphite converted DNA [27][28][29] . The CoBRA primers were designed in such a way that the region analysed included the specific CpG identified by the 450 K array and additional local CpGs to enable the reliable determination methylation status by restriction digestion (Supplementary Table 2A). The methylation status of genes in patients' serum was determined using Methylation Specific PCR (Supplementary Table 2B).

Experimental validation of expression of selected genes. Expression analysis of candidate genes
was carried out using quantitative-reverse transcription PCR (qRT PCR). Due to the limited size of the tumor biopsies available, RNA was available only for BBM13, BBM15, BBM16, BBM20 and BBM30. Prior to the PCR, cDNA was prepared using QuanTect reverse transcription kit (QIAGEN, USA). β-actin gene was used as internal control for mRNA expression. The Ct value obtained were converted to the relative quantity of targets genes normalized with respect to internal control, and relative to a control sample. Due to the lack of normal tissue control samples (associated normal brest tissue was avalble only as FFPE), a BBM samples with median level of of expression for each qRTPCR among the samples set, was used as a control sample to calculate fold enrichment of other samples. Statistical analysis. Initial statistical tests to generate candidate CpG loci was carried out using the statistical package available in R 21 . During the experimental validation, only the samples where methylation differences were statistically significant between primary breast tumours and BBM were taken further. Fisher's exact test was used to determine the statistical significance of methylation between BP versus BBM samples. P ≤ 0.05 was considered statistically significant.  31 . This tool segregates patients on high or low expressing groups based on median or a lower/upper quartile as a cut off to measure a statistical significance of gene expression with patient survival. Logrank P < 0.05 was considered statistically significant.

Results
Identification of differentially methylated genes in BBM. We identified differentialy methylated CpGs between breast-to-brain metastases (BBM) and non-metastatic primary breast tumours (BP) and normal breast tissue (BN) by comparing Illumina BeadChip 450 K methylation data that we generated for 23 BBM to data for normal breast tissues and primary tumours from TCGA. First, an initial screening of genes based on β-value (see method section for details) was carried out to identify CpG loci that were either hypermethylated (Fig. 1A,B) or hypomethylatled in BBM (Fig. 1C,D)  In addition, we examined the methylation level of all CpG loci available based on the 450 K array design that are associated with our candidate genes which we refer to as the regional methylation level of each gene. The 450 K array design provides annotations for each CpG includin genomic location of the CpGs in the array. We retrieved all CpGs corresponding to our candidate genes and outlined where each of the CpG sites are located in relation to the gene structure: TSS1500, (1500 nucleotides upstreams of transcription start site), TSS200 (200 nucleotides upstream of transcription start site), 1 st Exon, or body regions (downstream of the TSS or the first Exon) individually for all samples in relation to the gene structure (Fig. 1E, supplementary Fig. 3, and supplementary Fig. 9 Left panel). We determined an average methylation level of each CpGs in each sample type i.e. BN, BP and BBM in individual structural regions of each gene (Fig. 1E, supplementary Fig. 3, supplementary Fig. 9; right panel). Regional methylation maps have been constructed only for those candidate genes where there is sufficient 450 K probe density to generate informative figures.

RP11-713P17.4, MIR124-2, NUS1P3
are hypermethylated in brain metastases compared to primary breast tumours. All candidate genes were validated for their methylation status using Combined Bisulpite and Restriction Analysis (CoBRA), this is a robust and reliable method that is not prone to false positive results 32 . Primers were designed to amplify the probe regions identified in the 450 K arrays using standard criteria published previously 19,23 .
First, we validated the six shortlisted candidate hypermethylated genes (see supplementary table 1B), by CoBRA in 30 BBM, that included the 23 BBM samples used in the methylation array. Then, using the CoBRA method, the methylation level of these genes was examined in an independent cohort of 20 primary breast tumours (BP) with no evidence of metastatic progression. Comparison of the results from these two tumour groups allowed us to determine if the methylation of these genes is more commonly found in the brain metastases than in BP. Of the 6 candidate genes, 2 were infrequently methylated in BP tumours; RP11-713P17. 4 and NUS1P3 ( Fig. 2A,D and Supplementary Fig. 4B, S4C, S10A-C). These genes were found to be methylated in 10% and 7% of primary tumours but 73% and 55% of BBM respectively. While MIR124-2 was methylated in 55% of the primary tumours and 88% of BBM ( Fig. 2A-D, Supplementary Fig. 4A). This suggests that RP11-713P17. 4 and NUS1P3 are frequently methylated in BBM but not in BP, and MIR124-2 methylation frequency is enriched BBM compared to primary breast tumours ( Fig. 2A-D and Supplementary Fig. 4A-C, Supplementary Table 1C). The methylation status of genes validated by CoBRA that were not differentially methylated are presenterd in Supplementary Fig. 6.   MIR3193, CTD-2023M8.1 and MTND6P4 are hypomethylated in brain metastases compared to primary breast tumours. Our methylation array analysis identified a panel of nine candidate genes that were hypometylated in BBM compared to BP and BN (see supplementary table 1B). As above, these genes were also validated using CoBRA in 30 BBM that included 23 BBM samples used in the methylation array and in an independent cohort of 20 primary breast tumours. This CoBRA analysis identifed three genes that were frequently methylated in BP and infrequently methylated in BBM tumours; CTD-2023M8.1, MIR3193 and MTND6P4 were methylated in 26%, 29% and 0% of BBM samples respectively (n = 30) (Fig. 2B-D, and Supplementary Fig. 4D-F, S11A-C) and were methylated in 63%, 67% and 47% non-metastatic BP respectively (n = 20) (Fig. 2B-D Supplementary Fig. 4D-F).

Methylation status of candidate genes in BBM samples and their originating primary breast tumours.
To determine if the common BBM-associated methylation events are also detectable in the primary tumours that the metastases are deriverd from, the methylation status of these genes was determined in BP and associated secondary BBM tumours from 11 individuals. Primary tumour material was available in the form of FFPE sections. We could bisulphite convert, amplify and analyse the promoter regions for RP11-713P14.4, MIR3193, MTND6P4 and CTD-2023M8.1 in these corresponding primary tumours. However, we were unable to amplify some regions in all 11 samples and MIR124-2 and NUS1P3 promoter regions were refracrtive to amplification in all FFPE primary samples.
MTND6P4 and MIR3193 are frequrently methylated in primary breast tumours tumurs (9/19, 16/19; see Fig. 2). However, we found that these regions are commonly unmethylated in metastasis-originating primary tumours (0/5, 3/9) and their corresponding BBM tumours ( Fig. 3A; left panel, 3C and Supplementary Fig. 5A,C). Similarly, RP11-713P17.4 is infrequently methylated in primary breast tumours (non-metasatic; 2/19 see, Fig. 2). However, in the primary tumours that proceed to metastasise to the brain this region is found to be methylated (3/6) ( Fig. 3A; right panel, 3C, Supplementary Fig. 5B). These results suggest that the differential methylayion we observed between unrelated primary tumours an BBM may be a result of differences that occur early during the developmet of the tumours that metastasize as they are common to the originating primary tumour and the associated metastatic tumour. From hearin we will refere to these as early events (in metastatic tumour evolution).
CTD-2023M8.1 is frequently methylated in primary tumours with no history of distant metastasis (12/19) but infrequently methylated in BBM (7/27) (see Fig. 2). CTD-2023M8.1 was also found to be methylated in metastasis-originating primary tumours However, CTD-2023M8.1 was not methylatyed in the corresponding BBM. This Suggests that this genomic change (the loss of methylation) was selected for after the metastasising tumour cells had left the primary tumour (Fig. 3B,C, Supplementary Fig. 5D). From hearin we will refere to this as a late event (in metastatic tumour evolution).
In addition, we carried out quantitative reverse transcription PCR (qRT PCR) to determine the expression status of these genes in the same cohort of BBM tumours that was used for 450 K methylation and experimental validation of methylation status (Fig. 3D, Supplementary Fig. 7). The expression was normalized against β-actin. As there was no RNA available from the primary tumors the fold change was determined relative to the expression level of a median ΔCT value for each transcript analysed. We have found that the genes that have promoter methylation have relatively low RNA levels (relative expression < 1) in those tumours. Similarly, the genes which are unmethylated, are expressed in those tumours (relative expression > 1). We have found that some of the samples/ genes that are not methylation have also low level expression which could be attributed to genomic changes other than the DNA methylation (Fig. 3D, Supplementary Fig. 7).

Examination of methylation status of candidate genes in tumour free circulating (tfc) DNA.
In order to investigate if the methylation status of candidate genes in BBM is similar to their methylation status in circulating DNA isolated from patients' serum (at the time of metastasis surgery), we determined the methylation status of MIR124-2, CTD-2023M8 and MIR3193 and CCDC8 in patients' serum (CCDC8 had previously been identified as methylated in BBM in our earlier study 19 ) by methylation-specific PCR (MSP). Serum was available from those patients whose originating primary tumours and BBM were also available. The methylation status of tfc DNA isolated from serum at MIR1242, CTD-2028M8, CCDC8 and MIR3193 was the same as that seen in BBM in 100%, 100%, 83% and 50% of the samples respectively ( Fig. 4A-E, Supplementary Figs. 8, 12). It is important to note that while we belive this analysis a useful in showing that these epigenetic marks can be identified in tfc DNA, this analysis is limited as matched blood samples to the retrospectively collected primary breast tumours were not available.

Expression status of MIR124-2 gene correlates to the clinical prognosis of patients. We wished
to carry out survival analyses of breast cancer patients to investigate the correlation of expression of candidate genes with clinical prognosis using KM-plotter tools using data from the Gene Expression Omnibus (GEO) 31 . However, prognosis data was not available for mir3193 and NUS1P3 or the log-non-coding RNAs that we have identified. Data was avalalber for MIR124-2 and, in addition, we also performed KM analysis for three genese we have previously found to be frequently methylated in BBM (BNC1, CCDC8 and GALNT9) 19 .
Kaplan-Meier analyses showed that low expression of MIR124-2 (that we have shown here to be frequently hypermethyated in BBM) correlates to poorer clinical outcomes (Fig. 5). We have found that the high expression of MIR124-2 correlates to better relapse-free survival (RFS) of breast cancer patients (p = 0.00004 (Fig. 5A). Furthermore, we carried out combined prognosis of the four candidate genes (BNC1, CCDC8, GALNT9, MIR124-2), three of which (BNC1, CCDC8, GALNT9) were reported to have metastatic suppressive functions in our previous study 19 . This combined analysis shows that higher combined expression of these 4 genes together correlated to better RFS (P = 0.0046) (Fig. 5B). In addition, we carried out survival analyses of ER + and ER-patients separately to investigate if expression of our candidate genes is associated with ER receptor expression in breast cancer patients; the data indicates that the expression of MIR124 is independent of estrogen receptor (ER + or ER-) status in breast cancer patients (Fig. 5C).

Discussion
The aim of this study was to identify genes that are frequently epigeneticly dysregulated in breast to brain metastases (BBM) using a genome-wide approache.  35 . There is growing evidence that non-coding RNA could be a class of novel biomarkers or therapeutic targets in multiple cancers 36,37 and prevous studies have reported that non-coding RNAs are epigenetically dysregulated in cancer 38,39 . This study has identified non-coding RNAs that are dysregulated by DNA methylation that, potentially, could be used as a DNA methylation prognostic markes in BBM.
The methylation level of non-protein coding genes MIR124-2, NUS1P3 and RP11-713P17.4 was enriched in BBM tumours compared to primary breast tumours. Previous studies have reported that MIR124 is associated with inhibition of invasion and metastases of breast and lung cancers 40 , oral squamous cell carcinoma 41 and pancreatic adenocarcinoma 42 . MIR124 is epigenetically dysregulated in hepatocellular carcinoma (HCC) cell lines 43 and its silencing is also associated with poor clinical prognosis of colorectal carcinoma 44 . Recently, the methylation status of MIR124-2 has been proposed as a prognostic marker associated with cervical cancer in human papoloma Virus (HPV) positive women 45,46 . MIR124-2 is one of three independent precursors genes; MIR124-1, MIR124-2, and MIR124-3, which are processed to form MIR124. Interestingly, the other precursors MIR124-1 and MIR124-3 are also methylated in BBM compared to BP and BN ( Supplementary Fig. 3) suggesting that MIR124 is dysregulated in BBM. Recent studies have reported that MIR124-2 suppresses proliferation, shows that loss of expression of these genes in combination correlates to poor RFS of breast cancer patients (p = 0.0046). (C) Low expression of MIR124-2 is associated with poor prognosis of ER + and ER-breast cancer patients (p = 0.007 and 0.017 respectively).  47,48 . MIR124 is abundantly expressed in the nervous system where it contributes to regulation of alternative splicing and plays a crucial role in the differentiation of progenitor neuronal cells 49,50 . MIR124 also contributes to glial cells quiescence and is involved in repression of migration and invasion of various cancers through its targets 49 . NUS1P3 is a processed pseudogene of its parental gene NUS1(Dehydrodolichyl Diphosphate Synthase Subunit). NUS1, also known as NOGO-B Receptor is expressed in most tissues 44,51,52 . NUS1 down regulates epithelial markers such as E-cadherin and increases mesenchymal markers contributing to EMT in cervical cancer promoting invasion and metastasis 53 . In addition, NUS1 dysregulation is associated with various cancer types 54 including ER/PR/HER2 positive breast tumours 55 . There is growing evidence that pseudogenes are dysregulated in cancer and this dysregulation may modulate thire intereaction with either parental genes or other gene loci regulating transcriptional, and post transcriptional activities 56,57 . In our study, NUS1P3 is unmethylated in primary tumours suggesting its expression, which may lead to increased expression of NUS1. Furthermore, increased expression of NUS1 could promote Epithelial to Mesenchymal Transition (EMT) in breast cancer contributing to invasion and metastases to the brain. It is possible that NUS1P3 acts as a competitive endogenous RNA (ceRN)A for NUS1 and the silencing of NUS1P3 in metastases found in the brain lead to down-regulation of NUS1 concorant with Mesenchymal to Epithelial Transition (MET). These findings are consistent with our finding that silencing of MIR124-2 and NUS1P3 through promoter methylation in BBM samples may provide a selective advantage for metastasised tumours to survive and to proliferate in the brain microenvironment.
RP11-713P17.4 is a long intergenic non-coding RNA (lincRNA) gene. LincRNAs are gene-associated transcripts that are associated with open chromatin marks such as histone modification sites and epigenetic regulation of transcription, RNA stability, and recruitment of protein complexes [58][59][60] . LincRNA are associated with crucial biological functions such as cellular growth and differentiation, development, and apoptosis 59,60 . There is growing evidence of epigenetic reglations of lincRNAs in cancers 61,62 , and further mechanistic studies are required to investigate the molecular mechanistic role of RP11-713P17.4 in BBM and cancer metastases.
Three non-protein coding genes MIR3193, MTND6P4 and CTD-2023M8.1 are hypomethylated in BBM compared to primary tumours and normal breast tissues. The microRNA, MIR3193 was one of 209 novel micro RNAs identified by deep sequencing melanomas 63 . Furthermore, MIR3193 is frequently upregulated in glioma compared to normal brain tissue 64 .
MTND6P4 is a processed pseudogene of its parental gene MTND6 (ND6) that codes for the protein Mitochondrially Encoded NADH Dehydrogenase 6 (MTND6). MTND6 provides a quinone binding sites and is one of the six subunits (ND1-ND6) of the complex I in electron transport chain (ETS) in mitochondria. Mutations in MTND6 are associated with an increase in metastatic potential that was associated with low NADH and high reactive oxygen species (ROS) in lung and breast cancer cell lines 65,66 and heptatocellular carcinoma 67 . Mutation in one of the subunits of ETS leads to low oxidative phosphorylation and increased glycolytic activity of mitochondria contributing to aggressiveness of childhood Acute Lymphoblastic Leukemia 68 . It is possible that epigenetic dysregulation of MTND6P4 may contribute to an energy shift towards glycolysis leading to acidosis with microenvionment changes that provide powerful growth advantages and invasive potential to the tumour cell 69 . A principle emerging role of pseudogenes is to act as competitive endogenous RNAs (ceRNA), a sponge for molecules that interact with mRNA (such as miRNA) thus positively influencing the expression of their parental gene 70 . MTND6P4 may regulate its parental gene MTND6 and hypomethylation and subsequent overexpression of MTND6P4 may lead to changes in oxidative phosphorylation activies, glycolysis and brain microenvironment contributing to growth advantages to tumours via upregulation of MTND6.
In our study, MIR3193, MTND6P4 and CTD-2023M8.1 were frequently methylated in a cohort of non-metastatic primary breast tumours and infrequently methylated in BBM. The two genes MIR3193 and MTND6P4 were commonly unmethylated in BBM and their originating primary breast tumours in individual patients suggesting that the demethylation/hypomethylation of MIR3193 and MTND6P4 is an early event during tumour evolution. It further suggests that MIR3193 and MNTND6P4 have metastatic promoter function, which are silenced in normal breast tissues and primary breast tumours due to methylation. Similarly, CTD-2023M8.1 is frequently methylated in non-metastatic primary breast tumours. It is also frequently methylated in metastasis-originating primary tumours. However, it is frequently unmethylated in BBM. This suggests that the hypomethylation of CTD-2023M8.1 in BBM is a late event that may occur only after the tumour cells have left the primary site.
RP11-713P17.4 is infrequently methylated in primary breast tumours and normal breast tissues but frequently methylated in BBM and their originating primary tumours in individual patients suggesting that the promoter hypermethylation of RP11-713P17.4 is an early event during BBM. The novel non-protein coding genes identified in this study may regulate invasion and metastasis directly or by regulating other protein coding genes. However, their regulatory functions and their target genes have not been reported before. Functional studies are required to determine the role of these genes in BBM.
In addition, our study shows that patients' serum could be useful to detect the methylation status of tumour associated circulating DNA in BBM suggesting a potential method of prognostic analysis. For this, a panel of genes could possibly be developed as prognostic markers for BBM. However in this study, patients' serum taken at the time of primary tumour diagniosis was not available to determine if the methylation status of these genes could be used as non-invasive prognostic markers.
Taken together, our study has identified a panel of six novel non-protein coding genes (miRNAs, pseudogenes and long intergenic/non-coding RNAs) of which RP11-713P17.4, NUS1P3, MIR3193, MTND6P4 and CTD-2023M8. 1 have been reported for the first time as epigenetically dysregulated genes in cancer and in metastases. MIR124-2 has previously been reported as an epigenetically dysregulated gene in cancer [43][44][45][46][71][72][73][74] , however, its role in breast cancer metasetases has not been reported previously. The non-coding RNA genes 75 are a part of broad epigenetic network that are emerging as critical regulators in human diseases and cancers 76,77 . The genes