Introduction

Colorectal cancer (CRC) represents the third most frequently diagnosed cancer, and the fourth prime cause of cancer-related mortality worldwide1,2. Although the etiology of colorectal cancer is multifactorial, genetic and epigenetic alterations of proto-oncogenes and tumor suppressor genes remain the fundamental mechanism of tumorigenesis3. Increasing interest in non-coding genomic sequences has revealed the recent implication of several classes of non-coding RNAs (ncRNAs), including microRNAs (miRNAs) as a type of posttranscriptional regulators in carcinogenesis4. This group of ncRNAs could exhibit differential expression in many types of cancer, including CRC5, and their dysregulation could promote every stage of carcinogenesis, including cell proliferation, invasion, and metastasis, and confer resistance to apoptosis through interaction with several intracellular signaling networks6,7,8. The thermodynamics of miRNA-mRNA target interactions may be influenced by single nucleotide polymorphisms (SNPs) occurring in the mature sequence of miRNA, resulting in target gene dysregulation with consequent phenotype variations and/or cancer susceptibility9,10.

MiR-34a is one of the emerging microRNAs that are implicated in many cancers, including CRC11,12,13,14. It has been shown that the expression of miR-34a is reduced in primary CRC tissues15. Moreover, CpG methylation of the miR-34a gene (MIR34A) promoter is detected in some colon cancer cell lines16, and its expression could be induced upon p53 activation17. Also, miR-34a could lower cell cycle progression through p53-dependent induction of p21 to alter colon cancer cell proliferation through direct or indirect regulation of the E2F transcription factor family15. Contrary to evidence on the pro-apoptotic role of miR-34a, however, also exists in the literature. It has been demonstrated that miR-34a may cooperate with p21 and 14-3-3σ to override the apoptotic signals generated by p53 activation16. As a controversy of miR-34a role in CRC still exists and also as sequence variations in the miRNA-binding sites could affect either the expression level and/or the oncogenic or tumor-suppressing functions of cancer-associated miRNAs6,9,10, the current study aims to analyze MIR34A expression and rs2666433 (A/G) variant in preliminary samples of archived CRC tissue specimens in comparison to non-cancer tissues and correlate the results to the available clinicopathological data. This could help improve our understanding of the impact of such type of miRNAs in CRC and its potential role as a candidate for the future molecular-based individualized therapy of such lethal cancer.

Results

In silico analysis of miR-34a

Our bioinformatics analysis identified miR-34a-5p to be the most highly significant non-coding microRNA enriched in the colorectal cancer pathway (Fig. 1). It can complement and bind 30 target genes and fine-tuning their expression profile (Fig. 2).

Figure 1
figure 1

Top microRNAs involved in the colorectal cancer pathway. (A) microRNAs involved in the colorectal cancer KEGG pathway (https://www.kegg.jp/kegg/kegg1.html)18,19. (B) Functional enrichment pathway analysis of miR-34a target genes. [Data source: Diana lab tools]. The top bar indicates the log (P-value) of the implication of the specified pathway in colon cancer. The direction towards red color indicates more significance as the log P < 0.05 equivalent to P < -1.30.

Figure 2
figure 2

MicroRNA-34a targets the colorectal cancer pathway. Genes targeted by miRNA34a are colored in orange. In the CRC pathway (KEGG: hsa05210)18,19, miR-34a-5p significantly targets 30 genes (P = 0.0013); including B-Raf proto-oncogene, serine/threonine kinase (BRAF), Ras-Related C3 Botulinum Toxin Substrate 2 (RAC2), phosphoinositide-3-kinase regulatory subunit 2 (PIK3R2), phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit alpha (PIK3CA), RAF1, B-cell lymphoma-2 (BCL2), BCL2 associated agonist of cell death (BAD), Baculoviral IAP repeat containing 5 (BIRC5), transforming growth factor beta 1 (TGFB1), TGFB3, TGFB-receptor 2 (TGFBR2), TCF7L1 (Transcription Factor 7 Like 1), ARAF, Tumor Protein P53 (TP53), AKT serine/threonine kinase 2 (AKT2), Jun proto-oncogene, AP-1 transcription factor subunit (JUN), cyclin D1 (CCND1), mothers against DPP homolog 4 (SMAD4), catenin beta 1 (CTNNB1), MutS homolog 6 (MSH6), axis inhibition proteins 2 (AXIN2), MYC proto-oncogene, BHLH transcription factor (MYC), mitogen-activated protein kinase 1 (MAPK1), MAPK3, MAPK8, mitogen-activated protein kinase kinase 1 (MAP2K1), caspase-9 (CASP9), Ral guanine nucleotide dissociation stimulator (RALGDS), lymphoid enhancer binding factor 1 (LEF1), cytochrome C, somatic (CYCS).

Functional enrichment analysis of miR-34a

Both miR-34a-5p and miR-34a-3p were mostly significantly involved in two pathways: namely fatty acid biosynthesis (hsa00061) and fatty acid metabolism (hsa01212). miR-34a-5p was also identified to target specific cancer types; including colorectal cancer, thyroid cancer, non-small cell lung cancer, chronic myeloid leukemia, bladder cancer, pancreatic cancer, glioma, and melanoma, in addition to multiple cancer-related pathways as cell cycle, pathways in cancer, p53 signaling pathway, and proteoglycans in cancer.

In the CRC pathway (KEGG: hsa05210)18,19, miR-34a-5p significantly targets 30 genes (P = 0.0013); which are involved in all steps of colorectal development and progression (Fig. 2). These gene lists included apoptotic genes (BCL2, BAD, BIRC5, and CASP9), proliferative genes (CCND1, TGFB1, and TGFB3), tumor suppressor genes (TGFBR2, TP53, and SMAD4), DNA repair gene (MSH6), oncogene (CTNNB1), transcription factors (JUN, MYC, TCF7L1), and serine-threonine kinases (BRAF, RAF1, ARAF, AKT2, MAPK1, MAPK3, and MAPK8) (Fig. 2). Enrichment of miR-34a in hallmarks of cancer20 revealed to be involved in two main functions: namely resisting cell death (gray color, Fig. 3) and tumor invasion and metastasis (black color, Fig. 3).

Figure 3
figure 3

MicroRNA-34a is involved in the cancer hallmarks. The thirty target genes in the colorectal cancer pathway are functionally enriched in diverse cancer hallmarks. Data source: Cancer Hallmarks Analytics Tool (https://chat.lionproject.net/)20. Each color indicates a specific cancer hallmark, as indicated in the left colored legend.

Genotyping of MIR34A variant

Genotype frequencies of rs2666433 agreed with HWE in patients (P = 0.34) and controls (P = 0.13). MAF (A allele) accounted for 0.31 in controls. According to the 1000 Genome Project, the same allele frequencies were 0.43 in Africans, 0.31 in East Asians, 0.23 in South Asians, 0.17 in Americans, and 0.09 in Europeans.

Impact of genotypes on cancer risk

On the comparison between malignant and adjacent colon tissues, A allele was highly prevalent in cancer tissues representing a frequency of 0.56, P < 0.001. Correspondingly, AA and AG genotypes were predominant in cancer specimens (34.5% and 43.1%) compared to counterpart non-cancerous tissues (13.8% and 34.5%), respectively. AA and AG were 5.7 and 2.8 more likely to develop cancer than GG (AA versus GG: OR 5.76, 95%CI: 2.02–16.43, AG versus GG: OR 2.88, 95% CI 1.20–6.93) (Table 1).

Table 1 Genetic association models for MIR34A variant and cancer risk.

Somatic mutation burden analysis

Tumor-normal paired analysis revealed genotype concordance in 33 out of 58 tissue samples. However, the rest of the specimen (43.1%) showed a tendency for G to A shift; 8 (13.8%) controls with AG genotype were substituted to AA in paired adjacent cancer tissue, 13 non-malignant samples (22.4%) changed from GG to AG, and 4 samples (6.9%) with GG genotype showed double mutations to AA at both gene loci in malignant tissues derived from the same patients (Table 2).

Table 2 Somatic mutations of rs2666433 (A/G) genotypes in cancer and paired non-cancer tissues.

Transcriptomic signature of miR-34a in CRC

The mean quantitative threshold cycle (Cq) for RNU6B was 38.5 ± 3.23 in cancer specimens compared to 39.4 ± 5.38 in paired samples (P > 0.05). Hence, RNU6B was used in the downstream analysis as an endogenous control. In contrast, Cq of miR-34a-5p varied significantly between the study population; 35.1 ± 2.6 in cancer and 36.7 ± 3.8 in non-cancer (P < 0.001). Marked over-expression of miR-34a was observed in the cancer specimen with a median and quartile levels of 1436.8 (312.6–11,551.6) (Fig. 4A). Therefore, log-transformed values were used for further analyses. No differential expression level of miR-34a was found in patients with different rs2666433 genotypes (P = 0.82) (Fig. 4B).

Figure 4
figure 4

The relative expression profile of the MIR34A gene in colon cancer specimens. Data are shown as medians and quartiles. Box plot values were log-transformed as data were non-parametric. The box defines upper and lower quartiles (25 and 75%, respectively), and the error bars indicate upper and lower adjacent limits. The fold change was normalized to RNU6B and calculated using the delta-delta quantitative cycle (Cq) method [= 2(-DDCq)] in comparison to non-cancer adjacent tissues. The red dotted line represents the control level. Mann–Whitney U and Kruskal–Wallis tests were applied. (A) Overall samples. (B) Stratified by rs2666433 genotype.

Association of MIR34A expression and variant with clinicopathological features

As depicted in Table 3 and Suppl. Fig. S1, no association was found between any of the patient characteristics and miR-34a expression or polymorphism. However, patients harboring G > A somatic mutation had a more prolonged DFS (P = 0.003) and OS (P < 0.001) than non-carriers. Also, unlike all other types of colon cancer, a higher frequency of somatic mutation (92%) was observed in adenocarcinoma (P = 0.006) (Table 4).

Table 3 Univariate association analysis of MIR34A expression and variant with clinicopathological features.
Table 4 Comparison between patients with and without somatic mutation.

Discussion

Given the advancement in the "high-throughput genome-wide profiling" and "screening technologies", newly emerged miRNA signatures and several "miRNA–mRNA" crosstalk have been identified in CRC21. An example of those signatures is the MIR34A gene expression, which plays a critical role in all stages of colorectal carcinogenesis, starting from colon epithelium proliferation, dysplasia, early/late adenoma, and progression to malignant neoplasm (Fig. 2). The present study identified significant upregulation of miR-34a in CRC tissues relative to normal tissues. In contrast to other studies that reported p53- and other molecular players-mediated miR-34a down-regulation in CRC tissue/plasma samples15,22,23,24,25,26,27, our finding was in line with that of Aherne and colleagues, who found a significant increase of miR-34a tissue expression in early-stage CRC samples compared to non-malignant ones and in colorectal adenomas relative to polyp and normal tissues28. Interestingly, the latter findings corresponded to the same changes in miR-34a circulating levels in CRC patients, in an independent cohort explored by the same authors. Brunet et al. also, reported overexpression of miR-34a in CRC (stage III) tissue samples relative to normal ones, which support the oncogenic role of miR-34a in the CRC. The observed controversy in results̕ reproducibility in aforementioned studies could reflect variable miRNA expression signatures due to disparities in participant age and/or time of sample collection27, varied sex distribution in the specified study29, racial difference30, tumor sample heterogeneity, and different detection approaches28. Additionally, miR-34a has multiple targets even in the same type of cancer31 (Figs. 2 and 3), as well as being itself a target for other coding32,33 and non-coding RNAs34,35,36,37,38, creating multiple circRNAs/lncRNA-miRNA-mRNA crosstalk networks that either promote or inhibit carcinogenesis in a spatial-, temporal- and cell type-specific pattern. In this sense, more "gene–gene interaction" analyses will better uncover miR-34a and its regulatory genes implicated in the pathogenesis of CRC.

Several molecular pathways have been identified to mediate the miRNA-34a role in this context, including Notch-1 and Notch-2 pathway suppression, which implicated in self-renewal and colon stem cells differentiation39,40, tumor-initiating cells (cancer stem cells) regulation41,42, and Fos-related antigen-1 (FRA1) targeting23 which plays an essential role in mediating the crosstalk between the oncogenic RAS-ERK and TGFβ signaling networks implied in "epithelial-mesenchymal plasticity" during CRC progression43.

It has been reported that miRNA SNPs might also cause an aberrant function of the miRNA in regulating the putative target genes44. Previous researches have shown that MIR34A variants could modulate the susceptibility of individuals to multiple human cancers, including osteosarcoma45, colon cancer46, and breast cancer47. The present results revealed that MIR34A rs2666433 AA and AG genotype carriers were 5.7 and 2.8 more likely to develop cancer than GG carriers. A part of one recent publication related to the study of the MIR34A rs2666433 association with ischemic stroke in Chinese population48, the impact of this variant on CRC risk (or other types of cancer) has not been reported previously. Interestingly, we also found that nearly 43% of the cancer tissues showed a tendency for G to A shift, and a higher frequency of somatic mutation (92%) was observed in the adenocarcinoma subtype of CRC. Although the normal and cancer colon tissues were exposed to the same environmental insult, only the cancer tissues showed the transformation into malignancy, which confirms the contribution of the cell genetic and epigenetic makeup to this transformation. Recently, Sun et al., suggested that the rs2666433 variant may affect the binding of transcription factors to MIR34A promoter sequences49.

Furthermore, Wei et al. reported that ischemic stroke patients with rs2666433 (AA) genotype had a higher level of miR-34a than those with (GG + GA) genotypes48, suggesting that rs2666433 may influence miR-34a expression level in their population. However, we could not find a significant association between the specified microRNA variant and its tissue expression levels in the present samples. The authors confirm the specificity of miRNAs, which is related to the type of the disease (ischemic stroke vs. cancer), the type of cancer (the CRC in the present study), the type of samples (body fluids vs. tissues), and the study population (i.e. ethnicity) among others. The negative result could also be partly related to the limited sample size that warrants further large-scale studies to confirm this finding in CRC tissues. It's worth noting that although this limitation above, an essential element of the validity of our study is its agreement with HWE in both study groups, particularly the controls which exclude any genotyping errors or guided sample selection by the authors. Another raised limitation in this study could be related to evaluation of the study variant in FFPE normal colon tissue samples, which, however, is "a very common source for DNA extraction in the studies regarding microRNAs"47.

In conclusion, the present study revealed miR-34a upregulation in CRC tissues compared to paired non-cancer ones. Moreover, for the first time, the authors reported an association between MIR34A rs2666433 (A/G) variant and CRC risk in the study population with a high rate of the specified miRNA mutation in cancer tissues relative to controls. These results could support the previous evidence of miR-34a implication in CRC pathogenesis and its potential use as a biomarker with other molecular panels or as an individualized therapeutic target in the near future. For results validation, further large-scale studies, including several miRNAs combinations in ethnic different populations, are highly recommended.

Methods

Bioinformatics-based selection of microRNA and its variant

Screening microRNAs involved in the colorectal cancer pathway [KEGG: hsa05210]18,19, we found 621 miRNAs targeting this particular pathway, in which miR-34a-5p ranked the top among them (Fig. 1). Experimentally validated gene targets for miR-34a-5p and miR-34a-3p were obtained from DIANA (https://diana.imis.athena-innovation.gr/), miRDB (https://mirdb.org), miRTarBase v.20 (https://mirtarbase.mbc.nctu.edu.tw/), TargetScanHuman v6.2 (https://www.targetscan. org/), PicTar (https://pictar.mdc-berlin.de/), and miRNAMap v2.0. Network analysis of target genes was carried out via a network analyst, web server (https://www.networkanalyst.ca). Functional enrichment pathway analysis was performed using miRTar. Human tool (https://miRTar.mbc.nctu.edu.tw/) and DIANA-miRPath v3.0 (https://snf-515788.vm.okeanos.grnet.gr/index.php?r=mirpath/reverse) which integrate all miRNA targets at the coding sequences (CDS), 3′ untranslated region )3′UTR), or 5′UTR regions into significant Kyoto encyclopedia of genes and genomes (KEGG) pathways and gene ontology (GO). Next, miR-34a disease interactions were retrieved from the PhenomiR database (https://mips.helmholtz-muenchen.de/phenomir/), miR2Disease database (https://www.miR2Disease.org), miRCancer database (https://mircancer.ecu.edu/about.jsp), ExcellmiRDB (www.excellmirdb.brfjaisalmer.com/), and Human miRNA Disease Database (HMDD) v2.0 (https://cmbi.bjmu.edu.cn/hmdd).

The genetic variant of MIR34A (rs2666433; A>G) is located 2 kb upstream to the mature miRNA sequence and within the intron of the MIR34A host gene (MIR34AHG; n.333-1283T > C) (ensemble.org). It was registered with an overall minor allele frequency (MAF: A allele) in the 1000Genome Project of 0.260.

Sample collection

A total of 116 formalin-fixed, paraffin-embedded (FFPE) specimens were collected retrospectively, including 58 CRC samples and paired 58 non-cancer colon tissues. Specimens were obtained from patients who underwent colon resection for histologically confirmed carcinoma. Paired controls were adjacent tissues obtained from the surgical free margins of each specimen and recorded to be normal by microscopic examination before its parafinization. All retrieved cases were archived in the Department of Pathology, Mansoura University, between 2013 and 2017. Patient data were obtained from medical records. There was no history of neoadjuvant therapy before surgery. Direct contact of patients was performed to complete missing data and follow-up (the last contact was in July 2019). The available follow-up period ranged from 20 to 68 months. Samples with incomplete clinical data or follow-up period, history of receiving any treatment before surgery, and/or diagnosis with malignant disease primarily arising from other organs were excluded. The study was conducted according to the ethical and legal guidelines adopted by the Declaration of Helsinki. Ethical approval for this study was granted by the local Research Ethics Committee (No. MED-2018-3-9-F-7825). The informed consent from the patients was waived from the ethical committee as the authors worked on archived samples.

Histopathological examination

Specimens included adenocarcinoma (n = 39; 67.2%), mucinous carcinoma (n = 8; 13.8%), signet ring cell carcinoma (n = 6; 10.3%), and undifferentiated type (n = 5; 8.60%). Apart from the limited sample size in this pilot study, the low frequency of undifferentiated carcinoma subtype in our cases could be congruent with the relative low frequency of such type of CRC as evidenced previously50, and including only the confirmed immunohistochemical staining for cytokeratin (CK) cases which could additionally contribute to the low number of such cases. Sections were examined for histopathologic diagnosis and tumor, node, metastasis (TNM) staging by an expert pathologist51. Other sections (5 to 8 μm thick) for cancer and paired non-cancer tissues were collected in separate Eppendorf tubes for both miRNA expression and SNP identification analyses.

Gene expression profiling

Total RNAwas purified from the FFPE colon sections using a Qiagen miRNeasy FFPE Kit (Cat # 217504) following the manufacturer's instructions12. RNAconcentration and purity were assessed by a NanoDrop ND-1000 spectrophotometer (NanoDrop Tech., Inc. Wilmington, DE, USA), and the integrity was checked by gel electrophoresis. Specific complementary DNA(cDNA) was prepared using TaqMan MicroRNA Reverse Transcription (RT) kit (P/N 4366596; (Thermo Fisher Scientific, Applied Biosystems, Foster City, CA, USA) for miR-34a-5p (assay ID 000426) as described in our previous publication52. RNU6B exhibited a uniform and stable expression in colon tissues with no significant difference between cancer and non-cancer samples; thus was used as an endogenous control (assay ID 001093). T-Professional Basic, Biometra PCRSystem (Biometra, Germany) was used. Appropriate negative controls were applied in each run to exclude amplicon contamination. The PCRreactions were carried out in triplicate in StepOne Real-Time PCRSystem (Applied Biosystems) using specific TaqMan small RNA assay53. All the steps of the quantitative Real-Time reverse transcription-polymerase chain reaction (qRT-PCR) were run according to the Minimum Information for Publication of Quantitative Real-Time PCR Experiments (MIQE) guidelines54. The relative MIR34A expression levels were calculated using the LIVAK method (2-ΔΔCq), where Delta-Delta quantitative cycle (Cq) = (Cq MIR34A − Cq RNU6B)CRC − (Cq MIR34A − Cq RNU6B)NAT55.

Allelic discrimination analysis

QIAamp DNA FFPE tissue kit (QIAGEN, 56,404) was used for DNA extraction according to the guided protocol. DNA quality/purity was evaluated, as mentioned above. DNA samples from the 58 cancer specimens and 58 non-cancer tissues were genotyped for the MIR34A variant (rs2666433 (A/G), assay ID C___2800266_10) using Taqman Real-Time PCR method as detailed previously56. Appropriate negative controls were applied in each PCR run to avoid the false positive of amplicon contamination. Real-time PCR amplification was performed on StepOne Real-Time PCR System (Applied Biosystems) using the following conditions: an initial hold (95 °C for 10 min) followed by a 40-cycle two-step PCR (95 °C denaturation for 15 s and annealing/extension 60 °C for 1 min). Allelic discrimination was called by the SDS software version 1.3.1 (Applied Biosystems). Genotyping was performed by two persons independently blinded to case/control status. Ten percent of the randomly selected samples were re-genotyped in separate runs to exclude the possibility of false genotype calls, with 100% concordance of the results.

Statistical analysis

Data were managed using SPSS version 24.0, the R packages, and GraphPad Prism version 7.0. Genotype and allele frequencies were calculated within each group. Hardy–Weinberg equilibrium (HWE) was estimated online (https://www.oege.org/software/hwe-mr-calc.shtml) and tested by the goodness of fit. Overall comparison and subgroup analyses were performed. Adjusted odds ratios (OR) with a 95% confidence interval (CI) was calculated to identify the strength of the association between the SNP and cancer risk under various genetic association models48; allelic model (G versus A), homozygote comparison (GG versus AA), heterozygote comparison (AG versus AA), dominant model (GG + AG versus AA), and recessive model (GG versus AG + AA). The Wilcoxon matched-pair signed-rank test was carried out to compare the expression level between cancer samples and their corresponding adjacent non-cancer tissues. Chi-square (2) and Fisher's exact tests were used for qualitative parameters, while quantitative variables were shown as mean ± standard deviation (SD) or median (quartiles) according to data distribution. Spearman's correlation test was applied for correlation analysis. Overall survival time was counted (months) from the date of diagnosis to the date of death or last follow-up before study finalization. The Kaplan–Meier method and the Cox proportional hazard model were carried out to assess survival rates among groups. A two-tailed P-value < 0.05 was considered significant.

Ethical approval

This study had been approved by the local Research & Ethics Committee, NBU, Arar, Saudi Arabia. The informed consent from the patients was waived from the ethical committee as the authors worked on archived FFPE samples.