Introduction

The worldwide cancer burden continues to increase; however, the precise mechanisms of carcinogenesis remain largely unknown. A number of investigators have demonstrated that genetic factors play a significant role in an individual's risk of cancer. MicroRNAs (miRNAs) are naturally occurring, small, noncoding, single-stranded RNA molecules that regulate gene expression by base pairing with the 3′ untranslated region of their target mRNAs, leading to mRNA cleavage or translational repression1. Numerous studies have demonstrated that miRNAs regulate a variety of biological processes, including cell proliferation, differentiation, apoptosis and development, thus dysregulation of these processes is closely associated with carcinogenesis2,3.

Recently, single nucleotide polymorphisms (SNPs) located in miRNAs, named as MirSNPs, have attracted increasing attention due to their possible involvement in the development of various types of cancer. Such MirSNPs may play functional roles through affecting the transcription of the primary target gene, altering pri-miRNA/pre-miRNA processing, or exerting effects on miRNA-mRNA interactions4. We performed a literature search and review of the association of common MirSNPs, including rs1834306, rs2043556, rs3746444, rs4919510, rs2910164, rs11614913, rs895819, rs2292832 and rs6505162, with the risk of cancer. However, the conclusions of the relevant studies were inconsistent, in part because of the heterogeneity of the types of cancer studied, the small sample sizes and the varied ethnicity of the patients. Therefore, there is an urgent need to further investigate the association of cancer-related MirSNPs with the risk of various types of cancer. Although the identification of cancer-related miRNAs based on gene association studies has become increasingly popular5, no study has yet investigated the association of cancer-related MirSNPs with the risk of various types of cancer based on an analysis of a large number of MirSNP association studies.

Therefore, we conducted a candidate-gene designed association study employing large numbers of cases and controls for eight kinds of cancer that commonly jeopardize human health (bladder cancer, breast cancer, esophageal squamous cell carcinoma (ESCC), gastric cancer, lung cancer, pancreatic cancer and renal cell carcinoma (RCC)) and analyzed these nine MirSNPs (either by direct genotyping or imputation) to further determine the association of these MirSNPs with the risk of developing cancer. Cross phenotype meta-analysis (CPMA) was performed to analyze the association of MirSNPs and overall cancer risk and specific cancer risk was further discussed.

Results

Patient characteristics

The risk of developing eight different types of cancer, including bladder cancer, breast cancer, lung cancer, pancreatic cancer, RCC, prostate cancer, ESCC and gastric cancer was assessed. The patients and controls in the gastric cancer and ESCC study were from Asian population, while the patients and controls in the six other cancer studies were from a Caucasian population.

Quantitative analysis

Primary analyses were conducted through unconditional logistic regression models for genotype trend effects (1 degree of freedom) and adjusted for eigenvectors, gender and cohort. The false discovery rate (FDR) method was considered to correct for multiple testing. Results revealed a significant association between rs2910164 C vs G and the risk of bladder cancer (OR = 1.12, 95% CI: 1.04–1.21, P = 2.06E-03(PFDR = 0.0297)) and gastric cancer (OR = 0.85, 95% CI: 0.77–0.93, P = 5.98E-04(PFDR = 0.0108)); rs2043556 C vs. T and the risk of bladder cancer (OR = 1.19, 95% CI: 1.10–1.28, P = 1.44E-05(PFDR = 5.18E-04)); rs6505162 C vs. A and the risk of bladder cancer (OR = 1.12, 95% CI: 1.068–1.18, P = 4.05E-05(PFDR = 9.72E-04)); rs895819 C vs. T and the risk of bladder cancer (OR = 1.19, 95% CI: 1.10–1.28, P = 6.70E-06(PFDR = 4.82E-04)). We further performed the above analyses 1000 times but randomly selected 70% cases and controls each time and results showed that the five MirSNPs mentioned above were consistently associated with specific type of cancer risk (P < 0.001).

CPMA analysis was performed to unveil the association of each MirSNP with the overall risk of cancer, which suggested that rs2910164 C (P = 1.11E-03), rs2043556 C (P = 0.0165), rs6505162 C (P = 2.05E-03) and rs895819 (P = 0.0284) involved with cancer occurrence (1Table 2). A meta-analysis using different effects model with inverse-variance weighting based on the heterogeneity existing in the results of the studies in each MirSNP was also provided (Figure 1).

Table 1 Summary of the MirSNPs studied
Table 2 Stratification analyses of the association of the nine MirSNPs with the overall cancer risk and risk of specific types of cancer
Figure 1
figure 1

Meta-analysis of nine MirSNPs and their association with the overall risk of cancer.

(a) rs2292832; (b) rs2910164; (c) rs2043556; (d) rs4919510; (e) rs1834306; (f) rs1614913; (g) rs6505162; (h) rs895819; (i) rs3746444.

Begg's test was used to investigate publication bias in the literature. The shapes of the funnel plots showed no obvious asymmetry and no statistical evidence of bias existed (Figure 2).

Figure 2
figure 2

Begg's funnel plot with pseudo 95% confidence limits for publication bias of the MirSNPs in meta-analysis.

Begg's test offers no evidence of publication bias. (a) rs2292832 (P = 0.386); (b) rs2910164 (P = 0.536); (c) rs2043556 (P = 0.386); (d) rs4919510 (P = 1.000); (e) rs1834306 (P = 0.174); (f) rs11614913 (P = 0.536); (g) rs6505162 (P = 0.174); (h) rs895819 (P = 0.386); (i) rs3746444 (P = 1.000).

Discussion

Approximately 50% of all annotated human miRNA genes are located in fragile sites or areas of the genome that are frequently associated with cancer. SNPs, the most common type of genetic variation in the human genome, result in phenotypic differences6; such sequence variations in miRNA genes may potentially affect the processing of miRNAs, pri-miRNAs, pre-miRNAs and/or mature miRNAs, and/or target selection and may thus significantly affect an individual's risk of cancer7.

Here we evaluated the associations between nine common MirSNPs (rs1834306, rs2043556, rs3746444, rs4919510, rs2910164, rs11614913, rs895819, rs2292832 and rs6505162) and the susceptibility to cancer using data from seven published studies; each study investigated a single type of cancer, except for one study which investigated both gastric adenocarcinoma and ESCC. Therefore, this study was a large population-based and multi-cancer stratified investigation. We observed significant relations between the MirSNPs rs2910164, rs2043556, rs6505162 and the overall risk of developing cancer using FDR adjusted CPMA analysis. CPMA analysis adopts association P values and examine whether the observed P values diverge from the expected distribution of P values under the null hypothesis of no additional associations besides those already known. The CPMA analysis is especially well fitted to wide phenotypic surveys, resulting from its benefits from increased numbers of phenotypes8.

The rs2910164 G/C polymorphism of the miR-146a gene is situated in the stem structure opposite the mature miR-146 sequence and leads to a change from a G:U pair to C:U mismatch in the stem region of the miR-146a precursor. The G allele of the miR-146a precursor might influence the generation of mature miR-146a and impact on target mRNA binding9,10. Our study revealed an association between rs2910164 and the overall risk of cancer by CPMA, which is inconsistent with He et al using random effects meta-analysis11. Although random effects meta-analysis incorporates a moderate level of the effects of heterogeneity, it is not well suited for the cases in which the genetic variant produces the opposite effects on diverse phenotypes. For rs2910164, the results of the two different meta-analyses may be due to the opposite effects of the MirSNP in different types of cancer, thus the use of CPMA seems more reasonable10,11,12,13. It is of interest to learn that the amount of mature miR-146a from the C allele were 1.8-fold reduced, compared to the G allele in papillary thyroid carcinoma, while the miR-146a levels in the CC genotype were significantly increased compared with the GG genotype in gastric cancer13. The rs2910164 C allele was associated with a decreased risk of gastric cancer in the Asian population, a finding supported by Xu et al14. An increased risk of bladder cancer in the Caucasian population was observed in the rs2910164 C allele. However, a study performed by Wang et al. indicated a reduced risk of bladder cancer in the rs2910164 C allele in Asian population15. These results suggest that the rs2910164 polymorphism may have varying effects in different genetic backgrounds or patients with a different ethnicity, and/or during the pathogenesis of different types of cancer.

The rs6505162 SNP, located in the pre-miR-423, 12 base pairs 5′ of miR-423-5p offers an association with cancer development based on CPMA analysis. So far, most research on miR-423 has concentrated on expression analyses, where aberrant expression of both mature forms of the miRNA has been seen in cancer, as well as during cellular differentiation16,17,18,19. Studies have shown that pre-miRNA SNPs from miRNAs can affect the production of mature forms and the binding of nuclear factors related to miRNA processing20,21,22. We suppose that rs6505162 might affect the expression or processing of miR-423, therefore, studies evaluating the effect of this SNP in miRNA functionality are required. However, studies of the rs6505162 polymorphism on cancer risk have yielded inconsistent results23,24,25. The first of these studies was conducted in 2009 on ESCC in a population of 346 Caucasian ESCC patients and suggested the C allele of rs6505162 being significantly higher in cancer patients compared with controls23. A study performed in 2012 indicated that the C genotype of the rs6505162 SNP reduces the risk of breast cancer development, however, another study undertaken in 2009 suggested that the C genotype of rs6505162 offered an increased risk of developing both ovarian and breast cancer in Breast Cancer Associated 2 (BRCA2) mutation carriers26. Our research observed an increased risk of bladder cancer in the rs6505162 C allele using the Caucasian population, as to our knowledge, this is the first study to show a relation between this SNP and bladder cancer, thus needs further validation.

The allele C of rs2043556, located in miR-605, was marginally associated with a risk of developing cancer; this is the first study to associate this MirSNP with cancer development, which needs to be validated by more studies. Stratified analysis revealed that the miR-605 allele C was associated with an increased risk of developing bladder cancer in the Caucasian population. Recently, analysis of this SNP was conducted on gastrointestinal cancer among Asians and produced data similar to our own, with C allele being significantly lower in controls compared to cancer patients27. Researchers have found that miR-605 to be an element of the p53 network which forms a positive feedback loop in response to stress28, thus miR-605 may play a key role in carcinogenesis. It will make more sense if the association between SNP and miRNA expression have been investigated and might be an answer to the relation of SNP and cancer risk.

The allele C of the MirSNP rs895819, located in the terminal loop of the pre-miR-27a, was associated with increased risk of bladder cancer in the Caucasian population and it is the first study to address association between the MirSNP and bladder cancer. MiR-27a has been investigated in several types of cancer and comes into inconsistent results. MiR-27a functions as a tumor suppressor in ESCC and hepatocellular carcinoma, while serves as promoting factor in gastric tumorigenesis29,30. Therefore, we assume that miR-27a plays pleiotropic signaling roles in regulating tumorigenesis. The MirSNP rs895819 initially reported to relate with a reduced risk of familial breast cancer risk (P = 0.0215) in a Caucasian population31; however, no significant association of rs895819 with the risk of breast cancer was observed in Chinese population32. A previous study suggested no association between rs895819 and the risk of colorectal cancer in a Central-European Caucasian population, a population with an extremely high incidence of sporadic colorectal cancer33; this observation is supported by our results. Since the high probability of MirSNP rs895819 involved with carcinogenesis, these conflicting results may be due to the analysis of varying sample sizes and warrant further analysis of larger cohorts to clearly establish the impact of rs895819 on the risk of cancer.

The rs11614913 polymorphism of miR-196a2 has a significant impact on the expression of miR-196a2 and is associated with carcinogenesis in various types of cancer34,35. Previous, meta-analysis studies suggested a significant association between rs11614913 and the overall risk of cancer in the Asian population, which was inconsistent with our results11,36,37. Our study suggests the rs116114913 C allele might protect against lung cancer in the Caucasian population, but the significance was mitigated with P value 0.197 after FDR adjustment, while Tian et al found rs116114913 C allele associated with significantly increased risk of lung cancer in Chinese38, suggesting that the effect of the rs11614913 polymorphism may rely on the genetic background or ethnicity of the patients and/or the effects of the environment, in agreement with the reports of Chu et al.37 and Wang et al.36. The effect of rs11614913 on the risk of different types of cancer needs to be confirmed in additional studies.

No significant associations were observed for the rs1834306, rs4919510, rs2292832 and rs3746444 polymorphisms in terms of the overall risk of cancer or the risk of specific types of cancer.

Though miR-100 has been shown to suppress the expression of proteins in the insulin-like growth factor (IGF)/mammalian target of rapamycin (mTOR) signaling cascade in childhood adrenocortical tumors39 and clear cell ovarian cancer40, thus suppressing tumorigenesis, while act as a oncogene in acute myeloid leukemia41. Our results showed mir-100 polymorphism, located in the pri-miR-100 region had no relation with the risk of cancer. Rs4919510 lies within the mature miR-608 sequence and is located at the junction between the stem and canonical hairpin loop42. Rs4919510 G allele was observed to relate with increased risk in bladder cancer, gastric cancer and prostate cancer, however, P values were mitigated after FDR adjustment, which needs to be validated by further studies. Rs2292832, located in pre-miR-149, was previously reported to have no significant associations with the risk of evaluated in breast cancer43, lung cancer44 or gastrointestinal cancer45, in agreement with the results of this study. Although a sizeable number of studies have been performed to investigate the role of the miR-499 rs3746444 polymorphism in several types of cancer, including breast cancer43,46, lung cancer44, gastric cancer47 and bladder cancer48, these existing studies have yielded contradictory results. These discrepancies may be due to the study of different populations from different areas and variations in selection of the case groups; therefore, the effect of the miR-499 rs3746444 polymorphism needs to be investigated further.

One limitation of the present study that needed to be addressed is the multiple comparison problems resulting from the number of MirSNPs tested. Therefore the FDR method was used to correct for multiple testing.

Second, several MirSNPs were imputed rather than directly genotyped in this study. Although using imputed MirSNPs might lead to less accurate results, we ensured that only SNPs with high imputation confidence > 95% were included into further analysis.

Taken together, the findings of the present study have substantial scientific significance and may have implications in the clinical setting. Our results suggest that common MirSNPs may contribute to an individual's susceptibility to diverse types of cancer. Further functional characterization of MirSNPs and their influence on their target mRNAs may reveal the underlying mechanisms responsible for the associations between these polymorphisms and the etiology of cancer. Further prospective investigations of larger numbers of cases and controls are required in order to clarify the inconsistent associations between MirSNPs and the risk of cancer.

Methods

Identification of eligible studies

We evaluated the effect of nine MirSNPs on the risk of bladder cancer in 3527 cases and 5119 controls from the Maryland bladder cancer study (dbGAP number: phg0000132.v1) performed among the Caucasian population of the United States49; the risk of breast cancer in 1145 postmenopausal women of European ancestry with invasive breast cancer and 1142 controls from the Massachusetts breast cancer study (dbGAP number: phg000032.v1) performed in the United States50; the risk of lung cancer in 3782 cases and 3840 controls from the Maryland lung cancer study (dbGAP number: phg000124.v1) performed in the United States51; the risk of pancreatic cancer in 2452 affected individuals (cases) and 2461 unaffected controls from the Minnesota pancreatic cancer study (dbGAP number: phg000089.v1) performed in the United States52; the risk of prostate cancer in a nested case-control study (dbGAP number: phg000067.v1) including 659 cases and 1593 controls of European origin performed in the United States53; the risk of RCC in 1311 affected individuals and 3424 controls with a European background from the Maryland renal cell carcinoma study (dbGAP number: phg000123.v1) performed in the United States54; and the risk of gastric adenocarcinoma and ESCC in a study (dbGAP number: phg000128.v1) performed in the United States of individuals of Chinese ethnicity, including 1625 cases of gastric cancer, 1898 cases of ESCC and 2100 controls55.

This study is based on an in-silicon re-analyze of the human genotyping data downloaded form dbGAP(www.ncbi.nlm.nih.gov/gap). The data submitters have obtained the informed consent from each participant.

Selection of SNPs

We carried out a search of the PubMed and Embase databases for all relevant reports on the association of MirSNPs with the risk of cancer. The following candidate MirSNPs were selected for this study: miR-605 A/G (rs2043556), miR-499a/b A/G (rs3746444), miR-608 C/G (rs4919510), miR-146a G/C (rs2910164), miR-196a2 C/T (rs11614913), miRNA-27a T/C (rs895819), miR-149 C/T (rs2292832), miR-423 A/C (rs6505162) and miR-100 T/C (rs1834306), which are present in the pre-miRNA regions of miR-196a2, miR-146a, miR-499a/b, miR-423, miR-608, miRNA-27a, miR-149 and miR-605 and the pri-miRNA region of miR-100, respectively (Table 1).

Imputation of the MirSNPs

The SNPs not present in the original chip were imputed by the program IMPUTE2, using both HapMap (NCBI Build 36 (db126b)) CEU data and 1000 Genomes as a reference haplotype set. All SNPs showed high imputation confidence (>95%). Rs2292832, rs2043556 and rs11614913 were directly genotyped, the others were imputed.

Association testing and adjustment for covariates

All the association tests were performed by Plink v1.07 using additive logistic regression models. To account for potential population stratification or admixture in these samples, principal component analyses (PCA) was carried out using the EIGENSTRAT56. After adjustment for significant principal components (PCs) in each study based on leveling off of the PCA screen plot, there was no evidence for large scale inflation of the association test statistics by comparison of observed and expected distributions, ruling out the significance hidden population substructure. The principal component score for each individual was included as a covariate in all models along with gender and cohort in logistic regression models. Multivariate logistic regression was performed in R software package (http://www.r-project.org/). The FDR method was used to correct for multiple testing (FDR q < 0.05).

Resampling

To examine the robustness of the associations, we conduct a re-sampling analysis in accordance with Li et al.57. Using the association test mentioned above, P-values (Prandom) were obtained by performing the test 1,000 times but randomly selected 70% of population in corresponding study. Then we tested the null hypothesis, Prandom ≥ 0.05(Supplementary Table S1).

Statistical analysis

The associations of the nine MirSNPs with the risk of cancer were examined by performing meta-analysis using inverse-variance method. We examined the association of the MirSNPs with the overall risk of cancer as measured by odds ratios (ORs) and 95% confidence intervals (CIs). Moreover, stratified analyses were also performed by the type of cancer for each MirSNP. The heterogeneity of the cancer type between studies was evaluated using the Chi-square-based Q statistical test, with a heterogeneity (Ph) < 0.05 considered significant. A fixed-effect model using the Mantel–Haenszel method and a random-effects model using the DerSimonian and Laird method were used to pool the data according to the cancer types and individual MirSNPs. The random-effects model was used when heterogeneity existed in the results of the studies; otherwise the fixed-effect model was used.

Additionally, cross phenotype meta-analysis (CPMA) was performed to determine the associations of the MirSNPs with the overall cancer risk; P < 0.05 was considered significant after FDR adjustment. The CPMA statistic determines evidence for the hypothesis that single SNP has multiple phenotypic associations. The CPMA statistic is agnostic to the direction of effect in each disease. It has one degree of freedom as it measures a deviation in P value behavior instead of testing all possible combinations of diseases for association to each SNP and therefore provides high power to reject the null hypothesis58,59.

All statistical tests for the meta-analysis were performed with review manager version 5.2 (The Nordic Cochrane Centre, The Cochrane Collaboration, Copenhagen, Denmark). Begg's test was used to evaluate publication bias.