MicroRNA sequence polymorphisms and the risk of different types of cancer

MicroRNAs (miRNAs) participate in diverse biological pathways and may act as oncogenes or tumor suppressors. Single nucleotide polymorphisms (SNPs) in miRNAs (MirSNPs) might promote carcinogenesis by affecting miRNA function and/or maturation; however, the association between MirSNPs reported and cancer risk remain inconsistent. Here, we investigated the association between nine common MirSNPs and cancer risk using data from large scale case-control studies. Eight precursor-miRNA (pre-miRNA) SNPs (rs2043556/miR-605, rs3746444/miR-499a/b, rs4919510/miR-608, rs2910164/miR-146a, rs11614913/miR-196a2, rs895819/miR-27a, rs2292832/miR-149, rs6505162/miR-423) and one primary-miRNA (pri-miRNA) SNP (rs1834306/miR-100) were analyzed in 16399 cases and 21779 controls from seven published studies in eight common cancers. With a novel statistic, Cross phenotype meta-analysis (CPMA) of the association of MirSNPs with multiple phenotypes indicated rs2910164 C (P = 1.11E-03), rs2043556 C (P = 0.0165), rs6505162 C (P = 2.05E-03) and rs895819 (P = 0.0284) were associated with a significant overall risk of cancer. In conclusion, MirSNPs might affect an individual's susceptibility to various types of cancer.

T he worldwide cancer burden continues to increase; however, the precise mechanisms of carcinogenesis remain largely unknown. A number of investigators have demonstrated that genetic factors play a significant role in an individual's risk of cancer. MicroRNAs (miRNAs) are naturally occurring, small, noncoding, single-stranded RNA molecules that regulate gene expression by base pairing with the 39 untranslated region of their target mRNAs, leading to mRNA cleavage or translational repression 1 . Numerous studies have demonstrated that miRNAs regulate a variety of biological processes, including cell proliferation, differentiation, apoptosis and development, thus dysregulation of these processes is closely associated with carcinogenesis 2,3 .
Recently, single nucleotide polymorphisms (SNPs) located in miRNAs, named as MirSNPs, have attracted increasing attention due to their possible involvement in the development of various types of cancer. Such MirSNPs may play functional roles through affecting the transcription of the primary target gene, altering pri-miRNA/pre-miRNA processing, or exerting effects on miRNA-mRNA interactions 4 . We performed a literature search and review of the association of common MirSNPs, including rs1834306, rs2043556, rs3746444, rs4919510, rs2910164, rs11614913, rs895819, rs2292832 and rs6505162, with the risk of cancer. However, the conclusions of the relevant studies were inconsistent, in part because of the heterogeneity of the types of cancer studied, the small sample sizes, and the varied ethnicity of the patients. Therefore, there is an urgent need to further investigate the association of cancer-related MirSNPs with the risk of various types of cancer. Although the identification of cancer-related miRNAs based on gene association studies has become increasingly popular 5 , no study has yet investigated the association of cancer-related MirSNPs with the risk of various types of cancer based on an analysis of a large number of MirSNP association studies. Therefore, we conducted a candidate-gene designed association study employing large numbers of cases and controls for eight kinds of cancer that commonly jeopardize human health (bladder cancer, breast cancer, esophageal squamous cell carcinoma (ESCC), gastric cancer, lung cancer, pancreatic cancer, and renal cell carcinoma (RCC)), and analyzed these nine MirSNPs (either by direct genotyping or imputation) to further determine the association of these MirSNPs with the risk of developing cancer. Cross phenotype meta-analysis (CPMA) was performed to analyze the association of MirSNPs and overall cancer risk, and specific cancer risk was further discussed.

Results
Patient characteristics. The risk of developing eight different types of cancer, including bladder cancer, breast cancer, lung cancer, pancreatic cancer, RCC, prostate cancer, ESCC, and gastric cancer was assessed. The patients and controls in the gastric cancer and ESCC study were from Asian population, while the patients and controls in the six other cancer studies were from a Caucasian population.
CPMA analysis was performed to unveil the association of each MirSNP with the overall risk of cancer, which suggested that rs2910164 C (P 5 1.11E-03), rs2043556 C (P 5 0.0165), rs6505162 C (P 5 2.05E-03) and rs895819 (P 5 0.0284) involved with cancer occurrence (Table 2). A meta-analysis using different effects model with inverse-variance weighting based on the heterogeneity existing in the results of the studies in each MirSNP was also provided ( Figure 1).
Begg's test was used to investigate publication bias in the literature. The shapes of the funnel plots showed no obvious asymmetry and no statistical evidence of bias existed ( Figure 2).

Discussion
Approximately 50% of all annotated human miRNA genes are located in fragile sites or areas of the genome that are frequently associated with cancer. SNPs, the most common type of genetic variation in the human genome, result in phenotypic differences 6 ; such sequence variations in miRNA genes may potentially affect the processing of miRNAs, pri-miRNAs, pre-miRNAs and/or mature miRNAs, and/or target selection and may thus significantly affect an individual's risk of cancer 7 .
Here we evaluated the associations between nine common MirSNPs (rs1834306, rs2043556, rs3746444, rs4919510, rs2910164, rs11614913, rs895819, rs2292832 and rs6505162) and the susceptibility to cancer using data from seven published studies; each study investigated a single type of cancer, except for one study which investigated both gastric adenocarcinoma and ESCC. Therefore, this study was a large population-based and multi-cancer stratified investigation. We observed significant relations between the Mir-SNPs rs2910164, rs2043556, rs6505162 and the overall risk of developing cancer using FDR adjusted CPMA analysis. CPMA analysis adopts association P values and examine whether the observed P values diverge from the expected distribution of P values under the null hypothesis of no additional associations besides those already known. The CPMA analysis is especially well fitted to wide phenotypic surveys, resulting from its benefits from increased numbers of phenotypes 8 .
The rs2910164 G/C polymorphism of the miR-146a gene is situated in the stem structure opposite the mature miR-146 sequence, and leads to a change from a G:U pair to C:U mismatch in the stem region of the miR-146a precursor. The G allele of the miR-146a precursor might influence the generation of mature miR-146a and impact on target mRNA binding 9,10 . Our study revealed an association between rs2910164 and the overall risk of cancer by CPMA, which is inconsistent with He et al using random effects meta-analysis 11 . Although random effects meta-analysis incorporates a moderate level of the effects of heterogeneity, it is not well suited for the cases in which the genetic variant produces the opposite effects on diverse phenotypes. For rs2910164, the results of the two different meta-analyses may be due to the opposite effects of the MirSNP in different types of cancer, thus the use of CPMA seems more reasonable [10][11][12][13] . It is of interest to learn that the amount of mature miR-146a from the C allele were 1.8-fold reduced, compared to the G allele in papillary thyroid carcinoma, while the miR-146a levels in the CC genotype were significantly increased compared with the GG genotype in gastric cancer 13 . The rs2910164 C allele was associated with a decreased risk of gastric cancer in the Asian population, a finding supported by Xu et al 14 . An increased risk of bladder cancer in the Caucasian population was observed in the rs2910164 C allele. However, a study performed by Wang et al. indicated a reduced risk of bladder cancer in the rs2910164 C allele in Asian population 15 . These results suggest that the rs2910164 polymorphism may have varying effects in different genetic backgrounds or patients with a different ethnicity, and/or during the pathogenesis of different types of cancer.
The rs6505162 SNP, located in the pre-miR-423, 12 base pairs 59 of miR-423-5p offers an association with cancer development based on CPMA analysis. So far, most research on miR-423 has concentrated on expression analyses, where aberrant expression of both mature forms of the miRNA has been seen in cancer, as well as during cellular differentiation [16][17][18][19] . Studies have shown that pre-miRNA SNPs from miRNAs can affect the production of mature forms and the binding of nuclear factors related to miRNA processing [20][21][22] . We suppose that rs6505162 might affect the expression or processing of miR-423, therefore, studies evaluating the effect of this SNP in miRNA functionality are required. However, studies of the rs6505162 polymorphism on cancer risk have yielded inconsistent results [23][24][25] . The first of these studies was conducted in 2009 on ESCC in a population of 346 Caucasian ESCC patients and suggested the C allele of rs6505162 being significantly higher in cancer patients compared with controls 23 . A study performed in 2012 indicated that the C genotype of the rs6505162 SNP reduces the risk of breast cancer development, however, another study undertaken in 2009 suggested that the C genotype of rs6505162 offered an increased risk of developing both ovarian and breast cancer in Breast Cancer Associated 2 (BRCA2) mutation carriers 26 . Our research observed an increased risk of bladder cancer in the rs6505162 C allele using the Caucasian population, as to our knowledge, this is the first study to show a relation between this SNP and bladder cancer, thus needs further validation.
The allele C of rs2043556, located in miR-605, was marginally associated with a risk of developing cancer; this is the first study to associate this MirSNP with cancer development, which needs to be validated by more studies. Stratified analysis revealed that the miR-605 allele C was associated with an increased risk of developing bladder cancer in the Caucasian population. Recently, analysis of this SNP was conducted on gastrointestinal cancer among Asians and produced data similar to our own, with C allele being significantly lower in controls compared to cancer patients 27 . Researchers have found that miR-605 to be an element of the p53 network which forms a positive feedback loop in response to stress 28 , thus miR-605 may play a key role in carcinogenesis. It will make more sense if the association between SNP and miRNA expression have been investigated and might be an answer to the relation of SNP and cancer risk.
The allele C of the MirSNP rs895819, located in the terminal loop of the pre-miR-27a, was associated with increased risk of bladder  cancer in the Caucasian population, and it is the first study to address association between the MirSNP and bladder cancer. MiR-27a has been investigated in several types of cancer and comes into inconsistent results. MiR-27a functions as a tumor suppressor in ESCC and hepatocellular carcinoma, while serves as promoting factor in gastric tumorigenesis 29,30 . Therefore, we assume that miR-27a plays pleiotropic signaling roles in regulating tumorigenesis. The MirSNP rs895819 initially reported to relate with a reduced risk of familial breast cancer risk (P 5 0.0215) in a Caucasian population 31 ; however, no significant association of rs895819 with the risk of breast cancer was observed in Chinese population 32 . A previous study suggested no association between rs895819 and the risk of colorectal cancer in a Central-European Caucasian population, a population with an extremely high incidence of sporadic colorectal cancer 33 ; this observation is supported by our results. Since the high probability of MirSNP rs895819 involved with carcinogenesis, these conflicting results may be due to the analysis of varying sample sizes, and warrant further analysis of larger cohorts to clearly establish the impact of rs895819 on the risk of cancer.
The rs11614913 polymorphism of miR-196a2 has a significant impact on the expression of miR-196a2 and is associated with carcinogenesis in various types of cancer 34,35 . Previous, meta-analysis studies suggested a significant association between rs11614913 and the overall risk of cancer in the Asian population, which was inconsistent with our results 11,36,37 . Our study suggests the rs116114913 C allele might protect against lung cancer in the Caucasian population, but the significance was mitigated with P value 0.197 after FDR adjustment, while Tian et al found rs116114913 C allele associated with significantly increased risk of lung cancer in Chinese 38 , suggesting that the effect of the rs11614913 polymorphism may rely on the genetic background or ethnicity of the patients and/or the effects of the environment, in agreement with the reports of Chu et al. 37 and Wang et al. 36 . The effect of rs11614913 on the risk of different types of cancer needs to be confirmed in additional studies.
No significant associations were observed for the rs1834306, rs4919510, rs2292832 and rs3746444 polymorphisms in terms of the overall risk of cancer or the risk of specific types of cancer.
Though miR-100 has been shown to suppress the expression of proteins in the insulin-like growth factor (IGF)/mammalian target of rapamycin (mTOR) signaling cascade in childhood adrenocortical tumors 39 and clear cell ovarian cancer 40 , thus suppressing tumorigenesis, while act as a oncogene in acute myeloid leukemia 41 . Our results showed mir-100 polymorphism, located in the pri-miR-100 region had no relation with the risk of cancer. Rs4919510 lies within the mature miR-608 sequence, and is located at the junction between the stem and canonical hairpin loop 42 . Rs4919510 G allele was observed to relate with increased risk in bladder cancer, gastric cancer and prostate cancer, however, P values were mitigated after FDR adjustment, which needs to be validated by further studies. Rs2292832, located in pre-miR-149, was previously reported to have no significant associations with the risk of evaluated in breast cancer 43 48 , these existing studies have yielded contradictory results. These discrepancies may be due to the study of different populations from different areas and variations in selection of the case groups; therefore, the effect of the miR-499 rs3746444 polymorphism needs to be investigated further.
One limitation of the present study that needed to be addressed is the multiple comparison problems resulting from the number of MirSNPs tested. Therefore the FDR method was used to correct for multiple testing.
Second, several MirSNPs were imputed rather than directly genotyped in this study. Although using imputed MirSNPs might lead to less accurate results, we ensured that only SNPs with high imputation confidence . 95% were included into further analysis.
Taken together, the findings of the present study have substantial scientific significance and may have implications in the clinical setting. Our results suggest that common MirSNPs may contribute to an individual's susceptibility to diverse types of cancer. Further functional characterization of MirSNPs and their influence on their target mRNAs may reveal the underlying mechanisms responsible for the associations between these polymorphisms and the etiology of cancer. Further prospective investigations of larger numbers of cases and controls are required in order to clarify the inconsistent associations between MirSNPs and the risk of cancer. This study is based on an in-silicon re-analyze of the human genotyping data downloaded form dbGAP(www.ncbi.nlm.nih.gov/gap). The data submitters have obtained the informed consent from each participant.
Imputation of the MirSNPs. The SNPs not present in the original chip were imputed by the program IMPUTE2, using both HapMap (NCBI Build 36 (db126b)) CEU data and 1000 Genomes as a reference haplotype set. All SNPs showed high imputation confidence (.95%). Rs2292832, rs2043556 and rs11614913 were directly genotyped, the others were imputed.
Association testing and adjustment for covariates. All the association tests were performed by Plink v1.07 using additive logistic regression models. To account for potential population stratification or admixture in these samples, principal component analyses (PCA) was carried out using the EIGENSTRAT 56 . After adjustment for significant principal components (PCs) in each study based on leveling off of the PCA screen plot, there was no evidence for large scale inflation of the association test statistics by comparison of observed and expected distributions, ruling out the significance hidden population substructure. The principal component score for each individual was included as a covariate in all models along with gender and cohort in logistic regression models. Multivariate logistic regression was performed in R software package (http://www.r-project.org/). The FDR method was used to correct for multiple testing (FDR q , 0.05).
Resampling. To examine the robustness of the associations, we conduct a re-sampling analysis in accordance with Li et al. 57 . Using the association test mentioned above, Pvalues (P random ) were obtained by performing the test 1,000 times but randomly selected 70% of population in corresponding study. Then we tested the null hypothesis, P random $ 0.05 (Supplementary Table S1).
Statistical analysis. The associations of the nine MirSNPs with the risk of cancer were examined by performing meta-analysis using inverse-variance method. We examined the association of the MirSNPs with the overall risk of cancer as measured by odds ratios (ORs) and 95% confidence intervals (CIs). Moreover, stratified analyses were also performed by the type of cancer for each MirSNP. The heterogeneity of the cancer type between studies was evaluated using the Chi-square-based Q statistical test, with a heterogeneity (Ph) , 0.05 considered significant. A fixed-effect model using the Mantel-Haenszel method and a random-effects model using the DerSimonian and Laird method were used to pool the data according to the cancer types and individual MirSNPs. The random-effects model was used when heterogeneity existed in the results of the studies; otherwise the fixed-effect model was used.
Additionally, cross phenotype meta-analysis (CPMA) was performed to determine the associations of the MirSNPs with the overall cancer risk; P , 0.05 was considered significant after FDR adjustment. The CPMA statistic determines evidence for the hypothesis that single SNP has multiple phenotypic associations. The CPMA statistic is agnostic to the direction of effect in each disease. It has one degree of freedom as it measures a deviation in P value behavior instead of testing all possible combinations of diseases for association to each SNP, and therefore provides high power to reject the null hypothesis 58,59 .
All statistical tests for the meta-analysis were performed with review manager version 5.2 (The Nordic Cochrane Centre, The Cochrane Collaboration, Copenhagen, Denmark). Begg's test was used to evaluate publication bias.