Analyzing large-scale samples confirms the association between rs16892766 polymorphism and colorectal cancer susceptibility

Colorectal cancer (CRC) is a common complex disease caused by the combination of genetic variants and environmental factors. Genome-wide association studies (GWAS) have been performed and reported some novel CRC susceptibility variants. The rs16892766 (8q23.3) polymorphism was first identified to be significantly associated with CRC in European ancestry. The following studies investigated this association in Chinese, Japanese, Romanian, Swedish, African American, European American, and Croatian populations. These studies reported consistent and inconsistent results. Here, we reevaluated this association using the relatively large-scale samples from 13 studies (N = 59737, 26237 cases and 33500 controls) using a meta-analysis by searching the PubMed, Google Scholar and CRCgene databases. We observed no significant heterogeneity among the included studies. Our results showed significant association between rs16892766 polymorphism and CRC (P = 1.33E-35, OR = 1.23, 95% CI 1.20-1.27). Collectively, our analysis further supports previous findings that the rs16892766 polymorphism is significantly associated with CRC susceptibility. We believe that our findings will be very useful for future genetic studies on CRC.

Colorectal cancer (CRC) is a common complex disease caused by the combination of genetic variants and environmental factors. Genome-wide association studies (GWAS) have been performed and reported some novel CRC susceptibility variants. The rs16892766 (8q23.3) polymorphism was first identified to be significantly associated with CRC in European ancestry. The following studies investigated this association in Chinese, Japanese, Romanian, Swedish, African American, European American, and Croatian populations. These studies reported consistent and inconsistent results. Here, we reevaluated this association using the relatively large-scale samples from 13 studies (N 5 59737, 26237 cases and 33500 controls) using a meta-analysis by searching the PubMed, Google Scholar and CRCgene databases. We observed no significant heterogeneity among the included studies. Our results showed significant association between rs16892766 polymorphism and CRC (P 5 1.33E-35, OR 5 1.23, 95% CI 1. 20-1.27). Collectively, our analysis further supports previous findings that the rs16892766 polymorphism is significantly associated with CRC susceptibility. We believe that our findings will be very useful for future genetic studies on CRC.
C olorectal cancer (CRC), also called colon cancer or large bowel cancer, is the third most common form of cancer and the second leading cause of cancer-related death in the Western world and its lifetime risk in the United States is about 7% 1 . CRC is a common complex disease caused by the combination of genetic variants and environmental factors 1 . Genome-wide association studies (GWAS) are considered to be a new and power approach to detect the genetic variants of human complex diseases. Recently, GWAS have been performed and reported some novel CRC susceptibility variants 2-6 .
Recent studies investigated the influence of rs16892766 in Lynch syndrome. Wijnen  Registry whose mutation carrier status was known 17 . They found a significant association between CRC risk and rs16892766 (8q23.3). The possession of the C-allele was associated with an elevated risk of CRC in a dose-dependent fashion, with homozygosity for CC being associated with a 2.16-fold increased risk 17 . Talseth-Palmer et al. investigated whether the rs16892766 (8q23.3) acts as modifier of disease risk in patients with Lynch syndrome using 684 mutationpositive patients with Lynch syndrome from 298 Australian and Polish families 18 . They identified an association between rs16892766 on chromosome 8q23.3 and the risk of developing CRC and age of diagnosis was found in MLH1 mutation carriers 18 .
It is reported that meta-analysis method involves combining and analyzing quantitative evidence from related studies to produce results based on a whole body of research 19 . It is a quantitative, formal, epidemiological study design used to systematically assess previous research studies to derive conclusions about that body of research 20 . The motivation of a meta-analysis is to aggregate information in order to achieve a higher statistical power. Considering the important role of rs16892766 polymorphism in CRC risk and inconsistent results reported by previous studies, we reevaluated this association using the relatively large-scale samples from 13 studies (N 5 59737, 26237 cases and 33500 controls) using meta-analysis method by searching the PubMed, Google Scholar and CRCgene databases 21 .

Methods
Literature search. We searched the PubMed database to select all possible studies with key words including 'rs16892766' and 'colorectal cancer' or '8q23.3' and 'colorectal cancer'. The literature search was updated on June 5, 2014. Meanwhile, we used the Google Scholar (http://scholar.google.com/) to query the articles citing the studies and all references in these studies identified by the PubMed. We selected only published articles written in English. Theodoratou et al. report the first comprehensive field synopsis and creation of a parallel publicly available and regularly updated database (CRCgene) that catalogs all genetic association studies on colorectal cancer (http://www.chs.med.ed.ac.uk/CRCgene/) 21 . They carried out metaanalyses to derive summary effect estimates for 92 polymorphisms in 64 different genes.
Inclusion criteria. We selected the studies meeting the following criteria: (1) the study was conducted by a case-control design; (2) the study evaluated the association between rs16892766 polymorphism and CRC; (3) the study provided the numbers of rs16892766 genotypes or (4) the study must provided sufficient data to calculate the numbers of rs16892766 genotypes or (5) the study provided an OR with 95% CI as well as the P value; or (6) the study must provided sufficient data to calculate the OR and 95% CI; Data extraction. We extracted the following information from each study: (1) the name of the first author; (2) the year of publication; (3) the population and ethnicity; (4) the numbers of AD cases and controls; (5) the genotype numbers of rs16892766 polymorphism in cases and controls; (6) the numbers of rs16892766 genotypes or (7) to calculate the numbers of rs16892766 genotypes; (8) the OR with 95% CI or (9) to calculate the OR and 95% CI; All relevant calculations were completed using the program R (http://www.r-project.org/). Genetic model. The rs16892766 polymorphism has two alleles including C and A. C is the minor allele. We assume that C is the high-risk allele and A is the lower-risk allele. We selected the additive genetic model for further meta-analysis. The additive model can be described as C allele versus A allele 22 .
Heterogeneity test. We evaluated the genetic heterogeneity among the studies included using Cochran's Q test, which approximately follows a X 2 distribution with k-1 degrees of freedom (k stands for the number of studies for analysis).
, which ranges from 0 to 100%, was also used 23 . I 2 is a measure of heterogeneity and a statistic that indicates the percentage of variance in a meta-analysis that is attributable to study heterogeneity 24 . Low, moderate, large and extreme heterogeneity corresponded to 0-25%, 25-50%, 50-75% and 75-100%, respectively 23 . The significant levels for heterogeneity are defied to be with P , 0.01 and I 2 . 50%.

Meta-analysis.
If there is no significant heterogeneity among the included studies, the pooled OR is calculated by the fixed effect model (Mantel-Haenszel), otherwise the OR is calculated by random-effect model (DerSimonian-Laird). Z test is used to determine the significance of OR. All statistical tests for heterogeneity and metaanalysis were computed using R Package (http://cran.r-project.org/web/packages/ meta/index.html).
Sensitivity and publication bias analyses. We evaluated the relative influence of each study by omitting each study at a time. Meanwhile, we used funnel plots to evaluate the potential publication bias 25 . Begg and Egger's tests were used to evaluate the asymmetry of the funnel plot 25 .

Results
Literature search. We selected 41 articles from PubMed and Google Scholar databases, and two articles from the CRCgene database. Finally, 9 articles including 13 independent studies were included for our following analysis. More detailed information about the inclusion or exclusion of selected studies was described in Figure 1.
The main characteristics of the included studies are described in Table 1, which included the name of the first author, the year of publication, the population or ethnicity, the numbers of AD cases and controls, and the OR with 95% CI.
Heterogeneity test. We evaluated the genetic heterogeneity of rs16892766 polymorphism among the selected studies using additive model and I 2~Q { k{1 ð Þ ð Þ = Q |100% as well as P value.
We did not identify significant heterogeneity among these 13 studies using additive model (P 5 0.8239 and I 2 5 0%).
Meta-analysis. As described above, we identified no significant heterogeneity among these 13 studies. We then performed a metaanalysis. We calculated the overall OR by the fixed effect model. Our results showed significant association between rs16892766 polymorphism and CRC using additive model (P 5 1.33E-35, OR 5 1.23, 95% CI 1.20-1.27). In Figure 2, for each study, we list the name of the first author, the year of publication, the population or ethnicity, the OR with 95% CI and the weight in meta-analysis. Detailed results are described in Figure 2.
Sensitivity analysis and publication bias analysis. By excluding any one study, we identified that the association between rs16892766 polymorphism and CRC did not vary substantially. The funnel plots are symmetrical inverted funnels for models (Figure 3), which suggest no significant publication bias for the additive model (Begg's test, P 5 0.2206 and Egger's test, P 5 0.2206).

Discussion
Recent GWAS identified rs16892766 (8q23.3) polymorphism to be significantly associated with CRC in European ancestry 6 . The following studies investigated this association and reported consistent and inconsistent results. It is important to assess the genetic architecture of rs999737 polymorphism across different populations. Here, we reevaluated this association using the relatively large-scale samples from 13 studies by searching the PubMed, Google Scholar and CRCgene databases. We first evaluated the genetic heterogeneity of rs16892766 polymorphism among the selected studies. We did not identify significant heterogeneity among these 13 studies using additive model (P 5 0.8239 and I 2 5 0%). We then conducted a meta-analysis using fixed effect model. Our results showed significant association between rs16892766 polymorphism and CRC using additive model (P 5 1.33E-35, OR 5 1.23, 95% CI 1.20-1.27). Collectively, our analysis further supports previous findings that the rs16892766 polymorphism is significantly associated with CRC susceptibility. We believe that our findings will be very useful for future genetic studies on CRC. Before our submission, we accessed the PubMed and Google Scholar databases using the key words 'rs16892766' and 'meta'. We identified two articles 8,21 . Hutter et al. examined potential effectmodification between 10 loci and probable or established environmental risk factors for CRC in 7,016 CRC cases and 9,723 controls from nine cohort and case-control studies 8 . They used meta-analysis of an efficient empirical-Bayes estimator to detect potential multiplicative interactions between each of the SNPs and select major CRC risk factors 8 . The strongest statistical evidence for a gene-envir-www.nature.com/scientificreports SCIENTIFIC REPORTS | 5 : 7957 | DOI: 10.1038/srep07957 onment interaction across studies was for vegetable consumption and rs16892766, located on chromosome 8q23.3, near the EIF3H and UTP23 genes 8 . Theodoratou et al. carried out meta-analyses to derive summary effect estimates for 92 polymorphisms in 64 different genes and constructed the CRCgene database (http://www.chs. med.ed.ac.uk/CRCgene/) 21 .
Our study is different from previous studies 8, 21 . Hutter et al. investigated the gene-environment interaction between each of the SNPs and select major CRC risk factors 8 . We accessed CRCgene databases and found two articles including three studies investigating rs16892766 polymorphism. Here, we conducted an updated analysis to reevaluate the association between rs16892766 polymorphism and CRC using the relatively large-scale samples by searching the PubMed and Google Scholar databases. We observed no significant heterogeneity among the included studies. Our results from this meta-analysis are consistent with the findings from CRC GWAS.  (1) was conducted by a case-control design; (2) evaluated the association between rs16892766 polymorphism and CRC; (3) provided the numbers of rs16892766 genotypes or (4) must provided sufficient data to calculate the numbers of rs16892766 genotypes or (5) provided an OR with 95% CI as well as the P value; or (6) must provided sufficient data to calculate the OR and 95% CI; OR, odds ratio; CI, confidence interval.  26 . Four most significantly associated SNPs-rs16892766, Novel 28, rs16888589 and rs11986063 are strongly correlated with one another (pairwise r2 . 0.75) and constitute a single risk haplotype 26 . Reporter gene studies demonstrated that the rs16888589, which was in high LD with rs16892766, acts as an allele-specific transcriptional repressor 26 . Chromosome conformation capture analysis showed that the genomic region harboring rs16888589 interacts with the promoter of gene for eukaryotic translation initiation factor 3, subunit H (EIF3H) 26 . EIF3H is located at 8q23 and identified to be a CRC susceptibility gene by previous GWAS 6,27 . Increased expression of EIF3H gene increases CRC growth and invasiveness thereby providing a biological mechanism for the 8q23.3 association 26 .
Despite these interesting results, our study has a limitation. Here, we investigated the association between rs16892766 and CRC using additive model. It is reported that most meta-analyses used an additive genetic model 28 . In general, this model performs well when the true underlying genetic model is uncertain 28 . It was also important to analyze the association between rs16892766 and CRC using dominant model (CC1CA versus AA) and recessive model (CC versus were included for meta-analysis. The heterogeneity among these 13 studies was evaluated by I 2~Q{ k{1 ð Þ ð Þ = Q |100% as well as P value. For each study, we list the name of the first author, the year of publication, the population or ethnicity, the OR with 95% CI and the weight in metaanalysis. For the meta-analysis, the overall OR was calculated by the fixed effect model. OR, odds ratio; CI, confidence interval; fixed, fixed effect model.  CA1AA) 22 . Exact genotype numbers of all studies used in our analysis are required for the dominant and recessive models. We attempted to obtain these genotype numbers but were not successful. Considering that the original genotype data are not publicly available for us, future replication studies using genotype data are required to replicate our findings.