Article
Published: 18 May 2014

Large-scale genetic study in East Asians identifies six new loci associated with colorectal cancer risk

Ben Zhang¹,
Wei-Hua Jia²,
Koichi Matsuda ORCID: orcid.org/0000-0001-7292-2686³,
Sun-Seog Kweon^4,5,
Keitaro Matsuo⁶,
Yong-Bing Xiang⁷,
Aesun Shin^8,9,
Sun Ha Jee¹⁰,
Dong-Hyun Kim¹¹,
Qiuyin Cai¹,
Jirong Long¹,
Jiajun Shi ORCID: orcid.org/0000-0001-5194-0009¹,
Wanqing Wen¹,
Gong Yang¹,
Yanfeng Zhang¹,
Chun Li¹²,
Bingshan Li¹³,
Yan Guo¹⁴,
Zefang Ren¹⁵,
Bu-Tian Ji¹⁶,
Zhi-Zhong Pan²,
Atsushi Takahashi¹⁷,
Min-Ho Shin⁴,
Fumihiko Matsuda¹⁸,
Yu-Tang Gao⁷,
Jae Hwan Oh¹⁹,
Soriul Kim¹⁰,
Yoon-Ok Ahn⁹,
Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO),
Andrew T Chan^20,21,
Jenny Chang-Claude²²,
Martha L Slattery²³,
Colorectal Transdisciplinary (CORECT) Study,
Stephen B Gruber²⁴,
Fredrick R Schumacher²⁴,
Stephanie L Stenzel²⁴,
Colon Cancer Family Registry (CCFR),
Graham Casey²⁴,
Hyeong-Rok Kim²⁵,
Jin-Young Jeong¹¹,
Ji Won Park^19,26,
Hong-Lan Li⁷,
Satoyo Hosono⁶,
Sang-Hee Cho²⁷,
Michiaki Kubo¹⁷,
Xiao-Ou Shu¹,
Yi-Xin Zeng² &
…
Wei Zheng¹

Nature Genetics volume 46, pages 533–542 (2014)Cite this article

7888 Accesses
182 Citations
11 Altmetric
Metrics details

Subjects

Abstract

Known genetic loci explain only a small proportion of the familial relative risk of colorectal cancer (CRC). We conducted a genome-wide association study of CRC in East Asians with 14,963 cases and 31,945 controls and identified 6 new loci associated with CRC risk (P = 3.42 × 10⁻⁸ to 9.22 × 10⁻²¹) at 10q22.3, 10q25.2, 11q12.2, 12p13.31, 17p13.3 and 19q13.2. Two of these loci map to genes (TCF7L2 and TGFB1) with established roles in colorectal tumorigenesis. Four other loci are located in or near genes involved in transcriptional regulation (ZMIZ1), genome maintenance (FEN1), fatty acid metabolism (FADS1 and FADS2), cancer cell motility and metastasis (CD9), and cell growth and differentiation (NXN). We also found suggestive evidence for three additional loci associated with CRC risk near genome-wide significance at 8q24.11, 10q21.1 and 10q24.2. Furthermore, we replicated 22 previously reported CRC-associated loci. Our study provides insights into the genetic basis of CRC and suggests the involvement of new biological pathways.

You have full access to this article via your institution.

Genome-wide association studies

Article 26 August 2021

Tissue-specific enhancer–gene maps from multimodal single-cell data identify causal disease alleles

Article 09 April 2024

Genomic data in the All of Us Research Program

Article Open access 19 February 2024

Main

CRC is a leading cause of cancer morbidity and mortality worldwide¹. It is well established that genetic factors have an important role in the etiology of CRC^2,3. Deleterious germline mutations in known susceptibility genes, notably APC (adenomatous polyposis coli), MLH1, MSH2, MSH6 and PMS2, confer high risk of CRC in hereditary cancer syndromes^3,4,5,6. Most sporadic CRC cases, however, do not carry these high-penetrance mutations^3,4. Since 2007, genome-wide association studies (GWAS) and subsequent fine-mapping analyses conducted in individuals of European descent have identified 21 low-penetrance susceptibility loci associated with CRC risk^{7,8,9,10,11,12,13,14,15,16,17}. Together, these common loci explain less than 10% of the familial relative risk of CRC in European populations^13,14. In a GWAS of 7,456 CRC cases and 11,671 controls conducted as part of the Asia Colorectal Cancer Consortium, we identified 3 new loci at 5q31.1 (near PITX1), 12p13.32 (near CCND2) and 20p12.3 (near HAO1) associated with CRC risk¹⁸. In addition, we discovered a new risk variant in the SMAD7 gene associated with CRC in East Asians¹⁹. Over the past 2 years, we have doubled the sample size in the Asia Colorectal Cancer Consortium and conducted a 4-stage GWAS, including 14,963 CRC cases and 31,945 controls, to identify additional susceptibility loci for CRC.

Results

Study overview

We performed a fixed-effects meta-analysis to evaluate approximately 2.4 million genotyped or imputed SNPs on 22 autosomes from 5 GWAS (stage 1) conducted in China, Japan and South Korea, including in total 2,098 CRC cases and 6,172 cancer-free controls (Supplementary Tables 1 and 2). There was little evidence of population stratification in these studies (Supplementary Figs. 1 and 2), with genomic inflation factor λ < 1.04 in each of the five studies and the meta-analysis (λ_1,000 = 1.01). We selected 8,539 SNPs showing evidence of association with CRC risk (P < 0.05) according to prespecified criteria (Online Methods). We also included the 31 risk-associated variants identified by previous GWAS^{7,8,9,10,11,12,13,14,15,16,17,18,19,20}, resulting in a total of 8,570 SNPs. Of these, 7,113 SNPs were successfully designed using Illumina Infinium assays as part of a large genotyping effort for multiple projects. Using this customized array, we genotyped an independent set of 3,632 CRC cases and 6,404 controls recruited in 3 studies (stage 2) conducted in China. After quality control exclusions, 6,899 SNPs remained for analysis in 3,519 cases and 6,275 controls. We evaluated associations between CRC risk and these SNPs in each study separately and then performed a fixed-effects meta-analysis to obtain summary estimates. Again, we observed little evidence of population stratification, either in the three studies individually (λ < 1.05) or combined (λ = 1.05, λ_1,000 = 1.01) (Supplementary Fig. 3). In a meta-analysis of data from stages 1 and 2, we identified 559 SNPs showing evidence of association at P < 0.005. We then evaluated these SNPs using data from a large Japanese CRC GWAS (stage 3) with 2,814 CRC cases and 11,358 controls²⁰. Thirty SNPs in 25 new loci were associated with CRC risk at P < 0.0001 in the meta-analysis of data from stages 1–3 and at P < 0.01 in the meta-analysis of stages 2 and 3. Of these SNPs, 29 were successfully genotyped in an independent sample of 6,532 CRC cases and 8,140 controls from 5 additional studies (stage 4) conducted in China, South Korea and Japan.

Newly identified risk-associated loci for CRC

In the meta-analysis of all data for the 29 SNPs from stages 1–4 with 14,963 CRC cases and 31,945 controls, signals from 10 SNPs, representing 6 new loci, showed convincing evidence of an association with CRC risk at the genome-wide significance level (P < 5 × 10⁻⁸), including rs704017 at 10q22.3; rs11196172 at 10q25.2; rs174537, rs4246215, rs174550 and rs1535 at 11q12.2; rs10849432 at 12p13.31; rs12603526 at 17p13.3; and rs1800469 and rs2241714 at 19q13.2 (Table 1, Supplementary Fig. 4 and Supplementary Tables 3 and 4). Associations of CRC risk with the top SNPs in each of the six loci were consistent across almost all studies, with no evidence of heterogeneity (Fig. 1). With the exception of the intergenic SNP rs10849432 at 12p13.31, the remaining nine newly identified risk-associated variants were located in exonic, promoter, 3′ UTR or intronic regions of known genes (Table 1). The linkage disequilibrium (LD) blocks (r² > 0.5) tagged by rs704017 (10q22.3), rs174537 (11q12.2) and rs1800469 (19q13.2) each span multiple genes (Supplementary Table 5). The LD blocks tagged by rs11196172 (10q25.2) and rs12603526 (17p13.3) each lie within a single gene. The LD block tagged by rs10849432 (12p13.31) does not contain any known gene. Stratification analyses of the newly identified risk variants by tumor anatomical site (colon or rectum), population (Chinese, Korean or Japanese) and sex (male or female) did not identify any significant heterogeneity (Supplementary Tables 6, 7, 8). In addition to the six newly identified loci, three additional regions showed association with CRC risk near genome-wide significance at 8q24.11 (rs6469656; P = 5.38 × 10⁻⁸), 10q21.1 (rs4948317; P = 7.14 × 10⁻⁸) and 10q24.2 (rs12412391; P = 7.41 × 10⁻⁷). Results for all 29 SNPs across stages 1–4 are presented in Supplementary Table 3.

Table 1 Summary results for risk variants in the six newly identified loci associated with CRC in East Asians

Full size table

**Figure 1: Forest plots for risk-associated variants in the six newly identified loci.**

We performed conditional analyses for SNPs located within a 1-Mb region centered on the index SNP in each of the six newly identified loci. No second association signal was identified at P < 0.01 after adjusting for the respective index SNP (data not shown). Four SNPs at 11q12.2 and 2 SNPs at 19q13.2 showed association with CRC risk at P < 5 × 10⁻⁸, and we thus performed haplotype analysis for these 2 loci using genotype data available for 10,051 CRC cases and 14,415 controls (stages 2 and 4). Two common haplotypes were found in the 11q12.2 locus, accounting for more than 99% of the haplotypes constructed using the four highly correlated SNPs. The haplotype with all four risk-associated alleles (frequency = 0.574 in controls) was strongly associated with CRC risk (odds ratio (OR) = 1.40, 95% confidence interval (CI) = 1.29–1.51; P = 3.69 × 10⁻¹⁶) (Supplementary Table 9). Similarly, we identified two common haplotypes at the 19q13.2 locus, accounting for more than 99% of the haplotypes constructed using the two highly correlated SNPs. The haplotype with the risk-associated allele at both SNPs (frequency = 0.485 in controls) was also associated with increased risk of CRC (OR = 1.16, 95% CI = 1.08–1.26; P = 1.18 × 10⁻⁴) (Supplementary Table 10). Overall, these analyses did not identify an independent signal in any of the six newly identified loci.

We examined potential SNP-SNP interactions between the 6 new risk-associated variants identified in this study (rs704017, rs11196172, rs174537, rs10849432, rs12603526 and rs1800469) and also between these 6 SNPs and the risk-associated variants in 25 previously reported loci (Supplementary Table 11). Multiplicative interactions were found with suggestive evidence of association (P < 0.05) for seven pairs of SNPs. None of these interactions, however, remained statistically significant after correcting for multiple comparisons in 180 tests (adjusted P = 0.00028).

We evaluated associations of the 10 newly identified SNPs with CRC risk in individuals of European descent using data from 3 consortia, the Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO)¹⁷, the Colorectal Transdisciplinary (CORECT) Study and the Colon Cancer Family Registry (CCFR)²¹, with a total sample size of 16,984 CRC cases and 18,262 controls (Supplementary Table 12). In a meta-analysis of data from these consortia, all ten SNPs showed association with CRC risk in the same direction as observed in East Asians (Table 2). Five SNPs in two loci (10q22.3 and 11q12.2) were associated with CRC risk at P < 0.008 (corrected for multiple comparisons of six loci). These associations in individuals of European descent, however, were weaker than in East Asians. Tests showed statistically significant evidence of heterogeneity for risk variants at 11q12.2 and 19q13.2 (P < 0.008). The frequency of the risk-associated allele was also considerably different in East Asians and individuals of European ancestry for SNPs in five loci (Supplementary Table 13). For example, the minor allele (C) of rs12603526 is common in East Asians, whereas the minor allele frequency (MAF) is <0.02 in individuals of European descent. These differences might in part reflect distinct patterns of LD between the index SNPs and causal SNPs in these two populations. As expected, LD patterns for most of the newly identified loci were considerably different in East Asians and individuals of European descent (Supplementary Fig. 5). Large-scale fine-mapping of these loci will be helpful in identifying causal variants.

Table 2 Associations of risk variants in the six newly identified loci with CRC in individuals of European descent

Full size table

Putative functional variants and candidate genes

We evaluated and annotated putative functional variants and candidate genes in each of the six newly identified loci using data from the 1000 Genomes Project²², HapMap 2 (ref. 23), the Encyclopedia of DNA Elements (ENCODE)²⁴, expression quantitative trait locus (eQTL) databases^25,26,27,28, the Catalogue of Somatic Mutations in Cancer (COSMIC)²⁹, The Cancer Genome Atlas (TCGA) CRC project³⁰, the Expression Atlas³¹, PubMed and Online Mendelian Inheritance in Man (OMIM) (Online Methods). We summarize the results below for each locus.

At the 10q25.2 locus, rs11196172 is located in intron 4 of the TCF7L2 gene. This SNP and other correlated SNPs (r² > 0.5) fall within a region with strong enhancer activity and a DNase I hypersensitivity site annotated by ENCODE (Supplementary Table 14), suggesting a potential functional role for these SNPs. We found that the risk-associated allele of rs11196172 was significantly associated with higher expression of the TCF7L2 gene (P = 0.003) in colon tumor tissue using TCGA data (Fig. 2). The TCF7L2 gene encodes TCF7L2 (previously known as TCF4), which is a key transcription factor in the Wnt signaling pathway. Aberrant activation of Wnt signaling is found in more than 90% of CRCs³⁰, and TCF7L2 is a known tumor suppressor in CRC. Loss of TCF7L2 function enhances CRC cell growth, whereas gain of function suppresses CRC cell growth^32,33. The TCF7L2 gene is one of the most frequently mutated genes in CRC, with estimated point mutation rates of approximately 8–12.5% (refs. 29,30). Although TCF7L2 is the only gene in this locus (Supplementary Fig. 4), we also found that the risk-associated allele of rs11196172 was significantly associated with higher expression of the VTI1A gene (P = 5.1 × 10⁻⁴) in colon tumor tissue (Fig. 2). The VTI1A gene is located approximately 131 kb upstream of the TCF7L2 gene, and mRNA levels for these two genes are highly correlated in colon tumor tissue (r = 0.71; P < 0.0001). Recently, a recurrent gene fusion connecting the first three exons of VTI1A to the fourth exon of TCF7L2 was identified in approximately 3% of colorectal tumors³⁴. It is possible that the VTI1A gene might also be involved in the association between rs11196172 and CRC risk.

**Figure 2: Association of selected risk variants identified in this study with gene expression in colon tumor tissue.**

At the 19q13.2 locus, we identified two perfectly correlated SNPs (rs1800469 and rs2241714; r² = 1) associated with CRC risk. Of these, rs1800469 has previously been investigated with respect to CRC risk in many small candidate gene association studies, with conflicting results⁵. We herein provide for the first time, to our knowledge, convincing evidence of association for rs1800469 through our GWAS analysis. SNP rs1800469 maps to the promoter of the TGFB1 gene, and rs2241714 is a nonsynonymous SNP that results in an amino acid substitution at residue 11 of the B9D2 protein. The A allele of rs1800469 has been related to higher transcriptional activity for the TGFB1 gene and higher circulating levels of the transforming growth factor (TGF)-β1 protein than the G allele³⁵. Both rs1800469 and rs2241714 are in perfect LD with another nonsynonymous SNP, rs1800470, which causes a proline-to-leucine substitution at residue 10 of the TGF-β1 protein. Although the two nonsynonymous SNPs are predicted to be tolerated³⁶ or benign³⁷, the Pro10 variant encoded by rs1800470 has also been associated with an increase in TGFB1 gene expression, TGF-β1 protein secretion and circulating levels of TGF-β1 protein^38,39,40. Whereas rs2241714 is an eQTL for TGFB1, both rs1800469 and rs2241714 are also eQTLs for other genes in this locus (Supplementary Table 15). In addition to these three SNPs, we suggest that many highly correlated SNPs located in the TGFB1 gene might potentially have regulatory functions (Supplementary Table 14). The TGF-β1 protein is a member of the TGF-β signaling pathway. Somatic alterations of certain components in this pathway (TGFBR2, SMAD2, SMAD3 and SMAD4) are estimated to be present in almost half of CRCs⁴¹. High-penetrance germline mutations in the SMAD4 gene are known to cause juvenile polyposis, an autosomal dominant polyposis syndrome linked to a high risk of CRC⁴². Germline, allele-specific expression of the TGFBR1 gene has also been shown to contribute to increased risk of CRC⁴³. Thus far, GWAS have identified at least six other independent SNPs that are located in or proximal to genes in the TGF-β signaling pathway (SMAD7, GREM1, BMP2, BMP4 and RHPN2)^9,10,13,19. Our finding of an association between a genetic variant in the TGFB1 gene and CRC risk adds further evidence for the critical role of this pathway in colorectal tumorigenesis.

At the 11q12.2 locus, the four perfectly correlated SNPs rs174537, rs4246215, rs174550 and rs1535 lie in intron 24 of MYRF, the 3′ UTR of FEN1, intron 7 of FADS1 and intron 1 of FADS2, respectively. Of these SNPs, rs4246215 is an eQTL for the FEN1 gene in normal colorectal tissue⁴⁴ and is predicted to affect microRNA (miRNA) binding site activity⁴⁵. SNP rs174537 is an eQTL for the FADS1 and FADS2 genes in whole blood and other types of tissue (Supplementary Table 15). Using data from TCGA, we identified a strong correlation of rs1535 genotypes with FADS2 gene expression (P = 1.4 × 10⁻⁵) in colon tumor tissue (Fig. 2). These findings suggest that the potential functions of these SNPs might be mediated through their effects on their host genes. We also found that the FEN1, FADS1 and FADS2 genes are all highly expressed in colon tumor tissue compared with normal colon tissue (Supplementary Table 16). The FEN1 gene encodes flap structure–specific endonuclease 1, a protein that is essential for DNA repair, replication and degradation and that has a critical role in maintaining genome stability and protecting against carcinogenesis⁴⁶. FEN1 mutations have been found in several human cancers⁴⁷. Mouse models with haploinsufficiency for Fen1 showed rapid progression of CRC and reduced survival⁴⁸. Two other genes in this locus, FADS1 and FADS2, respectively encode delta-5 and delta-6 desaturases, which are key enzymes in the metabolism of polyunsaturated fatty acids. Of these proteins, delta-6 desaturase is responsible for the synthesis of arachidonic acid⁴⁹, the precursor of prostaglandin E₂ (PGE₂), which is a key molecule mediating the effect of cyclooxygenase-2 in colorectal carcinogenesis⁵⁰. Notably, SNPs in perfect LD with the risk-associated variants for CRC identified in this study are strongly associated with circulating arachidonic acid levels⁴⁹. We have shown previously that high levels of the PGE₂ metabolite in urine, a marker of endogenous PGE₂ production, are strongly related to higher risk of CRC⁵¹. Because the LD block of approximately 190 kb tagged by the four risk-associated variants covers many putatively functional SNPs that are located in the FEN1, FADS1 and FADS2 genes (Supplementary Fig. 6 and Supplementary Table 14), it is difficult to pinpoint a single SNP or gene that might be responsible for the association with CRC risk in this locus. Nevertheless, our study provides evidence of a potentially important role for the FEN1, FADS1 and FADS2 genes in the etiology of CRC.

At the 10q22.3 locus, rs704017 is located in intron 3 of the ZMIZ1-AS1 gene and resides in a strong enhancer region predicted using ENCODE data (Supplementary Fig. 6 and Supplementary Table 14). It also maps to a DNase I hypersensitivity site identified in the Caco-2 CRC cell line. In addition to the ZMIZ1-AS1 gene, the LD block tagged by rs704017 also includes the ZMIZ1 gene, whose expression is downregulated in the Caco-2 and HT-29 CRC cell lines³¹. In line with these observations, we found in TCGA data that ZMIZ1 gene expression is lower in colon tumor tissue compared with normal colon tissue (P = 3.28 × 10⁻⁶). In addition, somatic mutations in the ZMIZ1 gene have been reported in more than 2% of colon tumors²⁹. Whereas ZMIZ1-AS1 is a miscellaneous RNA (miscRNA) gene with unknown function, the ZMIZ1 gene encodes the protein ZMIZ1, which regulates the activity of several transcription factors, including AR, SMAD3, SMAD4 and p53. It has been shown that ZMIZ1 might have a broader role in epithelial cancers, including CRC⁵². SNP rs704010, located in intron 1 of the ZMIZ1 gene, has been associated with breast cancer⁵³. However, this SNP, which is in weak LD (r² = 0.09) with the risk-associated variant we identified for CRC, was not associated with CRC in this study (data not shown). Given the biological function of the ZMIZ1 gene, it is possible that this gene is involved in the association observed in this locus.

In the 12p13.31 locus, rs10849432 maps to an LD block of approximately 52 kb with no known genes. ENCODE data suggest that rs4764551 and rs4764552, perfectly correlated with rs10849432, might be located in a strong enhancer region (Supplementary Table 14). Notably, rs4764551 also maps to a DNase I hypersensitivity site in the HCT-116 CRC cell line and a binding site for the CTCF protein in the Caco-2 CRC cell line. Using data from TCGA, we showed that the closest genes to rs10849432 (CD9, PLEKHG6 and TNFRSF1A) all have downregulated expression in colon tumor tissue (Supplementary Table 16). The CD9 gene encodes the CD9 antigen, which participates in many cellular processes, including differentiation, adhesion and signal transduction. Notably, CD9 has a critical role in the suppression of cancer cell motility and metastasis⁵⁴, and overexpression of the CD9 gene is associated with favorable prognosis for patients with CRC⁵⁵. CD9 is also involved in suppressing Wnt signaling⁵⁶. Although the function of the PLEKHG6 gene is less clear, somatic mutations in this gene were found in approximately 2% of colon tumors²⁹. The protein encoded by TNFRSF1A is a major receptor for tumor necrosis factor (TNF)-α and is known to be involved in cytokine-induced senescence in cancer⁵⁷. In addition to evidence for the three nearby genes, we also found that rs4764552 is an eQTL for the LTBR gene (Supplementary Table 15). The LTßR protein has an essential role in lymphoid organ formation and has also been linked to cancer⁵⁸, including CRC⁵⁹. On the basis of these data, we propose that the CD9 gene is the most likely candidate to explain the association identified in this locus. However, potential roles for other genes cannot be excluded.

At the 17p13.3 locus, rs12603526 lies in intron 1 of the NXN gene, in a region covering several regulatory elements, including a DNase I hypersensitivity site, a strong enhancer region and a site with an effect on regulatory motifs as annotated by ENCODE (Supplementary Table 14). NXN gene expression was lower in the colon tumor tissue samples included in TCGA (P = 2.83 × 10⁻⁵). Nucleoredoxin, encoded by the NXN gene, has functions related to cell growth and differentiation⁶⁰. Overexpression of the NXN gene has been found to suppress the Wnt signaling pathway, and nucleoredoxin dysfunction might cause activation of the transcription factor TCF (T cell factor), accelerated cell proliferation and enhancement of oncogenicity⁶¹. Further research is needed to determine the causal variant and biological mechanism for the association at this locus.

Previously reported CRC-associated loci in East Asians

We evaluated association evidence for 31 SNPs in 25 established CRC susceptibility loci^{7,8,9,10,11,12,13,14,15,16,17,18,19,20} by analyzing data from stages 1–3 and our previous GWAS^18,19 with a total sample size of up to 11,934 CRC cases and 28,282 controls (Table 3 and Supplementary Table 17). We found further evidence to support the associations of the four loci identified previously in our GWAS conducted among East Asians (P = 1.40 × 10⁻¹⁰ to 3.05 × 10⁻¹⁵). Of the 23 SNPs in the 18 susceptibility loci previously identified by GWAS of individuals of European descent, 20 showed association with CRC risk at P < 0.05 in East Asians in the same direction as reported in the original studies^{7,8,9,10,11,12,13,14,15,16,17}. These signals included 6 SNPs in 4 loci (1q41, 8q24.21, 10p14 and 18q21.1) with association at P < 5 × 10⁻⁸, 6 SNPs in 6 loci with association at P < 0.002 (significance level adjusted for multiple comparisons of 25 independent loci) and 8 SNPs in 8 additional loci with association at P < 0.05. Three SNPs in three loci were not associated with CRC risk (P > 0.05). Given that our study had a statistical power of >80% to identify an association with an OR of 1.05 at P = 0.05 for SNPs with a MAF of 0.20, it is unlikely that these three SNPs confer substantial risk of CRC in East Asian populations. In general, loci initially identified in individuals of European descent had smaller ORs in East Asians, with evidence of heterogeneity noted for three SNPs (P < 0.002). SNPs rs6691170 and rs16892766, identified by previous GWAS of individuals of European descent, are not polymorphic in East Asians, and SNP rs5934683 is located on the X chromosome. We did not have data to evaluate the associations of these three SNPs with CRC risk in this study.

Table 3 Association evidence in East Asians for risk variants in previously reported CRC susceptibility loci

Full size table

Familial relative risk explained by CRC-associated loci

The six newly identified loci in this study explain approximately 2.1% of the familial relative risk of CRC in East Asians (Supplementary Table 18). The variants, along with the four SNPs identified in our previous GWAS, explained approximately 4.3% of the familial relative risk of CRC in East Asians. An additional 3.4% of the familial relative risk in East Asians can be explained by 18 independent SNPs initially identified in studies conducted among individuals of European descent and confirmed in this study. On the basis of per-allele OR values derived from previously published GWAS^{7,8,9,10,11,12,13,14,15,16,17,18} and this study, we estimate that the SNPs in the 31 loci identified thus far explain approximately 9% of the familial relative risk of CRC in individuals of European descent (Supplementary Table 19), a level slightly higher than the 7.7% explained in East Asians.

Discussion

In the largest GWAS conducted thus far among East Asians, we identified six new genetic loci associated with CRC risk and provided suggestive evidence for three additional previously unreported loci. In addition, we replicated 22 previously reported CRC susceptibility loci. Of the six newly identified loci, two map to genes (TCF7L2 and TGFB1) that have established roles in colorectal tumorigenesis. The other four loci are located in or proximal to genes that are functionally important in transcription regulation (ZMIZ1), genome maintenance (FEN1), fatty acid metabolism (FADS1 and FADS2), cancer cell motility and metastasis (CD9), and cell growth and differentiation (NXN). Risk-associated variants at some loci fall within potential functional regions, and two are associated with the expression levels of the TCF7L2 and FADS2 genes. This study expands current understanding of the genetic basis of CRC risk and provides evidence for new genes and biological pathways that might be involved in colorectal tumorigenesis.

On the basis of a large twin study conducted in Sweden, Denmark and Finland², the heritabilities estimated for CRC, breast cancer and prostate cancer were 35%, 27% and 42%, respectively. Thus far, more than 70 low-penetrance susceptibility loci have been identified in GWAS for breast cancer⁶² or prostate cancer⁶³, and these loci together explain approximately 14% and 30%, respectively, of the familial relative risk of these cancers in individuals of European descent. For CRC, however, only 31 low-penetrance susceptibility loci have been identified, explaining approximately 9% of the familial relative risk of CRC in individuals of European descent. Compared with GWAS of breast cancer and prostate cancer, studies conducted for CRC have been relatively small. Our study, in which we evaluated approximately 7,000 promising variants identified by GWAS in the replication stages, represents one of the largest efforts thus far to follow up genetic variants identified by GWAS. We identified six new loci, representing the largest number of new loci identified for CRC risk in a single study. Although multiple GWAS with sample sizes larger than the one in this study have been conducted among individuals of European descent^13,14,16, we were still able to identify risk-associated variants with relatively large effect sizes. Our study further highlights the value of conducting GWAS in non-European populations to discover new susceptibility loci for CRC.

In summary, we have identified six new loci associated with CRC risk in this large GWAS conducted among East Asians. These new loci contain genes with established connections to colorectal tumorigenesis through major biological pathways such as Wnt and TGF-β signaling, as well as genes with important biological functions that have not yet been well linked to CRC. Our study considerably expands knowledge of the genetic landscape of CRC and provides direction for future studies to characterize the causal variants and functional mechanisms of these GWAS-identified loci.

Methods

Study participants.

This GWAS was conducted as part of the Asia Colorectal Cancer Consortium, comprising a total of 14,963 CRC cases and 31,945 controls of East Asian ancestry from 14 studies conducted in China, South Korea and Japan (Supplementary Table 1). Specifically, stage 1 (GWAS discovery) consisted of 5 studies: Shanghai CRC Study 1 (Shanghai-1; n = 3,102), Shanghai CRC Study 2 (Shanghai-2; n = 908), Guangzhou CRC Study 1 (Guangzhou-1; n = 1,603), Aichi CRC Study 1 (Aichi-1; n = 1,346) and Korean Cancer Prevention Study-II CRC (KCPS-II; n = 1,301). With the exception of Shanghai-2, for which we added 423 controls from other studies^64,65, samples for the remaining 4 studies were the same as we reported in our previous study¹⁸. Stage 2 consisted of 3 studies: Shanghai CRC Study 3 (Shanghai-3; n = 6,577), Guangzhou CRC Study 2 (Guangzhou-2; n = 809) and Guangzhou CRC Study 3 (Guangzhou-3; n = 2,408). Stage 3 included 1 study: the BioBank Japan CRC Study (BBJ; n = 14,172). Stage 4 consisted of 5 studies: Guangzhou CRC Study 4 (Guangzhou-4; n = 1,791), Aichi CRC Study 2 (Aichi-2; n = 708), Korean–National Cancer Center CRC Study (Korea-NCC; n = 2,721), Seoul CRC Study (Korea-Seoul; n = 1,522) and Hwasun Cancer Epidemiology Study–Colon and Rectum Cancer (HCES-CRC; n = 7,930). We estimated that our study had a statistical power of >80% to identify an association with an OR of 1.10 or greater at P < 5 × 10⁻⁸ for SNPs with a MAF of as low as 0.30. We evaluated the generalizability of the newly identified associations with CRC risk in individuals of European descent in data from 3 consortia including 23 studies (Supplementary Table 13) with a total sample size of 16,984 cases and 18,262 controls recruited in the United States, Europe, Canada and Australia: the Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO)¹⁷, the Colorectal Transdisciplinary (CORECT) Study and the Colon Cancer Family Registry (CCFR)²¹. Summary descriptions of participating studies are presented in the Supplementary Note. Study protocols were approved by the relevant review boards in the respective institutions, and informed consent was obtained from all study participants.

Laboratory procedures.

Genotyping of samples in stage 1 was conducted as described previously^{18,64,65,66,67,68,69} using the following platforms: the Affymetrix Genome-Wide Human SNP Array 6.0, the Illumina HumanOmniExpress BeadChip, the Illumina Infinium HumanHap550 BeadChip, the Illumina 660W-Quad BeadChip, the Illumina Human610-Quad BeadChip, the Illumina Infinium HumanHap610 BeadChip and the Affymetrix Genome-Wide Human SNP Array 5.0. We used a uniform quality control protocol as recently described¹⁸ to filter samples and SNPs. Genotyping and quality control methods are also presented in the Supplementary Note. After quality control exclusions, we obtained 502,145 autosomal SNPs for samples in Shanghai-1, 245,961 SNPs in Shanghai-2, 250,612 SNPs in Guangzhou-1, 232,426 SNPs in Aichi-1 and 312,869 SNPs in KCPS-II (Supplementary Table 2).

Genotyping for 3,632 cases and 6,404 controls in stage 2 was completed using Illumina Infinium assays as part of the customer add-on content for multiple projects to the Illumina HumanExome BeadChip (see URLs). Details of array design, genotyping, genotype calling and quality control are provided in the Supplementary Note. Samples were excluded according to the following criteria: (i) genotype call rate of <98%, (ii) genetically identical or duplicated samples, (iii) sex determined using genetic data inconsistent with epidemiological or clinical data, (iv) first- or second-degree relatives, (v) ancestry outliers or (vi) heterozygosity outliers. Genetic markers were excluded using the following criteria: (i) MAF = 0, (ii) genotype call rate of <98%, (iii) consistency rate of <98% in positive quality control samples, (iv) P for Hardy-Weinberg equilibrium < 1 × 10⁻⁵ in controls or (v) caution SNPs revealed by the Exome Chip Design group (see URLs). We obtained a final data set including 6,899 SNPs genotyped in 3,519 cases and 6,275 controls for this project.

Cases and controls in stage 3 were genotyped using the Illumina HumanHap610-Quad BeadChip. Quality control filters were based on criteria described previously²⁰. Methods of genotyping and quality control procedures are also presented in the Supplementary Note. After sample and SNP exclusions, we generated a data set comprising 2,814 cases and 11,358 controls with 460,463 SNPs.

Stage 4 genotyping for 29 SNPs was conducted using the iPLEX Sequenom MassARRAY platform according to manufacturer's protocols at the Vanderbilt Molecular Epidemiology Laboratory (Nashville, Tennessee, USA). Details of genotyping and quality control are provided in the Supplementary Note. We filtered out SNPs with (i) genotype call rate of <95%, (ii) genotyping consistency rate of <95% in positive control samples, (iii) an unclear genotype call or (iv) P for Hardy-Weinberg equilibrium of <1 × 10⁻⁵ in controls. The average consistency rate of these SNPs passing quality control filters was 99.9% with a median value of 100% in each of the five participating studies included in this stage.

Samples in GECCO, CORECT and CCFR were genotyped with Illumina and Affymetrix arrays^17,21. Genotyping, quality control and imputation have been reported previously^17,21 and are described in the Supplementary Note.

SNP selection.

Selection of SNPs for stage 2 replication was primarily based on the following criteria: (i) P < 0.05 in meta-analysis, (ii) P for heterogeneity > 0.0001, (iii) imputation R² > 0.5 in each of the included studies, (iv) MAF > 0.05 in each of the included studies, (v) SNPs uncorrelated with established CRC SNPs (defined as r² < 0.2 in the HapMap Asian population), (vi) SNPs uncorrelated with other SNPs identified in this project (r² < 0.2) and (vii) data available in at least two studies (Supplementary Note). We included multiple SNPs in some regions with a prior association P value of <0.002 or with genes of interest. Risk variants identified from previously published GWAS were also included in the assay^{7,8,9,10,11,12,13,14,15,16,17,18,19,20}. In total, 8,570 unique SNPs were selected. Of these, 7,113 SNPs were successfully designed. For stage 3 replication, we selected 559 SNPs according to the following criteria: (i) P < 0.005 in the meta-analysis of data from stages 1 and 2, (ii) association in the same direction in both stages and (iii) P for heterogeneity > 0.0001. For stage 4, we selected 30 SNPs on the basis of the following criteria: (i) P < 0.0001 in the meta-analysis of stages 1–3, (ii) P < 0.01 in the meta-analysis of stages 2 and 3, (iii) association in the same direction in the three stages and (iv) P for heterogeneity > 0.0001.

Statistical and bioinformatics analysis.

Details of imputation and population substructure evaluation are provided in the Supplementary Note. Briefly, stage 1 imputation was performed with the CHB (Han Chinese in Beijing, China) and JPT (Japanese in Tokyo, Japan) HapMap 2 panel as the reference using the MACH v1.0 program⁷⁰ (see URLs). Stage 3 imputation was conducted with phased data for JPT, CHS (Southern Han Chinese, China) and CHD (Chinese in Metropolitan Denver, Colorado) participants from 1000 Genomes Project phase 1 release v3 as the reference using MACH v1.0 and Minimac⁷¹ (see URLs). Regional imputation of genotype data from TCGA³⁰ (see URLs) was performed with the GIANT ALL reference panel from 1000 Genomes Project phase 1 release v3 using MACH v1.0 and Minimac (see URLs). To evaluate imputation quality in our study, we directly genotyped the 10 newly identified risk variants in the approximately 2,800 samples included in stage 1. The concordance between imputed and genotyped data was very high, with mean values ranging from 96.00% to 99.96% for the ten SNPs (Supplementary Table 20). For rs10849432, the imputation quality for the Aichi-1 study was relatively low (R² = 0.57), and data from this study were therefore not included in our final analysis. We evaluated population structure in studies included in stages 1 and 2 using principal-components analysis with EIGENSTRAT software⁷² (see URLs). On the basis of adjusted regression models including the first ten principal components, the genomic inflation factor λ was <1.04 in each of the five studies included in stage 1 and 1.0368 in the meta-analysis of all five studies (Supplementary Fig. 2). The λ value was <1.05 in each of the three studies included in stage 2 and 1.0525 in the meta-analysis of all three studies (Supplementary Fig. 3). A rescaled inflation statistic, λ_1,000, representing the equivalent value for a study with 1,000 cases and 1,000 controls using the formula⁷³ λ_1,000 = 1 + 500 × (λ − 1) × (1/N_cases + 1/N_controls) was 1.01 in both stages 1 and 2. These findings show little evidence of population stratification in our studies.

Associations between SNPs and CRC risk were evaluated on the basis of the log-additive model using Mach2dat⁷⁰, PLINK (version 1.0.7)⁷⁴, R version 3.0.0 and SAS version 9.3 (for all, see URLs). Per-allele OR estimates and 95% CIs were derived from logistic regression models, adjusting for age, sex and the first ten principal components when appropriate. Association analysis was conducted for each participating study separately, and a fixed-effects meta-analysis was conducted to obtain summary results for each of the four stages and all stages combined with the inverse-variance method using the Metal⁷⁵ program. SNPs showing an association at P < 5 × 10⁻⁸ in the combined analysis of all studies were considered genome-wide significant. We also performed stratified analyses for the top SNPs by tumor anatomical site (colon or rectum), population (Chinese, Korean or Japanese) and sex (male or female). We estimated heterogeneity across studies and subgroups with a Cochran's Q test⁷⁶, with P for heterogeneity < 0.008 set as statistically significant when considering multiple comparisons of six independent loci. Independent signals in a locus were identified using stepwise logistic regression models conditioning on the top risk-associated variant we identified in each of the new loci using R software (see URLs). We estimated haplotype frequencies using Haploview (version 4.2)⁷⁷ (see URLs) and conducted haplotype association analysis for two loci (11q12.2 and 19q13.2) where two or more SNPs were identified using SAS Genetics v9.3 with logistic regression models. Pairwise SNP-SNP interactions between 6 top risk-associated variants in the newly identified loci with association P < 5 × 10⁻⁸ and also between these 6 SNPs and the risk-associated variants in 25 previously reported loci were evaluated using the maximum-likelihood ratio test with inclusion of interaction terms in logistic regression models. Interactions with P < 0.00028 were considered statistically significant with adjustment for multiple comparisons of 180 tests.

The familial relative risk (λ) for the offspring of an affected individual due to a single locus was estimated using a log-additive model: λ = (pr² + q)/(pr + q)², where p is the frequency of the risk allele, q = 1−p is the frequency of the reference allele and r is the per-allele relative risk⁷⁸. The proportion of the familial relative risk explained by this locus, assuming a multiplicative interaction between markers in the locus and other loci, was calculated as log (λ)/log (λ₀), where λ₀ is the overall familial relative risk. λ₀ is assigned to be 2.2 for CRC risk estimated from a meta-analysis⁷⁹. Assuming that the risks associated with individual loci combine multiplicatively, the familial relative risks also multiply. Thus, the combined contribution of the familial relative risks from multiple loci is equal to

We generated forest plots and quantile-quantile plots using R software (see URLs). Regional association plots for SNPs in newly identified loci were generated using the website-based tool LocusZoom (version 1.1)⁸⁰ (see URLs). LD structure between SNPs was determined on the basis of data from 1000 Genomes Project Pilot 1 or HapMap 2 as provided by the website-based tool SNAP⁸¹ (see URLs) and plotted using Haploview, SNAP and the UCSC Genome Browser (see URLs). LD blocks were defined using HapMap recombination rates and hotspots²³. All genomic coordinates are based on NCBI Build 36.

To find putative functional variants for newly identified loci, we identified all SNPs in LD (r² > 0.5) with the risk-associated variants using data from the 1000 Genomes Project²² and HapMap 2 (ref. 23). We mapped the genomic locations of these SNPs to nonsynonymous sites, splice sites, promoters, nearGene-3 regions, nearGene-5 regions, 3′ UTRs, 5′ UTRs, introns and intergenic regions. We evaluated the potential functional effect of nonsynonymous SNPs using the prediction algorithms SIFT³⁶ and PolyPhen-2 (ref. 37) (see URLs). We predicted the putative function of SNPs in promoters, nearGene-3 regions, nearGene-5 regions, 3′ UTRs and 5′ UTRs with the SNPinfo Web Server⁴⁵ (see URLs). We conducted analyses to evaluate the potential regulatory effect of SNPs in noncoding regions on transcription using the ENCODE tool HaploReg (v2)⁸² and the UCSC Genome Browser (see URLs) on the basis of their location within regions of promoter or enhancer activity, DNase I hypersensitivity, local histone modification, proteins bound to these regulatory sites, cis-eQTL and transcription factor binding motifs. We obtained additional functional evidence for these SNPs from the published literature.

We identified all genes that localize to 1-Mb windows centered on the top risk-associated variants in our newly identified loci, including SNPs correlated (r² > 0.5) with the top risk variants. To determine whether these genes might explain the observed associations in these loci, we first examined genome-wide cis-eQTL data in multiple tissues from four major eQTL databases: the Blood eQTL Browser²⁵, the eQTL Browser²⁶, the Genotype-Tissue Expression (GTEx) Project²⁷ and the Multiple Tissue Human Expression Resource (MuTHER) Project²⁸. The significance threshold for these analyses was set to P < 0.008 to account for six tests. Somatic mutations of these genes were evaluated using data from COSMIC²⁹ (see URLs). Expression levels of these genes in CRC cell lines were assessed using data from the Expression Atlas³¹ (see URLs). To correct for multiple comparisons of the 11 key genes, associations with P < 0.0045 were considered to be statistically significant. We searched the published literature for these genes with respect to CRC in PubMed and OMIM (see URLs).

Expression analysis.

We downloaded RNA sequencing (level 1) and SNP array (level 2) data for 364 colon adenocarcinoma and 18 normal colon tissue samples from TCGA³⁰ (see URLs). To quantify expression levels of candidate genes in the newly identified loci, we normalized gene expression levels using RPKM (reads per kilobase of exon per million mapped reads) values as previously described⁸³. Expression differences between tumor and normal samples for each gene were evaluated on the basis of RPKM values with the Wilcoxon rank-sum test. Associations between gene RPKM values and SNP genotypes were analyzed using a linear regression model including age and sex as covariates. We converted the RPKM value of a gene to log scale for analysis if it was not normally distributed. We considered P < 0.0045 to be statistically significant with adjustment for testing of the 11 key genes.

URLs.

1000 Genomes Browser, http://browser.1000genomes.org/index.html; BioBank Japan (in Japanese), http://biobankjp.org/; Blood eQTL browser, http://genenetwork.nl/bloodeqtlbrowser/; Catalogue of Somatic Mutations in Cancer (COSMIC), http://cancer.sanger.ac.uk/cancergenome/projects/cosmic/; database of Genotypes and Phenotypes (dbGaP), http://www.ncbi.nlm.nih.gov/gap; EIGENSTRAT, http://genepath.med.harvard.edu/~reich/EIGENSTRAT.htm; eQTL Browser from the University of Chicago, http://eqtl.uchicago.edu/Home.html; GTEx eQTL Browser, http://www.ncbi.nlm.nih.gov/projects/gap/eqtl/index.cgi/; Expression Atlas, http://www.ebi.ac.uk/gxa/; Haploview, http://www.broad.mit.edu/mpg/haploview/; HaploReg v2, http://www.broadinstitute.org/mammals/haploreg/haploreg.php; HapMap Project, http://hapmap.ncbi.nlm.nih.gov/; Illumina HumanExome-12v1_A BeadChip, International Mouse Phenotyping Consortium (IMPC), https://www.mousephenotype.org/; LocusZoom, http://csg.sph.umich.edu/locuszoom/; http://genome.sph.umich.edu/wiki/Exome_Chip_Design; MACH 1.0, http://www.sph.umich.edu/csg/abecasis/MACH/; Mach2dat, http://genome.sph.umich.edu/wiki/Mach2dat:_Association_with_MACH_output; Minimac, http://genome.sph.umich.edu/wiki/Minimac; Metal, http://www.sph.umich.edu/csg/abecasis/Metal/; Multiple Tissue Human Expression Resource (MuTHER) Project, http://www.muther.ac.uk/; Online Mendelian Inheritance in Man (OMIM), http://www.ncbi.nlm.nih.gov/omim/; PLINK version 1.07, http://pngu.mgh.harvard.edu/~purcell/plink/; PolyPhen-2, http://genetics.bwh.harvard.edu/pph2/; R version 3.0.0, http://www.r-project.org/; SAS version 9.2, http://www.sas.com/; SIFT, SNAP, http://www.broadinstitute.org/mpg/snap/; http://sift.jcvi.org/; The Cancer Genome Atlas (TCGA), http://cancergenome.nih.gov/; TRANSFAC, http://www.gene-regulation.com/pub/databases.html; UCSC Genome Browser, http://genome.ucsc.edu/.

References

Jemal, A. et al. Global cancer statistics. CA Cancer J. Clin. 61, 69–90 (2011).
PubMed Google Scholar
Lichtenstein, P. et al. Environmental and heritable factors in the causation of cancer—analyses of cohorts of twins from Sweden, Denmark, and Finland. N. Engl. J. Med. 343, 78–85 (2000).
Article CAS PubMed Google Scholar
de la Chapelle, A. Genetic predisposition to colorectal cancer. Nat. Rev. Cancer 4, 769–780 (2004).
Article CAS PubMed Google Scholar
Aaltonen, L., Johns, L., Jarvinen, H., Mecklin, J.P. & Houlston, R. Explaining the familial colorectal cancer risk associated with mismatch repair (MMR)-deficient and MMR-stable tumors. Clin. Cancer Res. 13, 356–361 (2007).
Article CAS PubMed Google Scholar
Ma, X., Zhang, B. & Zheng, W. Genetic variants associated with colorectal cancer risk: comprehensive research synopsis, meta-analysis, and epidemiological evidence. Gut 63, 326–336 (2014).
Article CAS PubMed Google Scholar
Palles, C. et al. Germline mutations affecting the proofreading domains of POLE and POLD1 predispose to colorectal adenomas and carcinomas. Nat. Genet. 45, 136–144 (2013).
Article CAS PubMed Google Scholar
Zanke, B.W. et al. Genome-wide association scan identifies a colorectal cancer susceptibility locus on chromosome 8q24. Nat. Genet. 39, 989–994 (2007).
Article CAS PubMed Google Scholar
Tomlinson, I. et al. A genome-wide association scan of tag SNPs identifies a susceptibility variant for colorectal cancer at 8q24.21. Nat. Genet. 39, 984–988 (2007).
Article CAS PubMed Google Scholar
Broderick, P. et al. A genome-wide association study shows that common alleles of SMAD7 influence colorectal cancer risk. Nat. Genet. 39, 1315–1317 (2007).
Article CAS PubMed Google Scholar
Jaeger, E. et al. Common genetic variants at the CRAC1 (HMPS) locus on chromosome 15q13.3 influence colorectal cancer risk. Nat. Genet. 40, 26–28 (2008).
Article CAS PubMed Google Scholar
Tenesa, A. et al. Genome-wide association scan identifies a colorectal cancer susceptibility locus on 11q23 and replicates risk loci at 8q24 and 18q21. Nat. Genet. 40, 631–637 (2008).
Article CAS PubMed PubMed Central Google Scholar
Tomlinson, I.P. et al. A genome-wide association study identifies colorectal cancer susceptibility loci on chromosomes 10p14 and 8q23.3. Nat. Genet. 40, 623–630 (2008).
Article CAS PubMed Google Scholar
Houlston, R.S. et al. Meta-analysis of genome-wide association data identifies four new susceptibility loci for colorectal cancer. Nat. Genet. 40, 1426–1435 (2008).
Article CAS PubMed Google Scholar
Houlston, R.S. et al. Meta-analysis of three genome-wide association studies identifies susceptibility loci for colorectal cancer at 1q41, 3q26.2, 12q13.13 and 20q13.33. Nat. Genet. 42, 973–977 (2010).
Article CAS PubMed PubMed Central Google Scholar
Tomlinson, I.P. et al. Multiple common susceptibility variants near BMP pathway loci GREM1, BMP4, and BMP2 explain part of the missing heritability of colorectal cancer. PLoS Genet. 7, e1002105 (2011).
Article CAS PubMed PubMed Central Google Scholar
Dunlop, M.G. et al. Common variation near CDKN1A, POLD3 and SHROOM2 influences colorectal cancer risk. Nat. Genet. 44, 770–776 (2012).
Article CAS PubMed PubMed Central Google Scholar
Peters, U. et al. Identification of genetic susceptibility loci for colorectal tumors in a genome-wide meta-analysis. Gastroenterology 144, 799–807 (2013).
Article CAS PubMed Google Scholar
Jia, W.H. et al. Genome-wide association analyses in East Asians identify new susceptibility loci for colorectal cancer. Nat. Genet. 45, 191–196 (2013).
Article CAS PubMed Google Scholar
Zhang, B. et al. Genome-wide association study identifies a new SMAD7 risk variant associated with colorectal cancer risk in East Asians. Int. J. Cancer 10.1002/ijc.28733 (21 January 2014).
Cui, R. et al. Common variant in 6q26-q27 is associated with distal colon cancer in an Asian population. Gut 60, 799–805 (2011).
Article CAS PubMed Google Scholar
Figueiredo, J.C. et al. Genotype-environment interactions in microsatellite stable/microsatellite instability-low colorectal cancer: results from a genome-wide association study. Cancer Epidemiol. Biomarkers Prev. 20, 758–766 (2011).
Article PubMed PubMed Central Google Scholar
Abecasis, G.R. et al. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).
Article CAS PubMed Google Scholar
Frazer, K.A. et al. A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861 (2007).
Article CAS PubMed Google Scholar
ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Westra, H.J. et al. Systematic identification of trans eQTLs as putative drivers of known disease associations. Nat. Genet. 45, 1238–1243 (2013).
Article CAS PubMed PubMed Central Google Scholar
Degner, J.F. et al. DNase I sensitivity QTLs are a major determinant of human expression variation. Nature 482, 390–394 (2012).
Article CAS PubMed PubMed Central Google Scholar
GTEx Consortium. The Genotype-Tissue Expression (GTEx) project. Nat. Genet. 45, 580–585 (2013).
Grundberg, E. et al. Mapping cis- and trans-regulatory effects across multiple tissues in twins. Nat. Genet. 44, 1084–1089 (2012).
Article CAS PubMed PubMed Central Google Scholar
Forbes, S.A. et al. COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer. Nucleic Acids Res. 39, D945–D950 (2011).
Article CAS PubMed Google Scholar
Cancer Genome Atlas Network. Comprehensive molecular characterization of human colon and rectal cancer. Nature 487, 330–337 (2012).
Kapushesky, M. et al. Gene Expression Atlas update—a value-added database of microarray and sequencing-based functional genomics experiments. Nucleic Acids Res. 40, D1077–D1081 (2012).
Article CAS PubMed Google Scholar
Tang, W. et al. A genome-wide RNAi screen for Wnt/β-catenin pathway components identifies unexpected roles for TCF transcription factors in cancer. Proc. Natl. Acad. Sci. USA 105, 9697–9702 (2008).
Article PubMed PubMed Central Google Scholar
Angus-Hill, M.L., Elbert, K.M., Hidalgo, J. & Capecchi, M.R. T-cell factor 4 functions as a tumor suppressor whose disruption modulates colon cell proliferation and tumorigenesis. Proc. Natl. Acad. Sci. USA 108, 4914–4919 (2011).
Article PubMed PubMed Central Google Scholar
Bass, A.J. et al. Genomic sequencing of colorectal adenocarcinomas identifies a recurrent VTI1A-TCF7L2 fusion. Nat. Genet. 43, 964–968 (2011).
Article CAS PubMed PubMed Central Google Scholar
Grainger, D.J. et al. Genetic control of the circulating concentration of transforming growth factor type β1. Hum. Mol. Genet. 8, 93–97 (1999).
Article CAS PubMed Google Scholar
Kumar, P., Henikoff, S. & Ng, P.C. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat. Protoc. 4, 1073–1081 (2009).
Article CAS PubMed Google Scholar
Adzhubei, I.A. et al. A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249 (2010).
Article CAS PubMed PubMed Central Google Scholar
Dunning, A.M. et al. A transforming growth factor β1 signal peptide variant increases secretion in vitro and is associated with increased incidence of invasive breast cancer. Cancer Res. 63, 2610–2615 (2003).
CAS PubMed Google Scholar
Suthanthiran, M. et al. Transforming growth factor-β1 hyperexpression in African-American hypertensives: a novel mediator of hypertension and/or target organ damage. Proc. Natl. Acad. Sci. USA 97, 3479–3484 (2000).
CAS PubMed PubMed Central Google Scholar
Yamada, Y. et al. Association of a polymorphism of the transforming growth factor-β1 gene with genetic susceptibility to osteoporosis in postmenopausal Japanese women. J. Bone Miner. Res. 13, 1569–1576 (1998).
Article CAS PubMed Google Scholar
Markowitz, S.D. & Bertagnolli, M.M. Molecular origins of cancer: molecular basis of colorectal cancer. N. Engl. J. Med. 361, 2449–2460 (2009).
Article CAS PubMed PubMed Central Google Scholar
Howe, J.R. et al. Mutations in the SMAD4/DPC4 gene in juvenile polyposis. Science 280, 1086–1088 (1998).
Article CAS PubMed Google Scholar
Valle, L. et al. Germline allele-specific expression of TGFBR1 confers an increased risk of colorectal cancer. Science 321, 1361–1365 (2008).
Article CAS PubMed PubMed Central Google Scholar
Liu, L. et al. Functional FEN1 genetic variants contribute to risk of hepatocellular carcinoma, esophageal cancer, gastric cancer and colorectal cancer. Carcinogenesis 33, 119–123 (2012).
Article CAS PubMed Google Scholar
Xu, Z. & Taylor, J.A. SNPinfo: integrating GWAS and candidate gene information into functional SNP selection for genetic association studies. Nucleic Acids Res. 37, W600–W605 (2009).
Article CAS PubMed PubMed Central Google Scholar
Zheng, L. et al. Functional regulation of FEN1 nuclease and its link to cancer. Nucleic Acids Res. 39, 781–794 (2011).
Article CAS PubMed Google Scholar
Zheng, L. et al. Fen1 mutations result in autoimmunity, chronic inflammation and cancers. Nat. Med. 13, 812–819 (2007).
Article CAS PubMed Google Scholar
Kucherlapati, M. et al. Haploinsufficiency of Flap endonuclease (Fen1) leads to rapid tumor progression. Proc. Natl. Acad. Sci. USA 99, 9924–9929 (2002).
Article CAS PubMed PubMed Central Google Scholar
Schaeffer, L. et al. Common genetic variants of the FADS1-FADS2 gene cluster and their reconstructed haplotypes are associated with the fatty acid composition in phospholipids. Hum. Mol. Genet. 15, 1745–1756 (2006).
Article CAS PubMed Google Scholar
Castellone, M.D., Teramoto, H., Williams, B.O., Druey, K.M. & Gutkind, J.S. Prostaglandin E2 promotes colon cancer cell growth through a Gs-axin–β-catenin signaling axis. Science 310, 1504–1510 (2005).
Article CAS PubMed Google Scholar
Cai, Q. et al. Prospective study of urinary prostaglandin E2 metabolite and colorectal cancer risk. J. Clin. Oncol. 24, 5010–5016 (2006).
Article CAS PubMed Google Scholar
Rogers, L.M., Riordan, J.D., Swick, B.L., Meyerholz, D.K. & Dupuy, A.J. Ectopic expression of Zmiz1 induces cutaneous squamous cell malignancies in a mouse model of cancer. J. Invest. Dermatol. 133, 1863–1869 (2013).
Article CAS PubMed PubMed Central Google Scholar
Turnbull, C. et al. Genome-wide association study identifies five new breast cancer susceptibility loci. Nat. Genet. 42, 504–507 (2010).
Article CAS PubMed PubMed Central Google Scholar
Ovalle, S. et al. The tetraspanin CD9 inhibits the proliferation and tumorigenicity of human colon carcinoma cells. Int. J. Cancer 121, 2140–2152 (2007).
Article CAS PubMed Google Scholar
Mori, M. et al. Motility related protein 1 (MRP1/CD9) expression in colon cancer. Clin. Cancer Res. 4, 1507–1510 (1998).
CAS PubMed Google Scholar
Lee, J.H. et al. Glycoprotein 90K, downregulated in advanced colorectal cancer tissues, interacts with CD9/CD82 and suppresses the Wnt/β-catenin signal via ISGylation of β-catenin. Gut 59, 907–917 (2010).
Article CAS PubMed Google Scholar
Braumüller, H. et al. T-helper-1-cell cytokines drive cancer into senescence. Nature 494, 361–365 (2013).
Article CAS PubMed Google Scholar
Wolf, M.J., Seleznik, G.M., Zeller, N. & Heikenwalder, M. The unexpected role of lymphotoxin β receptor signaling in carcinogenesis: from lymphoid tissue formation to liver and prostate cancer development. Oncogene 29, 5006–5018 (2010).
Article CAS PubMed Google Scholar
Lukashev, M. et al. Targeting the lymphotoxin-β receptor with agonist antibodies as a potential cancer therapy. Cancer Res. 66, 9617–9624 (2006).
Article CAS PubMed Google Scholar
Funato, Y. & Miki, H. Nucleoredoxin, a novel thioredoxin family member involved in cell growth and differentiation. Antioxid. Redox Signal. 9, 1035–1057 (2007).
Article CAS PubMed Google Scholar
Funato, Y., Michiue, T., Asashima, M. & Miki, H. The thioredoxin-related redox-regulating protein nucleoredoxin inhibits Wnt–β-catenin signalling through Dishevelled. Nat. Cell Biol. 8, 501–508 (2006).
Article CAS PubMed Google Scholar
Michailidou, K. et al. Large-scale genotyping identifies 41 new loci associated with breast cancer risk. Nat. Genet. 45, 353–361 (2013).
Article CAS PubMed PubMed Central Google Scholar
Eeles, R.A. et al. Identification of 23 new prostate cancer susceptibility loci using the iCOGS custom genotyping array. Nat. Genet. 45, 385–391 (2013).
Article CAS PubMed Google Scholar
Abnet, C.C. et al. A shared susceptibility locus in PLCE1 at 10q23 for gastric adenocarcinoma and esophageal squamous cell carcinoma. Nat. Genet. 42, 764–767 (2010).
Article CAS PubMed PubMed Central Google Scholar
Amundadottir, L. et al. Genome-wide association study identifies variants in the ABO locus associated with susceptibility to pancreatic cancer. Nat. Genet. 41, 986–990 (2009).
Article CAS PubMed PubMed Central Google Scholar
Bei, J.X. et al. A genome-wide association study of nasopharyngeal carcinoma identifies three new susceptibility loci. Nat. Genet. 42, 599–603 (2010).
Article CAS PubMed Google Scholar
Nakata, I. et al. Association between the SERPING1 gene and age-related macular degeneration and polypoidal choroidal vasculopathy in Japanese. PLoS ONE 6, e19108 (2011).
Article CAS PubMed PubMed Central Google Scholar
Jee, S.H. et al. Adiponectin concentrations: a genome-wide association study. Am. J. Hum. Genet. 87, 545–552 (2010).
Article CAS PubMed PubMed Central Google Scholar
Zheng, W. et al. Genome-wide association study identifies a new breast cancer susceptibility locus at 6q25.1. Nat. Genet. 41, 324–328 (2009).
Article CAS PubMed PubMed Central Google Scholar
Li, Y., Willer, C.J., Ding, J., Scheet, P. & Abecasis, G.R. MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet. Epidemiol. 34, 816–834 (2010).
Article PubMed PubMed Central Google Scholar
Howie, B., Fuchsberger, C., Stephens, M., Marchini, J. & Abecasis, G.R. Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nat. Genet. 44, 955–959 (2012).
Article CAS PubMed PubMed Central Google Scholar
Price, A.L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006).
Article CAS PubMed Google Scholar
Freedman, M.L. et al. Assessing the impact of population stratification on genetic association studies. Nat. Genet. 36, 388–393 (2004).
Article CAS PubMed Google Scholar
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Article CAS PubMed PubMed Central Google Scholar
Willer, C.J., Li, Y. & Abecasis, G.R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190–2191 (2010).
Article CAS PubMed PubMed Central Google Scholar
Lau, J., Ioannidis, J.P. & Schmid, C.H. Quantitative synthesis in systematic reviews. Ann. Intern. Med. 127, 820–826 (1997).
Article CAS PubMed Google Scholar
Barrett, J.C., Fry, B., Maller, J. & Daly, M.J. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21, 263–265 (2005).
Article CAS PubMed Google Scholar
Zheng, W. et al. Common genetic determinants of breast-cancer risk in East Asian women: a collaborative study of 23 637 breast cancer cases and 25 579 controls. Hum. Mol. Genet. 22, 2539–2550 (2013).
Article CAS PubMed PubMed Central Google Scholar
Johns, L.E. & Houlston, R.S. A systematic review and meta-analysis of familial colorectal cancer risk. Am. J. Gastroenterol. 96, 2992–3003 (2001).
Article CAS PubMed Google Scholar
Pruim, R.J. et al. LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics 26, 2336–2337 (2010).
Article CAS PubMed PubMed Central Google Scholar
Johnson, A.D. et al. SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap. Bioinformatics 24, 2938–2939 (2008).
Article CAS PubMed PubMed Central Google Scholar
Ward, L.D. & Kellis, M. HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants. Nucleic Acids Res. 40, D930–D934 (2012).
Article CAS PubMed Google Scholar
Yan, G. et al. Genome sequencing and comparison of two nonhuman primate animal models, the cynomolgus and Chinese rhesus macaques. Nat. Biotechnol. 29, 1019–1023 (2011).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

The authors are solely responsible for the scientific content of this paper. The sponsors of this study had no role in study design, data collection, analysis or interpretation, writing of the report or the decision for submission. We thank all study participants and research staff of all parent studies for their contributions and commitment to this project, R. Courtney for DNA preparation, J. He for data processing and analyses, X. Guo for suggestions on bioinformatics analysis, and M.J. Daly and B.J. Rammer for editing and preparing the manuscript. The work at the Vanderbilt University School of Medicine was supported by US National Institutes of Health (NIH) grants R37CA070867, R01CA082729, R01CA124558, R01CA148667 and R01CA122364, as well as by Ingram Professorship and Research Reward funds from the Vanderbilt University School of Medicine. Studies (grant support) participating in the Asia Colorectal Cancer Consortium include the Shanghai Women's Health Study (US NIH, R37CA070867), the Shanghai Men's Health Study (US NIH, R01CA082729), the Shanghai Breast and Endometrial Cancer Studies (US NIH, R01CA064277 and R01CA092585; contributing only controls), Shanghai Colorectal Cancer Study 3 (US NIH, R37CA070867 and Ingram Professorship funds), the Guangzhou Colorectal Cancer Study (National Key Scientific and Technological Project, 2011ZX09307-001-04; the National Basic Research Program, 2011CB504303, contributing only controls; the Natural Science Foundation of China, 81072383, contributing only controls), the Japan BioBank Colorectal Cancer Study (grant from the Ministry of Education, Culture, Sports, Science and Technology of the Japanese government), the Hwasun Cancer Epidemiology Study–Colon and Rectum Cancer (HCES-CRC; grants from the Korea Center for Disease Control and Prevention and the Jeonnam Regional Cancer Center), the Aichi Colorectal Cancer Study (Grant-in-Aid for Cancer Research, grant for the Third Term Comprehensive Control Research for Cancer and Grants-in-Aid for Scientific Research from the Japanese Ministry of Education, Culture, Sports, Science and Technology, 17015018 and 221S0001), the Korea-NCC (National Cancer Center) Colorectal Cancer Study (Basic Science Research Program through the National Research Foundation of Korea, 2010-0010276; National Cancer Center Korea, 0910220), the Korea-Seoul Colorectal Cancer Study (none reported) and the KCPS-II Colorectal Cancer Study (National R&D Program for Cancer Control, 1220180; Seoul R&D Program, 10526).

We also thank all participants, staff and investigators from the GECCO, CORECT and CCFR consortia for making it possible to present results from populations of European ancestry for the new CRC-associated loci identified among East Asians. GECCO, CORECT and CCFR are directed by U. Peters, S. Gruber and G. Casey, respectively. Complete lists of investigators from the GECCO, CORECT and CCFR consortia are provided below.

Investigators (institution and location) in the GECCO consortium include (in alphabetical order) John A. Baron (Division of Gastroenterology and Hepatology, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA), Sonja I. Berndt (Division of Cancer Epidemiology and Genetics, National Cancer Institute, US NIH, Bethesda, Maryland, USA), Stéphane Bezieau (Service de Génétique Médicale, Centre Hospitalier Universitaire (CHU) Nantes, Nantes, France), Hermann Brenner (Division of Clinical Epidemiology and Aging Research, German Cancer Research Center, Heidelberg, Germany), Bette J. Caan (Division of Research, Kaiser Permanente Medical Care Program, Oakland, California, USA), Christopher S. Carlson (Public Health Sciences Division, Fred Hutchinson Cancer Research Center, School of Public Health, University of Washington, Seattle, Washington, USA), Graham Casey (Department of Preventive Medicine, University of Southern California Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, California, USA), Andrew T. Chan (Division of Gastroenterology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA and Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, USA), Jenny Chang-Claude (Division of Cancer Epidemiology, German Cancer Research Center, Heidelberg, Germany), Stephen J. Chanock (Division of Cancer Epidemiology and Genetics, National Cancer Institute, US NIH, Bethesda, Maryland, USA), David V. Conti (Department of Preventive Medicine, University of Southern California, Los Angeles, California, USA), Keith Curtis (Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA), David Duggan (Translational Genomics Research Institute, Phoenix, Arizona, USA), Charles S. Fuchs (Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA and Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, USA), Steven Gallinger (Department of Surgery, Mount Sinai Hospital, Toronto, Ontario, Canada and Samuel Lunenfeld Research Institute, Toronto, Ontario, Canada), Edward L. Giovannucci (Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, USA, Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts, USA and Department of Nutrition, Harvard School of Public Health, Boston, Massachusetts, USA), Stephen B. Gruber (University of Southern California Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, California, USA), Robert W. Haile (Department of Preventive Medicine, University of Southern California, Los Angeles, California, USA), Tabitha A. Harrison (Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA), Richard B. Hayes (Division of Epidemiology, Department of Environmental Medicine, New York University School of Medicine, New York, New York, USA), Michael Hoffmeister (Division of Clinical Epidemiology and Aging Research, German Cancer Research Center, Heidelberg, Germany), John L. Hopper (Melbourne School of Population Health, The University of Melbourne, Melbourne, Victoria, Australia), Li Hsu (Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA and Department of Biostatistics, University of Washington, Seattle, Washington, USA), Thomas J. Hudson (Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada and Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada), David J. Hunter (Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts, USA), Carolyn M. Hutter (Division of Cancer Control and Population Sciences, National Cancer Institute, US NIH, Bethesda, Maryland, USA), Rebecca D. Jackson (Division of Endocrinology, Diabetes and Metabolism, Ohio State University, Columbus, Ohio, USA), Mark A. Jenkins (Melbourne School of Population Health, The University of Melbourne, Melbourne, Victoria, Australia), Shuo Jiao (Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA), Sébastien Küry (Service de Génétique Médicale, CHU Nantes, Nantes, France), Loic Le Marchand (Epidemiology Program, University of Hawaii Cancer Center, Honolulu, Hawaii, USA), Mathieu Lemire (Ontario Institute for Cancer Research, Toronto, Ontario, Canada), Noralane M. Lindor (Department of Health Sciences Research, Mayo Clinic, Scottsdale, Arizona, USA), Jing Ma (Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, USA), Polly A. Newcomb (Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA and Department of Epidemiology, University of Washington School of Public Health, Seattle, Washington, USA), Ulrike Peters (Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA and Department of Epidemiology, University of Washington School of Public Health, Seattle, Washington, USA), John D. Potter (Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA, Department of Epidemiology, University of Washington School of Public Health, Seattle, Washington, USA and Centre for Public Health Research, Massey University, Palmerston North, New Zealand), Conghui Qu (Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA), Thomas Rohan (Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Yeshiva University, Bronx, New York, USA), Robert E. Schoen (Department of Medicine and Epidemiology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania, USA), Fredrick R. Schumacher (Department of Preventive Medicine, University of Southern California, Los Angeles, California, USA), Daniela Seminara (Division of Cancer Control and Population Sciences, National Cancer Institute, US NIH, Bethesda, Maryland, USA), Martha L. Slattery (Department of Internal Medicine, University of Utah Health Sciences Center, Salt Lake City, Utah, USA), Stephen N. Thibodeau (Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, Minnesota, USA and Department of Laboratory Genetics, Mayo Clinic, Rochester, Minnesota, USA), Emily White (Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA and Department of Epidemiology, University of Washington School of Public Health, Seattle, Washington, USA) and Brent W. Zanke (Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, Ontario, Canada).

Investigators (institution and location) from the CORECT consortium include (in alphabetical order) Kendra Blalock (Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA), Peter T. Campbell (Epidemiology Research Program, American Cancer Society, Atlanta, Georgia, USA), Graham Casey (Department of Preventive Medicine, University of Southern California Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, California, USA), David V. Conti (Department of Preventive Medicine, University of Southern California Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, California, USA), Christopher K. Edlund (Department of Preventive Medicine, University of Southern California Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, California, USA), Jane Figueiredo (Department of Preventive Medicine, University of Southern California Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, California, USA), W. James Gauderman (Department of Preventive Medicine, University of Southern California Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, California, USA), Jian Gong (Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA), Roger C. Green (Faculty of Medicine, Memorial University of Newfoundland, St. John's, Newfoundland, Canada), Stephen B. Gruber (University of Southern California Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, California, USA), John F. Harju (University of Michigan Comprehensive Cancer Center, University of Michigan, Ann Arbor, Michigan, USA), Tabitha A. Harrison (Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA), Eric J. Jacobs (Epidemiology Research Program, American Cancer Society, Atlanta, Georgia, USA), Mark A. Jenkins (Melbourne School of Population Health, The University of Melbourne, Melbourne, Victoria, Australia), Shuo Jiao (Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA), Li Li (Case Comprehensive Cancer Center, Case Western Reserve University, Cleveland, Ohio, USA), Yi Lin (Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA), Frank J. Manion (University of Michigan Comprehensive Cancer Center, University of Michigan, Ann Arbor, Michigan, USA), Victor Moreno (Institut d'Investigació Biomèdica de Bellvitge, Institut Catala d'Oncologia, Hospitalet, Barcelona, Spain), Bhramar Mukherjee (University of Michigan Comprehensive Cancer Center, University of Michigan, Ann Arbor, Michigan, USA), Ulrike Peters (Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA), Leon Raskin (University of Southern California Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, California, USA), Fredrick R. Schumacher (Department of Preventive Medicine, University of Southern California Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, California, USA), Daniela Seminara (Division of Cancer Control and Population Sciences, National Cancer Institute, US NIH, Bethesda, Maryland, USA), Gianluca Severi (Melbourne School of Population Health, The University of Melbourne, Melbourne, Victoria, Australia), Stephanie L. Stenzel (Department of Preventive Medicine, University of Southern California Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, California, USA) and Duncan C. Thomas (Department of Preventive Medicine, University of Southern California Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, California, USA).

The CCFR consortium is represented by Graham Casey (Department of Preventive Medicine, University of Southern California Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, California, USA).

We also thank B. Buecher of ASTERISK; U. Handte-Daub, M. Celik, R. Hettler-Jensen, U. Benscheid and U. Eilber of DACHS; and P. Soule, H. Ranu, I. Devivo, D.J. Hunter, Q. Guo, L. Zhu and H. Zhang of HPFS, NHS and PHS, as well as the following state cancer registries for their help: Alabama, Arizona, Arkansas, California, Colorado, Connecticut, Delaware, Florida, Georgia, Idaho, Illinois, Indiana, Iowa, Kentucky, Louisiana, Maine, Maryland, Massachusetts, Michigan, Nebraska, New Hampshire, New Jersey, New York, North Carolina, North Dakota, Ohio, Oklahoma, Oregon, Pennsylvania, Rhode Island, South Carolina, Tennessee, Texas, Virginia, Washington and Wyoming. We thank C. Berg and P. Prorok of PLCO; T. Riley of Information Management Services, Inc.; B. O'Brien of Westat, Inc.; B. Kopp and W. Shao of SAIC-Frederick; the WHI investigators (see https://www.whi.org/researchers/SitePages/Write%20a%20Paper.aspx) and the GECCO Coordinating Center. Participating studies (grant support) in the GECCO, CORECT and CCFR GWAS meta-analysis are GECCO (US NIH, U01CA137088 and R01CA059045), DALS (US NIH, R01CA048998), DACHS (German Federal Ministry of Education and Research, BR 1704/6-1, BR 1704/6-3, BR 1704/6-4, CH 117/1-1, 01KH0404 and 01ER0814), HPFS (US NIH, P01CA055075, UM1CA167552, R01137178 and P50CA127003), NHS (US NIH, R01137178, P50CA127003 and P01CA087969), OFCCR (US NIH, U01CA074783), PMH (US NIH, R01CA076366), PHS (US NIH, R01CA042182), VITAL (US NIH, K05CA154337), WHI (US NIH, HHSN268201100046C, HHSN268201100001C, HHSN268201100002C, HHSN268201100003C, HHSN268201100004C, HHSN271201100004C and 268200764316C) and PLCO (US NIH, Z01CP 010200, U01HG004446 and U01HG 004438). CORECT is supported by the National Cancer Institute as part of the GAME-ON consortium (US NIH, U19CA148107) with additional support from National Cancer Institute grants (R01CA81488 and P30CA014089), the National Human Genome Research Institute at the US NIH (T32HG000040) and the National Institute of Environmental Health Sciences at the US NIH (T32ES013678). CCFR is supported by the National Cancer Institute, US NIH under RFA CA-95-011 and through cooperative agreements with members of the Colon Cancer Family Registry and principal investigators of the Australasian Colorectal Cancer Family Registry (US NIH, U01CA097735), the Familial Colorectal Neoplasia Collaborative Group (US NIH, U01CA074799) (University of Southern California), the Mayo Clinic Cooperative Family Registry for Colon Cancer Studies (US NIH, U01CA074800), the Ontario Registry for Studies of Familial Colorectal Cancer (US NIH, U01CA074783), the Seattle Colorectal Cancer Family Registry (US NIH, U01CA074794) and the University of Hawaii Colorectal Cancer Family Registry (US NIH, U01CA074806). The GWAS work was supported by a National Cancer Institute grant (US NIH, U01CA122839). OFCCR was supported by a GL2 grant from the Ontario Research Fund, Canadian Institutes of Health Research and a Cancer Risk Evaluation (CaRE) Program grant from the Canadian Cancer Society Research Institute. T.J. Hudson and B.W. Zanke are recipients of Senior Investigator Awards from the Ontario Institute for Cancer Research, through support from the Ontario Ministry of Economic Development and Innovation. ASTERISK was funded by a Regional Hospital Clinical Research Program (PHRC) and supported by the Regional Council of Pays de la Loire, the Groupement des Entreprises Françaises dans la Lutte contre le Cancer (GEFLUC), the Association Anne de Bretagne Génétique and the Ligue Régionale Contre le Cancer (LRCC). PLCO data sets were accessed with approval through dbGaP (CGEMS prostate cancer scan, phs000207.v1.p1; CGEMS pancreatic cancer scan, phs000206.v4.p3; and GWAS of Lung Cancer and Smoking, phs000093.v2.p2, which was funded by Z01CP 010200, U01HG004446 and U01HG 004438 from the US NIH).

Author information

Authors and Affiliations

Division of Epidemiology, Department of Medicine, Vanderbilt-Ingram Cancer Center, Vanderbilt Epidemiology Center, Vanderbilt University School of Medicine, Nashville, Tennessee, USA
Ben Zhang, Qiuyin Cai, Jirong Long, Jiajun Shi, Wanqing Wen, Gong Yang, Yanfeng Zhang, Xiao-Ou Shu & Wei Zheng
State Key Laboratory of Oncology in South China, Cancer Center, Sun Yat-sen University, Guangzhou, China
Wei-Hua Jia, Zhi-Zhong Pan & Yi-Xin Zeng
Laboratory of Molecular Medicine, Human Genome Center, Institute of Medical Science, The University of Tokyo, Tokyo, Japan
Koichi Matsuda
Department of Preventive Medicine, Chonnam National University Medical School, Gwangju, South Korea
Sun-Seog Kweon & Min-Ho Shin
Jeonnam Regional Cancer Center, Chonnam National University Hwasun Hospital, Hwasun, South Korea
Sun-Seog Kweon
Department of Preventive Medicine, Kyushu University Faculty of Medical Sciences, Fukuoka, Japan
Keitaro Matsuo & Satoyo Hosono
Department of Epidemiology, Shanghai Cancer Institute, Shanghai, China
Yong-Bing Xiang, Yu-Tang Gao & Hong-Lan Li
Molecular Epidemiology Branch, National Cancer Center, Goyang-si, South Korea
Aesun Shin
Department of Preventive Medicine, Seoul National University College of Medicine, Seoul, South Korea
Aesun Shin & Yoon-Ok Ahn
Department of Epidemiology and Health Promotion, Institute for Health Promotion, Graduate School of Public Health, Yonsei University, Seoul, South Korea
Sun Ha Jee & Soriul Kim
Department of Social and Preventive Medicine, Hallym University College of Medicine, Okcheon-dong, South Korea
Dong-Hyun Kim & Jin-Young Jeong
Department of Biostatistics, Vanderbilt University School of Medicine, Nashville, Tennessee, USA
Chun Li
Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, Tennessee, USA
Bingshan Li
Department of Cancer Biology, Vanderbilt University School of Medicine, Nashville, Tennessee, USA
Yan Guo
School of Public Health, Sun Yat-sen University, Guangzhou, China
Zefang Ren
Division of Cancer Epidemiology & Genetics, National Cancer Institute, Bethesda, Maryland, USA
Bu-Tian Ji
Center for Integrative Medical Sciences, RIKEN, Kanagawa, Japan
Atsushi Takahashi & Michiaki Kubo
Center for Genomic Medicine, Graduate School of Medicine, Kyoto University, Kyoto, Japan
Fumihiko Matsuda
Center for Colorectal Cancer, National Cancer Center, Goyang-si, South Korea
Jae Hwan Oh & Ji Won Park
Division of Gastroenterology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA
Andrew T Chan
Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, USA
Andrew T Chan
Division of Cancer Epidemiology, German Cancer Research Center, Heidelberg, Germany
Jenny Chang-Claude
Department of Internal Medicine, University of Utah Health Sciences Center, Salt Lake City, Utah, USA
Martha L Slattery
University of Southern California Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, California, USA
Stephen B Gruber, Fredrick R Schumacher, Stephanie L Stenzel & Graham Casey
Department of Surgery, Chonnam National University Medical School, Gwangju, South Korea
Hyeong-Rok Kim
Department of Surgery, Seoul National University Hospital, Seoul, South Korea
Ji Won Park
Department of Hemato-oncology, Chonnam National University Medical School, Gwangju, South Korea
Sang-Hee Cho

Authors

Ben Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Wei-Hua Jia
View author publications
You can also search for this author in PubMed Google Scholar
Koichi Matsuda
View author publications
You can also search for this author in PubMed Google Scholar
Sun-Seog Kweon
View author publications
You can also search for this author in PubMed Google Scholar
Keitaro Matsuo
View author publications
You can also search for this author in PubMed Google Scholar
Yong-Bing Xiang
View author publications
You can also search for this author in PubMed Google Scholar
Aesun Shin
View author publications
You can also search for this author in PubMed Google Scholar
Sun Ha Jee
View author publications
You can also search for this author in PubMed Google Scholar
Dong-Hyun Kim
View author publications
You can also search for this author in PubMed Google Scholar
Qiuyin Cai
View author publications
You can also search for this author in PubMed Google Scholar
Jirong Long
View author publications
You can also search for this author in PubMed Google Scholar
Jiajun Shi
View author publications
You can also search for this author in PubMed Google Scholar
Wanqing Wen
View author publications
You can also search for this author in PubMed Google Scholar
Gong Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yanfeng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Chun Li
View author publications
You can also search for this author in PubMed Google Scholar
Bingshan Li
View author publications
You can also search for this author in PubMed Google Scholar
Yan Guo
View author publications
You can also search for this author in PubMed Google Scholar
Zefang Ren
View author publications
You can also search for this author in PubMed Google Scholar
Bu-Tian Ji
View author publications
You can also search for this author in PubMed Google Scholar
Zhi-Zhong Pan
View author publications
You can also search for this author in PubMed Google Scholar
Atsushi Takahashi
View author publications
You can also search for this author in PubMed Google Scholar
Min-Ho Shin
View author publications
You can also search for this author in PubMed Google Scholar
Fumihiko Matsuda
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Tang Gao
View author publications
You can also search for this author in PubMed Google Scholar
Jae Hwan Oh
View author publications
You can also search for this author in PubMed Google Scholar
Soriul Kim
View author publications
You can also search for this author in PubMed Google Scholar
Yoon-Ok Ahn
View author publications
You can also search for this author in PubMed Google Scholar
Andrew T Chan
View author publications
You can also search for this author in PubMed Google Scholar
Jenny Chang-Claude
View author publications
You can also search for this author in PubMed Google Scholar
Martha L Slattery
View author publications
You can also search for this author in PubMed Google Scholar
Stephen B Gruber
View author publications
You can also search for this author in PubMed Google Scholar
Fredrick R Schumacher
View author publications
You can also search for this author in PubMed Google Scholar
Stephanie L Stenzel
View author publications
You can also search for this author in PubMed Google Scholar
Graham Casey
View author publications
You can also search for this author in PubMed Google Scholar
Hyeong-Rok Kim
View author publications
You can also search for this author in PubMed Google Scholar
Jin-Young Jeong
View author publications
You can also search for this author in PubMed Google Scholar
Ji Won Park
View author publications
You can also search for this author in PubMed Google Scholar
Hong-Lan Li
View author publications
You can also search for this author in PubMed Google Scholar
Satoyo Hosono
View author publications
You can also search for this author in PubMed Google Scholar
Sang-Hee Cho
View author publications
You can also search for this author in PubMed Google Scholar
Michiaki Kubo
View author publications
You can also search for this author in PubMed Google Scholar
Xiao-Ou Shu
View author publications
You can also search for this author in PubMed Google Scholar
Yi-Xin Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Wei Zheng
View author publications
You can also search for this author in PubMed Google Scholar

Consortia

Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO)

John A Baron
, Sonja I Berndt
, Stéphane Bezieau
, Hermann Brenner
, Bette J Caan
, Christopher S Carlson
, Graham Casey
, Andrew T Chan
, Jenny Chang-Claude
, Stephen J Chanock
, David V Conti
, Keith Curtis
, David Duggan
, Charles S Fuchs
, Steven Gallinger
, Edward L Giovannucci
, Stephen B Gruber
, Robert W Haile
, Tabitha A Harrison
, Richard B Hayes
, Michael Hoffmeister
, John L Hopper
, Li Hsu
, Thomas J Hudson
, David J Hunter
, Carolyn M Hutter
, Rebecca D Jackson
, Mark A Jenkins
, Shuo Jiao
, Sébastien Küry
, Loic Le Marchand
, Mathieu Lemire
, Noralane M Lindor
, Jing Ma
, Polly A Newcomb
, Ulrike Peters
, John D Potter
, Conghui Qu
, Thomas Rohan
, Robert E Schoen
, Fredrick R Schumacher
, Daniela Seminara
, Martha L Slattery
, Stephen N Thibodeau
, Emily White
& Brent W Zanke

Colorectal Transdisciplinary (CORECT) Study

Kendra Blalock
, Peter T Campbell
, Graham Casey
, David V Conti
, Christopher K Edlund
, Jane Figueiredo
, W James Gauderman
, Jian Gong
, Roger C Green
, Stephen B Gruber
, John F Harju
, Tabitha A Harrison
, Eric J Jacobs
, Mark A Jenkins
, Shuo Jiao
, Li Li
, Yi Lin
, Frank J Manion
, Victor Moreno
, Bhramar Mukherjee
, Ulrike Peters
, Leon Raskin
, Fredrick R Schumacher
, Daniela Seminara
, Gianluca Severi
, Stephanie L Stenzel
& Duncan C Thomas

Colon Cancer Family Registry (CCFR)

Graham Casey

Contributions

W.Z. conceived and directed the Asia Colorectal Cancer Consortium and the Shanghai-Vanderbilt Colorectal Cancer Genetics Project. W.-H.J. and Y.-X.Z.; K. Matsuda; S.-S.K.; K. Matsuo; X.-O.S., Y.-B.X. and Y.-T.G.; A.S.; S.H.J.; and D.-H.K. directed CRC projects for the Guangzhou Colorectal Cancer Study, the BioBank Japan Colorectal Cancer Study, the Hwasun Cancer Epidemiology Study–Colon and Rectum Cancer (HCES-CRC), the Aichi Colorectal Cancer Study, the Shanghai studies, the Korea-NCC (National Cancer Center) Colorectal Cancer Study, the KCPS-II Colorectal Cancer Study and the Korea-Seoul Colorectal Cancer Study, respectively. B.Z., Q.C. and W.W. coordinated the project. Q.C. directed laboratory operations. J.S. performed the genotyping experiments. B.Z. performed the statistical and bioinformatics analyses. W.W. contributed to the statistical analyses and data interpretation. A.T. conducted the statistical analyses and imputation for BioBank Japan. B.Z., W.W. and J.L. managed the data. Y.Z. and B.Z. performed the expression analysis for TCGA data. B.Z. and W.Z. wrote the manuscript with significant contributions from X.-O.S., Q.C., J.L., W.W., B.L. and Y.Z. All authors contributed to data and biological sample collection in the original studies included in this project and to manuscript revision. All authors have reviewed and approved the content of the paper.

Corresponding author

Correspondence to Wei Zheng.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Additional information

A complete list of members and affiliations appears in the Acknowledgments.

Supplementary information

Supplementary Text and Figures

Supplementary Note, Supplementary Tables 1–20 and Supplementary Figures 1–6 (PDF 9786 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, B., Jia, WH., Matsuda, K. et al. Large-scale genetic study in East Asians identifies six new loci associated with colorectal cancer risk. Nat Genet 46, 533–542 (2014). https://doi.org/10.1038/ng.2985

Download citation

Received: 19 November 2013
Accepted: 21 April 2014
Published: 18 May 2014
Issue Date: June 2014
DOI: https://doi.org/10.1038/ng.2985

This article is cited by

ZMIZ1 Regulates Proliferation, Autophagy and Apoptosis of Colon Cancer Cells by Mediating Ubiquitin–Proteasome Degradation of SIRT1
- Min Huang
- Junfeng Wang
- Xueliang Zuo
Biochemical Genetics (2024)
Prioritization of risk genes in colorectal cancer by integrative analysis of multi-omics data and gene networks
- Ming Zhang
- Xiaoyang Wang
- Xiaoping Miao
Science China Life Sciences (2024)
Genetic risk impacts the association of menopausal hormone therapy with colorectal cancer risk
- Yu Tian
- Yi Lin
- Jenny Chang-Claude
British Journal of Cancer (2024)
Identification of specific susceptibility loci for the early-onset colorectal cancer
- Haoxue Wang
- Yimin Cai
- Jianbo Tian
Genome Medicine (2023)
Dissecting the pathogenic effects of smoking and its hallmarks in blood DNA methylation on colorectal cancer risk
- Xuan Zhou
- Qian Xiao
- Xue Li
British Journal of Cancer (2023)

Subjects

Abstract

Similar content being viewed by others

Main

Results

Study overview

Newly identified risk-associated loci for CRC

Putative functional variants and candidate genes

Previously reported CRC-associated loci in East Asians

Familial relative risk explained by CRC-associated loci

Discussion

Methods

Study participants.

Laboratory procedures.

SNP selection.

Statistical and bioinformatics analysis.

Expression analysis.

URLs.

References

Acknowledgements

Author information

Authors and Affiliations

Consortia

Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO)

Colorectal Transdisciplinary (CORECT) Study

Colon Cancer Family Registry (CCFR)

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links