Abstract
Colorectal cancer (CRC) is a leading cause of mortality worldwide. We conducted a genome-wide association study meta-analysis of 100,204 CRC cases and 154,587 controls of European and east Asian ancestry, identifying 205 independent risk associations, of which 50 were unreported. We performed integrative genomic, transcriptomic and methylomic analyses across large bowel mucosa and other tissues. Transcriptome- and methylome-wide association studies revealed an additional 53 risk associations. We identified 155 high-confidence effector genes functionally linked to CRC risk, many of which had no previously established role in CRC. These have multiple different functions and specifically indicate that variation in normal colorectal homeostasis, proliferation, cell adhesion, migration, immunity and microbial interactions determines CRC risk. Crosstissue analyses indicated that over a third of effector genes most probably act outside the colonic mucosa. Our findings provide insights into colorectal oncogenesis and highlight potential targets across tissues for new CRC treatment and chemoprevention strategies.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 per month
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$189.00 per year
only $15.75 per issue
Rent or buy this article
Get just this article for as long as you need it
$39.95
Prices may be subject to local taxes which are calculated during checkout



Data availability
Summary-level data for the full set of Asian and European GWASs are available through the GWAS catalog (accession no. GCST90129505). For individual-level data, CCFR, CORECT, CORSA_2 and GECCO are deposited in dbGaP (accession nos. phs001415.v1.p1, phs001315.v1.p1, phs001078.v1.p1, phs001903.v1.p1, phs001856.v1.p1 and phs001045.v1.p1). NSCCG and COIN are available in the European Genome–Phenome Archive under accession nos. EGAS00001005412 (NSCCG) and EGAS00001005421 (COIN). UK Biobank data are available through http://www.ukbiobank.ac.uk and Finnish data through THL Biobank. Access to individual-level data for the remaining studies is controlled through oversight committees. CCFR 1 and CCFR 2 data can be requested by submitting an application for collaboration to the CCFR (forms, instructions and contact information can be located at www.coloncfr/collaboration.org). Applications for individual-level data from the QUASAR2 and SCOT clinical trials will be assessed by the translational research steering committees that oversee those studies. Individual-level data from the CORGI (UK1) study will be made available subject to standard institutional agreements. Application forms for these three studies, and for Scotland Phase 1, Scotland Phase 2, SOCCS, DACHS4 and Croatia, will be provided by emailing a request to access.crc.gwas.data@outlook.com. For access to CORSA_1, please contact gecco@fredhutch.org. For Generation Scotland (GS) access is through the GS Access Committee (access@generationscotland.org). Applications for the Lothian Birth Cohort data should be made through https://www.ed.ac.uk/lothian-birth-cohorts/data-access-collaboration. For details of the application process for Aichi1, Aichi2, BBJ, Guanzhou1, HCES, HCES2, Korea and Shanghai cohorts, please go to https://swhs-smhs.app.vumc.org or contact W.Z. at wei.zheng@vanderbilt.edu. CRC-relevant epigenome data were obtained from the National Center for Biotechnology Information’s Gene Expression Omnibus database under accession nos. GSE77737 and GSE36401. Genetically predicted models of gene expression and methylation have been deposited in the Zenodo repository (https://zenodo.org/deposit/6472285).
Code availability
All bioinformatics and statistical analysis tools used in the present study are open source, details of which are available in Methods and Nature Portfolio Reporting Summary. No customized code was used to process or analyze data. Details on URLs used can be found in Supplementary Note.
Change history
13 February 2023
A Correction to this paper has been published: https://doi.org/10.1038/s41588-023-01334-w
References
Sung, H. et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 71, 209–249 (2021).
Jiao, S. et al. Estimating the heritability of colorectal cancer. Hum. Mol. Genet. 23, 3898–3905 (2014).
Law, P. J. et al. Association analyses identify 31 new risk loci for colorectal cancer susceptibility. Nat. Commun. 10, 2154 (2019).
Huyghe, J. R. et al. Discovery of common and rare genetic risk variants for colorectal cancer. Nat. Genet. 51, 76–87 (2019).
Kvale, M. N. et al. Genotyping informatics and quality control for 100,000 subjects in the genetic epidemiology research on adult health and aging (GERA) cohort. Genetics 200, 1051–1060 (2015).
Wang, H. et al. Trans-ethnic genome-wide association study of colorectal cancer identifies a new susceptibility locus in VTI1A. Nat. Commun. 5, 4613 (2014).
Gamazon, E. R. et al. A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet. 47, 1091–1098 (2015).
Barbeira, A. N. et al. Integrating predicted transcriptome from multiple tissues improves association detection. PLoS Genet. 15, e1007889 (2019).
Bien, S. A. et al. Genetic variant predictors of gene expression provide new insight into risk of colorectal cancer. Hum. Genet. 138, 307–326 (2019).
Guo, X. et al. Identifying novel susceptibility genes for colorectal cancer risk from a transcriptome-wide association study of 125,478 subjects. Gastroenterology. 160, 1164–1178.e1166 (2021).
Battle, A. et al. Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals. Genome Res. 24, 14–24 (2014).
Koo, B. K. et al. Tumour suppressor RNF43 is a stem-cell E3 ligase that induces endocytosis of Wnt receptors. Nature 488, 665–669 (2012).
Hirano, Y. et al. Cell cycle-dependent phosphorylation of MAN1. Biochemistry 48, 1636–1643 (2009).
Fattet, L. & Yang, J. RREB1 integrates TGF-beta and RAS signals to drive EMT. Dev. Cell 52, 259–260 (2020).
Keku, T. O., Dulal, S., Deveaux, A., Jovov, B. & Han, X. The gastrointestinal microbiota and colorectal cancer. Am. J. Physiol. Gastrointest. Liver Physiol. 308, G351–G363 (2015).
Tuomisto, A. E., Makinen, M. J. & Vayrynen, J. P. Systemic inflammation in colorectal cancer: Underlying factors, effects, and prognostic significance. World J. Gastroenterol. 25, 4383–4404 (2019).
Zheng, J. et al. LD Hub: a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis. Bioinformatics 33, 272–279 (2017).
Pearson-Stuttard, J. et al. Type 2 diabetes and cancer: an umbrella review of observational and Mendelian randomization studies. Cancer Epidemiol. Biomarkers Prev. 30, 1218–1228 (2021).
Kyrgiou, M. et al. Adiposity and cancer at major anatomical sites: umbrella review of the literature. Br. Med. J. 356, j477 (2017).
Liu, J. et al. Targeting Wnt-driven cancer through the inhibition of Porcupine by LGK974. Proc. Natl Acad. Sci. USA 110, 20224–20229 (2013).
Zhang, Y. D. et al. Assessment of polygenic architecture and risk prediction based on common variants across fourteen cancers. Nat. Commun. 11, 3353 (2020).
Yang, J. et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat. Genet. 44, 369–375 (2012).
Liu, J. Z. et al. Meta-analysis and imputation refines the association of 15q25 with smoking quantity. Nat. Genet. 42, 436–440 (2010).
Magi, R. et al. SCOPA and META-SCOPA: software for the analysis and aggregation of genome-wide association studies of multiple correlated phenotypes. BMC Bioinf. 18, 25 (2017).
Speed, D. & Balding, D. J. SumHer better estimates the SNP heritability of complex traits from summary statistics. Nat. Genet. 51, 277–284 (2019).
Johns, L. E. & Houlston, R. S. A systematic review and meta-analysis of familial colorectal cancer risk. Am. J. Gastroenterol. 96, 2992–3003 (2001).
Schumacher, F. R. et al. Association analyses of more than 140,000 men identify 63 new prostate cancer susceptibility loci. Nat. Genet. 50, 928–936 (2018).
Bulik-Sullivan, B. K. et al. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).
Zhang, Y., Qi, G., Park, J. H. & Chatterjee, N. Estimation of complex effect-size distributions using summary-level statistics from genome-wide association studies across 32 complex traits. Nat. Genet. 50, 1318–1326 (2018).
Stegle, O., Parts, L., Piipari, M., Winn, J. & Durbin, R. Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses. Nat. Protoc. 7, 500–507 (2012).
Barbeira, A. N. et al. Integrating predicted transcriptome from multiple tissues improves association detection. PLoS Genet. 15, e1007889 (2019).
Tian Y. et al. ChAMP: updated methylation analysis pipeline for Illumina BeadChips. Bioinformatics https://doi.org/10.1093/bioinformatics/btx513 (2017).
Zhou, W., Laird, P. W. & Shen, H. Comprehensive characterization, annotation and innovative use of Infinium DNA methylation BeadChip probes. Nucleic Acids Res. 45, e22 (2017).
Dong, X. et al. A general framework for functionally informed set-based analysis: application to a large-scale colorectal cancer study. PLoS Genet. 16, e1008947 (2020).
Le Borgne, F. et al. Standardized and weighted time-dependent receiver operating characteristic curves to evaluate the intrinsic prognostic capacities of a marker by taking into account confounding factors. Statist. Methods Med. Res. 27, 3397–3410 (2018).
Acknowledgements
At the Institute of Cancer Research, this work was supported by Cancer Research UK (CRUK, grant no. C1298/A25514 to R.S.H.). Additional support was provided by the National Cancer Research Network. In Edinburgh, the work was supported by program grant funding from CRUK (grant nos. C348/A12076 to M.G.D. and C6199/A16459 to I.T.), EU European Research Council Advanced Grant EVOCAN, and the infrastructure and staffing of the Edinburgh CRUK Cancer Research Centre. C.F.R. was supported by a Marie Sklodowska-Curie Intra-European Fellowship Action (grant no. IEF-301077) for the INTERMPHEN project and received considerable help from many staff in the Department of Endoscopy at the John Radcliffe Hospital in Oxford. Support from the European Union (grant nos. FP7/207–2013 and 258236), FP7 collaborative project SYSCOL and COST Actions EuColonGene and TransColonCan are also acknowledged (grant nos. BM1206 and CA17118 to I.T.). We are grateful to many colleagues within UK clinical genetics departments (for CORGI) and to many collaborators who participated in the VICTOR, QUASAR2 and SCOT trials. We also thank colleagues from the UK National Cancer Research Network (for NSCCG). IT acknowledges funding from CRUK program grant no. C6199/A27327. The work at Vanderbilt University Medical Center was supported by US National Institutes of Health (NIH; grant nos. R01CA188214, R37CA070867, UM1CA182910, R01CA124558, R01CA158473 and R01CA148667), as well as Anne Potter Wilson Chair funds from the Vanderbilt University School of Medicine (to W.Z.). Sample preparation and genotyping assays at Vanderbilt University were conducted at the Survey and Biospecimen Shared Resources and Vanderbilt Microarray Shared Resource, supported in part by the Vanderbilt-Ingram Cancer Center (grant no. P30CA068485). Statistical analyses were performed on servers maintained by the Advanced Computing Center for Research and Education (ACCRE) at Vanderbilt University. Genetics and Epidemiology of Colorectal Cancer Consortium (GECCC), National Cancer Institute (NCI), NIH, US Department of Health and Human Services provided grant nos. U01 CA164930, U01 CA137088, R01 CA059045, R01201407 and R01CA206279. Genotyping services were provided by the Center for Inherited Disease Research (CIDR) contract no. HHSN268201200008I. This research was funded in part through the NIH/NCI Cancer Center support grant no. P30 CA015704. Scientific Computing Infrastructure at the Fred Hutchinson Cancer Research Center was funded by ORIP grant no. S10OD028685 (to U.P.). The Colorectal Cancer Transdisciplinary (CORECT) study was supported by the NCI/NIH, US Department of Health and Human Services (grant nos. U19 CA148107, R01 CA81488, P30 CA014089, R01 CA197350, P01 CA196569 and R01 CA201407) and National Institutes of Environmental Health Sciences, NIH (grant no. T32 ES013678). The Colon Cancer Family Registry (CCFR) participant recruitment and collection of data and biospecimens used in the present study were supported by the NCI, NIH (grant no. U01 CA167551). OFCCR was supported through funding allocated to the Ontario Registry for Studies of Familial Colorectal Cancer (grant no. U01 CA074783). The content of this manuscript does not necessarily reflect the views or policies of the NCI or any of the collaborating centers in the CCFR, and the mention of trade names, commercial products or organizations does not imply endorsement by the US Government, any cancer registry or the CCFR.
Author information
Authors and Affiliations
Contributions
C.F.R., M.N.T., P.J.L., V.M., G.C., S.B.G., I.T., W.Z., M.G.D., R.S.H. and U.P. designed the study. C.F.R., C.P., S.M.F., J.P.B., P.G.V.S., X.O.S., J.L., Q.C., X.G., Y.L.U., P.B., J.S., T.A.H., D.V.C., M.M., G.R., M.O.S., J.O., D.K., S.J., K.J., S.S.K., A.E.S., M.H.S., Y.A., J.E.K., I.O., W.W., K.E.M., K.O.M., C.T., Z.R., Y.G., W.J., J.L.H., M.A.J., A.K.W., R.K.P., J.C.F., R.W.H., S.G., M.O.W., P.A.N., J.P.C., R.K., T.S.M., R.S.K., D.J.K., I.K., J.B., L.P.M., P.J., P.K., L.A.A., H.R., E.P., J.G.E., T.C., U.H., J.O.K., K.P., T.T., L.R., B.Z., S.M., D.A., J.R.P., D.D.B., E.A.P., N.U., E.M.S., S.B.R., A.G., P.T.C., V.M.S., J.C.C., M.H., H.B., M.L.S., J.D.P., M.B.S., M.J.G., N.M., A.C., S.C.B., L.M., V.A., M.S., B.E.P., D.T.B., G.G.G., C.H.H., M.C.S., G.E.I., K.J.M., A.F.Z., J.K.G., K.A.S., F.L., K.O., Y.S., T.O.K., B.V.G., T.J.H., H.H., R.P., R.B.H., M.E.M., P.P., S.C.L., Y.Y., H.J.L., E.W., L.L., A.T.C., M.C.C., A.L., D.J.H., C.S., P.C.S., D.A.N., R.E.S., J.H., Z.K.S., P.E.V., L.V., V.V., N.P., D.S., A.E.T., S.D.M., S.J.C., F.v.D., E.J.M.F., M.G.D., A.W., A.N., B.A.P., L.M.F., L.S.C., S.O., C.K., C.I.L., R.L.P., C.X.Q., S.B.E., C.M.T., E.R.M., L.L.M., A.H.W., C.E.M., G.A.C., C.H., I.J.D., S.E.H., E.T., S.J.R., M.W., L.Y.O., M.A.D., T.U.S., T.Y., N.S., M.I., V.M., G.C., S.B.G., I.T., W.Z., M.D., R.S.H. and U.P. recruited patients and collected samples. C.F.R., M.N.T., P.J.L., S.L.S., V.D.O., C.P., S.E.B., V.S., K.D., S.M.F., P.G.V.S., J.L., Q.C., X.G., Y.L.U., P.B., J.S., J.R.H., T.A.H., D.V.C., C.H.D., M.D., F.R.S., M.M., G.R., M.O.S., W.W., J.L.H., D.D., J.P.C., R.K., R.S.K., D.J.K., K.P., D.A., S.J.W., E.A.R.N., J.R.P., E.A.P., K.V., N.U., E.M.S., P.T.C., J.C.C., M.H., H.B., M.L.S., M.J.G., A.C., S.C.B., L.M., B.E.P., M.C.S., G.E.I., A.F.Z., J.K.G., K.A.S., F.L., R.S., T.O.K., S.I.B., S.T., D.A.C., P.P., H.J.L., E.W., K.F.D., E.W.P., A.T.C., A.L., A.D.J., C.S., P.C.S., J.H., C.K.E., D.C.T., A.E.K., F.v.D., E.J.M.F., L.C.S., M.G.D., A.W., L.M.F., S.O., S.A.B., C.K., Y.L.I., C.X.Q., L.L.M., C.Q., C.E.M., S.E.H., E.T., S.J.R., V.M., G.C., S.B.G., I.T., W.Z., M.D., R.S.H. and U.P. carried out the molecular analysis. C.F.R., M.N.T., P.J.L., M.T., Z.C., S.L.S., V.D.O., L.H., J.F.T., C.P., K.I.S., V.S., K.D., JRH, M.M., F.M.N., K.P., A.N.S., A.B.K., C.K.E., W.J.G., D.C.T., Y.L.I., C.X.Q., C.Q., S.B.G., I.T., W.Z., M.D., R.S.H. and U.P. analyzed the data. C.F.R., M.N.T., P.J.L., M.T., Z.C., S.L.S., V.D.O., L.H., J.F.T., K.I.S., J.R.H., A.K.W., J.C.F., R.W.H., P.T.C., K.K.T., M.J.G., A.N.S., B.E.P., D.A.C., P.P., M.C.C., A.B.K., L.C.S., S.O., R.L.P., V.M., G.C., S.B.G., I.T., W.Z., M.D., R.S.H. and U.P. interpreted the data. All authors drafted or substantially revised the manuscript. C.F.R., V.M., S.B.G., I.T., M.D., R.S.H. and U.P. supervised the study and acquired funding.
Corresponding authors
Ethics declarations
Competing interests
A.C. is a consultant to Bayer Pharma AG, Boehringer Ingelheim and Pfizer Inc. for work unrelated to this manuscript. A.S. is an employee at Insitro, including consulting fees from BMS. H.H. is SAB for Invitae Genetics, Promega and Genome Medical, Stock/Stock options for Genome Medical and GI OnDemand. J.K. is a consultant for Guardant Health. N.P. is a collaborator for Thrive and Exact, PGDx, CAGE, NeoPhore, Vidium and ManaTbio, and receives royalties for licensed technologies according to JHU rules. R.K.P. collaborates with Eli Lilly, AbbVie, Allergan, Verily and Alimentiv, which includes consulting fees (outside the submitted work). S.A.B. has financial interest in Adaptive Biotechnologies. S.B.G. is co-founder, Brogent International LLC. T.S.M. receives research and honoraria from Merck Serono. One of Z.K.S.’s immediate family members serves as a consultant in ophthalmology for Alcon, Adverum, Gyroscope Therapeutics Limited, Neurogene and RegenexBio (outside the submitted work). V.M. has research projects and owns stocks of Aniling. The remaining authors declare no competing interests.
Peer review
Peer review information
Nature Genetics thanks the anonymous reviewers for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Figs. 1–5 and Note with references.
Supplementary Tables
Legends and data for Supplementary Tables 1–21.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Fernandez-Rozadilla, C., Timofeeva, M., Chen, Z. et al. Deciphering colorectal cancer genetics through multi-omic analysis of 100,204 cases and 154,587 controls of European and east Asian ancestries. Nat Genet 55, 89–99 (2023). https://doi.org/10.1038/s41588-022-01222-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41588-022-01222-9