Combining Asian and European genome-wide association studies of colorectal cancer improves risk prediction across racial and ethnic populations

Thomas, Minta; Su, Yu-Ru; Rosenthal, Elisabeth A.; Sakoda, Lori C.; Schmit, Stephanie L.; Timofeeva, Maria N.; Chen, Zhishan; Fernandez-Rozadilla, Ceres; Law, Philip J.; Murphy, Neil; Carreras-Torres, Robert; Diez-Obrero, Virginia; van Duijnhoven, Franzel J. B.; Jiang, Shangqing; Shin, Aesun; Wolk, Alicja; Phipps, Amanda I.; Burnett-Hartman, Andrea; Gsur, Andrea; Chan, Andrew T.; Zauber, Ann G.; Wu, Anna H.; Lindblom, Annika; Um, Caroline Y.; Tangen, Catherine M.; Gignoux, Chris; Newton, Christina; Haiman, Christopher A.; Qu, Conghui; Bishop, D. Timothy; Buchanan, Daniel D.; Crosslin, David R.; Conti, David V.; Kim, Dong-Hyun; Hauser, Elizabeth; White, Emily; Siegel, Erin; Schumacher, Fredrick R.; Rennert, Gad; Giles, Graham G.; Hampel, Heather; Brenner, Hermann; Oze, Isao; Oh, Jae Hwan; Lee, Jeffrey K.; Schneider, Jennifer L.; Chang-Claude, Jenny; Kim, Jeongseon; Huyghe, Jeroen R.; Zheng, Jiayin; Hampe, Jochen; Greenson, Joel; Hopper, John L.; Palmer, Julie R.; Visvanathan, Kala; Matsuo, Keitaro; Matsuda, Koichi; Jung, Keum Ji; Li, Li; Le Marchand, Loic; Vodickova, Ludmila; Bujanda, Luis; Gunter, Marc J.; Matejcic, Marco; Jenkins, Mark A.; Slattery, Martha L.; D’Amato, Mauro; Wang, Meilin; Hoffmeister, Michael; Woods, Michael O.; Kim, Michelle; Song, Mingyang; Iwasaki, Motoki; Du, Mulong; Udaltsova, Natalia; Sawada, Norie; Vodicka, Pavel; Campbell, Peter T.; Newcomb, Polly A.; Cai, Qiuyin; Pearlman, Rachel; Pai, Rish K.; Schoen, Robert E.; Steinfelder, Robert S.; Haile, Robert W.; Vandenputtelaar, Rosita; Prentice, Ross L.; Küry, Sébastien; Castellví-Bel, Sergi; Tsugane, Shoichiro; Berndt, Sonja I.; Lee, Soo Chin; Brezina, Stefanie; Weinstein, Stephanie J.; Chanock, Stephen J.; Jee, Sun Ha; Kweon, Sun-Seog; Vadaparampil, Susan; Harrison, Tabitha A.; Yamaji, Taiki; Keku, Temitope O.; Vymetalkova, Veronika; Arndt, Volker; Jia, Wei-Hua; Shu, Xiao-Ou; Lin, Yi; Ahn, Yoon-Ok; Stadler, Zsofia K.; Van Guelpen, Bethany; Ulrich, Cornelia M.; Platz, Elizabeth A.; Potter, John D.; Li, Christopher I.; Meester, Reinier; Moreno, Victor; Figueiredo, Jane C.; Casey, Graham; Lansdorp Vogelaar, Iris; Dunlop, Malcolm G.; Gruber, Stephen B.; Hayes, Richard B.; Pharoah, Paul D. P.; Houlston, Richard S.; Jarvik, Gail P.; Tomlinson, Ian P.; Zheng, Wei; Corley, Douglas A.; Peters, Ulrike; Hsu, Li

doi:10.1038/s41467-023-41819-0

Download PDF

Article
Open access
Published: 02 October 2023

Combining Asian and European genome-wide association studies of colorectal cancer improves risk prediction across racial and ethnic populations

Nature Communications volume 14, Article number: 6147 (2023) Cite this article

4695 Accesses
22 Altmetric
Metrics details

Subjects

Abstract

Polygenic risk scores (PRS) have great potential to guide precision colorectal cancer (CRC) prevention by identifying those at higher risk to undertake targeted screening. However, current PRS using European ancestry data have sub-optimal performance in non-European ancestry populations, limiting their utility among these populations. Towards addressing this deficiency, we expand PRS development for CRC by incorporating Asian ancestry data (21,731 cases; 47,444 controls) into European ancestry training datasets (78,473 cases; 107,143 controls). The AUC estimates (95% CI) of PRS are 0.63(0.62-0.64), 0.59(0.57-0.61), 0.62(0.60-0.63), and 0.65(0.63-0.66) in independent datasets including 1681-3651 cases and 8696-115,105 controls of Asian, Black/African American, Latinx/Hispanic, and non-Hispanic White, respectively. They are significantly better than the European-centric PRS in all four major US racial and ethnic groups (p-values < 0.05). Further inclusion of non-European ancestry populations, especially Black/African American and Latinx/Hispanic, is needed to improve the risk prediction and enhance equity in applying PRS in clinical practice.

Genetic risk factors for colorectal cancer in multiethnic Indonesians

Article Open access 11 May 2021

Evaluating the predictive value of genetic risk score in colorectal cancer among Chinese Han population

Article 19 December 2019

Genome-wide polygenic risk scores for colorectal cancer have implications for risk-based screening

Article Open access 03 January 2024

Introduction

Colorectal cancer (CRC) is a leading cause of cancer death, yet it is among the most preventable cancers via screening¹. Together with the detection of CRC at early stages, which dramatically improves prognosis, optimal screening has the potential for a major impact on CRC mortality. However, current screening programs are primarily age and family-history based and more refinement through risk-based screening recommendations could be instrumental in improving their effectiveness.

Genetics plays a key role in the CRC development and, as for most cancers and other common diseases, the risk is polygenic². As such, we can utilize the polygenic risk structure to develop a polygenic risk score (PRS) to quantify an individual’s inherited risk of developing CRC. As the predictive performance improves, a PRS can become clinically useful as a risk stratification tool for targeted screening and chemoprevention. However, PRS built based on European ancestry data have sub-optimal performance in other ancestral populations³ because of differential linkage disequilibrium (LD) patterns and allele frequencies across racial and ethnic groups for disease risk variants of CRC^4,5,6,7,8,9. The poor transferability of PRS across racial and ethnic groups has raised concern regarding whether its application in clinical practice may exacerbate existing health disparities⁷. As a result, there is a need to improve the accuracy of polygenic prediction across different racial and ethnic groups to maximize the clinical and public-health translational potential of PRS and enhance equity in precision medicine.

Developing ancestry-specific PRS requires sufficient sample sizes for each ancestral group; however, the sample sizes for non-European ancestry groups, while increasing, remain only a fraction of the sample size for European ancestry. Existing studies suggest that leveraging information from other ancestries can improve ancestry-specific PRS^10,11. As an alternative to developing ancestry-specific PRS, one may develop a single cross-ancestry PRS based on meta-analysis of genome-wide association studies (GWAS) across all available ancestral groups^12,13,14. To our knowledge, there is no study of PRS for non-European ancestral populations for CRC. Here we consider two different approaches to PRS development, (1) ancestry-specific PRS using PRS-CSx¹⁵ based on ancestry-specific GWAS while leveraging cross-ancestry information and (2) single cross-ancestry Asian-European PRS using LDPred2¹⁶ based on combined meta-analysis summary statistics and LD matrices across Asian and European ancestries. Using independent racially and ethnically diverse datasets, we evaluated the performance of these two PRS and compared them with a genome-wide PRS built using European-only GWAS data³ and a PRS based on 204 known CRC loci^17,18,19,20. To facilitate understanding of its clinical utility, we used decision-curve analyses²¹ to assess the standardized net benefit for the model based on family-history and PRS and compared to the family-history-only model, as the latter is currently used to decide at what age screening starts.

Results

For developing PRSs, we used GWAS summary statistics of 1,020,293 SNPs based on 21,731 cases and 47,444 controls of Asian and 78,473 cases and 107,143 controls of European ancestries. We evaluated the performance of the PRS in independent validation individual-level data sets including 12,025 Asian (2420 cases; 9605 controls), 13,823 Black/African-American (1954 cases; 11,869 controls), 10,378 Latinx/Hispanic (1682 cases; 8696 controls) and 118,756 non-Hispanic White (3651 cases; 115,105 controls) participants. More details about study participant characteristics for training and validation data sets are included in Table 1, Supplementary Data 1, and Supplemental Material and Methods.

Table 1 Characteristics of the validation studies

Full size table

Discriminatory accuracy of Asian-European PRS

The single cross-ancestry Asian-European PRS derived using the combined Asian-European GWAS meta-analysis summary statistics and LD matrices with LDpred2 improved the discriminatory accuracy in the Asian population compared to the European-centric PRS (AUC = 0.63 vs. 0.59, p-value < 4.5e−09, Table 2). It also improved the AUC significantly in the non-Hispanic White population (AUC = 0.65 vs. 0.63, p-value = 6.0e−03). Despite lack of Black/African American and Hispanic individuals in deriving the PRS, the Asian-European PRS improved the AUC for Black/African American (AUC = 0.59 vs. 0.58, p-value = 0.05) and Hispanic individuals (AUC = 0.62 vs. 0.59, p-value = 5.0e−03). The Asian-European PRS improved the AUC in all racial and ethnic groups compared to the known-loci PRS (all p-values < 0.05).

Table 2 AUC estimates (95% confidence interval) for European-centric PRS, known loci PRS, PRS-CSx and LDPred2

Full size table

The ancestry-specific PRS derived using PRS-CSx improved the discriminatory accuracy in the Asian population compared to the European-centric PRS (AUC = 0.64 vs. 0.59), though not statistically significant with p-value 0.06 (Table 2). The AUC for the ancestry-specific non-Hispanic White-specific PRS was also not statistically different from the European-centric PRS (p-value = 0.15) in the non-Hispanic White population; however, it was significantly higher than the known-loci PRS (p-value = 1.8e−05). The ancestry-specific PRS-CSx is not relevant for Black/African American and Hispanic groups, because there were no GWAS for these groups included in the training datasets.

There was little variation in AUC estimates across studies (Supplemental Table 1). Among these two approaches, the Asian-European PRS using the combined Asian-European summary statistics in LDpred2 had greater discriminatory accuracy than the ancestry-specific non-Hispanic White-specific PRS from PRS-CSx with p-value = 3.0e−03. However, we did not observe statistically significant differences in Asian individuals (p-value = 0.75). Taken together, the single cross-ancestry Asian-European PRS using LDpred2 performs among the best in terms of AUC but with much narrower confidence intervals; hereafter we focus only on the single cross-ancestry Asian-European PRS. The ROC curves for the cross ancestry Asian-European PRS showed a similar pattern to the AUC for Asian, Black/African American, Hispanic, and non-Hispanic White participants (Supplemental Fig. 1).

PRS distribution across racial and ethnic groups

As expected, the PRS distributions varied across the racial and ethnic groups (Fig. 1A and Supplemental Fig. 2). After trans-ancestry correction, the PRS distributions largely overlapped except for the MG-JPN study (Fig. 1B and Supplemental Fig. 3). This may be due to the use of the imputation reference panel of only Asian individuals from the 1000 Genomes Projects for MG-JPN; this differs from all other studies, which used all 1000 Genome Project samples in the reference panel. We thus performed an additional mean adjustment to the PRS for the MG-JPN study. After this adjustment, all PRS distributions overlapped (Fig. 1C).

Cases had higher mean PRS than controls across all racial and ethnic groups (Supplemental Fig. 4). The OR estimates per SD of PRS (95% CI) were 1.64 (1.55–1.74), 1.39 (1.31–1.47), 1.62 (1.51–1.73) and 1.67 (1.60–1.75) for Asian, Black/African American, Latinx/Hispanic, and non-Hispanic White participants, respectively, with p-value < 2.0e−18 for all four groups (Fig. 1D and Table 3).

Table 3 Odds ratios (OR), 95% confidence interval (95% CI) and two-sided p-values for PRS per SD for all and stratified by family-history and age

Full size table

Compared to the mean risk, the relative risks of PRS at any given percentile were similar for all racial and ethnic groups except for Black/African American participants for whom it was attenuated (Fig. 2). The relative risk at the 90th percentile of the PRS distribution compared to mean was 1.67, 1.44, 1.65, and 1.69 for Asian, Black/African American, Latinx/African American, and non-Hispanic participants, respectively.

The model-based relative risk was calibrated well across the PRS range in all racial and ethnic groups (Fig. 3).

**Fig. 3: Relative Risk Calibration of PRS.**

Odds ratios (ORs) for PRS stratified by family-history and age

Across all racial and ethnic groups, the ORs for the PRS were higher in those without a family-history than those with a family-history with p-values 0.21, 0.01, 3.0e−3, and 0.11 for Asian, Black/African American, Latinx/Hispanic, and non-Hispanic White participants respectively (Table 3). The estimates were consistent across studies (Supplemental Table 2).

The strength of association estimates for PRS in relation to CRC decreased over strata of increased age in each racial and ethnic group with trend test p-values of 0.07, 0.11, 2.8e−4, and 1.2e−03 for Asian, Black/African American, Latinx/Hispanic, and non-Hispanic White, participants, respectively. The ORs, 95% CI and trend p-value for each racial and ethnic group are given in Table 3. The estimates were consistent across studies (Supplemental Table 2).

Clinical utility for model based on PRS and family-history

We calculated the standardized net benefit (sNB) to assess the clinical utility of using a model based on PRS and family-history to recommend an intervention (such as screening) for participants <50 years of age. We used the average 10-year risk of developing CRC at age 45 as the risk threshold, because the current CRC-screening guidelines recommend that an average-risk individual start screening at age 45 years old. Using the GERA cohort, we estimated the 10-year risk to be 0.29% across all racial and ethnic groups. At this risk threshold, the risk model based on PRS, and family-history achieved 37.3% (95% CI: 23.8%–50.8%) of the maximum possible achievable utility. This was greater than the model based on family-history alone (sNB = 21.7%, 95% CI: 12.4%–33%, p-value 0.02) and hypothetically intervening on all or no people (Fig. 4a), a pattern that generally holds for each racial and ethnic group (Supplemental Fig. 5).

**Fig. 4: Standardized net benefit analysis.**

We observed a similar pattern for participants between the ages of 50 and 60 years (Supplemental Fig. 6). We also used the 10-year risk 0.39% at age 50 and 0.49% at 55 years as the risk thresholds. The risk model based on PRS, and family history achieved greater sNB (sNB = 24.8% and 21.6%, respectively) than the model based on family history-alone (sNB = 19.3% and 15.9%, respectively).

At the risk threshold 0.29%, in GERA cohort, for the model based on family history and PRS, the true-positive and false-positive rates were 70% and 37%, respectively, whereas, for the model based on family history only, the true-positive and false-positive rates were 31% and 10%, respectively (Fig. 4b). About 8472 of 22,628 individuals with age 40–49 were deemed to be at high risk based on our model of family history and PRS. Among these, 99 developed CRC in the next 10 years. For this age group, a total of 149 individuals developed CRC. Whereas, for the model based on family history only, at the same risk threshold, about 2357 would be deemed at high risk, and 37 developed CRC. (Fig. 4c, d).

Table 4 provides more detailed results of the net benefit (NB) analysis for our proposed family history and PRS-based model and the family history-based model compared to treat all for risk thresholds (%) from 0 to 0.32%, where NB for treat all becomes negative. Using the same risk threshold 0.29% as in the previous example, the NB of our model is 0.11%. This can be interpreted as that compared with assuming that all individuals do not have intervention, our model with 0.11% NB leads to the equivalent of a net 11 true-positives per 10,000 individuals without an increase in the number of false-positives. Moreover, the net benefit for the model was 0.08% greater than assuming all individuals had intervention and 0.04% greater than family history-based model. We also calculated the reduction in the number of false positives per 100 patients as²². There were 30 fewer false-positives per 100 individuals for our models whereas there were only 15 fewer false-positives for the family history-based model.

Table 4 Net benefit (NB) of intervention (e.g., screening) for 22,628 participants aged 40–49 from the GERA cohort, according to the proposed family history (FamHx) + PRS model and the FamHx only model for a given risk threshold

Full size table

In addition, we estimated the number of unnecessary interventions avoided for individuals with age 40–49 years old, as shown in Supplemental Fig. 7 and Table 5. Continuing using the 0.29% threshold as an example, risk stratification based on the family history and PRS would avoid 17 more interventions per 100 individuals, compared with the model based on family history, which would avoid 13 interventions per 100 individuals compared to intervening all.

Table 5 Unnecessary interventions avoided per 100 individuals with age 40–49 for different risk thresholds, 0.29%, 0.39% and 0.49% corresponding the average 10-year risk of developing CRC at ages 45, 50 and 55 years, for the proposed family history (FamHx) + PRS model and the FamHx only model

Full size table

Assessing CRC probabilities for PRS

We estimated age-specific probabilities for developing CRC by age 80, stratified by family history status, and by quantiles of PRS top 5%, top 25%, 25%–75%, bottom 25% and bottom 5%, for different racial/ethnic groups of GERA participants. There was clear separation between those who were in bottom and top PRS quantiles across ancestral groups, except for the African American group where the separation is less obvious due to the lower performance and very limited number of CRC cases in this group. The probabilities of developing CRC by age 70 for top 5% of PRS ranged from 2.2 to 4.7%, across the four different racial and ethnic groups. In comparison, the probabilities of developing CRC for those who had the positive family history were 1.9–5% (Supplemental Figs. 8 and 9).

Discussion

Using large-scale Asian and European GWAS data, we demonstrate that combining Asian and European summary statistics in deriving PRS led to statistically significant improvement in discriminatory accuracy across Asian, Black/African American, Latinx/Hispanic and non-Hispanic White groups, although the improvement was less marked in Latinx/Hispanic and Black/African American participants. We further show that across all groups, the PRS has stronger associations with CRC-risk in younger individuals and in those without a family-history of CRC, which will likely increase the possible clinical utility of the PRS given the rising young-onset CRC incidence rates in recent decades, mostly in individuals without a known family-history. This is supported by our decision-curve analysis demonstrating that adding PRS improves the maximum achievable clinical utility over the model based on family-history only for ages 40–60 years.

A challenging factor of moving PRS to clinical implementation is ensuring that the PRS is equally applicable to individuals across all racial and ethnic groups to prevent an increase in health disparities. Relevant to this objective, we evaluated two broad categories of approaches (ancestry-specific PRS while leveraging cross ancestry information and single cross-ancestry PRS based on the combined cross-ancestry GWAS) for improving the prediction in under-represented groups, and our observation of the performance of these approaches could be generalized to other traits besides CRC. We found that both approaches performed similarly in Asian and non-Hispanic European individuals. Further, the cross-ancestry Asian-European PRS also improved risk prediction performance in Hispanic individuals and, to a smaller extent, in Black/African American individuals. We also show that we can correct this raw PRS for genetic ancestry and create a common distribution that can be used across racial and ethnic groups, avoiding the potential difficulty of using ancestry-specific PRS in admixed populations. Accordingly, our cross-ancestry Asian-European PRS has the potential to reduce health disparities between non-European ancestry populations and the European ancestry population.

As there is growing interest in clinical use of PRS, it is important to point out that the purpose of PRS is not to identify CRC, but rather stratify individuals into different risk strata for which different levels of cancer preventive interventions may be devised.^23,24 Their performance should thus be compared with risk factors currently used for risk stratification such as family-history in terms of cost effectiveness. In this paper, we performed a decision-curve analysis that has been used in cancer research for assessing the potential population impact of incorporating a risk prediction model into clinical practice^22,25,26. The risk model that incorporates both the PRS and family-history achieves 37.3% of the maximum possible achievable utility for those 40–49 years old, significantly greater than 21.7% under the family-history-only model. Recently the US Preventive Services Task Force recommended lowering the age at screening initiation to 45 years for individuals at average risk²⁷. However, given the substantial burden of additional approximately 22 million people becoming eligible for screening and the fact that CRC remains a rare event in younger individuals, there has been critique of the universal change to the initial screening age that, instead, emphasizes the importance of targeted screening based on an individual’s risk factors^28,29,30. The results from the decision-curve analysis suggest that there is clinical utility to adding a PRS to the family-history-only model in risk stratification for CRC prevention. In decision curve analysis, we assumed the decision in question was whether an individual in the general population should undergo intervention (e.g., colonoscopy procedure), based on their risk. Overall, the model with the highest (standardized) net benefit is considered the “best” strategy in decision curve analysis. However, as argued in Kerr et al.²¹, decision curves cannot be used to choose a risk threshold, but it summarizes the costs and benefits of intervention of the risk model at different risk threshold. To fully evaluate the effectiveness of including PRS as part of risk stratification, a full decision analytic modeling that incorporates other aspects such as different screening methods, implementation factors, behavioral factors, and corresponding costs are warranted³¹.

Recent efforts^32,33 in clinical implementation of PRS shows the potential of PRS to effectively stratify the risk of diseases development and guide screening. BOADICEA v5 (as implemented in the CanRisk tool)³² already implements a 313-variant PRS of breast cancer and currently supports hundreds of thousands of women, doctors, and genetic counselors annually in >90 countries making treatment decisions. PRS-guided mammographic screening is also being tested in the WISDOM and PERSPECTIVE I&I studies³³. GenoVA Study³⁴ is a clinical trial in which patients and their primary care physicians receive a clinical PRS laboratory report on five diseases including CRC. MyOme implements a cross-ancestry risk score for breast cancer risk stratification³⁵. As CRC has an effective screening intervention, it would be of great interest to explore implementation of PRS for guiding personal screening recommendations.

This study has several strengths. We brought together most of the globally available GWAS of CRC for Asian and European ancestry populations as our training data, which is an important factor for the improved performance of the proposed PRS. Further, we used multiple independent evaluation data sets that were not part of our training data nor GWAS discovery, providing an unbiased evaluation of the developed models. Moreover, the single cross-ancestral PRS derived in this study makes it easy to implement in any admixed population.

The results of this investigation should be interpreted in the context of its limitations. The discriminatory accuracy remains lower in Latinx/Hispanic and particularly in Black/African American individuals due to their limited sample sizes in training data. Future studies more inclusive of these individuals are warranted for deriving PRS to enhance the discriminatory accuracy. Furthermore, we have not been able to evaluate the performance of these models in other racial and ethnic groups, including Alaskan Native, Native American and Pacific Islander individuals. Lastly, we expect to further improve risk prediction by combining the PRS with non-genetic risk factors such as obesity, diet, and aspirin use, as previously shown^24,36.

Advances in PRS development have promoted the use of PRS-enhanced models to determine and stratify disease risk, which could improve disease prevention and management through screening and early detection. Our cross-ancestry Asian-European PRS, built upon data on both Asian and European ancestry individuals, improves the PRS performance in Asian, Black/African, and Latinx/Hispanic individuals considerably. Combining PRS and other CRC-associated risk factors such as lifestyle/environmental risk factors and high penetrance genes will likely further improve the prediction performance³⁶. We anticipate that the continuous expansion of PRS development and validation to include more diverse populations and prospective evaluation of PRS-enhanced risk prediction model in clinical trials along with decreasing genotyping cost and adaptation of health care systems to accommodate genetic data and prediction algorithm will bring closer the implementation of PRS in clinical practice.

Methods

Training data sets

To develop polygenic risk scores (PRS) across population, we used the genome-wide association study (GWAS) summary statistics of 1,020,293 SNPs based on 78,473 cases and 107,143 controls of European (EUR) and 21,731 cases and 47,444 controls of Asian ancestries from GWAS catalog under accession code GCST90129505 (Supplementary Data 1)^17,18,19. For this we group participants into analytical units by study or genotyping platform as consistent with the original reports^{17,18,19,20,37,38}. Ancestry was determined by the genetic principal component analysis. Studies that contributed to more than one prior genome-wide association analyses were analyzed only once. In total, there were 31 analytical units (17 from EUR descent populations and 14 from Asian descent populations), totaling 100,204 CRC cases and 154,587 controls. Comprehensive details on the participants, genotyping and standard quality control (QC) procedures are summarized in Supplementary Data 1. All study protocols were approved by the relevant Institutional Review Boards, and informed consent was obtained from all study participants in accordance with the Helsinki accord.

Independent validation data sets

We evaluated the performance of each of the developed PRS in the Genetic Epidemiology Research on Adult Health and Aging Cohort (GERA) cohort; Minority GWAS Japanese study (MG-JPN)³⁹; Minority GWAS African American study (MG-AA)⁴⁰; Hispanic Colorectal Cancer Study (HCCS)⁴¹; Multiethnic Cohort study (MEC); Cancer Prevention Study II (CPSII)⁴²; Basque-colon cohort (BCC); and Electronic Medical Records and Genomics (eMERGE) study. Racial and ethnic identification in these studies were self-reported. In total, there were 12,025 Asian (2,420 cases; 9605 controls), 13,823 Black/African-American (1954 cases; 11,869 controls), 10,378 Latinx/Hispanic (1682 cases; 8696 controls) and 118,756 non-Hispanic White (3651 cases; 115,105 controls) participants. None of these samples was included in the training data sets for model building. More details about study participant characteristics are included in Table 1.

CRC status (Yes/No) was determined from cancer-registry data. Family-history of CRC (>=1 first-degree relatives with CRC), was ascertained through baseline study questionnaire or electronic medical records at study entry.

Approaches for deriving PRS

We compared two different approaches for PRS development using (1) ancestry-specific PRS using PRS-CSx that integrates genome-wide Asian and European summary statistics and LD matrices; (2) single cross-ancestry PRS using LDpred2 that combine genome-wide Asian and European summary statistics and a weighted LD matrix with weight defined as the proportion of participants from each ancestry in the summary statistics. Figure 5 depicts the summary of these PRS derivations.

**Fig. 5: Approaches for deriving polygenic risk scores (PRS) for colorectal cancer.**

PRS-CSx¹⁵ derives ancestry-specific PRS while leveraging GWAS summary statistics from other ancestral groups. We first obtained ancestry-specific PRS using ancestry-specific GWAS summary statistics and LD matrix for Asian and non-Hispanic White participants based on ~1M genome-wide SNPs, respectively, while leveraging GWAS from the other ancestral group. We denoted these PRS by PRS_Asian and PRS_European, respectively. We then improved ancestry-specific PRS by taking a weighted sum of these PRSs to predict CRC of respective ancestral group. To derive PRS for the Asian population, we calculated a weighted sum of PRS_Asian and PRS_European (α₁ PRS_European + β₁ PRS_Asian) and obtained α₁ and β₁ from a logistic regression model using the MG-JPN study. Similarly, to derive PRS for the European population, we calculated a weighted sum of PRS_Asian and PRS_European (α₂ PRS_European + β₂ PRS_Asian), where α₂ and β₂ were obtained based on the pooled BCC and CPSII studies.

To derive the single cross-ancestry PRS using LDpred2¹⁶, we combined the summary statistics from the Asian and European GWAS using the inverse variance weighted estimator⁴³ and combined the LD matrices, as the weighted sum of the Asian and European-specific LD matrices with the weights proportional to the sample sizes of the Asian and European individuals in the combined summary statistics.

We compared ancestry-specific and single cross-ancestry PRS from PRS-CSx and LDpred2 with a previously published European-centric genome-wide PRS³ and a known-loci PRS consisting of 204 independently CRC-associated variants based on GWAS of European and Asian ancestries^17,18,19,20 (Supplementary Data 2). Our model was focused on only PRS development and did not include any lifestyle and environmental risk factors.

Evaluation of model performance

We evaluated the model performance using a wide range metrics, the Area Under the Receive Operating Characteristics curve, ancestry adjustment of PRS distribution, odds ratio estimates, and relative risk calibration based on all of the validation datasets listed in Table 1. The decision curve analysis is based on the GERA study, which was the only cohort study among our independent validation datasets.

The area under the receiver operating characteristics curve (AUC)

We evaluated the predictive performance of the PRS by the area under the receiver operating characteristics curve (AUC) in each of the racial and ethnic groups⁴⁴. We calculated the adjusted AUC of PRS for each study using the ROCt R package⁴⁵, adjusting for covariates age, sex and four PCs. We emphasize that the AUC estimate was for PRS only and the covariates were not part of prediction along PRS. These covariates were included as potential confounders. We then combined the AUC estimates of PRS across studies for each ancestry using the inverse variance weighted estimator.

We obtained the bootstrapped-based standard error (se), 95% confidence intervals (CI) (1.96* se) and two-sided p-values for comparisons across various subgroups using 500 bootstrap samples.

Ancestry adjustment of PRS distribution

As the PRS distributions were different across racial and ethnic groups due to different allele frequencies, we used a modified trans-ancestry adjustment of PRS to align the PRS distributions⁴⁶. We used the 1000 Genome dataset to estimate the ancestry adjustment following the approach in Khera et al.⁴⁶. Specifically, we derived principal components (PCs) based on 343,662 ancestry informative SNPs with little overlapped (0.3%) with SNPs used in PRS development. To correct for the mean and variance differences between ancestry groups, we fit two linear regression models to predict the mean and variance of PRS based on the first four PCs. To correct for the raw PRS distribution in our data set, we first calculated the PCs using the same loadings for the top 4 PCs from the 1000 Genome data set. We then obtained the ancestry-adjusted PRS for each individual by subtracting the predicted mean based on the 4 PCs from the individual’s raw PRS and then divided it by the predicted standard deviation based on the 4 PCs. Additional adjustments are needed for data sets with different imputation panels. The ancestry adjusted PRS is computed as given below:

$${{{{{\rm{PR}}}}}}{{{{{{\rm{S}}}}}}}_{{{{{{\rm{adjusted}}}}}}}=\frac{{{{{{\rm{PR}}}}}}{{{{{{\rm{S}}}}}}}_{{{{{{\rm{sample}}}}}}}-({\alpha }_{o}+\mathop{\sum }\nolimits_{i=1}^{4}{\alpha }_{i}{{PC}}_{i})}{\sqrt{{\beta }_{o}+\mathop{\sum }\nolimits_{i=1}^{4}{\beta }_{i}{{PC}}_{i}}}$$

(1)

Odds ratio (OR) estimates

We estimated the OR and 95% CI of CRC-risk associated per SD change in PRS by logistic regression model, overall and stratified by family history and age. For each racial and ethnic group, we estimated the AUC and OR by study and combined the estimates using the inverse variance weighted estimator. In addition, we estimated OR stratified by family history of 1st degree relative with CRC (yes, no) and age (<50, 50–59, 60–69, 70–79, and >80). All analyses were adjusted for age, sex, and top 4 principal components of ancestry.

Relative risk calibration of PRS

We binned PRS into 5% strata and defined the reference group as PRS in the 40–60% stratum. The expected OR for a PRS stratum is the ratio of the within-stratum geometric average of individuals’ model-based OR, defined as exponent of individuals’ PRS times log (OR), between that stratum and the reference stratum. We estimated the observed OR estimates and its 95% CI by fitting a logistic regression model with CRC disease status as outcome and a binary variable with 1 indicating a specific stratum and 0 indicating the reference stratum, adjusting for age, sex, and first four principal components.

Decision curve analysis

The decision-curve analysis was performed by calculating the standardized net benefit (sNB), defined as the net benefit divided by the maximum possible net benefit²¹, to assess the potential clinical impact of the risk prediction models on recommended interventions (i.e., screening). For a given risk threshold, the NB was defined as

$${NB}={sensitivity}\times p-(1-{specificity})\times (1-p)\times w,$$

(2)

where w was the odds at the threshold, sensitivity was the proportion of cases above the risk threshold based on the model, specificity was the proportion of controls below the risk threshold based on the model, and p was the disease probability at the landmark time. As it was difficult to interpret NB itself, we followed the approach proposed by Kerr et al.²¹ to calculate sNB, i.e., dividing NB by the maximum NB, which is achieved when sensitivity = 1 and specificity = 1. Hence, the sNB was equal to

$${sNB}={sensitivity}-(1-{specificity})\times \frac{(1-p)}{p}\times \, w,$$

(3)

It provided some sense of magnitude of sNB on a percent scale and was interpreted as the relative utility that has maximum value of 1. For example, if sNB = 0.4, it means that the risk model achieves 40% of the maximum possible achievable utility.

To calculate the NB in the presence of competing risks⁴⁷, we denote rt be the risk threshold and I(t) the cumulative incidence of developing CRC for an individual by time t in the presence of competing risks, here, death. Further, we define z = 1 to indicate that an individual is at high risk if their predicted t-year risk from the model is greater than or equal to rt and z = 0 otherwise. We chose the landmark time t = 10 years. At each rt, we calculated the number of true and false positives, TP_rt and FP_rt, by

$${{TP}}_{{rt}}=I(t{{{{{\rm{|}}}}}}z=1)\times P(z=1)\times N$$

(4)

$${{FP}}_{{rt}}=\{1-I(t{{{{{\rm{|}}}}}}z=1)\}\times P(z=1)\times N$$

(5)

where N is the total number of participants. The true-positive rate was then calculated as TP_rt /TP_rt=0 and the false-positive rate was calculated as FP_rt /FP_rt=0. We also calculated the reduction in the number of false positives per 100 patients as²²: (net benefit of the model – net benefit of treat all)/{rt/(1− rt)) × 100. We compared the model based on PRS and family history with the model based on family history alone, as well as two hypothetical extreme scenarios: intervention (e.g., screening) for all and intervention for none. We calculated the sNB under the competing risks framework⁴⁸, where the observational time is the minimum of time to CRC, time to death, and time at last observation, and the disease status is 1 if the study participant had CRC, 2 if the participant died (competing event), and 0 otherwise. We plotted decision-curves of sNB at the 10-year landmark time vs. risk threshold for age at study entry 40–49 and 50–59 years old, because average-risk individuals in these age groups are recommended to start CRC screening.

We performed the analyses using R version 4.0.0^{22,45,49,50,51}. A two-sided p-value < 0.05 is considered statistically significant.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The Summary-level data for the full set of Asian and European GWASs used in this study are available in the GWAS catalog under accession code GCST90129505. Genotype data of GERA participants who consented to having their data shared with dbGaP are available from dbGaP under accession phs000674.v2.p2. The complete GERA data are available upon successful application to the KP Research Bank. Genotype data of eMERGE participants are available from dbGaP under the accession number phs001616.v1.p1. For individual-level data, MEC, CCFR, The MD Anderson Colorectal Cancer Case Control Study, HCCS are deposited in dbGaP (phs000220.v2.p2, phs002733.v1.p1, phs002691.v1.p1, phs001193.v1.p1) and PLCO (phs001286.v3.p2). SCCS and CanCORS data can be accessed via websites http://ors.southerncommunitystudy.org and http://outcomes.cancer.gov/cancors/. For the remaining studies please contact the corresponding PIs: CR2&3 (Loic Le Marchand at loic@cc.hawaii.edu), Fukuoka, (Loic Le Marchand at loic@cc.hawaii.edu), Nagano, JPHC(Motoki Iwasaki at moiwasak@ncc.go.jp), UNC-Rectal (Temitope Keku at temitope_keku@med.unc.edu) and Basque Study(Prof Luis Bujanda at LUIS.BUJANDAFERNANDEZDEPIEROLA@osakidetza.eus). The 1000 Genomes phase 3 dataset (GRCh37) is available in PLINK2 binary format at PLINK 2.0 Resources(https://www.cog-genomics.org/plink/2.0/resources#1kg_phase3). The PRS weight files generated by this study are available in PGS catalog (https://www.pgscatalog.org/) with accession number: PGS003852.

Code availability

All data and statistical analysis tools used in the present study are open source, details of which are available in Methods and Nature Portfolio Reporting Summary. No customized code was used to process or analyze data.

References

Murphy, C. C. et al. Decrease in incidence of colorectal cancer among individuals 50 years or older after recommendations for population-based screening. Clin. Gastroenterol. Hepatol. 15, 903–909.e6 (2017).
PubMed Google Scholar
Hikino, K. et al. Genome-wide association study of colorectal polyps identified highly overlapping polygenic architecture with colorectal cancer. J. Hum. Genet. 67, 149–156 (2022).
CAS PubMed Google Scholar
Thomas, M. et al. Genome-wide modeling of polygenic risk score in colorectal cancer risk. Am. J. Hum. Genet. 107, 432–444 (2020).
CAS PubMed PubMed Central Google Scholar
Peterson, R. E. et al. Genome-wide association studies in ancestrally diverse populations: opportunities, methods, pitfalls, and recommendations. Cell 179, 589–603 (2019).
CAS PubMed PubMed Central Google Scholar
Vassos, E. et al. An examination of polygenic score risk prediction in individuals with first-episode psychosis. Biol. Psychiatry 81, 470–477 (2017).
PubMed Google Scholar
Duncan, L. et al. Analysis of polygenic risk score usage and performance in diverse human populations. Nat. Commun. 10, 3328 (2019).
ADS CAS PubMed PubMed Central Google Scholar
Martin, A. R. et al. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 51, 584–591 (2019).
CAS PubMed PubMed Central Google Scholar
Wojcik, G. L. et al. Genetic analyses of diverse populations improves discovery for complex traits. Nature 570, 514–518 (2019).
CAS PubMed PubMed Central Google Scholar
Ping, J. et al. Developing and validating polygenic risk scores for colorectal cancer risk prediction in East Asians. Int. J. Cancer 151, 1726–1736 (2022).
CAS PubMed Google Scholar
Ge, T. et al. Development and validation of a trans-ancestry polygenic risk score for type 2 diabetes in diverse populations. Genome Med. 14, 70 (2022).
PubMed PubMed Central Google Scholar
Song, S., Jiang, W., Hou, L. & Zhao, H. Leveraging effect size distributions to improve polygenic risk scores derived from summary statistics of genome-wide association studies. PLoS Comput. Biol. 16, e1007565 (2020).
ADS CAS PubMed PubMed Central Google Scholar
Grinde, K. E. et al. Generalizing polygenic risk scores from Europeans to Hispanics/Latinos. Genet. Epidemiol. 43, 50–62 (2019).
PubMed Google Scholar
Márquez-Luna, C. & Loh, P.-R. South Asian Type 2 Diabetes (SAT2D) Consortium, SIGMA Type 2 Diabetes Consortium & Price, A. L. Multiethnic polygenic risk scores improve risk prediction in diverse populations. Genet. Epidemiol. 41, 811–823 (2017).
PubMed PubMed Central Google Scholar
Chen, F. et al. Validation of a multi-ancestry polygenic risk score and age-specific risks of prostate cancer: a meta-analysis within diverse populations. eLife 11, e78304 (2022).
CAS PubMed PubMed Central Google Scholar
Ruan, Y. et al. Improving polygenic prediction in ancestrally diverse populations. Nat. Genet. 54, 573–580 (2022).
CAS PubMed PubMed Central Google Scholar
Privé, F., Arbel, J. & Vilhjálmsson, B. J. LDpred2: better, faster, stronger. Bioinformatics 36, 5424–5431 (2020).
PubMed Central Google Scholar
Huyghe, J. R. et al. Discovery of common and rare genetic risk variants for colorectal cancer. Nat. Genet. 51, 76–87 (2019).
CAS PubMed Google Scholar
Lu, Y. et al. Large-scale genome-wide association study of east asians identifies loci associated with risk for colorectal cancer. Gastroenterology 156, 1455–1466 (2019).
PubMed Google Scholar
Law, P. J. et al. Association analyses identify 31 new risk loci for colorectal cancer susceptibility. Nat. Commun. 10, 2154 (2019).
ADS PubMed PubMed Central Google Scholar
Fernandez-Rozadilla, C. et al. Deciphering colorectal cancer genetics through multi-omic analysis of 100,204 cases and 154,587 controls of European and east Asian ancestries. Nat. Genet. 55, 89–99 (2023).
CAS PubMed Google Scholar
Kerr, K. F., Brown, M. D., Zhu, K. & Janes, H. Assessing the clinical impact of risk prediction models with decision curves: guidance for correct interpretation and appropriate use. J. Clin. Oncol. 34, 2534–2540 (2016).
PubMed PubMed Central Google Scholar
Vickers, A. J. & Elkin, E. B. Decision curve analysis: a novel method for evaluating prediction models. Med. Decis. Mak. 26, 565–574 (2006).
Google Scholar
Polygenic Risk Score Task Force of the International Common Disease Alliance. Responsible use of polygenic risk scores in the clinic: potential benefits, risks and gaps. Nat. Med. 27, 1876–1884 (2021).
Lambert, S. A., Abraham, G. & Inouye, M. Towards clinical utility of polygenic risk scores. Hum. Mol. Genet. 28, R133–R142 (2019).
CAS PubMed Google Scholar
Den, R. B. et al. Genomic classifier identifies men with adverse pathology after radical prostatectomy who benefit from adjuvant radiation therapy. J. Clin. Oncol. 33, 944–951 (2015).
PubMed PubMed Central Google Scholar
Choi, E. et al. Development and validation of a risk prediction model for second primary lung cancer. J. Natl Cancer Inst. 114, 87–96 (2022).
PubMed Google Scholar
US Preventive Services Task Force et al. Screening for colorectal cancer: US preventive services task force recommendation statement. JAMA 325, 1965–1977 (2021).
Campos, F. G. Colorectal cancer in young adults: a difficult challenge. World J. Gastroenterol. 23, 5041–5044 (2017).
PubMed PubMed Central Google Scholar
Weinberg, B. A. & Marshall, J. L. Colon cancer in young adults: trends and their implications. Curr. Oncol. Rep. 21, 3 (2019).
PubMed Google Scholar
Hull, M. A., Rees, C. J., Sharp, L. & Koo, S. A risk-stratified approach to colorectal cancer prevention and diagnosis. Nat. Rev. Gastroenterol. Hepatol. 17, 773–780 (2020).
PubMed PubMed Central Google Scholar
Loeve, F., Boer, R., van Oortmarssen, G. J., van Ballegooijen, M. & Habbema, J. D. The MISCAN-COLON simulation model for the evaluation of colorectal cancer screening. Comput. Biomed. Res. 32, 13–33 (1999).
CAS PubMed Google Scholar
Carver, T. et al. CanRisk Tool-A Web Interface for the Prediction of Breast and Ovarian Cancer Risk and the Likelihood of Carrying Genetic Pathogenic Variants. Cancer Epidemiol. Biomark. Prev. 30, 469–473 (2021).
CAS Google Scholar
Esserman, L. J. & WISDOM Study and Athena Investigators. The WISDOM Study: breaking the deadlock in the breast cancer screening debate. NPJ Breast Cancer 3, 34 (2017).
Hao, L. et al. Development of a clinical polygenic risk score assay and reporting workflow. Nat. Med. 28, 1006–1013 (2022).
CAS PubMed PubMed Central Google Scholar
Harnessing the True Power of the Genome - MyOme. https://www.myome.com/?utm_source=PRNewsWire&utm_medium=press_release&utm_campaign=ASHG_2022&utm_content=top.
Jeon, J. et al. Determining risk of colorectal cancer and starting age of screening based on lifestyle, environmental, and genetic factors. Gastroenterology 154, 2152–2164.e19 (2018).
PubMed Google Scholar
Lu, Y. et al. Identification of novel loci and new risk variant in known loci for colorectal cancer risk in east asians. Cancer Epidemiol. Biomark. Prev. 29, 477–486 (2020).
CAS Google Scholar
Schmit, S. L. et al. Novel common genetic susceptibility loci for colorectal cancer. J. Natl Cancer Inst. 111, 146–157 (2019).
PubMed Google Scholar
Wang, H. et al. Trans-ethnic genome-wide association study of colorectal cancer identifies a new susceptibility locus in VTI1A. Nat. Commun. 5, 4613 (2014).
CAS PubMed Google Scholar
Wang, H. et al. Fine-mapping of genome-wide association study-identified risk loci for colorectal cancer in African Americans. Hum. Mol. Genet. 22, 5048–5055 (2013).
CAS PubMed PubMed Central Google Scholar
Schmit, S. L. et al. Genome-wide association study of colorectal cancer in Hispanics. Carcinogenesis 37, 547–556 (2016).
CAS PubMed PubMed Central Google Scholar
Calle, E. E. et al. The American Cancer Society Cancer Prevention Study II Nutrition Cohort: rationale, study design, and baseline characteristics. Cancer 94, 500–511 (2002).
PubMed Google Scholar
Hartung, J., Knapp, G. & Sinha, B. K. Statistical Meta-Analysis with Applications (John Wiley & Sons, Inc., 2008).
Heagerty, P. J., Lumley, T. & Pepe, M. S. Time-dependent ROC curves for censored survival data and a diagnostic marker. Biometrics 56, 337–344 (2000).
CAS PubMed MATH Google Scholar
Le Borgne, F. et al. Standardized and weighted time-dependent receiver operating characteristic curves to evaluate the intrinsic prognostic capacities of a marker by taking into account confounding factors. Stat. Methods Med. Res. 27, 3397–3410 (2018).
MathSciNet PubMed Google Scholar
Khera, A. V. et al. Whole-genome sequencing to characterize monogenic and polygenic contributions in patients hospitalized with early-onset myocardial infarction. Circulation 139, 1593–1602 (2019).
CAS PubMed PubMed Central Google Scholar
Vickers, A. J., Cronin, A. M., Elkin, E. B. & Gonen, M. Extensions to decision curve analysis, a novel method for evaluating diagnostic tests, prediction models and molecular markers. BMC Med. Inform. Decis. Mak. 8, 53 (2008).
PubMed PubMed Central Google Scholar
Zhang, Z. Survival analysis in the presence of competing risks. Ann. Transl. Med. 5, 47 (2017).
PubMed PubMed Central Google Scholar
Privé, F., Aschard, H., Ziyatdinov, A. & Blum, M. G. B. Efficient analysis of large-scale genome-wide data with two R packages: bigstatsr and bigsnpr. Bioinformatics 34, 2781–2787 (2017).
Google Scholar
Gerds, T. A. & Kattan, M. W. Medical Risk Prediction: with Ties to Machine Learning (Chapman and Hall/CRC, 2021).
Team, R. C. R: A language and environment for statistical computing (2013).

Download references

Acknowledgements

National Cancer Institute, National Human Genome Research Institute (R01 CA244588 (Ulrike Peters), U01 CA164930 (Ulrike Peters), U01 CA137088 (Ulrike Peters), R01 CA059045 (Ulrike Peters), R01 CA201407 (Ulrike Peters), R01 CA206279 (Ulrike Peters), U01 CA261339 (David V Conti), R01 CA185094 (Ulrike Peters), U01 HG008657 (Gail P. Jarvik)).

Author information

Authors and Affiliations

Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA, 98109, USA
Minta Thomas, Yu-Ru Su, Lori C. Sakoda, Shangqing Jiang, Amanda I. Phipps, Conghui Qu, Emily White, Jeroen R. Huyghe, Jiayin Zheng, Michelle Kim, Polly A. Newcomb, Robert S. Steinfelder, Ross L. Prentice, Tabitha A. Harrison, Yi Lin, John D. Potter, Christopher I. Li, Ulrike Peters & Li Hsu
Biostatistics Division, Kaiser Permanente Washington Health Research Institute, Seattle, USA
Yu-Ru Su
Department of Medicine (Medical Genetics), University of Washington Medical Center, Seattle, WA, 98195, USA
Elisabeth A. Rosenthal & Gail P. Jarvik
Division of Research, Kaiser Permanente Northern California, Oakland, CA, USA
Lori C. Sakoda, Jennifer L. Schneider, Natalia Udaltsova & Douglas A. Corley
Genomic Medicine Institute, Cleveland Clinic, Cleveland, OH, USA
Stephanie L. Schmit
Population and Cancer Prevention Program, Case Comprehensive Cancer Center, Cleveland, USA
Stephanie L. Schmit
Danish Institute for Advanced Study (DIAS), Epidemiology, Biostatistics and Biodemography, Department of Public Health, University of Southern Denmark, Odense, Denmark
Maria N. Timofeeva
Colon Cancer Genetics Group, Medical Research Council Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, U, Germany
Maria N. Timofeeva & Malcolm G. Dunlop
Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt University Medical Center, Nashville, TN, USA
Zhishan Chen, Qiuyin Cai & Wei Zheng
Instituto de Investigacion Sanitaria de Santiago (IDIS), Choupana sn, 15706, Santiago de Compostela, Spain
Ceres Fernandez-Rozadilla
Edinburgh Cancer Research Centre, Institute of Genomics and Cancer, University of Edinburgh, Crewe Road, Edinburgh, EH4 2XU, UK
Ceres Fernandez-Rozadilla & Ian P. Tomlinson
Division of Genetics and Epidemiology, The Institute of Cancer Reseach, London, SW7 3RP, UK
Philip J. Law & Richard S. Houlston
Nutrition and Metabolism Branch, International Agency for Research on Cancer, World Health Organization, Lyon, France
Neil Murphy & Marc J. Gunter
Digestive Diseases and Microbiota Group, Girona Biomedical Research Institute (IDIBGI), Salt, 17190, Girona, Spain
Robert Carreras-Torres
Unit of Biomarkers and Susceptibility, Oncology Data Analytics Program, Catalan Institute of Oncology, Barcelona, 08908, Spain
Virginia Diez-Obrero
Colorectal Cancer Group, ONCOBELL Program, Bellvitge Biomedical Research Institute, Barcelona, 08908, Spain
Virginia Diez-Obrero
Department of Clinical Sciences, Faculty of Medicine, University of Barcelona, Barcelona, 08908, Spain
Virginia Diez-Obrero
Division of Human Nutrition and Health, Wageningen University & Research, Wageningen, The Netherlands
Franzel J. B. van Duijnhoven
Department of Preventive Medicine, Seoul National University College of Medicine, Seoul National University Cancer Research Institute, Seoul, South Korea
Aesun Shin & Yoon-Ok Ahn
Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden
Alicja Wolk
Department of Epidemiology, University of Washington, Seattle, WA, USA
Amanda I. Phipps & Ulrike Peters
Institute for Health Research, Kaiser Permanente Colorado, Denver, CO, USA
Andrea Burnett-Hartman
.Center for Cancer Research, Medical University Vienna, Vienna, Austria
Andrea Gsur & Stefanie Brezina
Division of Gastroenterology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
Andrew T. Chan & Mingyang Song
Channing Division of Network Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
Andrew T. Chan
Clinical and Translational Epidemiology Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
Andrew T. Chan & Mingyang Song
Broad Institute of Harvard and MIT, Cambridge, MA, USA
Andrew T. Chan
Department of Epidemiology, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
Andrew T. Chan
Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
Andrew T. Chan
Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
Ann G. Zauber
University of Southern California, Preventative Medicine, Los Angeles, CA, USA
Anna H. Wu
Department of Clinical Genetics, Karolinska University Hospital, Stockholm, Sweden
Annika Lindblom
Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden
Annika Lindblom
Department of Population Science, American Cancer Society, Atlanta, GA, USA
Caroline Y. Um & Christina Newton
SWOG Statistical Center, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
Catherine M. Tangen
Colorado Center for Personalized Medicine, University of Colorado - Anschutz Medical Campus, Aurora, CO, USA
Chris Gignoux
Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
Christopher A. Haiman, David V. Conti & Jane C. Figueiredo
Leeds Institute of Cancer and Pathology, University of Leeds, Leeds, UK
D. Timothy Bishop
Colorectal Oncogenomics Group, Department of Clinical Pathology, The University of Melbourne, Parkville, VIC, 3000, Australia
Daniel D. Buchanan
University of Melbourne Centre for Cancer Research, Victorian Comprehensive Cancer Centre, Parkville, VIC, 3000, Australia
Daniel D. Buchanan & Mark A. Jenkins
Genomic Medicine and Family Cancer Clinic, The Royal Melbourne Hospital, Parkville, VIC, 3000, Australia
Daniel D. Buchanan
Department of Bioinformatics and Medical Education, University of Washington Medical Center, Seattle, WA, 98195, USA
David R. Crosslin
Department of Social and Preventive Medicine, Hallym University College of Medicine, Okcheon-dong, South Korea
Dong-Hyun Kim
VA Cooperative Studies Program Epidemiology Center, Durham Veterans Affairs Health Care System, Durham, NC, USA
Elizabeth Hauser
Duke Molecular Physiology Institute, Duke University Medical Center, Durham, NC, USA
Elizabeth Hauser
Department of Epidemiology, University of Washington School of Public Health, Seattle, WA, USA
Emily White
Cancer Epidemiology Program, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
Erin Siegel
Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH, USA
Fredrick R. Schumacher
Department of Community Medicine and Epidemiology, Lady Davis Carmel Medical Center, Haifa, Israel
Gad Rennert
Ruth and Bruce Rappaport Faculty of Medicine, Technion-Israel Institute of Technology, Haifa, Israel
Gad Rennert
Cancer Epidemiology Division, Cancer Council Victoria, Melbourne, VIC, Australia
Graham G. Giles
Division of Human Genetics, Department of Internal Medicine, The Ohio State University Comprehensive Cancer Center, Columbus, OH, USA
Heather Hampel & Rachel Pearlman
Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany
Hermann Brenner, Michael Hoffmeister & Volker Arndt
Division of Preventive Oncology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Heidelberg, Germany
Hermann Brenner
.Division of Cancer Epidemiology and Prevention, Aichi Cancer Center Research Institute, Nagoya, Japan
Isao Oze
.Research Institute and Hospital, National Cancer Center, Goyang, South Korea, South Korea
Jae Hwan Oh
.Department of Gastroenterology, Kaiser Permanente San Francisco Medical Center, San Francisco, CA, USA
Jeffrey K. Lee
Department of Pathology, University of Michigan, Ann Arbor, MI, 48104, USA
Jeffrey K. Lee & Joel Greenson
Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), Heidelberg, Germany
Jenny Chang-Claude
University Medical Centre Hamburg-Eppendorf, University Cancer Centre Hamburg (UCCH), Hamburg, Germany
Jenny Chang-Claude
Department of Cancer Biomedical Science, Graduate School of Cancer Science and Policy, National Cancer Center, Gyeonggi-do, South Korea
Jeongseon Kim
Department of Medicine I, University Hospital Dresden, Technische Universität Dresden (TU Dresden), Dresden, Germany
Jochen Hampe
Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Melbourne, VIC, Australia
John L. Hopper & Mark A. Jenkins
Department of Epidemiology, School of Public Health and Institute of Health and Environment, Seoul National University, Seoul, South Korea
John L. Hopper
Slone Epidemiology Center, School of Medicine, Boston University, Boston, MA, USA
Julie R. Palmer
Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
Kala Visvanathan & Elizabeth A. Platz
Division of Molecular and Clinical Epidemiology, Aichi Cancer Center Research Institute, Nagoya, Japan
Keitaro Matsuo
Laboratory of Clinical Genome Sequencing, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, Tokyo, Japan
Koichi Matsuda
Institute for Health Promotion, Graduate School of Public Health, Yonsei University, Seoul, Korea
Keum Ji Jung
Department of Family Medicine, University of Virginia, Charlottesville, VA, USA
Li Li
University of Hawaii Cancer Center, Honolulu, HI, USA
Loic Le Marchand
Department of Molecular Biology of Cancer, Institute of Experimental Medicine of the Czech Academy of Sciences, Prague, Czech Republic
Ludmila Vodickova, Pavel Vodicka & Veronika Vymetalkova
Institute of Biology and Medical Genetics, First Faculty of Medicine, Charles University, Prague, Czech Republic
Ludmila Vodickova, Pavel Vodicka & Veronika Vymetalkova
Faculty of Medicine and Biomedical Center in Pilsen, Charles University, Pilsen, Czech Republic
Ludmila Vodickova
Department of Gastroenterology, Biodonostia Health Research Institute, Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBERehd), Universidad del País Vasco (UPV/EHU), San Sebastián, Spain
Luis Bujanda
Moffitt Cancer Center, Tampa, FL, USA
Marco Matejcic
Department of Internal Medicine, University of Utah, Salt Lake City, UT, USA
Martha L. Slattery
Department of Medicine and Surgery, LUM University, Camassima, Italy
Mauro D’Amato
Gastrointestinal Genetics Lab, CIC bioGUNE-BRTA, Derio, Spain
Mauro D’Amato
Department of Environmental Genomics, School of Public Health, Nanjing Medical University, Nanjing, China
Meilin Wang
Memorial University of Newfoundland, Discipline of Genetics, St. John’s, Canada
Michael O. Woods
Departments of Epidemiology and Nutrition, Harvard TH Chan School of Public Health, Boston, MA, USA
Mingyang Song & Susan Vadaparampil
Division of Epidemiology, National Cancer Center Institute for Cancer Control, National Cancer Center, Tokyo, Japan
Motoki Iwasaki & Taiki Yamaji
Division of Cohort Research, National Cancer Center Institute for Cancer Control, National Cancer Center, Tokyo, Japan
Motoki Iwasaki, Norie Sawada & Shoichiro Tsugane
Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, China
Mulong Du
Department of Environmental Health, Harvard T.H. Chan School of Public Health, Boston, MA, USA
Mulong Du
Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY, USA
Peter T. Campbell
Department of Laboratory Medicine and Pathology, Mayo Clinic Arizona, Scottsdale, AZ, USA
Rish K. Pai
Department of Medicine and Epidemiology, University of Pittsburgh Medical Center, Pittsburgh, PA, USA
Robert E. Schoen
Samuel Oschin Comprehensive Cancer Institute, CEDARS-SINAI, Los Angeles, CA, USA
Robert W. Haile
Department of Public Health, Erasmus University Medical Center, Rotterdam, The Netherlands
Rosita Vandenputtelaar, Reinier Meester & Iris Lansdorp Vogelaar
Nantes Université, CHU Nantes, Service de Génétique Médicale, F-44000, Nantes, France
Sébastien Küry
Gastroenterology Department, Hospital Clínic, Institut d’Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), University of Barcelona, Barcelona, Spain
Sergi Castellví-Bel
Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
Sonja I. Berndt, Stephanie J. Weinstein & Stephen J. Chanock
National University Cancer Institute, Singapore, Singapore
Soo Chin Lee
Department of Epidemiology and Health Promotion, Graduate School of Public Health, Yonsei University, Seoul, Korea
Sun Ha Jee
Department of Preventive Medicine, Chonnam National University Medical School, Gwangju, Korea
Sun-Seog Kweon
Jeonnam Regional Cancer Center, Chonnam National University Hwasun Hospital, Hwasun, Korea
Sun-Seog Kweon
Center for Gastrointestinal Biology and Disease, University of North Carolina, Chapel Hill, NC, USA
Temitope O. Keku
State Key Laboratory of Oncology in South China, Cancer Center, Sun Yat-sen University, Guangzhou, China
Wei-Hua Jia
Vanderbilt University Medical Center, Nashville, TN, USA
Xiao-Ou Shu
Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA
Zsofia K. Stadler
Department of Radiation Sciences, Oncology Unit, Umeå University, Umeå, Sweden
Bethany Van Guelpen
Wallenberg Centre for Molecular Medicine, Umeå University, Umeå, Sweden
Bethany Van Guelpen
Huntsman Cancer Institute and Department of Population Health Sciences, University of Utah, Salt Lake City, UT, USA
Cornelia M. Ulrich
Oncology Data Analytics Program, Catalan Institute of Oncology-IDIBELL, L’Hospitalet de Llobregat, Barcelona, Spain
Victor Moreno
CIBER Epidemiología y Salud Pública (CIBERESP), Madrid, Spain
Victor Moreno
Department of Clinical Sciences, Faculty of Medicine, University of Barcelona, Barcelona, Spain
Victor Moreno
ONCOBEL Program, Bellvitge Biomedical Research Institute (IDIBELL), L’Hospitalet de Llobregat, Barcelona, Spain
Victor Moreno
Department of Medicine, Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
Jane C. Figueiredo
Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
Graham Casey
Department of Medical Oncology & Therapeutics Research, City of Hope National Medical Center, Duarte, CA, USA
Stephen B. Gruber
Division of Epidemiology, Department of Population Health, New York University School of Medicine, New York, NY, USA
Richard B. Hayes
Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
Paul D. P. Pharoah
Department of Gastroenterology, Kaiser Permanente Medical Center, San Francisco, CA, USA
Douglas A. Corley
Department of Biostatistics, University of Washington, Seattle, WA, USA
Li Hsu

Authors

Minta Thomas
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Ru Su
View author publications
You can also search for this author in PubMed Google Scholar
Elisabeth A. Rosenthal
View author publications
You can also search for this author in PubMed Google Scholar
Lori C. Sakoda
View author publications
You can also search for this author in PubMed Google Scholar
Stephanie L. Schmit
View author publications
You can also search for this author in PubMed Google Scholar
Maria N. Timofeeva
View author publications
You can also search for this author in PubMed Google Scholar
Zhishan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Ceres Fernandez-Rozadilla
View author publications
You can also search for this author in PubMed Google Scholar
Philip J. Law
View author publications
You can also search for this author in PubMed Google Scholar
Neil Murphy
View author publications
You can also search for this author in PubMed Google Scholar
Robert Carreras-Torres
View author publications
You can also search for this author in PubMed Google Scholar
Virginia Diez-Obrero
View author publications
You can also search for this author in PubMed Google Scholar
Franzel J. B. van Duijnhoven
View author publications
You can also search for this author in PubMed Google Scholar
Shangqing Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Aesun Shin
View author publications
You can also search for this author in PubMed Google Scholar
Alicja Wolk
View author publications
You can also search for this author in PubMed Google Scholar
Amanda I. Phipps
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Burnett-Hartman
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Gsur
View author publications
You can also search for this author in PubMed Google Scholar
Andrew T. Chan
View author publications
You can also search for this author in PubMed Google Scholar
Ann G. Zauber
View author publications
You can also search for this author in PubMed Google Scholar
Anna H. Wu
View author publications
You can also search for this author in PubMed Google Scholar
Annika Lindblom
View author publications
You can also search for this author in PubMed Google Scholar
Caroline Y. Um
View author publications
You can also search for this author in PubMed Google Scholar
Catherine M. Tangen
View author publications
You can also search for this author in PubMed Google Scholar
Chris Gignoux
View author publications
You can also search for this author in PubMed Google Scholar
Christina Newton
View author publications
You can also search for this author in PubMed Google Scholar
Christopher A. Haiman
View author publications
You can also search for this author in PubMed Google Scholar
Conghui Qu
View author publications
You can also search for this author in PubMed Google Scholar
D. Timothy Bishop
View author publications
You can also search for this author in PubMed Google Scholar
Daniel D. Buchanan
View author publications
You can also search for this author in PubMed Google Scholar
David R. Crosslin
View author publications
You can also search for this author in PubMed Google Scholar
David V. Conti
View author publications
You can also search for this author in PubMed Google Scholar
Dong-Hyun Kim
View author publications
You can also search for this author in PubMed Google Scholar
Elizabeth Hauser
View author publications
You can also search for this author in PubMed Google Scholar
Emily White
View author publications
You can also search for this author in PubMed Google Scholar
Erin Siegel
View author publications
You can also search for this author in PubMed Google Scholar
Fredrick R. Schumacher
View author publications
You can also search for this author in PubMed Google Scholar
Gad Rennert
View author publications
You can also search for this author in PubMed Google Scholar
Graham G. Giles
View author publications
You can also search for this author in PubMed Google Scholar
Heather Hampel
View author publications
You can also search for this author in PubMed Google Scholar
Hermann Brenner
View author publications
You can also search for this author in PubMed Google Scholar
Isao Oze
View author publications
You can also search for this author in PubMed Google Scholar
Jae Hwan Oh
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey K. Lee
View author publications
You can also search for this author in PubMed Google Scholar
Jennifer L. Schneider
View author publications
You can also search for this author in PubMed Google Scholar
Jenny Chang-Claude
View author publications
You can also search for this author in PubMed Google Scholar
Jeongseon Kim
View author publications
You can also search for this author in PubMed Google Scholar
Jeroen R. Huyghe
View author publications
You can also search for this author in PubMed Google Scholar
Jiayin Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Jochen Hampe
View author publications
You can also search for this author in PubMed Google Scholar
Joel Greenson
View author publications
You can also search for this author in PubMed Google Scholar
John L. Hopper
View author publications
You can also search for this author in PubMed Google Scholar
Julie R. Palmer
View author publications
You can also search for this author in PubMed Google Scholar
Kala Visvanathan
View author publications
You can also search for this author in PubMed Google Scholar
Keitaro Matsuo
View author publications
You can also search for this author in PubMed Google Scholar
Koichi Matsuda
View author publications
You can also search for this author in PubMed Google Scholar
Keum Ji Jung
View author publications
You can also search for this author in PubMed Google Scholar
Li Li
View author publications
You can also search for this author in PubMed Google Scholar
Loic Le Marchand
View author publications
You can also search for this author in PubMed Google Scholar
Ludmila Vodickova
View author publications
You can also search for this author in PubMed Google Scholar
Luis Bujanda
View author publications
You can also search for this author in PubMed Google Scholar
Marc J. Gunter
View author publications
You can also search for this author in PubMed Google Scholar
Marco Matejcic
View author publications
You can also search for this author in PubMed Google Scholar
Mark A. Jenkins
View author publications
You can also search for this author in PubMed Google Scholar
Martha L. Slattery
View author publications
You can also search for this author in PubMed Google Scholar
Mauro D’Amato
View author publications
You can also search for this author in PubMed Google Scholar
Meilin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Michael Hoffmeister
View author publications
You can also search for this author in PubMed Google Scholar
Michael O. Woods
View author publications
You can also search for this author in PubMed Google Scholar
Michelle Kim
View author publications
You can also search for this author in PubMed Google Scholar
Mingyang Song
View author publications
You can also search for this author in PubMed Google Scholar
Motoki Iwasaki
View author publications
You can also search for this author in PubMed Google Scholar
Mulong Du
View author publications
You can also search for this author in PubMed Google Scholar
Natalia Udaltsova
View author publications
You can also search for this author in PubMed Google Scholar
Norie Sawada
View author publications
You can also search for this author in PubMed Google Scholar
Pavel Vodicka
View author publications
You can also search for this author in PubMed Google Scholar
Peter T. Campbell
View author publications
You can also search for this author in PubMed Google Scholar
Polly A. Newcomb
View author publications
You can also search for this author in PubMed Google Scholar
Qiuyin Cai
View author publications
You can also search for this author in PubMed Google Scholar
Rachel Pearlman
View author publications
You can also search for this author in PubMed Google Scholar
Rish K. Pai
View author publications
You can also search for this author in PubMed Google Scholar
Robert E. Schoen
View author publications
You can also search for this author in PubMed Google Scholar
Robert S. Steinfelder
View author publications
You can also search for this author in PubMed Google Scholar
Robert W. Haile
View author publications
You can also search for this author in PubMed Google Scholar
Rosita Vandenputtelaar
View author publications
You can also search for this author in PubMed Google Scholar
Ross L. Prentice
View author publications
You can also search for this author in PubMed Google Scholar
Sébastien Küry
View author publications
You can also search for this author in PubMed Google Scholar
Sergi Castellví-Bel
View author publications
You can also search for this author in PubMed Google Scholar
Shoichiro Tsugane
View author publications
You can also search for this author in PubMed Google Scholar
Sonja I. Berndt
View author publications
You can also search for this author in PubMed Google Scholar
Soo Chin Lee
View author publications
You can also search for this author in PubMed Google Scholar
Stefanie Brezina
View author publications
You can also search for this author in PubMed Google Scholar
Stephanie J. Weinstein
View author publications
You can also search for this author in PubMed Google Scholar
Stephen J. Chanock
View author publications
You can also search for this author in PubMed Google Scholar
Sun Ha Jee
View author publications
You can also search for this author in PubMed Google Scholar
Sun-Seog Kweon
View author publications
You can also search for this author in PubMed Google Scholar
Susan Vadaparampil
View author publications
You can also search for this author in PubMed Google Scholar
Tabitha A. Harrison
View author publications
You can also search for this author in PubMed Google Scholar
Taiki Yamaji
View author publications
You can also search for this author in PubMed Google Scholar
Temitope O. Keku
View author publications
You can also search for this author in PubMed Google Scholar
Veronika Vymetalkova
View author publications
You can also search for this author in PubMed Google Scholar
Volker Arndt
View author publications
You can also search for this author in PubMed Google Scholar
Wei-Hua Jia
View author publications
You can also search for this author in PubMed Google Scholar
Xiao-Ou Shu
View author publications
You can also search for this author in PubMed Google Scholar
Yi Lin
View author publications
You can also search for this author in PubMed Google Scholar
Yoon-Ok Ahn
View author publications
You can also search for this author in PubMed Google Scholar
Zsofia K. Stadler
View author publications
You can also search for this author in PubMed Google Scholar
Bethany Van Guelpen
View author publications
You can also search for this author in PubMed Google Scholar
Cornelia M. Ulrich
View author publications
You can also search for this author in PubMed Google Scholar
Elizabeth A. Platz
View author publications
You can also search for this author in PubMed Google Scholar
John D. Potter
View author publications
You can also search for this author in PubMed Google Scholar
Christopher I. Li
View author publications
You can also search for this author in PubMed Google Scholar
Reinier Meester
View author publications
You can also search for this author in PubMed Google Scholar
Victor Moreno
View author publications
You can also search for this author in PubMed Google Scholar
Jane C. Figueiredo
View author publications
You can also search for this author in PubMed Google Scholar
Graham Casey
View author publications
You can also search for this author in PubMed Google Scholar
Iris Lansdorp Vogelaar
View author publications
You can also search for this author in PubMed Google Scholar
Malcolm G. Dunlop
View author publications
You can also search for this author in PubMed Google Scholar
Stephen B. Gruber
View author publications
You can also search for this author in PubMed Google Scholar
Richard B. Hayes
View author publications
You can also search for this author in PubMed Google Scholar
Paul D. P. Pharoah
View author publications
You can also search for this author in PubMed Google Scholar
Richard S. Houlston
View author publications
You can also search for this author in PubMed Google Scholar
Gail P. Jarvik
View author publications
You can also search for this author in PubMed Google Scholar
Ian P. Tomlinson
View author publications
You can also search for this author in PubMed Google Scholar
Wei Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Douglas A. Corley
View author publications
You can also search for this author in PubMed Google Scholar
Ulrike Peters
View author publications
You can also search for this author in PubMed Google Scholar
Li Hsu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.T., S.L.S., L.L.M., M.A.J., S.T.V., G.P.J., U.P., and L.H. designed the study. F.J.BvD., A.S., A.B.H., A.G., D.D.B., D.V.C., D.H.K., E.W., H.H., H.B., J.C.C., J.K.,J.H., L.L.M., L.B., M.A.J., M.D.A., M.L.W., M.S., M.L.D., R.Pe., R.E.S., R.W.H., S.Kü., S.C.B., S.T., S.B., T.A.H., V.V., V.,A., J.D.P., C.I.L., J.C.F., I.L.V., P.D.P.P., R.S.H., G.P.J., I.P.T., W.Z., D.A.C., U.P., and L.H., recruited patients and collected samples. M.T., Y.S., E.A.R., L.C.S., S.L.S., M.N.T., Z.C., C.F.R., P.L., N.M., R.C.T., V.D.O., S.J., A.W., A.I.P., A.T.C., A.G.Z., A.H.W., A.L., C.Y.U., C.M.T., C.G., C.N., C.A.H., C.Q., D.T.B., D.D.B., D.R.C., D.V.C., D.H.K., E.H., E.S., F.R.S., G.R., G.G.G., H.H., I.O., J.H.O., J.K.L., J.L.S., J.C.C., J.K., J.R.H., J.Z., J.G., J.L.H., J.R.P., K.V., Ke. M., Ko. M., K.J.J., L.L., L.L.M., L.V., L.B., M.J.G., M.M., M.L.S., M.D.A., M.L.W., M.H., M.O.W., M.S.K., M.S., M.I., M.L.D., N.U., N.S., P.V., P.T.C., P.A.N., Q.C., R.Pe., R.Pa., R.E.S., R.S.S., R.W.H., R.V., R.Pr., S.Kü., S.C.B., S.T., S.I.B., L.S.C., S.B., S.J.W., S.J.C., S.H.J., S.Kw., T.A.H., T.Y., T.O.K., V.V., V.A., W.J., X.S., Y.L., Y.A., Z.K.S., B.V.G., C.M.U., E.A.P., J.D.P., C.I.L., R.M., V.M., J.C.F., G.C., I.L.V., M.G.D., S.B.G., R.B.H., P.D.P.P., R.S.H., G.P.J., I.P.T., W.Z., D.A.C., U.P., and L.H. analyzed and interpreted the data. All authors drafted or substantially revised the manuscript. L.H. and U.P. supervised the study and acquired funding, are corresponding authors.

Corresponding authors

Correspondence to Ulrike Peters or Li Hsu.

Ethics declarations

Competing interests

D.A.C. receives funds from NCI. K.V. receives related Research support from Cepheid and non-financial collaboration with Optra Health. L.B. is a consultant or has received research funding from Ikan Biotech. R.E.S. got research support from Immunovia, Freenome, Exact Sciences. Z.K.S.’s immediate family member serves as a consultant in Ophthalmology for Adverum, Genentech, Gyroscope Therapeutics Limited, Neurogene, Optos Plc, Outlook Therapeutics, RegenexBio, and Regeneron (outside the submitted work). The remaining authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Description of Additional Supplementary Files

Supplementary Data 1

Supplementary Data 2

Reporting Summary

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Thomas, M., Su, YR., Rosenthal, E.A. et al. Combining Asian and European genome-wide association studies of colorectal cancer improves risk prediction across racial and ethnic populations. Nat Commun 14, 6147 (2023). https://doi.org/10.1038/s41467-023-41819-0

Download citation

Received: 05 January 2023
Accepted: 19 September 2023
Published: 02 October 2023
DOI: https://doi.org/10.1038/s41467-023-41819-0

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

Discriminatory accuracy of Asian-European PRS

PRS distribution across racial and ethnic groups

Odds ratios (ORs) for PRS stratified by family-history and age

Clinical utility for model based on PRS and family-history

Assessing CRC probabilities for PRS

Discussion

Methods

Training data sets

Independent validation data sets

Approaches for deriving PRS

Evaluation of model performance

The area under the receiver operating characteristics curve (AUC)

Ancestry adjustment of PRS distribution

Odds ratio (OR) estimates

Relative risk calibration of PRS

Decision curve analysis

Reporting summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

Comments

Search

Quick links