Abstract
Polygenic risk scores (PRSs) summarize the genetic predisposition of a complex human trait or disease and may become a valuable tool for advancing precision medicine. However, PRSs that are developed in populations of predominantly European genetic ancestries can increase health disparities due to poor predictive performance in individuals of diverse and complex genetic ancestries. We describe genetic and modifiable risk factors that limit the transferability of PRSs across populations and review the strengths and weaknesses of existing PRS construction methods for diverse ancestries. Developing PRSs that benefit global populations in research and clinical settings provides an opportunity for innovation and is essential for health equity.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$189.00 per year
only $15.75 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout




References
Kullo, I. J. et al. Polygenic scores in biomedical research. Nat. Rev. Genet. 23, 524–532 (2022).
Abdellaoui, A., Yengo, L., Verweij, K. J. H. & Visscher, P. M. 15 years of GWAS discovery: realizing the promise. Am. J. Hum. Genet. 110, 179–194 (2023).
Martin, A. R. et al. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 51, 584–591 (2019). This paper demonstrates that PRSs have limited generalizability across populations and emphasizes the importance of diversity to realize the full and equitable potential of PRSs.
Fatumo, S. et al. A roadmap to increase diversity in genomic studies. Nat. Med. 28, 243–250 (2022). This paper presents an updated ancestry tabulation for participants in GWAS catalogue and discusses strategies for increasing diversity in genomic studies.
Zhou, W. et al. Global Biobank Meta-analysis Initiative: powering genetic discovery across human disease. Cell Genom. 2, 100192 (2022).
Wang, Y., Tsuo, K., Kanai, M., Neale, B. M. & Martin, A. R. Challenges and opportunities for developing more generalizable polygenic risk scores. Annu. Rev. Biomed. Data Sci. 5, 293–320 (2022).
Lewis, C. M. & Vassos, E. Polygenic risk scores: from research tools to clinical instruments. Genome Med. 12, 44 (2020).
Torkamani, A., Wineinger, N. E. & Topol, E. J. The personal and clinical utility of polygenic risk scores. Nat. Rev. Genet. 19, 581–590 (2018).
Choi, S. W., Mak, T. S.-H. & O’Reilly, P. F. Tutorial: a guide to performing polygenic risk score analyses. Nat. Protoc. 15, 2759–2772 (2020).
Polygenic Risk Score Task Force of the International Common Disease Alliance. Responsible use of polygenic risk scores in the clinic: potential benefits, risks and gaps. Nat. Med. 27, 1876–1884 (2021).
Duncan, L. et al. Analysis of polygenic risk score usage and performance in diverse human populations. Nat. Commun. 10, 3328 (2019).
Martin, A. R. et al. Human demographic history impacts genetic risk prediction across diverse populations. Am. J. Hum. Genet. 100, 635–649 (2017).
Mars, N. et al. Genome-wide risk prediction of common diseases across ancestries in one million people. Cell Genom. 2, 100118 (2022).
Privé, F. et al. Portability of 245 polygenic scores when derived from the UK Biobank and applied to 9 ancestry groups from the same cohort. Am. J. Hum. Genet. 109, 373 (2022).
Ding, Y. et al. Polygenic scoring accuracy varies across the genetic ancestry continuum. Nature 618, 774–781 (2023). This paper shows that the prediction accuracy of PRSs decreases from individual to individual along the continuum of genetic ancestries.
Cavazos, T. B. & Witte, J. S. Inclusion of variants discovered from diverse populations improves polygenic risk score transferability. HGG Adv. 2, 100017 (2021).
Wientjes, Y. C. J. et al. Empirical and deterministic accuracies of across-population genomic prediction. Genet. Sel. Evol. 47, 5 (2015).
Pszczola, M., Strabel, T., Mulder, H. A. & Calus, M. P. L. Reliability of direct genomic values for animals with different relationships within and to the reference population. J. Dairy. Sci. 95, 389–400 (2012).
Wientjes, Y. C. J., Veerkamp, R. F. & Calus, M. P. L. The effect of linkage disequilibrium and family relationships on the reliability of genomic prediction. Genetics 193, 621–631 (2013).
Habier, D., Fernando, R. L. & Dekkers, J. C. M. The impact of genetic relationship information on genome-assisted breeding values. Genetics 177, 2389–2397 (2007).
Yang, J., Zeng, J., Goddard, M. E., Wray, N. R. & Visscher, P. M. Concepts, estimation and interpretation of SNP-based heritability. Nat. Genet. 49, 1304–1310 (2017).
Daetwyler, H. D., Villanueva, B. & Woolliams, J. A. Accuracy of predicting the genetic risk of disease using a genome-wide approach. PLoS ONE 3, e3395 (2008).
Wang, Y. et al. Theoretical and empirical quantification of the accuracy of polygenic scores in ancestry divergent populations. Nat. Commun. 11, 3865 (2020). This paper theoretically and empirically investigates the impact of various genetic factors on the transferability of PRSs across populations.
Dudbridge, F. Power and predictive accuracy of polygenic risk scores. PLoS Genet. 9, e1003348 (2013).
Zhang, Y., Qi, G., Park, J.-H. & Chatterjee, N. Estimation of complex effect-size distributions using summary-level statistics from genome-wide association studies across 32 complex traits. Nat. Genet. 50, 1318–1326 (2018).
Chatterjee, N. et al. Projecting the performance of risk prediction based on polygenic analyses of genome-wide association studies. Nat. Genet. 45, 400–405, 405e1–405e3 (2013).
Ge, T., Chen, C.-Y., Neale, B. M., Sabuncu, M. R. & Smoller, J. W. Phenome-wide heritability analysis of the UK Biobank. PLoS Genet. 13, e1006711 (2017).
Mostafavi, H. et al. Variable prediction accuracy of polygenic scores within an ancestry group. eLife 9, e48376 (2020). This paper demonstrates that the predictive accuracy of PRSs can depend on sample characteristics such as age, sex and socioeconomic status even within a group that has relatively homogeneous genetic ancestries.
Shi, H. et al. Localizing components of shared transethnic genetic architecture of complex traits from GWAS summary data. Am. J. Hum. Genet. 106, 805–817 (2020).
Shi, H. et al. Population-specific causal disease effect sizes in functionally important regions impacted by selection. Nat. Commun. 12, 1098 (2021).
Lam, M. et al. Comparative genetic architectures of schizophrenia in East Asian and European populations. Nat. Genet. 51, 1670–1678 (2019).
Hou, K. et al. Causal effects on complex traits are similar for common variants across segments of different continental ancestries within admixed individuals. Nat. Genet. 55, 549–558 (2023).
Wojcik, G. L. et al. Genetic analyses of diverse populations improves discovery for complex traits. Nature 570, 514–518 (2019).
Chen, M.-H. et al. Trans-ethnic and ancestry-specific blood-cell genetics in 746,667 individuals from 5 global populations. Cell 182, 1198–1213.e14 (2020).
International HapMap Consortium. The International HapMap Project. Nature 426, 789–796 (2003).
1000 Genomes Project Consortium et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Zuk, O., Hechter, E., Sunyaev, S. R. & Lander, E. S. The mystery of missing heritability: genetic interactions create phantom heritability. Proc. Natl Acad. Sci. USA 109, 1193–1198 (2012).
Zaidi, A. A. & Mathieson, I. Demographic history mediates the effect of stratification on polygenic scores. eLife 9, e61548 (2020).
Saitou, M., Dahl, A., Wang, Q. & Liu, X. Allele frequency differences of causal variants have a major impact on low cross-ancestry portability of PRS. Preprint at bioRxiv https://doi.org/10.1101/2022.10.21.22281371 (2022).
Bitarello, B. D. & Mathieson, I. Polygenic scores for height in admixed populations. G3 10, 4027–4036 (2020).
Zhang, H. et al. Novel methods for multi-ancestry polygenic prediction and their evaluations in 5.1 million individuals of diverse ancestry. Preprint at bioRxiv https://doi.org/10.1101/2022.03.24.485519 (2022).
Digitale, J. C., Martin, J. N. & Glymour, M. M. Tutorial on directed acyclic graphs. J. Clin. Epidemiol. 142, 264–267 (2022).
Lipsky, A. M. & Greenland, S. Causal directed acyclic graphs. J. Am. Med. Assoc. 327, 1083–1084 (2022).
Aschard, H., Vilhjálmsson, B. J., Joshi, A. D., Price, A. L. & Kraft, P. Adjusting for heritable covariates can bias effect estimates in genome-wide association studies. Am. J. Hum. Genet. 96, 329–339 (2015).
Tennant, P. W. G. et al. Use of directed acyclic graphs (DAGs) to identify confounders in applied health research: review and recommendations. Int. J. Epidemiol. 50, 620–632 (2021).
Socrates, A. et al. Investigating the effects of genetic risk of schizophrenia on behavioural traits. NPJ Schizophr. 7, 2 (2021).
Peyrot, W. J. et al. Effect of polygenic risk scores on depression in childhood trauma. Br. J. Psychiatry 205, 113–119 (2014).
Peyrot, W. J. et al. Does childhood trauma moderate polygenic risk for depression? A meta-analysis of 5765 subjects from the psychiatric genomics consortium. Biol. Psychiatry 84, 138–147 (2018).
Dorans, K. S., Mills, K. T., Liu, Y. & He, J. Trends in prevalence and control of hypertension according to the 2017 American College of Cardiology/American Heart Association (ACC/AHA) guideline. J. Am. Heart Assoc. 7, e008888 (2018).
Centers for Disease Control and Prevention. Chronic kidney disease in the United States. CDC https://www.cdc.gov/kidneydisease/publications-resources/ckd-national-facts.html (2021).
Chu, C. D. et al. Trends in chronic kidney disease care in the US by race and ethnicity, 2012–2019. JAMA Netw. Open. 4, e2127014 (2021).
Siegel, R. L., Miller, K. D., Fuchs, H. E. & Jemal, A. Cancer statistics, 2022. CA Cancer J. Clin. 72, 7–33 (2022).
Zavala, V. A. et al. Cancer health disparities in racial/ethnic minorities in the United States. Br. J. Cancer 124, 315–332 (2021).
Marinac, C. R., Ghobrial, I. M., Birmann, B. M., Soiffer, J. & Rebbeck, T. R. Dissecting racial disparities in multiple myeloma. Blood Cancer J. 10, 19 (2020).
Daly, B. & Olopade, O. I. A perfect storm: how tumor biology, genomics, and health care delivery patterns collide to create a racial survival disparity in breast cancer and proposed interventions for change. CA Cancer J. Clin. 65, 221–238 (2015).
Carrot-Zhang, J. et al. Genetic ancestry contributes to somatic mutations in lung cancers from admixed Latin American populations. Cancer Discov. 11, 591–598 (2021).
Freedman, M. L. et al. Admixture mapping identifies 8q24 as a prostate cancer risk locus in African-American men. Proc. Natl Acad. Sci. USA 103, 14068–14073 (2006).
Fejerman, L. et al. Admixture mapping identifies a locus on 6q25 associated with breast cancer risk in US Latinas. Hum. Mol. Genet. 21, 1907–1917 (2012).
Gignoux, C. R. et al. An admixture mapping meta-analysis implicates genetic variation at 18q21 with asthma susceptibility in Latinos. J. Allergy Clin. Immunol. 143, 957–969 (2019).
Chi, C. et al. Admixture mapping reveals evidence of differential multiple sclerosis risk by genetic ancestry. PLoS Genet. 15, e1007808 (2019).
Tcheandjieu, C. et al. Large-scale genome-wide association study of coronary artery disease in genetically diverse populations. Nat. Med. 28, 1679–1692 (2022).
Trubetskoy, V. et al. Mapping genomic loci implicates genes and synaptic biology in schizophrenia. Nature 604, 502–508 (2022).
Mahajan, A. et al. Multi-ancestry genetic study of type 2 diabetes highlights the power of diverse populations for discovery and translation. Nat. Genet. 54, 560–572 (2022).
Kelemen, M., Vigorito, E., Fachal, L., Anderson, C. A. & Wallace, C. ShaPRS: leveraging shared genetic effects across traits or ancestries improves accuracy of polygenic scores. Preprint at bioRxiv https://doi.org/10.1101/2021.12.10.21267272 (2021).
International Schizophrenia Consortium et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 460, 748–752 (2009).
Choi, S. W. & O’Reilly, P. F. PRSice-2: polygenic risk score software for biobank-scale data. Gigascience 8, giz082 (2019).
Márquez-Luna, C., Loh, P.-R., South Asian Type 2 Diabetes (SAT2D) Consortium, SIGMA Type 2 Diabetes Consortium, & Price, A.L. Multiethnic polygenic risk scores improve risk prediction in diverse populations. Genet. Epidemiol. 41, 811–823 (2017).
Ruan, Y. et al. Improving polygenic prediction in ancestrally diverse populations. Nat. Genet. 54, 573–580 (2022). This paper introduces a Bayesian model that can integrate GWAS summary statistics from multiple populations to improve the predictive performance of PRSs across diverse populations.
Weissbrod, O. et al. Leveraging fine-mapping and multipopulation training data to improve cross-population polygenic risk scores. Nat. Genet. 54, 450–458 (2022). This paper leverages functionally informed fine-mapping to improve cross-population polygenic prediction.
Brown, B. C., Asian Genetic Epidemiology Network Type 2 Diabetes Consortium, Ye, C. J., Price, A. L. & Zaitlen, N. Transethnic genetic-correlation estimates from summary statistics. Am. J. Hum. Genet. 99, 76–88 (2016).
Coram, M. A., Fang, H., Candille, S. I., Assimes, T. L. & Tang, H. Leveraging multi-ethnic evidence for risk assessment of quantitative traits in minority populations. Am. J. Hum. Genet. 101, 638 (2017).
Cai, M. et al. A unified framework for cross-population trait prediction by leveraging the genetic correlation of polygenic traits. Am. J. Hum. Genet. 108, 632–655 (2021).
Hoggart, C. et al. BridgePRS : a powerful trans-ancestry polygenic risk score method. Preprint at bioRxiv https://doi.org/10.1101/2023.02.17.528938 (2023).
Tian, P. et al. Multiethnic polygenic risk prediction in diverse populations through transfer learning. Front. Genet. 13, 906965 (2022).
Mak, T. S. H., Porsch, R. M., Choi, S. W., Zhou, X. & Sham, P. C. Polygenic scores via penalized regression on summary statistics. Genet. Epidemiol. 41, 469–480 (2017).
Zhou, G., Chen, T. & Zhao, H. SDPRX: a statistical method for cross-population prediction of complex traits. Am. J. Hum. Genet. 110, 13–22 (2023).
Zhao, Z., Fritsche, L. G., Smith, J. A., Mukherjee, B. & Lee, S. The construction of cross-population polygenic risk scores using transfer learning. Am. J. Hum. Genet. 109, 1998–2008 (2022).
Zhang, J. et al. An ensemble penalized regression method for multi-ancestry polygenic risk prediction. Preprint at bioRxiv https://doi.org/10.1101/2023.03.15.532652 (2023).
Jin, J. et al. ME-Bayes SL: enhanced Bayesian polygenic risk prediction leveraging information across multiple ancestry groups. Preprint at bioRxiv https://doi.org/10.1101/2023.04.12.536510 (2023).
Ge, T., Chen, C.-Y., Ni, Y., Feng, Y.-C. A. & Smoller, J. W. Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat. Commun. 10, 1776 (2019).
Xiao, J. et al. XPXP: improving polygenic prediction by cross-population and cross-phenotype analysis. Bioinformatics 38, 1947–1955 (2022).
Miao, J. et al. Quantifying portable genetic effects and improving cross-ancestry genetic prediction with GWAS summary statistics. Nat. Commun. 14, 832 (2023).
Amariuta, T. et al. Improving the trans-ancestry portability of polygenic risk scores by prioritizing variants in predicted cell-type-specific regulatory elements. Nat. Genet. 52, 1346–1354 (2020).
Ge, T. et al. Development and validation of a trans-ancestry polygenic risk score for type 2 diabetes in diverse populations. Genome Med. 14, 70 (2022).
Zhao, Z. et al. PUMAS: fine-tuning polygenic risk scores with GWAS summary statistics. Genome Biol. 22, 257 (2021).
Vilhjálmsson, B. J. et al. Modeling linkage disequilibrium increases accuracy of polygenic risk scores. Am. J. Hum. Genet. 97, 576–592 (2015).
Lloyd-Jones, L. R. et al. Improved polygenic prediction by Bayesian multiple regression on summary statistics. Nat. Commun. 10, 5086 (2019).
Fahed, A. C. et al. Polygenic background modifies penetrance of monogenic variants for tier 1 genomic conditions. Nat. Commun. 11, 3635 (2020).
Oetjens, M. T., Kelly, M. A., Sturm, A. C., Martin, C. L. & Ledbetter, D. H. Quantifying the polygenic contribution to variable expressivity in eleven rare genetic disorders. Nat. Commun. 10, 4897 (2019).
Dornbos, P. et al. A combined polygenic score of 21,293 rare and 22 common variants improves diabetes diagnosis based on hemoglobin A1C levels. Nat. Genet. 54, 1609–1614 (2022).
Lali, R. et al. Calibrated rare variant genetic risk scores for complex disease prediction using large exome sequence repositories. Nat. Commun. 12, 5852 (2021).
Chen, C.-Y. et al. The impact of rare protein coding genetic variation on adult cognitive function. Nat. Genet. 55, 927–938 (2023).
Fiziev, P. P. et al. Rare penetrant mutations confer severe risk of common diseases. Science 380, eabo1131 (2023).
Weiner, D. J. et al. Polygenic architecture of rare coding variation across 394,783 exomes. Nature 614, 492–499 (2023).
Atkinson, E. G. et al. Tractor uses local ancestry to enable the inclusion of admixed individuals in GWAS and to boost power. Nat. Genet. 53, 195–204 (2021).
Marnetto, D. et al. Ancestry deconvolution and partial polygenic score can improve susceptibility predictions in recently admixed individuals. Nat. Commun. 11, 1628 (2020).
Sun, Q. et al. Improving polygenic risk prediction in admixed populations by explicitly modeling ancestral-specific effects via GAUDI. Preprint at bioRxiv https://doi.org/10.1101/2022.10.06.511219 (2022).
Pain, O. et al. Evaluation of polygenic prediction methodology within a reference-standardized framework. PLoS Genet. 17, e1009021 (2021). This study establishes a reference-standardized framework for fair comparison of PRS construction methods.
Wang, Y. et al. Polygenic prediction across populations is influenced by ancestry, genetic architecture, and methodology. Preprint at bioRxiv https://doi.org/10.1101/2022.12.29.522270 (2023).
Shen, J. et al. Fine-mapping and credible set construction using a multi-population joint analysis of marginal summary statistics from genome-wide association studies. Preprint at bioRxiv https://doi.org/10.1101/2022.12.22.521659 (2022).
Yuan, K. et al. Fine-mapping across diverse ancestries drives the discovery of putative causal variants underlying human complex traits and diseases. Preprint at medRxiv https://doi.org/10.1101/2023.01.07.23284293 (2023).
Harrell, F. E. Jr, Califf, R. M., Pryor, D. B., Lee, K. L. & Rosati, R. A. Evaluating the yield of medical tests. J. Am. Med. Assoc. 247, 2543–2546 (1982).
Mavaddat, N. et al. Polygenic risk scores for prediction of breast cancer and breast cancer subtypes. Am. J. Hum. Genet. 104, 21–34 (2019).
Ho, W.-K. et al. European polygenic risk score for prediction of breast cancer shows similar performance in Asian women. Nat. Commun. 11, 3833 (2020).
Shieh, Y. et al. A polygenic risk score for breast cancer in US Latinas and Latin American women. J. Natl Cancer Inst. 112, 590–598 (2020).
Du, Z. et al. Evaluating polygenic risk scores for breast cancer in women of African ancestry. J. Natl Cancer Inst. 113, 1168–1176 (2021).
Khera, A. V. et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 50, 1219–1224 (2018).
Dikilitas, O. et al. Predictive utility of polygenic risk scores for coronary heart disease in three major racial and ethnic groups. Am. J. Hum. Genet. 106, 707–716 (2020).
Chatterjee, N., Shi, J. & García-Closas, M. Developing and evaluating polygenic risk prediction models for stratified disease prevention. Nat. Rev. Genet. 17, 392–406 (2016). This paper provides a comprehensive review of concepts and methods relevant for the development and evaluation of risk prediction models that incorporate genetic susceptibility factors.
Wang, M. et al. Validation of a genome-wide polygenic score for coronary artery disease in South Asians. J. Am. Coll. Cardiol. 76, 703–714 (2020).
Khera, A. V. et al. Whole-genome sequencing to characterize monogenic and polygenic contributions in patients hospitalized with early-onset myocardial infarction. Circulation 139, 1593–1602 (2019).
Khan, A. et al. Genome-wide polygenic score to predict chronic kidney disease across ancestries. Nat. Med. 28, 1412–1420 (2022).
Hurson, A. N. et al. Prospective evaluation of a breast-cancer risk model integrating classical risk factors and polygenic risk in 15 cohorts from six countries. Int. J. Epidemiol. 50, 1897–1911 (2022).
Leening, M. J. G., Vedder, M. M., Witteman, J. C. M., Pencina, M. J. & Steyerberg, E. W. Net reclassification improvement: computation, interpretation, and controversies: a literature review and clinician’s guide. Ann. Intern. Med. 160, 122–131 (2014).
Kachuri, L. et al. Pan-cancer analysis demonstrates that integrating polygenic risk scores with modifiable risk factors improves risk prediction. Nat. Commun. 11, 6084 (2020). This paper quantifies the added predictive value of PRSs for 16 cancer types when added to models that contain extensive clinical and environmental risk factors.
Kerr, K. F. et al. Net reclassification indices for evaluating risk prediction instruments: a critical review. Epidemiology 25, 114–121 (2014).
Vickers, A. J. & Elkin, E. B. Decision curve analysis: a novel method for evaluating prediction models. Med. Decis. Mak. 26, 565–574 (2006).
Pal Choudhury, P. et al. Comparative validation of breast cancer risk prediction models and projections for future risk stratification. J. Natl Cancer Inst. 112, 278–285 (2020).
Pal Choudhury, P. et al. iCARE: an R package to build, validate and apply absolute risk models. PLoS ONE 15, e0228198 (2020).
Pain, O., Gillett, A. C., Austin, J. C., Folkersen, L. & Lewis, C. M. A tool for translating polygenic scores onto the absolute scale using summary statistics. Eur. J. Hum. Genet. 30, 339–348 (2022).
Naret, O. et al. Improving polygenic prediction with genetically inferred ancestry. HGG Adv. 3, 100109 (2022).
Ding, Y. et al. Large uncertainty in individual polygenic risk score estimation impacts PRS-based risk stratification. Nat. Genet. 54, 30–39 (2022). This paper estimates the variance of an individual’s PRS and highlights the importance of incorporating uncertainty into the interpretation of individual PRS estimates.
Chouldechova, A. & Roth, A. The frontiers of fairness in machine learning. Preprint at https://doi.org/10.48550/arXiv.1810.08810 (2018).
Komiyama, J., Takeda, A., Honda, J. & Shimao, H. in Proc. 35th Int. Conf. Machine Learning Vol. 80 (eds Dy, J. & Krause, A.) 2737–2746 (PMLR, 2018).
Agarwal, A., Dudik, M. & Wu, Z. S. in Proc. 36th Int. Conf. Machine Learning Vol. 97 (eds Chaudhuri, K. & Salakhutdinov, R.) 120–129 (PMLR, 2019).
Rajkomar, A., Hardt, M., Howell, M. D., Corrado, G. & Chin, M. H. Ensuring fairness in machine learning to advance health equity. Ann. Intern. Med. 169, 866–872 (2018).
Kleinberg, J., Mullainathan, S. & Raghavan, M. Inherent trade-offs in the fair determination of risk scores. Preprint at https://doi.org/10.48550/arXiv.1609.05807 (2016).
Oni-Orisan, A., Mavura, Y., Banda, Y., Thornton, T. A. & Sebro, R. Embracing genetic diversity to improve Black health. N. Engl. J. Med. 384, 1163–1167 (2021).
Lewis, A. C. F. et al. Getting genetic ancestry right for science and society. Science 376, 250–252 (2022).
Banda, Y. et al. Characterizing race/ethnicity and genetic ancestry for 100,000 subjects in the Genetic Epidemiology Research on Adult Health and Aging (GERA) Cohort. Genetics 200, 1285–1295 (2015).
Lee, S. H., Wray, N. R., Goddard, M. E. & Visscher, P. M. Estimating missing heritability for disease from genome-wide association studies. Am. J. Hum. Genet. 88, 294–305 (2011).
Tillin, T. et al. Ethnicity and prediction of cardiovascular disease: performance of QRISK2 and Framingham scores in a U.K. tri-ethnic prospective cohort study (SABRE-Southall And Brent REvisited). Heart 100, 60–67 (2014).
Rodriguez, F. et al. Atherosclerotic cardiovascular disease risk prediction in disaggregated Asian and Hispanic subgroups using electronic health records. J. Am. Heart Assoc. 8, e011874 (2019).
Aldrich, M. C. et al. Evaluation of USPSTF lung cancer screening guidelines among African american adult smokers. JAMA Oncol. 5, 1318–1324 (2019).
Pasquinelli, M. M. et al. Risk prediction model versus United States Preventive Services Task Force lung cancer screening eligibility criteria: reducing race disparities. J. Thorac. Oncol. 15, 1738–1747 (2020).
Mars, N. et al. Systematic comparison of family history and polygenic risk across 24 common diseases. Am. J. Hum. Genet. 109, 2152–2162 (2022).
Hujoel, M. L. A., Loh, P.-R., Neale, B. M. & Price, A. L. Incorporating family history of disease improves polygenic risk scores in diverse populations. Cell Genom. 2, 100152 (2022).
Mars, N. et al. Polygenic and clinical risk scores and their impact on age at onset and prediction of cardiometabolic diseases and common cancers. Nat. Med. 26, 549–557 (2020).
Pal Choudhury, P. et al. Comparative validation of the BOADICEA and Tyrer–Cuzick breast cancer risk models incorporating classical risk factors and polygenic risk in a population-based prospective cohort of women of European ancestry. Breast Cancer Res. 23, 22 (2021).
Lee, A. et al. Comprehensive epithelial tubo-ovarian cancer risk prediction model incorporating genetic and epidemiological risk factors. J. Med. Genet. 59, 632–643 (2022).
Riveros-Mckay, F. et al. Integrated polygenic tool substantially enhances coronary artery disease prediction. Circ. Genom. Precis. Med. 14, e003304 (2021).
NIH. The ‘All of Us’ Research Program. N. Engl. J. Med. 381, 668–676 (2019).
Choudhury, A. et al. High-depth African genomes inform human migration and health. Nature 586, 741–748 (2020).
Pereira, L., Mutesa, L., Tindana, P. & Ramsay, M. African genetic diversity and adaptation inform a precision medicine agenda. Nat. Rev. Genet. 22, 284–306 (2021).
Chapman, C. R. Ethical, legal, and social implications of genetic risk prediction for multifactorial disease: a narrative review identifying concerns about interpretation and use of polygenic scores. J. Community Genet. https://doi.org/10.1007/s12687-022-00625-9 (2022).
Lemke, A. A. et al. Addressing underrepresentation in genomics research through community engagement. Am. J. Hum. Genet. 109, 1563–1571 (2022).
Wojcik, G. L. et al. Opportunities and challenges for the use of common controls in sequencing studies. Nat. Rev. Genet. 23, 665–679 (2022).
Taliun, D. et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature 590, 290–299 (2021).
Bien, S. A. et al. Strategies for enriching variant coverage in candidate disease loci on a multiethnic genotyping array. PLoS ONE 11, e0167758 (2016).
Kim, M. S., Patel, K. P., Teng, A. K., Berens, A. J. & Lachance, J. Genetic disease risks can be misestimated across global populations. Genome Biol. 19, 179 (2018).
Martin, A. R. et al. Low-coverage sequencing cost-effectively detects known and novel variation in underrepresented populations. Am. J. Hum. Genet. 108, 656–668 (2021).
Emde, A.-K. et al. Mid-pass whole genome sequencing enables biomedical genetic studies of diverse populations. BMC Genomics 22, 666 (2021).
Kim, M. S. et al. Testing the generalizability of ancestry-specific polygenic risk scores to predict prostate cancer in sub-Saharan Africa. Genome Biol. 23, 194 (2022).
Borrell, L. N. et al. Race and genetic ancestry in medicine—a time for reckoning with racism. N. Engl. J. Med. 384, 474–480 (2021).
Reales, G. & Wallace, C. Sharing GWAS summary statistics results in more citations. Commun. Biol. 6, 116 (2023).
Wand, H. et al. Improving reporting standards for polygenic scores in risk prediction studies. Nature 591, 211–219 (2021). This paper outlines a framework for systematic reporting of methods and results from PRS studies that is necessary to build a high-quality evidence base for informing PRS translational efforts.
Lambert, S. A. et al. The polygenic score catalog as an open database for reproducibility and systematic evaluation. Nat. Genet. 53, 420–425 (2021).
Wang, Y. et al. Global Biobank analyses provide lessons for developing polygenic risk scores across diverse cohorts. Cell Genom. 3, 100241 (2023).
Linder, J. E. et al. Returning integrated genomic risk and clinical recommendations: the eMERGE study. Genet. Med. 25, 100006 (2023). This paper describes the ongoing prospective eMERGE study that returns integrated genetic risk assessment including monogenic risks, PRSs and family history to high-risk individuals for 11 conditions.
Lennon, N. J. et al. Selection, optimization, and validation of ten chronic disease polygenic risk scores for clinical implementation in diverse populations. Preprint at bioRxiv https://doi.org/10.1101/2023.05.25.23290535 (2023).
Mathieson, I. & Scally, A. What is ancestry? PLoS Genet. 16, e1008624 (2020).
Pritchard, J. K., Stephens, M. & Donnelly, P. Inference of population structure using multilocus genotype data. Genetics 155, 945–959 (2000).
Price, A. L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006).
Patterson, N., Price, A. L. & Reich, D. Population structure and eigenanalysis. PLoS Genet. 2, e190 (2006).
Maples, B. K., Gravel, S., Kenny, E. E. & Bustamante, C. D. RFMix: a discriminative modeling approach for rapid and robust local-ancestry inference. Am. J. Hum. Genet. 93, 278–288 (2013).
Li, N. & Stephens, M. Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. Genetics 165, 2213–2233 (2003).
Browning, S. R., Waples, R. K. & Browning, B. L. Fast, accurate local ancestry inference with FLARE. Am. J. Hum. Genet. 110, 326–335 (2023).
Price, A. L. et al. Sensitive detection of chromosomal segments of distinct ancestry in admixed populations. PLoS Genet. 5, e1000519 (2009).
Salter-Townshend, M. & Myers, S. Fine-scale inference of ancestry segments without prior knowledge of admixing groups. Genetics 212, 869–889 (2019).
Acknowledgements
This Review was supported by the National Institutes of Health (NIH) for the Polygenic Risk Methods in Diverse Populations (PRIMED) Consortium, with grant funding for the Coordinating Center (U01HG011697) and the study sites PREVENT (U01HG011710), CAPE (U01HG011715), CARDINAL (U01HG011717), FFAIRR-PRS (U01HG011719), EPIC-PRS (U01HG011720), D-PRISM (U01HG011723) and PRIMED-Cancer (U01CA261339). Additional funding was received from the NIH: R00CA246076 (to L.K.), R01HG010480 and U01CA249866 (to N.C.), R35GM140487 (to D.J.S.), R01CA241410 (to J.S.W.) and R01HG012354 (to T.G.). The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. The authors thank Y. Ding and H. Zhang for their help with creating the figures in this Review.
Author information
Authors and Affiliations
Consortia
Contributions
L.K., B.P., J.S.W. and T.G. conceptualized the Review. L.K., N.C., J.H. and T.G. drafted the manuscript with input from D.J.S., I.M., I.J.K., E.E.K., B.P. and J.S.W. All authors contributed to the literature search, synthesis and interpretation of findings, and reviewed and/or edited the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Reviews Genetics thanks Michael Inouye and the other, anonymous, reviewer for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Related links
BridgePRS: https://github.com/clivehoggart/BridgePRS
CT-SLEB: https://github.com/andrewhaoyu/CTSLEB
ME-Bayes SL: https://github.com/Jin93/MEBayesSL
PolyPred-S+/PolyPred-P+: https://github.com/omerwe/polyfun
PROSPER: https://github.com/Jingning-Zhang/PROSPER
PRS-CSx(-auto): https://github.com/getian107/PRScsx
SDPRX: https://github.com/eldronzhou/SDPRX
ShaPRS: https://github.com/mkelcb/shaprs
TL-Multi: https://github.com/mxxptian/TLMulti
TL-PRS/MTL-PRS: https://github.com/ZhangchenZhao/TLPRS
X-Wing: https://github.com/qlu-lab/X-Wing
XP-BLUP: https://github.com/tanglab/XP-BLUP
XPASS( + ): https://github.com/YangLabHKUST/XPASS
Glossary
- Absolute risk
-
The probability that a person or group of individuals who are free of a certain disease at a given point in time will develop that disease over a certain time period. Absolute risks are typically expressed as proportions from 0 to 100%.
- Admixture
-
The process by which two or more previously separated populations come into contact, often through migration, generating a descendant population with a mixed mosaic of genetic material.
- Admixture mapping
-
An approach that consists of inferring local genetic ancestry and testing for association between local ancestry segments derived from different ancestral populations and the phenotype.
- Area under the receiver operating characteristic curve
-
(AUC). The ability of a model to discriminate between diseased and disease-free individuals is calculated as the AUC, which compares the true positive rate (sensitivity) with the false positive rate (1 – specificity). An AUC of 0.50 indicates that the classification accuracy of a model is equal to chance; an AUC of 1.0 indicates perfect discrimination.
- Clumping
-
A procedure that iteratively selects the variant with the lowest P-value within a specified window from genome-wide association study (GWAS) results and removes nearby variants that are correlated with the selected variants above a specific linkage disequilibrium (LD) threshold.
- Genetic architecture
-
The genetic basis of a trait described by the number, frequency and magnitude of effect size of genetic variants contributing to its heritability.
- Genetic correlation
-
The correlation between the genetic influences on two traits, or the proportion of variance that two traits share due to genetics.
- Haplotype
-
A cluster of polymorphisms or alleles that typically reside near each other on a chromosome and tend to be inherited together.
- Linkage disequilibrium
-
(LD). Non-random association of alleles at different genetic loci, often measured as the square of the correlation coefficient between two alleles. LD is, on average, lower in African populations compared with European and Asian populations.
- Meta-analysis
-
Statistical analysis that combines results from multiple studies.
- Net reclassification indices
-
Metrics that measure the extent to which a new model improves classification as compared with an old model, calculated as the difference between the proportion of individuals who are correctly reclassified and the proportion of individuals who are incorrectly reclassified.
- P-value thresholding
-
A procedure that selects the genetic variants whose P-value is below a threshold in a genome-wide association study (GWAS).
- Polygenic risk scores
-
(PRSs; also known as genetic risk scores). Single values that quantify an individual’s genetic predisposition to a discrete health outcome, calculated as a sum of alleles weighted by effect sizes corresponding to a relative magnitude of association.
- Polygenic scores
-
Single values that quantify an individual’s genetic predisposition calculated as a sum of trait-associated alleles weighted by their additive, per-allele effect sizes, typically derived from genome-wide association studies (GWAS).
- Population structure
-
The presence of multiple genetically distinct subpopulations that differ in their allele frequencies and mean phenotypic values. Not accounting for this structure can lead to spurious associations in genome-wide association studies (GWAS) and polygenic risk score (PRS) analyses.
- Relative risk
-
The probability that a certain health outcome will occur in a person or group of individuals relative to the probability that this event will occur in a reference population. Relative risks are typically expressed as ratios, with 1.0 indicating no difference between the comparison groups.
- Risk stratification
-
The process of classifying and ordering individuals according to their specific risk estimates.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kachuri, L., Chatterjee, N., Hirbo, J. et al. Principles and methods for transferring polygenic risk scores across global populations. Nat Rev Genet (2023). https://doi.org/10.1038/s41576-023-00637-2
Accepted:
Published:
DOI: https://doi.org/10.1038/s41576-023-00637-2