Abstract
Genome-wide association studies (GWASs) have been mostly conducted in populations of European ancestry, which currently limits the transferability of their findings to other populations. Here, we show, through theory, simulations and applications to real data, that adjustment of GWAS analyses for polygenic scores (PGSs) increases the statistical power for discovery across all ancestries. We applied this method to analyze seven traits available in three large biobanks with participants of East Asian ancestry (nā=ā340,000 in total) and report 139 additional associations across traits. We also present a two-stage meta-analysis strategy whereby, in contributing cohorts, a PGS-adjusted GWAS is rerun using PGSs derived from a first round of a standard meta-analysis. On average, across traits, this approach yields a 1.26-fold increase in the number of detected associations (range 1.07- to 1.76-fold increase). Altogether, our study demonstrates the value of using PGSs to increase the power of GWASs in underrepresented populations and promotes such an analytical strategy for future GWAS meta-analyses.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 /Ā 30Ā days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout



Data availability
Data derived from the Taiwan Biobank are restricted. Access to data in general (including genome-wide association study (GWAS) summary statistics) requires transfer agreements and other requirements. Specific inquiries regarding how to access these resources should be sent to Taiwan Biobank researchers. Phenotypes and genotypes from participants of the Health and Retirement Study (HRS) can be accessed from dbGaP under accession no. phs000428.v2.p2. Summary statistics of GWAS meta-analyses for the seven traits measured across BioBank Japan, the Taiwan Biobank and the Korean Genome and Epidemiology Study, as well as GWASs of X-chromosome variants conducted in the UK Biobank, are publicly available at https://zenodo.org/record/8213134 (version 3). Source data are provided with this paper.
Code availability
Source code (shell and R scripts) used to run simulations and polygenic score-adjusted genome-wide association study analyses can be publicly downloaded at https://zenodo.org/record/8213134 (version 3).
References
Wang, Y. et al. Theoretical and empirical quantification of the accuracy of polygenic scores in ancestry divergent populations. Nat. Commun. 11, 3865 (2020).
Yengo, L. et al. A saturated map of common genetic variants associated with human height. Nature 610, 704ā712 (2022).
Bennett, D., OāShea, D., Ferguson, J., Morris, D. & Seoighe, C. Controlling for background genetic effects using polygenic scores improves the power of genome-wide association studies. Sci. Rep. 11, 19571 (2021).
Mbatchou, J. et al. Computationally efficient whole-genome regression for quantitative and binary traits. Nat. Genet. 53, 1097ā1103 (2021).
Loh, P.-R. et al. Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat. Genet. 47, 284ā290 (2015).
Yengo, L. et al. Meta-analysis of genome-wide association studies for height and body mass index in ā¼700000 individuals of European ancestry. Hum. Mol. Genet. 27, 3641ā3649 (2018).
Wood, A. R. et al. Defining the role of common variation in the genomic and biological architecture of adult human height. Nat. Genet. 46, 1173ā1186 (2014).
Locke, A. E. et al. Genetic studies of body mass index yield new insights for obesity biology. Nature 518, 197ā206 (2015).
Lloyd-Jones, L. GCTB SBayesR shrunk sparse linkage disequilibrium matrices for HM3 variants, summary statistics and predictors generated from āImproved polygenic prediction by Bayesian multiple regression on summary statisticsā by Lloyd-Jones, Zeng et al. 2019. Zenodo https://doi.org/10.5281/zenodo.3350914 (2019).
Lui, J. C. et al. Synthesizing genome-wide association studies and expression microarray reveals novel genes that act in the human growth plate to modulate height. Hum. Mol. Genet. 21, 5193ā5201 (2012).
Marouli, E. et al. Rare and low-frequency coding variants alter human adult height. Nature 542, 186ā190 (2017).
Bulik-Sullivan, B. K. et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291ā295 (2015).
Jurgens, S. J. et al. Adjusting for common variant polygenic scores improves yield in rare variant association analyses. Nat. Genet. 55, 544ā548 (2023).
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203ā209 (2018).
Nagai, A. et al. Overview of the BioBank Japan Project: study design and profile. J. Epidemiol. 27, S2āS8 (2017).
Hirata, M. et al. Cross-sectional analysis of BioBank Japan clinical data: a large cohort of 200,000 patients with 47 common diseases. J. Epidemiol. 27, S9āS21 (2017).
Akiyama, M. et al. Characterizing rare and low-frequency height-associated variants in the Japanese population. Nat. Commun. 10, 4393 (2019).
Kim, Y., Han, B.-G. & KoGES Group. Cohort profile: the Korean Genome and Epidemiology Study (KoGES) consortium. Int. J. Epidemiol. 46, e20 (2017).
Moon, S. et al. The Korea Biobank Array: design and identification of coding variants associated with blood biochemical traits. Sci. Rep. 9, 1382 (2019).
Feng, Y.-C. A. et al. Taiwan Biobank: a rich biomedical research database of the Taiwanese population. Cell Genom. 2, 100197 (2022).
Lloyd-Jones, L. R. et al. Improved polygenic prediction by Bayesian multiple regression on summary statistics. Nat. Commun. 10, 5086 (2019).
Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76ā82 (2011).
Sidorenko, J. et al. The effect of X-linked dosage compensation on complex trait variation. Nat. Commun. 10, 3009 (2019).
Szustakowski, J. D. et al. Advancing human genetics research and drug discovery through exome sequencing of the UK Biobank. Nat. Genet. 53, 942ā948 (2021).
Willer, C. J., Li, Y. & Abecasis, G. R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190ā2191 (2010).
Acknowledgements
L.Y. was supported by the Australian Research Council (DE200100425, FT220100069). P.M.V. was supported by the Australian Research Council (FL180100072). Y.-F.L. was supported by the National Health Research Institutes (NP-109-PP-09, NP-110-PP-09) and the National Science and Technology Council (109-2314-B-400-017, 110-2314-B-400-028-MY3) of Taiwan. Y.-C.A.F. acknowledges support from the National Taiwan University (NTU-112L7404), the Yushan Young Fellow Program provided by the Ministry of Education (MOE; NTU-112V1020-2), the National Science and Technology Council (NSTC 112-2314-B-002-200-MY3) and the Population Health Research Center from Featured Areas Research Center Program within the framework of the Higher Education Sprout Project by the MOE in Taiwan (NTU-112L9004). Y.O. was supported by the Japan Society for the Promotion of Science KAKENHI (Grants-in-Aid for Scientific Research; 22H00476), Japan Agency for Medical Research and Development (JP21gm4010006, JP22km0405211, JP22ek0410075, JP22km0405217, JP22ek0109594), Japan Science and Technology Agencyās Moonshot R&D Program (JPMJMS2021, JPMJMS2024), Takeda Science Foundation and Bioinformatics Initiative of Osaka University Graduate School of Medicine. S.N. was supported by the Takeda Science Foundation. The Korean Genome and Epidemiology Study (KoGES) was supported by the Brain Pool Plus (BP+, Brain Pool+) Program through the National Research Foundation of Korea funded by the Ministry of Science and ICT (2020H1D3A2A03100666). This study includes data from the KoGES (4851-302), National Research Institute of Health, Centers for Disease Control and Prevention, Ministry for Health and Welfare, Republic of Korea. This research was conducted using the Taiwan Biobank resource. We thank all participants and investigators of the Taiwan Biobank. We thank the National Center for Genome Medicine of Taiwan for the technical support in genotyping. We thank the National Core Facility for Biopharmaceuticals (MOST 106-2319-B-492-002) and the National Center for High-Performance Computing of the National Applied Research Laboratories of Taiwan for providing computational and storage resources. The Health and Retirement Study (HRS) was supported by the National Institute on Aging (U01AG009740). HRS genotyping received additional support from the National Institute on Aging (RC2 AG036495, RC4 AG039029). HRS data were obtained from dbGaP (database of Genotypes and Phenotypes, accession no. phs000428.v2.p2). We thank D.J. Benjamin, P. Turley and M.E. Goddard for helpful and constructive discussions.
Author information
Authors and Affiliations
Consortia
Contributions
L.Y. designed the study, derived the theory and ran simulations. A.I.C. ran simulations, performed simulated and real data analysis and drafted the manuscript. S.N., S.-C.L., K.N., J.S., H.W., Y.K., L.-H.W., S.L., Y.-F.L., Y.-C.A.F., Y.O. and P.M.V. curated the data, performed quality control, performed statistical analyses and interpreted the results. All authors contributed to the writing and revision of the manuscript.
Corresponding authors
Ethics declarations
Competing interests
A.I.C. is currently an employee of the Regeneron Genetics Center, a wholly owned subsidiary of Regeneron Pharmaceuticals, Inc., and may own stocks or stock options. The remaining authors declare no competing interests.
Peer review
Peer review information
Nature Genetics thanks Cassandra Spracklen and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisherās note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Methods, Note, Figs. 1ā16 and Tables 1ā10.
Source data
Source Data Fig. 1
Statistical source data.
Source Data Fig. 2
Statistical source data.
Source Data Fig. 3
Statistical source data.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Campos, A.I., Namba, S., Lin, SC. et al. Boosting the power of genome-wide association studies within and across ancestries by using polygenic scores. Nat Genet (2023). https://doi.org/10.1038/s41588-023-01500-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41588-023-01500-0