Genome-wide association studies (GWASs) have been mostly conducted in populations of European ancestry, which currently limits the transferability of their findings to other populations. Here, we show, through theory, simulations and applications to real data, that adjustment of GWAS analyses for polygenic scores (PGSs) increases the statistical power for discovery across all ancestries. We applied this method to analyze seven traits available in three large biobanks with participants of East Asian ancestry (n = 340,000 in total) and report 139 additional associations across traits. We also present a two-stage meta-analysis strategy whereby, in contributing cohorts, a PGS-adjusted GWAS is rerun using PGSs derived from a first round of a standard meta-analysis. On average, across traits, this approach yields a 1.26-fold increase in the number of detected associations (range 1.07- to 1.76-fold increase). Altogether, our study demonstrates the value of using PGSs to increase the power of GWASs in underrepresented populations and promotes such an analytical strategy for future GWAS meta-analyses.
This is a preview of subscription content, access via your institution
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Prices vary by article type
Prices may be subject to local taxes which are calculated during checkout
Data derived from the Taiwan Biobank are restricted. Access to data in general (including genome-wide association study (GWAS) summary statistics) requires transfer agreements and other requirements. Specific inquiries regarding how to access these resources should be sent to Taiwan Biobank researchers. Phenotypes and genotypes from participants of the Health and Retirement Study (HRS) can be accessed from dbGaP under accession no. phs000428.v2.p2. Summary statistics of GWAS meta-analyses for the seven traits measured across BioBank Japan, the Taiwan Biobank and the Korean Genome and Epidemiology Study, as well as GWASs of X-chromosome variants conducted in the UK Biobank, are publicly available at https://zenodo.org/record/8213134 (version 3). Source data are provided with this paper.
Source code (shell and R scripts) used to run simulations and polygenic score-adjusted genome-wide association study analyses can be publicly downloaded at https://zenodo.org/record/8213134 (version 3).
Wang, Y. et al. Theoretical and empirical quantification of the accuracy of polygenic scores in ancestry divergent populations. Nat. Commun. 11, 3865 (2020).
Yengo, L. et al. A saturated map of common genetic variants associated with human height. Nature 610, 704–712 (2022).
Bennett, D., O’Shea, D., Ferguson, J., Morris, D. & Seoighe, C. Controlling for background genetic effects using polygenic scores improves the power of genome-wide association studies. Sci. Rep. 11, 19571 (2021).
Mbatchou, J. et al. Computationally efficient whole-genome regression for quantitative and binary traits. Nat. Genet. 53, 1097–1103 (2021).
Loh, P.-R. et al. Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat. Genet. 47, 284–290 (2015).
Yengo, L. et al. Meta-analysis of genome-wide association studies for height and body mass index in ∼700000 individuals of European ancestry. Hum. Mol. Genet. 27, 3641–3649 (2018).
Wood, A. R. et al. Defining the role of common variation in the genomic and biological architecture of adult human height. Nat. Genet. 46, 1173–1186 (2014).
Locke, A. E. et al. Genetic studies of body mass index yield new insights for obesity biology. Nature 518, 197–206 (2015).
Lloyd-Jones, L. GCTB SBayesR shrunk sparse linkage disequilibrium matrices for HM3 variants, summary statistics and predictors generated from “Improved polygenic prediction by Bayesian multiple regression on summary statistics” by Lloyd-Jones, Zeng et al. 2019. Zenodo https://doi.org/10.5281/zenodo.3350914 (2019).
Lui, J. C. et al. Synthesizing genome-wide association studies and expression microarray reveals novel genes that act in the human growth plate to modulate height. Hum. Mol. Genet. 21, 5193–5201 (2012).
Marouli, E. et al. Rare and low-frequency coding variants alter human adult height. Nature 542, 186–190 (2017).
Bulik-Sullivan, B. K. et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).
Jurgens, S. J. et al. Adjusting for common variant polygenic scores improves yield in rare variant association analyses. Nat. Genet. 55, 544–548 (2023).
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).
Nagai, A. et al. Overview of the BioBank Japan Project: study design and profile. J. Epidemiol. 27, S2–S8 (2017).
Hirata, M. et al. Cross-sectional analysis of BioBank Japan clinical data: a large cohort of 200,000 patients with 47 common diseases. J. Epidemiol. 27, S9–S21 (2017).
Akiyama, M. et al. Characterizing rare and low-frequency height-associated variants in the Japanese population. Nat. Commun. 10, 4393 (2019).
Kim, Y., Han, B.-G. & KoGES Group. Cohort profile: the Korean Genome and Epidemiology Study (KoGES) consortium. Int. J. Epidemiol. 46, e20 (2017).
Moon, S. et al. The Korea Biobank Array: design and identification of coding variants associated with blood biochemical traits. Sci. Rep. 9, 1382 (2019).
Feng, Y.-C. A. et al. Taiwan Biobank: a rich biomedical research database of the Taiwanese population. Cell Genom. 2, 100197 (2022).
Lloyd-Jones, L. R. et al. Improved polygenic prediction by Bayesian multiple regression on summary statistics. Nat. Commun. 10, 5086 (2019).
Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).
Sidorenko, J. et al. The effect of X-linked dosage compensation on complex trait variation. Nat. Commun. 10, 3009 (2019).
Szustakowski, J. D. et al. Advancing human genetics research and drug discovery through exome sequencing of the UK Biobank. Nat. Genet. 53, 942–948 (2021).
Willer, C. J., Li, Y. & Abecasis, G. R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190–2191 (2010).
L.Y. was supported by the Australian Research Council (DE200100425, FT220100069). P.M.V. was supported by the Australian Research Council (FL180100072). Y.-F.L. was supported by the National Health Research Institutes (NP-109-PP-09, NP-110-PP-09) and the National Science and Technology Council (109-2314-B-400-017, 110-2314-B-400-028-MY3) of Taiwan. Y.-C.A.F. acknowledges support from the National Taiwan University (NTU-112L7404), the Yushan Young Fellow Program provided by the Ministry of Education (MOE; NTU-112V1020-2), the National Science and Technology Council (NSTC 112-2314-B-002-200-MY3) and the Population Health Research Center from Featured Areas Research Center Program within the framework of the Higher Education Sprout Project by the MOE in Taiwan (NTU-112L9004). Y.O. was supported by the Japan Society for the Promotion of Science KAKENHI (Grants-in-Aid for Scientific Research; 22H00476), Japan Agency for Medical Research and Development (JP21gm4010006, JP22km0405211, JP22ek0410075, JP22km0405217, JP22ek0109594), Japan Science and Technology Agency’s Moonshot R&D Program (JPMJMS2021, JPMJMS2024), Takeda Science Foundation and Bioinformatics Initiative of Osaka University Graduate School of Medicine. S.N. was supported by the Takeda Science Foundation. The Korean Genome and Epidemiology Study (KoGES) was supported by the Brain Pool Plus (BP+, Brain Pool+) Program through the National Research Foundation of Korea funded by the Ministry of Science and ICT (2020H1D3A2A03100666). This study includes data from the KoGES (4851-302), National Research Institute of Health, Centers for Disease Control and Prevention, Ministry for Health and Welfare, Republic of Korea. This research was conducted using the Taiwan Biobank resource. We thank all participants and investigators of the Taiwan Biobank. We thank the National Center for Genome Medicine of Taiwan for the technical support in genotyping. We thank the National Core Facility for Biopharmaceuticals (MOST 106-2319-B-492-002) and the National Center for High-Performance Computing of the National Applied Research Laboratories of Taiwan for providing computational and storage resources. The Health and Retirement Study (HRS) was supported by the National Institute on Aging (U01AG009740). HRS genotyping received additional support from the National Institute on Aging (RC2 AG036495, RC4 AG039029). HRS data were obtained from dbGaP (database of Genotypes and Phenotypes, accession no. phs000428.v2.p2). We thank D.J. Benjamin, P. Turley and M.E. Goddard for helpful and constructive discussions.
A.I.C. is currently an employee of the Regeneron Genetics Center, a wholly owned subsidiary of Regeneron Pharmaceuticals, Inc., and may own stocks or stock options. The remaining authors declare no competing interests.
Peer review information
Nature Genetics thanks Cassandra Spracklen and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Campos, A.I., Namba, S., Lin, SC. et al. Boosting the power of genome-wide association studies within and across ancestries by using polygenic scores. Nat Genet (2023). https://doi.org/10.1038/s41588-023-01500-0