Tutorial: a guide to performing polygenic risk score analyses

Abstract

A polygenic score (PGS) or polygenic risk score (PRS) is an estimate of an individual’s genetic liability to a trait or disease, calculated according to their genotype profile and relevant genome-wide association study (GWAS) data. While present PRSs typically explain only a small fraction of trait variance, their correlation with the single largest contributor to phenotypic variation—genetic liability—has led to the routine application of PRSs across biomedical research. Among a range of applications, PRSs are exploited to assess shared etiology between phenotypes, to evaluate the clinical utility of genetic data for complex disease and as part of experimental studies in which, for example, experiments are performed that compare outcomes (e.g., gene expression and cellular response to treatment) between individuals with low and high PRS values. As GWAS sample sizes increase and PRSs become more powerful, PRSs are set to play a key role in research and stratified medicine. However, despite the importance and growing application of PRSs, there are limited guidelines for performing PRS analyses, which can lead to inconsistency between studies and misinterpretation of results. Here, we provide detailed guidelines for performing and interpreting PRS analyses. We outline standard quality control steps, discuss different methods for the calculation of PRSs, provide an introductory online tutorial, highlight common misconceptions relating to PRS results, offer recommendations for best practice and discuss future challenges.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: The PRS analysis process.
Fig. 2: Shown is a flow chart of suggested analytical steps that can be followed to perform QC and select software for PRS analyses.
Fig. 3: Illustration of major sources of inflation/deflation of PRS-trait associations.
Fig. 4: Results from a simulation study comparing Nagelkerke pseudo-R2 with the pseudo-R2 proposed by Lee et al.75 that incorporates adjustment for the sample case/control ratio.
Fig. 5: Three different ways of representing the same data.
Fig. 6: Examples of the performance of PRS analyses on real data by validation sample size, according to (a) phenotypic variance explained (R2) and (b) association P value.

References

  1. 1.

    Locke, A. E. et al. Genetic studies of body mass index yield new insights for obesity biology. Nature 518, 197–206 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  2. 2.

    Kunkle, B. W. et al. Genetic meta-analysis of diagnosed Alzheimer’s disease identifies new risk loci and implicates Aβ, tau, immunity and lipid processing. Nat. Genet. 51, 414 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  3. 3.

    Schizophrenia Working Group of the Psychiatric Genomics Consortium. Biological insights from 108 schizophrenia-associated genetic loci. Nature 511, 421–427 (2014).

    PubMed Central  Google Scholar 

  4. 4.

    Yang, J. et al. Common SNPs explain a large proportion of heritability for human height. Nat. Genet. 42, 565–569 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  5. 5.

    Dudbridge, F. Power and predictive accuracy of polygenic risk scores. PLoS Genet. 9, e1003348 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  6. 6.

    Dudbridge, F. Polygenic epidemiology. Genet. Epidemiol. 40, 268–272 (2016).

    PubMed  PubMed Central  Google Scholar 

  7. 7.

    Yang, J. et al. GCTA: A tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. 8.

    Bulik-Sullivan, B. K. et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  9. 9.

    Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. 10.

    Palla, L. & Dudbridge, F. A fast method that uses polygenic scores to estimate the variance explained by genome-wide marker panels and the proportion of variants affecting a trait. Am. J. Hum. Genet. 97, 250–259 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. 11.

    Purcell, S. M. et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 460, 748–752 (2009).

    CAS  PubMed  Google Scholar 

  12. 12.

    Wray, N. R. et al. Research review: polygenic methods and their application to psychiatric traits. J. Child Psychol. Psychiatry 55, 1068–1087 (2014).

    PubMed  Google Scholar 

  13. 13.

    Euesden, J., Lewis, C. M. & O’Reilly, P. F. PRSice: polygenic risk score software. Bioinformatics 31, 1466–1468 (2015).

    CAS  PubMed  Google Scholar 

  14. 14.

    Choi, S. W. & O’Reilly, P. F. PRSice-2: polygenic risk score software for biobank-scale data. Gigascience 8, giz082 (2019).

    PubMed  PubMed Central  Google Scholar 

  15. 15.

    Agerbo, E. et al. Polygenic risk score, parental socioeconomic status, family history of psychiatric disorders, and the risk for schizophrenia: a Danish population-based study and meta-analysis. JAMA Psychiatry 72, 635–641 (2015).

    PubMed  Google Scholar 

  16. 16.

    Mavaddat, N. et al. Polygenic risk scores for prediction of breast cancer and breast cancer subtypes. Am. J. Hum. Genet. 104, 21–34 (2019).

    CAS  PubMed  Google Scholar 

  17. 17.

    Mullins, N. et al. Polygenic interactions with environmental adversity in the aetiology of major depressive disorder. Psychol. Med. 46, 759–770 (2016).

    CAS  PubMed  Google Scholar 

  18. 18.

    Natarajan, P. et al. Polygenic risk score identifies subgroup with higher burden of atherosclerosis and greater relative benefit from statin therapy in the primary prevention setting. Circulation 135, 2091–2101 (2017).

    PubMed  PubMed Central  Google Scholar 

  19. 19.

    Mak, T. S. H. et al. Polygenic scores via penalized regression on summary statistics. Genet. Epidemiol. 41, 469–480 (2017).

    PubMed  Google Scholar 

  20. 20.

    Ge, T. et al. Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat. Commun. 10, 1–10 (2019).

    Google Scholar 

  21. 21.

    Speed, D. & Balding, D. J. MultiBLUP: improved SNP-based prediction for complex traits. Genome Res. 24, 1550–1557 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  22. 22.

    Zhou, X., Carbonetto, P. & Stephens, M. Polygenic modeling with Bayesian sparse linear mixed models. PLoS Genet. 9, e1003264 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. 23.

    Shi, J. et al. Winner’s curse correction and variable thresholding improve performance of polygenic risk modeling based on genome-wide association study summary-level data. PLoS Genet. 12, e1006493 (2016).

    PubMed  PubMed Central  Google Scholar 

  24. 24.

    Lello, L. et al. Accurate genomic prediction of human height. Genetics 210, 477–497 (2018).

    PubMed  PubMed Central  Google Scholar 

  25. 25.

    Sudlow, C. et al. UK Biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015).

    PubMed  PubMed Central  Google Scholar 

  26. 26.

    Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. 27.

    Evans, L. M. et al. Comparison of methods that use whole genome data to estimate the heritability and genetic architecture of complex traits. Nat. Genet. 50, 737–745 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  28. 28.

    Coleman, J. R. I. et al. Quality control, imputation and analysis of genome-wide genotyping data from the Illumina HumanCoreExome microarray. Brief. Funct. Genomics 15, 298–304 (2016).

    CAS  PubMed  Google Scholar 

  29. 29.

    Marees, A. T. et al. A tutorial on conducting genome-wide association studies: quality control and statistical analysis. Int. J. Methods Psychiatr. Res. 27, e1608 (2018).

    PubMed  PubMed Central  Google Scholar 

  30. 30.

    Anderson, C. A. et al. Data quality control in genetic case-control association studies. Nat. Protoc. 5, 1564–1573 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. 31.

    Speed, D. & Balding, D. J. SumHer better estimates the SNP heritability of complex traits from summary statistics. Nat. Genet. 51, 277–284 (2019).

    CAS  PubMed  Google Scholar 

  32. 32.

    Han, B. & Eskin, E. Random-effects model aimed at discovering associations in meta-analysis of genome-wide association studies. Am. J. Hum. Genet. 88, 586–598 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  33. 33.

    Drepper, U., Miller, S. & Madore, D. md5sum(1): compute/check MD5 message digest. Linux man page (accessed 20 October 2018); https://linux.die.net/man/1/md5sum

  34. 34.

    National Center for Biotechnology Information. US National Library of Medicine. Data changes that occur between builds. in SNP FAQ Archive. NCBI Help Manual. https://www.ncbi.nlm.nih.gov/books/NBK44467/ (2005).

  35. 35.

    Hinrichs, A. S. et al. The UCSC Genome Browser Database: update 2006. Nucleic Acids Res. 34, D590–D598 (2006).

    CAS  PubMed  Google Scholar 

  36. 36.

    Niemi, M. E. K. et al. Common genetic variants contribute to risk of rare severe neurodevelopmental disorders. Nature 562, 268–271 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  37. 37.

    Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).

    PubMed  PubMed Central  Google Scholar 

  38. 38.

    Chen, L. M. et al. PRS-on-Spark (PRSoS): a novel, efficient and flexible approach for generating polygenic risk scores. BMC Bioinforma. 19, 295 (2018).

    Google Scholar 

  39. 39.

    Accounting for sex in the genome. Nat. Med.23, 1243–1243 (2017).

  40. 40.

    König, I. R. et al. How to include chromosome X in your genome-wide association study. Genet. Epidemiol. 38, 97–103 (2014).

    PubMed  Google Scholar 

  41. 41.

    Wray, N. R. et al. Pitfalls of predicting complex traits from SNPs. Nat. Rev. Genet. 14, 507–515 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  42. 42.

    Viechtbauer, W. & Cheung, M. W.-L. Outlier and influence diagnostics for meta-analysis. Res. Synth. Methods 1, 112–125 (2010).

    PubMed  Google Scholar 

  43. 43.

    Socrates, A. et al. Polygenic risk scores applied to a single cohort reveal pleiotropy among hundreds of human phenotypes. Preprint at https://www.biorxiv.org/content/10.1101/203257v1 (2017).

  44. 44.

    Devlin, B. & Roeder, K. Genomic control for association studies. Biometrics 55, 997–1004 (1999).

    CAS  PubMed  Google Scholar 

  45. 45.

    Vilhjálmsson, B. J. et al. Modeling linkage disequilibrium increases accuracy of polygenic risk scores. Am. J. Hum. Genet. 97, 576–592 (2015).

    PubMed  PubMed Central  Google Scholar 

  46. 46.

    Newcombe, P. J. et al. A flexible and parallelizable approach to genome-wide polygenic risk scores. Genet. Epidemiol. 43, 730–741 (2019).

    PubMed  PubMed Central  Google Scholar 

  47. 47.

    Loh, P.-R. et al. Mixed-model association for biobank-scale datasets. Nat. Genet. 50, 906–908 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  48. 48.

    The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).

    PubMed Central  Google Scholar 

  49. 49.

    Privé, F. et al. Efficient analysis of large-scale genome-wide data with two R packages: bigstatsr and bigsnpr. Bioinformatics 34, 2781–2787 (2018).

    PubMed  PubMed Central  Google Scholar 

  50. 50.

    Márquez‐Luna, C., Loh, P.-R. & Price, A. L. Multiethnic polygenic risk scores improve risk prediction in diverse populations. Genet. Epidemiol. 41, 811–823 (2017).

    PubMed  PubMed Central  Google Scholar 

  51. 51.

    Clayton, D. Link functions in multi-locus genetic models: implications for testing, prediction, and interpretation. Genet. Epidemiol. 36, 409–418 (2012).

    PubMed  PubMed Central  Google Scholar 

  52. 52.

    Price, A. L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006).

    CAS  PubMed  Google Scholar 

  53. 53.

    Astle, W. & Balding, D. J. Population structure and cryptic relatedness in genetic association studies. Stat. Sci. 24, 451–471 (2009).

    Google Scholar 

  54. 54.

    Kong, A. et al. The nature of nurture: effects of parental genotypes. Science 359, 424–428 (2018).

    CAS  PubMed  Google Scholar 

  55. 55.

    Selzam, S. et al. Comparing within- and between-family polygenic score prediction. Am. J. Hum. Genet. 105, 351–363 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  56. 56.

    Price, A. L. et al. New approaches to population stratification in genome-wide association studies. Nat. Rev. Genet. 11, 459–463 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  57. 57.

    Cheesman, R. et al. Comparison of adopted and nonadopted individuals reveals gene–environment interplay for education in the UK Biobank. Psychol. Sci. 31, 582–591 (2020).

    PubMed  Google Scholar 

  58. 58.

    Mostafavi, H. et al. Variable prediction accuracy of polygenic scores within an ancestry group. eLife 9, e48376 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  59. 59.

    Young, A. I. et al. Deconstructing the sources of genotype-phenotype associations in humans. Science 365, 1396–1400 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  60. 60.

    Kim, M. S., Patel, K. P., Teng, A. K., Berens, A. J. & Lachance, J. Genetic disease risks can be misestimated across global populations. Genome Biol. 19, 179 (2018).

    PubMed  PubMed Central  Google Scholar 

  61. 61.

    Martin, A. R. et al. Human demographic history impacts genetic risk prediction across diverse populations. Am. J. Hum. Genet. 100, 635–649 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  62. 62.

    Duncan, L. et al. Analysis of polygenic risk score usage and performance in diverse human populations. Nat. Commun. 10, 3328 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  63. 63.

    Selzam, S. et al. Predicting educational achievement from DNA. Mol. Psychiatry 22, 267–272 (2017).

    CAS  PubMed  Google Scholar 

  64. 64.

    Lee, J. J. et al. Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals. Nat. Genet. 50, 1112–1121 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  65. 65.

    Grotzinger, A. D. et al. Genomic structural equation modelling provides insights into the multivariate genetic architecture of complex traits. Nat. Hum. Behav. 3, 513–525 (2019).

    PubMed  PubMed Central  Google Scholar 

  66. 66.

    Maier, R. M. et al. Improving genetic prediction by leveraging genetic correlations among human diseases and traits. Nat. Commun. 9, 989 (2018).

    PubMed  PubMed Central  Google Scholar 

  67. 67.

    Krapohl, E. et al. Multi-polygenic score approach to trait prediction. Mol. Psychiatry 23, 1368–1374 (2018).

    CAS  PubMed  Google Scholar 

  68. 68.

    Ruderfer, D. M. et al. Polygenic dissection of diagnosis and clinical dimensions of bipolar disorder and schizophrenia. Mol. Psychiatry 19, 1017–1024 (2014).

    CAS  PubMed  Google Scholar 

  69. 69.

    Bipolar Disorder and Schizophrenia Working Group of the Psychiatric Genomics Consortium. Genomic dissection of bipolar disorder and schizophrenia, including 28 subphenotypes. Cell 173, 1705–1715.e16 (2018).

    PubMed Central  Google Scholar 

  70. 70.

    Turley, P. et al. Multi-trait analysis of genome-wide association summary statistics using MTAG. Nat. Genet. 50, 229–237 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  71. 71.

    Rheenen, W. et al. Genetic correlations of polygenic disease traits: from theory to practice. Nat. Rev. Genet. 20, 567–581 (2019).

    PubMed  Google Scholar 

  72. 72.

    Visscher, P. M. et al. Statistical power to detect genetic (co)variance of complex traits using SNP data in unrelated samples. PLoS Genet. 10, e1004269 (2014).

    PubMed  PubMed Central  Google Scholar 

  73. 73.

    Janssens, M. J. J. Co-heritability: its relation to correlated response, linkage, and pleiotropy in cases of polygenic inheritance. Euphytica 28, 601–608 (1979).

    Google Scholar 

  74. 74.

    Pirinen, M., Donnelly, P. & Spencer, C. C. A. Including known covariates can reduce power to detect genetic effects in case-control studies. Nat. Genet. 44, 848–851 (2012).

    CAS  PubMed  Google Scholar 

  75. 75.

    Lee, S. H. et al. A better coefficient of determination for genetic profile analysis. Genet. Epidemiol. 36, 214–224 (2012).

    PubMed  Google Scholar 

  76. 76.

    Heinzl, H., Waldhör, T. & Mittlböck, M. Careful use of pseudo R-squared measures in epidemiological studies. Stat. Med. 24, 2867–2872 (2005).

    PubMed  Google Scholar 

  77. 77.

    Lee, S. H. et al. Estimating missing heritability for disease from genome-wide association studies. Am. J. Hum. Genet. 88, 294–305 (2011).

    PubMed  PubMed Central  Google Scholar 

  78. 78.

    Won, H.-H. et al. Disproportionate contributions of select genomic compartments and cell types to genetic risk for coronary artery disease. PLoS Genet. 11, e1005622 (2015).

    PubMed  PubMed Central  Google Scholar 

  79. 79.

    Santoro, M. L. et al. Polygenic risk score analyses of symptoms and treatment response in an antipsychotic-naive first episode of psychosis cohort. Transl. Psychiatry 8, 1–8 (2018).

    Google Scholar 

  80. 80.

    Power, R. A. et al. Polygenic risk scores for schizophrenia and bipolar disorder predict creativity. Nat. Neurosci. 18, 953–955 (2015).

    CAS  PubMed  Google Scholar 

  81. 81.

    Mullins, N. et al. GWAS of suicide attempt in psychiatric disorders and association with major depression polygenic risk scores. Am. J. Psychiatry 176, 651–660 (2019).

    PubMed  Google Scholar 

  82. 82.

    Khera, A. V. et al. Polygenic prediction of weight and obesity trajectories from birth to adulthood. Cell 177, 587–596.e9 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  83. 83.

    Khera, A. V. et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 50, 1219–1224 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  84. 84.

    Du Rietz, E. et al. Association of polygenic risk for attention-deficit/hyperactivity disorder with co-occurring traits and disorders. Biol. Psychiatry Cogn. Neurosci. Neuroimaging 3, 635–643 (2018).

    PubMed  PubMed Central  Google Scholar 

  85. 85.

    Clayton, D. G. Prediction and interaction in complex disease genetics: experience in type 1 diabetes. PLoS Genet. 5, e1000540 (2009).

    PubMed  PubMed Central  Google Scholar 

  86. 86.

    Dudbridge, F., Pashayan, N. & Yang, J. Predictive accuracy of combined genetic and environmental risk scores. Genet. Epidemiol. 42, 4–19 (2018).

    PubMed  Google Scholar 

  87. 87.

    Torkamani, A., Wineinger, N. E. & Topol, E. J. The personal and clinical utility of polygenic risk scores. Nat. Rev. Genet. 19, 581–590 (2018).

    CAS  PubMed  Google Scholar 

  88. 88.

    Lambert, S. A., Abraham, G. & Inouye, M. Towards clinical utility of polygenic risk scores. Hum. Mol. Genet. 28(R2), R133–R142 (2019).

    Google Scholar 

  89. 89.

    Gibson, G. On the utilization of polygenic risk scores for therapeutic targeting. PLoS Genet. 15, e1008060 (2019).

    PubMed  PubMed Central  Google Scholar 

  90. 90.

    Rose, G. Sick individuals and sick populations. Int. J. Epidemiol. 30, 427–432 (2001).

    CAS  PubMed  Google Scholar 

  91. 91.

    Wynants, L., Collins, G. S. & Van Calster, B. Key steps and common pitfalls in developing and validating risk models. BJOG 124, 423–432 (2017).

    CAS  PubMed  Google Scholar 

  92. 92.

    Janssens, A. C. J. W. & Joyner, M. J. Polygenic risk scores that predict common diseases using millions of single nucleotide polymorphisms: is more, better? Clin. Chem. 65, 609–611 (2019).

    CAS  PubMed  Google Scholar 

  93. 93.

    Baverstock, K. Polygenic scores: are they a public health hazard? Prog. Biophys. Mol. Biol. 149, 4–8 (2019).

    CAS  PubMed  Google Scholar 

  94. 94.

    Janssens, A. C. J. W. Validity of polygenic risk scores: are we measuring what we think we are? Hum. Mol. Genet. 28, R143–R150 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  95. 95.

    Boyle, E. A., Li, Y. I. & Pritchard, J. K. An expanded view of complex traits: from polygenic to omnigenic. Cell 169, 1177–1186 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  96. 96.

    Sniekers, S. et al. Genome-wide association meta-analysis of 78,308 individuals identifies new loci and genes influencing human intelligence. Nat. Genet. 49, 1107–1112 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  97. 97.

    Bowden, J., Davey Smith, G. & Burgess, S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int. J. Epidemiol. 44, 512–525 (2015).

    PubMed  PubMed Central  Google Scholar 

  98. 98.

    Hartwig, F. P. et al. Two-sample Mendelian randomization: avoiding the downsides of a powerful, widely applicable but potentially fallible technique. Int. J. Epidemiol. 45, 1717–1726 (2016).

    PubMed  Google Scholar 

  99. 99.

    Krapohl, E. et al. Phenome-wide analysis of genome-wide polygenic scores. Mol. Psychiatry 21, 1188–1193 (2016).

    CAS  PubMed  Google Scholar 

  100. 100.

    Pingault, J.-B. et al. Using genetic data to strengthen causal inference in observational research. Nat. Rev. Genet. 19, 566–580 (2018).

    CAS  PubMed  Google Scholar 

  101. 101.

    Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Methodol. 58, 267–288 (1996).

    Google Scholar 

  102. 102.

    Mak, T. S. H. et al. Polygenic scores for UK Biobank scale data. Preprint at https://www.biorxiv.org/content/10.1101/252270v3 (2018).

  103. 103.

    Machiela, M. J. et al. Evaluation of polygenic risk scores for predicting breast and prostate cancer risk. Genet. Epidemiol. 35, 506–514 (2011).

    PubMed  PubMed Central  Google Scholar 

  104. 104.

    Falconer, D. S. Introduction to Quantitative Genetics (Ronald Press, 1960).

  105. 105.

    Jaffee, S. & Price, T. Gene–environment correlations: a review of the evidence and implications for prevention of mental illness. Mol. Psychiatry 12, 432–442 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank the participants in the UK Biobank and the scientists involved in the construction of this resource. We thank Jonathan Coleman and Kylie Glanville for help in management of the UK Biobank resource at King’s College London, and we thank Jack Euesden, Carla Giner-Delgado, Clive Hoggart, Hei Man Wu, Tom Bond, Gerome Breen, Cathryn Lewis, Cecile Janssens and Pak Sham for helpful discussions. This research has been conducted using the UK Biobank Resource under application 18177 (P.F.O.). P.F.O. receives funding from the UK Medical Research Council (MR/N015746/1). S.W.C. is funded by the UK Medical Research Council (MR/N015746/1). This report represents independent research partially funded by the National Institute for Health Research (NIHR) Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King’s College London. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health.

Author information

Affiliations

Authors

Contributions

P.F.O. conceived, prepared and wrote the manuscript, with feedback from S.W.C. and T.S.-H.M. S.W.C. performed the analyses and produced the online tutorial, with feedback from P.F.O.

Corresponding author

Correspondence to Paul F. O’Reilly.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Protocols thanks Dorret Boomsma, Brandon Johnson and Anubha Mahajan for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Related links

Key references using this protocol

PRS tutorial: https://choishingwan.github.io/PRS-Tutorial/

GWAS Tutorial: https://github.com/MareesAT/GWA_tutorial

PRSice software: https://www.prsice.info

LDpred software: https://github.com/bvilhjal/ldpred

Lassosum software: https://github.com/tshmak/lassosum/blob/master/README.md

The R project for statistical computing: https://www.r-project.org

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Choi, S.W., Mak, T.S. & O’Reilly, P.F. Tutorial: a guide to performing polygenic risk score analyses. Nat Protoc (2020). https://doi.org/10.1038/s41596-020-0353-1

Download citation

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.