Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Leveraging polygenic enrichments of gene features to predict genes underlying complex traits and diseases

Abstract

Genome-wide association studies (GWASs) are a valuable tool for understanding the biology of complex human traits and diseases, but associated variants rarely point directly to causal genes. In the present study, we introduce a new method, polygenic priority score (PoPS), that learns trait-relevant gene features, such as cell-type-specific expression, to prioritize genes at GWAS loci. Using a large evaluation set of genes with fine-mapped coding variants, we show that PoPS and the closest gene individually outperform other gene prioritization methods, but observe the best overall performance by combining PoPS with orthogonal methods. Using this combined approach, we prioritize 10,642 unique gene–trait pairs across 113 complex traits and diseases with high precision, finding not only well-established gene–trait relationships but nominating new genes at unresolved loci, such as LGR4 for estimated glomerular filtration rate and CCR7 for deep vein thrombosis. Overall, we demonstrate that PoPS provides a powerful addition to the gene prioritization toolbox.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overview of PoPS.
Fig. 2: Evaluation of PoPS and comparison to other similarity-based methods.
Fig. 3: Most informative gene features used by PoPS.
Fig. 4: Comparing and combining PoPS with locus-based methods.
Fig. 5: High-confidence genes for selected traits.
Fig. 6: Known and new biological examples.

Similar content being viewed by others

Data availability

A repository of processed gene features, visualizations of top derived features and code to reproduce these analyses are available on GitHub at https://github.com/FinucaneLab/gene_features. Complete PoPS results for 95 complex traits in the UK Biobank and 18 additional disease traits, as well as results for PoPS and locus-based methods in genome-wide significant loci, are available at https://www.finucanelab.org/data.

Code availability

PoPS is available as an open-source Python package at https://github.com/FinucaneLab/pops. A static version of the PoPS method used in the present study is available at https://doi.org/10.5281/zenodo.8002379.

References

  1. Visscher, P. M. et al. 10 years of GWAS discovery: biology, function, and translation. Am. J. Hum. Genet. 101, 5–22 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Donnelly, P. Progress and challenges in genome-wide association studies in humans. Nature 456, 728–731 (2008).

    Article  CAS  PubMed  Google Scholar 

  3. Gallagher, M. D. & Chen-Plotkin, A. S. The post-GWAS era: from association to function. Am. J. Hum. Genet. 102, 717–730 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Reich, D. E. et al. Linkage disequilibrium in the human genome. Nature 411, 199–204 (2001).

    Article  CAS  PubMed  Google Scholar 

  5. van Arensbergen, J., van Steensel, B. & Bussemaker, H. J. In search of the determinants of enhancer-promoter interaction specificity. Trends Cell Biol. 24, 695–702 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  6. Pers, T. H. et al. Biological interpretation of genome-wide association studies using predicted gene functions. Nat. Commun. 6, 5890 (2015).

    Article  CAS  PubMed  Google Scholar 

  7. Hormozdiari, F. et al. Colocalization of GWAS and eQTL signals detects target genes. Am. J. Hum. Genet. 99, 1245–1260 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Gusev, A. et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. 48, 245–252 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. de Leeuw, C. A., Mooij, J. M., Heskes, T. & Posthuma, D. MAGMA: generalized gene-set analysis of GWAS data. PLoS Comput. Biol. 11, e1004219 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  10. Greene, C. S. et al. Understanding multicellular function and disease with human tissue-specific networks. Nat. Genet. 47, 569–576 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Fulco, C. P. et al. Activity-by-contact model of enhancer-promoter regulation from thousands of CRISPR perturbations. Nat. Genet. 51, 1664–1669 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Jung, I. et al. A compendium of promoter-centered long-range chromatin interactions in the human genome. Nat. Genet. 51, 1442–1449 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Ulirsch, J. C. et al. Interrogation of human hematopoiesis at single-cell and single-variant resolution. Nat. Genet. 51, 683–693 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Javierre, B. M. et al. Lineage-specific genome architecture links enhancers and non-coding disease variants to target gene promoters. Cell 167, 1369–1384 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Andersson, R. et al. An atlas of active enhancers across human cell types and tissues. Nature 507, 455–461 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Liu, Y., Sarkar, A., Kheradpour, P., Ernst, J. & Kellis, M. Evidence of reduced recombination rate in human regulatory domains. Genome Biol. 18, 193 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  17. Fine, R. S., Pers, T. H., Amariuta, T., Raychaudhuri, S. & Hirschhorn, J. N. Benchmarker: an unbiased, association-data-driven strategy to evaluate gene prioritization algorithms. Am. J. Hum. Genet. 104, 1025–1039 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Barbeira, A. N. et al. Exploiting the GTEx resources to decipher the mechanisms at GWAS loci. Genome Biol. 22, 49 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  19. Stacey, D. et al. ProGeM: a framework for the prioritization of candidate causal genes at molecular quantitative trait loci. Nucleic Acids Res. 47, e3 (2019).

    Article  CAS  PubMed  Google Scholar 

  20. Yang, J. et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565–569 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Kanai, M. et al. Insights from complex trait fine-mapping across diverse populations. Preprint at medRxiv https://doi.org/2021.09.03.21262975 (2021).

  22. The 1000 Genomes Project Consortium et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).

    Article  Google Scholar 

  23. Li, T. et al. A scored human protein–protein interaction network to catalyze genomic interpretation. Nat. Methods 14, 61–64 (2017).

    Article  CAS  PubMed  Google Scholar 

  24. Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Kanehisa, M., Goto, S., Sato, Y., Furumichi, M. & Tanabe, M. KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res. 40, D109–D114 (2012).

    Article  CAS  PubMed  Google Scholar 

  26. Croft, D. et al. Reactome: a database of reactions, pathways and biological processes. Nucleic Acids Res. 39, D691–D697 (2011).

    Article  CAS  PubMed  Google Scholar 

  27. Blake, J. A. et al. The Mouse Genome Database: integration of and access to knowledge about the laboratory mouse. Nucleic Acids Res. 42, D810–D817 (2014).

    Article  CAS  PubMed  Google Scholar 

  28. Teslovich, T. M. et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Wheeler, E. et al. Impact of common genetic determinants of hemoglobin A1c on type 2 diabetes risk and diagnosis in ancestrally diverse populations: a transethnic genome-wide meta-analysis. PLoS Med. 14, e1002383 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  30. Kurkó, J. et al. Genetics of rheumatoid arthritis—a comprehensive review. Clin. Rev. Allergy Immunol. 45, 170–179 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  31. Gejman, P. V., Sanders, A. R. & Duan, J. The role of genetics in the etiology of schizophrenia. Psychiatr. Clin. North Am. 33, 35–66 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  32. Heyes, S. et al. Genetic disruption of voltage-gated calcium channels in psychiatric and neurological disorders. Prog. Neurobiol. 134, 36–54 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. GTEx, Consortium et al. Genetic effects on gene expression across human tissues. Nature 550, 204–213 (2017).

    Article  Google Scholar 

  34. Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481–487 (2016).

    Article  CAS  PubMed  Google Scholar 

  35. GTEx Consortium. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).

    Article  Google Scholar 

  36. Wang, Q. S. et al. Leveraging supervised learning for functionally informed fine-mapping of cis-eQTLs identifies an additional 20,913 putative causal eQTLs. Nat. Commun. 12, 3394 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Mountjoy, E. et al. An open approach to systematically prioritize causal variants and genes at all published human GWAS trait-associated loci. Nat. Genet. 53, 1527–1533 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Dron, J. S. & Hegele, R. A. Genetics of lipid and lipoprotein disorders and traits. Curr. Genet. Med. Rep. 4, 130–141 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  39. Thompson, D. J. et al. Genetic predisposition to mosaic Y chromosome loss in blood. Nature 575, 652–657 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Brisch, R. et al. The role of dopamine in schizophrenia from a neurobiological and evolutionary perspective: old fashioned, but still in vogue. Front. Psychiatry 5, 47 (2014).

    PubMed  PubMed Central  Google Scholar 

  41. Basak, A. et al. BCL11A deletions result in fetal hemoglobin persistence and neurodevelopmental alterations. J. Clin. Invest. 125, 2363–2368 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  42. Quednow, B. B., Brzózka, M. M. & Rossner, M. J. Transcription factor 4 (TCF4) and schizophrenia: integrating the animal and the human perspective. Cell. Mol. Life Sci. 71, 2815–2835 (2014).

    Article  CAS  PubMed  Google Scholar 

  43. Ulirsch, J. C. et al. Systematic functional dissection of common genetic variation affecting red blood cell traits. Cell 165, 1530–1545 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Cvejic, A. et al. SMIM1 underlies the Vel blood group and influences red blood cell traits. Nat. Genet. 45, 542–545 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Cawley, N. X. et al. Obese carboxypeptidase E knockout mice exhibit multiple defects in peptide hormone processing contributing to low bone mineral density. Am. J. Physiol. Endocrinol. Metab. 299, E189–E197 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Kato, S. et al. Leucine-rich repeat-containing G protein-coupled receptor-4 (LGR4, Gpr48) is essential for renal development in mice. Nephron Exp. Nephrol. 104, e63–e75 (2006).

    Article  CAS  PubMed  Google Scholar 

  47. Budnik, I. & Brill, A. Immune factors in deep vein thrombosis initiation. Trends Immunol. 39, 610–623 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Lambert, M. P., Sachais, B. S. & Kowalska, M. A. Chemokines and thrombogenicity. Thromb. Haemost. 97, 722–729 (2007).

    Article  CAS  PubMed  Google Scholar 

  49. Purcell, S. et al. PLINK: a toolset for whole-genome association and population-based linkage analysis. Am. J. Hum. Genet. 81, 559–575 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Loh, P.-R., Kichaev, G., Gazal, S., Schoech, A. P. & Price, A. L. Mixed-model association for biobank-scale datasets. Nat. Genet. 50, 906–908 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Zhou, W. et al. Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Nat. Genet. 50, 1335–1341 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Baglama, J. & Reichel, L. Restarted block Lanczos bidiagonalization methods. Numer. Algorithms 43, 251–272 (2007).

    Article  Google Scholar 

  54. Hyvärinen, A. Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans. Neural Netw. 10, 626–634 (1999).

    Article  PubMed  Google Scholar 

  55. McInnes, L., Healy, J. & Melville, J. UMAP: uniform manifold approximation and projection for dimension reduction. Preprint at https://doi.org/10.48550/arXiv.1802.03426 (2018).

  56. Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. UK10K Consortium et al. The UK10K project identifies rare variants in health and disease. Nature 526, 82–90 (2015).

    Article  Google Scholar 

  59. Csárdi, G. & Nepusz, T. The igraph software package for complex network research. Int. J. complex syst. 1695, 1–9 (2006).

    Google Scholar 

  60. Wang, G., Sarkar, A., Carbonetto, P. & Stephens, M. A simple new approach to variable selection in regression, with application to genetic fine mapping. J. R. Stat. Soc. Series B Stat. Methodol. 82, 1273–1300 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  61. Benner, C. et al. Prospects of fine-mapping trait-associated genomic regions by using summary statistics from genome-wide association studies. Am. J. Hum. Genet. 101, 539–551 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. McLaren, W. et al. The Ensembl variant effect predictor. Genome Biol. 17, 122 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  63. Cairns, J. et al. CHiCAGO: robust detection of DNA looping interactions in Capture Hi-C data. Genome Biol. 17, 127 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  64. Roadmap Epigenomics Consortium et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

    Article  PubMed Central  Google Scholar 

  65. Calderon, D. et al. Landscape of stimulation-responsive chromatin across diverse human immune cells. Nat. Genet. 51, 1494–1505 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank K. Aragam, A. Butterworth, M. Daly, N. Artomov, Y. Reshef and all members of the Finucane lab for helpful discussions. This research was conducted using the UK Biobank Resource under project 31063. H.K.F. was funded by a National Institutes of Health (NIH) grant (no. DP5 OD024582) and by Eric and Wendy Schmidt. J.M.E. was supported by a Pathway to Independence Award (grant nos. K99HG00917 and R00HG009917), the Harvard Society of Fellows and the Base Research Initiative at Stanford University. J.M. and J.N.H. were supported by an NIH grant (no. R01DK075787). R.S.F. was supported by National Human Genome Research Institute, NIH (grant no. F31HG009850). J.O.-M. was supported by the Richard and Susan Smith Family Foundation, the HHMI Damon Runyon Cancer Research Foundation Fellowship (no. DRG-2274-16), the AGA Research Foundation’s AGA-Takeda Pharmaceuticals Research Scholar Award in Inflammatory Bowel Disease (grant no. AGA2020-13-01), the HDDC Pilot and Feasibility (grant no. P30 DK034854) and the Food Allergy Science Initiative.

Author information

Authors and Affiliations

Authors

Contributions

E.M.W. and H.K.F. conceived of the study. E.M.W., J.C.U., N.Y.C. and H.K.F. designed the research, performed the experiments, analyzed the data and interpreted the results. B.L.T. and R.S.F. designed and performed the enrichment-based validations. J.M., T.A.P., M.K., J.N., C.P.F., K.C.T., F.A., T.L., J.O.-M., C.S.S., M.B., A.K.S., A.N.A., R.J.X., A.R., R.M.G., K.L., K.G.A., J.N.H. and J.M.E. provided data or analysis tools used by PoPS or other gene prioritization methods. E.S.L. helped advise the project. E.M.W., J.C.U. and H.K.F. wrote the manuscript with input from all authors. H.K.F. supervised the project.

Corresponding authors

Correspondence to Elle M. Weeks or Hilary K. Finucane.

Ethics declarations

Competing interests

J.C.U. reports compensation from consulting services with Goldfinch Bio and is an employee of Illumina. R.S.F. is an employee of Vertex Pharmaceuticals Incorporated. C.P.F. is an employee of Bristol Myers Squibb. J.O.-M. reports compensation for consulting services with Cellarity. A.R. is a cofounder and equity holder of Celsius Therapeutics and an equity holder in Immunitas, and was an SAB member of Thermo Fisher Scientific, Syros Pharmaceuticals, Neogene Therapeutics and Asimov until 31 July 2020. From 1 August 2020, A.R. is an employee of Genentech. J.N.H. served on the Scientific Advisory Board of and consults for Camp4 Therapeutics. E.S.L. serves on the Board of Directors for Codiak BioSciences and Neon Therapeutics, and serves on the Scientific Advisory Board of F-Prime Capital Partners and Third Rock Ventures; he is also affiliated with several nonprofit organizations including serving on the Board of Directors of the Innocence Project, Count Me In and Biden Cancer Initiative, and the Board of Trustees for the Parker Institute for Cancer Immunotherapy. He has served and continues to serve on various federal advisory committees. The remaining authors declare no competing interests.

Peer review

Peer review information

Nature Genetics thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 PoPS model parameter choices and feature selection.

a-c, Results using Benchmarker to compare different parameter choices for fitting the PoPS model, meta-analyzed across independent traits (n = 46). Error bars represent 95% confidence intervals around the meta-analyzed point estimate. a, Feature selection: GLS with an L1 penalty on the full set of features performs less well than GLS after marginal selection using a P value < 0.05 threshold from the two-sided Wald test. b, Error model: ordinary least squares (OLS) performs less well than generalized least squares (GLS) using marginal selection from a. c, Joint model regularization: GLS after marginal feature selection with an L2 penalty performs better than similar models with an L1 penalty or no penalty. d, Number of features selected (marginal P value < 0.05 from the two-sided Wald test) and included in the joint predictive model for PoPS for each trait. A legend for trait domain colors is provided in Fig. 2.

Extended Data Fig. 2 Additional comparisons using closest gene metric.

a, Results using closest gene enrichment to compare similarity-based gene prioritization methods, meta-analyzed within each trait domain across independent traits (n = 46). Error bars represent 95% confidence intervals around the meta-analyzed point estimate. b, Results using closest gene enrichment to compare PoPS results using different feature sets, meta-analyzed within each trait domain across independent traits (n = 46). Error bars represent 95% confidence intervals around the meta-analyzed point estimate.

Extended Data Fig. 3 Comparison of gene expression features derived from bulk and single-cell RNA seq datasets.

a, Results using Benchmarker to compare PoPS results using different feature sets, meta-analyzed within each trait domain across independent traits (n = 46). Error bars represent 95% confidence intervals around the meta-analyzed point estimate. b, Results using closest gene enrichment to compare PoPS results using different feature sets, meta-analyzed within each trait domain across independent traits (n = 46). Error bars represent 95% confidence intervals around the meta-analyzed point estimate.

Extended Data Fig. 4 Comparison of similarity-based methods using precision and recall.

Precision-recall plot showing performance of similarity-based methods.

Extended Data Fig. 5 Comparing prioritization criteria.

Precision-recall plots for each method with varying prioritization criteria. Each point shows the precision and recall for a set of prioritized genes selected using prioritization criteria based on absolute thresholds and/or relative rank in a locus. For all methods, the star represents the final chosen criteria. a, Circles: PoP scores ranked ≤ 2–5 in the locus. Star: highest PoPS score in the locus. b, Plus: significant TWAS P value after Bonferroni correction (P < 0.05/235,584). Circles: TWAS P values ranked ≤ 2–5 in the locus. Star: significant TWAS P value after Bonferroni correction (P < 0.05/235,584) and the most significant in the locus. c, Pluses: CLPP > 0.01, 0.1, 0.5, 0.9, and 0.99. Circles: CLPP > 0.01, 0.1, 0.5, 0.9, and 0.99 and also the highest CLPP in the locus. Star: CLPP > 0.1 and also the highest CLPP in the locus. d, Plus: any predicted connection from ABC. Circles: ABC connection strength ranked ≤ 2–5 in the locus. Star: highest ABC connection strength in the locus. e, Pluses: any predicted connection from PCHiC for individual datasets. Triangle: any predicted connection from PCHi-C in any dataset. Circles: highest connection strength in the locus for individual datasets. Star: highest connection strength in the locus in any dataset. f, Pluses: any predicted connection from E-P correlation for individual datasets. Triangle: any predicted connection from E-P correlation in any dataset. Circles: highest connection strength in the locus for individual datasets. Star: highest connection strength in the locus in any dataset. g, Circle: closest gene by distance to the transcription start site. Star: closest gene by distance to the gene body. h, Circles: MAGMA z-scores ranked ≤ 2–5 in the locus. Star: highest MAGMA score in the locus. i, Plus: significant SMR P value after Bonferroni correction (P < 0.05/18,383). Circles: SMR P values ranked ≤ 2–5 in the locus. Star: significant SMR P value after Bonferroni correction (P < 0.05/18,383) and the most significant in the locus.

Extended Data Fig. 6 Performance of PoPS and locus-based gene prioritization methods by trait.

Precision-recall plots for each method. Each point represents a single trait colored by trait domain. Only traits for which the method prioritized at least five genes in the validation loci were included. A legend for trait domain colors is provided in Fig. 2.

Extended Data Fig. 7 Additional performance metrics using evaluation gene set in 1,348 non-coding loci containing genes that harbor fine-mapped protein coding variants.

a, Sensitivity-specificity plot showing performance of locus-based methods, PoPS, intersections of pairs of locus-based methods, and intersections of PoPs with locus-based methods on the evaluation gene set of 589 genes with fine-mapped protein coding variants. b, Heatmap showing performance using the F-score of locus-based methods, PoPS, intersections of pairs of locus-based methods, and intersections of PoPs with locus-based methods.

Extended Data Fig. 8 Number of prioritized genes for non-UK Biobank traits.

Number of unique gene-trait pairs prioritized by PoPS, locus-based gene prioritization methods, and their intersections, sorted by estimated precision. The full height of each bar represents the total number of genes prioritized. The opaque portion of each bar represents the expected number of true causal genes prioritized. Methods to the left of the dashed line achieve precision greater than 75%.

Extended Data Fig. 9 Known example RBM38.

Top: summary statistics colored by LD to the lead variant and fine-mapping results for variants in the locus colored by credible set. Bottom: results from PoPS and locus-based methods for all genes in the locus. Genes are colored by strength of prediction for each method with a star denoting the prioritized gene. Variant rs737092, RBM38 for mean corpuscular hemoglobin (MCH).

Extended Data Fig. 10 Sensitivity of precision and recall estimates to locus definition.

a, Loci defined as +/− 100 kb on either side of the lead variant. b, Loci defined as +/− 1 Mb on either side of the lead variant. c, Results restricted to loci in fine-mapped regions with three or fewer independent credible sets. d, Results restricted to loci in fine-mapped regions with five or fewer independent credible sets.

Supplementary information

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Weeks, E.M., Ulirsch, J.C., Cheng, N.Y. et al. Leveraging polygenic enrichments of gene features to predict genes underlying complex traits and diseases. Nat Genet 55, 1267–1276 (2023). https://doi.org/10.1038/s41588-023-01443-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-023-01443-6

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing