Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Polygenic architecture of rare coding variation across 394,783 exomes

Abstract

Both common and rare genetic variants influence complex traits and common diseases. Genome-wide association studies have identified thousands of common-variant associations, and more recently, large-scale exome sequencing studies have identified rare-variant associations in hundreds of genes1,2,3. However, rare-variant genetic architecture is not well characterized, and the relationship between common-variant and rare-variant architecture is unclear4. Here we quantify the heritability explained by the gene-wise burden of rare coding variants across 22 common traits and diseases in 394,783 UK Biobank exomes5. Rare coding variants (allele frequency < 1 × 10−3) explain 1.3% (s.e. = 0.03%) of phenotypic variance on average—much less than common variants—and most burden heritability is explained by ultrarare loss-of-function variants (allele frequency < 1 × 10−5). Common and rare variants implicate the same cell types, with similar enrichments, and they have pleiotropic effects on the same pairs of traits, with similar genetic correlations. They partially colocalize at individual genes and loci, but not to the same extent: burden heritability is strongly concentrated in significant genes, while common-variant heritability is more polygenic, and burden heritability is also more strongly concentrated in constrained genes. Finally, we find that burden heritability for schizophrenia and bipolar disorder6,7 is approximately 2%. Our results indicate that rare coding variants will implicate a tractable number of large-effect genes, that common and rare associations are mechanistically convergent, and that rare coding variants will contribute only modestly to missing heritability and population risk stratification.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overview of BHR.
Fig. 2: Burden heritability of 22 complex traits and common diseases in UK Biobank.
Fig. 3: Burden heritability explained by significant genes.
Fig. 4: Common- and rare-variant heritability enrichments.
Fig. 5: Burden genetic correlations between variant classes and traits.
Fig. 6: Burden heritability of schizophrenia and bipolar disorder.

Similar content being viewed by others

Data availability

All data used in this manuscript are publicly available and documented in the Supplementary Tables. All results are available in the Supplementary Tables. Neale Lab UKB GWAS summary statistics are available at http://www.nealelab.is/uk-biobank/. Genebass summary statistics are available at https://app.genebass.org. SCHEMA is available at https://schema.broadinstitute.org. BipEx is available at https://bipex.broadinstitute.org. Differentially expressed gene sets are available at https://alkesgroup.broadinstitute.org. Gene-level constraint data are available at https://gnomad.broadinstitute.org. COSMIC cancer gene sets are available at https://cancer.sanger.ac.uk/census.

Code availability

BHR (v.0.1.0) is implemented in R, and its source code is publicly available at GitHub (https://github.com/ajaynadig/bhr) and Zenodo (https://doi.org/10.5281/zenodo.7382799). We have also published scripts enabling the results of the manuscript to be reproduced using publicly available data (Data availability); these are implemented in R, Python, Hail and MATLAB. We also used AMM (https://github.com/danjweiner/AMM21), LDSC (v.1.0.1; https://github.com/bulik/ldsc), HESS (v.0.5.3; https://huwenboshi.github.io/hess/), Genomic SEM (v.0.0.5c; https://github.com/GenomicSEM/GenomicSEM) and GCTA (v.1.94.1; https://yanglab.westlake.edu.cn/software/gcta/#GREMLanalysis).

References

  1. Sun, B. B. et al. Genetic associations of protein-coding variants in human disease. Nature 603, 95–102 (2022).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  2. Wang, Q. et al. Rare variant contribution to human disease in 281,104 UK Biobank exomes. Nature 597, 527–532 (2021).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  3. Backman, J. D. et al. Exome sequencing and analysis of 454,787 UK Biobank participants. Nature 599, 628–634 (2021).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  4. Claussnitzer, M. et al. A brief history of human disease genetics. Nature 577, 179–189 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  5. Karczewski, K. J. et al. Systematic single-variant and gene-based association testing of thousands of phenotypes in 394,841 UK Biobank exomes. Cell Genomics 2, 100168 (2022).

  6. Singh, T. et al. Rare coding variants in ten genes confer substantial risk for schizophrenia. Nature 604, 509–516 (2022).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  7. Palmer, D. S. et al. Exome sequencing in bipolar disorder identifies AKAP11 as a risk gene shared with schizophrenia. Nat. Genet. 54, 541–547 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. International Schizophrenia Consortium. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 460, 748–752 (2009).

    Article  PubMed Central  Google Scholar 

  9. Yang, J. et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565–569 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Bulik-Sullivan, B. K. et al. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Yang, J. et al. Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index. Nat. Genet. 47, 1114–1120 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).

  13. Brainstorm Consortium. Analysis of shared heritability in common disorders of the brain. Science 360, eaap8757 (2018).

    Article  Google Scholar 

  14. Watanabe, K. et al. A global overview of pleiotropy and genetic architecture in complex traits. Nat. Genet. 51, 1339–1348 (2019).

    Article  CAS  PubMed  Google Scholar 

  15. Gusev, A. et al. Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases. Am. J. Hum. Genet. 95, 535–552 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Finucane, H. K. et al. Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nat. Genet. 50, 621–629 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Jagadeesh, K. A. et al. Identifying disease-critical cell types and cellular processes by integrating single-cell RNA-sequencing and human genetics. Nat. Genet. 54, 1479–1492 (2022).

    Article  CAS  PubMed  Google Scholar 

  19. O’Connor, L. J. et al. Extreme polygenicity of complex traits is explained by negative selection. Am. J. Hum. Genet. 105, 456–476 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  20. Zeng, J. et al. Signatures of negative selection in the genetic architecture of human complex traits. Nat. Genet. 50, 746–753 (2018).

    Article  CAS  PubMed  Google Scholar 

  21. Gazal, S. et al. Linkage disequilibrium-dependent architecture of human complex traits shows action of negative selection. Nat. Genet. 49, 1421–1427 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Gazal, S. et al. Functional architecture of low-frequency variants highlights strength of negative selection across coding and non-coding annotations. Nat. Genet. 50, 1600–1607 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Liu, D. J. & Leal, S. M. Estimating genetic effects and quantifying missing heritability explained by identified rare-variant associations. Am. J. Hum. Genet. 91, 585–596 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Wainschtein, P. et al. Assessing the contribution of rare variants to complex trait heritability from whole-genome sequence data. Nat. Genet. 54, 263–273 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  26. Lee, S., Abecasis, G. R., Boehnke, M. & Lin, X. Rare-variant association analysis: study designs and statistical tests. Am. J. Hum. Genet. 95, 5–23 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Adzhubei, I. A. et al. A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Speed, D., Hemani, G., Johnson, M. R. & Balding, D. J. Improved heritability estimation from genome-wide SNPs. Am. J. Hum. Genet. 91, 1011–1021 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Jang, S.-K. et al. Rare genetic variants explain missing heritability in smoking. Nat. Hum. Behav. 6, 1577–1586 (2022).

    Article  PubMed  Google Scholar 

  30. Loh, P.-R. et al. Contrasting genetic architectures of schizophrenia and other complex diseases using fast variance-components analysis. Nat. Genet. 47, 1385–1392 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Shi, H., Kichaev, G. & Pasaniuc, B. Contrasting the genetic architecture of 30 complex traits from summary association data. Am. J. Hum. Genet. 99, 139–153 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Palmer, C. & Pe’er, I. Statistical correction of the winner’s curse explains replication variability in quantitative trait genome-wide association studies. PLoS Genet. 13, e1006916 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  33. Weiner, D. J., Gazal, S., Robinson, E. B. & O’Connor, L. J. Partitioning gene-mediated disease heritability without eQTLs. Am. J. Hum. Genet. 109, 405–416 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Sondka, Z. et al. The COSMIC Cancer Gene Census: describing genetic dysfunction across all human cancers. Nat. Rev. Cancer 18, 696–705 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Boyle, E. A., Li, Y. I. & Pritchard, J. K. An expanded view of complex traits: from polygenic to omnigenic. Cell 169, 1177–1186 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  37. Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Mostafavi, H., Spence, J. P., Naqvi, S. & Pritchard, J. K. Limited overlap of eQTLs and GWAS hits due to systematic differences in discovery. Preprint at bioRxiv https://doi.org/10.1101/2022.05.07.491045 (2022).

  39. Gardner, E. J. et al. Reduced reproductive success is associated with selective constraint on human genes. Nature 603, 858–863 (2022).

    Article  ADS  CAS  PubMed  Google Scholar 

  40. Sebat, J. et al. Strong association of de novo copy number mutations with autism. Science 316, 445–449 (2007).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  41. Sanders, S. J. et al. De novo mutations revealed by whole-exome sequencing are strongly associated with autism. Nature 485, 237–241 (2012).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  42. Purcell, S. M. et al. A polygenic burden of rare disruptive mutations in schizophrenia. Nature 506, 185–190 (2014).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  43. Simons, Y. B., Bullaughey, K., Hudson, R. R. & Sella, G. A population genetic interpretation of GWAS findings for human quantitative traits. PLoS Biol. 16, e2002985 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  44. Fu, J. M. et al. Rare coding variation provides insight into the genetic architecture and phenotypic context of autism. Nat. Genet. 54, 1320–1331 (2022).

    Article  CAS  PubMed  Google Scholar 

  45. Border, R. et al. Cross-trait assortative mating is widespread and inflates genetic correlation estimates. Science 378, 754–761 (2022).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  46. Genovese, G. et al. Increased burden of ultra-rare protein-altering variants among 4,877 individuals with schizophrenia. Nat. Neurosci. 19, 1433–1441 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Kosmicki, J. A. et al. Refining the role of de novo protein-truncating variants in neurodevelopmental disorders by using population reference samples. Nat. Genet. 49, 504–510 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Baselmans, B. M. L., Yengo, L., van Rheenen, W. & Wray, N. R. Risk in relatives, heritability, SNP-based heritability, and genetic correlations in psychiatric disorders: a review. Biol. Psychiatry 89, 11–19 (2021).

    Article  CAS  PubMed  Google Scholar 

  49. Samocha, K. E. et al. Regional missense constraint improves variant deleteriousness prediction. Preprint at bioRxiv https://doi.org/10.1101/148353 (2017).

  50. Lefebvre, S. et al. Identification and characterization of a spinal muscular atrophy-determining gene. Cell 80, 155–165 (1995).

    Article  CAS  PubMed  Google Scholar 

  51. Mendell, J. R. et al. Single-dose gene-replacement therapy for spinal muscular atrophy. N. Engl. J. Med. 377, 1713–1722 (2017).

    Article  CAS  PubMed  Google Scholar 

  52. Pritchard, J. K. Are rare variants responsible for susceptibility to complex diseases? Am. J. Hum. Genet. 69, 124–137 (2001).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Kim, S. S. et al. Genes with high network connectivity are enriched for disease heritability. Am. J. Hum. Genet. 104, 896–913 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Forgetta, V. et al. An effector index to predict target genes at GWAS loci. Hum. Genet. 141, 1431–1447 (2022).

    Article  CAS  PubMed  Google Scholar 

  55. Liu, X., Li, Y. I. & Pritchard, J. K. Trans effects on gene expression can drive omnigenic inheritance. Cell 177, 1022–1034 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Khera, A. V. et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 50, 1219–1224 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Fahed, A. C. et al. Polygenic background modifies penetrance of monogenic variants for tier 1 genomic conditions. Nat. Commun. 11, 3635 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  58. Khera, A. V. et al. Polygenic prediction of weight and obesity trajectories from birth to adulthood. Cell 177, 587–596 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Biddinger, K. J. et al. Rare and common genetic variation underlying the risk of hypertrophic cardiomyopathy in a national biobank. JAMA Cardiol. 7, 715–722 (2022).

    Article  PubMed  Google Scholar 

  60. Bishop, S. L., Thurm, A., Robinson, E. & Sanders, S. J. Prevalence of returnable genetic results based on recognizable phenotypes among children with autism spectrum disorder. Preprint at bioRxiv https://doi.org/10.1101/2021.05.28.21257736 (2021).

  61. Martin, A. R. et al. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 51, 584–591 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Fry, A. et al. Comparison of sociodemographic and health-related characteristics of UK Biobank participants with those of the general population. Am. J. Epidemiol. 186, 1026–1034 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  63. Devlin, B. & Roeder, K. Genomic control for association studies. Biometrics 55, 997–1004 (1999).

    Article  CAS  PubMed  MATH  Google Scholar 

  64. 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).

    Article  Google Scholar 

  65. Berisa, T. & Pickrell, J. K. Approximately independent linkage disequilibrium blocks in human populations. Bioinformatics 32, 283–285 (2016).

    Article  CAS  PubMed  Google Scholar 

  66. Schoech, A. P. et al. Quantification of frequency-dependent genetic architectures in 25 UK Biobank traits reveals action of negative selection. Nat. Commun. 10, 790 (2019).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  67. Zhou, W. et al. Scalable generalized linear mixed model for region-based association tests in large biobanks and cohorts. Nat. Genet. 52, 634–639 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Grotzinger, A. D. et al. Genomic structural equation modelling provides insights into the multivariate genetic architecture of complex traits. Nat. Hum. Behav. 3, 513–525 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank S. Gazal, D. King, A. Price and K. Samocha for analytic assistance and comments on this manuscript; and J. Duan for identifying an issue in the first draft of our manuscript. We acknowledge support from National Institute Mental Health (F30MH129009 to D.J.W.), National Library of Medicine (T15LM007092 to D.J.W.), National Institute of General Medical Science (T32GM007753 to A.N.), Simons Foundation Autism Research Initiative (704413 to E.B.R. and L.J.O.) and the Broad Institute.

Author information

Authors and Affiliations

Authors

Contributions

D.J.W., A.N. and L.J.O. conceived and designed experiments. K.A.J. and K.K.D. suggested analyses. B.M.N., E.B.R., K.J.K. and L.J.O. supervised the project. D.J.W., A.N. and L.J.O. performed analyses. D.J.W., A.N. and L.J.O. wrote the manuscript.

Corresponding authors

Correspondence to Daniel J. Weiner, Ajay Nadig or Luke J. O’Connor.

Ethics declarations

Competing interests

K.J.K. is a consultant for Vor Biopharma and AlloDx. B.M.N. is a member of the scientific advisory board at Deep Genomics and Neumora, consultant of the scientific advisory board for Camp4 Therapeutics and consultant for Merck. The other authors declare no competing interests.

Peer review

Peer review information

Nature thanks Doug Speed and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Performance of BHR in exome-scale simulations with no individual-level data.

We performed an extended set of simulations to assess the performance of BHR. The MAF groups are < 1e-5 (group 1), 1e-5 - 1e-4 (group 2), 1e-4 - 1e-3 (group 3), and 1e-3 - 1e-4 (group 4), respectively; the grey and red boxplots indicate the distribution of estimates in null and non-null simulations (true burden h2 = 0%, 0.5% respectively). A minor difference in the way that BHR was applied to simulated vs. real data is that in simulated data, significant genes were identified without any attempt to correct for population stratification, whereas in our real-trait analyses, they were identified using SAIGE-GENE1. We started with a realistic set of parameters (see Methods) and varied one simulation parameter in each simulation. (A) We increased the sample size from 5e5 to 2e6. This increase amplifies the uncorrected population stratification, causing false positive significant genes and upward bias in BHR (no bias is observed in estimates without significant genes). (B) We added overdispersion effects with the same distribution of effect sizes as the burden effects, i.e. with per-allele effect size variance drawn from a discrete mixture distribution (see Methods). This distribution differs from the BHR model, which assumes that overdispersion effects have a constant per-s.d. effect size variance, but this form of misspecification does not lead to bias. (C) We performed simulations with realistic parameters, including stratification and selection (see Methods and Fig. 1c). (D) We decreased the sample size from 5e5 to 1e5. (E) We increased the strength of population stratification (including the minor-allele biased stratification) by a factor of 10, from a per-s.d. effect size mean of 1e-7 and a variance of 1e-5 to a mean of 1e-6 and a variance of 1e-4. (F) We increased the strength of selection, from mean Ns = 1 to mean Ns = 10. There were extremely few variants with allele frequency greater than 1e-3, so MAF group 4 estimates are not shown. Numerical results are contained in Supplementary Table 4. Boxplots denote median, quartiles and range of distribution (excepting outliers).

Extended Data Fig. 2 Comparison of BHR and GCTA in null simulations with individual-level genotypes and phenotypes, and different patterns of population stratification.

There are four demographic models: no stratification; north-south stratification; north-south stratification with smaller population size in the northern deme; and local stratification with very small population size in one deme (see Methods). Under each model, we performed simulations with and without selection, mimicking pLoF and synonymous variants respectively. (a) BHR burden heritability estimates with no correction for minor allele-biased stratification. (b) GCTA heritability estimates with no correction for ancestry. (c) BHR burden heritability estimates, correcting for minor allele-biased stratification. (d) GCTA heritability estimates, correcting for ancestry by providing the deme from which each individual was sampled as a covariate. Boxplots denote median, quartiles and range of distribution (excluding outliers).

Extended Data Fig. 3 Genome-wide mean minor allele effect sizes.

We define the “mean effect” as the effect size of the genome-wide burden, summing all minor alleles across genes within a category, on the phenotype. For synonymous variants, a nonzero mean effect is interpreted as evidence of minor-allele biased population stratification, and this type of stratification produces upward bias in BHR heritability estimates (see Methods). (ac) Mean effect of synonymous variants vs. mean effect of missense benign, missense other, and pLoF variants respectively. The lack of correlation in (c) suggests that for pLoFs, the nonzero mean effect is mostly biological. (d) Mean effect of synonymous variants vs. the resulting bias in heritability estimates, for synonymous variants (left y axis) or for pLoFs (right y axis). These differ by a constant factor due to the larger number of synonymous variants than pLoFs. (e) Mean effect of pLoF variants vs. the contribution of these effects to burden heritability. These estimates are a small fraction of the total pLoF burden heritability. Error bars represent standard errors, which are computed by assuming independence across genes.

Extended Data Fig. 4 Burden heritability estimates with effect-allele-permuted burden statistics.

We assessed the potential for confounding in our results by repeating our analyses with ultra-rare pLoF burden statistics whose effect alleles were randomly permuted. This permutation is expected to eliminate the burden heritability while not affecting any form of confounding that is symmetrical with respect to the minor vs. major allele. Boxplots indicate the distribution of burden heritability estimates before and after the permutation (non-null and null, respectively), with median, quartiles and range (excepting outliers).

Extended Data Fig. 5 Proportion of common variant heritability explained by LD-independent blocks with significant heritability.

For each trait, we used HESS to identify which of the 1651 LD-independent blocks from Berisa2 have Bonferroni-significant heritability, and then computed the proportion of the overall HESS heritability mediated by each block. Although these blocks aggregate over many variants in many genes, the proportion of heritability explained by individual significant blocks is still less than the proportion of burden heritability explained by individual significant genes in BHR (Extended Data Fig. 4).

Extended Data Fig. 6 Comparison of burden versus common variant heritability explained by exome-wide significant genes.

Each point represents a trait-gene significant burden association from the Genebass dataset. X axis values are the fraction of common variant heritability (estimated with HESS) explained by the LD-independent block containing that gene. Y axis values are the fraction of burden heritability (estimated with BHR) explained by the significant gene.

Extended Data Fig. 7 Absolute mean minor allele effect size of ultra-rare pLoF variants genome wide, vs. the constrained gene enrichment of each trait.

(+) and (−) denote the sign of the mean minor allele effects. For numerical results, see Supplementary Tables 7, 16, and 17.

Extended Data Fig. 8 Genetic correlation estimates across 37 traits, for common variants (upper triangle) and rare coding variants (lower).

Asterisks indicate nominally significant genetic correlation estimates (two-tailed p < 0.05). Grey boxes not on the diagonal indicate cross-trait LDSC point estimates that are outside of [−1.25, 1.25], which cross-trait LDSC does not report by default. For numerical results, see Supplementary Table 19.

Extended Data Fig. 9 Comparison of common coding vs. common whole-genome genetic correlations.

(a) We evaluated whether common coding variants, similar to rare coding variants, have stronger genetic correlations than common variants overall. The fit line indicates the Deming regression slope, which allows for uncertainty in both the X and Y axis values. (b) We also assessed the stability of the Deming regression slope for the burden genetic correlation vs. the common-variant genetic correlation on chromosomes 1–8 and chromosomes 9–22.

Extended Data Fig. 10 Burden heritability enrichments of drug target gene sets.

We used BHR to estimate the ultra-rare loss-of-function burden heritability enrichment in sets of manually curated drug target genes from a previous publication6. For all panels, error bars are standard errors, and bars are shaded in blue if the enrichment is significantly greater than 1. (A) Burden heritability enrichment in n = 14 blood pressure drug target genes (union of diastolic and systolic blood pressure gene sets from reference publication). (B) Burden heritability enrichment in n = 8 bone mineral density drug target genes. (C) Burden heritability enrichment in n = 6 calcium drug target genes. (D) Burden heritability enrichment in n = 10 lipid drug target genes (union of LDL and triglyceride gene sets from reference publication). (E) Burden heritability enrichment in n = 6 red blood cell drug target genes. (F) Burden heritability enrichment in n = 7 type 2 diabetes drug target genes.

Supplementary information

Supplementary Information

Legends for Supplementary Figs. 1–8, legends for Supplementary Tables 1–22 and additional references.

Reporting Summary

Supplementary Figs. 1–8

Supplementary Figs. 1–8.

Supplementary Tables 1–22

Supplementary Tables 1–22.

Peer Review File

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Weiner, D.J., Nadig, A., Jagadeesh, K.A. et al. Polygenic architecture of rare coding variation across 394,783 exomes. Nature 614, 492–499 (2023). https://doi.org/10.1038/s41586-022-05684-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41586-022-05684-z

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing