Polygenic architecture of rare coding variation across 394,783 exomes

Weiner, Daniel J.; Nadig, Ajay; Jagadeesh, Karthik A.; Dey, Kushal K.; Neale, Benjamin M.; Robinson, Elise B.; Karczewski, Konrad J.; O’Connor, Luke J.

doi:10.1038/s41586-022-05684-z

Article
Published: 08 February 2023

Polygenic architecture of rare coding variation across 394,783 exomes

Nature volume 614, pages 492–499 (2023)Cite this article

21k Accesses
42 Citations
171 Altmetric
Metrics details

Subjects

Abstract

Both common and rare genetic variants influence complex traits and common diseases. Genome-wide association studies have identified thousands of common-variant associations, and more recently, large-scale exome sequencing studies have identified rare-variant associations in hundreds of genes^1,2,3. However, rare-variant genetic architecture is not well characterized, and the relationship between common-variant and rare-variant architecture is unclear⁴. Here we quantify the heritability explained by the gene-wise burden of rare coding variants across 22 common traits and diseases in 394,783 UK Biobank exomes⁵. Rare coding variants (allele frequency < 1 × 10⁻³) explain 1.3% (s.e. = 0.03%) of phenotypic variance on average—much less than common variants—and most burden heritability is explained by ultrarare loss-of-function variants (allele frequency < 1 × 10⁻⁵). Common and rare variants implicate the same cell types, with similar enrichments, and they have pleiotropic effects on the same pairs of traits, with similar genetic correlations. They partially colocalize at individual genes and loci, but not to the same extent: burden heritability is strongly concentrated in significant genes, while common-variant heritability is more polygenic, and burden heritability is also more strongly concentrated in constrained genes. Finally, we find that burden heritability for schizophrenia and bipolar disorder^6,7 is approximately 2%. Our results indicate that rare coding variants will implicate a tractable number of large-effect genes, that common and rare associations are mechanistically convergent, and that rare coding variants will contribute only modestly to missing heritability and population risk stratification.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 2: Burden heritability of 22 complex traits and common diseases in UK Biobank.**

**Fig. 3: Burden heritability explained by significant genes.**

**Fig. 4: Common- and rare-variant heritability enrichments.**

**Fig. 5: Burden genetic correlations between variant classes and traits.**

**Fig. 6: Burden heritability of schizophrenia and bipolar disorder.**

Rare variant contribution to human disease in 281,104 UK Biobank exomes

Article Open access 10 August 2021

Whole-exome imputation within UK Biobank powers rare coding variant association and fine-mapping analyses

Article 05 July 2021

Genome-wide rare variant analysis for thousands of phenotypes in over 70,000 exomes from two cohorts

Article Open access 28 January 2020

Data availability

All data used in this manuscript are publicly available and documented in the Supplementary Tables. All results are available in the Supplementary Tables. Neale Lab UKB GWAS summary statistics are available at http://www.nealelab.is/uk-biobank/. Genebass summary statistics are available at https://app.genebass.org. SCHEMA is available at https://schema.broadinstitute.org. BipEx is available at https://bipex.broadinstitute.org. Differentially expressed gene sets are available at https://alkesgroup.broadinstitute.org. Gene-level constraint data are available at https://gnomad.broadinstitute.org. COSMIC cancer gene sets are available at https://cancer.sanger.ac.uk/census.

Code availability

BHR (v.0.1.0) is implemented in R, and its source code is publicly available at GitHub (https://github.com/ajaynadig/bhr) and Zenodo (https://doi.org/10.5281/zenodo.7382799). We have also published scripts enabling the results of the manuscript to be reproduced using publicly available data (Data availability); these are implemented in R, Python, Hail and MATLAB. We also used AMM (https://github.com/danjweiner/AMM21), LDSC (v.1.0.1; https://github.com/bulik/ldsc), HESS (v.0.5.3; https://huwenboshi.github.io/hess/), Genomic SEM (v.0.0.5c; https://github.com/GenomicSEM/GenomicSEM) and GCTA (v.1.94.1; https://yanglab.westlake.edu.cn/software/gcta/#GREMLanalysis).

References

Sun, B. B. et al. Genetic associations of protein-coding variants in human disease. Nature 603, 95–102 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Wang, Q. et al. Rare variant contribution to human disease in 281,104 UK Biobank exomes. Nature 597, 527–532 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Backman, J. D. et al. Exome sequencing and analysis of 454,787 UK Biobank participants. Nature 599, 628–634 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Claussnitzer, M. et al. A brief history of human disease genetics. Nature 577, 179–189 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Karczewski, K. J. et al. Systematic single-variant and gene-based association testing of thousands of phenotypes in 394,841 UK Biobank exomes. Cell Genomics 2, 100168 (2022).
Singh, T. et al. Rare coding variants in ten genes confer substantial risk for schizophrenia. Nature 604, 509–516 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Palmer, D. S. et al. Exome sequencing in bipolar disorder identifies AKAP11 as a risk gene shared with schizophrenia. Nat. Genet. 54, 541–547 (2022).
Article CAS PubMed PubMed Central Google Scholar
International Schizophrenia Consortium. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 460, 748–752 (2009).
Article PubMed Central Google Scholar
Yang, J. et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565–569 (2010).
Article CAS PubMed PubMed Central Google Scholar
Bulik-Sullivan, B. K. et al. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).
Article CAS PubMed PubMed Central Google Scholar
Yang, J. et al. Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index. Nat. Genet. 47, 1114–1120 (2015).
Article CAS PubMed PubMed Central Google Scholar
Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).
Brainstorm Consortium. Analysis of shared heritability in common disorders of the brain. Science 360, eaap8757 (2018).
Article Google Scholar
Watanabe, K. et al. A global overview of pleiotropy and genetic architecture in complex traits. Nat. Genet. 51, 1339–1348 (2019).
Article CAS PubMed Google Scholar
Gusev, A. et al. Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases. Am. J. Hum. Genet. 95, 535–552 (2014).
Article CAS PubMed PubMed Central Google Scholar
Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).
Article CAS PubMed PubMed Central Google Scholar
Finucane, H. K. et al. Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nat. Genet. 50, 621–629 (2018).
Article CAS PubMed PubMed Central Google Scholar
Jagadeesh, K. A. et al. Identifying disease-critical cell types and cellular processes by integrating single-cell RNA-sequencing and human genetics. Nat. Genet. 54, 1479–1492 (2022).
Article CAS PubMed Google Scholar
O’Connor, L. J. et al. Extreme polygenicity of complex traits is explained by negative selection. Am. J. Hum. Genet. 105, 456–476 (2019).
Article PubMed PubMed Central Google Scholar
Zeng, J. et al. Signatures of negative selection in the genetic architecture of human complex traits. Nat. Genet. 50, 746–753 (2018).
Article CAS PubMed Google Scholar
Gazal, S. et al. Linkage disequilibrium-dependent architecture of human complex traits shows action of negative selection. Nat. Genet. 49, 1421–1427 (2017).
Article CAS PubMed PubMed Central Google Scholar
Gazal, S. et al. Functional architecture of low-frequency variants highlights strength of negative selection across coding and non-coding annotations. Nat. Genet. 50, 1600–1607 (2018).
Article CAS PubMed PubMed Central Google Scholar
Liu, D. J. & Leal, S. M. Estimating genetic effects and quantifying missing heritability explained by identified rare-variant associations. Am. J. Hum. Genet. 91, 585–596 (2012).
Article CAS PubMed PubMed Central Google Scholar
Wainschtein, P. et al. Assessing the contribution of rare variants to complex trait heritability from whole-genome sequence data. Nat. Genet. 54, 263–273 (2022).
Article CAS PubMed PubMed Central Google Scholar
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Lee, S., Abecasis, G. R., Boehnke, M. & Lin, X. Rare-variant association analysis: study designs and statistical tests. Am. J. Hum. Genet. 95, 5–23 (2014).
Article CAS PubMed PubMed Central Google Scholar
Adzhubei, I. A. et al. A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249 (2010).
Article CAS PubMed PubMed Central Google Scholar
Speed, D., Hemani, G., Johnson, M. R. & Balding, D. J. Improved heritability estimation from genome-wide SNPs. Am. J. Hum. Genet. 91, 1011–1021 (2012).
Article CAS PubMed PubMed Central Google Scholar
Jang, S.-K. et al. Rare genetic variants explain missing heritability in smoking. Nat. Hum. Behav. 6, 1577–1586 (2022).
Article PubMed Google Scholar
Loh, P.-R. et al. Contrasting genetic architectures of schizophrenia and other complex diseases using fast variance-components analysis. Nat. Genet. 47, 1385–1392 (2015).
Article CAS PubMed PubMed Central Google Scholar
Shi, H., Kichaev, G. & Pasaniuc, B. Contrasting the genetic architecture of 30 complex traits from summary association data. Am. J. Hum. Genet. 99, 139–153 (2016).
Article CAS PubMed PubMed Central Google Scholar
Palmer, C. & Pe’er, I. Statistical correction of the winner’s curse explains replication variability in quantitative trait genome-wide association studies. PLoS Genet. 13, e1006916 (2017).
Article PubMed PubMed Central Google Scholar
Weiner, D. J., Gazal, S., Robinson, E. B. & O’Connor, L. J. Partitioning gene-mediated disease heritability without eQTLs. Am. J. Hum. Genet. 109, 405–416 (2022).
Article CAS PubMed PubMed Central Google Scholar
Sondka, Z. et al. The COSMIC Cancer Gene Census: describing genetic dysfunction across all human cancers. Nat. Rev. Cancer 18, 696–705 (2018).
Article CAS PubMed PubMed Central Google Scholar
Boyle, E. A., Li, Y. I. & Pritchard, J. K. An expanded view of complex traits: from polygenic to omnigenic. Cell 169, 1177–1186 (2017).
Article CAS PubMed PubMed Central Google Scholar
Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).
Article CAS PubMed PubMed Central Google Scholar
Mostafavi, H., Spence, J. P., Naqvi, S. & Pritchard, J. K. Limited overlap of eQTLs and GWAS hits due to systematic differences in discovery. Preprint at bioRxiv https://doi.org/10.1101/2022.05.07.491045 (2022).
Gardner, E. J. et al. Reduced reproductive success is associated with selective constraint on human genes. Nature 603, 858–863 (2022).
Article ADS CAS PubMed Google Scholar
Sebat, J. et al. Strong association of de novo copy number mutations with autism. Science 316, 445–449 (2007).
Article ADS CAS PubMed PubMed Central Google Scholar
Sanders, S. J. et al. De novo mutations revealed by whole-exome sequencing are strongly associated with autism. Nature 485, 237–241 (2012).
Article ADS CAS PubMed PubMed Central Google Scholar
Purcell, S. M. et al. A polygenic burden of rare disruptive mutations in schizophrenia. Nature 506, 185–190 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Simons, Y. B., Bullaughey, K., Hudson, R. R. & Sella, G. A population genetic interpretation of GWAS findings for human quantitative traits. PLoS Biol. 16, e2002985 (2018).
Article PubMed PubMed Central Google Scholar
Fu, J. M. et al. Rare coding variation provides insight into the genetic architecture and phenotypic context of autism. Nat. Genet. 54, 1320–1331 (2022).
Article CAS PubMed Google Scholar
Border, R. et al. Cross-trait assortative mating is widespread and inflates genetic correlation estimates. Science 378, 754–761 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Genovese, G. et al. Increased burden of ultra-rare protein-altering variants among 4,877 individuals with schizophrenia. Nat. Neurosci. 19, 1433–1441 (2016).
Article CAS PubMed PubMed Central Google Scholar
Kosmicki, J. A. et al. Refining the role of de novo protein-truncating variants in neurodevelopmental disorders by using population reference samples. Nat. Genet. 49, 504–510 (2017).
Article CAS PubMed PubMed Central Google Scholar
Baselmans, B. M. L., Yengo, L., van Rheenen, W. & Wray, N. R. Risk in relatives, heritability, SNP-based heritability, and genetic correlations in psychiatric disorders: a review. Biol. Psychiatry 89, 11–19 (2021).
Article CAS PubMed Google Scholar
Samocha, K. E. et al. Regional missense constraint improves variant deleteriousness prediction. Preprint at bioRxiv https://doi.org/10.1101/148353 (2017).
Lefebvre, S. et al. Identification and characterization of a spinal muscular atrophy-determining gene. Cell 80, 155–165 (1995).
Article CAS PubMed Google Scholar
Mendell, J. R. et al. Single-dose gene-replacement therapy for spinal muscular atrophy. N. Engl. J. Med. 377, 1713–1722 (2017).
Article CAS PubMed Google Scholar
Pritchard, J. K. Are rare variants responsible for susceptibility to complex diseases? Am. J. Hum. Genet. 69, 124–137 (2001).
Article CAS PubMed PubMed Central Google Scholar
Kim, S. S. et al. Genes with high network connectivity are enriched for disease heritability. Am. J. Hum. Genet. 104, 896–913 (2019).
Article CAS PubMed PubMed Central Google Scholar
Forgetta, V. et al. An effector index to predict target genes at GWAS loci. Hum. Genet. 141, 1431–1447 (2022).
Article CAS PubMed Google Scholar
Liu, X., Li, Y. I. & Pritchard, J. K. Trans effects on gene expression can drive omnigenic inheritance. Cell 177, 1022–1034 (2019).
Article CAS PubMed PubMed Central Google Scholar
Khera, A. V. et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 50, 1219–1224 (2018).
Article CAS PubMed PubMed Central Google Scholar
Fahed, A. C. et al. Polygenic background modifies penetrance of monogenic variants for tier 1 genomic conditions. Nat. Commun. 11, 3635 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Khera, A. V. et al. Polygenic prediction of weight and obesity trajectories from birth to adulthood. Cell 177, 587–596 (2019).
Article CAS PubMed PubMed Central Google Scholar
Biddinger, K. J. et al. Rare and common genetic variation underlying the risk of hypertrophic cardiomyopathy in a national biobank. JAMA Cardiol. 7, 715–722 (2022).
Article PubMed Google Scholar
Bishop, S. L., Thurm, A., Robinson, E. & Sanders, S. J. Prevalence of returnable genetic results based on recognizable phenotypes among children with autism spectrum disorder. Preprint at bioRxiv https://doi.org/10.1101/2021.05.28.21257736 (2021).
Martin, A. R. et al. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 51, 584–591 (2019).
Article CAS PubMed PubMed Central Google Scholar
Fry, A. et al. Comparison of sociodemographic and health-related characteristics of UK Biobank participants with those of the general population. Am. J. Epidemiol. 186, 1026–1034 (2017).
Article PubMed PubMed Central Google Scholar
Devlin, B. & Roeder, K. Genomic control for association studies. Biometrics 55, 997–1004 (1999).
Article CAS PubMed MATH Google Scholar
1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Article Google Scholar
Berisa, T. & Pickrell, J. K. Approximately independent linkage disequilibrium blocks in human populations. Bioinformatics 32, 283–285 (2016).
Article CAS PubMed Google Scholar
Schoech, A. P. et al. Quantification of frequency-dependent genetic architectures in 25 UK Biobank traits reveals action of negative selection. Nat. Commun. 10, 790 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Zhou, W. et al. Scalable generalized linear mixed model for region-based association tests in large biobanks and cohorts. Nat. Genet. 52, 634–639 (2020).
Article CAS PubMed PubMed Central Google Scholar
Grotzinger, A. D. et al. Genomic structural equation modelling provides insights into the multivariate genetic architecture of complex traits. Nat. Hum. Behav. 3, 513–525 (2019).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank S. Gazal, D. King, A. Price and K. Samocha for analytic assistance and comments on this manuscript; and J. Duan for identifying an issue in the first draft of our manuscript. We acknowledge support from National Institute Mental Health (F30MH129009 to D.J.W.), National Library of Medicine (T15LM007092 to D.J.W.), National Institute of General Medical Science (T32GM007753 to A.N.), Simons Foundation Autism Research Initiative (704413 to E.B.R. and L.J.O.) and the Broad Institute.

Author information

These authors contributed equally: Daniel J. Weiner, Ajay Nadig

Authors and Affiliations

Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
Daniel J. Weiner, Ajay Nadig, Karthik A. Jagadeesh, Kushal K. Dey, Benjamin M. Neale, Konrad J. Karczewski & Luke J. O’Connor
Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
Daniel J. Weiner, Ajay Nadig, Benjamin M. Neale, Elise B. Robinson & Konrad J. Karczewski
Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
Daniel J. Weiner, Ajay Nadig, Benjamin M. Neale, Elise B. Robinson & Konrad J. Karczewski
Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
Karthik A. Jagadeesh & Kushal K. Dey
Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
Elise B. Robinson

Authors

Daniel J. Weiner
View author publications
You can also search for this author in PubMed Google Scholar
Ajay Nadig
View author publications
You can also search for this author in PubMed Google Scholar
Karthik A. Jagadeesh
View author publications
You can also search for this author in PubMed Google Scholar
Kushal K. Dey
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin M. Neale
View author publications
You can also search for this author in PubMed Google Scholar
Elise B. Robinson
View author publications
You can also search for this author in PubMed Google Scholar
Konrad J. Karczewski
View author publications
You can also search for this author in PubMed Google Scholar
Luke J. O’Connor
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

D.J.W., A.N. and L.J.O. conceived and designed experiments. K.A.J. and K.K.D. suggested analyses. B.M.N., E.B.R., K.J.K. and L.J.O. supervised the project. D.J.W., A.N. and L.J.O. performed analyses. D.J.W., A.N. and L.J.O. wrote the manuscript.

Corresponding authors

Correspondence to Daniel J. Weiner, Ajay Nadig or Luke J. O’Connor.

Ethics declarations

Competing interests

K.J.K. is a consultant for Vor Biopharma and AlloDx. B.M.N. is a member of the scientific advisory board at Deep Genomics and Neumora, consultant of the scientific advisory board for Camp4 Therapeutics and consultant for Merck. The other authors declare no competing interests.

Peer review

Peer review information

Nature thanks Doug Speed and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Performance of BHR in exome-scale simulations with no individual-level data.

We performed an extended set of simulations to assess the performance of BHR. The MAF groups are < 1e-5 (group 1), 1e-5 - 1e-4 (group 2), 1e-4 - 1e-3 (group 3), and 1e-3 - 1e-4 (group 4), respectively; the grey and red boxplots indicate the distribution of estimates in null and non-null simulations (true burden h² = 0%, 0.5% respectively). A minor difference in the way that BHR was applied to simulated vs. real data is that in simulated data, significant genes were identified without any attempt to correct for population stratification, whereas in our real-trait analyses, they were identified using SAIGE-GENE¹. We started with a realistic set of parameters (see Methods) and varied one simulation parameter in each simulation. (A) We increased the sample size from 5e5 to 2e6. This increase amplifies the uncorrected population stratification, causing false positive significant genes and upward bias in BHR (no bias is observed in estimates without significant genes). (B) We added overdispersion effects with the same distribution of effect sizes as the burden effects, i.e. with per-allele effect size variance drawn from a discrete mixture distribution (see Methods). This distribution differs from the BHR model, which assumes that overdispersion effects have a constant per-s.d. effect size variance, but this form of misspecification does not lead to bias. (C) We performed simulations with realistic parameters, including stratification and selection (see Methods and Fig. 1c). (D) We decreased the sample size from 5e5 to 1e5. (E) We increased the strength of population stratification (including the minor-allele biased stratification) by a factor of 10, from a per-s.d. effect size mean of 1e-7 and a variance of 1e-5 to a mean of 1e-6 and a variance of 1e-4. (F) We increased the strength of selection, from mean Ns = 1 to mean Ns = 10. There were extremely few variants with allele frequency greater than 1e-3, so MAF group 4 estimates are not shown. Numerical results are contained in Supplementary Table 4. Boxplots denote median, quartiles and range of distribution (excepting outliers).

Extended Data Fig. 2 Comparison of BHR and GCTA in null simulations with individual-level genotypes and phenotypes, and different patterns of population stratification.

There are four demographic models: no stratification; north-south stratification; north-south stratification with smaller population size in the northern deme; and local stratification with very small population size in one deme (see Methods). Under each model, we performed simulations with and without selection, mimicking pLoF and synonymous variants respectively. (a) BHR burden heritability estimates with no correction for minor allele-biased stratification. (b) GCTA heritability estimates with no correction for ancestry. (c) BHR burden heritability estimates, correcting for minor allele-biased stratification. (d) GCTA heritability estimates, correcting for ancestry by providing the deme from which each individual was sampled as a covariate. Boxplots denote median, quartiles and range of distribution (excluding outliers).

Extended Data Fig. 3 Genome-wide mean minor allele effect sizes.

We define the “mean effect” as the effect size of the genome-wide burden, summing all minor alleles across genes within a category, on the phenotype. For synonymous variants, a nonzero mean effect is interpreted as evidence of minor-allele biased population stratification, and this type of stratification produces upward bias in BHR heritability estimates (see Methods). (a–c) Mean effect of synonymous variants vs. mean effect of missense benign, missense other, and pLoF variants respectively. The lack of correlation in (c) suggests that for pLoFs, the nonzero mean effect is mostly biological. (d) Mean effect of synonymous variants vs. the resulting bias in heritability estimates, for synonymous variants (left y axis) or for pLoFs (right y axis). These differ by a constant factor due to the larger number of synonymous variants than pLoFs. (e) Mean effect of pLoF variants vs. the contribution of these effects to burden heritability. These estimates are a small fraction of the total pLoF burden heritability. Error bars represent standard errors, which are computed by assuming independence across genes.

Extended Data Fig. 4 Burden heritability estimates with effect-allele-permuted burden statistics.

We assessed the potential for confounding in our results by repeating our analyses with ultra-rare pLoF burden statistics whose effect alleles were randomly permuted. This permutation is expected to eliminate the burden heritability while not affecting any form of confounding that is symmetrical with respect to the minor vs. major allele. Boxplots indicate the distribution of burden heritability estimates before and after the permutation (non-null and null, respectively), with median, quartiles and range (excepting outliers).

Extended Data Fig. 5 Proportion of common variant heritability explained by LD-independent blocks with significant heritability.

For each trait, we used HESS to identify which of the 1651 LD-independent blocks from Berisa² have Bonferroni-significant heritability, and then computed the proportion of the overall HESS heritability mediated by each block. Although these blocks aggregate over many variants in many genes, the proportion of heritability explained by individual significant blocks is still less than the proportion of burden heritability explained by individual significant genes in BHR (Extended Data Fig. 4).

Extended Data Fig. 6 Comparison of burden versus common variant heritability explained by exome-wide significant genes.

Each point represents a trait-gene significant burden association from the Genebass dataset. X axis values are the fraction of common variant heritability (estimated with HESS) explained by the LD-independent block containing that gene. Y axis values are the fraction of burden heritability (estimated with BHR) explained by the significant gene.

Extended Data Fig. 7 Absolute mean minor allele effect size of ultra-rare pLoF variants genome wide, vs. the constrained gene enrichment of each trait.

(+) and (−) denote the sign of the mean minor allele effects. For numerical results, see Supplementary Tables 7, 16, and 17.

Extended Data Fig. 8 Genetic correlation estimates across 37 traits, for common variants (upper triangle) and rare coding variants (lower).

Asterisks indicate nominally significant genetic correlation estimates (two-tailed p < 0.05). Grey boxes not on the diagonal indicate cross-trait LDSC point estimates that are outside of [−1.25, 1.25], which cross-trait LDSC does not report by default. For numerical results, see Supplementary Table 19.

Extended Data Fig. 9 Comparison of common coding vs. common whole-genome genetic correlations.

(a) We evaluated whether common coding variants, similar to rare coding variants, have stronger genetic correlations than common variants overall. The fit line indicates the Deming regression slope, which allows for uncertainty in both the X and Y axis values. (b) We also assessed the stability of the Deming regression slope for the burden genetic correlation vs. the common-variant genetic correlation on chromosomes 1–8 and chromosomes 9–22.

Extended Data Fig. 10 Burden heritability enrichments of drug target gene sets.

We used BHR to estimate the ultra-rare loss-of-function burden heritability enrichment in sets of manually curated drug target genes from a previous publication⁶. For all panels, error bars are standard errors, and bars are shaded in blue if the enrichment is significantly greater than 1. (A) Burden heritability enrichment in n = 14 blood pressure drug target genes (union of diastolic and systolic blood pressure gene sets from reference publication). (B) Burden heritability enrichment in n = 8 bone mineral density drug target genes. (C) Burden heritability enrichment in n = 6 calcium drug target genes. (D) Burden heritability enrichment in n = 10 lipid drug target genes (union of LDL and triglyceride gene sets from reference publication). (E) Burden heritability enrichment in n = 6 red blood cell drug target genes. (F) Burden heritability enrichment in n = 7 type 2 diabetes drug target genes.

Supplementary information

Supplementary Information

Legends for Supplementary Figs. 1–8, legends for Supplementary Tables 1–22 and additional references.

Reporting Summary

Supplementary Figs. 1–8

Supplementary Figs. 1–8.

Supplementary Tables 1–22

Supplementary Tables 1–22.

Peer Review File

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Weiner, D.J., Nadig, A., Jagadeesh, K.A. et al. Polygenic architecture of rare coding variation across 394,783 exomes. Nature 614, 492–499 (2023). https://doi.org/10.1038/s41586-022-05684-z

Download citation

Received: 29 June 2022
Accepted: 22 December 2022
Published: 08 February 2023
Issue Date: 16 February 2023
DOI: https://doi.org/10.1038/s41586-022-05684-z

This article is cited by

FiTMuSiC: leveraging structural and (co)evolutionary data for protein fitness prediction
- Matsvei Tsishyn
- Gabriel Cia
- Fabrizio Pucci
Human Genomics (2024)
Recent advances in polygenic scores: translation, equitability, methods and FAIR tools
- Ruidong Xiang
- Martin Kelemen
- Samuel A. Lambert
Genome Medicine (2024)
Whole genome sequencing in clinical practice
- Frederik Otzen Bagger
- Line Borgwardt
- Finn Cilius Nielsen
BMC Medical Genomics (2024)
Whole-exome sequencing in UK Biobank reveals rare genetic architecture for depression
- Ruoyu Tian
- Tian Ge
- Chia-Yen Chen
Nature Communications (2024)
Principles and methods for transferring polygenic risk scores across global populations
- Linda Kachuri
- Nilanjan Chatterjee
- Tian Ge
Nature Reviews Genetics (2024)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.