We develop a Bayesian mixed linear model that simultaneously estimates single-nucleotide polymorphism (SNP)-based heritability, polygenicity (proportion of SNPs with nonzero effects), and the relationship between SNP effect size and minor allele frequency for complex traits in conventionally unrelated individuals using genome-wide SNP data. We apply the method to 28 complex traits in the UK Biobank data (N = 126,752) and show that on average, 6% of SNPs have nonzero effects, which in total explain 22% of phenotypic variance. We detect significant (P < 0.05/28) signatures of natural selection in the genetic architecture of 23 traits, including reproductive, cardiovascular, and anthropometric traits, as well as educational attainment. The significant estimates of the relationship between effect size and minor allele frequency in complex traits are consistent with a model of negative (or purifying) selection, as confirmed by forward simulation. We conclude that negative selection acts pervasively on the genetic variants associated with human complex traits.
Access optionsAccess options
Subscribe to Journal
Get full journal access for 1 year
only $18.75 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Johnson, T. & Barton, N. Theoretical models of selection and mutation on quantitative traits. Phil. Trans. R. Soc. Lond. B 360, 1411–1425 (2005).
Hancock, A. M. et al. Colloquium paper: human adaptations to diet, subsistence, and ecoregion are due to subtle shifts in allele frequency. Proc. Natl. Acad. Sci. USA 107(Suppl 2), 8924–8930 (2010).
Pritchard, J. K., Pickrell, J. K. & Coop, G. The genetics of human adaptation: hard sweeps, soft sweeps, and polygenic adaptation. Curr. Biol. 20, R208–R215 (2010).
Smith, J. M. & Haigh, J. The hitch-hiking effect of a favourable gene. Genet. Res. 23, 23–35 (1974).
Pritchard, J. K. Are rare variants responsible for susceptibility to complex diseases? Am. J. Hum. Genet. 69, 124–137 (2001).
Eyre-Walker, A. Evolution in health and medicine Sackler colloquium: genetic architecture of a complex trait and its implications for fitness and genome-wide association studies. Proc. Natl. Acad. Sci. USA 107(Suppl 1), 1752–1756 (2010).
Mancuso, N. et al. The contribution of rare variation to prostate cancer heritability. Nat. Genet. 48, 30–35 (2016).
Visscher, P. M. et al. 10 years of GWAS discovery: biology, function, and translation. Am. J. Hum. Genet. 101, 5–22 (2017).
Manolio, T. A. et al. Finding the missing heritability of complex diseases. Nature 461, 747–753 (2009).
Lee, S. H., Wray, N. R., Goddard, M. E. & Visscher, P. M. Estimating missing heritability for disease from genome-wide association studies. Am. J. Hum. Genet. 88, 294–305 (2011).
Yang, J. et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565–569 (2010).
Yang, J. et al. Genome partitioning of genetic variation for complex traits using common SNPs. Nat. Genet. 43, 519–525 (2011).
Gratten, J., Wray, N. R., Keller, M. C. & Visscher, P. M. Large-scale genomics unveils the genetic architecture of psychiatric disorders. Nat. Neurosci. 17, 782–790 (2014).
Habier, D., Fernando, R. L., Kizilkaya, K. & Garrick, D. J. Extension of the Bayesian alphabet for genomic selection. BMC Bioinformatics 12, 186 (2011).
Moser, G. et al. Simultaneous discovery, estimation and prediction analysis of complex traits using a Bayesian mixture model. PLoS Genet. 11, e1004969 (2015).
de Los Campos, G., Hickey, J. M., Pong-Wong, R., Daetwyler, H. D. & Calus, M. P. Whole-genome regression and prediction methods applied to plant and animal breeding. Genetics 193, 327–345 (2013).
Zhou, X., Carbonetto, P. & Stephens, M. Polygenic modeling with Bayesian sparse linear mixed models. PLoS Genet. 9, e1003264 (2013).
Lloyd-Jones, L. R. et al. Inference on the genetic basis of eye and skin color in an admixed population via Bayesian linear mixed models. Genetics 206, 1113–1126 (2017).
Sudlow, C. et al. UK Biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015).
Lloyd-Jones, L. R. et al. The genetic architecture of gene expression in peripheral blood. Am. J. Hum. Genet. 100, 228–237 (2017).
Speed, D., Hemani, G., Johnson, M. R. & Balding, D. J. Improved heritability estimation from genome-wide SNPs. Am. J. Hum. Genet. 91, 1011–1021 (2012).
Lee, S. H. et al. Estimation of SNP heritability from dense genotype data. Am. J. Hum. Genet. 93, 1151–1155 (2013).
Yang, J. et al. Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index. Nat. Genet. 47, 1114–1120 (2015).
Neal, R. M. MCMC using Hamiltonian dynamics. in Handbook of Markov Chain Monte Carlo (eds Brooks, S., Gelman, A., Jones, G. & Meng, X.-L.) 113–162 (CRC Press, Boca Raton, FL, 2011).
Fernando, R. L., Dekkers, J. C. & Garrick, D. J. A class of Bayesian methods to combine large numbers of genotyped and non-genotyped animals for whole-genome analyses. Genet. Sel. Evol. 46, 50 (2014).
Walter, K. et al. The UK10K project identifies rare variants in health and disease. Nature 526, 82–90 (2015).
Wood, A. R. et al. Defining the role of common variation in the genomic and biological architecture of adult human height. Nat. Genet. 46, 1173–1186 (2014).
Locke, A. E. et al. Genetic studies of body mass index yield new insights for obesity biology. Nature 518, 197–206 (2015).
Robinson, M. R. et al. Population genetic differentiation of height and body mass index across Europe. Nat. Genet. 47, 1357–1362 (2015).
Marouli, E. et al. Rare and low-frequency coding variants alter human adult height. Nature 542, 186–190 (2017).
Turcot, V. et al. Protein-altering variants associated with body mass index implicate pathways that control energy intake and expenditure in obesity. Nat. Genet. 50, 26–41 (2018).
de Koning, L., Merchant, A. T., Pogue, J. & Anand, S. S. Waist circumference and waist-to-hip ratio as predictors of cardiovascular events: meta-regression analysis of prospective studies. Eur. Heart J. 28, 850–856 (2007).
Wass, P., Waldenström, U., Rössner, S. & Hellberg, D. An android body fat distribution in females impairs the pregnancy rate of in-vitro fertilization-embryo transfer. Hum. Reprod. 12, 2057–2060 (1997).
Day, F. R. et al. Large-scale genomic analyses link reproductive aging to hypothalamic signaling, breast cancer susceptibility and BRCA1-mediated DNA repair. Nat. Genet. 47, 1294–1303 (2015).
Fuchsberger, C. et al. The genetic architecture of type 2 diabetes. Nature 536, 41–47 (2016).
Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).
Hill, W. G., Goddard, M. E. & Visscher, P. M. Data and theory point to mainly additive genetic variance for complex traits. PLoS Genet. 4, e1000008 (2008).
Visscher, P. M., Goddard, M. E., Derks, E. M. & Wray, N. R. Evidence-based psychiatric genetics, AKA the false dichotomy between common and rare variant hypotheses. Mol. Psychiatry 17, 474–485 (2012).
Simons, Y. B., Turchin, M. C., Pritchard, J. K. & Sella, G. The deleterious mutation load is insensitive to recent population history. Nat. Genet. 46, 220–224 (2014).
Uricchio, L. H., Zaitlen, N. A., Ye, C. J., Witte, J. S. & Hernandez, R. D. Selection and explosive growth alter genetic architecture and hamper the detection of causal rare variants. Genome Res. 26, 863–873 (2016).
Kelleher, J., Etheridge, A. M. & McVean, G. Efficient coalescent simulation and genealogical analysis for large sample sizes. PLoS Comput. Biol. 12, e1004842 (2016).
Konarzewski, M. & Książek, A. Determinants of intra-specific variation in basal metabolic rate. J. Comp. Physiol. B 183, 27–41 (2013).
Nyholt, D. R., Gillespie, N. A., Heath, A. C. & Martin, N. G. Genetic basis of male pattern baldness. J. Invest. Dermatol. 121, 1561–1564 (2003).
Hyde, C. L. et al. Identification of 15 genetic loci associated with risk of major depression in individuals of European descent. Nat. Genet. 48, 1031–1036 (2016).
de Moor, M. H. et al. Meta-analysis of genome-wide association studies for neuroticism, and the polygenic association with major depressive disorder. JAMA Psychiatry 72, 642–650 (2015).
Davies, G. et al. Genome-wide association study of cognitive functions and educational attainment in UK Biobank (N = 112 151). Mol. Psychiatry 21, 758–767 (2016).
Gravel, S. et al. Demographic history and rare allele sharing among human populations. Proc. Natl. Acad. Sci. USA 108, 11983–11988 (2011).
Altshuler, D. M. et al. International HapMap 3 Consortium. Integrating common and rare genetic variation in diverse human populations. Nature 467, 52–58 (2010).
Torgerson, D. G. et al. Evolutionary processes acting on candidate cis-regulatory regions in humans inferred from patterns of polymorphism and divergence. PLoS Genet. 5, e1000592 (2009).
McCarthy, S. et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 48, 1279–1283 (2016).
Hernandez, R. D. et al. Classic selective sweeps were rare in recent human evolution. Science 331, 920–924 (2011).
Turchin, M. C. et al. Evidence of widespread selection on standing variation in Europe at height-associated SNPs. Nat. Genet. 44, 1015–1019 (2012).
Berg, J. J. & Coop, G. A population genetic signal of polygenic adaptation. PLoS Genet. 10, e1004412 (2014).
Field, Y. et al. Detection of human adaptation during the past 2000 years. Science 354, 760–764 (2016).
Gazal, S. et al. Linkage disequilibrium-dependent architecture of human complex traits shows action of negative selection. Nat. Genet. 49, 1421–1427 (2017).
Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).
Speed, D., Cai, N., Johnson, M. R., Nejentsev, S. & Balding, D. J. Reevaluation of SNP heritability in complex human traits. Nat. Genet. 49, 986–992 (2017).
Guan, Y. & Stephens, M. Bayesian variable selection regression for genome-wide association studies and other large-scale problems. Ann. Appl. Stat. 5, 1780–1815 (2011).
Zeng, J. Whole Genome Analyses Accounting for Structures in Genotype Data. PhD dissertation, Iowa State University, Chapter 2, 6-33 (2015).
Fernando, R. L. & Garrick, D. Bayesian methods applied to GWAS. Methods Mol. Biol. 1019, 237–274 (2013).
Gelman, A. et al. Bayesian Data Analysis. (CRC Press, Boca Raton, FL, 2014).
Fernando, R. L. et al. Controlling the proportion of false positives in multiple dependent tests. Genetics 166, 611–619 (2004).
Storey, J. D. The optimal discovery procedure: a new approach to simultaneous significance testing. J. R. Stat. Soc. Series B Stat. Methodol. 69, 347–368 (2007).
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).
UK Biobank Genotyping and quality control of UK Biobank, a large-scale, extensively phenotyped prospective resource. UK Biobank http://biobank.ctsu.ox.ac.uk/crystal/docs/genotyping_qc.pdf (2015).
Cross-Disorder Group of the Psychiatric Genomics Consortium. Identification of risk loci with shared effects on five major psychiatric disorders: a genome-wide analysis. Lancet 381, 1371–1379 (2013).
Messer, P. W. SLiM: simulating evolution with selection and linkage. Genetics 194, 1037–1039 (2013).
Palamara, P. F. et al. Leveraging distant relatedness to quantify human mutation and gene-conversion rates. Am. J. Hum. Genet. 97, 775–789 (2015).
Enard, D., Messer, P. W. & Petrov, D. A. Genome-wide signals of positive selection in human evolution. Genome Res. 24, 885–895 (2014).
Abecasis, G. R. et al. 1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).
We thank The University of Queensland’s Research Computing Centre (RCC) for its support in this research. We thank F. Zhang for building the website for the software tool GCTB. This research was supported by the Australian Research Council (DP160101343, DP160101056, DP160103860, and DP160102400), the Australian National Health and Medical Research Council (1107258, 1078901, 1078037, 1083656, 1078399, 1046880, and 1113400), the US National Institutes of Health (MH100141, GM099568, ES025052, and AG042568), and the Sylvia & Charles Viertel Charitable Foundation (Senior Medical Research Fellowship). R.d.V. acknowledges funding from an ERC consolidator grant (647648 EdGe, awarded to Philipp Koellinger). This study makes use of data from dbGaP (accessions: phs000090 and phs000091), UK10K project (EGA accessions: EGAS00001000108 and EGAS00001000090), and UK Biobank Resource (application number: 12514). A full list of acknowledgements for these datasets can be found in part 19 of the Supplementary Note.
About this article
Nature Communications (2019)
Bivariate causal mixture model quantifies polygenic overlap between complex traits beyond genetic correlation
Nature Communications (2019)
Behavior Genetics (2019)
Human Genetics (2019)
Comparison of methods that use whole genome data to estimate the heritability and genetic architecture of complex traits
Nature Genetics (2018)