The genetic architectures of common, complex diseases are largely uncharacterized. We modeled the genetic architecture underlying genome-wide association study (GWAS) data for rheumatoid arthritis and developed a new method using polygenic risk-score analyses to infer the total liability-scale variance explained by associated GWAS SNPs. Using this method, we estimated that, together, thousands of SNPs from rheumatoid arthritis GWAS explain an additional 20% of disease risk (excluding known associated loci). We further tested this method on datasets for three additional diseases and obtained comparable estimates for celiac disease (43% excluding the major histocompatibility complex), myocardial infarction and coronary artery disease (48%) and type 2 diabetes (49%). Our results are consistent with simulated genetic models in which hundreds of associated loci harbor common causal variants and a smaller number of loci harbor multiple rare causal variants. These analyses suggest that GWAS will continue to be highly productive for the discovery of additional susceptibility loci for common diseases.
This is a preview of subscription content
Subscribe to Journal
Get full journal access for 1 year
only $4.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
Wellcome Trust Case Control Consortium. et al. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 (2007).
Stahl, E.A. et al. Genome-wide association study meta-analysis identifies seven new rheumatoid arthritis risk loci. Nat. Genet. 42, 508–514 (2010).
Park, J.H. et al. Estimation of effect size distribution from genome-wide association studies and implications for future discoveries. Nat. Genet. 42, 570–575 (2010).
Lango Allen, H. et al. Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 467, 832–838 (2010).
Speliotes, E.K. et al. Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nat. Genet. 42, 937–948 (2010).
Teslovich, T.M. et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713 (2010).
Franke, A. et al. Genome-wide meta-analysis increases to 71 the number of confirmed Crohn's disease susceptibility loci. Nat. Genet. 42, 1118–1125 (2010).
Maher, B. Personal genomes: The case of the missing heritability. Nature 456, 18–21 (2008).
Manolio, T.A. et al. Finding the missing heritability of complex diseases. Nature 461, 747–753 (2009).
Purcell, S.M. et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 460, 748–752 (2009).
Yang, J. et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565–569 (2010).
Bush, W.S. et al. Evidence for polygenic susceptibility to multiple sclerosis—the shape of things to come. Am. J. Hum. Genet. 86, 621–625 (2010).
Eijgelsheim, M. et al. Genome-wide association analysis identifies multiple loci related to resting heart rate. Hum. Mol. Genet. 19, 3885–3894 (2010).
Lee, S.H., Wray, N.R., Goddard, M.E. & Visscher, P.M. Estimating missing heritability for disease from genome-wide association studies. Am. J. Hum. Genet. 88, 294–305 (2011).
Painter, J.N. et al. Genome-wide association study identifies a locus at 7p15.2 associated with endometriosis. Nat. Genet. 43, 51–54 (2011).
Do, C.B. et al. Web-based genome-wide association study identifies two novel loci and a substantial genetic component for Parkinson's disease. PLoS Genet. 7, e1002141 (2011).
Yang, J. et al. Genome partitioning of genetic variation for complex traits using common SNPs. Nat. Genet. 43, 519–525 (2011).
Chen, R. Fine mapping the TAGAP locus in rheumatoid arthritis. Genes Immun. 12, 314–318 (2011).
Dubois, P.C. et al. Multiple common variants for celiac disease influencing immune gene expression. Nat. Genet. 42, 295–302 (2010).
Kathiresan, S. et al. Genome-wide association of early-onset myocardial infarction with single nucleotide polymorphisms and copy number variants. Nat. Genet. 41, 334–341 (2009); erratum 41, 762 (2009).
Wellcome Case Control Consortium. et al. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 (2007).
Voight, B.F. et al. Twelve type 2 diabetes susceptibility loci identified through large-scale association analysis. Nat. Genet. 42, 579–589 (2010).
Nagelkerke, N.J.D. A note on a general definition of the coefficient of determination. Biometrika 78, 691–692 (1991).
Leuenberger, C. & Wegmann, D. Bayesian computation and model selection without likelihoods. Genetics 184, 243–252 (2010).
Wegmann, D., Leuenberger, C., Neuenschwander, S. & Excoffier, L. ABCtoolbox: a versatile toolkit for approximate Bayesian computations. BMC Bioinformatics 11, 116 (2010).
Yang, J., Lee, S.H., Goddard, M.E. & Visscher, P.M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).
MacGregor, A.J. et al. Characterizing the quantitative genetic contribution to rheumatoid arthritis using data from twins. Arthritis Rheum. 43, 30–37 (2000).
van der Woude, D. et al. Quantitative heritability of anti-citrullinated protein antibody–positive and anti-citrullinated protein antibody–negative rheumatoid arthritis. Arthritis Rheum. 60, 916–923 (2009).
Raychaudhuri, S. Recent advances in the genetics of rheumatoid arthritis. Curr. Opin. Rheumatol. 22, 109–118 (2010).
Schunkert, H. et al. Large-scale association analysis identifies 13 new susceptibility loci for coronary artery disease. Nat. Genet. 43, 333–338 (2011).
Wheeler, E. & Barroso, I. Genome-wide association studies and type 2 diabetes. Brief. Funct. Genomics 10, 52–60 (2011).
Nisticò, L. et al. Concordance, disease progression, and heritability of coeliac disease in Italian twins. Gut 55, 803–808 (2006).
Marenberg, M.E., Risch, N., Berkman, L.F., Floderus, B. & de Faire, U. Genetic susceptibility to death from coronary heart disease in a study of twins. N. Engl. J. Med. 330, 1041–1046 (1994).
Nora, J.J., Lortscher, R.H., Spangler, R.D., Nora, A.H. & Kimberling, W.J. Genetic-epidemiologic study of early-onset ischemic heart disease. Circulation 61, 503–508 (1980).
Almgren, P. et al. Heritability and familiality of type 2 diabetes and related quantitative traits in the Botnia Study. Diabetologia 54, 2811–2819 (2011).
Poulsen, P., Kyvik, K.O., Vaag, A. & Beck-Nielsen, H. Heritability of type II (non-insulin-dependent) diabetes mellitus and abnormal glucose tolerance–a population-based twin study. Diabetologia 42, 139–145 (1999).
van Heel, D.A. et al. A genome-wide association study for celiac disease identifies risk variants in the region harboring IL2 and IL21. Nat. Genet. 39, 827–829 (2007).
Dickson, S.P., Wang, K., Krantz, I., Hakonarson, H. & Goldstein, D.B. Rare variants create synthetic genome-wide associations. PLoS Biol. 8, e1000294 (2010).
Wray, N.R., Purcell, S.M. & Visscher, P.M. Synthetic associations created by rare variants do not explain most GWAS results. PLoS Biol. 9, e1000579 (2011).
1000 Genomes Project Consortium. et al. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010); erratum 473, 544 (2011).
Spencer, C.C., Su, Z., Donnelly, P. & Marchini, J. Designing genome-wide association studies: sample size, power, imputation, and the choice of genotyping chip. PLoS Genet. 5, e1000477 (2009).
Wang, K. et al. Interpretation of association signals and identification of causal variants from genome-wide association studies. Am. J. Hum. Genet. 86, 730–742 (2010).
Orozco, G., Barrett, J.C. & Zeggini, E. Synthetic associations in the context of genome-wide association scan signals. Hum. Mol. Genet. 19, R137–R144 (2010).
Park, L. Identifying disease polymorphisms from case-control genetic association data. Genetica 138, 1147–1159 (2010).
Spencer, C., Hechter, E., Vukcevic, D. & Donnelly, P. Quantifying the underestimation of relative risks from genome-wide association studies. PLoS Genet. 7, e1001337 (2011).
Fisher, R. The correlation between relatives on the supposition of Mendelian inheritance. Phil. Trans. R. Soc. Edinb. 52, 399–433 (1918).
Norton, B. & Pearson, E.S. A note on the background to, and refereeing of, R. A. Fisher's 1918 paper 'On the correlation between relatives on the supposition of Mendelian inheritance'. Notes Rec. R. Soc. Lond. 31, 151–162 (1976).
Stephens, M. & Balding, D.J. Bayesian statistical methods for genetic association studies. Nat. Rev. Genet. 10, 681–690 (2009).
Eleftherohorinou, H. et al. Pathway analysis of GWAS provides new insights into genetic susceptibility to 3 inflammatory diseases. PLoS ONE 4, e8068 (2009).
Cornelis, M.C. et al. Joint effects of common genetic variants on the risk for type 2 diabetes in U.S. men and women of European ancestry. Ann. Intern. Med. 150, 541–550 (2009).
Wei, Z. et al. From disease association to risk assessment: an optimistic view from genome-wide association studies on type 1 diabetes. PLoS Genet. 5, e1000678 (2009).
Pritchard, J.K., Pickrell, J.K. & Coop, G. The genetics of human adaptation: hard sweeps, soft sweeps, and polygenic adaptation. Curr. Biol. 20, R208–R215 (2010).
Pritchard, J.K. & Di Rienzo, A. Adaptation—not by sweeps alone. Nat. Rev. Genet. 11, 665–667 (2010).
Raychaudhuri, S. et al. Identifying relationships among genomic disease regions: predicting genes at pathogenic SNP associations and rare deletions. PLoS Genet. 5, e1000534 (2009).
Rossin, E.J. Proteins encoded in genomic regions associated to immune-mediated disease physically interact and suggest underlying biology. PLoS Genet. 7, e1001273 (2011).
Hu, X. et al. Integrating autoimmune risk loci with gene-expression data identifies specific pathogenic immune cell subsets. Am. J. Hum. Genet. 89, 496–506 (2011).
Freudenberg, J. et al. Locus category based analysis of a large genome-wide association study of rheumatoid arthritis. Hum. Mol. Genet. 19, 3863–3872 (2010).
Falconer, D. & Mackay, T. Introduction to Quantitative Genetics. 4th edn (Longman, 1996).
R.M.P. is supported by grants from the US National Institutes of Health (NIH) (R01-AR057108, R01-AR056768, U01-GM092691 and R01-AR059648) and holds a Career Award for Medical Scientists from the Burroughs Wellcome Fund. S.R. is supported by an NIH Career Development Award (K08AR055688-01A1). The Brigham Rheumatoid Arthritis Sequential Study Registry is supported by a grant from Crescendo and Biogen-Idec. The North American Rheumatoid Arthritis Consortium is supported by the NIH (NO1-AR-2-2263 and RO1-AR44422). This research was also supported in part by the Intramural Research Program of the National Institute of Arthritis, Musculoskeletal and Skin Diseases of the NIH and by a Canada Research Chair and grants to K.A.S. from the Canadian Institutes for Health Research (MOP79321 and IIN-84042) and the Ontario Research Fund (RE01061). We acknowledge S. Purcell, A. Price and N. Zaitlen for help with the design and implementation of the study and analysis.
The authors declare no competing financial interests.
About this article
Cite this article
Stahl, E., Wegmann, D., Trynka, G. et al. Bayesian inference analyses of the polygenic architecture of rheumatoid arthritis. Nat Genet 44, 483–489 (2012). https://doi.org/10.1038/ng.2232
A Summary on the Genetics of Systemic Lupus Erythematosus, Rheumatoid Arthritis, Systemic Sclerosis, and Sjögren’s Syndrome
Clinical Reviews in Allergy & Immunology (2022)
Leveraging fine-mapping and multipopulation training data to improve cross-population polygenic risk scores
Nature Genetics (2022)
Twelve years of GWAS discoveries for osteoporosis and related traits: advances, challenges and applications
Bone Research (2021)
Nature Genetics (2021)
Widespread signatures of natural selection across human complex traits and functional genomic categories
Nature Communications (2021)