An atlas of genetic associations in UK Biobank

Abstract

Genome-wide association studies (GWAS) have identified many loci contributing to variation in complex traits, yet the majority of loci that contribute to the heritability of complex traits remain elusive. Large study populations with sufficient statistical power are required to detect the small effect sizes of the yet unidentified genetic variants. However, the analysis of huge cohorts, like UK Biobank, is challenging. Here, we present an atlas of genetic associations for 118 non-binary and 660 binary traits of 452,264 UK Biobank participants of European ancestry. Results are compiled in a publicly accessible database that allows querying genome-wide association results for 9,113,133 genetic variants, as well as downloading GWAS summary statistics for over 30 million imputed genetic variants (>23 billion phenotype–genotype pairs). Our atlas of associations (GeneATLAS, http://geneatlas.roslin.ed.ac.uk) will help researchers to query UK Biobank results in an easy and uniform way without the need to incur high computational costs.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: The effect of sample size on the number of GWAS hits and their estimated effects.
Fig. 2: Histograms of numbers of significant associations (two-sided t-test, P < 10−8).
Fig. 3: Number of significant associations (two-sided t-test, P < 10−8).
Fig. 4: Relationship between estimated SNP heritability and numbers of genome-wide significant associations (two-sided t-test, P < 10−8).
Fig. 5: Manhattan plots for selected phenotypes.
Fig. 6: Numbers of phenotypes of different SNP heritability.
Fig. 7: Phenotypic prediction accuracy from genetic markers.

Data availability

All summary results from the analyses performed are available at the GeneATLAS website, http://geneatlas.roslin.ed.ac.uk/.

References

  1. 1.

    Falconer, D. S. & Mackay, T. F. C. Introduction to Quantitative Genetics (Longman, Harlow, 1996).

  2. 2.

    Sudlow, C. et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015).

    Article  Google Scholar 

  3. 3.

    Canela-Xandri, O., Law, A., Gray, A., Woolliams, J. A. & Tenesa, A. A new tool called DISSECT for analysing large genomic data sets using a Big Data approach. Nat. Commun. 6, 10162 (2015).

    CAS  Article  Google Scholar 

  4. 4.

    Loh, P.-R., Kichaev, G., Gazal, S., Schoech, A. P. & Price, A. L. Mixed-model association for biobank-scale datasets. Nat. Genet. 50, 906–908 (2018).

    CAS  Article  Google Scholar 

  5. 5.

    McCarthy, S. et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 48, 1279–1283 (2016).

    CAS  Article  Google Scholar 

  6. 6.

    Palmer, C. & Pe’er, I. Statistical correction of the winner’s curse explains replication variability in quantitative trait genome-wide association studies. PLoS Genet. 13, e1006916 (2017).

    Article  Google Scholar 

  7. 7.

    Yang, J. et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565–569 (2010).

    CAS  Article  Google Scholar 

  8. 8.

    Ransohoff, K. J. et al. Two-stage genome-wide association study identifies a novel susceptibility locus associated with melanoma. Oncotarget 8, 17586–17592 (2017).

    Article  Google Scholar 

  9. 9.

    Chahal, H. S. et al. Genome-wide association study identifies 14 novel risk alleles associated with basal cell carcinoma. Nat. Commun. 7, 12510 (2016).

    CAS  Article  Google Scholar 

  10. 10.

    Meuwissen, T., Hayes, B. & Goddard, M. Prediction of total genetic value using genome-wide dense marker maps. Genetics 157, 1819–1829 (2001).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. 11.

    Canela-Xandri, O., Rawlik, K., Woolliams, J. A. & Tenesa, A. Improved genetic profiling of anthropometric traits using a Big Data approach. PLoS One 11, e0166755 (2016).

    Article  Google Scholar 

  12. 12.

    Daetwyler, H. D., Villanueva, B. & Woolliams, J. A. Accuracy of predicting the genetic risk of disease using a genome-wide approach. PLoS One 3, e3395 (2008).

    Article  Google Scholar 

  13. 13.

    1000 Genomes Project Consortium. et al. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).

    Article  Google Scholar 

  14. 14.

    Bycroft, C. F. et al. Genome-wide genetic data on ~500,000 UK Biobank participants. Preprint at bioRxiv https://doi.org/10.1101/166298 (2017).

  15. 15.

    Aulchenko, Y. S., de Koning, D. J. & Haley, C. Genomewide rapid association using mixed model and regression: a fast and simple method for genomewide pedigree-based quantitative trait loci association analysis. Genetics 177, 577–585 (2007).

    CAS  Article  Google Scholar 

  16. 16.

    Yang, J., Zaitlen, N. A., Goddard, M. E., Visscher, P. M. & Price, A. L. Advantages and pitfalls in the application of mixed-model association methods. Nat. Genet. 46, 100–106 (2014).

    Article  Google Scholar 

  17. 17.

    Patsopoulos, N. A. et al. Fine-mapping the genetic association of the Major Histocompatibility Complex in multiple sclerosis: HLA and non-HLA Effects. PLoS Genet. 9, e1003926 (2013).

    Article  Google Scholar 

  18. 18.

    Stram, D. O. & Lee, J. W. Variance components testing in the longitudinal mixed effects model. Biometrics 50, 6 (1994).

    Article  Google Scholar 

  19. 19.

    Visscher, P. M. A note on the asymptotic distribution of likelihood ratio tests to test variance components. Twin. Res. Hum. Genet. 9, 490–495 (2012).

    Article  Google Scholar 

  20. 20.

    Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

    CAS  Article  Google Scholar 

  21. 21.

    Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 1–16 (2015).

    Article  Google Scholar 

Download references

Acknowledgements

This research has been conducted using the UK Biobank Resource under project 788. The work was funded by the Roslin Institute Strategic Programme Grant from the BBSRC (BB/P013732/1) and MRC grant (MR/N003179/1) granted to A.T. A.T. also acknowledges funding from the Medical Research Council and O.C.-X. from MRC fellowship MR/R025851/1. Analyses were performed using the ARCHER UK National Supercomputing Service.

Author information

Affiliations

Authors

Contributions

All authors contributed equally to the design, running of the analyses, and writing of the manuscript.

Corresponding author

Correspondence to Albert Tenesa.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–24 and Supplementary Note

Reporting Summary

Supplementary Tables 1, 2 and 4–13

Supplementary Tables 1, 2 and 4–13

Supplementary Table 3

List of lead variants for each phenotype

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Canela-Xandri, O., Rawlik, K. & Tenesa, A. An atlas of genetic associations in UK Biobank. Nat Genet 50, 1593–1599 (2018). https://doi.org/10.1038/s41588-018-0248-z

Download citation

Further reading

Search

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing