Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain
the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in
Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles
The UK Biobank is a prospective cohort study with deep genetic, physical and health data collected on ~500,000 individuals across the United Kingdom from 2006-2010. This unprecedented open access database has enabled an order of magnitude larger studies on genetic and epidemiological associations for an extensive range of health related traits. The UK Biobank has generously made their datasets, and research results resulting from these, accessible to researchers as an open access resource to benefit public health.
This collection accompanies the publication of the first main papers from UK Biobank in Nature and associated commentaries. We also highlight a selection of research publications from Nature journals that showcase how these UK Biobank datasets have already been widely used in a broad range of studies in order to advance the understanding of the genetic basis of disease, genetic epidemiology and public health.
Deep phenotype and genome-wide genetic data from 500,000 individuals from the UK Biobank, describing population structure and relatedness in the cohort, and imputation to increase the number of testable variants to 96 million.
Genome-wide association studies of brain imaging data from 8,428 individuals in UK Biobank show that many of the 3,144 traits studied are heritable, and genes associated with individual phenotypes are identified.
The UK Biobank combines detailed phenotyping and genotyping with tracking of long-term health outcomes in a large cohort. This study describes the recently launched brain-imaging component that will ultimately scan 100,000 individuals. Results from the first 5,000 subjects are reported, including thousands of associations, population modes and hypothesis-driven results.
UK Biobank contains a wealth of data on genetics, health and more from 500,000 participants. A detailed overview of the biobank and an analysis of its brain-imaging data show the value of this resource.
Two studies in Nature describe the full data set of the UK Biobank resource, which contains genome-wide genetic data, clinical measurements and health records for ~500,000 individuals, and reveal insights into the brain’s genetic architecture.
Analysis of 329,000 individuals in the UK Biobank identifies 116 loci associated with neuroticism. Genes implicated are enriched in neuronal differentiation pathways, and genetic correlations between neuroticism and other mental health traits are elucidated.
Genome-wide association study for osteoarthritis using data from UK Biobank identifies loci for knee- and hip-specific disease. Functional analyses of chondrocytes provide further insight into candidate causal genes.
Genome-wide meta-analysis identifies >100 loci associated with hair color variation in humans of European ancestry. These loci explain a large portion of the heritability of this trait & provide insights into pathways regulating hair pigmentation.
BayesS estimates SNP-based heritability, polygenicity, and the relationship between effect size and minor allele frequency using genome-wide SNP data. Applying BayesS to UK Biobank data identifies signatures of natural selection for 23 complex traits.
A meta-analysis of genome-wide association studies for neuroticism identifies novel loci, pathways and potential drug targets. Further analysis implicates specific brain regions and evaluates genetic overlap with other neuropsychiatric traits.
This large, multi-ethnic genome-wide association study identifies 97 loci significantly associated with atrial fibrillation. These loci are enriched for genes involved in cardiac development, electrophysiology, structure and contractile function.
Large-scale association analyses identify 142 independent risk variants for atrial fibrillation. Pathway and functional enrichment analyses suggest that many of the putative risk genes act via cardiac structural remodeling.
Genome-wide analyses identify 42 risk loci for diverticular disease, 39 of which are new. Genes in associated regions are enriched for expression in connective tissue cell types and are coexpressed with genes involved in vascular and mesenchymal biology.
Association analyses in over 1 million individuals identify 535 new loci influencing blood pressure traits. The results provide new insights into blood pressure regulation and highlight shared genetic architecture between blood pressure and lifestyle exposures.
Neuroticism can be assessed as a composite score of individual items. Here, Nagel et al. perform genetic association studies for 12 neuroticism items and the sum-score and demonstrate genetic heterogeneity at the item-level.
Examination of predicted loss-of-function (pLOF) genetic variants allows direct identification of genes with therapeutic potential. Here, Emdin et al. perform association analysis for 3759 pLOF variants with 24 traits and highlight protective variants against cardiometabolic and immune phenotypes.
Protein-truncating variants (PTVs) are predicted to significantly affect a gene’s function and, thus, human traits. Here, DeBoever et al. systematically analyze PTVs in more than 300,000 individuals across 135 phenotypes and identify 27 associations between PTVs and medical conditions.
Testing the association between genetic variants and a range of phenotypes can assist drug development. Here, in a phenome-wide association study in up to 697,815 individuals, Diogo et al. identify genotype–phenotype associations predicting efficacy, alternative indications or adverse drug effects.
Little is known about the genetic determinants of social isolation and loneliness despite their well-established importance for health. Here, using multi-trait GWAS, Day et al. identify 15 genomic loci for loneliness and further show a bidirectional causal relationship between BMI and loneliness by MR.
Emma Clifton et al. report a genome-wide association study of risk taking propensity amongst UK Biobank participants. They identify 26 loci, 24 of which are novel, and use Mendelian randomisation analysis to explore the relationship between risk-taking propensity and BMI.
Konstantinos Hatzikotoulas et al. report the largest genome-wide association study to date for developmental dysplasia of the hip using national clinical audit data from the UK. They find a significant association with the GDF5 locus and evidence for shared genetic architecture with hip osteoarthritis.
Rosa Thorolfsdottir et al. report a genome-wide association study of atrial fibrillation in 29,502 cases and 767,760 controls from Iceland and the UK Biobank. They identify a significant association with coding variants in RPL3L, the first ribosomal gene implicated in atrial fibrillation, and MYZAP, an intercalated disc gene.
SAIGE (Scalable and Accurate Implementation of GEneralized mixed model) is a generalized mixed model association test that can efficiently analyze large data sets while controlling for unbalanced case-control ratios and sample relatedness, as shown by applying SAIGE to the UK Biobank data for > 1,400 binary phenotypes.
The heteroskedastic linear mixed model is a new framework for testing both mean and variance effects on quantitative traits. Applying the heteroskedastic linear mixed model to body mass index in the UK Biobank shows that the approach increases the power to detect associated loci.
MTAG is a new method for joint analysis of summary statistics from genome-wide association studies of different traits. Applying MTAG to summary statistics for depressive symptoms, neuroticism and subjective well-being increased discovery of associated loci as compared to single-trait analyses.