Obesity is a worldwide epidemic, with major health and economic costs. Here we estimate heritability for body mass index (BMI) in 172,000 sibling pairs and 150,832 unrelated individuals and explore the contribution of genotype–covariate interaction effects at common SNP loci. We find evidence for genotype–age interaction (likelihood ratio test (LRT) = 73.58, degrees of freedom (df) = 1, P = 4.83 × 10−18), which contributed 8.1% (1.4% s.e.) to BMI variation. Across eight self-reported lifestyle factors, including diet and exercise, we find genotype–environment interaction only for smoking behavior (LRT = 19.70, P = 5.03 × 10−5 and LRT = 30.80, P = 1.42 × 10−8), which contributed 4.0% (0.8% s.e.) to BMI variation. Bayesian association analysis suggests that BMI is highly polygenic, with 75% of the SNP heritability attributable to loci that each explain <0.01% of the phenotypic variance. Our findings imply that substantially larger sample sizes across ages and lifestyles are required to understand the full genetic architecture of BMI.
At a glance
- Heritability in the genomics era—concepts and misconceptions. Nat. Rev. Genet. 9, 255–266 (2008). , &
- Genetics and Analysis of Quantitative Traits (Sinauer Associates, 1998). &
- Estimation and partition of heritability in human populations using whole-genome analysis methods. Annu. Rev. Genet. 47, 75–95 (2013). , , , &
- Five years of GWAS discovery. Am. J. Hum. Genet. 90, 7–24 (2012). , , &
- Pitfalls of predicting complex traits from SNPs. Nat. Rev. Genet. 14, 507–515 (2013). et al.
- Sex differences in heritability of BMI: a comparative study of results from twin studies in eight countries. Twin Res. 6, 409–421 (2003). et al.
- Variability in the heritability of body mass index: a systematic review and meta-regression. Front. Endocrinol. (Lausanne) 3, 29 (2012). et al.
- Genetic variability of adult body mass index: a longitudinal assessment in Framingham families. Obes. Res. 10, 675–681 (2002). et al.
- Heritability and genetic correlations explained by common SNPs for metabolic syndrome traits. PLoS Genet. 8, e1002637 (2012). , &
- Using extended genealogy to estimate components of heritability for 23 quantitative and dichotomous traits. PLoS Genet. 9, e1003520 (2013). et al.
- Assumption-free estimation of heritability from genome-wide identity-by-descent sharing between full siblings. PLoS Genet. 2, e41 (2006). et al.
- Inference of the genetic architecture underlying BMI and height with the use of 20,240 sibling pairs. Am. J. Hum. Genet. 93, 865–875 (2013). et al.
- Meta-analysis of the heritability of human traits based on fifty years of twin studies. Nat. Genet. 47, 702–709 (2015). et al.
- Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565–569 (2010). et al.
- Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index. Nat. Genet. 47, 1114–1120 (2015). et al.
- Genetic studies of body mass index yield new insights for obesity biology. Nature 518, 197–206 (2015). et al.
- Defining the role of common variation in the genomic and biological architecture of adult human height. Nat. Genet. 46, 1173–1186 (2014). et al.
- Dominance genetic variation contributes little to the missing heritability for human complex traits. Am. J. Hum. Genet. 96, 377–385 (2015). et al.
- Familial resemblance of body mass index and familial risk of high and low body mass index. A study of young men in Sweden. Int. J. Obes. Relat. Metab. Disord. 26, 1225–1231 (2002). &
- GWAS of 126,559 individuals identifies genetic variants associated with educational attainment. Science 340, 1467–1471 (2013). et al.
- Estimation of pleiotropy between complex diseases using single-nucleotide polymorphism–derived genomic relationships and restricted maximum likelihood. Bioinformatics 28, 2540–2542 (2012). , , , &
- Estimating the covariance structure of traits during growth and ageing, illustrated with lactation in dairy cattle. Genet. Res. 64, 57–69 (1994). , &
- Analysis of the inheritance, selection and evolution of growth trajectories. Genetics 124, 979–993 (1990). , &
- Estimating covariance functions for longitudinal data using a random regression model. Genet. Sel. Evol. 30, 221 (1998).
- Up hill, down dale: quantitative genetics of curvaceous traits. Phil. Trans. R. Soc. Lond. B 360, 1443–1455 (2005). &
- Environmental heterogeneity generates fluctuating selection on a secondary sexual trait. Curr. Biol. 18, 751–757 (2008). , , , &
- GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011). , , &
- Stable genes and changing environments: body mass index across adolescence and young adulthood. Behav. Genet. 40, 495–504 (2010). et al.
- A twin study of human obesity. JAMA 256, 51–54 (1986). , &
- Genetic and environmental contributions to the association between body height and educational attainment: a study of adult Finnish twins. Behav. Genet. 30, 477–485 (2000). , &
- The influence of age and sex on genetic associations with adult body size and shape: a large-scale genome-wide interaction study. PLoS Genet. 11, e1005378 (2015). et al.
- Global, regional, and national prevalence of overweight and obesity in children and adults during 1980–2013: a systematic analysis for the Global Burden of Disease Study 2013. Lancet 384, 766–781 (2014). et al.
- Genetics of obesity: what have we learned? Curr. Genomics 12, 169–179 (2011). &
- Sugar-sweetened beverages and genetic risk of obesity. N. Engl. J. Med. 367, 1387–1396 (2012). et al.
- Television watching, leisure time physical activity, and the genetic predisposition in relation to body mass index in women and men. Circulation 126, 1821–1827 (2012). et al.
- Multiple novel gene-by-environment interactions modify the effect of FTO variants on body mass index. Nat. Commun. 7, 12724 (2016). , &
- FTO genotype is associated with phenotypic variability of body mass index. Nature 490, 267–272 (2012). et al.
- Cohort of birth modifies the association between FTO genotype and BMI. Proc. Natl. Acad. Sci. USA 112, 354–359 (2015). et al.
- Gene–obesogenic environment interactions in the UK Biobank study. Int. J. Epidemiol. 46, 559–575 (2017). et al.
- Warped linear mixed models for the genetic analysis of transformed phenotypes. Nat. Commun. 5, 4890 (2014). , , &
- Simultaneous discovery, estimation and prediction analysis of complex traits using a Bayesian mixture model. PLoS Genet. 11, e1004969 (2015). et al.
- Genetic evidence of assortative mating in humans. Nat. Hum. Behav. 1, 16 (2017). et al.
- Sex-specific genetic architecture of human disease. Nat. Rev. Genet. 9, 911–922 (2008). , &
- The sex-specific genetic architecture of quantitative traits in humans. Nat. Genet. 38, 218–222 (2006). , , &
- Sex-stratified genome-wide association studies including 270,000 individuals show sexual dimorphism in genetic loci for anthropometric traits. PLoS Genet. 9, e1003500 (2013). et al.
- Genome-wide genetic homogeneity between sexes and populations for human height and body mass index. Hum. Mol. Genet. 24, 7445–7449 (2015). et al.
- Genetic mechanisms leading to sex differences across common diseases and anthropometric traits. Genetics 205, 979–992 (2017). et al.
- Unequal Chances: Family Background and Economic Success (eds. Bowles, S., Gintis, H. & Osborne-Grave, M.) 145–164 (Princeton University Press, 2005). , & in
- The Swedish Twin Registry: establishment of a biobank and other recent developments. Twin Res. Hum. Genet. 16, 317–329 (2013). et al.
- MTG2: an efficient algorithm for multivariate linear mixed model analysis based on genomic information. Bioinformatics 32, 1420–1422 (2016). &
- Computing approximate standard errors for genetic parameters derived from random regression models fitted by average information REML. Genet. Sel. Evol. 36, 363–369 (2004). , &
- Variance components testing in the longitudinal mixed effects model. Biometrics 50, 1171–1177 (1994). &
- Improving accuracy of genomic predictions within and between dairy cattle breeds with imputed high-density single nucleotide polymorphism panels. J. Dairy Sci. 95, 4114–4129 (2012). et al.
- Genomic selection: prediction of accuracy and maximisation of long term response. Genetica 136, 245–257 (2009).
- Polygenic modeling with Bayesian sparse linear mixed models. PLoS Genet. 9, e1003264 (2013). , &
- Supplementary Figure 1: Simulation study of genotype–covariate interaction using observed genotype data from 25,000 randomly selected unrelated individuals in the UK Biobank study. (339 KB)
(a) Estimates of the variance tagged by common SNP markers from multivariate REML (MV-GREML) and random regression REML (RR-GREML) models across five different simulation scenarios (see Online Methods). For each simulation scenario, there were five measurement points, representing different points along a continuous gradient. The estimates of the RR-GREML models come from a model fitting the highest order of Legendre polynomial function that gave a significant improvement in model fit based on a log-likelihood ratio test (LRT). Estimates are the mean value obtained over 10 simulation replicates and error bars give the s.d. across replicates. The solid lines give the pattern of variance change predicted from the polynomial function of the RR-GREML models and dotted lines give the approximate s.d. across replicates (see Online Methods). (b) The analysis of a is repeated but the phenotype is standardized with an inverse normal transformation within each measurement point. (c) Simulated and estimated correlation in common SNP marker effects from MV-GREML models (green circles) and RR-GREML models of first order polynomial (blue circles). (d) P-value of a LRT with two degrees of freedom of a first order polynomial RR-GREML model versus a zero order RR-GREML model for simulation scenarios with and without genotype-covariate interaction and with and without-standardization. Points give the mean, and error bars show the full distribution of the LRT statistic P-values across all 10 replicates. (e) P-value of a LRT of increasing orders of polynomial (k = 1, 2, 3, 4) RR-GREML models across different simulation scenarios.
- Supplementary Figure 2: Simulation study of genotype–covariate interaction using observed genotype data from 25,000 randomly selected unrelated individuals in the UK Biobank study. (227 KB)
(a) Estimates of SNP coheritability, V(G), and variance attributable to genotype-covariate interaction effects, V(GCI), from genotype-covariate interaction (GCI-GREML) models across five simulation scenarios described in Supplementary Figure 1, for both rank inverse normal transformed phenotypes and unstandardized ones. For each simulation scenario, there were five measurement points, representing different points along a continuous gradient, and the estimates are the mean value obtained over 10 simulation replicates with error bars giving the s.d. across replicates. (b) We repeat the GCI-GREML models but we only use the extreme measurement points. Expectations for V(G) and V(GCI) were derived from our theory (see Supplementary Note). (c) The significance of the likelihood ratio tests comparing the fit to the data of a GCI-GREML model (fitting a genetic variance component and a genetic interaction variance component) and a null GREML model (fitting a single genetic variance component) across the simulation scenarios.
- Supplementary Figure 3: Phenotypic variance and variance tagged by common HapMap3 SNP loci for body mass index (BMI) and height in the AHTHEL sample. (161 KB)
(a,b) The phenotypic variance of BMI (a) and height (b) were unstandardized within age groups allowing testing for variance heterogeneity for both traits. The phenotypic variance is shown in grey circles. The variance captured by common HapMap3 SNP loci from both an MV-GREML model is shown in blue circles for BMI, and green circles for height. The estimates from a zero order RR-GREML model are shown by a solid blue line for BMI, and a solid green line for height, with dashed lines giving the approximate s.e. LRT values give the likelihood ratio test statistic values for a RR-GREML model with a first order polynomial as compared to a RR-GREML model of zero order, to test for the presence of changes in variance tagged by SNP markers.
- Supplementary Figure 4: No evidence for genotype–age interaction for BMI in 107,488 individuals from the UK Biobank study. (203 KB)
(a,b) The UK Biobank sample contained individuals measured between the ages of 46 and 73 and we divided the age distribution into deciles of approximately equal sample size. The age ranges were: (1) 9,655 individuals aged 40 to 44; (2) 10,379 individuals aged 45 to 48; (3) 9,205 individuals aged 49 to 51; (4) 13,669 individuals aged 52 to 55; (5) 7,689 individuals aged 56 and 57; (6) 8,590 individuals aged 58 and 59; (7) 11,211 individuals aged 60 and 61; (8) 11,019 individuals aged 62 and 63; (9) 14,559 individuals aged 64 to 66; and (10) 11,612 individuals aged 67 and above. We corrected for sex effects and standardized BMI measures within each decile with a rank inverse normal transformation to remove differences in phenotypic mean and variance across age. We used a full MV-GREML model to estimate the proportion of variance attributable to common HapMap3 SNPs for each age groups as shown in blue dots with s.e. bars (a), and the covariance in genome-wide SNP effects across age groups (b), with s.e. of the genetic correlations among age groups given in brackets. We used a RR-GREML model with a first order polynomial and compared the model fit to the data to a RR-GREML model of zero order, to test for the presence of genotype-age interaction effects using a likelihood-ratio test (LRT). We find no evidence for genotype-age interaction effects (LRT = 1.02, P = 0.601) and thus the estimates of the RR-GREML model of zero order are presented in a with dashed lines giving the approximate s.e.
- Supplementary Figure 5: Correlations among the self-reported lifestyle factors found to influence BMI of 97,510 individuals of the UK Biobank study. (172 KB)
- Supplementary Figure 6: Phenotypic variance and variance tagged by common HapMap3 SNP loci for BMI of 97,510 individuals within the UK Biobank study across self-reported lifestyle factors. (172 KB)
A series of self-report lifestyle factors shown to significantly influence mean BMI within the UK Biobank were used to group individuals. We corrected BMI for sex, age and the effects of all lifestyle factors and then converted BMI values to a z-score across but not within groups. This means that the phenotypic variance of BMI was unstandardized within groups allowing testing for variance heterogeneity and these values are shown in grey circles. Estimates of the variance captured by common HapMap3 SNP loci from a MV-GREML model are shown in blue circles, and those from a first order RR-GREML model are given by a solid blue line, with dashed line giving the approximate s.e. For the RR-GREML model estimates, we present the model of best fit to the data, as assessed by likelihood-ratio test statistics (LRT), which are given in Supplementary Table 6.
- Supplementary Text and Figures (1,823 KB)
Supplementary Figures 1–6, Supplementary Tables 1–6 and Supplementary Note.