Introduction

The World Health Organization defines overweight and obesity as a body mass index (BMI) 25 and 30 kg m−2, respectively.1 Overweight and obese people have a significantly higher mortality rate than those with normal weight owing to various health disorders.2, 3, 4, 5 Therefore, preventing overweight and obesity will decrease the onset of these diseases’ associated adverse effects. Furthermore, decreasing the obese population will lower medical costs. The estimated annual costs of overweight and obesity are $498 and $1630 per capita, respectively.6

Obesity is associated with environmental factors, such as dietary intake and physical activity, as well as genetic factors.1, 7 Obesity is due to an increase in total energy intake, although the precise contributions of nutrients (for example, carbohydrates, fat, protein and fiber) are not fully understood.8 Low physical activity and a sedentary lifestyle exacerbate obesity.8 Speliotes et al.9 performed a meta-analysis of large genome-wide association studies (GWAS) and extracted genome-wide significant single-nucleotide polymorphisms (SNPs) representing 32 loci on BMI-related genes.9 The genetic risk score (GRS) indicates a genetic predisposition to obesity, and is calculated from these SNPs. Several studies have investigated gene–environment interactions using the GRS. The GRS and the consumption of sugar-sweetened beverages and fried foods are reported to be associated with obesity.10, 11, 12 However, associations between the GRS and dietary factors besides sugar-sweetened beverages and fried foods remain unclear.13 Several studies have reported an association between the GRS and physical activity.14, 15, 16 Hence, further information about gene–environment interactions will promote the establishment of an effective prevention method against obesity.

Elucidating gene–environment interactions with more extensive environmental factors enables the determination of individualized and detailed risks for the development of obesity. Such personalized prevention may reduce the risk of obesity more effectively than generalized prevention methods. Prospective cohort studies are a useful way to accurately determine gene–environment interactions, mainly because of reduced bias.17 Thus, such studies that precisely evaluate the risks of obesity are anticipated.

As part of the Yamagata Study (Takahata),18, 19, 20 the present study examined the risk of obesity and gene–environment interactions using the GRS. The GRS was calculated based on known SNPs representing 29 loci associated with BMI at baseline, and longitudinal data on changes in BMI, along with dietary intake, physical activity and smoking status.21

Materials and methods

Study population

Takahata City is 300 km north of Tokyo, Japan; its population over 40 years of age was 15 244 in 2010. The Yamagata Study (Takahata) is a population-based cohort study of Japanese people over 40 years that aims to clarify risk factors for certain lifestyle-related diseases such as diabetes and obesity. The baseline survey of 3522 participants was conducted from 2004 to 2006.18, 19, 20 Among them, 2124 participants completed the follow-up survey in 2011, 5–7 years after the baseline survey. This study was approved by the Ethics Committee of the Yamagata University School of Medicine, and written informed consent was obtained from all participants.

Assessment of BMI and lifestyle factors

Weight and height were measured by an examiner and used to calculate BMI (kg m−2). BMI <18.5, 18.5–24.99 and 25 kg m−2 were classified as underweight, normal weight and overweight, respectively.22 The change in BMI (kg·m−2·year−1) of each participant was calculated by subtracting the BMI at the follow-up survey from that at baseline, and divided by the number of years of follow-up. Underweight or overweight participants at baseline were excluded when assessing the change in BMI. Other characteristics of the Yamagata Study (Takahata) have been described in detail elsewhere.18, 19, 20

Characteristics besides BMI, such as the fasting plasma glucose and fasting serum insulin levels, were also analyzed. The homeostasis model assessment ratio (HOMA-R) was calculated from fasting plasma glucose and fasting serum insulin levels. It was only calculated for participants with fasting plasma glucose levels <140 mg dl−1 to improve precision.23 The Brinkman index was calculated as the number of cigarettes smoked per day multiplied by the number of smoking years.24

Daily nutritional intake status was assessed using the brief self-administered diet history questionnaire, which involves the recollection of dietary habits over 1 month.25 We evaluated the daily nutrition intake in grams. Physical activity status was assessed using the Japan Arteriosclerosis Longitudinal Study Physical Activity Questionnaire, which allows total and activity-specific energy to be quantified in metabolic equivalents-hours per day (METs-h day−1).26

Genotyping and imputation

A total of 1620 DNA samples were extracted from blood samples. SNP genotyping was performed using Infinium 660 W BeadChip (Illumina, San Diego, CA, USA). SNPs with a minor allele frequency of <0.5% and a call rate of <95% were excluded. Genotype imputation was subsequently performed using the MACH-Admix program27 with 194 ASN (68 CHB, 25 CHS, 84 JPT and 17 MXL) reference genotyped data from the 1000 Genomes Project (released August 2010). The final genotyped or imputed set of SNPs known to be associated with BMI comprised 1620 participants from 10 524 403 imputed autosomal markers. (Supplementary Figure S1).

Genetic risk score

The GRS was calculated from β coefficients of 29 SNPs reported by Lu et al.21 in an East Asian population using a previously reported weighting method, as our cohort participants were Japanese.21, 28 In brief, the GRS was calculated by multiplying the number of the effect alleles (0, 1 or 2) at each locus by the β coefficient of that SNP obtained from the previous GWAS, summing those values, dividing by 1.78 (the maximum allowable sum of the β coefficients), and then multiplied by 58 (twice the number of alleles). Each point of this GRS corresponded to one risk allele.

Statistical analysis

The exact test of the Hardy–Weinberg equilibrium was performed for each SNP with the ‘hwexact’ function from the ‘hwde’ package in R.29 Multivariate linear regression models were constructed to estimate the effect of the GRS, and other lifestyle factors on BMI at baseline, and the change in BMI. The outcomes were baseline BMI or the change in BMI, and the explanatory variable was GRS, with the following covariates: model 0: age, age2 and sex; model 1: age, age2, sex, METs-h day−1, energy intake and HOMA-R; and model 2: age, age2, sex, METs-h day−1, Brinkman index, alcohol intake, carbohydrate intake, animal and vegetable fat intake, animal and vegetable protein intake, fiber intake and HOMA-R. Additionally, BMI at baseline was included as a covariate in regression model 3, which was analyzed only for the change in BMI. We calculated standardized β coefficients of the GRS, age, sex and lifestyle factors to compare the relative effects of these factors. Regression model 2 was also used to estimate the associations between BMI at baseline and lifestyle factors in a subgroup stratified according to the tertiles of the GRS. In this analysis, the gene–environment interactions between the GRS and each lifestyle factor were tested for their effects on BMI by including the respective interaction terms in the models (for example, GRS (as continuous variable) × energy intake). Because these models test the hypothesis that the GRS modifies the effects of the environmental factors, they are not adjusted for GRS per se. This is because the GRS cannot modify the effect of the environmental factors on its own. For example, genetic risk itself does not increase BMI; however, each individual’s intake of energy or nutrition does. All P-values are two-sided; the level of significance was set at P<0.05. Statistical analyses were performed with R software version 3.0.2 (R Foundation for Statistical Computing, Vienna, Austria).

Results

Characteristics of the Yamagata Study (Takahata)

The study profile is shown in Figure 1. A total of 1620 participants with a median age of 62 years (range, 40–87 years) were analyzed; they included 726 (44.8%) men and 894 (55.2%) women with median ages of 63 years (range, 40–87 years) and 61 years (range, 40–83 years), respectively. The change in BMI was analyzed in 708 participants, including 324 (45.8%) men and 384 (54.2%) women. The mean (SD) BMI at baseline was 23.4 (3.1) kg m−2, and the mean annual change in BMI was −0.017 (0.223) kg m2. Of 708 participants whose change in BMI was analyzed, 56 (7.9%) became overweight or obese and 23 (3.2%) became underweight. The mean GRS was 26.1 (3.9). The genotype distributions and allele frequencies of the SNPs are shown in Supplementary Table 1.

Figure 1
figure 1

Flowchart of participants enrolled in the Yamagata Study (Takahata).1HOMA-R, homeostasis model assessment ratio. 2The change in body mass index (BMI) (kg·m−2·year−1) was calculated by subtracting the BMI at baseline from that at the follow-up survey, and dividing by the number of years of follow-up of each individual.

Effect of the GRS on BMI

The effect of the GRS on BMI at baseline per increment of one risk allele is shown in Table 1. In the regression model adjusted for age, age2 and sex (model 0 in Table 1), BMI at baseline increased 0.12 kg m2 per increment of one risk allele. The associations between the GRS and the BMI at baseline were significant in all models (P<0.001), indicating a direct association between the GRS and the BMI at baseline.

Table 1 Increase in BMI per increment of one risk allele

Effects of lifestyle factors

Table 2 shows the results of multivariate linear regression models used to compare the standardized β coefficients of the GRS, age and lifestyle factors. The multivariate linear regression model evaluating factors, including nutrition, showed that the GRS, age and HOMA-R were significantly associated with the BMI at baseline (model fit: adjusted R2, 0.207, P<0.001).

Table 2 Standardized β coefficients of each factor in the multivariate linear regression modelsa

Lifestyle factors and BMI according to the GRS tertile

Figure 2 shows the BMI at baseline associated with the increments of lifestyle factors in subgroups stratified according to the GRS tertile. Significant gene–environment interactions were observed for all six nutritional factors, METs-h day−1, HOMA-R (all P<0.001 for interaction) and alcohol intake (P=0.014, details are shown in Supplementary Table S2). The BMI at baseline was significantly higher in the highest tertile for the increment of fiber intake (0.15 kg m−2 g−1; P=0.01). In contrast, the BMI at baseline was significantly lower in the highest tertile for the increment of vegetable fat intake (−0.05 kg m−2 g−1; P=0.04) and animal protein intake (−0.05 kg m−2 g−1; P=0.005). In particular, the intake of 1 g of fiber was associated with a BMI at baseline of 0.01 kg m−2 (95% confidence interval (CI): −0.08, 0.10), −0.02 kg m−2 (95% CI: −0.10, 0.07) and 0.15 kg m−2 (95% CI: 0.04, 0.26) for the first, second and third GRS tertiles, respectively.

Figure 2
figure 2

BMI at baseline associated with each lifestyle factor according to the tertiles of GRS. GRS, genetic risk score. Data are effect sizes (β coefficients (95% confidence intervals)) of the increments of lifestyle factors on body mass index (BMI) stratified according to the GRS tertile. The median scores in the first (T1), second (T2) and third (T3) tertiles were 21.8 (range, 14.5–24.3; n=503), 26.0 (range, 24.3–27.7; n=511) and 30.4 (range, 27.8–42.3; n=495), respectively. Data were adjusted for age; age2; sex; metabolic equivalents (METs)-h day−1; the Brinkman index; the homeostasis model assessment ratio (HOMA-R); and alcohol, carbohydrate, animal fat, vegetable fat, animal protein, vegetable protein and fiber intake. P-values are for interactions. Bars indicate 95% confidence intervals. When the analysis was further adjusted for GRS, P-values for each calculation were as follows: carbohydrate intake, P=0.206; animal protein intake, P=0.265; vegetable protein intake, P=0.108; animal fat intake, P=0.094; vegetable fat intake, P=0.130; fiber intake, P=0.041; alcohol intake, P=0.536; METs-h day1, P=0.042; Brinkman index, P=0.608; HOMA-R, P=0.230.

Changes in BMI according to longitudinal observations

The effect of the GRS on the change in BMI per increment of one risk allele was −0.001 (model 0 in Table 1). The GRS was not significantly associated with the change in BMI from baseline in any model (models 0–3 in Tables 1 and 2). The multivariate linear regression model evaluating factors, including nutrition (models 2 and 3 in Table 2), showed that only the BMI at baseline was significantly associated with a change in BMI (standardized β coefficient, −0.134, P=0.001); the adjusted R2 for the model fit was 0.042 (P<0.001). The findings regarding the change in BMI did not differ after including underweight and overweight participants at baseline.

Discussion

Because we corroborated the effect of gene–environment interactions on the risk of obesity, this study demonstrated the applicability of applying the previous results of GWAS to preventive medicine. The GRS of participants of the Yamagata Study (Takahata) was associated with obesity, which is consistent with the GRS from previous large-scale GWAS.9, 10, 30, 31

The present study indicates that genomic information from previous large-scale GWAS can be used to evaluate the genetic risk of obesity. This approach is inexpensive, as it involves only typing approximately 20–30 SNPs. Moreover, it also enables the assessment of gene–environment interactions. As GWAS that utilize case–control designs and environmental factors are not taken into account,32 it is difficult to assess the effects of environmental factors on developing disease.33 Accordingly, gene–environment interactions can be assessed in the same manner as in the present study. Gene–environment interactions have an important role in the etiology of common diseases.8, 34 Thus, they would have a substantial role in the development of personalized preventive medicine for common diseases.

The results of this study advance the possibility of using the GRS in the practice of preventive medicine. Incorporating the field of gene–environment interactions in preventive medicine involves considering how genetic factors modify the effects of lifestyle factors on obesity in interventions.34, 35 In the present study, there were significant interactions between the GRS and several lifestyle factors. Furthermore, an increase in BMI at baseline in association with incremental increases in lifestyle factors differed among GRS tertiles; participants in some of the GRS tertiles exhibited significant differences in BMI at baseline, indicating differences in the effects of intervention with respect to the GRS. For example, BMI at baseline was significantly higher in association with the per gram increase of fiber intake; however, per gram increases in vegetable fat and animal protein intake resulted in a lower BMI at baseline in the third GRS tertile. In the first GRS tertile, a higher amount of smoking was associated with a lower BMI at baseline, and high carbohydrate intake and sedentary lifestyle tended to be associated with a higher BMI at baseline.

The present findings will aid in the development and administration of personalized preventive medicine because the GRS enables the selection of optimal lifestyle factors that are expected to strongly influence obesity as intervention targets. This also improves the efficiency of the use of health-care resources, as specific lifestyle factors can be targeted in contrast to generalized comprehensive interventions. Although our results demonstrate interactions between genetic and environmental factors, they do not confirm causal relationships. Notably, lifestyle was only obtained at baseline analysis. Therefore, we emphasize the need for additional studies before our results can be put to practical use. In particular, our data regarding dietary fiber intake are inconsistent with those of previous studies, which indicate a decrease in weight.36, 37, 38 The BMI at baseline was directly associated with the GRS, suggesting that participants with a higher GRS had a higher BMI; thus, participants with a higher GRS may be more likely to receive interventions, resulting in a higher dietary fiber intake. Another possible reason for this discrepancy is that dietary fiber intake was insufficient to control weight among participants with a higher GRS. Furthermore, although there are several types of fibers, they were not differentiated in the analysis. Finally, there may be an unknown confounding factor(s) in foods containing fiber. These findings may also explain other discrepancies between the present and previous studies, including those with respect to vegetable fat and animal protein intake. Furthermore, we could not replicate previous results with respect to physical activity.14, 15 Randomized intervention studies according to genetic predisposition must be conducted to resolve these discrepancies and clarify gene–environment interactions.

The present study revealed an association between the GRS and the BMI at baseline as well as the effects of gene–environment interactions on obesity. In contrast, there was no significant relationship between the GRS and the changes in BMI according to longitudinal observations. This can be explained as follows: first, the GRS may be an insufficient measure of the risk of changes in BMI. In previous GWAS, the cross-sectional baseline BMI was the dependent variable used to calculate the GRS; hence, the GRS reflects the genetic risk of cross-sectional BMI, not changes to BMI. However, the genetic risk of changes to BMI would be more relevant to the field of preventive medicine. If the GWAS were conducted to analyze changes in BMI using longitudinal data, particularly data from adolescents, it may be possible to detect new loci other than those described herein. The existence of such loci would indicate a biological pathway responsible for changes in BMI that is distinct from currently recognized biological pathways, and is responsible for the cross-sectional or acquired BMI. This is further supported by the fact that an increase in BMI induces epigenetic changes.39 Nevertheless, additional research is required to clarify this aspect. Second, the change in BMI was small, as only adults were included. Increases in BMI and the effects of obesity susceptibility genes are greater in adolescence than in adulthood; accordingly, peak BMI occurs at approximately 55 and 60 years in men and women, respectively.2, 40, 41 All participants of the Yamagata Study (Takahata) were older than 40 years (median, 62 years). Therefore, the change in BMI was negligible; the mean annual change in BMI among all participants was −0.017 (0.223). Third, this study recruited participants who opted to undergo health examinations. Therefore, the increase in BMI may have been skewed as a result of health-care interventions prompted by the examination. Nevertheless, the main limitation of assessing changes in BMI remains the lack of statistical power due to a small sample size.

The major strength of this study is that we evaluated gene–environment interactions with respect to the risk of obesity onset using more environmental factors than previous studies. This also afforded useful findings for the development of personalized preventive medicine. However, the statistical power of the analysis was insufficient owing to the small number of subjects, as only 1620 individuals participated. The 95% CIs in the figures overlapped among GRS tertiles, whereas some tertiles exhibited significant increases in BMI (Figure 2). Thus, a larger sample size would provide clearer and more significant findings, especially regarding factors with significant P-values for interactions. Limitations with respect to the small sample size can be overcome by combining several ongoing large-scale genomic cohort studies in Japan,42, 43, 44 including our own,42 which together comprise of over 100 000 subjects. There are additional issues that warrant attention. First, although validated methods were used to assess nutrition and physical activity, they were self-reported.25, 26 Therefore, self-reporting bias may have affected the results. Second, unrecognized variables may have more substantial effects on obesity. The results indicate that the GRS is unable to explain >2% of the variation in the BMI at baseline. Even after including age, sex, lifestyle factors and HOMA-R as covariates, the results only explain approximately 30% of the observed variation. The residual variation can be explained by unrecognized factors, specifically unknown rare variants45 that have large effect sizes as well as geological, climatic and cultural factors. Cultural factors and socioeconomic status shape obesogenic environments.46 Other factors of obesogenic environments include an urban or rural residence, residential or commercial surroundings, distance to grocery stores, accessibility to public transportation and frequency of automobile use, among others. These variables should be explored in future studies.

In conclusion, our study showed that evaluating gene–environment interactions using known genetic information can be applied to personalized preventive medicine. Future studies will be aimed at discovering new obesity susceptibility loci and their SNPs, including rare variants, and consequently unveiling genetic architecture. Such investigations will elucidate new biological pathways, especially pathways affecting changes in BMI, and increase the accuracy of the GRS. The findings of these studies, combined, will establish practical personalized preventive medicine.