Conditional and unconditional genome-wide association study reveal complicate genetic architecture of human body weight and impacts of smoking

To reveal the impacts of smoking on genetic architecture of human body weight, we conducted a genome-wide association study on 5,336 subjects in four ethnic populations from MESA (The Multi-Ethnic Study of Atherosclerosis) data. A full genetic model was applied to association mapping for analyzing genetic effects of additive, dominance, epistasis, and their ethnicity-specific effects. Both the unconditional model (base) and conditional model including smoking as a cofactor were investigated. There were 10 SNPs involved in 96 significant genetic effects detected by the base model, which accounted for a high heritability (61.78%). Gene ontology analysis revealed that a number of genetic factors are related to the metabolic pathway of benzopyrene, a main compound in cigarettes. Smoking may play important roles in genetic effects of dominance, dominance-related epistasis, and gene-ethnicity interactions on human body weight. Gene effect prediction shows that the genetic effects of smoking cessation on body weight vary from different populations.

Scientific RepoRtS | (2020) 10:12136 | https://doi.org/10.1038/s41598-020-68935-x www.nature.com/scientificreports/ was set as a covariant by conducting conditional association mapping 12 in the conditional model (WT|SMK). For analyzing genetic models, estimating genetic effects, and improving test efficiency at reasonable computational costs, a mixed linear model approach was used 13 . The findings will provide useful information for tailored prevention and precision treatment.

Materials and methods
Study sample. The Multi-Ethnic Study of Atherosclerosis (MESA) is a study of the characteristics of cardiovascular disease and its physiological and social psychological risk factors 14 . Several thousand participants recruited form six US communities had physical examinations, and the genotype data was obtained from gene sequencing. 5,336 individuals passed the preliminary screening (repeat of two years with missing data, contained 9,984 observations). There are 38% of European-American (E-A population), 28% of African-American (A-A population), 22% of Hispanic-American (H-A population), 12% of Chinese-American (C-A population). The gender is basically evenly distributed in each ethnic population. The MESA dataset was obtained from the dbGaP (the database of Genotypes and Phenotypes) of US National Center for Biotechnology Information website (https ://www.ncbi.nlm.nih.gov/gap, accession number: phs000209.v11.p3).
Genotype and quality control. Before the formal study began, we performed quality control on the original genotype data. Single nucleotide polymorphism (SNP) with call rate less than 95% was deleted. Polymorphic at least in one ethnic group monomorphic SNPs in each ethnic group were deleted. Due to the allelic frequency differences between ethnic groups, no SNPs were filtered based on the minor allele frequency. The uniform heterozygosity distribution is limited to the range of 0-53%, and SNPs with heterozygosity larger than 53% were removed. And we also filtered the SNP with Hardy weinberg equilibrium (HWE) P-value larger than 1 × 10 -4 . Finally, 866,435 SNPs from 22 autosomes met the quality control standards and were used for our analysis.
phenotype and the cofactor. The age, gender, ethnicity, and other basic information including smoking habits of each study participant were assessed in the form of a set of self-reporting questionnaires. Physiological index like body weight and height was obtained by physical examinations, which were conducted twice, the second time was two years later than the first time. We took body weight (the variable name is "wtlb1", the values were recorded in pounds) as the main object. Smoking (the variable name is "pkyrslc", which means the average number of packets of cigarettes someone smoked per year) was chosen in this study as covariant.
Statistical analysis. In this study, mixed linear model approaches were used for genome-wide association studies. The full model includes additive and dominance effects of singular SNPs (a, d), epistasis effects between pairs of SNPs (aa, ad, da, dd), and ethnicity-specific effects (e, ae, de, aae, ade, dae, dde). The phenotypic values of the k-th individual in the h-th ethnic population ( y hk ) can be expressed as where µ is the population mean; c hk is the fixed cofactor effects; s hk is the fixed gender effects; a i is the additive effect of the i-th locus with coefficient u A ik ; d i is the dominance effect of the i-th locus with coefficient u D ik ; aa ij , ad ij , da ij and dd ij are the digenic epistasis effects with coefficients u AA ijk , u AD ijk , u DA ijk and u DD ijk ; e h is the effect of the h-th ethnic population (1 for E-A, 2 for C-A, 3 for A-A, 4 for H-A); ae ih is the additive × ethnicity interaction effect of the i-th locus in the h-th ethnic population with coefficient u AE ihk ; de ih is the dominance × ethnicity interaction effect of the i-th locus in the h-th ethnic population with coefficient u DE ihk ; aae ijh , ade ijh , dae ijh and dde ijh are the ethnicity-specific epistasis effects in the h-th ethnic population with coefficient u AAE ijhk , u ADE ijhk , u DAE ijhk and u DDE ijhk ; and ε hk is the residual effect of the k-th individual in the h-th ethnic population.
In our mixed linear model, we take ethnicity-specific effects into consideration, instead of detecting them separately in each ethnic population. The general effects (a, d, aa, ad, da, dd) are effects stably expressed in each ethnic population, while the ethnicity-specific effects are specifically expressed in a certain population.
Conditional genetic model is an effective method to reveal conditional effects excluding interference factors 15 . If we add smoking variables in the mixed linear model as cofactor, it could reveal net genetic effects after removing influence of smoking.
To reduce the computational burden in mixed model-based GWAS analysis, we first used GMDR (Generalized Multifactor Dimensionality Reduction) 16 method to scan 866,435 SNPs of 9,984 observations in one dimension (single SNP), two dimensions (two SNPs epistasis), and three dimensions (three SNPs epistasis), and a set of 807 candidate SNPs were obtained. In the second step, a GPU parallel computing software QTXNetwork was used to run the mixed linear model on 807 candidate SNPs. The estimated genetic effects and the standard errors (SE) of detected SNPs were obtained by Monte Carlo Markov Chain method with 20,000 Gibbs sampler iterations. The experiment-wise P values (P EW -value) for controlling the experiment-wise type I error (P EW < 0.05) were calculated by 2,000 permutation tests 17 .  20 to build a database of biomedical concepts. In this research, we use BiopubInfo to search genetic networks between weight-related genes with chemicals, diseases, and gene-gene interactions.
Estimation of genetic effects. As is mentioned above, mixed linear model can estimate genetic effects with their standard errors 12 . For a person of a certain ethnicity, we calculate the sum of the general effects and the ethnicity-specific effect of this ethnicity of all the weight-related SNPs to obtain the estimated value of his body weight. For simplicity and intuitively, we showed the estimation data, a 5,336 × 15 matrix, in the form of figure.
We compared estimated genetic effects of base model (WT) to conditional model (WT|SMK), and revealed impacts of smoking on gene effects of body weight.

Results
Detected Snps. Eleven SNPs were identified significantly by two models (seven SNPs in the base model WT and nine SNPs in the conditional model WT|SMK). Highly significant SNPs at the experiment-wise level (P EW < 10 -5 ) are presented in Fig. 1. We use different shapes (circle and square) and colors (green, red, blue and black) to show different kinds of genetic effects of SNPs. The estimated genetic effects (represented by the various colors and shapes in Fig. 1) and heritabilities of two models are summarized in Tables 1, 2   . Genetic architecture of highly significant SNPs at the experiment-wise level (P EW < 10 -5 ) for body weight in two models. WT = the base model with no cofactor (left box). WT|SMK = the conditional model on smoking (right box). The left axis is the SNP IDs (Chromosome_SNP_Allele), and each line represents a SNP. The different shape symbols on each line represent the different effects of that SNP: Circle = additive effect; Square = dominance effect; Black color = only epistasis effects (not include single effects); Red color = general effects for four ethnic groups; Green color = ethnicity-specific effects; Blue color = both general and ethnicityspecific effects; Lines = epistasis effect of two SNPs, and the color of the lines have the same meaning as above.  Genetic effects of SNPs related to human body weight. There were 32 highly significant genetic effects (P EW -value < 1 × 10 −5 ) of human body weight detected in the base model (Table 2). Among all the genetic effects, nearly half of them are ethnicity-specific effects, but C-A population has fewer (2/18) ethnicity-specific effects, suggesting that weight-related genes have minor specific effects in Chinese-American. Table 2. Estimated genetic effects of highly significant loci for body weight in the base and conditional models. Model: Smoke = model setting smoking as a cofactor. √: effect caused by smoking. ×: effect not affected by smoking. Genetic effect: a = additive effect, d = dominance effect, aa = additive × additive epistasis effect, ad = additive × dominance epistasis effect, da = dominance × additive epistasis effect, dd = dominance × dominance epistasis effect, ae 2 = C-A specific additive effect, ae 3 = A-A specific additive effect, ae 4 = H-A specific additive effect, de 1 = E-A specific dominance effect, de 3 = A-A specific dominance effect, de 4 = H-A specific dominance effect, aae 1 = E-A specific additive × additive epistasis effect, aae 2 = C-A specific additive × additive epistasis effect, aae 3 = A-A specific additive × additive epistasis effect, aae 4 = H-A specific additive × additive epistasis effect, ade 1 = E-A specific additive × dominance epistasis effect, ade 3 = A-A specific additive × dominance epistasis effect, dae 1 = E-A specific dominance × additive epistasis effect. − LogP EW = minus log 10 (experiment-wise P-value). h 2 = heritability.  Table 2). The homozygote genotype of rs9639575 has additive effect ( a ∧ = -5.12 for T/T and 5.12 for G/G), which reveals that major frequency homozygous (T/T) has negative additive effect, but minor frequency homozygous (G/G) has positive additive effect. The heterozygote genotype T/C of rs11684785 has a positive dominance effect ( d ∧ = 6.26) and three negative epistasis effects with interaction to homozygote genotype T/T and heterozygote genotype T/C of rs620175 ( da ∧ = -4.02 for T/C × T/T, but 4.02 for T/C × C/C; dae 1 ∧ = -6.92 for T/C × T/T, but 6.92 for T/C × C/C. dd ∧ = -7.74). One dominance effect and three ethnicity-specific dominance effects of the heterozygote genotype G/A of rs2504934 will increase body weight by 4.22 lb, but decrease body weight by -7.59 lb in E-A population and -13.67 in H-A population, increase body weight by 22.27 lb in A-A population, respectively. Interestingly, this SNP not only has the largest positive effect but also the smallest negative effect among all genetic effects with highly significance ( de 3 ∧ = 22.27, P EW -value < 1 × 10 −47 ; de 4 ∧ = -13.67, P EW -value < 1 × 10 −32 ). This means SNPs can have totally different effects in different ethnic populations. The homozygote genotype G/G of this SNP also has five different epistasis effects with interaction to rs620175 in the SDK1 gene (additive × additive effect and additive × additive epistasis effects in all four populations, increases body weight in E-A and H-A populations, decreases body weight in C-A and A-A population), indicating that this epistasis regulates body weight through different metabolic networks in different populations and has widespread effects in different populations. These results show that SNP rs2504934 in the SLC22A3 gene has major effects on body weight, and it is not affected by smoking.
Genetic effects of rs620175 in the SDK1 gene have large parts of total effects (11/21). In addition to single effects ( ae 3 ∧ = 8.74 for T/T but -8.74 for C/C, ae 4 ∧ = -5.68 for T/T but 5.68 for C/C), SNP rs620175 also has many epistasis effects with interaction to three SNPs (rs11684785 in the TMEM163 gene, rs2504934 in the SLC22A3 gene, and rs1277840 in the CACNB2 gene). Epistasis rs620175 × rs1277840 has positive dominance × dominance Table 3. Estimated genetic effects of highly significant loci for body weight only in conditional model. Genetic effect: a = additive effect, d = dominance effect, aa = additive × additive epistasis effect, ad = additive × dominance epistasis effect, da = dominance × additive epistasis effect, dd = dominance × dominance epistasis effect, aae 1 = E-A specific additive × additive epistasis effect, ade 1 = E-A specific additive × dominance epistasis effect, ade 2 = C-A specific additive × dominance epistasis effect, ade 3 = A-A specific additive × dominance epistasis effect, ade 4 = H-A specific additive × dominance epistasis effect, dae 1 = E-A specific dominance × additive epistasis effect, dae 3 = A-A specific dominance × additive epistasis effect, dae 4 = H-A specific dominance × additive epistasis effect. − LogP EW = minus log 10 (experiment-wise P-value). h 2 = heritability.  Genetic effects of SNPs caused by smoking. Eleven highly significant genetic effects were detected due to smoking ( Table 2). All five single effects and ethnicity-specific effects in E-A population are positive, but ethnicity-specific effects in other three ethnic populations are negative. This indicates that smoking not only has overall effects of increasing body weight in all populations, but also can increase weight in E-A population and decrease weight in C-A, A-A, and H-A populations.
Compared to effects of SNPs not affected by smoking, genetic effects caused by smoking are smaller. The largest effect is 6.94 with heritability 1.74% of rs17094894 near the VRK1 gene. This shows that effect of smoking on weight-related genes is not so significant as compared to their own effects on body weight. This SNP also has ethnicity-specific additive × dominance epistasis effect with interaction to rs11684785 in the TMEM163 gene in E-A population ( dae 1 ∧ = 4.51 for T/C × C/C and -4.51 for T/C × T/T), indicating that this SNP is a key factor in the mechanism of weight gain contributed by smoking in all populations, but is also a smoking-related factor in the mechanism of weight gain in E-A population. SNP rs620175 has many additive-related effects, including ethnicity-specific additive effects in A-A population and three epistasis effects with rs2504934, rs9639575, and rs1277840. The only additive effect due to homozygote genotype A/A of rs1415210 in the CRB1 gene is highly significant ( a Snps suppressed by smoking. There were seven highly significant SNPs (P EW -value < 1 × 10 −5 ) detected only in the conditional model with 25 genetic effects (Table 3).
SNP rs1467194 could only be detected in the conditional model; whereas SNP rs11684785 could not be detected in this model. However, these two SNPs are located close to each other in the same gene TMEM163. The homozygote genotype of rs1467194 takes part in two epistasis effects: additive × additive effect with interaction to rs9639575 ( aa ∧

for G/G × C/T but -8.for A/A × C/T).
The dominance × additive effect ( dae 3 ∧ = -11.76 for G/A × T/T but 11.76 for G/A × C/C) of rs2504934 × rs620175 is the negatively largest effect suppressed by smoking with the heritability of 4.42%, which contributes to a large part of the total heritability of epistasis effects in the conditional model. SNP rs2656825 was not detected in the base model, but has highly significant additive effect ( d ∧ = 9.06, P EW -value < 1 × 10 −64 ) in the conditional model. Apart from this single effect, six epistasis effects were detected due to interactions with three SNPs (rs620175, rs1277840 and rs2145965), implying that this SNP is involved in body weight regulation and could be suppressed by smoking.
Smoking also suppresses the ways in which SNPs regulates the body weight via other two important SNPs (rs1415210 and rs2145965). SNP rs1415210 has epistasis effects with rs17094894 ( aa Gene ontology analysis. Using BioPubInfo (https ://ibi.zju.edu.cn/biopu binfo /), we also assessed the structural and functional connectivity for the detected candidate genes. Figure 2a shows the network of two SNP-related genes (CREB5 and SLC22A3) not caused or suppressed by smoking. The CREB5 gene is connected to the SLC22A3 gene via three kinds of chemicals (estradiol, polyestradiol phosphate, and estradiol sulfate), which are all related to estrogen. Gonadal hormones can effectively control body weight, the activation of estradiol will affect the body's homeostasis for a long time in female animals, it has similar effects in humans 21 . Figure 2b shows three SNP-related genes with the largest effects caused by smoking (CRB1, SDK1, SLC22A3). The SDK1 gene is connected to the CRB1 gene through valproic acid and to the SLC22A3 gene via tetrachlorodibenzo-p-dioxin (TCDD). These two chemicals were both proved to be related to body weight. A common adverse reaction of valproic acid is weight gain 22 . TCDD is a main compound in cigarettes, and animals treated with TCDD will develop overeating and become obese 23 . SNP rs2504934 in the SLC22A3 gene has seven kinds of epistasis effects with interaction to rs620175 in SDK1 gene in all populations (Table 3), indicating that this epistasis is important in body weight regulation, and that these effects may be caused by smoking through TCDD. From gene ontology analysis, the relationship between these three genes became clearer for smoking and body weight. Figure 2c shows three SNP-related genes (TMEM163, VRK1, SLC22A3) with the largest effects suppressed by smoking. All the three genes are connected to each other by benzopyrene. Long-term exposure to benzopyrene may cause obesity 24 , and overweight/obesity can be experimentally induced by benzopyrene 25 , which is a main compound in cigarettes 26 , and all these three genes have effects contributed or suppressed by smoking. This gene ontology plot proves the reliability of our results in some ways. Figure 2d shows other three key genes (CACNB2, VRK1, CREB5) that have interesting stories about body weight. The VRK1 and CREB5 genes are connected via 7,8-dihydro-7,8-dihydroxybenzo(a)pyrene-9,10-oxide, which is an analogue of benzopyrene. Cyclosporine is connected to genes CACNB2, VRK1 and Scientific RepoRtS | (2020) 10:12136 | https://doi.org/10.1038/s41598-020-68935-x www.nature.com/scientificreports/ CREB5. Cyclosporine, which is widely used in organ transplantation, was found to cause weight gain after transplantation 27 . The CREB5 gene and the VRK1 gene are connected through the JUN gene and the ATF2 gene. CRE (cAMP response element) can bind to the product of the CREB5 gene with c-Jun (another name of the JUN gene). The protein encoded by the ATF2 gene is closely related to CRE and c-Jun. An essay clearly described the regulatory principles and the complex metabolic interaction of CRE, c-Jun, the ATF2 gene, and the VRK1 gene 28 . This network may be a way of controlling body weight and the reasons are as follows. CREB (CRE binding protein) and ATF2 could affect the expression of UCP1, a protein essential for the thermogenic function of brown adipose tissue. The brown adipose tissue is related with heat production in human body, and there is a weight loss therapy for the purpose of activating the tissue 29 . Thus, we may have discovered a novel mechanism involving CRE of these four genes that has impacts on body weight and weight-related diseases through a complex metabolic network, the specific reason remained to be revealed.
Gene effect prediction. There were 15 SNP/epistasis detected in the base model and 22 SNP/epistasis detected in the conditional model. All the 15 SNP/epistasis were also detected in the conditional model. The patterns of gene effects for 15 SNP/epistasis in the conditional model are almost identical with those in the base model (some new effects were detected when setting smoking as a covariant, indicating that these effects are suppressed by smoking). This shows the stability and reasonableness of our conditional method. Although some effects of the 15 SNP/epistasis were not detected in the conditional model (some effects disappeared when setting smoking as a covariant, indicating that these effects are caused by smoking), these effects have little effects on the population and are not large in number. Therefore, we believed that the overall impact of smoking on body weight is produced by suppressing the expression of certain weight-related SNP/epistasis. In Fig. 3, we present genetic effects of the rest seven SNP/epistasis in the conditional model, which are suppressed by smoking. We arranged individuals of same population together, did clustering on the effects in each population, and then visualized the result by R software.
The most important result of Fig. 3 is that smoking has different effects on different populations. The effects of these seven SNP/epistasis were smaller in effect size and less in effects number compared to other three ethnic populations (represented in Fig. 3 as the region of C-A population has large area of blank or light color), showing that smoking has limited effects on C-A population. However, A-A population has large area of dark color, and the color is the darkest in all effects. From the point of single SNP/epistasis, rs620175 × rs2656825 has negative effects (purple color) on some people in A-A population but not in other populations; rs1277840 × rs2656825 has positive effects (orange color) on some people in H-A population but not in other populations. Some other SNP/epistasis have the same phenomenon. www.nature.com/scientificreports/ The first column is the sum of all columns, from which we can infer when people quit smoking, he or she will gain or lose how much body weight. In E-A and H-A populations, a small number of people will gain weight heavily (people have red color in the first column), but also a small number of people will lose weight heavily (people have blue color in the first column), and major part of people will have slight weight change. In C-A population, smoking cessation has limited effects on body weight, but in A-A population, a majority of people will lose weight when quitting smoking.

Discussion
Body weight is a typical complex trait in human that is affected by genetic and environment factors as well as their interactions. In our study, we assessed the genetic effects of individual genes, epistasis effects between pairs of genes, and ethnic-specific interaction effects by mixed linear model approaches. We identified eleven significant SNPs for body weight with P EW -value less than 1 × 10 -5 , and seven of them were found to be within the introns of identified genes. Among all the genetic loci detected, one gene (CRB1) has been reported by recent GWAS studies 30 , and three genes (TMEM163, SLC22A3, CACNB2) have been shown to have associations with human body weight.
Gene ontology analysis indicates that there may be new mechanism involving cAMP response element, which may affect body weight. Also, three chemicals (cyclosprine, estradiol, and benzopyrene) have a vast number of associations with genes detected through our gene ontology analysis, which have been reported to be related to body weight.
Smoking behavior as well as body-weight both may have correlative association with some SNPs, but direct biological evidences are insufficient. We have conducted a conditional GWAS to detect the impact of smoking on body weight by comparing the different results between two models (WT and WT|SMK), rather than directly studying the interaction with them. GWAS is hypothesis-free and does not require prior knowledge of a SNP's biological impact on body weight.
Setting the first few principal components of genetic relationship matrix as covariates in the model is a common method for controlling population stratification in GWAS. The ethnicity-specific effect in our model is not a covariant, but an effect with estimate value, standard error, P-value and heritability. The population stratification www.nature.com/scientificreports/ due to different ethnic populations can be effectively controlled by ethnicity-specific effect, and there is no need to set principal components in the model as covariates. Conditional genetic model is an effective method to reveal conditional effects excluding interference factors. When we added smoking variables in the mixed linear model as cofactor, it could reveal net genetic effects after removing influence of smoking. In this way, we can detect SNPs associated with body weight which are affected by smoking. It is worth mentioning that, a SNP will not be detected in the conditional model if it affect body weight only through smoking.
Gene effect prediction shows that smoking has different impacts on different human populations. The genetic effects of smoking cessation on body weight vary from different populations. Body weight of C-A population is not sensitive to smoking, while A-A population has opposite situation. In E-A and H-A populations, people can be divided into several subgroups, and precision medicine and intervention of living habits can be applied on each subgroup for smoking people to keep good fit.
Comparing with low heritability (less than 10%) of weight-related genes estimated by most of the previous GWASs 2,3 , the total heritability of human body weight in this study is up to 61.78% (base model). Majority of the heritability is contribute by ethnicity-specific epistasis h 2 The heritability of epistasis × ethnicity effects accounts for the largest proportion of total heritability and more than half of the effects are ethnicity-specific effects. The same singular gene could have different impacts on different ethnic groups, therefore ethnicity-specific effects should be considered when designing personalized medicine therapies for weight related diseases.
Over the past decade, dozens of genes associated with body weight have been identified, but biologists have not confirmed most of them. A highly significant gene associated with human body weight detected in our study have been found to be significant in the previous GWAS studies on traits related with body weight. One GWAS assessed the relationship between the CRB1 gene and human body weight in Wight American 30 . The SNP rs1415210 in the first intron of CRB1 gene has single dominance effects and also epistasis effects with interaction to many other SNPs (rs7006789, rs2504934, rs17094894), indicating that rs1415210 regulates body weight through different metabolic networks. This gene plays a role in photoreceptor morphogenesis in the retina and may maintain cell polarization and adhesion. However, no evidence has been revealed to support the relationship between this gene and weight.
In this study, we found eight new significant SNPs associated with human body weight. The most important SNP is rs2504934. This SNP not only has the largest positive effect, but also has the largest negative effects that are highly significant ( de 3 ∧ = 22.27, P EW -value < 1 × 10 −47 ; de 4 ∧ = -13.67, P EW -value < 1 × 10 −32 ), indicating that this SNP is a key factor on body weight regulation. SNP rs2504934 of the first intron of gene SLC22A3 was detected to be highly significant in both base and conditional models, and has large positive effects. This gene is one of three similar cation transporter genes located in a cluster on chromosome 6, and this cluster was identified as a strong susceptibility locus for coronary artery disease through a genome-wide haplotype association study 31 . Coronary artery disease is one of the obesity-related diseases 32,33 .
SNP rs11684785 and SNP rs1467194 are located in the fourth intron of the TMEM163 gene. This gene encodes a protein of the same name that is associated with cellular zinc homeostasis 34 . Zinc is of great importance on metabolism, and seriously affects weight and the incidence of type II diabetes 35,36 . In a GWAS of Indian population for type II diabetes, TMEM163 is the most significant gene associated with type II diabetes 37 .
SNP rs17094894 is near the VRK1 gene, a gene has been identified to be associated with childhood obesity 38 . SNP rs17094894 has the largest effect with high heritability among effects caused by smoking. In a GWAS that studied the growth of boars, researchers found the association between the VRK1 gene and weight of boars, and this gene also related to cell growth and division 39 .
SNP rs620175 in the SDK1 gene has most effects among all SNPs. It has epistasis effects with interactions to other five SNPs, indicating that this gene regulates body weight by lots of mechanisms. Food conversation, which is closely connected to weight growth, is partly contributed by extracellular matrix and cell adhesion protein as SDK1 40 .
SNP rs1277840 is in the fourth intron of the CACNB2 gene, and most of the effects involving rs1277840 are general effects, indicating that this gene can be expressed stably in all the ethnic groups. This gene is significantly associated with obesity-related diseases involving blood pressure 41,42 . A study on agouti showed that agouti regulates fatty acid synthase and fat storage via calcium channel, which is related to the protein encoded by this gene 43 . This gene was also found in human, and this illustrated that this gene may regulate human body weight via similar ways. In addition, increasing adipocyte intracellular calcium resulted in fat production, and increasing dietary calcium suppressed adipocyte intracellular calcium and thereby modulated energy metabolism and attenuates obesity risk 44 .
Fewer genetic effects were identified in C-A population, compared to the other populations. One possible reason is that the sample population consisted of only 12% Asian Americans. Perhaps a larger sample might help reveal more genetic structure of C-A population. Another problem is that human body weight may be affected by rare alleles, but we used a relatively conservative threshold in the first screen of SNPs to ensure the confidence; some rare alleles with large effects may be filtered out because of low P-value. Nonetheless, we still obtained a high total heritability and detected a number of SNPs in four populations with smoking as a covariant. Our results provide new insight in dissecting and understanding the sophisticated genetic architecture of human body weight, which will be useful for designing personalized medicine therapies for complex diseases.

Data availability
Raw data is available on request.