Introduction

Obesity in childhood and adolescence can lead to life-long complications on individual’s health. Obese and overweight adolescents are more likely to have respiratory, sleep-related, behavioral and mental health problems1. Childhood obesity can also persist into adulthood, making such individuals vulnerable to developing diseases like cancer, arthritis, coronary heart disease and other chronic metabolic diseases such as type 2 diabetes2,3. There were 42 million overweight children (below 5 years) across the globe in 20154. By contributing to the growing prevalence of associated complications, the childhood obesity is putting an increasing burden on the global public health systems5. Hence it is necessary to study the predisposing factors associated with it.

Certain populations have a stronger genetic predisposition to obesity compared to other populations6, with varying degree of susceptibility among individuals within a population7. The available literature suggests a strong genetic basis of obesity8,9,10,11. Genetic studies have implicated 227 genetic variants from genes involved in various signaling pathways (neuro-endocrine coordination, insulin signaling, lipid metabolism, adipocyte differentiation, muscle and liver biology, maintenance of gut microbiota) involved in the etiology of common polygenic obesity9. The environment also plays a significant role in the development of obesity12. Excessive food intake and insufficient physical activity can disrupt body’s energy homeostasis13. This disruption can lead to changes in complex biochemical and signaling pathways involved in the alimentary, neuroendocrine and immune system. Such changes in the internal environment of the body manifest at cellular level in the form of altered transcriptional programs. Recent research efforts have shown that chromatin modifications allow cells to make rapid and context-specific transcriptional changes13. Several proteins have been identified as components of chromatin modification complexes that can control the accessibility of DNA towards transcription factors, in effect controlling transcription in various disease states14. It is therefore important to understand the role of chromatin modification with respect to normal or disease states including obesity.

In the context of obesity, a few chromatin modifying proteins have been identified those play roles in controlling transcriptional rewiring in response to environment15,16. A United Kingdom-based study has detected mutations in DNMT3A (chromatin modifier gene) in children with overgrowth syndrome17. Functional studies to identify novel players are confounded by the fact that obesity is a systemic condition affecting multiple tissues. Different players may be involved in different tissues for the same purpose. This adds complexity to the identification of chromatin modifiers involved in obesity. Candidate gene-based association could provide simpler approaches to identify chromatin modifier genes involved in obesity. Here, we hypothesized that genetic variants in chromatin modifying genes could play an important role in the development of obesity. To test this hypothesis, we conducted the first candidate gene-based association study investigating chromatin modifying genes to identify genetic variants associated with adolescent obesity.

Results

Anthropometric and clinical characteristics of study participants have been provided in Table 1. Association analysis in Stage 1 revealed the associations of 28 variants in 13 genes with overweight/obesity at P < 0.05 (Supplementary Table 1). The top signals for overweight/obesity were identified at rs4589135 [P = 1.2 × 10−4] and rs6598860 [P = 7.8 × 10−4] in ARID1A gene. Association analysis with BMI identified rs907092 in IKZF3 (P = 3.6 × 10−6) as top signals in stage 1.

Table 1 Clinical status of study participants.

We successfully genotyped 18 SNPs in stage 2 and 13 SNPs passed the QC (Supplementary Table 2). We replicated the associations of rs6598860 (P = 0.04) in ARID1A and rs17003998 in SMARCE1 (P = 0.01) with overweight/obesity (Table 2). Subsequent meta-analysis of summary results from stage 1 and stage 2 revealed significant association of rs6598860 (P = 1.58 × 10−4) and rs4589135 in ARID1A (P = 3.72 × 10−4) with overweight/obesity after multiple testing correction (P = 6.33 × 10−4). Meta-analysis of BMI data showed significant association of rs3804562 (P = 1.35 × 10−4) in KAT2B along with rs4589135 (P = 3.57 × 10−5) and rs6598860 (P = 1.16 × 10−5) in ARID1A after multiple corrections (Table 2). Identified variants were also associated with other adiposity measures (Fig. 1, Supplementary Table 3). Variations in adiposity measures according to different genotypes of identified SNPs have been shown in Fig. 2. The study has more than 80% power to detect an association of variant with an observed allele frequency of 0.30 and an effect size of 1.20–1.30 (Supplementary Figure 1).

Table 2 Association of significant SNPs with obesity and BMI.
Figure 1
figure 1

Associations of significant SNPs with measures of obesity. Association of overweight/obese associated SNPs with anthropometric measures of obesity (weight, BMI, WC, HC) in meta-analysis results. The z score change per risk allele for associated SNPs in meta-analysis has been plotted against corresponding phenotypes.

Figure 2
figure 2

Effect of genotype of significant SNPs over z score of adiposity measures. Variation of adiposity measures with the different genotypes of associated SNPs. The average z score is plotted on the y-axis against the different genotypes of SNPs on the x-axis for SNPs associated with adiposity measures. The analysis has been performed on total samples obtained after combining samples from stage 1 and stage 2.

We searched for the associations of identified variants with obesity-related phenotypes in Genetic Investigation of Anthropometric Traits (GIANT)18 consortium data. The analysis revealed that all the identified SNPs are associated with either BMI or other related traits (Table 3) with similar effect sizes at P <  = 0.05.

Table 3 Association status of significant SNPs with obesity and other related parameters in GIANT consortium data.

Discussion

The study has used an already established cohort for children/adolescent obesity for its finding. There were 1280 and 863 common samples between current studies and our earlier work investigating the association of common variants in inflammatory marker genes with overweight or obesity in Indian children/adolescents by Tabassum et al.19 in stage 1 and 2 respectively. Our results demonstrate the associations of two SNPs in ARID1A (rs6598860 and rs4589135) with the risk of overweight/obesity in urban Indian adolescents. A moderate LD (R2 = 0.57) between rs6598860 and rs4589135 was observed in combined samples from stage 1 and stage 2. Variant rs6598860 (RegulomeDB score of 2b) lies in the promoter of ARID1A and might affect the binding affinity of transcription machinery units on the promoter (RegulomeDB). Variant rs6598860 has also been associated with leptin level and birth length at nominal significance level (P <  = 0.05) in Europeans20. Variant rs4589135 has also been associated with High-Density Lipid (HDL) levels of European and mixed ancestry samples in Global Lipids Genetics Consortium (GLGC) study at nominal significance level (P <  = 0.05)21. It has also been associated with triglycerides levels in European population22.

Although no genetic study has linked ARID1A with adult or childhood obesity, a functional study involving ARID1A deficient mice showed higher expression of Interleukin 6 (IL6), a known inflammatory cytokine23. ARID1A is a transcription factor and its depletion has been reported to affect cholesterol synthesis as well as glycogen metabolism related proteins levels in ovarian cancer cell lines24. Study in skeletal muscle had shown positive correlation of expression of ARID1A with BMI in humans25. These evidences suggest that ARID1A may affect obesity through cytokines (IL6), adipokines (leptins) or lipids mediated lipid pathways.

We also found an association of rs3804562 in KAT2B, which codes for a histone acetyltransferase, with BMI. KAT2B knockdown mice showed a reduction in body weight and hyperglycemia in comparison to control mice26. Also, its role in gluconeogenesis and energy maintenance mechanism has been suggested by a previous study27.

In conclusion, our data revealed that common variants of ARID1A and KAT2B are associated with increased susceptibility to overweight/obesity in Indian urban adolescents. Our study had used overweight individuals along with obese individuals as case group, the effect size of identified associations can be underestimated and should be interpreted cautiously in case of obese subjects. Since, the current study aimed at investigating the important genes from chromatin modifiers pathway, and list of genes included in the study might not be an exhaustive list of all genes involved in the pathway. An exhaustive investigation of all the listed genes in literature might help to identify more genetic variants associated with childhood obesity in chromatin modifier genes. Although the case-control studies design limits the potential to identify the causal relationships, our study provides a lead for future investigations toward understanding the contribution of epigenetic modifiers in genetic predisposition to obesity in adolescents. This would help in understanding the molecular mechanisms and exploring therapeutic options toward prevention of childhood obesity.

Methods

The study involved the participation of 3,530 adolescents (aged 11–17 years) including 2,539 normal-weight (NW group) and 991 overweight/obese (OW/OB group) participants. All the participants belonged to Indo-European ethnicity and were recruited from school health surveys in five different zones of Delhi (north, south, east, west, and central regions) and National Capital Region as described previously19,28,29. Prior permission from school authorities, informed consent from parents/guardians and verbal consent from participants themselves were obtained before participation in the study. The study plan was discussed in detail with school authorities for administrative approval. A written plan was circulated to the parents through the school. The study protocol was approved by ethics committees of CSIR-Institute of Genomics and Integrative Biology and All India Institute of Medical Sciences. The study was conducted according to principles of the Declarations of Helsinki. Anthropometric measurements including height, weight, waist circumference (WC) and hip circumference (HC) were taken using standard methods and BMI was calculated. Blood samples were drawn from participants after overnight fast and DNA was extracted as mentioned previously30. Participants were classified as normal weight and overweight/obese according to age- and sex-specific cutoffs provided by Cole et al., 200231.

In stage 1, we initially selected 203 SNPs in 37 genes for genotyping after exhaustive literature survey on chromatin modifiers. Most of the genes included in this study were selected from You et al. who reviewed available literature till 2012 for epigenetic process-related genes in case of cancer32. We also included genes that were not included in this review but literature search revealed their involvement in the epigenetic process. The SNPs were selected on basis of their presence in functionally important regions of genes, previous reports of association with metabolic disorders and a minor allele frequency greater than or equal to 0.05. Genotyping was done on 1283 participants using Illumina Golden Gate assay (Illumina, San Diego, CA). Genotyping data were subjected to extensive quality control (QC). SNPs with genotype confidence score (confidence value assigned to each called genotype that ranges between 0 and 1 with less reliable call assigned lower value) less than 0.25 were removed. We also removed SNPs with GenTrain score (a statistical score that mimics evaluations made by a human expert’s visual and cognitive systems about clustering behavior of a locus, less reliable cluster assigned lower value) less than 0.6. SNPs with cluster separation score (cluster separation measurement between different genotypes for an SNP that ranges between 0 and 1) less than 0.4 and call rate < 0.9 were also removed. Further, SNPs with Hardy-Weinberg equilibrium P value less than 0.01 in any of the NW, OW/OB and combined sample groups were removed. Out of selected 203 SNPs in 35 genes, 5 SNPs (rs66797130, rs9909489, rs16834954, rs6576 and rs341530259) in 4 genes were non-polymorphic in our samples. In total 24 SNPs in 17 different genes failed in the assay. The final analysis was done on 1283 adolescent (830 NW and 453 OW/OB adolescent) for 179 SNPs from 35 genes, encoding DNA methylation enzymes, histone modifiers and chromatin modifiers (Supplementary Table 1). After QC, SNPs (n = 179) had a call rate of 98% and a concordance rate of 99.97% with 5% duplicate samples. Genotype frequencies for all the SNPs are provided in Supplementary Table 1.

Genotyping of 18 obesity-associated SNPs (17 direct SNPs and one proxy SNP, rs56315139 for rs6504550 as shown in Supplementary Table 2) from stage 1 was performed in 2247 adolescent (1709 normal weight and 538 overweight/obese) using iPLEX (Sequenom, San Diego, CA now Agena Bioscience, Hamburg, Germany). We failed to design primers for remaining SNPs using Assay Design Suite Agena (https://seqpws1.agenacx.com/AssayDesignerSuite.html) in a single plex. Stringent QC for the genotyped data was performed. We removed two SNPs with a call rate less than 90% and 3 SNPs with HWE P-value < 2.78 × 10−3 (0.05/18) during analysis in stage 2. Finally, we analyzed 13 SNPs in stage 2. The average genotyping success rate for remaining SNPs was 96% (range = 91–100%) with 99.7% consistency in genotyping with 10% duplicates.

Statistical analysis was performed using PLINK version 1.07 (http://pngu.mgh.harvard.edu/;purcell/plink)33,34 and R version 3.1.0. Genotype frequencies were checked for Hardy-Weinberg equilibrium using the χ2 test. Prior to analysis, internal age- and sex-specific z scores were calculated for continuous variables as described previously31. The z scores were inverse normal transformed to achieve normal distribution. Logistic regression analysis under a log-additive model adjusting for age and sex was performed to test the association of QC-passed SNPs with overweight/obesity in PLINK. Associations for continuous traits related to obesity were performed using linear regression model adjusted for age and sex assuming the additive mode of inheritance. Meta-analysis of summary statistics from stage 1 and stage 2 associations was performed using fixed-effect inverse variance method using METAL (http://www.sph.umich.edu/csg/abecasis/Metal/). A P-value of 6.33 × 10–4 after meta-analysis was considered significant after correcting for 79 independent loci (r2 < 0.8) for obesity and BMI. We did not correct for multiple phenotypes as all the tested phenotypes (obesity and BMI) are correlated with each other.

We have collected samples from a small geographical region that forms a homogenous cluster as shown by Dwivedi OP et al., 201230. Principal component analysis of genetic data (Axiom™ Genome-Wide EUR 1 Array) available for 1095 participants revealed that our samples are genetically homogeneous (Supplementary Figure 2). The statistical power of the study was calculated using the log-additive model of inheritance of considering 24% prevalence of overweight/obesity2 for SNPs with allele frequency ranging between 0.05–0.50 and an effect size of 1.05–2 at α = 6.33 × 10−4.

Raw genotype data used in the study has been included in the manuscript as supplementary dataset S1.