Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

A novel gene THSD7A is associated with obesity


Body mass index (BMI) is a non-invasive measurement of obesity. It is commonly used for assessing adiposity and obesity-related risk prediction. Genetic differences between ethnic groups are important factors, which contribute to the variation in phenotypic effects. India inhabited by the first out-of-Africa human population and the contemporary Indian populations are admixture of two ancestral populations; ancestral north Indians (ANI) and ancestral south Indians (ASI). Although ANI are related to Europeans, ASI are not related to any group outside Indian-subcontinent. Hence, we expect novel genetic loci associated with BMI. In association analysis, we found eight genic SNPs in extreme of distribution (P3.75 × 10−5), of which WWOX has already been reported to be associated with obesity-related traits hence excluded from further study. Interestingly, we observed rs1526538, an intronic SNP of THSD7A; a novel gene significantly associated with obesity (P=2.88 × 10−5, 8.922 × 10−6 and 2.504 × 10−9 in discovery, replication and combined stages, respectively). THSD7A is neural N-glycoprotein, which promotes angiogenesis and it is well known that angiogenesis modulates obesity, adipose metabolism and insulin sensitivity, hence our result find a correlation. This information can be used for drug target, early diagnosis of obesity and treatment.


Overweight and obesity are increasing at an alarming rate, worldwide, over the past decades. It is one of the major risk factors for chronic diseases, which has a central role in insulin resistance or metabolic syndromes, including hyper-insulinemia, hypertension, hyper-lipidemia, type 2 diabetes mellitus and atherosclerotic cardiovascular disease.1, 2 Body mass index (BMI) is a non-invasive measure of obesity that predicts the risk of related complications. Identifying genetic determinants of BMI could lead to a better understanding of the biological basis of overweight and obesity. In recent years, genome-wide association studies (GWAS) and next generation sequencing analyses have exponentially increased the discovery of novel genetic loci. Several loci including FTO, MC4R, TMEM18, GNPDA2, BDNF, NEGR1, SH2B1, ETV5, MTCH2 and KCTD15 have been recently reported to be associated with BMI.3, 4, 5, 6, 7, 8, 9, 10, 11, 12 Even though BMI has >50% heritability, known loci explains only 2% variation in phenotype.

Ethnic difference is an important factor, which contributes variation in obesity-related traits in different populations.13 With 4635 anthropologically defined groups, India is inhabited by the most genetically diverse populations. In our earlier study, we predicted that in prehistoric India, there were only two ancestral populations; ancestral north Indians (ANI) and ancestral south Indians (ASI). Intriguingly, ASI are not related to any group outside of Indian-subcontinent, contributing specific evolutionary history and genetic architecture.14, 15 Considering these facts, we expect novel genetic locus/loci associated with BMI in Indian populations and investigated to identify these by GWAS.

Materials and methods

Sample details, genotyping and statistical analysis

Blood samples of 208 healthy individuals, in age group 20–30 years with their details of BMI and informed written consent, were collected from diverse ethnic background and different geographical locations (Supplementary Figure S1). Complete details of each individual are given in Supplementary Table S1. We have excluded the individuals, from this study, who are with smoking habit, chronic systemic diseases such as diabetes, hypertension and metabolic syndromes. Those who do not have the above conditions were considered as ‘healthy subjects’ and included in this study. For categorization of subjects, we used definition of underweight (<18.5), normal (18.5–22.9) and overweight (23; obese>25), following WHO criteria.16 Individuals were genotyped on Affymetrix SNP array (6.0) with recommended protocol and genotypes were fetched using Birdsuite.17 Further, Birdsuite utility was used for conversion of files to Plink format. This study was approved by the Institutional ethical committee of CSIR-Centre for Cellular and Molecular Biology, Hyderabad, India.

To find stratification and to exclude outlier samples, principle component analysis (PCA) was performed using SmartPCA.18 Individuals having σ >±6 on eigenvector 1–10 were excluded in 10 iterations from further analysis (Supplementary Table S2). For quantitative trail association analysis, Wald test statistics was used from Plink.19 Markers having>90% genotype and in Hardy–Weinberg equilibrium (HWE; P-value>0.001) were considered for the analysis. To find missing heritability of BMI, we performed restricted maximum likelihood analysis with genome-wide complex trait analysis tool.20

Replication study

In replication study, we genotyped four markers (rs9551868, rs7681875, rs1526538 and rs2281524) in 655 samples by Sanger sequencing method (Supplementary Table S3; few samples were removed owing bad quality of sequence; details are given in Supplementary Table S4). To genotype these four markers, we utilized reference sequence from Ensembl database ( The sequence-specific primers were designed with Primer 3.0 tool ( and subsequently their specificity was tested against human genome reference sequence with NCBI Primer-BLAST ( Details of primer sequences and their respective thermocycling conditions are given in Supplementary Table S5. PCR was done in 10 μl volume, which contains 5 μl of 2 × EmeraldAmp GT PCR master mix, 10 ng of genomic DNA and 0.1 pm (final concentration) of each primer. Amplicons were cleaned with Exo SAP-IT (USB, Affymetrix, Santa Clara, CA, USA) with recommended protocol and 1.0 μl of purified product used as template for sequencing, using BigDye terminator (v3.1) cycle sequencing kit (Applied Biosystems, Foster City, CA, USA), and analyzed on ABI 3730xl DNA Analyzer (Applied Biosystems). Sequence data were further analyzed using sequence analysis (v3.4) and AutoAssembler (v1.0) for editing and assembling of sequence data, respectively. Statistical analysis was performed in R and power calculation performed with Quanto-


Discovery stage: cohort-I

Population stratification

A total of 208 males in age group of 20–30 years were selected for quantitative trait association analysis. As systematic ancestral differences among individuals could cause spurious association, it is logical to exclude those, who are in extreme genetic difference. To achieve this, we pruned 440 987 SNPs, which are in linkage disequilibrium (r2>0.75) and performed PCA with remaining 386 722 SNPs. We found four individuals as outlier with σ-value −10.017, −6.123, 6.628, 6.527 on eigenvector 0, 7, 3, 3 in iteration 1, 1, 2, 3, respectively, and hence removed for further analysis. Remaining 204 individuals consist of 97 normal weight (BMI=20.83±1.19), 38 underweight (BMI=17.31±1.06), 38 overweight (23.79±0.52) and 31 obese (26.80±1.39). Even, with consideration of groups on the basis of BMI in 204 individuals, we did not find any significant differences (Figure 1). On eigenvector 1; analysis of variance P-value for overweight vs obese, overweight vs normal, overweight vs underweight, obese vs normal, obese vs underweight and normal vs underweight was 0.902, 0.523, 0.924, 0.430, 0.832 and 0.626, whereas on eigenvector 2, P-value was 0.874, 0.121, 0.395, 0.277, 0.566 and 0.855, respectively.

Figure 1

Principle component analysis to remove outlier samples; (a) PCA of 208 samples; (b) PCA of 204 samples after outlier samples removal. To reflect the BMI for samples, we used formula ; where, Si is the point size. For both PCA plots, we use BMImax=31.7 and BMImin=14.01 obtained from the distribution of 208 subjects.

Association analysis

To find locus/loci that is/are associated with the phenotype, we performed quantitative trait association analysis. At first, out of 827 709 SNPs, 35 042 were excluded as they were not in Hardy–Weinberg equilibrium (P-value<0.001). Total eight SNPs, having genotype frequency <0.9 were also excluded and analysis was performed with remaining 792 659 SNPs. On QQ-plot, we did not find any stratification (Supplementary Figure S2). We selected 23 SNPs with extreme Wald test asymptotic P-value (3.75 × 10−5) and used only genic SNPs (markers lies within gene) for further analysis. We observed 8 out of 23 were genic (Supplementary Table S4) and present within intron of five genes; includes rs11645605 (WWOX; P=2.11 × 10−5), rs9551868 and rs2231998 (KATNAL1; P=2.21 × 10−5 and 3.42 × 10−5, respectively), rs2281524 and rs2815429 (MTF2; P=2.28 × 10−5 and 2.94 × 10−5); rs1526538 (THSD7A; P=2.88 × 10−5); and rs7681875 and rs7682470 (LIMCH1; P=3.32 × 10−5 and 3.75 × 10−5, respectively) (Supplementary Table S4, Supplementary Table S6, Figure 2 and Supplementary Figure S3). Out of five, WWOX was already reported to be associated with obesity and obesity-related traits.22 Hence, we excluded rs11645605 for further study. Out of two, we selected single marker with lowest P-value in KATNAL1, MTF2 and LIMCH1 for replication study.

Figure 2

Regional plots of four associated SNPs (which are replicated further in cohort-II). In each plot, x axis represents physical position (hg19) in kilo base pairs; and arrow mark points-associated SNP, used for replication. In the inset, we represent distribution of BMI for each genotype, allele codes on positive strand. (a) rs9551868: KATNAL1, (T-ancestral allele) (b) rs2281524: MTF2, (T-ancestral allele) (c) rs1526538: THSD7A (G-ancestral allele) and (d) rs7681875: LIMCH1 (A-ancestral allele).

Missing heritability explained by 792659 SNPs

Despite high heritability of BMI (>0.5), associated variants explain only <0.02 variation in the phenotype. Several common variants, having small effect on the phenotype, cannot be detected through contemporary GWAS analysis and might be the reason for this missing heritability. We believe that it would be interesting to explore it with restricted maximum likelihood analysis implemented in genome-wide complex trait analysis.

Initially, to exclude cryptic relation, we calculated genetic relationship matrix (GRM) for 208 individuals and excluded 8 individuals with GRM cutoff 0.025. Further, we performed PCA and extracted 10 eigenvectors. Using GRM and 10 eigenvectors, we observed 0.16±1.76 BMI variation is explained by 792 659 markers, which reflect that besides strongly associated SNPs, data consist of several loci, which has small but collective effect on this polygenic trait.

Replication stage: cohort-II and combined analysis

For the replication analysis, 654 males, which include 140 underweight (17.38±0.99), 321 normal weight (20.76±1.29), 106 overweight (23.86±0.49) and 87 obese (26.95±2.09) were resequenced by Sanger sequencing method (Supplementary Figure S4). We found none of the SNP was deviating from HWE (rs2281524: P=0.887, rs1526538: P=0.1376 and rs7681875: P=1), except one (rs9551868: P=4.058 × 10−5) and had Wald test asymptotic P-value 0.2428, 0.1281, 8.922 × 10−6 and 0.01021 for rs9551868, rs2281524, rs1526538 and rs7681875, respectively. Only rs1526538 (THSD7A) was in strong association, whereas rs7681875 was marginally associated. Further, we combined data of both the cohorts for rs1526538 and observed the P-value of 2.504 × 10−9. We also observed that the rs1526538 explain 0.08 495% BMI variation in cohort-I, whereas 0.04 176% variation in cohort-II and 0.05 192% variation in cohort-I+cohort-II.


In the current study, we explored a locus associated with BMI in Indian population. Our results suggest that one SNP rs1526538 (THSD7A) is associated with a risk of obesity. Our sample size of 669 has 99.99 and 90.85% power to detect effect of BMI at α of 0.05 and 3.779 338 × 10−6 (Boenferroni-corrected P-value for 792 659 SNPs), respectively, (β=−1.0140) in additive model. Although we failed to observe previously reported loci, FTO and MC4R, but we found WWOX in extreme of P-value distribution, previously known to be associated with obesity-related traits. rs12970134 (MC4R), rs9939609 (FTO), previously found in association (P=5 × 10−4, 1 × 10−2, respectively) with BMI in Asian Sikhs and south Indians, respectively, has P-value 0.4763 and 0.4312 in our samples.23, 24 Intriguingly, we observed rs9935403 of FTO has 5.856 × 10–3 P-value in cohort-I. We have calculated the statistical power of previously reported variants that are associated with BMI, in our study. Interestingly, the power of previously reported variants, are <0.063 (Supplementary Table S7); hence, none of the previously reported variants are associated with BMI in the present study.

We explored the locus THSD7A with ±10 kb flanking region in GIANT database ( and observed 747 SNPs within the region. None of the SNPs was significantly associated (P=value>0.0219) (Supplementary Figure S5). We have also explored the GWAS central database ( and considered ±10 kb flanking region of THSD7A. In total, we found seven studies with 31 markers showing P-value of –log10 P-value>3. Of which, one study was related to body height in British populations (HGVRS572 and HGVRS573). We explored further in depth and found rs2355068 have –log10 P-value=3.15, which is insignificant at genome-wide level. Since, none of the studies, so far, reported association of THSD7A with BMI, it appears that THSD7A is a susceptibility gene for obesity that is unique to Indian/South Asian populations.

The THSD7A is a neural N-glycoprotein, which promotes angiogenesis through endothelial cell migration and tube formation.25, 26, 27 Moreover, angiogenesis is reported to modulate obesity, adipose metabolism and insulin sensitivity.28 Hence, we speculate that the role of THSD7A in higher BMI or obesity may be influenced through its role in angiogenesis.


  1. 1

    Kopelman PG . Obesity as a medical problem. Nature 2000; 404: 635–643.

    CAS  Article  Google Scholar 

  2. 2

    Roth J, Qiang X, Marban SL, Redelt H, Lowell BC . The obesity pandemic: where have we been and where are we going? Obes Res 2004; 12 (Suppl 2): 88S–101S.

    Article  Google Scholar 

  3. 3

    Wen W, Zheng W, Okada Y, Takeuchi F, Tabara Y, Hwang JY et al. Meta-analysis of genome-wide association studies in East Asian-ancestry populations identifies four new loci for body mass index. Hum Mol Genet 2014; 23: 5492–5504.

    CAS  Article  Google Scholar 

  4. 4

    Graff M, Ngwa JS, Workalemahu T, Homuth G, Schipf S, Teumer A et al. Genome-wide analysis of BMI in adolescents and young adults reveals additional insight into the effects of genetic loci over the life course. Hum Mol Genet 2013; 22: 3597–3607.

    CAS  Article  Google Scholar 

  5. 5

    Berndt SI, Gustafsson S, Magi R, Ganna A, Wheeler E, Feitosa MF et al. Genome-wide meta-analysis identifies 11 new loci for anthropometric traits and provides insights into genetic architecture. Nat Genet 2013; 45: 501–512.

    CAS  Article  Google Scholar 

  6. 6

    Loos RJ, Lindgren CM, Li S, Wheeler E, Zhao JH, Prokopenko I et al. Common variants near MC4R are associated with fat mass, weight and risk of obesity. Nat Genet 2008; 40: 768–775.

    CAS  Article  Google Scholar 

  7. 7

    Monda KL, Chen GK, Taylor KC, Palmer C, Edwards TL, Lange LA et al. A meta-analysis identifies new loci associated with body mass index in individuals of African ancestry. Nat Genet 2013; 45: 690–696.

    CAS  Article  Google Scholar 

  8. 8

    Okada Y, Kubo M, Ohmiya H, Takahashi A, Kumasaka N, Hosono N et al. Common variants at CDKAL1 and KLF9 are associated with body mass index in east Asian populations. Nat Genet 2012; 44: 302–306.

    CAS  Article  Google Scholar 

  9. 9

    Pei YF, Zhang L, Liu Y, Li J, Shen H, Liu YZ et al. Meta-analysis of genome-wide association data identifies novel susceptibility loci for obesity. Hum Mol Genet 2014; 23: 820–830.

    CAS  Article  Google Scholar 

  10. 10

    Speliotes EK, Willer CJ, Berndt SI, Monda KL, Thorleifsson G, Jackson AU et al. Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nat Genet 2010; 42: 937–948.

    CAS  Article  Google Scholar 

  11. 11

    Thorleifsson G, Walters GB, Gudbjartsson DF, Steinthorsdottir V, Sulem P, Helgadottir A et al. Genome-wide association yields new sequence variants at seven loci that associate with measures of obesity. Nat Genet 2009; 41: 18–24.

    CAS  Article  Google Scholar 

  12. 12

    Willer CJ, Speliotes EK, Loos RJ, Li S, Lindgren CM, Heid IM et al. Six new loci associated with body mass index highlight a neuronal influence on body weight regulation. Nat Genet 2009; 41: 25–34.

    CAS  Article  Google Scholar 

  13. 13

    Tan LJ, Zhu H, He H, Wu KH, Li J, Chen XD et al. Replication of 6 obesity genes in a meta-analysis of genome-wide association studies from diverse ancestries. PLoS One 2014; 9: e96149.

    Article  Google Scholar 

  14. 14

    Reich D, Thangaraj K, Patterson N, Price AL, Singh L . Reconstructing Indian population history. Nature 2009; 461: 489–494.

    CAS  Article  Google Scholar 

  15. 15

    Moorjani P, Thangaraj K, Patterson N, Lipson M, Loh PR, Govindaraj P et al. Genetic evidence for recent population mixture in India. Am J Hum Genet 2013; 93: 422–438.

    CAS  Article  Google Scholar 

  16. 16

    IASO. The Asia-Pacific Perspective: Redefining Obesity and its Treatment. World Health Organization Western Pecific Region, 2000.

  17. 17

    Korn JM, Kuruvilla FG, McCarroll SA, Wysoker A, Nemesh J, Cawley S et al. Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat Genet 2008; 40: 1253–1260.

    CAS  Article  Google Scholar 

  18. 18

    Patterson N, Price AL, Reich D . Population structure and eigenanalysis. PLoS Genet 2006; 2: e190.

    Article  Google Scholar 

  19. 19

    Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007; 81: 559–575.

    CAS  Article  Google Scholar 

  20. 20

    Yang J, Lee SH, Goddard ME, Visscher PM . GCTA: a tool for genome-wide complex trait analysis. Am J Hum Genet 2011; 88: 76–82.

    CAS  Article  Google Scholar 

  21. 21

    Natarajan R, Turnbull BW, Slate EH, Clark LC . A computer program for sample size and power calculations in the design of multi-arm and factorial clinical trials with survival time endpoints. Comput Methods Programs Biomed 1996; 49: 137–147.

    CAS  Article  Google Scholar 

  22. 22

    Wang K, Li WD, Zhang CK, Wang Z, Glessner JT, Grant SF et al. A genome-wide association study on obesity and obesity-related traits. PLoS One 2011; 6: e18939.

    CAS  Article  Google Scholar 

  23. 23

    Been LF, Nath SK, Ralhan SK, Wander GS, Mehra NK, Singh J et al. Replication of association between a common variant near melanocortin-4 receptor gene and obesity-related traits in Asian Sikhs. Obesity (Silver Spring) 2010; 18: 425–429.

    CAS  Article  Google Scholar 

  24. 24

    Vasan SK, Fall T, Neville MJ, Antonisamy B, Fall CH, Geethanjali FS et al. Associations of variants in FTO and near MC4R with obesity traits in South Asian Indians. Obesity (Silver Spring) 2012; 20: 2268–2277.

    CAS  Article  Google Scholar 

  25. 25

    Kuo MW, Wang CH, Wu HC, Chang SJ, Chuang YJ . Soluble THSD7A is an N-glycoprotein that promotes endothelial cell migration and tube formation in angiogenesis. PLoS One 2011; 6: e29000.

    CAS  Article  Google Scholar 

  26. 26

    Wang CH, Chen IH, Kuo MW, Su PT, Lai ZY, Wang CH et al. Zebrafish Thsd7a is a neural protein required for angiogenic patterning during development. Dev Dyn 2011; 240: 1412–1421.

    CAS  Article  Google Scholar 

  27. 27

    Wang CH, Su PT, Du XY, Kuo MW, Lin CY, Yang CC et al. Thrombospondin type I domain containing 7A (THSD7A) mediates endothelial cell migration and tube formation. J Cell Physiol 2010; 222: 685–694.

    CAS  PubMed  Google Scholar 

  28. 28

    Cao Y . Angiogenesis modulates adipogenesis and obesity. J Clin Invest 2007; 117: 2362–2368.

    CAS  Article  Google Scholar 

Download references


This work was supported by the Office of the Principal Scientific Advisor to Government of India; Department of Science and Technology (DST), Government of India (PRNSA/ADV/AYURVEDA/4/2007). KT was also supported by CSIR Network project—GENESIS (BSC0121), Government of India. We acknowledge the help of Dr Ketaki Bapat for her constant support throughout the project tenure.

Author information



Corresponding author

Correspondence to K Thangaraj.

Ethics declarations

Competing interests

The authors declare no conflict of interest.

Additional information

Supplementary Information accompanies this paper on International Journal of Obesity website

Supplementary information

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Nizamuddin, S., Govindaraj, P., Saxena, S. et al. A novel gene THSD7A is associated with obesity. Int J Obes 39, 1662–1665 (2015).

Download citation

Further reading


Quick links