A thrifty variant in CREBRF strongly influences body mass index in Samoans


Samoans are a unique founder population with a high prevalence of obesity1,2,3, making them well suited for identifying new genetic contributors to obesity4. We conducted a genome-wide association study (GWAS) in 3,072 Samoans, discovered a variant, rs12513649, strongly associated with body mass index (BMI) (P = 5.3 × 10−14), and replicated the association in 2,102 additional Samoans (P = 1.2 × 10−9). Targeted sequencing identified a strongly associated missense variant, rs373863828 (p.Arg457Gln), in CREBRF (meta P = 1.4 × 10−20). Although this variant is extremely rare in other populations, it is common in Samoans (frequency of 0.259), with an effect size much larger than that of any other known common BMI risk variant (1.36–1.45 kg/m2 per copy of the risk-associated allele). In comparison to wild-type CREBRF, the Arg457Gln variant when overexpressed selectively decreased energy use and increased fat storage in an adipocyte cell model. These data, in combination with evidence of positive selection of the allele encoding p.Arg457Gln, support a 'thrifty' variant hypothesis as a factor in human obesity.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Association results from genome-wide and targeted sequencing and beanplots of BMI versus genotype in men and women from the discovery sample.
Figure 2: CREBRF variants, adipogenic differentiation, lipid accumulation, and energy homeostasis.
Figure 3: Induction of Crebrf expression by nutritional stress and protection against starvation.
Figure 4: Evidence of positive selection centered on the missense variant rs373863828.

Accession codes


NCBI Reference Sequence


  1. 1

    Åberg, K. et al. Susceptibility loci for adiposity phenotypes on 8p, 9p, and 16q in American Samoa and Samoa. Obesity (Silver Spring) 17, 518–524 (2009).

  2. 2

    McGarvey, S.T. Obesity in Samoans and a perspective on its etiology in Polynesians. Am. J. Clin. Nutr. 53 (Suppl. 6), 1586S–1594S (1991).

  3. 3

    Hawley, N.L. et al. Prevalence of adiposity and associated cardiometabolic risk factors in the Samoan genome-wide association study. Am. J. Hum. Biol. 26, 491–501 (2014).

  4. 4

    Tishkoff, S. Strength in small numbers. Science 349, 1282–1283 (2015).

  5. 5

    McGarvey, S.T., Bindon, J.R., Crews, D.E. & Schendel, D.E. in Human Population Biology: A Transdisciplinary Science (eds. Little, M.A. & Haas, J.D.) 263–279 (Academic Press, 1989).

  6. 6

    McGarvey, S.T. The thrifty gene concept and adiposity studies in biological anthropology. J. Polyn. Soc. 103, 29–42 (1994).

  7. 7

    Zimmet, P., Dowse, G., Finch, C., Serjeantson, S. & King, H. The epidemiology and natural history of NIDDM—lessons from the South Pacific. Diabetes Metab. Rev. 6, 91–124 (1990).

  8. 8

    Kirch, P.V. & Rallu, J.-L. in The Growth and Collapse of Pacific Island Societies (eds. Kirch, P.V. & Rallu, J.-L.) 1–14 (University of Hawaii Press, 2007).

  9. 9

    Friedlaender, J.S. et al. The genetic structure of Pacific Islanders. PLoS Genet. 4, e19 (2008).

  10. 10

    Tsai, H.-J. et al. Distribution of genome-wide linkage disequilibrium based on microsatellite loci in the Samoan population. Hum. Genomics 1, 327–334 (2004).

  11. 11

    Green, R.C. in The Growth and Collapse of Pacific Island Societies (eds. Kirch, P.V. & Rallu, J.-L.) 203–231 (University of Hawaii Press, 2007).

  12. 12

    Exome Aggregation Consortium. Analysis of protein-coding genetic variation in 60,706 humans. Preprint at bioRxiv http://dx.doi.org/10.1101/030338 (2016).

  13. 13

    Kichaev, G. et al. Integrating functional data to prioritize causal variants in statistical fine-mapping studies. PLoS Genet. 10, e1004722 (2014).

  14. 14

    Loos, R.J. & Yeo, G.S. The bigger picture of FTO: the first GWAS-identified obesity gene. Nat. Rev. Endocrinol. 10, 51–61 (2014).

  15. 15

    Speliotes, E.K. et al. Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nat. Genet. 42, 937–948 (2010).

  16. 16

    Eicher, J.D. et al. GRASP v2.0: an update on the Genome-Wide Repository of Associations between SNPs and phenotypes. Nucleic Acids Res. 43, D799–D804 (2015).

  17. 17

    Leslie, R., O'Donnell, C.J. & Johnson, A.D. GRASP: analysis of genotype–phenotype results from 1390 genome-wide association studies and corresponding open access database. Bioinformatics 30, i185–i194 (2014).

  18. 18

    Locke, A.E. et al. Genetic studies of body mass index yield new insights for obesity biology. Nature 518, 197–206 (2015).

  19. 19

    Pearce, L.R. et al. KSR2 mutations are associated with obesity, insulin resistance, and impaired cellular fuel oxidation. Cell 155, 765–777 (2013).

  20. 20

    Vankoningsloo, S. et al. CREB activation induced by mitochondrial dysfunction triggers triglyceride accumulation in 3T3-L1 preadipocytes. J. Cell Sci. 119, 1266–1282 (2006).

  21. 21

    Reusch, J.E., Colton, L.A. & Klemm, D.J. CREB activation induces adipogenesis in 3T3-L1 cells. Mol. Cell. Biol. 20, 1008–1020 (2000).

  22. 22

    Ma, X. et al. CREBL2, interacting with CREB, induces adipogenesis in 3T3-L1 adipocytes. Biochem. J. 439, 27–38 (2011).

  23. 23

    Kim, T.H. et al. Identification of Creb3l4 as an essential negative regulator of adipogenesis. Cell Death Dis. 5, e1527 (2014).

  24. 24

    Wilson-Fritch, L. et al. Mitochondrial biogenesis and remodeling during adipogenesis and in response to the insulin sensitizer rosiglitazone. Mol. Cell. Biol. 23, 1085–1094 (2003).

  25. 25

    Keuper, M. et al. Spare mitochondrial respiratory capacity permits human adipocytes to maintain ATP homeostasis under hypoglycemic conditions. FASEB J. 28, 761–770 (2014).

  26. 26

    Tiebe, M. et al. REPTOR and REPTOR-BP regulate organismal metabolism and transcription downstream of TORC1. Dev. Cell 33, 272–284 (2015).

  27. 27

    Stocker, H. Stress relief downstream of TOR. Dev. Cell 33, 245–246 (2015).

  28. 28

    Chen, R., Mallelwar, R., Thosar, A., Venkatasubrahmanyam, S. & Butte, A.J. GeneChaser: identifying all biological and clinical conditions in which genes of interest are differentially expressed. BMC Bioinformatics 9, 548 (2008).

  29. 29

    Dengjel, J. et al. Autophagy promotes MHC class II presentation of peptides from intracellular source proteins. Proc. Natl. Acad. Sci. USA 102, 7922–7927 (2005).

  30. 30

    Martyn, A.C. et al. Luman/CREB3 recruitment factor regulates glucocorticoid receptor activity and is essential for prolactin-mediated maternal instinct. Mol. Cell. Biol. 32, 5140–5150 (2012).

  31. 31

    Neel, J.V. Diabetes mellitus: a “thrifty” genotype rendered detrimental by “progress”? Am. J. Hum. Genet. 14, 353–362 (1962).

  32. 32

    Pruim, R.J. et al. LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics 26, 2336–2337 (2010).

  33. 33

    Kampstra, P. Beanplot: a boxplot alternative for visual comparison of distributions. J. Stat. Softw. 28, 1–9 (2008).

  34. 34

    Gauderman, W.J. Sample size requirements for association studies of gene–gene interaction. Am. J. Epidemiol. 155, 478–484 (2002).

  35. 35

    Gauderman, W.J. Sample size requirements for matched case–control studies of gene–environment interaction. Stat. Med. 21, 35–50 (2002).

  36. 36

    Scuteri, A. et al. Genome-wide association scan shows genetic variants in the FTO gene are associated with obesity-related traits. PLoS Genet. 3, e115 (2007).

  37. 37

    McGarvey, S.T., Levinson, P.D., Bausserman, L., Galanis, D.J. & Hornick, C.A. Population-change in adult obesity and blood-lipids in American-Samoa from 1976–1978 to 1990. Am. J. Hum. Biol. 5, 17–30 (1993).

  38. 38

    Keighley, E.D., McGarvey, S.T., Turituri, P. & Viali, S. Farming and adiposity in Samoan adults. Am. J. Hum. Biol. 18, 112–122 (2006).

  39. 39

    Swinburn, B.A., Ley, S.J., Carmichael, H.E. & Plank, L.D. Body size and composition in Polynesians. Int. J. Obes. Relat. Metab. Disord. 23, 1178–1183 (1999).

  40. 40

    Cole, T.J., Bellizzi, M.C., Flegal, K.M. & Dietz, W.H. Establishing a standard definition for child overweight and obesity worldwide: international survey. Br. Med. J. 320, 1240–1243 (2000).

  41. 41

    American Diabetes Association. Diagnosis and classification of diabetes mellitus. Diabetes Care 35 (Suppl. 1), S64–S71 (2012).

  42. 42

    Matthews, D.R. et al. Homeostasis model assessment: insulin resistance and beta-cell function from fasting plasma glucose and insulin concentrations in man. Diabetologia 28, 412–419 (1985).

  43. 43

    Laurie, C.C. et al. Quality control and quality assurance in genotypic data for genome-wide association studies. Genet. Epidemiol. 34, 591–602 (2010).

  44. 44

    Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

  45. 45

    Aulchenko, Y.S., Ripke, S., Isaacs, A. & van Duijn, C.M. GenABEL: an R library for genome-wide association analysis. Bioinformatics 23, 1294–1296 (2007).

  46. 46

    Heath, S.C. et al. Investigation of the fine structure of European populations with applications to disease association studies. Eur. J. Hum. Genet. 16, 1413–1429 (2008).

  47. 47

    Chen, W.M. & Abecasis, G.R. Family-based association tests for genomewide association scans. Am. J. Hum. Genet. 81, 913–926 (2007).

  48. 48

    Willer, C.J., Li, Y. & Abecasis, G.R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190–2191 (2010).

  49. 49

    Delaneau, O., Marchini, J. & Zagury, J.F. A linear complexity phasing method for thousands of genomes. Nat. Methods 9, 179–181 (2012).

  50. 50

    Delaneau, O., Howie, B., Cox, A.J., Zagury, J.F. & Marchini, J. Haplotype estimation using sequencing reads. Am. J. Hum. Genet. 93, 687–696 (2013).

  51. 51

    Delaneau, O., Zagury, J.F. & Marchini, J. Improved whole-chromosome phasing for disease and population genetic studies. Nat. Methods 10, 5–6 (2013).

  52. 52

    O'Connell, J. et al. A general approach for haplotype phasing across the full spectrum of relatedness. PLoS Genet. 10, e1004234 (2014).

  53. 53

    Delaneau, O. & Marchini, J. 1000 Genomes Project Consortium; 1000 Genomes Project Consortium. Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel. Nat. Commun. 5, 3934 (2014).

  54. 54

    Marchini, J., Howie, B., Myers, S., McVean, G. & Donnelly, P. A new multipoint method for genome-wide association studies by imputation of genotypes. Nat. Genet. 39, 906–913 (2007).

  55. 55

    Howie, B.N., Donnelly, P. & Marchini, J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009).

  56. 56

    Marchini, J. & Howie, B. Genotype imputation for genome-wide association studies. Nat. Rev. Genet. 11, 499–511 (2010).

  57. 57

    Wang, X. et al. Evaluation of transethnic fine mapping with population-specific and cosmopolitan imputation reference panels in diverse Asian populations. Eur. J. Hum. Genet. 24, 592–599 (2016).

  58. 58

    Aulchenko, Y.S., Struchalin, M.V. & van Duijn, C.M. ProbABEL package for genome-wide association analysis of imputed data. BMC Bioinformatics 11, 134 (2010).

  59. 59

    Bradford, M.M. A rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding. Anal. Biochem. 72, 248–254 (1976).

  60. 60

    R Development Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2004).

  61. 61

    Staples, J., Nickerson, D.A. & Below, J.E. Utilizing graph theory to select the largest set of unrelated individuals for genetic analysis. Genet. Epidemiol. 37, 136–141 (2013).

  62. 62

    Staples, J. et al. PRIMUS: rapid reconstruction of pedigrees from genome-wide estimates of identity by descent. Am. J. Hum. Genet. 95, 553–564 (2014).

  63. 63

    Cadzow, M. et al. A bioinformatics workflow for detecting signatures of selection in genomic data. Front. Genet. 5, 293 (2014).

  64. 64

    Gautier, M. & Vitalis, R. rehh: an R package to detect footprints of selection in genome-wide SNP data from haplotype structure. Bioinformatics 28, 1176–1177 (2012).

  65. 65

    Sabeti, P.C. et al. Detecting recent positive selection in the human genome from haplotype structure. Nature 419, 832–837 (2002).

  66. 66

    Szpiech, Z.A. & Hernandez, R.D. selscan: an efficient multithreaded program to perform EHH-based scans for positive selection. Mol. Biol. Evol. 31, 2824–2827 (2014).

  67. 67

    Voight, B.F., Kudaravalli, S., Wen, X. & Pritchard, J.K. A map of recent positive selection in the human genome. PLoS Biol. 4, e72 (2006).

  68. 68

    Ferrer-Admetlla, A., Liang, M., Korneliussen, T. & Nielsen, R. On detecting incomplete soft or hard selective sweeps using haplotype structure. Mol. Biol. Evol. 31, 1275–1291 (2014).

Download references


The authors would like to thank the Samoan participants of the study, and local village authorities and the many Samoan and other field workers over the years. We acknowledge the Samoan Ministry of Health and the Samoan Bureau of Statistics, and the American Samoan Department of Health for their support of this research. We also acknowledge S.S. Shiva and C.G. Corey at the University of Pittsburgh Center for Metabolism and Mitochondrial Biology for assistance with cellular bioenergetic profiling. This work was funded by NIH grants R01-HL093093 (S.T.M.), R01-AG09375 (S.T.M.), R01-HL52611 (I. Kamboh), R01-DK59642 (S.T.M.), P30 ES006096 (S.M. Ho), R01-DK55406. (R.D.), R01-HL090648 (Z.U.), and R01-DK090166 (E.E.K.) and by Brown University student research funds. Genotyping was performed in the Core Genotyping Laboratory at the University of Cincinnati, funded by NIH grant P30 ES006096 (S.M. Ho). Illumina sequencing was conducted at the Genetic Resources Core Facility, Johns Hopkins Institute of Genetic Medicine (Baltimore).

Author information

R.L.M. performed the genotype quality control and association analyses, with guidance from D.E.W. and assistance from O.D.B. and J.L.; D.E.W. and R.L.M. wrote the relevant sections of the manuscript. N.L.H. led the field work data collection and phenotype analyses with guidance from S.T.M. G.S. led and directed genotyping experiments (using the Affymetrix 6.0 chip) and assay development for validation and replication (using the TaqMan platform) with guidance from R.D. H.C. participated extensively in DNA extraction, genotyping, and quality control of the data under the supervision of G.S. and R.D. Z.U. and C.-T.S. designed and performed the CREBRF overexpression, lipid accumulation, and adipocyte differentiation and starvation experiments, analyzed the data, and wrote the relevant sections of the manuscript. E.E.K. contributed mouse and human gene expression profiling data as well as contributed to the design and analysis of the functional studies. M.S.R., S.V., and J.T. facilitated fieldwork in Samoa and American Samoa. T.N. contributed to the discussion of the public health implications of the findings. All authors contributed to this work, discussed the results, and critically reviewed and revised the manuscript.

Correspondence to Stephen T McGarvey.

Ethics declarations

Competing interests

Some authors are listed as inventors on a provisional patent application covering aspects of this work that has been filed with the US Patent and Trademark Office (S.T.M., N.L.H., R.D., D.E.W., R.L.M., Z.U., C.-T.S., and E.E.K.).

Integrated supplementary information

Supplementary Figure 1 Principal-components analyses.

(a) Scatterplot of the first three principal components from the principal-components analysis of the Samoan and HapMap phase 3 populations. Continental population abbreviations: SAM, Samoans (n = 250); EUR, Europeans (n = 253); AFR, Africans (n = 511); EAS, East Asians (n = 255); SAS, South Asians (n = 88); AMR, admixed Americans (n = 77). Supplementary Video 1 shows a rotating animation of this figure. (b) Scatterplots of the first six principal components from the principal-components analysis of the Samoans alone (n = 3,094) plotted against each other. Source data

Supplementary Figure 2 Quantile–quantile plot for the BMI GWAS.

A quantile–quantile (QQ) plot of the observed −log10 (P values) from Figure 1a for association of BMI in the discovery sample versus –log10 (P values) as expected under no association. The second most significant variant, rs3132141, lies between BNIP1 and NKX2-5 and is 184.5 kb from the most significant variant, rs12513649. n = 3,072 Samoans. Source data

Supplementary Figure 3 Conditional associations of targeted sequencing genotypes with BMI.

(ad) Associations between SNPs in the targeted sequencing regions and BMI conditioned on rs12513649 (a), rs150207780 (b), rs373863828 (c), and rs3095870 (d). The red line in each plot corresponds to a P value of 5 × 10−8. n = 3,072 Samoans. Source data

Supplementary Figure 4 Beanplots of BMI in GWAS and replication samples stratified by missense variant rs373863828 genotype, sex, and nation.

Each bean consists of a mirrored density curve containing a one-dimensional scatterplot of the individual data. The heavy dark line shows the average within each group, and the dotted line indicates the overall average. Plots were drawn using the R beanplot package33. Sample sizes are as indicated in Supplementary Table 1. Source data

Supplementary Figure 5 Expression of CREBRF in human and mouse tissues.

(a) Human CREBRF mRNA expression was determined in multiple human tissues using Human cDNA Arrays from Origene (n = 1/tissue; nutritional status not known). (b) Mouse Crebrf mRNA expression was determined in mouse tissues obtained from 10-week-old, littermate-matched, ad libitum–fed, male C56BL/6J mice (n = 6/group). Expression was normalized to the endogenous control gene peptidylprolyl isomerase A/cyclophilin A (PPIA for human; Ppia for mouse). Values represent relative expression and are expressed as means plus s.e.m. No statistical comparisons were performed. pg, perigonadal; sc, inguinal subcutaneous; mes, mesenteric. These data support the presence/absence of CREBRF in specific tissues but should be used with caution when assessing relative expression, particularly in humans where precise conditions at the time of tissue collection are not known. Gene expression can be compared to additional in silico resources including the BGTEx and BioGPS portals (see URLs). Source data

Supplementary Figure 6 Expression of mouse Crebrf relative to key adipogenic genes during adipocyte differentiation.

3T3-L1 cells were treated with a hormonal differentiation cocktail at 2 d after confluence (day 0, D0), and RNA samples were collected at the indicated time points. mRNA expression relative to the β-actin (Actb) reference gene was determined using quantitative RT–PCR, with day 0 expression values set at 1. Values are given as means ± s.e.m. (n = 8). A representative of five independent experiments is shown. Source data

Supplementary Figure 7 Bioenergetic profile changes during adipocyte differentiation.

3T3-L1 cells were treated with a hormonal differentiation cocktail at 2 d after confluence (day 0, D0), and key bioenergetic variables were determined on the basis of oxygen consumption rate (OCR) and extracellular acidification rate (ECAR) measurements normalized to protein content. Values are given as means ± s.e.m. (n = 6). *P < 0.01 compared to day 0 (two-tailed t test with unequal variances). As the results were consistent with previously published data24,25, the experiment was performed once. Source data

Supplementary Figure 8 iHS and nSL scores in an 800-kb region centered on the missense variant rs373863828 (n = 626 non-closely related Samoans).

(a) iHS scores versus physical position. (b) nSL scores versus physical position. In both a and b, the blue dot indicates the score at the missense variant rs373863828 and the yellow dot indicates the score at the discovery variant rs12513649; the dotted horizontal line indicates the score at the missense variant rs373863828. Source data

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–8, Supplementary Tables 1–3 and Supplementary Note. (PDF 1946 kb)

Principal-components analyses.

A rotating animation of a scatterplot of the first three principal components from the principal-components analysis of the Samoan and HapMap phase 3 populations. Continental population abbreviations: SAM, Samoans (n = 250); EUR, Europeans (n = 253); AFR, Africans (n = 511); EAS, East Asians (n = 255); SAS, South Asians (n = 88); AMR, admixed Americans (n = 77). (MOV 790 kb)

Source data

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Minster, R., Hawley, N., Su, C. et al. A thrifty variant in CREBRF strongly influences body mass index in Samoans. Nat Genet 48, 1049–1054 (2016). https://doi.org/10.1038/ng.3620

Download citation

Further reading