Introduction

Common variants in the FTO gene have repeatedly been shown to be associated with body mass index (BMI) and fat mass in populations of European descent1, 2, 3 and recently also in Asian populations.4, 5, 6, 7 FTO is therefore considered to be the first replicated signal at which common variants contribute to common obesity,8 although details of the molecular and physiological processes involved remain unclear.

On the basis of the available genome-wide and fine-mapping association data, the associated SNPs (and therefore, by implication, the still unidentified causal variant) map to a 47-kb region, bounded by flanking recombination hotspots, mainly in intron 1. It has been shown that the FTO protein shares sequence motifs with Fe(II)- and 2-oxoglutarate-dependent oxygenases.9 FTO mRNA is widely expressed in all tissues but particularly highly in hypothalamic nuclei generally implicated in energy balance and feeding behaviour.10 In mice, levels of FTO in the arcuate nucleus are regulated by feeding and fasting.9, 11 Despite this accumulating evidence, it remains to be established that FTO is the causal gene. For example, the similarity of expression profiles between FTO and the adjacent gene RPGRIP1L (KIAA1005, FTM) may indicate joint transcriptional regulation.12 The product of RPGRIP1L is involved in ciliary function.13, 14 Although this raises parallels with the Bardet–Biedl obesity syndrome12, 15, 16 (which is known to result from mutations in genes with similar roles), severe loss-of-function mutations in FTM result in Joubert syndrome type B and Meckel syndrome, but do not cause serious disturbances of body weight.13 FTO variants seem to influence body weight by modulation of food intake rather than energy expenditure.17, 18, 19 Effects on glucose metabolism independent of obesity have yet to be shown.20, 21, 22

Identification of a wider spectrum of susceptibility alleles can be a valuable tool in efforts to relate DNA sequence variation to clinical phenotypes,23, 24 and to this end, studies of populations with a divergent mutational and/or demographic history (including those from a different ethnic background or defined population isolates) offer particular advantages. Analysis of such samples provides an opportunity to detect novel allelic or haplotypic associations and may offer valuable clues to additional, independent causal variants as well as helping to map earlier described variants.

We performed a genome-wide scan for traits related to the metabolic syndrome in the Sorbs, a self-contained population of Slavonic origin resident in Germany. We specifically focus the analyses to effects of FTO variants on BMI in this population and report findings that suggest possible allelic heterogeneity across populations.

Materials and methods

Subjects and phenotyping

All subjects are part of a sample from an extensively phenotyped self-contained population from Eastern Germany, the Sorbs. The Sorbs are of Slavonic origin, and lived in ethnic isolation among the Germanic majority during the past 1100 years. Today, the Sorbian-speaking, Catholic minority comprises 15 000 full-blooded Sorbs resident in about 10 villages in rural Upper Lusatia (Oberlausitz), Eastern Saxony. This historical evidence establishing the Sorbs as a genetic isolate is substantiated by extended homozygosity and differences in allele frequency distribution compared with an outbred population.

At present, about 1000 Sorbian individuals are enrolled in the study. Sampling comprised unrelated subjects as well as families. Mean IBD sharing in the pairwise comparison was 0.008, median <10−6 (25th percentile <10−6, 75th percentile: 0.012). Extensive phenotyping included standardised questionnaires for past medical history and family history, collection of anthropometric data (weight, height, waist-to-hip-ratio (WHR), body impedance analysis (BIA)) and a 75-g glucose-tolerance-test (OGTT). BIA was performed with BIA-2000-S (Data Input GmbH, Darmstadt, Germany) and evaluated with the software Nutri3 (Data Input GmbH). A total of 948 subjects were available for this study, which was approved by the ethics committee of the University of Leipzig and all subjects gave the written informed consent before taking part in the study.

For comparison of allele and haplotype frequencies of the FTO region, we used the WTCCC T2D case set.25, 26 For the evaluation of linkage disequilibrium (LD) patterns, we selected 427 subjects with IBD sharing <0.15 from the Sorbs sample and compared them with 60 CEU HapMap samples.27

DNA extraction, genotyping and SNP selection

Genomic DNA was extracted using QIAmp DNA Blood Midi Kit (Qiagen Inc., Valencia, CA, USA) according to the manufacturer's protocol. Genotyping was performed using the 500 K Affymetrix GeneChip and the Affymetrix Genome-Wide Human SNP Array 6.0 (Affymetrix Inc., Santa Clara, CA, USA) by the Microarray Core Facility of the Interdisciplinary Centre for Clinical Research, University of Leipzig, Germany, and by ATLAS Biolabs GmbH, Berlin, Germany. Genotypes were determined with GeneChip Genotyping Analysis Software (GTYPE, Affymetrix Inc.) using the BRLMM algorithm for the 500 K arrays and the Birdseed Algorithm for Genome-Wide Human SNP Array 6.0 (Affymetrix Inc.). Only arrays with call rates >93% were selected for further analyses and only SNPs fulfilling the following criteria were included: missing rate per SNP <5%, Hardy–Weinberg equilibrium (HWE) P>0.0001 and minor allele frequency (MAF) >0.01. The average genotyping rate was 99%. In all, 390619 (380408 autosomal, 10211 X-chromosomal) markers overlapping between the 500 K Affymetrix GeneChip and the Affymetrix Genome-Wide Human SNP Array 6.0 were included in the analyses. All alleles were standardised to the forward strand.

Statistical methods and software

Call rates per sample and cluster-plots were checked using the GeneChip Genotyping Analysis Software (GTYPE). The calculation of minor allele frequencies, HWE, and missing rates per SNP was performed with PLINK.28 Pairwise IBD estimation was performed with the --genome command in PLINK. To perform multidimensional scaling analysis on the N × N matrix of genome-wide IBS pairwise distances, we used the --mds-plot option in conjunction with --cluster.28 BMI and fat mass were non-normally distributed and therefore ln-transformed before analysis and genome-wide association was assessed by linear regression in PLINK. We corrected for age and gender and furthermore for cryptic relatedness and possible population substructure by using genomic control (λ=1.16 for BMI and 1.18 for fat mass) and by using the first four vectors of multiple dimensional scaling as a covariate. We examined the distribution of test statistics and the deviation from the expected distribution under the null hypothesis of no association in a quantile–quantile plot (Figure 1). Out of the 390619 SNPs that passed, QC 22 showed evidence of association with BMI and fat mass on a significance level of P<0.0001 for both traits simultaneously in the Sorbian population (Table 4). Furthermore, we assessed the associations of FTO-SNPs with BMI in a subset of subjects with IBD sharing <0.15 and in a subset of non-diabetic individuals (according to the current ADA criteria29).

Figure 1
figure 1

QQ-plots for association with BMI and fat mass in the Sorbian population. rs17818902 represents the position of the SNP with the strongest association with each trait.

To analyse haplotypic association with BMI, we applied the GENEBPM algorithm30 in a sliding window of five overlapping SNPs across the region flanking FTO. The method was originally developed to test for association between a binary disease phenotype and SNP haplotypes, but has been adapted here to model association with a quantitative trait in a linear regression framework. The strength of evidence in favour of the association of haplotypes within each window with BMI was assessed by means of Bayes factor. For a direct comparison of the Bayes factors of the single point and haplotype-based analyses, we also applied the GENEBPM algorithm to each SNP alone in a single-locus analysis of the region.

Conditional analyses were conducted in a similar manner with inclusion of the conditioned SNP as an additional covariate. Linkage disequilibrium metrics were calculated in Haploview 4.1.31 Power calculations were performed in Quanto 1.1.32, 33 Given the effect size of rs8050136 in the WTCCC T2D case set (β=0.012), we would have a power of 30% to detect an effect on BMI (additive model) at an α of 0.05 in the whole Sorbian sample set. Allowing for the more stringent genome-wide α of 5 × 10−8, we would need a sample size of about 17 800 subjects to achieve a power of 80% to detect an effect of rs8050136 on BMI under the additive model.

Results

Genome-wide association with BMI and fat mass in 948 subjects

There were 27 SNPs available in introns 1, 2 and 3 of the FTO region in our data set. Seven SNPs out of these gave modest evidence (P<0.01) of association with BMI within the earlier reported 47-kb region of introns 1 and 2 (Table 1, eg, rs1861868: P-value=0.009; rs8050136: P-value=0.003; rs17817288: P-value=0.007). Furthermore, in contrast to the earlier reported data, another eight SNPs that map downstream of this region in introns 2 and 3 showed moderate effects on BMI (Figure 2, Table 1, eg, rs10852522: P-value=0.002; rs17818902: P-value=0.0006; rs8053367: P-value=0.001). In addition, the SNPs in the region between 52.41 and 52.43 Mb had consistent effects on fat mass (eg, rs10852522: P-value=0.01; rs17818902: P-value=0.002; rs8053367: P-value=0.007), whereas the presented SNPs in intron 1 showed a trend but no effects on the significance level of 0.05 (Table 1). The independent effect of both signals was confirmed by the conditional analyses (Table 2), in which adjustment for SNPs in the other cluster resulted in modest attenuation but not obliteration of the signal. The effects on BMI also remained after exclusion of closely related subjects. In a subset of 427 individuals with IBD sharing <0.15 rs8050136 (P-value=0.0029) and rs17818902 (P-value=0.0089) likewise showed modest evidence of association with BMI. Furthermore, we evaluated the effect of the FTO-SNPs in a subset of 802 non-diabetic subjects (λ=1.14). Again, the results were consistent with the effects in the whole Sorbian sample (rs8050136: P-value=0.014 and rs17818902: P-value=0.0029).

Table 1 Association of SNPs in FTO-introns 1–3 with BMI and fat mass in the Sorbs
Figure 2
figure 2

Associations of SNPs in FTO with BMI in the Sorbs. Positions are based on NCBI Build 35. Estimated recombination rates (HapMap27) reflect the local LD structure around the associated SNPs and their correlated proxies. The SNP with the lowest P-value for association with BMI (rs17818902) is given as a black diamond, correlated SNPs according to their LDs in a scale from grey (r2=1) to white (r2=0) based on the pairwise r2 values estimated in the Sorbian population.

Table 2 Conditioned analyses for rs8050136 and rs17818902

Effect sizes and directions of effects of the published FTO-SNPs in the Sorbian sample were consistent with the reported data in the other European populations. The observed allele frequencies and haplotype frequencies in the Sorbs, HapMap CEU sample and WTCCC T2D cases were similar (Table 3, Supplementary Tables 1 and 2).

Table 3 Haplotype frequencies in the Sorbs and the WTCCC T2D case sample

The strongest haplotypic association with BMI in the FTO/RPGRIP1L region, after adjustment for the effects of age, gender and four vectors of multidimensional scaling as covariates, was obtained for a window containing SNPs rs9972717, rs10852522, rs10521308, rs17818902, rs17818920. These SNPs cover a region of about 21 kb across parts of intron 2, exon 3 and intron 3. Visual inspection of the clustering of haplotypes within this window suggests that carriers of the combination of alleles G and C at SNPs rs10521308 and rs17818920, respectively, have the lowest mean BMI, and occur with relative frequency of 18% in the population (Table 3 and Supplementary Figure 1).

None of the SNPs in the region between 52.41 and 52.43 Mb was associated with BMI in 1913 subjects from the WTCCC T2D case sample set (Supplementary Table 1) and nor was any association evident using haplotype-based analyses. In contrast, rs8050136 is not significantly associated with BMI in 1479 subjects of the British 58 Birth Cohort (P=0.21), but SNPs in intron 2/3 show a moderate trend with the same effect direction as in the Sorbs (rs17818902: P-value=0.046, β=−0.4246; rs17818920: P-value=0.035181, β=−0.4488), (http://www.b58cgene.sgul.ac.uk/index.php). In the Diabetes Genetics Initiative (DGI) sample, the cluster of SNPs in the 47-kb region in intron 1/2 does not show any significant associations with BMI either in the T2D cases or in the control set (Supplementary Table 1). The SNPs in intron 2/3 show modest effects on BMI in the DGI control set but in the opposite direction than in the Sorbian sample (Supplementary Table 1).

Based on a subset of 427 Sorbian subjects with IBD sharing <0.15, LD between rs8050136 and the three representative tagging SNPs, which show moderate evidence of association with BMI in the intron 2/3 region in the Sorbian dataset was modest (to rs10852522: D′=0.046, r2=0.001, to rs17818902: D′=0.192, r2=0.006 and to rs8053367: D′=0.16, r2=0.012) suggesting that the intron 1 and intron 3 effects are independent.

Discussion

Recently, a genome-wide association study for type 2 diabetes showed an association of common FTO variants with weight, BMI, fat mass and risk of obesity.2, 26 This finding has been replicated in several independent samples of European origin.1, 3 In line with these findings, this study also identifies evidence of association with the same cluster of SNPs in intron 1.

Beyond this replication, we have identified a novel association mapping to the region between 52.41 and 52.43 Mb, which spans parts of intron 2 and intron 3. These events seem to be unrelated and statistically independent. One possible explanation of this is that the intron 3 signal represents a second mutational event segregating in the Sorbian population. Our hypothesis is that we may have detected a causal novel variant on an existing haplotype in the Sorbian population. An alternative explanation is that the BMI effect attributable to the intron 3 SNPs is the result of interaction with environmental factors and/or genetic variants outside the FTO region that are specific to the Sorbian isolate.

Our findings to some extent echo those of a recent study in the Amish, similar to the Sorbs, an isolate of European descent.34 A low-density genome-wide association scan in this population also showed evidence for FTO association with BMI. The strongest signals, however, were seen at SNPs in intron 1 (rs1861869, rs1861868 and rs147796), which were not in high LD with the recently published SNPs, and the effect was modified by physical exercise.

Table 4 SNPs associated with BMI and fat mass in the Sorbian population at P<0.0001

As the molecular mechanisms relating the earlier reported association signal in FTO with the functional disruption of nearby genes remain uncertain and the causal gene has not yet been unequivocally identified, the detection of additional association signals and independent causal variants may prove extremely useful. The study of population isolates provides a convenient tool for seeking out independent events that may reflect the specific mutational and/or demographic history of that population. At the same time, the limited size of many isolates imposes a burden in terms of the capacity to generate findings that reach stringent levels of genome-wide significance, and in terms of the intrinsic limitations with respect to seeking replication in samples from other populations.

Certainly, the additional associations we have detected in FTO do not, despite the analysis of all Sorbian samples for which we currently have DNA and phenotypic data, attain levels of significance that make them entirely convincing on the genome-wide scale.

Furthermore, there remain some limitations of the study despite all quality control and consistency checks. Despite being a robust and widely applicable tool to adjust for population structure in association studies, there remain restrictions of the correction by genomic control under certain settings.35, 36, 37 However, in this situation, the existing evidence implicating FTO variation in the regulation of weight, as well as the substantial increase in the earlier expectation that additional variants in the gene that affect function and/or expression will have similar phenotypic consequences, should be taken into account as the genome-wide significance threshold may be too conservative.

So far, it remains unclear which factors contribute to the association observed in the Sorbs. The signal in intron 2/3 cannot be replicated in the WTCCC T2D cases or the DGI T2D cases. rs17818902 shows a moderate trend of association with BMI in the DGI control set but with an opposite effect direction. However, it should be stated that the earlier published SNP rs8050136 is not significantly associated with BMI in about 1500 subjects of the British 58 Birth Cohort (P-value=0.21), whereas the FTO-SNP with strongest evidence for an effect on BMI in the Sorbs shows a moderate trend with a consistent effect direction in the British 58 Birth Cohort (P-value=0.046 for rs17818902, http://www.b58cgene.sgul.ac.uk/index.php).

Viewed in this light, our findings (allied to those in the Amish) provide qualified support for the notion that a detailed examination of this gene region in a range of populations will show additional independent alleles implicated in body weight regulation. As the causal variants underlying these are identified, these should provide valuable clues to the molecular mechanisms by which sequence variation in this region affects clinical phenotypes. Our study should act as a stimulus to similar efforts in other ethnic groups. Such efforts should be directed towards a search for the full allelic spectrum of FTO regional variation using complementary approaches (indirect LD-mapping, deep resequencing, structural variant detection) designed to capture variants of all frequencies and types.