Introduction

Body mass index (BMI) is the most commonly used anthropometric method to define human obesity. BMI is a complex trait affected by many environmental (eg, diet, age, physical activity) and genetic factors, with heritability estimates that vary from 40–80% in twin studies, 20–50% in family studies and 20–60% in adoption studies.1 Recent genome-wide association (GWA) studies have successfully identified numerous single-nucleotide polymorphisms (SNPs) that are robustly associated with obesity related traits, including BMI.2, 3, 4 They shed light on the biological basis of obesity and suggest a role for neuronal influences on the regulation of appetite and/or energy balance. However, the identified genetic variants jointly explained only a small proportion of the trait variation and thus had limited predictive value for obesity risk.5 For example, in a recent meta-analysis (249 796 individuals) 32 identified and replicated SNPs together explained only 1.45% of the inter-individual variation in BMI where the strongest SNP accounted for just 0.34% of the variance.3 The 32 BMI SNPs map to 32 different genes that are referred to as BMI loci hereafter.

Gene–gene interactions (epistasis) are thought to be potential sources of the unexplained genetic variation,6, 7, 8 but they remain largely unexplored in the GWA studies conducted for BMI so far. A major hurdle for analysing epistasis in GWA studies was the lack of fast methods to enumerate billions of interaction tests in a full pair-wise genome scan to map different types of epistasis (eg, with or without main effects) while keeping false-positive rates under control.9, 10 Another hurdle for studying epistasis is the relatively small sample size in many existing GWA cohorts that may limit the power of detection and replication of epistasis signals unless the epistatic effects to be detected are large.11, 12 It was showed in simulation that more than 4000 case–control pairs were needed to achieve 80% power of detection of epistasis with an odds ratio of 3.0 in complex diseases.13 For quantitative traits, sample sizes need to be substantially (eg, 45%) larger than case–control phenotypes to achieve a similar power.14

With the advances in computing technologies, the major hurdle is gradually easing and full pair-wise genome scans are beginning to be applied to GWA populations individually.15, 16 Meta-analysis of epistasis as applied in GWA studies3 could be a good way to overcome the sample size hurdle but requires new methods to accommodate imputed SNP genotype data. Various approaches in search space reduction (ie, less stringent significance thresholds as result of the much reduced number of tests) can be applied to improve the power of detection of epistasis in individual GWA populations.11 Testing interactions involving genome-wide significant loci (of marginal effects) with a threshold corrected for the actual number of tests has been suggested10, 17, 18 and applied successfully in recent studies.16, 19, 20, 21 Another approach is to select SNPs based on existing biological knowledge (eg, protein–protein interactions) and test interactions among them only.22, 23 However, cautions should be taken when making the SNP selection12 because biological knowledge may not be directly related to the trait studied and any biases in the pre-identified loci could lead to false-positive epistatic signals.

Here we demonstrate a different approach to exploit the value of genome-wide analysis of epistasis using multiple populations. First we performed full pair-wise genome scans for epistasis in BMI in four GWA populations to which we had direct access: the Scottish ORCADES,24 the CROATIA-Vis25 and CROATIA-Korcula,26 and the Italian MICROS27 study cohorts. Each of these cohorts has a relatively small sample size and is sampled from distinct European regions with widely differing lifestyles and diets. Second, we identified common and potentially important gene–gene interactions using the epistasis signals uncovered in each cohort and their gene ontology (GO) enrichment across populations. In addition, we also identified a set of interactions involving the BMI loci (as prior knowledge) in different cohorts. Third, we tested the identified interactions in each cohort for replication and then the replicated signals in the Northern Finland Birth Cohort 1966 (NFBC1966).28 We aim to address the question whether epistasis analysis is of value for the dissection of the genetic regulation of BMI in these study cohorts.

Materials and methods

Study cohorts and ethics statement

The four study cohorts have been described in detail elsewhere.24, 25, 26, 27, 29 Briefly, the Scottish ORCADES cohort was recruited from a subgroup of 10 islands of the archipelago of Orkney. This study was approved by the NHS Orkney Research Ethics Committee and the North of Scotland REC. The CROATIA-Vis and CROATIA-Korcula cohorts were recruited from the island of Vis and the island of Korcula, respectively. Both studies were approved by the Ethical Committee of the Medical School, University of Zagreb and the Multi-Centre Research Ethics Committee for Scotland. The Italian MICROS cohort was recruited from villages in an isolated highland area of the South Tyrol. The study was approved by the ethical committee of the Autonomous Province of Bolzano. All participants gave written informed consent and were measured for a number of traits, including weight and height from which BMI values were calculated.

DNA samples were genotyped with Illumina Infinium HumanHap300v1/v2 (for CROATIA-Vis by the Wellcome Trust Clinical Facility in Einburgh, UK) or HumanCNV370v1 SNP bead microarrays (for CROATIA-Korcula, ORCADES and MICROS by the Helmholtz Zentrum Munchen in Munich, Germany) and analysed using the BeadStudio software (Illumina). Quality control of the genotype data was performed for each cohort using the R/GenABEL package (Version 1.6-7)30 based on a common set of criteria: individual call rate at 97%, SNP call rate at 95%, P-value for deviation from Hardy–Weinberg equilibrium at 1.0e-10, minor allele frequency at 2%. The NFBC1966 data were provided by the database of Genotype and Phenotype (dbGaP) via specific Data Use Certification and used as the replication cohort. NFBC1966 includes nearly all individuals born in 1966 in the two northernmost Finnish provinces that were genotyped with HumanCNV370v1 SNP bead microarrays28 and was put through the same quality control procedure as above. The summary information of each cohort after quality control and excluding individuals without BMI or age records or with extremely high BMI (ie, BMI>50 kg/m2) is given in Table 1.

Table 1 Summary information of uncorrected BMI (kg/m2)a

Statistical analysis

The raw BMI data in each of the four study cohorts were corrected for age and sex and normalised using the rntransform function that is implemented in the GenABEL package performing quantile normalisation of residuals from a generalised linear model analysis. The normalised BMI residuals were then analysed using a linear mixed model to correct for polygenic effects due to relatedness using the polygenic function in the GenABEL package and the resultant environmental residuals (ie, pgresidualY in GenABEL) were used as the trait to test for association.31 The polygenic heritability was estimated at the mixed-model step. Following the original GWA study,28 in the NFBC1966 cohort individuals with pregnancy and/or self reported BMI measures were excluded, and the raw BMI values were corrected for the SexOCPG factor (recoded according to gender, status of taking oral contraception and pregnancy) and then normalised and corrected for relatedness as above.

A single-SNP based GWA scan was performed in each population using a score test method (based on the additive model) implemented in the mmscore function in the GenABEL package. The consensus GWA threshold of 7.3 (−log10(5.0E−08)) was applied to identify GWA significant SNPs.32 We also performed a full pair-wise genome scan using the regression models described below. Considering a pair of SNPs denoted as SNP1 and SNP2, the following genetic models were used to detect epistasis where genotypes of each SNP (ie, homozygote of the minor allele, homozygote of the major allele and heterozygote) were fitted as fixed factors:

where y is the trait of interest, μ is the model constant, SNP1 (or SNP2) is a fixed factor with three levels (genotype classes), SNP1*SNP2 is the interaction term, e is the random error term. The F ratio test of Model 1 against Model 3 evaluates the whole pair effect, including interaction (ie, Fpair, 8 degrees of freedom). The F ratio test of Model 1 against Model 2 evaluates the interaction between the two SNPs (ie, Fint, 4 degrees of freedom). P-values were calculated based on the F distribution with relevant degrees of freedom and transformed to the −log10 scale (ie, −log10Ppair for the Fpair test, −log10Pint for the Fint test). We were concerned mainly about the Fint tests in this study.

Genome-wide significant thresholds (all in the −log10 scale) were derived based on Bonferroni correction for multiple tests, that is, the 5% nominal P value corrected by the number of tests performed. Considering 300 000 SNPs, a full pair-wise genome scan perform 4.5E+10 association tests and thus the 5% genome-wide threshold is 11.95 (ie, −log10(0.05/4.5E+10)). After each pair-wise genome scan, results were evaluated using the predefined threshold to identify genome-wide significant interaction signals. Each SNP in the results was annotated to the nearest gene within a window of 20 kilobases flanking the SNP (based on the physical distance to either the start or end of transcription of a gene; the distance is considered as zero if the SNP is within a gene).

A GO enrichment analysis was conducted for each study cohort using the running mode of ‘Two unranked lists of genes’ in Gorilla33 based on the standard Hyper Geometric statistics, where the annotated epistatic genes were used as the target with the full list of human genes as the background. For simplicity, we chose to use the same −log10P value as the consensus GWA threshold (ie, −log10Pint >7.3) to select SNP pairs of each cohort and used their gene annotations as the input for the GO enrichment analysis. The GO terms enriched (P<1.0E−03) were compared across study cohorts to identify firstly common GO terms and then their member genes shared by the cohorts. The shared epistatic genes were examined further for biological functions via literature mining and their associated interactions in the retained results of each cohort to identify potentially important interactions for replication tests. The BMI loci involved SNP pairs (−log10Pint >7.3) in each study cohort were also identified as potentially important interaction signals for replication tests.

Genome-wide significant SNP pairs and those identified as potentially important interactions were tested for replication across the four study cohorts. The replicated SNP pairs were further tested for replication in the NFBC1966 cohort. Each replication test was done at both the SNP and region levels. At the SNP level, each replicated SNP was exactly the same as the corresponding epistatic SNP and thus the 5% nominal significance threshold (ie, −log10(0.05)=1.30) was used because only one replication test was needed. At the region level, interactions between each of 10 adjacent SNPs (ie, five upstream and five downstream) of the first epistatic SNP and each of those of the second were tested, to accommodate the situation where multiple SNPs may tag a same mutant of a gene. Permutation was used to derive significance thresholds for replication of each epistatic pair at the region level, where phenotypes were permuted and the highest −log10Pint value of 121 (ie, 11 × 11) interaction tests was recorded in each of 1000 iterations. The replicated SNP pairs were fitted together into the full model to calculate the proportion of phenotypic variance explained in each study cohort.

Results

The mean BMI was similar across the CROATIA-Vis, CROATIA-Korcula and ORCADES cohorts but lower in MICROS (Table 1). The polygenic heritability estimates varied from 0.356 (CROATIA-Vis) to 0.514 (ORCADES). Conventional GWA scans found no genome-wide significant SNPs in any single cohort. The inflation factor lambda (computed by regression of observed association P-values against the expected) of each GWA scan was very close to 1 (Table 1), suggesting the family relatedness in each cohort was well accounted for. Only 8 out of the 32 BMI SNPs previously identified3 were genotyped in the four study cohorts and none of these showed a strong association with BMI (Supplementary Table S1).

Full pair-wise genome scans found no SNP pairs that passed the genome-wide threshold (−log10Pint=11.95) in any of the four study cohorts (Figure 1). Considering interaction signals with −log10Pint >7.3, MICROS had the least number of SNP pairs and consequently the least number of annotated genes, whereas the remaining three cohorts had relatively similar numbers of SNP pairs and annotated genes (Table 2). Five out of the 32 BMI loci (but not the BMI SNPs) were involved in 7 epistatic pairs in CROATIA-Vis: FTO, KCTD15, LRP1B, NEGR1 and PRKD1. Similarly, three BMI loci (NEGR1, NRXN3 and PRKD1) were involved in CROATIA-Korcula, two (FTO and MTCH2) in ORCADES and two (FTO and LRP1B) in MICROS.

Figure 1
figure 1

Pair-wise epistatic signals in each study cohort. (a) Pairwise epistatic signals in CROATIA-Vis. (b) Pairwise epistatic signals in CROATIA-Korcula. (c) Pairwise epistatic signals in ORCADES. (d) Pairwise epistatic signals in MICROS.

Table 2 Number of SNP pairs and genes annotated at −log10Pint>7.3 in each study cohorta

GO terms enriched by epistatic genes (−log10Pint >7.3) in each cohort were compared (Supplementary Table S2) and identified 9 common in all four cohorts (Table 3) that might indicate common regulation mechanisms (eg, GO:0008038 – neuron recognition). Among the epistatic genes that enriched the 9 GO terms, we found 19 epistatic genes shared by the four cohorts of which 15 are previously published GWA loci (mostly not genome-wide significant) associating with various phenotypes34 (Supplementary Table S3). Most of the 19 shared epistatic genes interacted with one another despite their interactions being relatively weak (−log10Pint<7.3, Supplementary Table S4) in general, including CDH13 (cadherin 13) associated with height35 and SORCS2 (sortilin-related VPS10 domain containing receptor 2) associated with circulating insulin-like growth factor 1 and insulin-like growth factor-binding protein-3, which are important for anthropometric traits and risk of cancer and cardiovascular disease.36

Table 3 Common GO terms enriched by epistatic genes in CROATIA-Vis (Vis), CROATIA-Korcula (Korcula), ORCADES and MICROSa

We further tested replication of the SNP pairs involving either BMI loci (19, Table 2) or two shared epistatic genes across the study cohorts (50, Supplementary Table S4). Despite none of the 69 SNP pairs being genome-wide significant, eight of them had a replication in one or more cohorts at the SNP level (ie, −log10Pint>1.30; Table 4). The best replicated pairs at the SNP level were rs2202167 (NRXN3) rs11150880 (−log10Pint was 8.19, 1.68 and 1.43 in CROATIA-Korcula, CROATIA-Vis and ORCADES, respectively) and rs1474056 (MTCH2) - rs7250947 (PLIN4) (−log10Pint was 8.08 in ORCADES and 2.44 in CROATIA-Korcula). The rs11150880 SNP is near the RPH3AL gene, which is known to have a key role in insulin secretion by pancreatic cells.37 The PLIN4 gene may be important for intracellular and neutral lipid storage droplets.38 The eight replicated SNP pairs together explained the phenotypic variance of BMI by 4, 4, 2 and 0.5% in CROATIA-Vis, CROATIA-Korcula, ORCADES and MICROS, respectively. By testing replication at the region level, we found the pair of rs9858278 (NAALADL2) - rs7198915 (CDH13) replicated in CROATIA-Vis, CROATIA-Korcula and MICROS (exceed the 5% thresholds, Table 4 and Supplementary Table S5). Further testing the nine replicated SNP pairs in the NFBC1966 cohort found none replicated at either the SNP or region levels. However, seven out of the nine pairs had −log10Pint>2, of which three exceed the 20% thresholds (Table 4 and Supplementary Table S5).

Table 4 Replicated interactions involving either the BMI loci or two shared epistatic genes across cohortsa

Discussion

Gene–gene interactions have been suggested as sources of the hidden genetic variations in GWA studies,6, 7 but the extent of their role in this regard has yet to be demonstrated. One big challenge is that the sample sizes of many GWA data sets are relatively small (eg, less than 4000 individuals) and hence the power to detect epistasis could be low.8, 13 Therefore studying epistasis in a single GWA population is unlikely to be fruitful. This is certainly true in our case where exhaustive genome scans in the four study cohorts found no genome-wide significant epistasis associated with BMI. We suggest to tackle the challenge by looking for common (thus potentially important) gene–gene interactions from sub genome-wide significant epistatic signals (−log10Pint>7.3) in multiple GWA populations. We showed that GO enrichment analysis could be used to identify common GO terms (ie gene function groups) enriched by the epistatic signals in the four study cohorts from which 19 shared epistatic genes were identified. Most of the 19 shared epistatic genes are previously identified GWA loci associating with phenotypes other than BMI and interacted with one another. Their interactions were considered potentially important because they belong to one or multiple commonly enriched GO terms. Interactions involving at least one of the 32 BMI loci with −log10Pint>7.3 were also considered potentially important assuming the BMI loci are likely interactive.

Being aware of possible noises in those potentially important interactions, we used replication to identify the most reliable epistatic signals across the study cohorts. Eight epistatic pairs involving either the BMI loci or two shared epistatic genes showed replication at the SNP level in at least one cohort (Table 4). The eight epistatic pairs together could indeed explain a considerable proportion of the BMI variation in each individual cohort. Nevertheless, caution is recommended in light of the potential overestimation of the effects due to the ‘winner's curse’.39 Besides, none of the eight epistatic pairs were replicated in all of the four study cohorts, or in the replication cohort NFBC1966. Further replication tests in other populations and/or functional assays are useful to confirm whether they are true signals.

Statistical replication has been used as the golden rule to prevent reporting false positives in GWA studies. This however appears to be far more difficult for epistatic signals than for single SNP signals for reasons, including power, minor allele frequency change, and linkage disequilibrium between epistatic SNP and mutant for both loci.16 The moderate −log10Pint values of the epistatic pairs tested for replication suggest that the linkage disequilibrium between epistatic SNPs and mutants is not high so replication of these pairs will be difficult. Furthermore, different environments may cause different phenotype distributions in the discovery and replication cohorts. The lack of replication in the NFBC1966 cohort could be due to two important environmental factors of BMI: age40 (ie, 31 vs a range between 18 and 90 in the study cohorts) and diets.29

The approach based on common gene–gene interactions in multiple GWA populations is an effective solution to the issue of limited power of detection of epistasis. It is just a partial solution though because some ignored interactions may be important as well. Comparison of sub genome-wide significant epistatic signals across multiple populations can be made at either the SNP, or gene or pathway level and seem more fruitful at the gene or pathway level than the SNP level. The approach may become more useful if better annotation methods (considering only GWA signals without interactions)41 can be adapted to epistasis. For example, not all epistatic SNPs were annotated to genes in the study and hence did not contribute to the enrichment analysis. The approach will likely remain important even once new tools for meta-analysis of epistasis in GWA data sets become available to increase power for detection of epistasis.