Introduction

Numerous studies have shown that elevated Body mass index (BMI) is a risk factor for hypertension, heart diseases, diabetes, and cancers.1,2 Elevated BMI is also commonly used as a strong indicator of obesity. Since the metabolic pathways related to body mass are complex,3 one can expect several different loci to influence fatness and the risk for obesity.4 Indeed, a number of chromosomal regions linked to body mass-related traits have been suggested in several genome scans.5

Genomic imprinting is an epigenetic alteration of genes that leads to the expression or preferential expression of an allele originating from only one parent.6 Comparative genomic studies as well as observations of some rare genetic syndromes in humans have identified many chromosomal regions that show imprinting (reviewed in Bartolomei and Tilghman7). Since imprinted genes are involved in many aspects of development, including prenatal and postnatal growth,7,8 imprinting may affect quantitative traits related to postnatal body size,9 and its effect could probably be better seen in children. To date, chromosomal regions such as 7q, 11p15, 15q11–q13,7 and 20q13,10,11 harboring various genetic syndromes, have been shown to be imprinted. However, the complete extent of imprinting for the genome has not yet been determined.

One of the most popular hypotheses describing the evolution of imprinting is a sex-conflict theory. The theory states that conflict arises between the desire of the father to optimize his reproductive fitness by promoting the growth of the offspring even at the expense of the mother's future litters, and of the mother, who must conserve her own resources within any one litter to ensure that she does not have too many or too large offspring.7,12 The theory predicts that genes promoting prenatal and early postnatal growth are paternally expressed, whereas genes that suppress growth are maternally expressed.

Conventional linkage analyses of imprinted genes may fail to detect linkage,13,14 because parent-of-origin-dependent gene expression results in deviations from the laws of Mendelian genetics and an appropriate modeling of imprinted genes is needed. Recently, researchers started modeling potential effects of imprinting in the control of complex traits including obesity-related traits, both in adults13,15 and newborns.16 However, in newborns one may have to control for strong maternal effects that affect intrauterine growth. Here we present the results of linkage analysis for BMI using 372 markers across the genome while allowing for the effects of imprinting. Our sample population includes participants with a range of ages from childhood through young adulthood, an age period not previously extensively studied in genetic linkage analyses of BMI. Yet, because of lesser cumulative environmental (dietary) exposure, the latter group, too, may be promising for genetic studies of body weight-related traits. We use age stratification, as we believe that different sets of genes affect body size at different ontogenetic stages. Genes involved in growth seem to be important in BMI control in children, which is probably not the case for adults. Age stratification may help reduce the heterogeneity and improve chances to detect linkage. Here we confirm several previously identified linkage regions and find several regions showing evidence of parent-of-origin effects.

Materials and methods

The sample that we studied comprised 1909 individuals who are members of 255 three-generation pedigrees who took part in the first phase of the Rochester Family Heart Study (RFHS).17 The RFHS is a population-based study initiated in 1984 to investigate the genetics of cardiovascular disease and its risk factors. Families participating in the study were ascertained on the basis of having at least two children enrolled in the schools of Rochester, Minnesota, and without regard to the health status of the family members. As many of their relatives as practicable were collected. The study population is over 99% Caucasian. Height and weight measures were obtained by direct observation of subjects following well-established protocols.17 Subjects were asked to remove their shoes and outdoor clothing. Height was measured with a wall stadiometer and weight was determined on a beam balance.

Before the analysis, the extended families were divided into nuclear families to facilitate identity-by-descent calculations for the imprinting tests.14 Since different genes may affect BMI at different stages of ontogenesis, we not only analyzed the whole sample (893 sib pairs) but also the age-stratified groups. The groups were created to separate out effects that may exist during childhood, adolescence and early adulthood. We were also concerned that effects during adolescence might not be easily modeled given the potential heterogeneity of gene expression during this age period. The pubescent group was defined in this study as ages 12–16 inclusively and called ‘adolescents’. Thus, the following sib-pair age groups were analyzed:

  • children: both sibs were 5–11 years old (101 pairs from 69 independent sibships, 53 sibships having one sib pair, 237 subjects);

  • adolescents: both sibs aged 12–16 years (91 pairs from 75 independent sibships, 67 sibships having one sib pair, 279 subjects).

  • young adults, in which ages of both sibs were 17–30 years (173 pairs from 65 independent sibships, 41 sibships having one sib pair, 244 subjects).

The number of sib pairs in which both sibs were older than 30 was too small to perform a reliable analysis. Since some sib pairs included individuals from different age groups, the total number of sib pairs is greater than the sum of the sib pairs across the three age groups.

Before the linkage analysis, we adjusted BMI values for the effects of sex, age, and squared age (to account for possible nonlinearity of the dependency of BMI on age), separately for each age group, for standardization. The age groups and corresponding covariate adjustments are shown in Web Table 1. For children and adolescent age groups, the effects of adjustments were not significant though. We also explored the dependence of BMI on gender alone for the adolescent age group, having in mind that gender differences start to develop in that age group, but that was not significant either. When the sample of all sib pairs was analyzed, we tried two types of adjustment. One was based on the regression coefficients obtained for the whole sample, and the other was based on obtaining the residuals from regression analyses that were conducted within age strata. The results from either analysis were practically identical and so we only present the findings from the residuals after covariate adjustment within age strata.

Web Table 1 Covariate adjustments to BMI prior to analyses

DNA samples were evaluated using 372 minisatellite markers that are approximately evenly distributed along all the autosomes. Genotyping was performed by standard methods with an Applied Biosystems/Perkin-Elmer 377 automatic DNA sequencer. To minimize typing errors, all genotypes were performed in duplicate and any discrepancies were resolved by a senior laboratory supervisor. The proportions of alleles identical by descent with respect to each parent of origin were calculated by a modification of the IBD program from ACT package.14,18 For loci showing preliminary significant evidence for linkage and that were not in Hardy-Weinberg equilibrium, we applied a further modification of the IBD program to use genotype rather than allele frequencies for missing parents when calculating identity by descent. We applied Haseman–Elston tests19 for linkage. The Haseman–Elston test evaluates the evidence in favor of linkage by regressing the squared pair differences for a trait onto the proportion of alleles identical by descent, π, for the marker locus. An alternative method to use for this study would have been the variance components method, but given the deviations from normality and kurtosis demonstrated by BMI (Web Table 2), this method can produce excess false-positive results.20 Thus, we choose the Haseman–Elston test over other variance-components procedures as more robust in our situation. To accommodate parent-of-origin effects, we used the extension of Haseman–Elston method as suggested by Hanson et al.13

Web Table 2 Descriptive statistics of BMI in age groups

We tested for the effects of parental-specific expressed loci by including two separate terms into the Haseman–Elston regression equation and considering two separate β coefficients for the parental effects as follows:

where xi and xj are trait values for sibs, βMO and βFA are the genetic effects due to parental factors, and πMOij and πFAij are the estimated ibd sharing from mother and father (with range from 0 to 1), respectively. Web Table 3

Web Table 3 Obesity-related genes found in the regions identified in the present study

To evaluate significance of total and parental-specific linkages, we used simulation. Unlinked marker data were generated using the SIMULATE program.21 This program assigns markers to individuals by sampling alleles for the founders in the pedigrees according to the specified allele frequencies, and assuming Hardy–Weinberg equilibrium. Then, markers are segregated to offspring. We generated families with exactly the same structure and the same BMI data, as we had observed for each age strata. Then the regression was performed using the observed trait values. We performed 40 000 simulations for analysis of each P-value that is reported. The observed values of the total t-test for linkage as well as the parental-specific t-test values were used, separately, to derive total and parental-specific P-values. To obtain empirical P-values for the t-test values observed in our analysis, the t-test values were compared with the empirical distribution of t-values that were generated under the null hypotheses. The P-value was equal to the ratio of the number of simulated t-test values equal or exceeding the observed one, to the total number of tests performed.

We implemented a Bonferroni correction to paternal- and maternal-specific P-values to allow for the two independent tests (maternally imprinted and paternally imprinted). This correction was not applied to the P-values for total linkage, as it is dependent on both maternal and paternal linkage. The t-test values that we obtained while performing linear regression of squared trait sib pair differences on identity by descent sharing were transformed into LOD scores by squaring the t-test values for those regressions that were negative and dividing by 4.6.

Significance of the test for imprinting (whether or not the two parental-specific regression coefficients are different) was also evaluated using a permutation procedure. Hanson et al.13 showed by simulation that type 1 error may be inflated for this test in presence of a nonimprinted locus linked to the marker when non-independent sib pairs were used. However, the type 1 error didn't exceed the nominal value if no trait locus was linked to the marker regardless of the number of sib pairs per family. This means that the distribution of the test t-statistic

depends on whether or not there is a trait locus linked to the marker.13 In other words, the distribution is locus-specific. The permutation procedure was as follows: πMO and πFA were permuted within blocks defined by all sib pairs belonging to the same family, with probability 0.5, independently for each family. Thus, we preserved the correlation in ibd within families while permuting the parental components and then derived an empirical distribution of the test for imprinting under the null hypothesis. The procedure was repeated 1000 times for each locus for which evidence for linkage and an indication to parent-of-origin effect was present. The t-test mentioned above was used as a test statistic and empirical critical values and P-values were derived separately for each locus.

Multipoint linkage analysis was used to help to choose between two following possibilities: effects from an imprinted trait locus versus effects from a locus that is not imprinted but is located at a different genetic distance from a marker in the two genders. The latter is possible when there is a large sex difference in recombination rates and can lead to the false conclusion that the locus is imprinted if single point analyses are conducted.13 Multipoint linkage analysis was done using Fulker–Cardon approxmation22 for total linkage as well as for sex-specific linkages. As is shown in the Appendix, the approximation follows the same procedure with parent-of-origin-specific proportion of alleles shared IBD as with total IBD sharing. Sex-specific recombination distances were used. Each interval between markers was divided into such a number of intervals that the distance between points is approximately 1 cM on the sex-average map, and the lengths of the intervals on the sex-specific maps were chosen accordingly. Thus, all the three maps were divided into the same number of intervals and the physical location of the corresponding points was close to being the same for both genders and the sex-average map. The results are presented graphically based on the sex-averaged scale.

To check whether the markers were in Hardy–Weinberg equilibrium, we used the hwe program available from John Brzustowski (ftp://gause.biology.ualberta.ca/pub/jbrzusto/cgi/hwe/), based on the method proposed in the paper by Guo and Thompson.23

Results

Of the 372 markers, 55 (14.8%) were not in Hardy–Weinberg equilbrium with significance beyond the 5% level. Among the 21 markers that showed linkage, only three (about 14%) showed departures from Hardy–Weinberg equilibrium, namely D8S1113, GATA49D12, and D20S851.

In the analysis of all sib pairs (without dividing them into age groups) we found a strong suggestive evidence for linkage near the marker D1S552 at 45.33 cM from pter (1p36 region), thus confirming earlier findings it this region.24,25 The LOD score for total linkage equals 2.03 (PLin=0.002). There was only a weakly suggestive evidence for imprinting for this locus, implying preferentially maternal expression (LODFA=0.968, PFA=0.035; LODMO=1.538, PMO=0.008).

Linkage and imprinting results after stratifying by age

One might expect that the power of a linkage study would drop after a sample has been subdivided, because of the decrease in number of individuals. On the other hand, if genetically homogeneous subsamples can be isolated, evidence for linkage will increase. Here we present the results of analysis of age-stratified samples.

In the adolescent age group, no significant linkages were found. For both the children and the young adults there were several findings.

In children 5–11 years of age (Tables 1 and 2) there were 15 markers showing evidences for linkage to BMI, some of them strong with LOD scores greater than 4. Of those 15, two show only total linkage (one with LOD score higher than 3) and three more – both total and parent-specific linkage (two of them with LODTotal score higher than 3). A total of 10 markers show parent-specific linkage (i.e. linkage when taking parent-of-origin effect into account) only. Of them, we observed seven markers that showed only suggestive paternal expression (chromosomes 3, 10, 12, 16, 18). Three markers showed only suggestive maternal expression (chromosomes 2 and 4).

Table 4 Description of the markers showing linkage with BMI (ages 5–11 years)
Table 5 Characteristics of linkage found in age group 5–11 years

Chromosomes 2, 16, 18, and 20 show more than one marker linked to BMI. Two loci on chromosome 16, D16S404 and D16S764, 12 cM apart from each other, possibly reflect a presence of one major gene in the corresponding region. They show total and paternal linkage, the LOD score being higher for total linkage for both the markers. The strongest linkage was found on chromosome 20 with marker D20S851. The marker shows both total and maternal linkage, and again, the LOD score for the total linkage is higher than for maternal linkage only. Another marker from chromosome 20 and a marker from chromosome 5 show significant total linkage only.

As nominal P-values for test for imprinting may be biased towards smaller values in case of total linkage, we used simulation-derived locus-specific P-values for imprinting tests (see Materials and methods). For children, we found four loci showing significant parent-of-origin effects. Three of them show paternal expression (D3S3038, D10S1423, and GATA49D12), and one shows maternal expression (D4S1629) (Table 2).

For young adults (ages 17–30), six markers showed linkage with BMI (Tables 3 and 4). The marker D4S2417 on chromosome 4 and the marker D8S277 on Chromosome 8 showed paternal expression, both demonstrating significance for the empirical test for imprinting. The markers D9S922 and D15S1007 (chromosomes 9 and 15, respectively) showed suggestive maternal expression, although the test for imprinting was not significant. The marker D14S742 on chromosome 14 demonstrated suggestive total linkage without effects of imprinting. The marker D8S1113 on chromosome 8 showed both total linkage and suggestive paternal expression, although the test for imprinting was not significant.

Table 6 Description of the markers showing linkage with BMI (ages 17–30 years)
Table 7 Characteristics of linkage found in the age group 17–30 years

There were no markers that showed significant linkage to BMI simultaneously in both young adults and children. However, markers in the chromosomal region 4q31–q33 showed linkage in both the groups, but over a 24 cM region. Also, the imprinting effects were different in this region, with children showing maternal expression and young adults showing paternal expression.

The results for all chromosomes are presented in Web Figure 2, which shows the findings for each locus on each chromosome for the children and for the young adults.

figure 2figure 2

Web Figure 2

Discussion

Published genome scans and other studies of BMI have identified numerous chromosomal regions showing linkage or association (reviewed in Rankinen et al5). Different studies tend to identify different regions. These ‘inconsistencies’ might, in part, result from the use of samples of different ethnic origins and with different disease statuses. The examples of such a diversity include analyses of Mexican-Americans;26 Pima Indians predisposed to type II diabetes;13,27 Finnish diabetic II families ascertained through an affected sib pair;28 Finnish families having obese subjects;29 Ashkenazi Jewish families ascertained for type II diabetes;30 Amish;31 Dutch dyslipidemic families;32 see Rankinen et al5 for the complete list of the linkage studies. Different populations may be polymorphic for different loci involved in control of BMI. In addition, adjustment for different covariates also can modify results.32

In this study, we found evidence for linkage with 22 markers that are located in 17 distinct regions. Of those, 11 regions, namely 1p36, 4q31–q32, 8p11–p12, 10p, 12p12–p13, 14p11.2, 15q11–13, 16p, 18p11.2, 18q12, 20p12 have been reported previously (see Obesity map update5) from linkage or association studies. Regions 15q11–q13 and 20p12 are known to harbor obesity-related Mendelian disorders (Angelman syndrome with obesity/Prader–Willi syndrome and Bardet–Biedl syndrome 6, respectively). The 3p24.1 region that appeared in our analysis was identified recently as suggestively linked to BMI in a combined analysis of genome scans for obesity.33 Several other regions found in the present study were previously shown to be associated or linked with other obesity-related phenotypes: 2p12 is associated with triceps skinfold, 9q22.1 is linked to abdominal subcutaneous fat, 18q12 is linked to fat-free mass.5 The only region found here that has not been reported previously is 5q34–q35 that shows total linkage (no parent-of-origin effect).

We found several regions showing parent-of-origin effect. Most of them, namely 3p23–24, 4q31–q33, 8p, 12p12-pter (Tables 2 and 4) have not been previously reported as imprinted. Interestingly, for all those regions there were conventional linkage findings with LODs from 1.8 to 3.2 (see Table 4 in Rankinen et al5).

Hanson et al13 and Lindsay et al15 found a tentative evidence for imprinting in the region 10p, however, their linkage (at 20 cM) is 26 cM apart from the peak observed in the present study (at 46 cM). We were unable to identify a peak that Lindsay et al15 found on chromosome 5 (71 cM from pter) for maternal expression. In the study of BMI in newborns by that group,16 a peak on Chromosome 11 at 85 cM (paternal expression) was found. We observed modest peaks on Chromosome 11 in children – one at 85 cM, LOD 0.85, another at 100 cM from pter, LOD=1.6, both showing paternal expression. In fact, the location of the former one, as well as the mode of imprinting, coincides with that of the peak reported by Lindsay et al15 Interestingly, other previously known imprinted regions on chromosomes 7q22–q31 and 15q11–q138 produce clear, although in most cases modest, peaks and show only parent-specific linkage in our analysis (Figure 1). In adults, the peak at 15q12–22 reaches a borderline significance (P=0.004) for maternal expression, although the test for imprinting is not significant. A comparison of the results of our study and a list of genes that may influence body mass5 allowed us to suggest candidate genes in many regions identified in our study ().

Figure 1
figure 1

Results of multipoint linkage analysis (total and sex-specific linkages) for BMI in children 5–11 years old. (a) Chromosome 3; (b) Chromosome 18.

Our study emphasizes the importance of adequate sample grouping. This can be illustrated by the fact that in the analysis of all possible sib pairs we found only one region, 1p36, with suggestive linkage even though many more pairs were analyzed than in the other (stratified) analyses. Many more linkages were found after sample stratifying. The explanation may be as follows. There are different sets of genes responsible for body size in children and in adults. As BMI depends on both weight and height, in children it will be controlled in part by genes involved in growth. Owing to the fact that many imprinted genes participate in early growth control, one can expect that effects of imprinting in children should be stronger, which we indeed observed.

Finally, we would like to discuss the problem of false-positive results. In our study we used a threshold for LOD scores of 1.75, which corresponds to one false positive per genome scan with 400 markers.34 The LOD scores that we are reporting reflect three different tests. However, the test for total linkage is completely dependent on the two parent-specific tests. Therefore, when we applied a Bonferroni correction we adjusted for two independent tests. We detected no linkage in the adolescent group, which had approximately the same number of sib pairs as the group of children. This conforms to our expectations of greater genetic heterogeneity in this age range and may serve as an indirect argument that (some of) the linkages found in children and in adults are real. By applying permutation to generate empirical P–values, we addressed the issue of possible type 1 error inflation for the test for imprinting when there is a nonimprinted trait locus linked to a marker. We assessed significance of linkage findings using a permutation approach that assumes the markers are in Hardy–Weinberg equilibrium. Only one marker, D8S1113, that was in Hardy–Weinberg disequilbrium showed a significant excess of homozygotes (χ2=38.24, P<0.00001). Marker GATA49D12 showed an insignificant excess of heterozygotes, while marker D20S851 showed an insignificant excess of homozygotes. The excess frequency of homozygotes for D8S1113 would decrease the empirical power to detect linkage compared with the simulated power used to compute P-values. For the other two markers, departure from Hardy–Weinberg equilibrium (owing to variation in heterozygous genotypes from expected frequencies) would not affect the power to detect linkage and so should not affect the simulated P-values.

Sex differences in recombination frequency, in the presence of a nonimprinted major gene in the region, can lead to artefactual detection of parent-of-origin effects, although the sex difference must be very high (at least 10-fold).13 A weaker linkage in the sex with higher recombination in that region could be erroneously interpreted as imprinting. Such a phenomenon might explain our single-marker results for the markers D3S3038 and D18S478 (Table 2). To minimize this confounding, we used a modified Fulker–Cardon approach for multipoint linkage analysis with sex-specific recombination rates on Chromosomes 3 and 18. If there were a trait locus that is not imprinted and genetically located in females much farther from a marker than in males, we would see an increase in female linkage as the locus is approached. However, this is not the case (Figure 1). We thus conclude that the putative loci located near the markers D3S3038 and D18S478 are imprinted. However, a much denser set of markers would be desirable to evaluate imprinting versus sex-specific recombination. The wide peak found on Chromosome 18 at 29–54 cM is absent in the single-marker analysis. The difference is caused by the absence of the marker D18S542 in the multipoint analysis. The marker was not used because its location on the sex-specific maps remains unspecified.

In conclusion, several of the loci we identified in children showed strong evidence for linkage and their further investigation in homogeneously young datasets would be of interest.