Introduction

In linkage analysis for quantitative traits, extreme individuals provide a relatively large contribution to the test statistic. Indeed, this is the basis of selective genotyping to maximise power while minimising costs. For example, the EDAC design, in which extremely discordant pairs and concordant pairs of relatives are selected, has been proposed as a powerful design.1 If there are quantitative trait loci (QTL) of large effects segregating in pedigrees, then relatives that do not share alleles identical by descent (IBD) at the trait locus will tend to be phenotypically dissimilar, and relatives that share one or two alleles IBD at the trait locus will tend to be phenotypically similar. This relationship between IBD status and phenotypic similarity forms the basis of QTL mapping in human populations.2, 3

The corollary of the advantage of relatives with extreme phenotypes for QTL mapping is that large differences in phenotypes within families (ie, within-family outliers) can have a large effect on the test statistic for linkage, even if the individual phenotypes are not obvious outliers in the entire sample. Within-family outliers may obscure a real linkage signal or create a false positive one. However, in practice it is usually not known if a within-family outlier arises because of genetic or environmental (eg, measurement error or perinatal trauma) factors. Herein lies the dilemma of how to deal with within-family outliers: under the null hypothesis of no linkage they can create false positives, yet they would be expected under the alternative hypothesis of segregating alleles of large effects. The existence of rare alleles with large effects would require large sample sizes for detection and replication and would obscure the linkage signal at other loci contributing to normal variation. On the other hand, rare alleles of large effect size may be a major contributor to the caseload of clinical geneticists and may point to important pathways in the genetic regulation of a phenotype of interest.

Height is one of the most important anthropometric measures. It has been suggested as an indicator of childhood living conditions4 and has been associated with the risk of coronary heart diseases,5 intelligence,6 educational attainment,7 longevity8 and social mobility.9 Height is also an excellent model trait for studying the genetic architecture of complex quantitative traits.10 It is a normally distributed quantitative trait and highly heritable. A wide variety of genetic designs all converge on a heritability for height of about 0.8.10, 11

To date, numerous QTL for height have been mapped to almost all chromosomes (see Supplementary Table). With the exception of a few regions from a small number of studies, most of them have been mapped with only moderate statistical support (ie, LOD score <3), which is similar to that seen for many complex traits. This suggests that most of the genes influencing height have small to moderate effects. To be able to detect QTL with small to moderate effects, a large sample size is required.12 Therefore, using a large sample size of 8447 individuals from 2861 families of twins, our study aimed: (1) to detect chromosomal regions influencing height, including those with smaller effect sizes that may be missed by studies using smaller samples; (2) to replicate previously reported chromosomal regions affecting height; and (3) to empirically investigate the effect of within-family outliers on the evidence for linkage.

Subjects and methods

The study was approved by the Human Research Ethics Committee of the Queensland Institute of Medical Research and all participants gave their signed informed consent.

Subjects

The subjects were drawn from two cohorts, an adolescent and an adult cohort. The younger cohort constituted adolescent twins and their siblings who participated in various studies conducted at the Genetic Epidemiology Laboratory of the Queensland Institute of Medical Research, Australia. Height for these individuals was measured by a nurse using a stadiometer at the age of 12 and 14 as part of a melanoma risk factors study13 and at the age of 16 for a study on cognition.14 From these adolescents, we chose the measurement at the age of 16, which was available on 1575 individuals.

The adult cohort constituted twins and their families (eg, parents and other family relationships) who had been drawn from the Australian Twin Registry for various studies, including asthma and allergy,15 anxiety and depression,16 alcoholism17 and factors for cardiovascular disease.18 From the adult cohort, clinical measurement and/or self-reported height were available from 36 427 individuals and more than one measurement was available for some individuals due to their participation in multiple studies. Following Cornes et al,19 rules were implemented to select the most accurate measurement for further analysis. Briefly, if a clinical measurement was available for an individual, it was used for the analysis. Out of 36 427 individuals, clinical measurements were available for 5129 individuals (14.1%). If there was more than one measurement, the most recent measurement was chosen subject to consistency checks (see Supplementary Figure).

Genotyping

Microsatellite marker genotypes were available from several genome-wide and fine mapping studies. A detailed description of genotyping, cleaning and merging of each of the smaller genome scans has been provided by Zhu et al13 for the adolescent cohort and by Cornes et al19 for the adult cohort. Further genotypic data from a subset of individuals previously described by Cornes et al19 and 4575 newly genotyped individuals from the adult cohort have been incorporated into the present study using the same procedures described by Cornes et al19 (M Luciano, G Zhu, KM Kirk, SD Gordon, AC Heath, GW Montgomery and NG Martin, personal communication).

Briefly, the procedure of cleaning and merging the smaller genome scans into a combined genome scan can be described as follows. Firstly, raw data from each genome scan were integrated into one combined genome scan. Some individuals were genotyped using the same markers in different genome scans. Collapsing these duplicate markers into one unique marker was not an option since the same markers from different genome scans were often genotyped using different primers, allele calling algorithms and measurement technologies. Thus, these markers were given unique identifiers and their map positions were separated by 0.001 cM to avoid zero-spacing. Secondly, if Mendelian inheritance errors were detected, all genotypes for that marker for a given family were removed by utilising the algorithm developed by O'Connell and Weeks.20 Finally, unlikely chromosomal recombination patterns were detected and removed using the –error option in the Merlin package.21 Based on the genotype data, the pedigree relationships between individuals were checked using standard relationship checking software, RELPAIR 2.0.122 and GRR.23

Subjects with phenotypic and genotypic data

Data from the two cohorts were combined, giving a total of 38 002 individuals. From these, only 9464 individuals have both phenotypic and genotypic data. Figure 1 show that subjects younger than 16 years old were mostly in their adolescent growth stage, so to minimise the heterogeneity of the phenotypic definition while maximising the sample size (ie, 846 subjects aged between 16 and 18 years old), only subjects ≥16 years of age were included in the analyses. Subjects were also selected if they were genotyped with more than 210 autosomal markers (the minimum number of autosomal markers from the smallest genome scan). In addition to that, 12 outliers (subjects with standardised height after sex and age adjustments of more than 4 SD above or below the mean) were removed. After the exclusions, the final sample of 8447 individuals was included in linkage analyses and its descriptive statistics is presented Table 1. This sample consisted of 2861 families, 5815 possible pairs of siblings in sibships (980, 2065 and 2770 pairs for male–male, female–female and opposite-sex pairs were, respectively) 3913 parent–offspring, 63 half-sibs, 8 cousins, 10 grandparent–grandchild and 146 avuncular relationships. In addition to DZ twin pairs and additional siblings, there were 236 MZ twin pairs. In the above calculation of the number of informative relative pairs, only one individual from an MZ pair was included.

Figure 1
figure 1

Height against age at measurement.

Table 1 Descriptive statistics of subjects with phenotypic and genotypic information

Definition and removal of within-family outliers

In a family consisting of sibling pairs, let y1 and y2 be the phenotypic (residual) values of sibling 1 and sibling 2, respectively. Their phenotypic difference is D=y1−y2. By assuming that the phenotypic values are bivariate normally distributed with correlation coefficient r and σ the SD of y1 and y2, the expected mean and variance of square of D (D2) can be calculated using χ2 theory, D2∼[2(1−r)σ2]χ2(1). The mean of D2 is 2(1−r)σ2 and its variance is 8(1−r)2σ4 (see also Sham and Purcell24). Hence, the expected SD of D2 is 2√2(1−r)σ2.

Within-family outliers were defined as sibling pairs for which the squared phenotypic difference is much larger than expected from the population correlation. From our data, sibling correlation and residual SD (after a general linear model correction for the effects of age and sex) were 0.43 and 6.7 cm, respectively, so that the mean and SD of D2 are 51.2 and 72.4 cm, respectively. Thus, sibling pairs for which the squared phenotypic difference was 4 SD above the mean (ie, >341 cm2), which corresponds to an absolute difference 18.5 cm or greater, were categorised as within family outliers. There were 81 possible pairs of siblings in sibships with D2>4 SD above the mean. If within-family outliers were detected in families with more than two siblings, then not all individuals in a pair were removed, but only the individuals most responsible for the within-family outliers, that is, the individual with the largest deviation from the population mean. In total, 59 individuals were removed (0.7% of the total sample), which resulted in a reduction of 142 possible pairs of siblings in sibships, or 2.4% of the total number of possible pairs of siblings in sibships.

Power calculations

To assess the power of the present study to detect a QTL influencing the normal variation in height, a series of power calculations based on Sham et al,25 was performed using a genetic power calculator26 under the following assumptions: (1) an additive QTL is influencing the variation of height; (2) recombination rate between the marker and the QTL is 0; (3) a heritability of 0.8; (4) type 1 error rate (α)=4.8 × 10−5 (equivalent to a LOD score of 3.3). The expected LOD score was also calculated using the above assumptions as E(LOD)=(1+NCP)/4.605, with NCP the QTL non-centrality parameter.25

Linkage analysis

To identify and map chromosomal regions (QTL) influencing variation in height, a multipoint variance component linkage analysis was performed as implemented in the Merlin 1.0.1 (for autosomes) and MINX (for the X chromosome).21 As the framework for mapping the QTL, that is, establishing the position and order of the markers in the chromosomes, a locally weighted linear regression map (http://www.qimr.edu.au/davidD) based on NCBI Build 35.1 physical map positions, deCODE and Marshfield maps was used.27

Linkage analysis correlates the phenotypic similarity of relative pairs with their genotypic similarity (represented by the proportion of alleles shared identical by descent (IBD) at a specific position in the genome) (eg, Almasy and Blangero28). In a variance component framework, the observed variance is partitioned into components due to QTL effects (shared when pairs are IBD at a marker), polygenic effects (shared according to their genetic relationship) and environmental (non-shared) effects, using the observed covariance between relatives and IBD sharing at marker loci. The presence of a QTL at a specific chromosomal location was tested by comparing the likelihood that the QTL variance is zero (H0) with the likelihood that the QTL variance is different from zero (H1). The LOD score is defined as twice the difference between the log10 likelihood (H0) and (H1) (eg, Almasy and Blangero28). As suggested by Lander and Kruglyak,29 the present study considered LOD scores of 3.3 and 1.9 as significant and suggestive evidence of linkage between QTL and markers at the test position, respectively. The effects of within family outliers on linkage results were empirically examined by conducting linkage analyses on the data before and after removing individuals who were classified as within family outliers.

Results

Descriptive statistics

The mean height of males was larger than that of females with an estimated difference of 14.16 cm (SE 0.15). The regression slope of the age on height was −0.05 (SE 0.01) cm/year indicating gradual shortening of our aging sample, possibly confounded with a secular trend of increasing height in later birth cohort.30 Skewness and kurtosis values of the standardised residuals of height after a general linear model adjustment (in SPSS package) for the effects of age and sex were very close to zero (ie, skewness: 0.10 (SE 0.06); kurtosis: 0.03 (0.06). This is a good indication that our data have close to a normal distribution, which is an assumption of variance component linkage analysis by maximum likelihood and justifies use of asymptotic significance values for linkage.29 The average numbers of markers genotyped per individual was 542 (range, 211–1640). Overall, these numbers provided an average spacing of about 5 cM across the genome. The estimated sib correlations before and after removing within-family outliers were 0.43 and 0.48. The corresponding estimates of heritability were 0.83 and 0.84, respectively. The estimates of heritability were not identical to twice the sib correlations because they were estimated using all pedigree relationships, including parent–offspring and MZ pairs.

Detection of within-family outliers

In Figure 2 the distribution of the squared difference in phenotype between siblings (D2) is shown for all sibling pairs after removal of the effects of age and sex. The empirical mean and SD of D2 are 51.2 and 81.4 cm, respectively, whereas their predictions from χ2 theory (see Subjects and methods) are 51.2 and 72.4. Clearly there are a number of outlier pairs (ie, 81 pairs). The largest squared difference was 1000 cm2, corresponding to an absolute difference of 32 cm. For a trait such as height with a high heritability (and sibling correlation), such within-family deviations are not expected from genetic segregation of common variants of small effect, and removing these outliers appears justified.

Figure 2
figure 2

Empirical distribution of the squared difference of residual height (D2) of 5815 possible pairs of siblings in sibships. Inset is the distribution of D2≥341 cm2, which represent pairs excluded as within-family outliers.

A closer inspection to the outliers showed that there was no significant difference between the proportion of subjects with clinically measured and subjects with self-reported height in the outlying subjects and the rest of the sample (P-value=0.95). The proportion of same-sex and opposite-sex pairs in the outlying subjects and the rest of the samples was also not significantly different (P-value=0.23). The outlying subjects appear to be evenly distributed across tall and short stature and across age, but with fewer outliers after the age of 50. Most of the outlier pairs have similar age. The age correlation between sibling 1 and sibling 2 was 0.81. For three outlier pairs (six individuals) who were ∼16 at the time of measurement and between 21–28 years of age at the time of analyses, we checked the original recorded phenotype and telephoned those individuals about their current height. We found that the data in our database was consistent with both the initial records and that of the currently reported height difference.

Power calculations

The power of our sample to detect a QTL responsible for 5, 10 and 15% of the phenotypic variation at type 1 error rate (α) of 4.8. × 10−5 is 1, 32 and 89%, respectively. For a power of 80%, the required sample size to detect a QTL explaining 10% of the phenotypic variation is about 10 910 sib pairs. The expected LOD scores for a QTL explaining 5, 10 and 15% of the phenotypic variance are 0.9, 2.8 and 6.0, respectively.

Linkage analyses

The multipoint LOD score profiles from linkage analyses including and excluding within-family outliers are presented in Figure 3 and the chromosomal regions that reached a LOD score of 1.5 or greater are presented in Table 2. The results showed that by excluding the outliers, the LOD scores increased for most regions, except for the peak on chromosome 15 (LOD score decreased from 2.3 to 1.2). When within-family outliers were excluded, 3 chromosomal regions (1q23.1 (LOD 2.0), 3q22.1 (1.9) and 5q32 (2.3)) were suggestive for linkage with height.

Figure 3
figure 3

Comparison between LOD scores from linkage analyses including and excluding within-family outliers.

Table 2 Chromosomal regions showing multipoint LOD score ≥1.5

We have also performed additional linkage analyses where only subjects older than 18 or 20 years old were included in the analyses and the results were very similar (results not shown). In addition, the application of a regression method (implemented in the computer program Merlin-Regress21), which uses a linear combination of squared sums and squared difference as a dependent variable31 did not change the linkage results (results not shown).

Discussion

Genetic studies of normal variation in height are medically important and useful for understanding the genetic architecture of complex traits. Our genome-wide linkage analysis using a sample of 5815 possible pairs of siblings in sibships provides further evidence for the polygenic nature of height. Several chromosomal regions showed moderate linkage and three of them (1q23.1 (LOD 2.0), 3q22.1 (1.9) and 5q32 (2.3)) reached the asymptotic suggestive level of 1.9. However, despite our large sample, none of them reached the asymptotic threshold of 3.3 indicating significant linkage. Power calculations indicated that if there was a QTL explaining 15% or more of the variation in height segregating in our sample and the QTL was completely linked to the marker, it was unlikely to be missed since we had 89% power to detect it. Our results therefore suggest that normal variation in height is influenced by several or many genes with small-to-modest effect size.

Previous studies have identified several chromosomal regions showing significant linkage (ie, LOD score ≥3.3) with height (Supplementary Table). These included the regions on chromosome 1p21 (for males only),32 6q25.3,33 7q35,33 9q2234 12q13.1,33 13q33.1,33 14q23.1,35 and the X chromosome (Xp22 and Xq24).34 Many other studies have also supported these regions as being associated with height, including the regions on chromosome 1;35 chromosome 6;36 chromosome 7;37 chromosome 9;36 chromosome 12;32, 36, 38 and chromosome 14.39 Among these regions, 12q13.1 was replicated in this study with a LOD score of 1.4. The chromosomal regions of 3q22.1 and 5q32, which showed suggestive linkage in our study, have been previously mapped into a similar location. Using a combined population of 6752 individuals, Wu et al35 reported that 3q23 and 5q31.1 were suggestive for linkage with height (LOD=2.0 for both). Deng et al36 have also suggested that the region on chromosome 5 was associated with height. Although these regions are not exactly the same, the one LOD drop-off confidence intervals overlap.

Among many genes that have been associated with height, the vitamin D receptor (VDR) has recently received much attention. As one of the intracellular hormone receptors, the main role of VDR is to bind the active form of vitamin D. This gene has been mapped to 12q12–q1440 and it has been associated with height (eg,38, 41). Furthermore, a meta-analysis of published linkage studies on chromosome 12 has shown that there is significant evidence for the presence of a QTL on this chromosome.38 In the present study, the LOD score for this region did not reach suggestive linkage, but a LOD score of 1.4 was observed. Power calculations suggest that the expected LOD scores for a QTL explaining 5 to 10% of the phenotypic variance are 0.9 and 2.8, respectively, so if the VDR has a moderate effect on height, the observed LOD scores in this region are not inconsistent with VDR being a candidate gene for height. Other interesting genes that are located under or close to the linkage peaks were ADIPOR1 (1q32.1), PTHR1 (3p21.3), BCHE (3q26.1) and CYP19 (15q21.2) and MARFAN (15q21.1) (Figures 3). Variants in these genes have been associated with height.42, 43, 44, 45, 46

Linkage analysis was performed by excluding individuals who were categorised as within-family outliers. Within families, these individuals were regarded as discordant for height when the absolute difference of the height measure of siblings was more than 18.5 cm. Risch and Zhang1 have shown that selecting extremely discordant and concordant sib pairs is more powerful for linkage analysis than selecting random pairs. Thus, our strategy would seem to be counter productive because we removed the most extremely discordant pairs, hence the most informative pairs. However, by removing these outliers, their disproportionate contribution to the test statistic for linkage, which could be due to measurement errors or environmental effects, can be avoided. Extremely discordant pairs of siblings, who do not share any alleles IBD at a particular region in the genome, contribute strong apparent evidence for linkage. It is questionable to rely strongly on such pairs, because it is the absence of sharing that suggests linkage. The influence of a single pair of within-family outlier can be substantial. Across a number of studies we have noted that a single pair can change the LOD score by 0.5–1.0, and this occurs at many locations in the genome. On the other hand, one cannot discount the possibility of rare variants of large effect size producing ‘outlier’ effects. Replication of these in other studies might help to reveal variants of clinical significance in some rare families, even if unimportant at the population level. Ultimately it is the researcher who has to make a judgement about the validity of data to be used for analysis, but we urge investigators to be critical regarding within-family outliers.

When we followed up a number of individuals who were responsible for within-family outliers we found no evidence for measurement or database entry errors. This leaves open the interesting question whether the large observed differences within families are caused by unknown environmental factors of large effects, for example, in utero environmental insults, childhood disease and epigenetic effects or caused by rare alleles of large effect. Linkage analysis on collections of small nuclear families, as employed in our study, or in the case of disease collections of affected sibling pairs, are not suitable to detect segregating rare variants of large effect that do not explain a substantial (say, >5%) of genetic variance in the population. Moreover, the existence of such variants may obscure the detection of loci that do explain significant variation in the population because of the disproportionate contribution of within-family outliers to the LOD score. The identification of large pedigrees with multiple within-pedigree ‘outliers’ is the most efficient approach to map rare alleles of large effect.

We also noted that besides extreme discordant pairs, linkage information comes also from concordant pairs. Mean-corrected squared sum (S)47 was used to identify pairs that are phenotypically similar, that is, pairs of sibs that both are extremely short or extremely tall. From our data, the mean and SD of S are 129.4 and 180.0, respectively. These agree very well with the expected mean [2(1+r)σ2] and SD [2√2(1+r)σ2]. However, as shown by Visscher and Hopper,48 most of the information on linkage comes from the squared difference (rather than the squared sum) of the phenotypes of a pair of siblings. Therefore, we did not remove these extremely concordant pairs in our analysis.

In linkage analyses, we also excluded 12 extreme individuals (≥4 SD below or above the mean). From these individual outliers, there are 34 possible pairs of siblings in sibships. Within family, some of these individual outliers are also likely to be extreme compared to their siblings, which thus can be categorised as within family outliers. Therefore, it is difficult to disentangle the effect of within-family outliers and individual outliers per se. Individual outliers that are not informative for linkage are unlikely to have an effect on the test statistic for linkage, whereas individual outliers that are informative (eg, have siblings with phenotypes and genotypes) are likely to have an effect by creating within-family discordant pairs.

In conclusion, a genome-wide linkage analysis has revealed three chromosomal regions [1q23.1 (LOD 2.0), 3q22.1 (1.9) and 5q32 (2.3)] suggestive for linkage with height in a large sample of Australian twin families. Despite a large sample size, the moderate statistical support for most of the identified chromosomal regions suggests that height is influenced by several or many genes, each having a modest effect. We also confirmed the disproportionate influence of within-family outliers to the linkage results. While the precise effects of these outliers on linkage results cannot be quantified without theoretical or simulation studies, our findings showed that the disproportionate contribution of a small number of outlier pairs, which could be due to environmental effects or measurement errors, can make a big difference to linkage results. Therefore, we recommend that researchers (re)examine their linkage scans for such outliers as part of routine quality control and robustness analysis. Such outliers can distort the search for common variants of modest effect size, but may also help identify rare variants of large effect and clinical significance. We suggest that the effect of within-family outliers deserves further investigation via theoretical and simulation studies.