Introduction

The adult height (stature) has long been recognized as an important parameter of human physical development. However, to what extent environmental and genetic factors influence the variation of this highly complex trait is not very clear. Decades of research have identified factors underlying adult height variation. Nutrition (Kusin et al. 1992; Ruel et al. 1995), energy consumption during adolescence (Post et al. 1997), disease (Malleson 1991), and even psychosocial factors (Skuse et al. 1996) can significantly contribute to the final height an adult can achieve. Meanwhile, genetic factors also play an important role in height determination. Many studies suggested a strong familial aggregation of adult height in which the height correlation between monozygotic twins was 0.9 (Ellis et al. 2001) and the heritability estimates ranged from 0.75% to 0.95% (Phillips and Matheny 1990; Carmichael and McGue 1995; Preece 1996; Silventoinen et al. 2000).

Complex segregation analysis (CSA), as a prelude to further parameter-dependent genetic analyses (Elston and Stewart 1971; Lalouel et al. 1983; Dizier et al. 1996), is a method evaluating the transmission of complex traits within pedigrees. It has been applied to characterize genetic determination of adult height in different populations (e.g., Province and Rao 1985; Ginsburg et al. 1998; Ginsburg and Livshits 1999; Xu et al. 2002). However, the results were largely inconsistent. Several studies supported the existence of major gene(s) (MG), but with different inheritance models and different magnitudes of height variation attributable to MGs (e.g., Ginsburg et al. 1998; Ginsburg and Livshits 1999; Xu et al. 2002). In contrast, others failed to detect an MG effect even though the heritability for height was very high in their studies (e.g., Province and Rao 1985). Therefore, whether adult height variation behaves as an oligogenic or a polygenic mode of inheritance, or a mixture of the two, is still under debate. Furthermore, geneticists are still not aware whether different populations are really of different genetic backgrounds for adult height or not, although some implications appear (e.g., Ellis et al. 2001; Hirschhorn et al. 2001). Therefore, extensive segregation analysis in different populations can contribute to our understanding of the genetic determination of adult height.

To the best of our knowledge, there are still no in-depth studies on the genetics of adult height in the Chinese population. This is especially true for the familial correlation analysis and CSA. Therefore, our main purposes in the present study are (1) to study the familial correlations and evaluate the heritability of adult height adjusted for several significant covariates such as age and sex, and (2) to explore the mode of inheritance in Chinese nuclear families by CSA.

Materials and methods

Subjects

The study was approved by the Hunan Normal University and the Research Administration Departments of Shanghai’s Sixth People’s Hospital. We recruited 401 nuclear families composed of both parents and at least one healthy female offspring, totaling 1,260 individuals. All offspring are daughters whose ages were generally between 20 and 45. The average family size is 3.14, in which 349, 50, 2, and 1 families/family have/has 1, 2, 3, and 4 offspring, respectively.

All the subjects involved in the study came from a local population of Shanghai City located on the mid-east coast of the People’s Republic of China and belong to the Han ethnic group, a majority group in China that composes more than 90% of the total Chinese population of 1.30 billion people (Liu et al. 2002). The mother of each family was randomly recruited from visiting patients by the Clinic of the Center for Preventing and Treating Osteoporosis of Shanghai’s Sixth Hospital. All subjects signed informed-consent documents before entering the project. For each study subject, we have also investigated information on age, sex, medical history, family history, female history (such as age on menophania, lactation, and menopause), physical activity (such as average times and intensity of exercise per week), alcohol use, diet habits, and smoking history. Such information was obtained by nurse-administered questionnaires and/or medical records. Only families with healthy offspring, which were defined by the exclusion criteria of Deng et al. (2002a), were included in our analyses.

In the collected database, there are a total of 32 individuals (fathers or mothers) without height data, whose traits are regarded as missing data in our analyses. Furthermore, based on our available molecular data-restriction fragment length polymorphisms (RFLP) data for six markers, there are nine families (2.24% of the overall families) probably with some non-biological offspring according to the Mendelian Law. These offspring were excluded in our analysis.

We also investigated pedigree heterogeneity and ruled out some “outlier” families (Ginsburg et al. 1998). The basic principle is that a positive correlation should be observed in close relatives under the assumption of strong genetic inheritance. Accordingly, in our procedure, after height is adjusted for sex and age, those families with height discrepancy among the members (except for spouse pairs) greater than 2.5 standard deviations are excluded. As a result, 3.98% of the total families were excluded.

Measurement and data adjustments

Measurements were taken in the Research Administration Departments of Shanghai’s Sixth People’s Hospital. Height was measured twice in units of centimeters by the height bar of a standard hospital scale. Subjects stood straight on the scale without shoes and with the head positioned. Readings were recorded to the nearest 0.5 cm. Because we were interested in the genetic determination of adult height, any individual less than 20 years old was excluded from our analysis as not until that age can individuals reach the final adult height (Roche and Davila 1972).

Before the genetic analyses, the data were tested for dependence on sex and age. In this way, a multiple linear regression was performed with adult height as the dependent variable, and sex (dummy variable, 0 for male and 1 for female) and age as the independent variables. It shows that both sex and age are potentially significantly correlated with adult height. In the simple linear regression model, sex accounts for 31.7% of the total adult height variation in our sample, and age accounts for 12.7% in male group and 19.9% in female group. Therefore, for familial correlation analyses and CSA, the data were adjusted for age and sex by a multiple stepwise regression using a significance level of 0.1 for both inclusion and retention in the model. The variance of the adjusted data in the pooled sample is 0.569 of the initial variance, indicating that sex and age can jointly account for about 43.1% of the total variation of adult height in our sample. Because the data of adult height are fit for normal distribution (Shapiro-Wilks test), no data-transformation method was adopted in each analysis.

Familial correlation analyses

Familial correlations (spousal, parent-offspring, and sibling) and their equivalent pair counts were calculated using the FCOR program in Statistical Analysis for Genetic Epidemiology (SAGE 2002). The 95% confidence intervals for these correlations were constructed using Fisher’s z-transformation (DeStefano et al. 1996). Heritability can, therefore, be estimated from the correlation coefficients among sibling pairs, parent-offspring pairs, and spouse pairs (Rice et al. 1997). In addition, after grouping subtypes into main types, chi-square statistics and P-values were calculated to test the homogeneity of correlations among the available subtypes within each main type. Under the null hypothesis of homogeneity, the test statistic has an approximate chi-square distribution with degrees of freedom equal to the number of subtypes minus one. More details can be referred to the SAGE 4.2 user manual (SAGE 2002). For instance, in our study, the null hypothesis is that there can be only one group, parent-offspring, and the alternative hypothesis is that there should be two subgroups, father-daughter and mother-daughter.

Segregation analyses

Segregation analysis was performed using the program SEGREG, as implemented in the SAGE 4.2 package (2002). The class D regressive model was employed for continuous traits, assuming that the sibling correlations were equal (Bonney 1984). The general model assuming the existence of two major alleles (A and B) at an autosomal locus affecting adult height estimates the following parameters: (1) PA is the population frequency of the first of two major gene alleles. (2) μg is the average trait value (genotypic value) in all individuals having the genotype g at the major gene locus; where g=1, 2, and 3 corresponds to genotypes AA, AB, and BB, respectively. (3) σg 2 is the trait variance in individuals having the same major gene genotype g which estimates the trait variation due to the influence of all possible environmental factors and potential polygenes with relatively minor effects. (4) τg is the transmission probability parameter that estimates the probability that a parent of genotype g transmits allele A to the next generation. The general model does not assume a particular mode of transmission, so τ 1, τ 2, and τ 3 are estimated together with all other parameters. (5) ρ, β, and ε are correlations between the trait residuals adjusted for the major gene effect in spouses, parents/offspring, and siblings, respectively. These correlations measure the magnitude of covariations between relatives, which are attributable to polygenic and/or environmental effects other than major gene effects. The parameters described above correspond to those in the program SAGE. For more details, interested readers can refer to the SAGE 4.2 user manual (SAGE 2002).

The hypothesis for the MG model (also called the “Mendelian” model) or non-MG model (also called the “environmental” model) can be tested for by using standard transmission probability tests (Elston 1981), where negative two times the discrepancy in natural-log (ln) likelihood (−2 ln L) between a restricted/nested and non-restricted/non-nested model approximately follows a χ2 distribution. The degrees of freedom (df) for the test are equal to the difference in the number of parameters estimated in the two models. An MG model can be accepted provided the following conditions are simultaneously met: (1) when the general model (all parameters unrestricted) is compared with the Mendelian model (the transmission probability parameters are restricted to the expected 1.0, 0.5, and 0.0, respectively), the latter is accepted (P>0.05); and (2) when the general model is compared with the “environmental” model (assuming independence of offspring genotypes, i.e., equal transmission probability τ), the latter is rejected (P<0.05).

Once an MG model is accepted under the transmission probability tests, further constraining some parameters may give rise to some more parsimonious models (such as additive, dominant, or recessive). In addition, Akaike’s Information Criteria (AIC), defined as AIC=−2 ln L+2 (number of parameters estimated), can be used to compare the non-nested models (Akaike 1974). The best fitting and most parsimonious model requires the fewest estimated parameters and produces the minimum AIC value. If the most parsimonious model is an MG model, the proportion of variation due to the MG can be calculated based on the given estimates of parameters (Ginsburg and Livshits 1999). Standard errors (SE) of the parameter estimates are given as usual through the inverse matrix of second derivatives of the likelihood functions.

On the other hand, by using CSA, the significance of familial correlation can also be tested. A nested model can be constructed in such a way that one or some familial correlation coefficients are fixed to zero while holding other parameters identical to those in the general model. The non-nested model is the general one with the statistical test similar to the aforementioned. The results should be consistent with those in the familial correlation analysis.

Results

Descriptive statistics and familial correlations

Table 1 provides basic descriptive statistics for height, age, and weight in three different groups: fathers, mothers, and daughters. The mean height without adjustment is significantly different among the three groups (P<0.01) based on one-way analysis of variance (P<0.01). As indicated in Table 1, the male group (fathers) is generally taller than the female groups (mothers and daughters), while the daughter group is taller than the mother group. It is a common phenomenon, which may be explained by the fact that sex and age have significant effects on adult height (Abassi 1998; Seeman 1999).

Table 1 Descriptive statistics for the 1,169 subjects in 385 Chinese families (mean±standard deviation)

The familial correlation coefficients, their 95% confidence intervals and the number of pairs are provided in Table 2. The correlations of parent/offspring are positive and highly significantly different from zero (P<0.01). Similar results are observed in each sub-relationship, father-daughter and mother-daughter pairs. Though the number of sibling pairs is not large (only 60) in our sample, the correlation in sibling-sibling pairs is still significantly different from zero (P=0.02). The correlation of spouses is also positive and significant, indicating the phenomenon of positive assortative mating. The heritability estimate for adult height is 0.647(±0.122). These results show that adult height in the Chinese exerts a high inheritance, with a possible strong involvement of assortative mating.

Table 2 Correlation coefficients (95% confidence interval) of adult height among various relative pairs (adjusted for age and sex, age>20)

Equivalence of correlations among the father-daughter and mother-daughter pairs is accepted through the homogeneity test (χdf=1 2=0.0008, P=0.97). This may indicate that the difference between paternal and maternal effects on adult height in Chinese is not large, though the magnitude of correlation between the father-daughter pair is smaller than that between the mother-daughter pair, 0.377 versus 0.409. This is also the reason that mother-offspring and father-offspring correlations are assumed to be equal in our CSA.

Segregation analysis

The main results of CSA of adult height are presented in Table 3. Column 1 shows a general model in which all parameters are unrestricted to make the likelihood function reach the global maximum and provide the best-fit model to the data. Column 2 provides the results of an MG (Mendelian) model that is not significantly different from the general model (χdf=3 2=1.11, P=0.775). Meanwhile, as indicated in column 6, the environmental model is not statistically different from the general model (χdf=3 2=7.19, P=0.066). It suggests that the environmental model is not strongly rejected in our sample. However, our results are in favor of the MG model. First, the P-value 0.066 for the test for the environmental model is very close to 0.05, i.e., it is marginally significant, whereas, that for the test for the MG model is 0.775. Second, according to the AIC value, the MG model is fitter than the environmental model.

Table 3 Complex segregation analysis of adult height (adjusted for sex and age). For entries shown in [ ], a parameter is fixed to the shown value. Entries shown in ( ) represent the standard error of a corresponding parameter. N d, number of estimated parameters; GMG, general major gene model, namely, only the τs are fixed to be 1.0, 0.5, and 0.0, respectively; No FC, all parameters are set as the general model except for parent-offspring and sibling-sibling correlation coefficients fixed to be 0; No SP, all parameters are set as the general model except for spouse correlation coefficients fixed to be 0

Subsequent tests are applied to decipher more parsimonious inheritance models (additive, dominant, or recessive) by constraining some corresponding parameters. It can be seen that both the MG models with additive and recessive effects are not significantly different from the general model, as shown in columns 3 and 5, respectively. According to the AIC value, the MG model with a recessive effect is the most parsimonious model, with the smallest AIC value. Therefore, we chose the recessive model as the underlying MG model that may determine the variation of adult height in the Chinese. For each estimated parameter, the corresponding SEs are also listed in column 5. Relatively small SEs of the parameter estimates indicate a narrow range of possible fluctuation of the corresponding estimates. Based on the given parameter estimates, the proportion of adult height variance attributable to the MG effect is 17.2% under Hardy-Weinberg equilibrium, after excluding effects of sex and age in our sample. The model, however, shows substantial residual familial correlations between parents and offspring (β=0.380±0.045), and in particular, between siblings (ε=0.485±0.117), which are, presumably, due to the effects of polygenes and/or unknown environmental factors.

In addition, we tested the residual familial correlations by CSA. The model denying the effects of parents-offspring and sibling-sibling correlations was examined (i.e., β=0 and ε=0), and highly significantly rejected (in column 7) against the general model (χdf=3 2=33.09, P<0.001). In column 8, the spouse correlation coefficient, ρ, is fixed to zero (while holding the other parameters identical to those in column 2) to test the effect of spouse correlation. This model was also statistically rejected, indicating that assortative mating is notable in the Chinese. These results are in agreement with the results in the familial correlation analyses.

Discussion

Adult height has been long recognized as an important human complex trait. However, few genetic studies on adult height have been conducted in the Chinese, the largest population in the world. As such, we performed familial correlation analyses and CSA on adult height in a Chinese sample composed of 385 nuclear families with a total of 1,169 informative individuals. As the first CSA on height in the Chinese population, our study suggests the existence of an MG with a recessive effect and a significant familial correlation for height in the Chinese. The heritability of height in the Chinese is estimated to be around 0.647(±0.122).

Our finding is generally consistent with some early studies performed in other populations (e.g., Ginsburg et al. 1998; Xu et al. 2002). Xu et al. (2002) conducted CSA in Dutch families. The study supported an MG model with a recessive effect plus considerable residual polygenic effects. The detected MG(s) could account for about 38.1% of the total variation of adult height. In the study of Ginsburg et al. (1998), pedigree samples were collected from five ethnically and geographically different populations: Kirghizians, Turkmenians, Chuvashians, Mexicans, and Israelis. Except for the last population, the most fitting and parsimonious models were additive, and the MG(s) implied by the models were responsible for 39.8, 34.6, 53.2, 41.6, and 48.3% of the total adult height variation in each population, respectively. In the most parsimonious models suggested by Ginsburg et al. (1998), the non-MG familial effects were close to zero. This may imply, unlike the results of Xu et al., that potential polygenic effects on adult height in their samples may not be significant. In our study, an MG model with a recessive effect for height is suggested. However, the proportion of height variation attributable to the MG is only 17.2%. Thus, apart from the detected MG, other potential genetic factors, such as polygenes, may also play an important role in regulating adult height in the Chinese.

Significant height correlation is found among parent-offspring and sibling-sibling pairs in our sample (Table 2), which was also shown by many other studies (e.g., Silventoinen et al. 2000; Xu et al. 2002). It may indicate a strong familial aggregation of adult height. Significant height correlation is also detected among spouses. However, the correlation between the MG genotypic values in spouses is fixed to zero in our MG model. This indicates that assortative mating may have no significant effects on the spousal similarity due to the MG in our study. Instead, the significant observed phenotypic spouse correlation of adult height may result from other unknown genetic, environmental, or socioeconomic factors (Spuhler 1982; Mascie-Taylor 1987; Sanchez-Andres and Mesa 1994).

Population heterogeneity of the study trait is one of the important concerns for CSA as well as genetic studies using other approaches. It may lead to inconsistent results among studies in different populations (Province and Rao 1985; Ginsburg et al. 1998; Hirschhorn et al. 2001; Xu et al. 2002). This is especially true for the study of height. Linkage and/or association studies (e.g., Thompson et al. 1995; Raivio et al. 1996; Garnero et al. 1998; Lorentzon et al. 1999; Miyake et al. 1999) in different populations have reported quite a few loci potentially responsible for adult height in different human chromosomes. Several whole-genome linkage scans in different populations (Hirschhorn et al. 2001; Deng et al. 2002b; Xu et al. 2002) revealed different regions suggestive of containing QTLs underlying height variation. For instance, in the study by Hirschhorn et al. (2001) on four different populations (Botnia region of Finland, other parts of Finland, Southern Sweden, and Saguenay-Lac-St.-Jean region of Quebec), chromosomal regions with the best evidence for linkage in a given population did not show strong evidence for linkage in other populations. As for CSA studies, different inheritance models and even the absence of MG were suggested for a variety of distinct ethnic populations (Province and Rao 1985; Ginsburg et al. 1998; Ginsburg and Livshits 1999; Xu et al. 2002). These facts reflect the population heterogeneity of height as a quantitative trait and the complexity in unraveling its genetic basis. They also underscore the importance and necessity of studying this trait in different ethnic populations, as only based on this can we identify the genetic factors universal to all human race as well as those particular to a certain ethnic group.

Our sample is from an expanding database being created for studies searching for candidate genes underlying the variation of BMD where all offspring are female since females suffer more commonly from osteoporosis compared to males (Cooper et al. 1992). Though some researchers argued that males might have different patterns of genetic determination of adult height compared with females (e.g., Ellis et al. 2001), it seems impossible that these genetic factors are completely different between males and females. Therefore, our results may give an implication about the shared genetic mechanisms between males and females. The conclusions obtained from the results should be at least applicable to the Chinese female group.

Our inference that an MG with a recessive effect influences adult height in Chinese was based on the overwhelming majority of the studied data. In our analyses, only 3.98% of the total families showing pedigree heterogeneity among the data were ruled out. Considering body habitus (such as sex and age) and other environmental factors that can substantially affect adult height, the exclusion of a minor part of data should not alter the validity of the inference (Ginsburg et al. 1998). Thus, the obtained results are robust and reliable. Further more, our conclusion, as stated above, is generally consistent with most previous studies in different populations.

In summary, our findings suggest the existence of an MG with a recessive effect contributing to the variation of adult height in the Chinese. They also, along with some previous ones, provide evidence for ethnic differentiation in genetic determination of height. Based on the study, marker-based genetic analyses are to be implemented to further dissect the genetic basis of height and other complex traits in the Chinese population.