The limits of normal approximation for adult height

Adult height inspired the first biometrical and quantitative genetic studies and is a test-case trait for understanding heritability. The studies of height led to formulation of the classical polygenic model, that has a profound influence on the way we view and analyse complex traits. An essential part of the classical model is an assumption of additivity of effects and normality of the distribution of the residuals. However, it may be expected that the normal approximation will become insufficient in bigger studies. Here, we demonstrate that when the height of hundreds of thousands of individuals is analysed, the model complexity needs to be increased to include non-additive interactions between sex, environment and genes. Alternatively, the use of log-normal approximation allowed us to still use the additive effects model. These findings are important for future genetic and methodologic studies that make use of adult height as an exemplar trait.

transformation gives a slightly poorer fit in the case of height than none at all" (4).Thus, often the adult height serves as an empirical example of a normally distributed biological trait in textbooks on statistics (2)(3)(4).Thus, historically, adult height is described with an additive effects model.
A notable exception from this is the way some classical studies treat the effects of sex.
Often, analyses are stratified by sex, such as in Pearson and Lee (14), while Galton (1) pre-adjusted the height for sex by multiplying female height by 1.08.Interestingly, Solomon et al. (15) formally demonstrate that for the effects of sex the " additive model is clearly rejected, but the multiplicative model provides an acceptable fit".
To characterise current practices in human height genetics, we performed a mini-review of a semi-random sample of literature.For that, we queried Google Scholar for "human height (genetic OR epidemiological) study" on Jan 28, 2020 using Internet Explorer web browser.We then analysed the top 50 articles that were retrieved.
Results are presented in Supplementary Table 10.
Twenty-seven manuscripts published between 2003 and 2018 (median 2010) dealt with analysis of individual-level adult height data.The manuscripts were published in prestigious journals such as Nature, Nature Genetics, American Journal of Human Genetics and were jointly cited more than 5,000 times (minimum 7, maximum 1202, median -54).Among these papers, only one work applied a non-linear transformation of height.Only 11 applied sex-stratified analysis or sex-specific standardization of height.A practice of sex-specific standardization and then joint analysis or meta-analysis is, in fact, similar to accounting for the effects of sex in a multiplicative (Galton's (1) and Thompson's (15)) manner.Thus, the current dominating practice in the field of human height genetics is to treat all effects, including sex, as additive, and to assume normal distribution of the residuals.
We should, however, note, that about 1/3 of studies agree with a classical tradition of special treatment of sex by performing sex-stratified Z-transformation.Supplementary Note 2. Peculiarities in the current model of height.
From theory as well as from practice it is well-understood that normal distribution is only an approximation to the distribution of height in human populations.In fact, a scenario that would be the most favourable to a normal approximation would imply a compound-normal distribution of height, with total distribution being a mixture of normal distributions having different means in different sub-populations defined by sex, age, socio-economic status, and so forth.Thus, in fact, we do not assume normal distribution of height, but rather the normal distribution of height residuals.
Practically, we also notice several peculiarities in the "beautiful regularity in the statures of a population" (a quote from (1)), which may suggest that compound-normal may be not the best approximation either, -especially when we study large diverse populations.
The first peculiarity lies in an observation-coming from anthropometric and socioeconomic literature-that standard deviation of height tends to be greater in taller populations (16), while the coefficient of variation across populations is rather stable (see, e.g., Figure 2 from ( 17)).This low value and stability of the CV of adult height within human populations is rather noticeable when compared to the distribution of CV of weight (16) or to that of body length in other species (see Figure 2 of ( 18)).These observations led some social scientists and economists to postulate a log-normal distribution of height (19), implying a model under which effects multiply.
The second peculiarity relates to the way in which the field of quantitative genetics sometimes handles (a nuisance) effects of sex (covered briefly in Supplementary Note 1).In reasonably sized analysis groups one can observe that not only the height, but also the standard deviation of height is larger in men than in women.Interestingly, the ratio between standard deviation of male and female height is close to the ratio between average male and female height.For future reference, we will call this observation "an approximate equality of SD and mean ratios between sexes".Several works, starting with Galton (1), dealt with sex differences in height by multiplying female height by ~1.08; and some works even formally demonstrated that the multiplicative adjustment for sex is statistically better than additive (15).
In some contemporary genetic studies the adjustment for the effects of sex is sometimes made by centering and scaling height in males and females separately, so that transformed height has the same mean and variance in both sexes (see Supplementary Note 1).The latter procedure is, in fact, very similar to the multiplicative adjustment, if we trust approximate equality of SD and mean ratios between sexes.
Several hypotheses may explain the observations summarised above, each of them having their own advantages and disadvantages.It may be that an additive model, that leads to normal approximation, is true.The greater variance in men's height may be explained in a number of ways.For example, the Geodakyan's theory (20) postulates a narrower norm of reaction in males, and hence, for a trait under stabilising selection in an outbred population, a larger total variance.Another broad explanation may be that the environmental effects on the stature are distributed differently for men and women, leading to the difference in the amount of environmentally determined variance.The apparent observation that the between-sex ratios in means and SDs are approximately equal-if indeed true-may be just a coincidence.Both these explanations, however, would also predict differences in heritability of height between males and females; however, this difference, if any, is very small (see, for example, (21,22)).We also feel that these explanations are unnecessary complex and a simpler explanation, when available, should be favoured.
Another explanation is that the multiplicative hypothesis, which would lead to a log-normal approximation, may be true.However, in that case, why is the distribution of height so well described by the normal distribution (2-4), and why hasn't the log-normal distribution of height been detected before?If an additive (normal) model is used to study a trait that follows a multiplicative (log-normal) model, the groups defined by factors affecting the mean (e.g. a sex, socio-economic status, genotype) should exhibit different variance because of such property of a log-normal distribution as the scaling of standard deviation with the mean.Also, when a log-normally distributed trait is studied under a normal approximation, one would expect inclusion of multiplicative effects (i.e.epistatic and gene-environment interactions) to improve the fit of a model to the data ( 23), but this is not what we have seen up until now, at least for the genetic effects (see, e.g., (7)).Finally, a "hybrid" hypothesis could be proposed, assuming that some effects multiply, while others add up.For example, the effects of sex and the factors that distinguish and generate height differences between different populations may multiply, consistent with the observations from anthropologic and socioeconomic studies and some practices of handling sex effects in human genetics.At the same time, the effects of other factors, such as genes, may add up.
Although the latter hypothesis can accommodate most observations, and, although implicitly, this hypothesis reflects the de-facto state of affairs, we believe that now it may be the time to explicitly formulate and revise our assumptions about the distribution of height.An explanation involving only additivity or only multiplicativity would be more internally consistent and parsimonious; and if the hybrid hypothesis is true, it would be interesting to see what distinguishes factors that act additively from these acting multiplicatively, and to try to understand the underlying biology.
The evidence supporting a multiplicative model and log-normal approximation of height mostly come from an anthropological comparison of variation of height between different populations, while the additive model is used in genetic studies of height on the multiplicative scale, using the log-normal distribution, is free from these artefacts.These conceptual results may have wider implications beyond the analysis of human height per se.We feel that a careful inspection of previous reports of (potential) interactions is warranted.When effect sizes and standard deviation correlate with the mean-as is the case for human height-a multiplicative model should be considered as a simple parsimonious explanation.
In biology, the log-normal distribution is ubiquitous (29,30).While examples of traits that are distributed log-normally are paramount, it is actually hard to name a trait that is believed to be distributed normally, with height being one of the very few but prominent examples.Some authors (31) argued that statistical analysis and reporting standards should change to account for the fact that most biological traits are multiplicative, and the most common distribution is log-normal.While we agree that, as a field, we may want to change our mind-set to consider log-normal as a priori most likely distribution, we appreciate the difficulty of reporting results on log-scale, or, more generally, the fact that a "scale transformation obscures rather than illuminates the description" (23).Presenting the results for multiplicative, log-normally distributed traits requires special methods (30).Although specifically for height, with its small CV, the difference between reporting results on arithmetic and geometric scale is often minimal, we explore different ways of presenting results obtained for (log)height in Supplementary Table 8.
To conclude, here we demonstrated that height is a trait, for which many effects appear to multiply.While in a homogenious population, after stratification by sex, the distribution of height is well approximated by a normal distribution, the log-normal approximation should be considered in analysis of big data, analysis of heterogeneous populations, and analysis of the extremes of height.Other variance-stabilising transformations may be considered as well (see Supplementary Note 2).
Still, the question of distribution of adult height is far from being solved.Although we (hundreds and even tens of thousands), and often distributed neither normally nor log-normally.This led to a practice of applying quantile transformation to normality (often also called inverse-normal transformation) that, in absence of ties, leads to perfect normal distribution.One usual option is to first obtain residuals from linear regression of the trait onto a set of fixed covariates (usually sex, age; often principal components of genomic kinship matrix), perform quantile transformation of the residuals to normality and then running GWAS that estimates additive effect a SNP onto transformed residuals.Alternative protocol assumes quantile transformation to normality first, and running GWAS model that jointly estimates the additive SNP effect and the effects of covariates.When the effects of covariates and genetic polymorphisms are small, the above procedures work satisfactory in practical terms.
However, neither procedure is theoretically satisfactory, and provides unsatisfactory results when effects become bigger.This can be understood from the following example: consider a distribution a log-normally distributed trait y, so that log(y) = mu + B1*g + B2*sex + error Where "error" is a normally distributed error and B1 is an effect of a genotype and B2 is an effect of sex.It is quite obvious that if one takes log(y) and then runs a linear model adjusting for g and sex, the residuals from this model will be distributed normally.

When
one, however, would first perform quantile normalisation (inverse-normalisation) of y, and then adjust for g and sex; or would pre-adjust for sex, then perform quantile normalisation, and then account for the effect of g, the residuals from either of the procedures will not be distributed normally.That results in p=0.8, 4e-5 and 3e-5 for the logarithmic transformation, and inverse-normal transformation of type 1 and 2, respectively.