Introduction

Geneticists have been successful in mapping genes underlying rare, monogenic disorders with clear patterns of Mendelian inheritance.1,2,3 However, mapping genes underlying complex traits, such as common multifactorial diseases, has been more difficult.4,5 Genes can be mapped via linkage or association tests. Although both strategies exploit the cosegregation of markers with phenotypes, there are some striking differences between them. For example, in humans, genome-wide searches may require testing 30 000–500 000 single-nucleotide polymorphisms (SNPs) to detect significant associations, compared to 200–400 microsatellites in a linkage analysis.6,7,8 Moreover, theoretical work suggests that finding associations between markers and complex diseases is more powerful than searching for linkage, even if many SNPs have to be tested and significance thresholds are raised to compensate for multiple testing.9 However, not all association tests are robust to spurious associations.10 This explains why association tests using family controls, such as the transmission/disequilibrium test (TDT), have been favoured over association tests using random controls, for example, case-control, because the former are robust to spurious associations caused by population stratification, or recent admixture.11,12,13

Power studies of association tests will help researchers to design appropriate experiments, and to choose the most powerful test for the analysis of data. In this study, we investigate the power of association tests for quantitative traits, with and without family controls. Allison14 proposed five TDTs (TDTQ1–Q5) for analysing quantitative traits under different ascertainment conditions, and we have included the most powerful, TDTQ5, in this study. Long and Langley15 compared the power of five random controls association tests and the TDTQ5, and found that TDTQ5 was always the least powerful test. Nevertheless, these authors acknowledged that under population stratification, the type-I error rate of association tests using random controls can rise above the nominal level set for the experiment. Xiong et al16 proposed the TDTG, an extension of Allison's TDTQ1 that accounts for any number of sibs per family, families with one or two heterozygous parents, and any number of alleles at the marker locus. They found that TDTG is more powerful than TDTQ1, the Haseman–Elston linkage test,17 and an extreme discordant sib pair test.16 Lastly, Rabinowitz18 developed a TDT (referred to here as TDTR) to model explicitly the correlation between a quantitative trait and marker segregation.

In this study, several tests of association have been compared in terms of power, both empirically and deterministically. Our deterministic approximation for predicting power was based on the calculation of noncentrality parameters (NCPs).19 The accuracy of this methodology, which can be used to predict power of other association tests, was validated via simulation.

Materials and methods

Definition of the evaluated association tests

The power of five tests to detect both linkage and/or association between a marker locus and a Quantitative Trait Locus (QTL) was studied empirically (simulations) and deterministically (calculation of NCP). Table 1 shows all the tests used in this study: the one-way analysis of variance (one-way ANOVA), the nested analysis of variance (nested ANOVA), the TDTQ5, the TDTR, and the TDTG.14,16,18,20 A general deterministic method for predicting power at a linked marker is proposed in this study, and implementation examples are given for one-way ANOVA, TDTR, and TDTQ5. The one-way ANOVA test uses a sample of unrelated individuals who have been both genotyped and phenotyped, whereas the other four tests use the same but with the inclusion of the genotypes of the parents. Recombination rate and linkage disequilibrium were denoted c and D, respectively.

Table 1 Main features of the tests compared in this study assuming family trios

One-way ANOVA

The one-way ANOVA contrasts marker genotype means among the progeny. This is the simplest and the most powerful test of association, although it is prone to high type-I error rates in the presence of spurious association, viz. disequilibrium without linkage.15 This is so because the null hypothesis (H0) being tested by one-way ANOVA is no association, regardless of linkage. Therefore, H0 could be rejected when testing unlinked marker loci (c = ½) if there was a sufficiently strong population-wide association (D≠0). This lack of robustness is common to tests that do not use family controls, for example, case–control studies.12,13 The test statistic follows an F2,n'-3 distribution under H0, given a total sample size n′ and three different genotype groups.20 The sum of squares between genotype groups, after subtracting the overall mean effect, reflects differences between marker genotypes. Hence, a significant statistic suggests greater differences between genotypes than would be expected under the assumption of linkage equilibrium between the QTL and the marker. Under the alternative hypothesis (H1) of association, the distribution of the test statistic is a noncentral F with NCP equal to λ0, or a noncentral χ 2 , λ 0 2 / 2 for large n.

Nested ANOVA

A way of overcoming the lack of robustness of one-way ANOVA is to contrast marker genotype means of progeny within parental types, using a nested ANOVA design.20 Parental type represents a particular combination of parental marker genotypes, and family type a particular combination of marker genotypes across all family members (Table 2). Thus, the H0 being tested by nested ANOVA is no association within parental types. There must be at least two progeny with different marker genotypes within each parental type for there to be a contrast; therefore, only those families with at least one heterozygous parent, that is, informative families, are used. This type of family ascertainment increases the degrees of freedom (df) between groups and reduces the df within groups, resulting in a loss of power compared to one-way ANOVA. The appropriate F-test in nested ANOVA is a ratio of between to within genotype mean squares within parental types. The test follows an Fα,β distribution under H0, where α = ∑i = 1γ ngi -1, and ngi is the observed number of genotypes within parental type i, γ the observed number of parental types, and β=n–(α+γ), where n is the number of informative families.

Table 2 Probability of each family type given a biallelic marker and assuming Hardy–Weinberg equilibrium

TDTQ5

The original statistic for TDTQ5 is [(SSF - SSR) / 2]/[(SST - SSF)/ (n - 5)], where SST is the total sum of squares, SSR is the sum of squares explained by a reduced model that fits an overall mean and two (out of three) informative parental types as fixed factors, and SSF is the sum of squares explained by a full model that fits, in addition to the reduced model, two more fixed factors to estimate additive and dominant effects.14 The total number of informative families is n. TDTQ5 is testing whether a significant amount of phenotypic variation can be explained by marker genotypes in the progeny, over and above the variation already explained by parental type. TDTQ5 follows an F2,n - 5 distribution under H0 if residuals are normally distributed, or χ22 /2 for large n. Under H1, TDTQ5 follows a noncentral F 2 , n -5, λ Q 5 , or a noncentral χ 2 , λ Q 5 2 / 2 /2 for large n, with NCP λQ5. The TDTQ5 is equivalent to a two-way ANOVA with a cross-classified design where the factors are parental type and progeny marker genotype (Appendix B).

TDTR

Although Rabinowitz18 derived a NCP, he used parameters not included in his simulations, leading to some confusion in interpreting and calculating λR. Therefore, we developed a neater NCP for TDTR. The TDTR is calculated as TT for a biallelic marker. T measures the strength of the covariance between the transmission of a marker allele, from heterozygous parents to progeny, and the phenotype of progeny, and σT is the standard deviation of T. We will next describe TDTR in detail, as this information will be needed for further statistical developments. The numerator is T = ∑in (yi - ȳ)wi, where yi is the phenotype of the ith child, ȳ is the overall mean (or the mean among informative families), and wi are weights given to each family type (Table 3). The sum is over n informative and unrelated family trios randomly drawn from a population. The variance of T is σT2 = ¼ ∑in (yi - ȳ)2hi where hi is the number of heterozygous parents in the family (Table 3). Under H0, TDTR follows a tn−1 distribution, so (TDTR)2 follows an F1,n−1 distribution. Under the alternative hypothesis, (TDTR)2 follows a noncentral F with NCP λR, or a noncentral χ 1 , λ R 2 for large n.

Table 3 Variables in TDTR

TDTG

The last test being considered is TDTG.16 For a biallelic marker where ȳMm) is the mean among progeny having inherited allele M (m) from heterozygous parents, and nM (nm) is the number of times allele M (m) is transmitted. The variance of (ȳM - ȳm) is

where

and yMk is the phenotype of the child of the kth parent. The latter sum is over the 2n parents in a sample of n family trios. If all family members are Mm heterozygous, then the same information is included in both allele categories. For a normally distributed trait and large n, TDTG follows a χ12 distribution under H0. The asymptotic distribution under H1 is a noncentral χ 1 , λ G 2 with NCP λG.

Empirical power

Power was calculated empirically as the proportion of significant results out of 1000 analyses of independent data sets, simulated under specific combinations of parameter values. Each sample consisted of n=200 unrelated family trios (father, mother, and a single child). The frequencies of the positive allele (Q) from a biallelic QTL were pQ=[0.5, 0.3, 0.1], and the same frequencies were assigned to allele M from a biallelic marker linked to the QTL. The recombination rates between the marker and the QTL were c=[0, 0.1, 0.3, 0.4, 0.5]. QTL and marker genotypes were generated for all individuals. Phenotypes were generated only for the progeny by adding a normally distributed error with variance σe2 = 1, plus -1/2, 0, or ½ for QTL genotypes qq, qQ, or QQ, respectively. Neither dominance nor polygenic effects were simulated. The level of association between allele Q at the QTL and allele M at the marker was given by the standardised linkage disequilibrium parameter D′ = [0, ½, 1].21

Deterministic power

We have developed a compound method with two parts for predicting power of association tests deterministically. The first part consisted in calculating the expected effect of marker genotypes as functions of underlying QTL genotypes, conditional on population parameters and family type. This part can be used to predict power in other association tests, in addition to the ones in this study. The second part consisted in calculating the NCP as a function of marker contrasts specific to each test.

Expected marker effects

Consider the 10 different family types at a biallelic marker (Table 2), and let Xj be a vector with the marker genotypes of child, father, and mother in a family of type j, for example, X1=[MM, MM, MM]. Let Gi denote the ith QTL genotype of the child, that is, G1=QQ, G2=Qq, and G3=qq. The expected phenotype (y) of a child given the ith family type, assuming no dominance, is

where a is the effect of substituting allele q for Q, assumed to be ½. The conditional probabilities P(G1Xi) and P(G3Xi) can be calculated using Tables W1, W2, W3, W4, available on the web (www.nature.com/ejhg/5201042).22,23 For example, the probability of QTL genotype QQ given X1 is

where P(G1 ∩ X1) is the joint probability of QTL genotype QQ in the child and marker genotype MM in all members of the family, P(X1) is the probability of family type 1 (Table 2), and h1 is the probability of drawing haplotype QM from the population which, assuming random mating and no segregation distortion, is h1 = hQM = pQ, pM + DQM. (Note: DQM=DDmax, and if D′>0 then Dmax=min{pqpM, pQpm}.)24 The joint probability P(G1 ∩ X1 can be obtained from Table W4 by multiplying the third and the sixth columns and adding up all. The conditional probabilities P[GiXj], for i=1, 2, 3 and j=1…10, are all summarised in Table 4.

Table 7 Probabilities of 4 parental haplotypes and expected frequency of QTL genotypes in progeny given Mm and Mm parents and MM, Mm or mm progeny
Table 8 Probabilities of 4 parental haplotypes and expected frequency of QTL genotypes in progeny given MM and Mm parents and MM or Mm progeny
Table 9 Probabilities of 4 parental haplotypes and expected frequency of QTL genotypes in progeny given mm and Mm parents and Mm or mm progeny
Table 10 Probabilities of 4 parental haplotypes and expected frequency of QTL genotypes in progeny given MM and MM, or mm and mm, or MM and mm parents and the marker genotype in their progeny
Table 4 Conditional QTL genotype probabilities in a child, given the family type (FT), and population parameters D, c, PM, Pm, PQ, and Pq

Noncentrality parameters (NCP)

The NCP for the one-way ANOVA (λO) can be obtained applying the formula25

The sum in Equation (2) is over all three marker genotype classes, the vector B′ contains the three marker genotype means [μMM, μMm, μmm], and X′X is a matrix with diagonal elements [nMM, nMm, nmm] and zeroes elsewhere, where ni is the sample size corresponding to marker genotype i. Equation (2) represents the sum of squares due to both the marker locus and the sample mean (μ). The appropriate λO can be obtained after subtracting from Eq. (2) the sum of squares due to the sample mean, that is, n′μ2, where n′ = nMM + nMm + nmm. When testing the QTL (ie conditioning on c=0, D′=1, and pQ=pM), and assuming no dominance, Eq. (2) simplifies to

where σQTL2 = 2pQ pq a2.26

In Appendix B, we have shown that TDTQ5 is equivalent to a two-way ANOVA analysis, where data are modelled fitting parental type and progeny genotype as fixed factors, in addition to μ. Taking this equivalence into account, the NCP λQ5, derived in Appendix A, is

where bi is the expected marker genotype effect in progeny of family type i (Table 3), ni is the number of type i families, Ii(j) is an indicator variable that takes the value 1 when the family is informative (viz. at least one heterozygous parent), and 0 otherwise, Fj is the mean value of the jth parental type, and fj the number of j parental types. Eq. (4) measures, in σe2 units, the amount of total sum of squares explained by the marker, after subtracting the parental type effect. When testing the QTL, Eq. (4) reduces to

The NCP for TDTR (λR) is approximately

(Appendix C). When testing the QTL, Eq. (6) simplifies to

Finally, the NCP for TDTG (λG) is16

where n is the number of informative families. When testing the QTL in family trios, the appropriate NCP is16

The differences between the four NCPs λO, λQ5, λG, and λR are easily appreciated in Table 5, for both large and small sample sizes. In all cases, the QTL allele frequency and effect size only affect λ through the QTL variance.

Table 5 Noncentrality parameters (λ) given c=0 and D′=1, and distribution under H0 for small and large sample sizes

Results

Empirical versus deterministic power

We have developed formulae to calculate NCPs for one-way ANOVA (λO), TDTQ5 (λQ5), and TDTR (λR), assuming that the sample consists of family trios. Once these λ's are obtained, power can be calculated from the appropriate noncentral distributions. Xiong et al16 derived the equation for the NCP of TDTG (λG). Figure 1 shows that predictions of power using our deterministic method (lines) match very well the simulation results (points). Power is shown as a function of c for three different allele frequencies denoted with circles (p=0.5), triangles (p=0.3), and squares (p=0.1), while averaging out D′. The NCP of nested ANOVA can also be calculated following this method; however, simulation results showed that nested ANOVA is the least powerful method by far, and therefore we concentrated on deriving the other NCPs. In addition to the close match between deterministic and empirical power, two other features in Figure 1 are worth mentioning. First, power decayed more when p dropped from 0.3 to 0.1, than when it dropped from 0.5 to 0.3. This is because the loss of information is relatively more important in the former than in the latter drop. Second, TDTQ5 was less powerful than TDTR, whereas the contrary was true in Table 6. This can be explained by the fact that, in Figure 1, TDTQ5 was implemented as described by Allison,14 that is, using only informative families, and estimating both additive and dominant effects. The NCP λQ5 was obtained assuming this model. However, the power of TDTQ5 increases when the dominant parameter need not be estimated.

Figure 1
figure 1

Empirical (lines) versus deterministic (points) power for one-way ANOVA, TDTQ5, and TDTR, across c and three allele frequencies: 0.5 (circles), 0.3 (triangles), and 0.1 (squares). D′ was averaged out.

Table 6 Empirical power (%) of tests per single parameter

Power ranking with more powerful models via simulations

The power of TDTQ5 increases after removing the dominance parameter from the model when it is redundant, that is, the QTL has additive effects only. A further improvement in power, albeit slight, can be achieved by using all six parental types, whether informative or not. In doing so, TDTQ5 follows an F1,n′-4 distribution under H0, as opposed to an F2,n-5, where n′ (n) is the total number of (informative) families. Likewise, one-way ANOVA can become more powerful, fitting a simple regression line across genotypes to estimate additive QTL effects. Thus, one-way ANOVA will be distributed as F1,n′-2 under H0, as opposed to F2,n′-3. All other tests remained unchanged, and power was estimated for all via simulations.

Table 6 shows empirical power across tests, focusing on each parameter at a time (c, p or D′), averaging across the other two parameters. The ranking of the tests in terms of power was the same across scenarios: first the one-way ANOVA, followed by TDTQ5, TDTG, and TDTR (the last two with similar power), and lastly nested ANOVA. Table 6(a) shows power of the tests for a given c, averaging across values of D′ and p. The last row in Table 6(a) corresponds to the empirical type-I error for each test, ie, c = ½. The one-way ANOVA was the only test for which the empirical error exceeded the nominal 5%. This is caused by the fact that one-way ANOVA is testing whether D′ is significantly different from zero, regardless of c.15 Power declined steadily as c increased, because the amount of σQTL2 explained by the marker decreased as interloci distance increased.

Table 6(b) shows power for a given D′, averaging across values of p and c. The power of one-way ANOVA reached 72% when D′=1, being approximately twice as powerful as the TDTs. Undoubtedly, if spurious association is not an issue, significant extra power can be obtained by testing genotype differences directly, as opposed to using robust tests. All tests showed 5% type-I error when D′=0, even for c=0.

Finally, Table 6(c) shows power for a given p, averaging across values of c and D′. Power decays as allele frequency becomes more extreme because (1) there are less informative families, and (2) the proportion of informative families with two heterozygous parents decreases. The first point directly causes a reduction in sample size. The second point means that less σQTL2 is available to TDTs. TDTs owe their robustness to the fact that they use only within-family genetic variation, which is greater in families with two heterozygous parents. These results contrast with those of Allison,14 who concluded that power increases as p decreases. However, Allison14 kept σQTL2 constant, so as p became more extreme, the QTL effect, and the mean difference between marker genotypes, increased, resulting in more powerful contrasts.

Discussion

A comprehensive review of methodology developed in the 1990s provided more than 60 references of association tests for monogenic diseases with Mendelian inheritance, and only about a dozen references of association tests for complex diseases.27 Nevertheless, complex diseases are by far the commonest human ailments; for example, infectious and parasitic diseases, psychiatric disorders, and cardiovascular diseases affect 44% of the world population, compared to just 0.05% of Caucasians being affected by cystic fibrosis, the commonest of the monogenic diseases.28,29

TDTs are increasingly used to identify QTLs underlying complex diseases because they can be more powerful than other tests, for example, linkage analysis, when markers are tightly linked to responsible QTLs, and because they are robust to spurious associations generated by common demographic events such as population stratification and/or admixture.8,10

We have developed and verified deterministic power calculations for a range of association tests for quantitative traits, that is, three TDTs and two ANOVAs, and shown how the power depends on the effect of a QTL, the recombination rate between a QTL and a marker, and the amount of linkage disequilibrium between marker and QTL. In this study, we have assumed that both loci were biallelic, and shared the same allele frequencies. Moreover, we considered a continuously distributed trait genetically determined by a single additive QTL, without polygenic component or dominance. This simplistic scenario was chosen to facilitate the derivation of NCPs for predicting power. Nonetheless, we recognise that a more comprehensive picture of the properties of these tests requires analyses of more realistic situations, for example, including dominance and polygenic effects, which is possible within the framework presented here.

The deterministic method proposed in this study consists in deriving NCPs (λ's) as functions of marker genotype contrasts specific to each test. These λ's can subsequently be used to obtain power. A common feature across all λ's was the use of expected marker genotype means, conditional on family information, under the assumptions of random mating and no segregation distortion. The marker effects were functions of the standardised linkage disequilibrium (D'), the recombination rate (c), the allele frequencies (pQ, pM), and the size of the QTL (a). Allison14 derived λQ5 for TDTQ5 when the marker is the trait locus, and we have obtained an alternative prediction of λQ5 for any recombination rate, and linkage disequilibrium in the parent population.

Power was also predicted empirically via stochastic simulations, and results confirmed the accuracy of our deterministic predictions. The advantages of deterministic over stochastic methods are (1) ease of implementation, (2) instant predictions, and (3) direct appreciation of the relationship between population parameters and power. However, deriving NCPs becomes cumbersome in complex scenarios. Thus, in these cases, empirical simulations are invaluable.

The tests ranked as follows in terms of power. The one-way ANOVA was the most powerful test of association across all scenarios, but also the only test not robust to spurious disequilibrium. The TDTs had similar, and intermediate, power. However, we showed how to increase the power of TDTQ5 compared to the original version, if there is no dominance. Lastly, the nested ANOVA was the least powerful test of association.

The power of TDTQ5 may have been previously overemphasised because complete linkage and linkage disequilibrium between marker and QTL were assumed, and family trios were sampled from a population of informative families.14 This sampling scheme means that the variance explained by the QTL is larger in the sample of informative trios than in the population at large, which would include both informative and noninformative families, and led to the counter-intuitive conclusion that the more extreme the allele frequency, the higher the power of TDTQ5 to detect associations. In addition, Allison's14 comparison between TDTQ5 and the Haseman–Elston linkage test17 favours TDTQ5 because this is a test for association, and a perfect association was assumed, whereas the Haseman-Elston test is for linkage.

In summary, a new and accurate deterministic method has been developed to predict the power of QTL detection for TDTs and ANOVAs, as a function of population parameters. We have obtained specific formulae for the NCPs of the tests, when the marker is the QTL, as functions of sample size and QTL heritability. The method contains a general part (Table 4) that can be used to calculate NCPs for other association tests. Moreover, our method can also model dominant QTL effects, and a polygenic component. Extensions to cope with multiallelic markers are theoretically possible, although future association studies in human populations are more likely to employ vast arrays of SNPs than multiallelic markers.30,31 Therefore, further developments of these approaches ought to be directed to coping with the problem of simultaneous testing of several loci, and the study of haplotypes.