Introduction

Mapping the genes underlying complex human disorders is one of the most important and challenging tasks of modern human genetics.1, 2, 3 Most common human diseases, such as diabetes, tuberculosis, hypertension, heart disease, cancer and neurodegenerative disorders, are complex traits influenced by multiple genetic and environmental factors. Quantitative endophenotypes related to a complex disease are expected to be controlled by multiple genes, each exerting a small effect. Searching for such genes requires powerful methods that use knowledge about the trait's mode of inheritance.

Genomic imprinting is a wide-spread genetic phenomenon that is known to shape mammalian embryonic development.4 To date, more than 70 imprinted genes affecting human and murine ontogenesis have been described. Many human diseases, such as cancer, obesity, diabetes, behavioral and cognitive disorders, may be related to imprinted genes.5, 6, 7, 8

When searching for genes related to complex disorders, the power of Genome-Wide Association (GWA) analysis can be improved through the introduction of a parent-of-origin effect (POE) in the analysis model. For quantitative traits, this approach was implemented in an extension of the family-based transmission-disequilibrium test (TDT).9 TDT analysis assesses linkage in the presence of association10 and is not confounded by population substructure as it ignores between-family variation.11 These features made the TDT especially attractive for candidate gene studies. This robustness in the face of population stratification, however, comes at the cost of reduced power, as the ignored between-family variation component contains considerable association information. Given data appropriately corrected for population substructure, a more powerful group of methods based on the so-called measured genotype (MG) approach can be implemented.12 These methods use mixed models and exploit both between- and within-family variation to achieve a higher power.

The MG mixed model implies the estimation of a polygenic component. Fitting this model using traditional Maximum Likelihood (ML) or Restricted ML (REML) methods may be very time consuming when the model is estimated from a large kinship matrix. To solve this problem, we proposed a fast two-step approximation to MG analysis called GRAMMAR (Genome-wide Rapid Association using Mixed Model And Regression).13 During the first step of GRAMMAR, an additive polygenic model is fitted and environmental residuals are estimated. These residuals are then used in the second step in association tests performed by a least squares or score method. GRAMMAR is fast enough to be suitable for GWA scans. Its power approaches that of MG implemented using REML, especially when a genomic control procedure is applied and genomic kinship is used to estimate the polygenic component.14

In this study, we aimed at implementing POE analysis within a measured genotype framework and exploring its power and efficiency in comparison with analogous TDT-based methods.

Methods

Proposed method

To perform MG-based POE analysis, we suggest introducing the parental origin of alleles as a covariate in a regression model during the second step of the fast GRAMMAR approximation to the MG approach.13 In the first step of GRAMMAR, the following mixed model is fitted:

where yi is the quantitative trait value for the ith individual, μ is the mean, βj is the effect of the jth covariate, cij is the value of the jth covariate for individual i and Gi and ei are random polygenic and environmental effects. Environmental residuals are calculated as

where μ̂, β̂j and Ĝi are the estimates of μ, βj and Gi. In the second step, the markers are tested for association with these residuals using simple linear regression

where kg is the effect of the marker genotype and gi is a genotype code for the ith individual. Assuming the analysis of a biallelic marker with alleles A and B, the additive model can be formulated by coding the vector g with 0 for the genotype AA, 1 for a heterozygous genotype and 2 for the genotype BB. To introduce POE, however, we need to distinguish the two heterozygous genotype possibilities. Let the heterozygous genotype be AB if the A allele is maternally derived and BA if the A allele is paternally derived. In the presence of a POE, E (yi∣AB)≠E (yi∣BA). Let kp be the value of such an effect. The corresponding mean trait values given to the AB and BA genotypes are μ+kp and μ−kp, where μ is an overall mean. The model accounting for both parent-of-origin and a simple allelic effect is

where pi is calculated as [(number of maternally derived A alleles)−(number of paternally derived A alleles)]. The vector p, therefore, contains the values 0, 1, −1 and 0 for genotypes AA, AB, BA and BB, respectively. Dominance is easily included in our model with the addition of one parameter.

The parental origin of the alleles of an offspring can be determined from parental genotypes. Under certain configurations, this can be unequivocally carried out from the genotypes of a biallelic marker. However, when an offspring and both parents are heterozygous at a given biallelic locus, it is impossible to determine the parental origin of the alleles from the genotypes of that single locus only. This problem can be solved when genotypes at many loci are available and considered simultaneously. In this case, maternally and paternally transmitted haplotypes can often be established. We therefore suggest using haplotype reconstruction to use the information on flanking loci when determining the parental origin of alleles.

The proposed method consists of two stages:

  1. a)

    Determination of the parental origin of alleles in arbitrarily complex pedigrees. For biallelic markers, this can be carried out by multipoint haplotype reconstruction according to the most likely pattern of gene flow.

  2. b)

    Using parental origin of alleles in MG analysis as a covariate in a linear mixed model or during the second step of GRAMMAR analysis in a regression model.

Here, we compare our procedure with TDT-based POE analysis as implemented in the QTDT package.15 The basic idea of QTDT is an orthogonal decomposition of genotype scores into two vectors expressing expected genotypes and corresponding deviations. The expected phenotype μij for the jth individual in the ith family is modeled as

where μ is a mean, bij and wij are orthogonal between- and within-family components of a genotype score and βb and βw are the corresponding regression coefficients. The null hypothesis is formalized as βw=0. To explore POE, the following regression model is fitted:

where bijmat and wijmat are values analogous to bij and wij, but based on maternal transmission only. Under the null hypothesis, the model is fitted with βwmat=0.

To examine the power and efficiency of our procedure, we simulated a quantitative trait and used GRAMMAR and TDT to test the significance of the additive effect of the genotype (GR-A and TDT-A), parent-of-origin effect only (GR-P and TDT-P) and additive effect and POE together (GR-G and TDT-G) (Table 1). The proposed POE testing strategy was implemented in the GR-P and GR-G tests. The haplotype reconstruction to restore the parental origin of alleles was applied only for these tests. Comparison of these two tests to TDT-P and TDT-G allows us to assess the performance of our procedure. Comparison of GR-G to GR-A allows us to explore the power changes when introducing POE into a GRAMMAR regression model.

Table 1 Description of tests compared by simulation study

All TDT-based tests were performed with QTDT software.15 Environmental residuals from polygenic models for GRAMMAR were estimated using ASReml.16

Haplotypes were reconstructed using MERLIN,17 without taking the LD structure into account. For haplotyping, the extended pedigrees were cut into pieces with a bit-size of 16 or less using PedSTR software18 (available on http://mga.bionet.nsc.ru/soft/index.html). All other computations were carried out using freely available R software (http://www.r-project.org).

Simulations

We used three pedigree structures in our simulations:

  • NP: 202 nuclear families with three offspring each (1010 pheno- and genotyped individuals in total).

  • IPP: 10 pedigrees, each consisting of one sire mated to 10 dams; each dam has nine offspring (1010 pheno- and genotyped individuals in total).

  • ERF: a pedigree with 9817 members, including 1010 pheno- and genotyped individuals. The phenotyped individuals are a part of the Erasmus Rucphen Family (ERF) study, performed in a young genetically isolated Dutch population.19

We simulated 51 linked biallelic SNP markers. One SNP with a minor-allele frequency of 0.1 controlled the quantitative trait, 25 flanking markers on either side of the ‘causative’ SNP were used for haplotyping. Flanking marker positions corresponded to those observed for the SNPs in the first 40 cM of chromosome 6 in the Illumina 6 k human linkage chip. Minor-allele frequencies corresponded to those observed in the ERF study. The markers were assumed to be in linkage equilibrium in founders. Under the null hypothesis, no allelic effect was assigned and only polygenic and environmental effects were simulated. Under the alternative, the trait was simulated as a sum of the main allelic effect kg, parent-of-origin effect kp, polygenic additive variance and a normally distributed environmental effect with a variance of 0.70, 0.50 and 0.20 corresponding to a total heritability (sum of variances explained by the allelic, POE and polygenic effects) of 0.3, 0.5 and 0.8. To study power, the main allelic additive effect explained 1, 2 or 3% of the total trait variation. POE explained 0.5, 1 or 1.5% of the total trait variance for additive effects explaining 1, 2 and 3%, respectively. In each simulation, 2% of genotypes were randomly deleted to simulate missing genotypic data. One thousand replicates were used to study type I error and 100 replicates to study power under each scenario considered.

Results

Determination of the parental origin of alleles using haplotype reconstruction

We estimated the parental origin of alleles by comparing childrens' and parents' haplotypes. The simplest way to use haplotype reconstruction is to consider the most likely haplotype for each individual (eg., the ‘ – best’ option of MERLIN). The efficiency of detecting the parental origin of alleles with and without haplotype reconstruction is shown in Table 2. For all pedigree structures, haplotyping proved to be a much more effective tool than the use of biallelic genotypes at the single locus. Using haplotype reconstruction allowed us to determine the parental origin of alleles in almost all heterozygous offspring in which such a determination was theoretically possible (Table 2). The probability of incorrect detection with the ‘ – best’ option was very low (<0.003%).

Table 2 Efficiency of detecting the parental origin of alleles with and without haplotype reconstructiona

The advantages of implementing a haplotype reconstruction procedure are especially clear in the ERF pedigree. Using single locus genotypes for ERF, the parental origins of alleles could be determined in as few as 66% heterozygous individuals. With the haplotype reconstruction procedure, this number increased to 98%, which is very close to the upper limit.

We also attempted to derive probabilities of the parental origin of alleles weighted for haplotype likelihood using MERLIN ‘ – sample n’ option, with n=100. This procedure proved to be rather time consuming. In the ERF pedigree, it took 5–8 s to generate each additional random realization from the haplotype probability distribution. The yield of correctly restored haplotypes, however, did not exceed that obtained with the ‘ – best’ option.

Type I errors and the power of GRAMMAR- and TDT-based tests to detect POE

Type 1 error rates at a significance level of 0.05 for the evaluated methods are shown in Figure 1. The type 1 errors of TDT-based tests were in good agreement with the nominal 5% level, whereas those of the GRAMMAR-based tests tended to be lower than expected. In accordance with our previous results,13 the GRAMMAR additive-only effect test was most conservative for the IPP pedigree consisting of many large sibships, and there was a weak tendency of type I errors to decrease with increasing heritability. Similar trends were also observed for the POE-only and the combined 2 d.f. test. The 95% quantiles of the test statistic's distribution and the exact values of the type 1 error estimates at significance levels α=0.05 and α=0.01 can be found in Supplementary Table S1.

Figure 1
figure 1

Type I error for TDT-based (light bars) and GRAMMAR-based (dark bars) tests applied in the study. Three columns show power for different pedigree structures, namely, idealized pig population pedigree (IPP), nuclear pedigrees (NP) and the Erasmus Rucphen Family (ERF). Three rows show type I errors for three kinds of tests, whereas the y axis shows the proportion of P-values that were less than 0.05 under null simulation. The x axis indicates heritability value (30, 50 and 80%). The estimates are based on α=0.05. Error bars show 95% confidence intervals.

Figure 2 illustrates the power to detect an imprinted QTL using different methods. The mean χ2 statistic and the proportion of simulations resulting in a χ2 ⩾ the tabular value are presented in Supplementary Table S2. The GRAMMAR-based method using information on both effects (GR-G, black solid curve in Figure 2) always showed the highest power to detect an imprinted QTL, and it is especially evident for the NP and ERF pedigree structures. Although GR-G and TDT-G performed similarly for the IPP pedigree with 50 and 80% trait heritability, the power of the GR-G test is underestimated here because of its conservativeness, as the same fixed thresholds were used for all methods.

Figure 2
figure 2

Power to detect association by GR-A (dashed black curves, open black circles), GR-G (solid black curves and circles), TDT-A (dashed gray curves, open gray circles) and TDT-G (solid gray curves and circles). The curves are approximations calculated from linear dependency between the noncentrality parameter and the QTL effect size. Three columns show power for different pedigree structures, namely, idealized pig population pedigree (IPP), nuclear pedigrees (NP) and the Erasmus Rucphen Family (ERF). The power achieved under three heritability models (30, 50 and 80%) is shown in rows. The y axis of each panel shows power, whereas the x axis shows the proportion of total variation explained by the allelic effect of the QTL under study. POE size was set to half the allelic effect size. Circles indicate the empirical power estimates at α=0.01.

The mean GR-G test statistic was close to the sum of the corresponding GR-A and GR-P statistics (Supplementary Table S2). The TDT-G mean statistic was often lower than the sum of the TDT-A and TDT-P χ2 values, especially for the ERF pedigree.

There was a substantial gain in power for GR-G compared with GR-A (solid and dashed black curves in Figure 2). This gain increased with both heritability and larger numbers of close relatives in the pedigrees (IPP>NP>ERF). Introducing POE to TDT analysis did not result in the same trend. For the NP and IPP pedigree structures, the power of TDT-G to detect an imprinted QTL was very close to that of the conventional allelic TDT-A (gray curves in Figure 2). However, in the ERF pedigree, the power of TDT-G decreased dramatically and, in fact, was even lower than shown in Figure 2, as TDT-G and TDT-P failed to perform the test for 16–29% of all replicates because of a lack of informative data (by default, QTDT skips analysis if there are less than 30 informative individuals). The mean number of informative meioses was 26.3 among unsuccessful TDT-G and TDT-P realizations and 40.3 among successful ones.

Performance of GRAMMAR-based POE analysis under different trait models

Until now, we have considered a completely additive model in which the QTL had both main allelic and parent-of-origin effects on the trait. To explore the properties of our procedure when the locus under study is not imprinted, we carried out the same simulations with no POE. Under this scenario, the probability to ‘detect’ POE with the GR-P test is equivalent to that of the type I error. The GR-A test performed slightly better than GR-G (Supplementary Figure S1) as it is distributed with one, instead of two, degrees of freedom.

The GR-P test statistic was not inflated when analyzing a nonimprinted locus with a dominant allelic effect (data not shown). Indeed, with respect to the linear regression (2) in the absence of POE, it is easy to show that

where pi and yi* are the same as in equation (2); NAB, NBA and N are the counts of AB and BA heterozygotes and the total sample size, respectively. The presence of dominance will change the values of both E(yi*∣AB) and E(yi*∣BA) relative to the expectations under homozygous genotypes. However, even in the case of dominance, E(yi*∣AB)=E(yi*∣BA) when POE is absent, and given NAB≈NBA, both E(pi) and E(pi yi*) approach 0 at large N. Under these conditions, E(pi yi*)=E(pi)E(yi*)=0, in which case vectors p and y* are independent. They are, therefore, uncorrelated in the absence of POE, irrespective of the presence or absence of dominance effect, and the GR-P test statistic will not be inflated.

GRAMMAR is much faster than MG and, therefore, allows for rapid GWA analysis.13 The only additional step needed to introduce POE into GRAMMAR is haplotype reconstruction. In our study, this step took 60, 20 and 2 s per replicate in the ERF, IPP and NP pedigree structures. We also performed haplotyping for real genotypic data on 5249 SNP markers that were typed genome wide in the ERF pedigree. Haplotyping was run on a single Intel Celeron 2.8 Ghz processor and took 130 min. We, therefore, expect the haplotyping of 500 000 SNPs to be completed in approximately 9 days, or many times faster if run simultaneously on many processors. Intermediate steps, such as recoding haplotypes into the parameter of interest, are much less time consuming and took only several minutes. Genome-wide haplotype reconstruction can, therefore, be completed in an acceptable time frame, even in large pedigrees. It is to be noted that computational time can be much shorter if the pedigrees in question have a smaller bit-size. As haplotyping needs to be performed just once for all analyzed traits, the proposed method is feasible for GWA scans.

Discussion

In this article, we proposed a new and powerful MG-based method to detect QTLs exhibiting POE; the method is sufficiently fast to be feasible on a genome-wide level. We determined the parental origin of alleles through haplotype reconstruction and introduced parental origin as a covariate into a regression model during the second step of GRAMMAR, a fast approximation of the measured genotype (MG) approach.13 The idea of using parental origin of alleles as a parameter to detect POE was already suggested in the context of qualitative TDT analysis by Weinberg.20 Here, we introduced such a parameter to model POE for quantitative traits in the MG framework. We showed that our procedure is more powerful than a traditional TDT-based approach and makes it possible to analyze data that are noninformative for analogous TDT-based methods.

The gain in power has two sources. One is that the haplotyping procedure allows the determination of the parental origin of alleles in situations that cannot be resolved from single marker genotypes. In our study, haplotype reconstruction proved to be an effective and reliable tool: the parental origin of alleles was correctly resolved in >97% heterozygotes (Table 2). This increases data informativeness and, therefore, the power of all subsequent analyses.

The informativeness of data for GR-P analysis, as measured by the mean noncentrality parameter per measured individual, was maximal for IPP (0.0066±0.0002), intermediate for NP (0.0054±0.0002) and lowest for ERF (0.0030±0.0001). This is probably a reflection of the number of individuals for which the parental origin of alleles could be resolved. This is also likely to explain the fact that the difference in power between GR-G and GR-A tended to decrease in the same order.

The lack of informative heterozygotes could be the factor that hindered the TDT-P and TDT-G tests in ∼22% replicates when using the ERF pedigree. QTDT only attempts to analyze a sample with at least 30 informative individuals. When testing for POE by QTDT, informative individuals are those with both parents genotyped and one parent homozygous, as only for these individuals can parental origin of alleles be restored from genotypes at a single locus.

Another source of power gain is that the MG approach by itself is known to have a higher power compared with TDT, because it makes use of both the variation between families and information on allele transmission. Thus, even without haplotype reconstruction, GR-P and GR-G proved to be more powerful than TDT-P and TDT-G. The mean noncentrality parameter (NCP) of GR-P and GR-G without haplotyping was ∼2.4 and ∼3.1 times higher than the NCP of TDT-P and TDT-G, respectively. This means that GRAMMAR-based tests with regard to POE can achieve the same power as analogous TDT-based tests while using samples of less than half the size. This can be easily explained, as POE covariates are also presented as between- and within-family components in QTDT (bij mat and wij mat in equation (4)), and only the within-family component is tested for significance. With haplotype reconstruction, the NCP of GR-P and GR-G increased even more, and outperformed TDT-P and TDT-G by a factor of ∼3.3 and ∼3.4, respectively. GR-P and GR-G are also applicable for candidate gene analysis and for trio study designs. Although for trios, determination of the parental origin of alleles of a biallelic marker cannot be improved by linkage-based haplotyping; the haplotypes may be reconstructed on the basis of linkage disequilibrium. Given that GR-P and GR-G are more powerful than TDT-P and TDT-G even without haplotyping, they are also expected to have a higher power for trios or in case when no information on flanking markers is available.

The conservativeness of the GRAMMAR-based test is consistent with our previous studies.13, 14 We may speculate that conservativeness emerges as the residuals used in the second step of GRAMMAR are derived by subtracting the estimated polygenic value from the trait value. The polygenic value also partly includes the effect of a genotype under consideration, and this leads to underestimation of the effect of the genotype and the corresponding statistics under any scenario. Owing to conservativeness, the power of GRAMMAR-based tests is underestimated, as the same tabular thresholds were used for both GRAMMAR- and TDT-based tests. It has been shown that the power of GRAMMAR tests may be further improved when empirical thresholds are used instead of tabular ones.13 It is especially relevant to IPP-like pedigree structures having high average kinship values. To obtain empirical thresholds, however, multiple simulations under the null hypothesis are required.

The other way to deal with conservativeness is to apply the genomic control procedure21 to correct the GRAMMAR test statistic (GRAMMAR-GC).14 As only a minor proportion of markers is usually expected to be truly associated with a trait, the majority of markers can be used to characterize the null distribution of the test statistic. The appropriate correction of the test statistic using this genomic control procedure allows the power of GRAMMAR to be restored to that of MG while keeping type I error at the declared level.14

The GRAMMAR-based tests presented here outperform their TDT-based analogs even when tabular thresholds are used. The tests do not provide false-positive results for nonimprinted QTLs even in the presence of a dominant allelic effect. When the analyzed locus shows a parent-of-origin effect, the GR-G test provides a substantial gain in power compared with GR-A, which uses a main allelic effect only. The gain in power increases with heritability and with the number of informative individuals.

When POE is absent, some power may be lost with the second degree of freedom. To avoid this, we suggest applying both GR-A and GR-G for traits that are expected to show POE. To check for the significance of POE alone, the GR-P test can be used.

The power of GRAMMAR was shown to be very close to, or even the same as, that of MG.13, 14 GRAMMAR-based tests, however, can be used to select potentially significant associations for subsequent reanalysis by MG, as was initially suggested.13

All MG-based methods may, unfortunately, generate false positives in the presence of population stratification. There are many situations, however, when population stratification bias can be ignored or corrected. For instance, an MG-based analysis may be acceptable for large pedigrees collected from isolated populations, such as the ERF pedigree used in this study. In this case, population stratification bias is expected to be minimal, whereas power gain may be substantial. Moreover, if population substructure is well defined, it can be accounted for by performing a structured association analysis. In that case, stratified analysis can be performed and the joint results could be synthesized using standard meta-analysis methods.

Finally, it should be noted that the MG test (and, consequently, GRAMMAR) is a test for co-segregation of a particular marker allele and the phenotype, that is, it is a test for linkage and/or association. If a single family of close relatives is studied, a specific large region may be co-segregating with the trait in this particular family (linkage). If the alleles in the associated region are specific to this family, but different from other (not studied) families from the same population, the MG test would be detecting linkage. On the other hand, if the data used for analysis are representative of the general study population (eg., a large collection of families of more or less balanced size, or a large random sample from a genetically isolated population), the test will detect linkage/association with short genomic regions co-segregating with the trait in this population. Association found by either method used here, including the TDT-based methods, may be attributed to either linkage within pedigrees or linkage disequilibrium in study population or both. Although the orthogonal model in QTDT was initially developed to test for association in the presence of linkage, it should be used in variance components framework to allow such an analysis,15 which was not carried out here. Searching for associations not attributable to linkage is of value when aiming to refine mapping results in a region of strong linkage. In the absence of previous mapping information, the main advantage of a family-based GWA study is that linkage information is used and not ignored.

Conflict of interest

The authors declare no conflict of interest.