Bivariate association analysis in selected samples: application to a GWAS of two bone mineral density phenotypes in males with high or low BMD

Saint-Pierre, Aude; Kaufman, Jean-Marc; Ostertag, Agnes; Cohen-Solal, Martine; Boland, Anne; Toye, Kaatje; Zelenika, Diana; Lathrop, Mark; de Vernejoul, Marie-Christine; Martinez, Maria

doi:10.1038/ejhg.2011.22

Download PDF

Article
Published: 23 March 2011

Bivariate association analysis in selected samples: application to a GWAS of two bone mineral density phenotypes in males with high or low BMD

Aude Saint-Pierre¹,
Jean-Marc Kaufman²,
Agnes Ostertag³,
Martine Cohen-Solal³,
Anne Boland⁴,
Kaatje Toye²,
Diana Zelenika⁴,
Mark Lathrop⁴,
Marie-Christine de Vernejoul³ &
…
Maria Martinez¹

European Journal of Human Genetics volume 19, pages 710–716 (2011)Cite this article

1451 Accesses
11 Citations
Metrics details

Subjects

Abstract

Our specific aims were to evaluate the power of bivariate analysis and to compare its performance with traditional univariate analysis in samples of unrelated subjects under varying sampling selection designs. Bivariate association analysis was based on the seemingly unrelated regression (SUR) model that allows different genetic models for different traits. We conducted extensive simulations for the case of two correlated quantitative phenotypes, with the quantitative trait locus making equal or unequal contributions to each phenotype. Our simulation results confirmed that the power of bivariate analysis is affected by the size, direction and source of the phenotypic correlations between traits. They also showed that the optimal sampling scheme depends on the size and direction of the induced genetic correlation. In addition, we demonstrated the efficacy of SUR-based bivariate test by applying it to a real Genome-Wide Association Study (GWAS) of Bone Mineral Density (BMD) values measured at the lumbar spine (LS) and at the femoral neck (FN) in a sample of unrelated males with low BMD (LS Z-scores ≤−2) and with high BMD (LS and FN Z-scores >0.5). A substantial amount of top hits in bivariate analysis did not reach nominal significance in any of the two single-trait analyses. Altogether, our studies suggest that bivariate analysis is of practical significance for GWAS of correlated phenotypes.

Pleiotropic genomic variants at 17q21.31 associated with bone mineral density and body fat mass: a bivariate genome-wide association analysis

Article 22 September 2020

A genome-wide scan for pleiotropy between bone mineral density and nonbone phenotypes

Article Open access 01 July 2020

Simultaneous estimation of bi-directional causal effects and heritable confounding from GWAS summary statistics

Article Open access 14 December 2021

Introduction

With the availability of high-density maps of single-nucleotide polymorphisms (SNPs), association studies have become popular tools for identifying genes underlying complex human traits and diseases. For most current population-based genome-wide association studies (GWAS), statistical power is often limited because of the complex interplay among factors that influence the etiology of diseases.¹ Increasing sample size and multilocus or multivariate statistical analyses can improve the power for detecting association. Sample size is often restricted because of genotyping costs and limited sample resources. Several studies have demonstrated that analyzing samples selected with extreme values can be more powerful than analyzing samples randomly selected from the population.^{2, 3, 4} In addition to using selected samples, another approach to increasing association test power is to perform joint analysis of multiple correlated phenotypes. For many common multifactorial traits, several correlated phenotypes are usually recorded for each individual during sample collection, but most often, the phenotypes are analyzed separately in a univariate framework. Joint analysis of correlated phenotypes can theoretically provide greater power than that provided by analysis of individual phenotypes.^{3, 5, 6, 7} Multivariate analysis can also alleviate the multiple testing problem, caused by testing different traits separately, and thereby improve the ability to detect genetic variants whose effects are too small to be detected in univariate analysis.⁸ Several multivariate approaches have been applied to linkage studies of correlated complex phenotypes, such as osteoporosis and bone-related phenotypes.^{9, 10, 11, 12} Similarly, various methods, often based on generalized estimating equations (GEEs), have been proposed for performing multivariate association tests on population- or family-based data.^{13, 14, 15, 16, 17, 18, 19, 20} Of the two studies that have investigated the power of bivariate association test in population-based data, one applied the restricted bivariate association test that assumes same quantitative trait locus (QTL) effects on each trait.^{16, 18} Such constraints in the model may have overestimated or underestimated the relative performance of bivariate over univariate analysis. Finally, GWAS studies using multivariate analysis are rare, especially in samples of subjects selected through their phenotype values, and further investigations using this approach are warranted.⁴

To this aim, we evaluated the statistical properties of joint association analysis of two correlated quantitative traits in samples of unrelated subjects through simulation studies using the seemingly unrelated regression (SUR) bivariate model that allows for different QTL effects on traits. The evaluation was conducted under different situations according to the sample selection design, genetic effects and residual correlation between the traits. We demonstrate the efficacy of SUR-based bivariate test by applying it to simultaneous GWAS analysis of two correlated bone phenotypes, bone mineral density (BMD) at the lumbar spine(LS) and at the femoral neck (FN), which are major risk factors of osteoporosis.

Materials and methods

SUR-based bivariate model

The SUR model²¹ is a generalization of a classical linear regression model that consists of several regression equations with potentially different sets of explanatory variables. It thus allows for a differential effect of explanatory variables on phenotypes as well as the possibility that some variables might be associated with only one trait. Let N be the total number of unrelated subjects (i=1, …, N), each having observations on two phenotypes y_ji (j=1, 2). Consider a system of two equations, where the jth equation is of the form: y_j=X_j × β_j+e_j; y_j is a N × 1 vector of the phenotypic values, X_j is a (K_j+1) × N matrix of explanatory variables with K_j representing the number of explanatory variables in the model for phenotype j excluding the intercept; is the (K_j+1) × 1 vector of coefficients and e_j is a N × 1 vector of the residuals errors. The system of SUR can be written as:

The SUR model allows for cross-equation correlation of the residual terms. The covariance matrix of all the residuals is assumed to be normally distributed with mean 0 and covariance matrix where I_N is a N × N unit matrix and Σ a 2 × 2 matrix with the following form:

σ₁² and σ₂² are the residual variances of Y₁ and Y₂, respectively, and rE is the residual correlation between Y₁ and Y₂.

The SUR model is estimated using the generalized least square method where the covariance matrix Ω is first estimated using ordinary least square regression in system (1). Linear restrictions on coefficients can be tested by an F test. The F statistic for systems of equations is: where, C is the matrix of restrictions on coefficients. Under the null hypothesis, the F statistic has a central Fisher distribution with 2 and 2 × N−K degrees of freedom, where K is the total number of estimated coefficients (K=K₁+K₂+2). The goodness of fit of the whole system can be measured by the McElroy's r-square (R²). R² is the proportion of covariance because of X taking into account the residual matrix covariance Ω.²²

Here, we applied the SUR model to test association between two continuous phenotypes in unrelated subjects genotyped at one SNP marker, and X_j is the N × 1 vector of genotypes at the SNP. Under an additive model, the genotype for each individual i, noted g_i, is coded as a function of the number of minor alleles, that is, 0, 1 or 2. We computed the SUR model free of constraints on the regression coefficients, that is, β1 and β2 were freely estimated. Under the null hypothesis of no association to either one or both phenotypes, the F statistic has a central Fisher distribution with 2 and 2 × (N−2) degrees of freedom. Separate association analyses of Y₁ and Y₂ can be conducted using traditional univariate linear regression model: y_j=g × β_j+e_j, where y_j, g and β_j are as described above but now, e_j is assumed to follow a normal distribution N (0, σ_j²). The null hypothesis of no association (β_j=0) can be tested against the alternative (β_j≠0) with a Student's statistic (t-test) with N−2 degrees of freedom.

Simulation study

We considered genetic models of complex traits and specifically tried to generate correlated data, mimicking as much as possible our real BMD GWAS data (see below). As a strong (∼0.5) and positive phenotypic covariation exists for BMD values at the LS and at the FN,²³ we generated data for two positively correlated quantitative phenotypes. Further, in real data sets, as causal loci usually contribute a small proportion to the total phenotypic correlation, residual correlation approximates phenotypic correlation between traits. It is also more realistic to assume that the investigator has a priori knowledge on the magnitude and sign of the covariation of the studied phenotypes than on the magnitude and sign of the QTL effect on each phenotype. Therefore, in all our scenarios, the sign of the residual correlation (rE) was positive, but the sign of the induced QTL correlation (rG) was either positive or negative. Also, our BMD GWAS study used a sampling design, with extreme truncate selection of unrelated males, aiming to improve power. Therefore, we also generated samples of subjects drawn from the extremes of the phenotype(s) population distribution.

The main scenarios and parameter settings are shown in Table 1. The different settings allowed us to generate data for a QTL having same or different effect on the two positively correlated phenotypes, and the two sources of covariation (QTL and residual) have same or opposite sign. Briefly, we assumed a biallelic QTL having additive effects (a_j) on Y_j (j=1, 2), with minor and major allele frequency q and p, respectively. The QTL contribution to Y_j is the trait-specific QTL heritability, h_j². Here, we focussed our power investigation to QTLs explaining a relatively small part of the trait variance, that is, from 0.5 to 3% that, for complex traits, seemed to us more realistic. The genotypic means (m_jk) of Yj are equal to 2q × a_j, (q−p) × a_j and −2p × a_j when k, the number of minor alleles, is equal to 0, 1 and 2, respectively, and with . We varied the sign of a_j: both were of same or opposite sign and the QTL correlation (rG) was, thus, equal to +1 or −1, respectively. We first generated samples of subjects unselected for their traits values (denoted as Su). Second, we generated subjects selected from the 2.5% (ie, trait value ≤−2) and 30% (ie, trait value >0.5) left and right tail of the population distribution of Y₁ (denoted as S1), respectively. Third, we included Y₂ in the selection design, that is, we selected subjects from the 2.5 and 30% left and right tail of the population distribution of Y₁ and Y₂, respectively, (denoted as S2). These truncate selection criteria (trait value ≤−2 or >0.5) are the values that we have used in our real BMD GWAS. Under S1 and S2, we generated samples with equal number of subjects drawn from the left (N/2) and the right (N/2) side of the phenotypes distributions.

Table 1 Outline of the main scenarios and varying parameter values in the bivariate data simulations

Full size table

Traits values of N (300, 1000)-unrelated subjects were generated as follows. For a given combination of parameter values (rE, h²₁, h²₂, rG), we first draw QTL alleles from a binomial distribution with parameter q, and built genotypes under Hardy–Weinberg equilibrium. Then, conditionally on the generated genotype, g_k (k=0, 1, 2), we jointly drew the values of Y₁ and Y₂ via a bivariate normal distribution with mean (m_1k, m_2k)^t and variance matrix Ω, given in equation (2). Third, under sampling S1 or S2, we applied the corresponding truncate selection, that is, individuals not fulfilling the selection criteria were withdrawn from the sample. Steps 1–3 were repeated until reaching the required left and right truncated sample sizes of (N/2) subjects.

Each replicate was analyzed with SUR-based bivariate and with two separate univariate analyses using the systemfit package of R software (http://www.r-project.org/) using the genotypes at the QTL, that is, the SNP is the causal variant. The mean and standard deviations of each association statistic (F test and t₁, t₂-tests) were derived from K replicates. Power and type I error rates of each association test were calculated as the proportion of replicates with a test statistic exceeding a given theoretical threshold (Rα) value, at nominal significance levels, α=5, 1, 0.1 and 10⁻³%. Type 1 errors were estimated in the settings were h²₁=h²₂=0 with K=20 000 replicates. Power rates were derived with K=1000 replicates. To compare the performance of bivariate and that of univariate association analysis, we computed the proportion of replicates where t₁ and t₂ were both lower than Rα. One minus this proportion estimated the probability to detect association to either one of the two phenotypes. To adjust for the two univariate association tests, we applied the Bonferroni correction, that is, we used the theoretical thresholds Rα/2.

Results

Simulation study

Tables 2 and 3 present the mean (and SD) association statistic of the SUR-based bivariate (F test) and of the traditional univariate tests (t-test), respectively, when N=1000 for 66 scenarios under the alternative hypothesis and when q=0.4. For a given QTL heritability value, the results did not vary, as expected, with q.

Table 2 Mean and SD (μF (SD)) of the SUR-based bivariate association statistic (F test) in samples of N=1000 subjects for various parameter settings: QTL heritability (h²₁/h²₂), sign of the induced genetic correlation (rG), residual correlation (rE) and sampling selection design

Full size table

Table 3 Mean and SD (μt (SD)) of the traditional univariate association statistic (t-test) in samples of N=1000 subjects for various parameter settings: QTL heritability (h²₁/h²₂), sign of the induced genetic correlation (rG), residual correlation (rE) and sampling selection design

Full size table

Bivariate association statistics

In randomly selected samples, the results in Table 2 show several well-established power figures. First, mean F statistics of bivariate association analysis increase with the size of the trait-specific QTL heritability (h²₁ and/or h²₂) irrespective of rG and rE. Second, the power is highest in presence (rG≠0) than in absence (rG=0) of pleitropic effects: the highest power is achieved when rG=−1, that is, when the correlation induced by the QTL effect and the residual correlation are opposite in sign. Third, the results also confirm that the power of bivariate association test varies with the size of the residual correlation: when rG=0 or rG=−1, the power increases with rE; conversely, when rG=+1, it decreases with rE. These general trends are observed irrespective of the sampling selection designs. Applying extreme truncate selection increases the power of bivariate association analysis, but the optimal selection design depends on the true genetic model. When rG=0 or rG=−1, extreme selection on one trait (S1) is more efficient than extreme selection on both traits (S2). Conversely, when rG=+1, S2 is more efficient than S1. Overall, under Su or S1, the highest mean F statistics are obtained when rG=−1, irrespective of rE. Under S2, the highest power is achieved when rG=+1 or when rG=−1, depending on the size of rE. Interestingly, when the traits are moderately (rE=0.20) correlated, mean F statistics have greater values when rG=+1 than when rG=−1.

Univariate association statistics

Table 3 shows again several well-established power figures. In randomly selected samples, the power of univariate analysis increases with the QTL heritability (h²₁/h²₂) and varies little with the size of the residual correlation, rE. For phenotype Y₁, under a given QTL heritability (h²₁) value, the mean statistic values of all models are similar in the randomly selected samples. Applying extreme truncate selection increases the power of univariate association analysis of Y₁. Under S1, the power remains similar whatever may be the rG value. Under S2, the power is the highest and the lowest for the pleiotropic models rG=+1 and rG=−1, respectively. When rG=−1 or rG=0, the power of univariate association analysis is greater under S1 than under S2. The reverse trend is obtained when rG=+1. For phenotype Y₂, the power of univariate analysis depends on rG and rE. Further, applying extreme selection does not always lead to a gain in power. Indeed, when rG=−1, the power of univariate analysis is the greatest in the unselected samples (Su). When rG=0, the mean t-statistic values in the selected samples are biased and inflated. The magnitude of the bias is greater under S2 than under S1. Under S1, the bias increases with rE.

Overall, applying selection criteria on one or both traits is an optimal sampling design when rG=+1: the power of each separate univariate analysis is improved over that in randomly selected samples. When rG=−1, applying extreme truncate selection leads to both a substantial gain and decrease in power for Y₁ and Y₂, respectively. For the situations in which the QTL does not exert pleiotropic effects (rG=0), the highest power of univariate analysis of Y₁ is obtained in the selected samples. However, the mean t-statistic values for Y₂, the trait not associated to the QTL, are also increased. Type I error rates of separate univariate analyses may thus be inflated, especially in selected samples and when the residual correlation is high.

Type I error rates

When the QTL/SNP has no effect on Y₁ and Y₂, the values of the mean and standard deviation of both bivariate and univariate association tests are close to the theoretical values, regardless of the residual correlation, minor allele frequency of the studied SNP and of the selection sampling design (Supplementary Table 1A). Indeed, SUR-based bivariate and each separate univariate association tests have correct type I error rates (Supplementary Table 1B). However, the false positive rates of univariate association analyses for detecting association to either or both the two traits are, as expected, inflated: the estimated rates are roughly two times higher than the theoretical rates. Applying a Bonferroni correction (denoted as U_b) leads to slightly conservative significance levels, especially when the residual correlation between the traits is strong.

Power comparisons

The power to detect association to either or both of the two traits using SUR-based bivariate analysis was compared with the power of separate univariate analysis of Y₁ and Y₂ adjusted for multiple testing by the Bonferroni correction (denoted as U_b). Figure 1a shows the power curves (at significance of 10⁻⁵) against the QTL heritability (h²₁, h²₂) when N=1000 for moderately (rE=0.2) or strongly (rE=0.6) correlated traits. Power curves under S1 and S2 are shown in Figure 1b, when h²₁=h²₂=0.005, N=1000 and rE=0.2 or 0.6.

In randomly selected samples (Figure 1a), the relative advantage of SUR-based bivariate over univariate association analysis is more obvious not only when rG=−1 and/or the traits are strongly correlated (rE=0.6) but also when rG=+1 and the traits are moderately correlated (rE=0.2). Under S1 (Figure 1b), SUR-based bivariate is slightly less powerful than univariate analysis when rG=+1 and rE=0.6 or when rG=0 and rE=0.2. For strongly correlated traits, the power rates are equal to 94.5% (SUR) versus 29.3% (U_b) when rG=−1; 44.0% (SUR) versus 32.3% (U_b) when rG=0; and 36.8% (SUR) versus 39.9% (U_b) when rG=+1. For moderately correlated traits, the power rates are equal to 64.6% (SUR) versus 31.7% (U_b) when rG=−1; 32.9% (SUR) versus 34.9% (U_b) when rG=0; and 43.7% (SUR) versus 32.6% (U_b) when rG=+1. Under S2 (Figure 1b), SUR-based bivariate shows same or slightly lower power than univariate analysis, except when rG=−1 or when rG=0 and rE=0.6 where it outperforms univariate test. As already noted above, selecting on Y₁ (S1) is the most efficient sampling design when rG=−1 or when rG=0 and the traits are strongly correlated (rE=0.6). Selecting on both traits (S2) is the most efficient design when rG=+1. Overall, when rE=0.6, the power of SUR is the greatest (94.5%) when rG=−1 and under S1, whereas the power of univariate analysis is the greatest (56.8%) when rG=+1 and under S2. When rE=0.2, the power of SUR and univariate analysis are both the greatest (72.5 and 72.9%) when rG=+1 and under S2. As shown in Supplementary Table 2, all these trends are confirmed under various parameter settings.

Analyses of empirical BMD genome-wide association data

BMD GWAS data

Subjects were recruited from the Network in Europe on Male Osteoporosis Study.^{24, 25} Subjects selected from this cohort were unrelated males >18 and <68 years of age. In addition, the subjects were selected by bone densitometry (measured at the LS and FN) criteria, having either low BMD (LS Z-scores ≤−2, n=175) or high BMD (both LS and FN Z-scores >0.50, n=155). Further details of the study sample are provided in Supplementary Table 3. Genotyping was carried out at the Centre National de Génotypage (Evry, France) using the Illumina 370K platform (Illumina, San Diego, CA, USA). SNPs and DNA data were subjected to standard quality control analyses with PLINK²⁶ (details are provided in Supplementary Methods).

Association analysis

Our primary analysis was the joint association analysis of LS Z-scores and FN Z-scores by means of SUR-based bivariate test. For comparison purpose, we also applied separate univariate association analyses of LS and FN Z-scores. We used single-marker analysis assuming additive genetic effects. The mean F statistic of our SUR-based genome-wide association (GWA) analysis was equal to 1.018 (SD=1.022, median=0.70). The mean t-statistic of LS and FN were −0.0167 (SD=1.011, median=−0.0165) and −0.0129 (SD=1.006, median=0.0104), respectively. These results indicated that there was no meaningful inflation of univariate as well as bivariate association analyses.

Results

SUR-based bivariate analyses identified a substantial number (35) of SNPs with strong evidence of association (P-value <10⁻⁴). Interestingly, several of the identified SNPs failed to reach nominal (P-value <5%) significance under separate univariate analyses for either one or the two BMD phenotypes. Genome-wide bivariate and univariate association results were compared in terms of statistical significance and ranks of the SNPs identified in either one of the two approaches. For each SNP, we kept the lowest P-value (denoted as Best_U) of LS or FN univariate association analysis. Univariate P-values were not corrected for multiple testing. We ranked the Best_U P-values from the lowest to the highest. We similarly ranked the P-values from SUR-based bivariate analysis of LS and FN. Figure 2 plots the significance levels in each procedure for the top 100 most associated SNPs identified from SUR-based (Figure 2a) or from univariate (Figure 2b) analyses. We found that a majority (52) of the top SNPs in SUR-based bivariate analysis also show strong (P<3 × 10⁻⁴) association signal in univariate analyses. For a substantial number (16) of the remaining SNPs, univariate analyses fail to reach nominal (P<5%) significance (Figure 2a) On the other hand, all of the top 100 SNPs in univariate analyses (Figure 2b) are also highly significant (P<8 × 10⁻⁴) in bivariate analysis. Table 4 shows details of the association results for the top 10 SNPs in SUR-based and in each separate univariate analysis. The table also shows P-values and ranks found in each of the two other procedures. The genetic contributions (R² values) of the 10 top SNPs are not great, as expected for any relatively common polymorphic locus. In all, 3 of the top 10 SNPs from bivariate analysis also rank well (ie, are in the set of top 300 SNPs) in univariate analyses of LS and/or FN. They are located on 6q25: rank=2, P=1.3 × 10⁻⁵ (LS) and rank=1, P=1.2 × 10⁻⁵ (FN); on 15q14-q15: rank=2635, P=8.4 × 10⁻³ (LS) and rank=3, P=1.7 × 10⁻⁵ (FN); and on 22q13: rank=1, P=3.5 × 10⁻⁶ (LS) and rank=8, P=3 × 10⁻⁵ (FN). All the remaining seven SNPs show a much stronger association signal in bivariate than in univariate analyses, including two of the three best SUR-based association signals. For the most significant result, on 22q11.2 (P=5.44 × 10⁻⁶), the QTL explains 3.85% of the joint (co)variance of LS and FN. This value likely overestimates the contribution in unselected populations. Nonetheless, univariate analyses failed to detect association (P>0.07) with this SNP. Conversely, all the top 20 SNPs identified from univariate analysis of either LS or FN belong to the set of top 42 SNPs from SUR-based bivariate analysis. Overall, our analyses showed that univariate analysis did not identify new strongly associated SNPs as compared with those detected in bivariate analysis. Conversely, SUR-based analysis identified strongly associated SNPs that were not detected in univariate analysis.

Table 4 Association results: top 10 most associated SNPs from SUR-based bivariate or from separate univariate analysis of LS and FN BMD

Full size table

Our study used a design, with extreme truncate selection of unrelated males, aiming to improve power. The approach of studying samples drawn from the extremes of the population distribution of BMD has been used in several linkage studies of BMD variation,^{25, 27} but rarely in association studies,²⁸ and to our knowledge, never in samples drawn from the population of males. Owing to our relatively small GWA sample size, no SNP showed evidence of association to either one or both BMD phenotypes at genome-wide significance threshold of 1.7 × 10⁻⁷ (0.05/298 783 SNPs). However, we used an extreme truncate selection design that, as shown by our simulation studies, has increased power over unselected samples. Our SUR-based bivariate association analyses identified strong association (P<8.4 × 10⁻⁶) with three genomic regions (6q22.1, 15q14 and 22q11). These SNPs have not yet been reported to be associated with bone density in previous GWAS.^{29, 30, 31} Two of them, on 15q14-15 and 22q11, are located in genes that are known to be expressed in skeletal muscle:^{32, 33} GLUT 11 encoded by SLC2A11 on 22q11 and RYR3 on 15q14-15. Because muscle contraction has a major impact on bone density, this might represent an indirect role of these genes on bone density. These genetic variants, whether they are site specific or possibly shared (pleiotropic), may warrant further follow-up genetic studies on BMD and other bone-related phenotypes.

Discussion

We have evaluated the performance of bivariate association analysis based on the SUR model, which allows different genetic models for different traits. To our knowledge, this is the first study to specifically derive the power and the relative performance of bivariate association analysis in selected samples of unrelated subjects. Our main results coincide with well-known power figures,^{6, 7, 8} and confirmed that bivariate association analysis outperforms univariate analysis when the QTL exerts pleiotropic effects and the relative increase in power is the greatest when correlation of the QTL is opposite in sign to the residual correlation. The most powerful sampling selection design varied with the genetic model, specifically with the size and the direction of the induced QTL correlation. Applying truncate selection on one trait was found the most efficient sampling design when the genetic and the residual correlations are opposite in signs. The same most efficient design was found when the QTL does not exert pleiotropic effects: the power of the SUR-based bivariate association test was found as good as or better than that of univariate association test, depending on the size of the residual correlation. Finally, when the QTL exerts pleiotropic effects and both sources (QTL and residual) of covariation are of the same sign, applying selection criteria on both traits was found to be the optimal sampling selection design. Under this sampling design, the performance of SUR-based bivariate test relatively to univariate analysis decreases with the size of the residual correlation.

So far, two studies have investigated the power of bivariate association in unselected population-based data, and they both applied bivariate association test based on GEEs.^{16, 18} The former applied a general GEE-based model that allows, as the SUR model, for different QTL effects on the two traits. The second study used a GEE-based bivariate model that assumed same QTL effects on the phenotypes. Our results are congruent with those reported by the first study. The restricted bivariate test estimates, as the univariate test, a single parameter (ie, the SNP regression coefficients on each trait are all set as equal). Under the restricted bivariate model, the gain in power of bivariate analysis is enhanced and reduced when the QTL has similar effect and when it affects one trait only, respectively. Clearly, rarely, knowledge of this magnitude about a complex trait is known a priori. Thus, we do not recommend using restricted bivariate models even in unselected data.

Our bivariate GWA analysis of LS and FN BMD values, conducted in a sample of unrelated males with low BMD (LS Z-scores ≤−2) and high BMD (LS and FN Z-scores >0.5), consistently demonstrated the advantage of the SUR-based bivariate test over separate univariate analysis. All the top hits in univariate analysis also showed strong evidence of association in bivariate analysis. Conversely, additional SNP associations were detected with the bivariate method that did not reach nominal significance in single-trait analyses: this was achieved without adjusting significance of univariate analyses for multiple testing.

In conclusion, our results showed that SUR-based models are useful to detect association for correlated phenotypes. However, our results also showed that similar power levels can be achieved whether the QTL exerts or not pleiotropic effects. Thus, disentangling pure pleiotropic from residual covariation remains a challenge even in bivariate association analysis.

References

Hirschhorn JN, Daly MJ : Genome-wide association studies for common diseases and complex traits. Nat Rev Genet 2005; 6: 95–108.
Article CAS PubMed Google Scholar
Allison DB : Transmission-disequilibrium tests for quantitative traits. Am J Hum Genet 1997; 60: 676–690.
CAS PubMed PubMed Central Google Scholar
Allison DB, Thiel B, St Jean P, Elston RC, Infante MC, Schork NJ : Multiple phenotype modeling in gene-mapping studies of quantitative traits: power advantages. Am J Hum Genet 1998; 63: 1190–1201.
Article CAS PubMed PubMed Central Google Scholar
Abecasis GR, Cookson WO, Cardon LR : The power to detect linkage disequilibrium with quantitative traits in selected samples. Am J Hum Genet 2001; 68: 1463–1474.
Article CAS PubMed PubMed Central Google Scholar
Amos C, de Andrade M, Zhu D : Comparison of multivariate tests for genetic linkage. Hum Hered 2001; 51: 133–144.
Article CAS PubMed Google Scholar
Almasy L, Dyer TD, Blangero J : Bivariate quantitative trait linkage analysis: pleiotropy versus co-incident linkages. Genet Epidemiol 1997; 14: 953–958.
Article CAS PubMed Google Scholar
Amos CI, Laing AE : A comparison of univariate and multivariate tests for genetic linkage. Genet Epidemiol 1993; 10: 671–676.
Article CAS PubMed Google Scholar
Jiang C, Zeng ZB : Multiple trait analysis of genetic mapping for quantitative trait loci. Genetics 1995; 140: 1111–1127.
CAS PubMed PubMed Central Google Scholar
Wang L, Liu YJ, Xiao P et al: Chromosome 2q32 may harbor a QTL affecting BMD variation at different skeletal sites. J Bone Miner Res 2007; 22: 1672–1678.
Article CAS PubMed Google Scholar
Pan F, Xiao P, Guo Y et al: Chromosomal regions 22q13 and 3p25 may harbor quantitative trait loci influencing both age at menarche and bone mineral density. Hum Genet 2008; 123: 419–427.
Article CAS PubMed Google Scholar
Wang XL, Deng FY, Tan LJ et al: Bivariate whole genome linkage analyses for total body lean mass and BMD. J Bone Miner Res 2008; 23: 447–452.
Article PubMed Google Scholar
Liu XG, Liu YJ, Liu J et al: A bivariate whole genome linkage study identified genomic regions influencing both BMD and bone structure. J Bone Miner Res 2008; 23: 1806–1814.
Article PubMed PubMed Central Google Scholar
Lange C, Silverman EK, Xu X, Weiss ST, Laird NM : A multivariate family-based association test using generalized estimating equations: FBAT-GEE. Biostatistics 2003; 4: 195–206.
Article PubMed Google Scholar
Lange C, van Steen K, Andrew T et al: A family-based association test for repeatedly measured quantitative traits adjusting for unknown environmental and/or polygenic effects. Stat Appl Genet Mol Biol 2004; 3: Article17.
Article PubMed Google Scholar
Jung J, Zhong M, Liu L, Fan R : Bivariate combined linkage and association mapping of quantitative trait loci. Genet Epidemiol 2008; 32: 396–412.
Article PubMed Google Scholar
Liu J, Pei Y, Papasian CJ, Deng HW : Bivariate association analyses for the mixture of continuous and binary traits with the use of extended generalized estimating equations. Genet Epidemiol 2009; 33: 217–227.
Article PubMed PubMed Central Google Scholar
Pei YF, Zhang L, Liu J, Deng HW : Multivariate association test using haplotype trend regression. Ann Hum Genet 2009; 73: 456–464.
Article PubMed PubMed Central Google Scholar
Yang F, Tang Z, Deng H : Bivariate association analysis for quantitative traits using generalized estimation equation. J Genet Genomics 2009; 36: 733–743.
Article PubMed Google Scholar
Zhang L, Bonham AJ, Li J et al: Family-based bivariate association tests for quantitative traits. PLoS One 2009; 4: e8133.
Article PubMed PubMed Central Google Scholar
Zhang L, Pei YF, Li J, Papasian CJ, Deng HW : Univariate/multivariate genome-wide association scans using data from families and unrelated samples. PLoS One 2009; 4: e6502.
Article PubMed PubMed Central Google Scholar
Zellner A : An Efficient Method of Estimating Seemingly Unrelated Regressions and Tests for Aggregation Bias. J Am Stat Assoc 1962; 57: 348–368.
Article Google Scholar
McElroy MB : Goodness of Fit for Seemingly Unrelated Regressions. J Econometrics 1977; 6: 381–387.
Article Google Scholar
Livshits G, Deng HW, Nguyen TV, Yakovenko K, Recker RR, Eisman JA : Genetics of bone mineral density: evidence for a major pleiotropic effect from an intercontinental study. J Bone Miner Res 2004; 19: 914–923.
Article PubMed Google Scholar
Pelat C, Van Pottelbergh I, Cohen-Solal M et al: Complex segregation analysis accounting for GxE of bone mineral density in European pedigrees selected through a male proband with low BMD. Ann Hum Genet 2007; 71: 29–42.
Article CAS PubMed Google Scholar
Kaufman JM, Ostertag A, Saint-Pierre A et al: Genome-wide linkage screen of bone mineral density (BMD) in European pedigrees ascertained through a male relative with low BMD values: evidence for quantitative trait loci on 17q21-23, 11q12-13, 13q12-14, and 22q11. J Clin Endocrinol Metab 2008; 93: 3755–3762.
Article CAS PubMed Google Scholar
Purcell S, Neale B, Todd-Brown K et al: PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007; 81: 559–575.
Article CAS PubMed PubMed Central Google Scholar
Sims AM, Shephard N, Carter K et al: Genetic analyses in a sample of individuals with high or low BMD shows association with multiple Wnt pathway genes. J Bone Miner Res 2008; 23: 499–506.
Article CAS PubMed Google Scholar
Kung AW, Xiao SM, Cherny S et al: Association of JAG1 with bone mineral density and osteoporotic fractures: a genome-wide association study and follow-up replication studies. Am J Hum Genet 2010; 86: 229–239.
Article CAS PubMed PubMed Central Google Scholar
Richards JB, Kavvoura FK, Rivadeneira F et al: Collaborative meta-analysis: associations of 150 candidate genes with osteoporosis and osteoporotic fracture. Ann Intern Med 2009; 151: 528–537.
Article PubMed PubMed Central Google Scholar
Rivadeneira F, Styrkarsdottir U, Estrada K et al: Twenty bone-mineral-density loci identified by large-scale meta-analysis of genome-wide association studies. Nat Genet 2009; 41: 1199–1206.
Article CAS PubMed PubMed Central Google Scholar
Styrkarsdottir U, Halldorsson BV, Gretarsdottir S et al: Multiple genetic loci for bone mineral density and fractures. N Engl J Med 2008; 358: 2355–2365.
Article CAS PubMed Google Scholar
Doege H, Bocianski A, Scheepers A et al: Characterization of human glucose transporter (GLUT) 11 (encoded by SLC2A11), a novel sugar-transport facilitator specifically expressed in heart and skeletal muscle. Biochem J 2001; 359: 443–449.
Article CAS PubMed PubMed Central Google Scholar
Bertocchini F, Ovitt CE, Conti A et al: Requirement for the ryanodine receptor type 3 for efficient contraction in neonatal skeletal muscles. EMBO J 1997; 16: 6956–6963.
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

Part of this work was supported by the Network in Europe on Male Osteoporosis (European Commission grant QL6-CT-2002-00491), by the Flemish Fund for Scientific Research (FWO Vlaanderen grants G.0331.02 and G.0662.07), by the Société Française de Rhumatologie (SFR) and by the French National Agency of Research (ANR).

Author information

Authors and Affiliations

INSERM U563, Toulouse, France
Aude Saint-Pierre & Maria Martinez
Department of Endocrinology, Unit for Osteoporosis and Metabolic Bone Diseases, Ghent University Hospital, Ghent, Belgium
Jean-Marc Kaufman & Kaatje Toye
INSERM U606, Paris, France
Agnes Ostertag, Martine Cohen-Solal & Marie-Christine de Vernejoul
Centre National de Génotypage, Evry, France
Anne Boland, Diana Zelenika & Mark Lathrop

Authors

Aude Saint-Pierre
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Marc Kaufman
View author publications
You can also search for this author in PubMed Google Scholar
Agnes Ostertag
View author publications
You can also search for this author in PubMed Google Scholar
Martine Cohen-Solal
View author publications
You can also search for this author in PubMed Google Scholar
Anne Boland
View author publications
You can also search for this author in PubMed Google Scholar
Kaatje Toye
View author publications
You can also search for this author in PubMed Google Scholar
Diana Zelenika
View author publications
You can also search for this author in PubMed Google Scholar
Mark Lathrop
View author publications
You can also search for this author in PubMed Google Scholar
Marie-Christine de Vernejoul
View author publications
You can also search for this author in PubMed Google Scholar
Maria Martinez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aude Saint-Pierre.

Ethics declarations

Competing interests

The authors declare no conflict of interest.

Additional information

Supplementary Information accompanies the paper on European Journal of Human Genetics website

Supplementary information

Supplementary Information (DOC 43 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Saint-Pierre, A., Kaufman, JM., Ostertag, A. et al. Bivariate association analysis in selected samples: application to a GWAS of two bone mineral density phenotypes in males with high or low BMD. Eur J Hum Genet 19, 710–716 (2011). https://doi.org/10.1038/ejhg.2011.22

Download citation

Received: 06 October 2010
Revised: 14 December 2010
Accepted: 06 January 2011
Published: 23 March 2011
Issue Date: June 2011
DOI: https://doi.org/10.1038/ejhg.2011.22

Keywords

This article is cited by

Bivariate genome-wide association study of the growth plasticity of Staphylococcus aureus in coculture with Escherichia coli
- Xuyang Zheng
- Jun Bai
- Xiaoqing He
Applied Microbiology and Biotechnology (2020)
Dissecting the genetics underlying the relationship between protein content and grain yield in a large hybrid wheat population
- Patrick Thorwarth
- Guozheng Liu
- Carl Friedrich Horst Longin
Theoretical and Applied Genetics (2019)
Altered metabolite levels and correlations in patients with colorectal cancer and polyps detected using seemingly unrelated regression analysis
- Chen Chen
- G. A. Nagana Gowda
- Daniel Raftery
Metabolomics (2017)
Genome-wide approaches for identifying genetic risk factors for osteoporosis
- Shuyan Wu
- Yongjun Liu
- Hong-Wen Deng
Genome Medicine (2013)
Osteoporosis genetics: year 2011 in review
- David Karasik
- Miri Cohen-Zinder
BoneKEy Reports (2012)

Subjects

Abstract

Similar content being viewed by others

Introduction

Materials and methods

SUR-based bivariate model

Simulation study

Results

Simulation study

Bivariate association statistics

Univariate association statistics

Type I error rates

Power comparisons

Analyses of empirical BMD genome-wide association data

BMD GWAS data

Association analysis

Results

Discussion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

This article is cited by

Search

Quick links