Introduction

Genome-wide association study (GWAS) is a primary tool to identify association of genetic variants with phenotypes [1, 2]. GWAS has been successfully applied to a variety of complex diseases and identified genetic factors underlying complex diseases [3, 4]. However, there is still a considerable heritability of complex diseases that could not be explained by conventional GWAS [5, 6]. One plausible reason for unexplained heritability is due to the genetic architecture of complex diseases, which are affected by many common variants with low penetrance (i.e., small effect) [5]. Gene-based analysis, which considers the aggregate effect of multiple genic variants in a single test, is an alternative approach to overcome the genetic heterogeneity problem [7, 8]. Conventional GWAS may also be limited by phenotypic heterogeneity [5, 6, 9]. Most GWASs consider a univariate clinical outcome (e.g., disease diagnosis or a composite score of several disease-related traits). It is well understood that some variants may influence multiple traits associated with a single complex disease, but association of those variants may not be detected in a model with a broadly defined outcome [10]. Thus multiphenotype analysis, which simultaneously considers more than one phenotype pathologically or clinically related with the disease, may help identify additional disease-related genetic associations.

Several gene-based association methods [11, 12] and multivariate association methods [9, 13, 14] have been developed and successfully applied to GWASs of complex diseases. Recently, van der Sluis et al. developed a multivariate gene-based test (MGAS) [15] that combines the TATES [9] method for multivariate single-nucleotide polymorphism (SNP) association testing and the GATES method [8] for gene-based univariate association testing. In this study, we evaluated the statistical performance of combinations of multivariate association methods and gene-based association methods in various simulation models. The gene-based methods tested in this study include VEGAS [16] and GATES [8], which have been frequently used for analyzing common SNPs. The tested multivariate association methods were O’Brien [17], TATES [9], and MultiPhen [13]. These methods have been implemented in freely available standalone software or in an R library that accepts as input files produced by commonly used GWAS tools. The goal of this study is to provide guidance on how to optimally select multivariate gene-based association method for analyzing common variants given the correlation of phenotypes and genetic background (e.g., linkage disequilibrium [LD]).

Methods

The information in detail about the multivariate and gene-based association methods are described in Table 1.

Table 1 Description of the methods for multivariate and gene-based association testing

Approaches for multiphenotype association testing

The O’Brien method combines univariate test statistics (i.e. Z scores or β) of all SNPs from GWAS of multiple phenotypes to compute a test statistic for pleiotropic effect [17, 18] and is implemented in an R library, CUMP [19]. This method calculates a statistic assumed to follow a multivariate normal distribution with mean (combined Z scores of all SNPs) and covariance matrix of the multiple phenotypes. The covariance matrix among phenotypes can be approximated by the sample covariance matrix of the Z scores of all SNPs [17, 19].

The Trait-based Association Test that uses the Extended Simes procedure (TATES) was developed to detect effects across correlated traits measured in the same individuals using summary association statistics in the form of P value for each trait [9]. For each variant, the approach takes the minimum P value across a set of univariate tests carried out on each phenotype and then applies a weight to the P value to account for the number of phenotypes tested and their correlation. TATES requires univariate test statistics (i.e., P values) and a correlation matrix of the multiple phenotypes.

MultiPhen performs ordinal regression using an inverted model whereby the genotype or imputed SNP allele dosage is the outcome variable and the phenotypes are the predictors [13]. This program uses individual-level data (genotypes and phenotypes) for computing regression models, whereas the O’Brien method and TATES use summary statistics (β and SE or P values) and tests association between a SNP and a set of phenotypes by conducting likelihood ratio test for model fit, testing whether all regression coefficients in the model are jointly significantly different from zero. MultiPhen is an R package available from CRAN.

Approaches for gene-based testing

The Gene-based Association Test using the Extended Simes procedure (GATES) computes a gene-based P value using SNP-based P values and correlations between SNPs (or pairwise LD information) in a gene [8]. The individual SNP P values are combined in a manner that appropriately controls for the effective number of independent SNPs in a gene. The effective number of independent SNPs is estimated from the eigenvalues of the square root of LD matrix.

The Versatile gene-based test for Genome-wide Association Studies (VEGAS) allows the SNP-based chi-square test statistics in a gene to be combined in a gene-based test statistic [16]. An empirical null distribution for this gene-based test statistic is obtained through a simulation of multivariate standard normal random vectors (Z statistics) with mean 0 and the correlations (or LD) between SNPs in a gene. The simulated gene-based test statistic is the sum of the squared Z statistics (with a chi-square distribution). The observed gene-based test statistic is the sum of chi-squares (converted from P values). The empirical gene-based P value is the proportion of simulated gene-based test statistics that surpass the observed gene-based test statistics. To compute the empirical gene-based P value, we performed 106 simulations.

Genotype simulation settings

Simulation studies under a range of scenarios were performed to assess and compare the performance (type I error and power) of the three multivariate association methods (O’Brien, TATES, and MultiPhen) each paired with one of the gene-based association methods (GATES and TATES). For all scenarios, we generated sequence genotypes of 22 autosomal chromosomes. HAPGEN2 software [20] was applied for generating the sequence genotypes for 2000 samples, and the European ancestry populations in the 1000 human genome reference panel (GRCh37; Mar 2012) [21] was used as reference to incorporate realistic genetic background. Only common SNPs with minor allele frequency (MAF) ≥1% were evaluated in the simulation tests for this study. For each simulation replicate, a 10-kb region containing at least 20 common SNPs was randomly selected. EIGENSTRAT [22] was used to generate principal components of the simulated genotypes to adjust for population structures in the simulation tests.

Correlated multiphenotype simulation settings

Van der Sluis et al. suggested various genotype–phenotype models for genetic architecture of complex disease [9]. Among these models, we used the primary model, “single common factor model,” which implies that individual phenotypes related with a complex disease result from one shared latent factor influenced by genetic factors. Three correlated phenotypes of 2000 sample were simulated for each simulation test. The covariance matrix of three correlated phenotypes with the common factor were simulated according to this model:

$$\Sigma = \Lambda \ast \Lambda ^{\mathrm{T}} + \Theta ,$$

where Σ is the 3 × 3 covariance matrix among the three phenotypes, Λ is the 3 × 3 matrix of factor loadings, T is matrix transpose, and Θ is the 3 × 3 diagonal matrix of residual variances (see details elsewhere [9]). The factor loading is the proportion of variance of a complex disease explained by a factor, and the residual variance is the proportion of variance of phenotypes that is not explained by the factor. In general, when a factor loading value increases, the correlations among phenotypes simulated by the model above increase.

Type I error simulation

We simulated three phenotypes that were multivariate normally distributed with mean 0 and covariance Σ for 2000 samples. Five million simulation tests were performed to evaluate whether the combined methods for multiphenotype gene-based test maintained the correct type I error rate. Type I error rate was evaluated through four scenarios by different factor loadings (Λ = 0.15, 0.35, 0.55, or 0.75, see Supplementary Table 1).

Power simulation

To simulate multiphenotype data for the power evaluation, we randomly chose causal SNPs in the selected 10-kb region. We simulated three continuous phenotypes for 2000 samples using the formula

$$Y = \beta _1G_1 + \beta _2G_2 + \cdots + \beta _nG_n + \varepsilon ,$$

where βi is the effect size of the causal SNP i, Gi is the genotype of the causal SNP i, and ε is error term that follows a multivariate normal distribution with means 0 and covariance matrix Σ. The effect size, β, was generated by

$$\beta _i = \sqrt {\frac{{h^2_q}}{{2 \times MAF_i \times (1 - MAF_i)}}} ,$$

where h2q, the proportion of variance explained by each causal SNP, was fixed at 1% for all scenarios and MAFi is the MAF of the causal SNP i. We considered various scenarios in terms of different factor loadings (Λ = 0.15, 0.35, 0.55, and 0.75), i.e., different correlation between phenotypes and either 5% or 15% of causal SNPs as shown in Supplementary Table 1.

Simulation test procedure

For each simulation replicate, multiphenotype association tests of SNPs were first conducted using the O’Brien, TATES, and MultiPhen approaches, and the association P values of SNPs from the three multivariate tests were combined into single gene-based P values using GATES and VEGAS. The O’Brien method requires genome-wide association statistics to compute a null distribution. To reduce the computation time, we generated a pruned set of uncorrelated SNPs on all chromosomes. The pruned SNP set was used for computing genome-wide association statistics, which were applied to compute the covariance matrix for the O’Brien approach. The full set of unpruned SNPs were used for the rest of simulation replicates. We used linear regression to compute univariate associations (βs and SEs or P values) between SNPs and each phenotype after adjusting the first three PCs, and the univariate SNP associations results were then used as input for analyses using the O’Brien and TATES multivariate association methods. MultiPhen computes multivariate associations with individual-level simulated data (SNPs and three phenotypes) after adjusting the three PCs. Because we simulated genotypes in the European ancestry population from the 1000 Genomes reference panel [21], the European LD structure from this panel (GRCh37; Mar 2012) was used for GATES and VEGAS to correct for the correlation between SNPs.

Scenario setting

We investigated statistical performance, both type I error and empirical power, of each pair of multivariate gene-based association methods in various scenarios for four different factor loadings (Λ = 0.15, 0.35, 0.55, and 0.75) and proportions of independent SNPs in a gene. The effective number of independent SNPs in a gene was estimated in the manner applied in GATES [8], and proportions of independent SNPs out of the total number of SNPs in a gene were classified into three groups (low: <40%, moderate: 40–60%, and high: >60%).

We further considered additional scenarios for assessing the empirical power by varying (1) the percentage of causal variants (5% and 15%) in a gene, (2) phenotype direction (i.e., correlations of phenotypes are all in same direction or not), and (3) the number of phenotypes affected by the causal variants. We randomly selected 5% or 15% of causal variants among the total number of SNPs in a gene. We compared the empirical power of the paired multivariate gene-based association methods when all three phenotypes were positively correlated, with the sign of the pairwise correlation = (+++) or one phenotype was inversely correlated with the others (+−−). Also, we investigated the empirical power for the scenarios where the causal variants in a gene affect one or two phenotypes only, rather all three phenotypes. The simulation studies for the scenarios with varying phenotype correlation direction and the number phenotypes affected by causal variants were conducted with the factor loading fixed to 0.55. The complete range of simulation scenarios is summarized in Supplementary Table 1.

Application to neuropathological traits related to Alzheimer disease (AD)

Genetic and phenotypic data for 3135 AD cases and 463 clinically and pathologically confirmed controls of European ancestry from 12 datasets (participant characteristics shown in Supplementary Table 2) who have genotypes and AD-related neuropathological phenotypes including neuritic plaque (NP), neurofibrillary tangles (NFT), and cerebral amyloid angiopathy (CAA) were obtained from the Alzheimer’s Disease Genetics Consortium. A total of 3598 subjects have both NP and NFT phenotype data, but only 2403 subjects have CAA data. GWAS of each of these three phenotypes has been conducted previously [23], and GWAS summary statistics of the three traits were obtained from NIAGADS (https://www.niagads.org) [24].

Neuropathological traits are correlated, but the correlations among traits vary, which is not covered by our simulations scenarios. Although our simulation study may provide sufficient information to infer statistical performance, especially type I error, of the proposed methods, we performed one more simulation study to assess the type I error for phenotypes with correlation structure similar to the neuropathological traits. An R multivariate normal generating (“mvrnorm”) function of the MASS library was applied to generate sets of phenotypes to mimic the neuropathological traits. The rest of the simulation setting were identical to the main simulation study.

Summary statistics of the three neuropathological phenotypes were used to evaluate the O’Brien and TATES multivariate association methods. The MultiPhen program was applied to each dataset and the SNP association results were combined across datasets using METAL with the weighted Z-score method based on sample sizes [25]. It should be noted that the sample size used for MultiPhen (# actual sample size for computation = 2403) is much smaller than the other multivariate association methods (O’Brien and TATES; # = 3135) because MultiPhen requires individual-level genotype and phenotype data (NP, NFT, and CAA). For the gene-based tests (VEGAS and GATES), we used SNPs within 10 kb of both ends of transcripts after removing low-frequency SNPs (MAF ≤ 1%) and SNPs with low imputation quality (R2 < 0.4). To evaluate the performance of the multivariate association methods in a consistent condition (i.e., the same subjects), we also performed a sensitivity analysis using a subset of individuals who had all genotype and no missing phenotype data (n = 2403).

The genome-wide significance level for gene-based tests was set as 2.7 × 10−6, which was calculated as the nominal significant level divided by the total number of genes tested (n = 18,500). We defined a gene having a pleiotropic effect on the three phenotypes when the multivariate gene-based P value was at least one order of magnitude lower than the univariate gene-based P value of each phenotype.

Results

On average, 60 SNPs were observed in each gene region (10-kbp length). On average, correlation estimates (Pearson correlation, r) between phenotypes that were simulated by factor loadings (Λ) in scenarios were 0.01, 0.21, 0.57, and 0.86 for Λ = 0.15, 0.35, 0.55, and 0.75, respectively. Of note, simulated phenotypes were not significantly correlated (P > 0.05) when the factor loading was equal to 0.15.

Type I error simulations

The empirical type I errors of multivariate association methods with the gene-based tests are shown in Table 2 (VEGAS) and Table 3 (GATES) at different α levels based on the proportion of independent SNPs in a gene (low, moderate, and high) and factor loading (Λ = 0.15, 0.35, 0.55, and 0.75). Applying VEGAS to the multivariate output by O’Brien (VEGAS-O’Brien) yielded slightly inflated type I errors at all α levels in the three scenarios by factor loadings (Λ = 0.15, 0.35, and 0.55) regardless of proportion of independent SNPs but not in the scenario of higher factor loading (Λ = 0.75). Applying VEGAS to rest of the multivariate association methods has inflated type I errors at α = 0.0001 for TATES and at all α levels for MultiPhen for all scenarios irrespective of factor loadings or independent SNP proportions. When GATES was applied, type I errors for all three multivariate association methods were deflated at all α levels in all scenarios.

Table 2 Type I error rate of multivariate association methods with a gene-based association method, VEGAS
Table 3 Type I error rate of multivariate association methods with a gene-based association method, GATES

Power simulations

Power simulation results of multivariate association methods with gene-based association methods are shown in Fig. 1 and Supplementary Tables 3 and 4. As the proportion of independent SNPs in a gene increased, the power of all multivariate association methods decreased regardless of gene-based association method. Most of the multivariate association methods, except for MultiPhen, using VEGAS or GATES showed low power (<10%) in scenarios when the proportion of causal SNPs in a gene is equal to 5%. In the scenarios for the proportion of causal SNPs fixed at 15%, the power of O’Brien and TATES, regardless of the gene-based association methods, remained almost constant in the scenarios when factor loading (Λ) increases from 0.15 to 0.35, 0.55, and 0.75. In contrast, the power of MultiPhen combined with VEGAS or GATES was highest in the scenario including the lowest factor loading (Λ = 0.15), and it was reduced by about 20% when the factor loading increased from 0.15 to 0.75. All pairs of multivariate and gene-based association methods had similar power in the scenarios of the highest factor loading (Λ = 0.75). Applying VEGAS to the three multivariate association methods showed slightly greater power than GATES in all scenarios. However, this might be related to the inflated type I errors observed in most of the multivariate association methods using VEGAS.

Fig. 1
figure 1

Power comparisons of multivariate association methods (O’Brien, TATES, and MultiPhen) with gene-based association methods (VEGAS and GATES) in various scenarios by varying the proportion of independent SNPs in a gene. a Causal variant percentage = 15% and b causal variant percentage = 5%. Empirical power calculated at α level of 0.0001

In the scenario with phenotypes positively and negatively correlated (+−−), O’Brien combined with both GATES and VEGAS yielded low or zero power (Table 4a), while the empirical power of TATES and MultiPhen, regardless of the gene-based association methods, remained almost constant compared to the scenario with positively correlated phenotypes (+++; Table 4b). We also observed that, as the number of phenotypes not affected by causal variants increased, the power of all multivariate association methods decreased regardless of gene-based association methods (Table 5). This can be seen most clearly with the results of the O’Brien method paired with both gene-based association methods, where power is very low when the effect of causal variants is on only one phenotype among the three phenotypes.

Table 4 Power of multivariate association methods (O’Brien, TATES, and MultiPhen) with gene-based association methods (VEGAS and GATES) for phenotypes in the same or different directions
Table 5 Power of multivariate association methods (O’Brien, TATES, and MultiPhen) with gene-based association methods ((a) VEGAS and (b) GATES)

Among the pairs of multivariate gene-based association methods that correctly controlled type I error, MultiPhen and GATES outperformed other combinations in most scenarios, suggesting that the pairing GATES with MultiPhen is an omnibus method for multivariate gene-based association testing.

Application to AD-related neuropathological traits

The correlation estimates (r) between the three neuropathological phenotypes were 0.68 between NP and NFT, 0.56 between NP and CAA, and 0.40 between NFT and CAA. For the investigation of pleiotropic effects on the three neuropathological phenotypes, we planned to apply multivariate gene-based association methods that properly controlled the type I error in the simulation tests. From the extra simulation study for the phenotypes with correlation most similar to the neuropathological phenotypes, we found that the pairing of GATES with all three multivariate association methods maintained correct type I error. The method combined by O’Brien and VEGAS exhibited slightly inflated type I errors (0.0001–0.0003 at α = 0.0001), while the pairing of VEGAS with the other multivariate association methods (TATES and MultiPhen) showed substantial inflated type I error (0.0002–0.0008 at α = 0.0001; see Supplementary Table 5). Therefore, the pairing of GATES with all three multivariate association methods (O’Brien, TATES, and MultiPhen) plus the combination of VEGAS and O’Brien were used for analysis of the neuropathological phenotypes.

Association findings from multivariate gene-based methods are shown in Table 6 for previously reported AD genes [26,27,28,29] and Table 7 for new genes identified in this study with at least suggestive association (P < 10−4). Only 8 of the 27 previously known AD genes attained at least a nominally significant level of association with at least 1 of the phenotypes. Five of the previously reported AD genes—BIN1, PICALM, TSPOAP1, CASS4, and APOE—were at least nominally associated in the multivariate gene-based analyses at a significance level of at least one order of magnitude smaller than the results from the univariate analyses of three neuropathological phenotypes. APOE was detected as a gene with pleiotropy effect on the three phenotypes in the pairs of methods: GATES with O’Brien (P = 1.6 × 10−68) and TATES (P = 2.1 × 10−44). O’Brien with VEGAS also found the significant association for APOE (P < 1.0 × 10−6) in the multivariate gene-based analysis. However, O’Brien with VEGAS could not differentiate whether or not the multivariate association in APOE is more significant than its associations from each of univariate analyses because VEGAS generated a gene-based P value from a permutation approach that was not precise enough to detect a change of one order of magnitude. Nominally significant multivariate gene-based associations (P < 10−3) in BIN1, PICALM, and CASS4 were observed from O’Brien with VEGAS, and nominal association in TSPOAP1 was detected from MultiPhen with GATES (Table 6).

Table 6 Associations (P values) of known AD genes from the analysis of neuropathological phenotypes using multivariate gene-based association methods
Table 7 Novel associations (P values) from the analysis of neuropathological phenotypes using multivariate gene-based association methods

Three neighboring genes including TRAPPC12, TRAPPC12-AS1, and ADI1 on chromosome 2p25.3 were identified at a gene-wide significant level (P < 2.7 × 10−6) in the MGAS for the three phenotypes (NP, NFT, and CAA) by O’Brien with VEGAS (Table 7). It should be noted that suggestively significant association with SNPs (best SNP: chr2:g:3474085C>T [rs35067331]; P = 5.5 × 10−7 in TRAPCC12) in the multivariate model (NP, NFT, and CAA) by the O’Brien method was observed in this region (Supplementary Fig. S1). A genome-wide sensitivity analysis showed that only APOE attained gene-wide significant association in all the pairings of multivariate gene-based association methods, and the P values of APOE from the multivariate association tests paired with GATES were similar for all approaches (Supplementary Table 6). In addition, we observed that the pairing of MultiPhen with GATES yielded similar association strengths (i.e., P values) for the new genes, except for KRT2, compared with the other combinations of multivariate gene-based association methods (Supplementary Table 6).

Discussion

In this study, we proposed a multivariate gene-based association test as post-GWAS analysis by combining a multivariate association method (O’Brien, TATES, or MultiPhen) with a gene-based association method (GATES or TATES) to identify genes with pleiotropic effects on multiple phenotypes related to a complex disease. We limited the multivariate gene-based association tests of common SNPs (MAF ≥ 1%) because those methods were originally designed for common variants. We performed numerous simulations to depict the genetic (proportion of causal SNPs and independent SNPs) and phenotypic (various correlations between phenotypes in same or opposite direction) architecture of a complex disease to assess the performance of multivariate gene-based methods. Continuous phenotypes, which are normally distributed, were used for the simulation, but other types of phenotypes such as binary or survival outcomes could be analyzed with the three multivariate association methods.

Compared with other combinations of multivariate gene-based methods, selecting GATES for gene-based test and MultiPhen for multivariate test is robust for type I error and advantageous for power when the correlation between phenotypes is relatively low (r ≤ 0.57). However, MultiPhen requires individual-level data (genotypes and phenotypes), which is not available in most cases. This also means that MultiPhen omits samples with any missing values in any of the phenotypes, which will reduce the study power. For these cases in which MultiPhen has limited power, our simulation study suggests O’Brien with VEGAS or TATES with GATES as the second optimal multivariate gene-based method. When analyzing phenotypes with high correlations (r ≥ 0.86), we did not see noticeable difference in statistical power among the three multivariate association methods with GATES. O’Brien with VEGAS maintain the acceptable type I error in most scenarios for all factor loadings (or correlation between phenotypes) except for the highest factor loading (Λ = 0.75, r = 0.86). This suggests that applying VEGAS to the multivariate associations of SNPs from O’Brien method is appropriate when the phenotypic correlation is relatively low (r ≤ 0.57).

We also gained additional knowledge from the simulation studies with scenarios for phenotype in different directions and inclusion of phenotypes that were affected by causal variants. The O’Brien lost substantial power when the phenotypes are not correlated in same direction. Also, O’Brien yielded low power when phenotypes not affected by causal variants are included into the association tests. This may indicate that O’Brien may have better specificity compared to other multivariate methods (TATES and MultiPhen) to identify genetic loci with pleiotropy effects across the entire set of phenotypes included in a test.

In the multivariate gene-based analyses of the three neuropathological phenotypes (NP, NFT, and CAA), five known AD genes—BIN1, PICALM, TSPOAP1, CASS4, and APOE—reached significant association with a P value at least one order of magnitude smaller than each of the univariate association P values. The improved association in APOE was found in most of multivariate gene-based methods except for the method by MultiPhen with GATES. We noticed that the sample analyzed through MultiPhen was 33.2% smaller than the sample analyzed by O’Brien and TATES. This is because MultiPhen requires individual-level data (genotypes and phenotypes), while other multivariate association methods use summary statistics for each phenotype. Multiphen tests including a smaller number of subjects, for whom genotype data and information for all three phenotypes were available, yielded relatively weak associations in APOE. Our sensitivity analysis confirmed that the decrease significance of the APOE association in the pairing of MultiPhen with GATES compared to other combined methods was due to the smaller sample size. The improved association in TSPOAP1 compared to univariate associations for the individual traits was observed only in the multivariate gene-based method by combining MultiPhen with GATES, and associations with BIN1, PICALM, and CASS4 were detected only by O’Brien with VEGAS.

We identified gene-level significant (P < 2.7 × 10−6) associations with contiguous genes TRAPPC12, TRAPPC12-AS1, and ADI1 using the multivariate gene-based approach based on O’Brien paired with VEGAS. The association findings for these three genes using other methods (O’Brien with GATES and TATES with GATES) are also moderately significant (P < 7.0 × 10−5), except for MultiPhen with GATES (P < 0.01). Recently, we identified associations with these same genes in a bivariate analysis of NFT and CAA using the O’Brien method with VEGAS [30]. Association of rs35067331 from the bivariate model (NFT+CAA; P = 5.8 × 10−8) was more significant than the association of rs35067331 from the trivariate model (NP+NFT+ CAA; P = 5.5 × 10−7). However, we obtained gene-wide significant evidence that these three genes have pleiotropy effects on the three neuropathological phenotypes.

In general, genes that yielded at least a moderately significant (P < 1.0 × 10−4) association using any of the multivariate gene-based methods (Table 4) contained an effective number of independent SNPs that accounted for <40% of the total number of SNPs in that region, a finding which is consistent with our observation from the simulation tests of statistical power.

It should be noted that our simulation scenarios do not represent all possible genotype–phenotype models for complex diseases. Therefore, the statistical performance (type I errors and powers) for the tested methods in this study cannot be assumed in all genome-wide multivariate gene-based studies. However, our simulation results based on diverse scenarios may indicate which multivariate association tests are most appropriate based on the phenotypic correlations.

Taken together, our comparison of multivariate gene-based association methods for detecting pleiotropy effects at the gene-level showed noticeable differences in type I error and power among the tested methods. This comparison study also provides practical and useful information for choosing a multivariate gene-based method, which can maximize power for gene-level pleiotropy analysis. For studies where individual-level data are available, MultiPhen with GATES can be the best option since this pair of methods performed best for the tested scenarios. When the proportion of missing data is high, however, we observed from the pleiotropy analysis using AD-related neuropathological traits that MultiPhen with GATES loses substantial power. This was especially true for the study of phenotypes with high correlation (r ≥ 0.86), when we observed no substantial difference in power among the tested methods. Therefore, we suggest O’Brien with VEGAS or TATES with GATES as alternative approaches when the proportion of missing data is high or the individual-level data are not available.