Introduction

Considering the effects on the phenotypic variance of quantitative traits (that is, differences in the variance of a trait according to the genotype), a sample size of the order of tens of thousands is required for the association of single-nucleotide polymorphisms (SNPs) at genome-wide significance levels (P<1E-06).1 The effects on the phenotypic variance of quantitative traits are likely to be equally modest, thus requiring large sample sizes to identify. Among the statistical tests designed to detect variance heterogeneity, Levene’s test2 has been shown to be robust to the violation of the normality assumption and adequately powered under other irregularities.3 Furthermore, Levene’s test, by design, is not under the influence of any of the main effects of SNPs and compares the pairwise differences in variance between genotype groups, which encompass both linear and non-linear trends. For instance, an analysis of 21 799 individuals from the Women’s Genome Health Study first identified SNPs with a genome-wide significant Levene’s test P-value for C-reactive protein (rs12753193, P=8.0E-11) and soluble ICAM-1 (rs738409, P=1.9E-10; rs1799969, P=2.1E-09).4 Although it is feasible to analyze the heterogeneity of variance in individually large studies, sufficient sample sizes for the detection of variants with small effects can only be practically reached through meta-analysis. Indeed, a recent report has associated an FTO variant (rs7202116) with the phenotypic variability of body mass index (BMI) (P=2.4E-10; N=131 233) in a meta-analysis using the squared residual as the response variable.5

In addition to finding genetic variants influencing phenotypic variance, a meta-analysis of variance heterogeneity can also be used to prioritize potentially interacting variants to test for gene–environment and gene–gene interactions. The high-dimensional nature of genome-wide data inevitably poses computational and statistical challenges, such as multiple testing burden. Consequently, sample sizes of individual genome-wide association studies have been largely underpowered to detect interactions.6 Despite these challenges, there is a pressing need to understand how genetic interactions contribute to the ‘missing heritability’.7, 8 The discovery of novel genetic interactions through meta-analysis presents a promising strategy, as large international consortia provide the adequate sample sizes and methodologies for meta-analyzing interactions are quite well developed.9, 10 We have previously proposed a prioritization scheme–variance prioritization–in the context of quantitative traits based on the observation that the trait variance conditional on genotypes will vary when an interaction is present,4 an active area of methodological research.11, 12, 13 Prioritization is achieved by comparing the variances of a quantitative trait conditional on the genotypes using Levene’s test. As only SNPs with Levene’s test P-values that are lower than a pre-determined threshold (typically a nominal significance level at 0.05) are tested for interaction effects, the underlying effect of multiple hypothesis testing is greatly reduced and overall statistical power is increased accordingly, compared with an exhaustive search for gene–gene or gene–environment interactions.

In this paper, we provide a framework for combining summary statistics from multiple genome-wide studies to calculate the meta-analyzed Levene’s test P-values for individual SNPs without needing to exchange individual-level data. We then perform a genome-wide search for SNPs involved in the heterogeneity of variance using log-transformed BMI and height.

Materials and methods

Consider a quantitative trait Y with N individuals, and Yi as the quantitative trait when stratified according to the possible genotypes (i=0, 1, or 2) of a biallelic SNP. To obtain an equivalent of the exact Levene’s test statistic without exchanging individual-level data, the following statistics are reported by the study s (s=1, 2, ... S) for each SNP:

(n0s, n1s, n2s): genotype counts, summing up to Ns

(, , ): within genotype means of Z0, Z1, Z2

(, , ): within genotype variances of Z0, Z1, Z2

Where and is the group mean of Yi. The calculation of Levene’s test statistic by simply combining samples assumes the following natural weights: () and (). The meta-analyzed Levene’s test statistic L+ using only the summary statistics and weights is (detailed derivation in S1):

Under the null hypothesis of variance homogeneity, L+ follows an F-distribution with df1=2 and df2=N−3. Caution should be observed regarding rare variants (minor allele frequency (MAF)<1%); a minimum of two individuals is needed to estimate the variance in any observed genotype group.

It is common practice in meta-analysis to apply study-specific weights in such a way that the combined estimate reflects the individual effects of varying influences. An adjusted weight can be attained by multiplying the natural weights by the desired adjustment ηis:

The corresponding P-value can be calculated from the test statistic, with the adjusted weights replacing the natural weights. In other words, natural weights are re-weighted by the adjustment ηis [0,1] , where 1 corresponds to the complete representation of the ith genotype in the sth study and 0 corresponds to no representation in the meta-analysis.

We conducted a genome-wide meta-analysis of the variance heterogeneity for log(BMI) and log(height) using three publicly available genome-wide data sets from dbGap:14 MESA (Study accession: phs000209.v10.p2, http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000209.v10.p2) and GENEVA, including data from the NHS and HFPS (Study accession:phs000091.v2.p1, http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000091.v2.p1.). For each data set, we performed quality control of SNPs based on MAF (>1%), Hardy–Weinberg equilibrium (P>1E-08) and the genotype call rate (>95%) and filtered individuals based on ethnicity (only European Caucasians) and relatedness (kinship coefficient>0.025). In addition, individuals with diabetes were excluded from the analysis of BMI to avoid reverse causation.

Results

The quantile–quantile plots of the Levene’s test P-values suggested no noticeable inflation of type I error rate in either the individual studies or the meta-analysis (Supplementary Figure S2). We did not detect SNPs with a meta-analyzed Levene’s test P-value (as shown in Figures 1a and e) that was lower than 5E-08 (ref. 15), which was attributed to the small sample size, even with all the studies combined (Supplementary Table S1). For SNPs with P-values that were lower than 1E-05 (Table 1), we systematically searched for neighboring SNPs associated at genome-wide significance levels with any traits or disease in the catalog of published GWAS (http://www.genome.gov/gwastudies/), filtering associations based on a maximum distance of 500 kb and r2>0.8 or D'>0.8. Among the 16 top hits for log(BMI), rs12132044 in the intronic region of the NEGR1 gene had a highly suggestive meta-analyzed Levene’s test P-value of 4.28E-06 (Supplementary Figure S3). Notably, rs2815752 near the NEGR1 gene, which is known to be associated with BMI16 and severe obesity in a pediatric cohort,5 was in weak LD (r2=0.325; D'=0.888; Distance=231.44 kb) with rs12132044 and also nominally significant for variance heterogeneity (P=0.0076). None of the other top hits for log(BMI) or log(height) were correlated with variants associated with other traits or diseases in their neighboring regions (Supplementary Table S2). For illustrative purposes, we also performed a meta-analysis using arbitrary adjustments. Similar conclusions were reached when meta-analysis was performed with study-specific weights (Figures 1b–d, f and g). Additional simulation results on variance prioritization are provided (Supplementary Table S3; Supplementary Figure S1 and S4).

Figure 1
figure 1

Distribution of meta-analyzed Levene’s test P-values according to study weights for log(height) and log(BMI). Illustrated in (ad) is the quantile–quantile plot of meta-analyzed Levene’s test P-values for log (height) with adjusted weights applied to the three studies (8114 individuals combined). Illustrated in (eh) is the quantile–quantile plot of meta-analyzed Levene’s test P-values for log (BMI) with adjusted weights applied to the three studies (5892 individuals combined). Panels a and e assumed natural weights, that is, an adjustment of one for all three studies (equivalent to using all the samples from the three studies). For illustrative purposes, we also performed meta-analysis using arbitrary adjustments. Panels b and f assumed an adjustment of 1, 1, and 0 for MESA, NHS, and HPFS, respectively (equivalent to meta-analysis of only MESA and NHS). Panels c and g assumed adjustment of 1, 1, and 0.8 for MESA, NHS, and HPFS, respectively. Panels d and h assumed adjustment of 1, 0.9, and 1 for MESA, NHS, and HPFS, respectively.

Table 1 SNPs with Levene’s test P-value lower than 1.0 × 10−5 from a meta-analysis of height and body mass index

Discussion

Analysis of the genetic basis of quantitative trait variance has recently gained increasing interest. Differences in the variance of a quantitative trait between genotypes of a SNP can be due to environmental sensitivity, underlying gene–gene or gene–environment interactions, or linkage disequilibrium with causal variants. Levene’s test can be applied to meta-analysis of environmental sensitivity, which largely rests on analysis of phenotypic variation. Notably, meta-Levene identified a NEGR1 variant (rs12132044) such that the variance of log(BMI) stratified by genotypes was related to its number of minor alleles in a non-linear fashion, which would otherwise be underpowered for detection using a linear model.

Even with the large sample size available from modern consortia, statistical power to detect interactions remains modest, and thus there is a need for prioritization methods. The computation of Levene’s test P-values using only summary-level data facilitates the use of variance prioritization in meta-analysis when individual-level data cannot be obtained. In variance prioritization, SNPs with significant Levene’s test P-values are prioritized and then directly tested for interaction effects using the preferred interaction meta-analysis methods. Our simulations (Supplementary Table S3) showed improvements in power when using the optimal Levene’s test P-value thresholds. Increased power was consistent with the reductions in the genetic interaction search space resulting from the prioritization of SNPs. Under most circumstances, the absolute power to detect an interaction is low, a priori, such that the relative increase in power is substantial. This can be highly advantageous if hundreds or thousands of interactions of small effect sizes underlie the genetics of complex traits. On the other hand, the need for prioritization diminishes when the interaction effect sizes are large and exhaustive search alone provides satisfactory power. However, even in this scenario, the performance of prioritization is either better than or at least equivalent to the conventional exhaustive search. Finally, the strength of association between the environmental covariate and the quantitative trait is the main determinant of the gain in power from prioritization, so situations where variance prioritization is particularly favorable can be readily identified. Our simulations (Supplementary Figure S4) also concurred with the theoretical framework of variance prioritization4, 11, according to which the sample size, number of SNPs, MAF, and the proportion of variance explained by the interactions and the covariate influenced the statistical power of variance prioritization.

Beyond allowing the implementation of variance prioritization to select SNPs for a meta-analysis of genetic interactions or environmental sensitivity, there are many other potential applications of the meta-analysis of Levene’s test. For example, the homogeneity of variance assumption underlying many statistical models is usually examined using Levene’s test. The meta-analysis of the main effects often relies on the assumption of a common variance among the different levels of a factor, and meta-analysis of Levene’s test can be conveniently adopted as a quality control step prior to main effects analysis. Meta-analysis of Levene’s test is not limited to stratification by genotype; it can also be used to investigate the heterogeneity of phenotypic variance across a wide range of environmental factors.

A few limitations are worth considering. First, the required summary statistics are not typically reported in existing GWAS meta-analyses, and the generation of such statistics entails further analytic efforts among individual research centers. However, calculation of summary statistics can be simply executed at research centers using our PLINK R plug-in17 scripts (PLINK v.1.07; http://pngu.mgh.harvard.edu/purcell/plink/). Second, meta-analysis frequently uses imputation methods to produce a common set of SNPs among studies genotyped on different platforms. Imputed SNPs are usually assigned a probability score based on the expected number of minor alleles, in which case individuals cannot be stratified into discrete genotypes. To address this concern, we suggest using a best-guess model whereby participants may be classified according to the most likely genotype. However, further statistical methodologies to incorporate probabilistic genotypes under the current framework are required. Finally, population stratification presents a major challenge to the meta-analysis of population-based GWAS. We observed that meta-analysis of Levene’s test in a multi-ethnic population can lead to false positive results when no precaution is taken (data not shown). Although our method does not explicitly address this problem, one solution would be to compute the required summary statistics from the principal-component-adjusted traits.

In conclusion, we have presented a mathematical framework for meta-analysis of Levene’s test that can be used for environmental sensitivity or variance prioritization in meta-analysis. The use of Levene’s test is advantageous as it is robust to departures from the normality assumption, is not influenced by the main effects of SNPs, and does not assume an additive genetic model. Finally, meta-analysis of Levene’s test can be adapted to more general contexts of variance analysis and has utility beyond the field of genetics.