Semi-parametric empirical Bayes factor for genome-wide association studies

Bayes factor analysis has the attractive property of accommodating the risks of both false negatives and false positives when identifying susceptibility gene variants in genome-wide association studies (GWASs). For a particular SNP, the critical aspect of this analysis is that it incorporates the probability of obtaining the observed value of a statistic on disease association under the alternative hypotheses of non-null association. An approximate Bayes factor (ABF) was proposed by Wakefield (Genetic Epidemiology 2009;33:79–86) based on a normal prior for the underlying effect-size distribution. However, misspecification of the prior can lead to failure in incorporating the probability under the alternative hypothesis. In this paper, we propose a semi-parametric, empirical Bayes factor (SP-EBF) based on a nonparametric effect-size distribution estimated from the data. Analysis of several GWAS datasets revealed the presence of substantial numbers of SNPs with small effect sizes, and the SP-EBF attributed much greater significance to such SNPs than the ABF. Overall, the SP-EBF incorporates an effect-size distribution that is estimated from the data, and it has the potential to improve the accuracy of Bayes factor analysis in GWASs.


Introduction
Genome-wide association studies (GWASs) are comprehensive studies on the relationship between disease traits and single nucleotide polymorphisms (SNPs), throughout the genome, and have identified susceptibility gene variants for many complicated diseases [1,2]. The data-analysis approach commonly used for identifying susceptibility gene variants in GWASs is statistical hypothesis testing based on the P value. Many authors, however, have pointed out that the P value has fundamental limitations [3]. A critical limitation is that the P value only conveys information about dissociation from the null hypothesis (null association), and it controls the probability of yielding a false positive based on the probability distribution of a test statistic under the null hypothesis, but not the probability of yielding a false negative. In GWASs, the lack of power due to the use of the extremely strict, genomewide significance level [4,5], 5 × 10 −8 , has also been criticized [6][7][8], as several studies have shown that many SNPs not reaching genome-wide significance are associated with various traits [9][10][11].
Thus far, increasing numbers of studies have used the Bayes factor (BF), in addition to the P value [12][13][14]. Typically, the BF is based on a sufficient statistic regarding the association between a disease and a particular SNP, and it compares the probability of observing a value of the statistic under the null hypothesis and the corresponding probability of observing this value under the alternative hypothesis. Thus, the BF conveys more information than the P value, since it takes into account not only the false positive, but also the false negative. This is particularly true in the Bayesian decision-theoretic testing [15]; the rejection of the null hypothesis based on the posterior odds ratio across the two hypotheses is transformed to a comparison of the BF and the threshold, expressed as the product of a prior odds ratio and the cost of the false negative relative to the false positive (see also "Hypothesis testing and the BF").
When calculating the denominator of the BF, which represents the probability that the observed statistic value is under the alternative hypothesis, it is necessary to specify a prior distribution for an association parameter or effect size of a SNP (such as a coefficient of the log odds ratio in a logistic model), as well as nuisance parameters (such as an intercept coefficient in a logistic model), under the alternative hypothesis. Wakefield [16] sidestepped the specification of the prior distribution for the nuisance parameters and derived an explicit form of an approximate BF (ABF) for the association parameter of interest based on two approximations, (1) asymptotic normality of the estimated effect size and (2) a normal prior N(0, W) with the variance W for the effect-size distribution [16].
However, the difficulty in specifying the prior distribution (as seen in many Bayesian analyses) also applies to the BF analysis. For the ABF, Wakefield proposed some specifications of the prior variance W [16]. For example, W can be specified as W = 0.21 2 with a 95% belief that the effect size in terms of the odds ratio is within 1/1.5-1.5. More complex specifications incorporating dependence of the effect on the minor allele frequency (MAF) are also possible [16]. However, even with these arguments, there is always a risk of mis-specifying the prior distribution, especially in exploratory GWASs with limited prior information. To address misspecification of the variance W, some authors proposed to introduce a prior distribution for W [17] or to perform an empirical Bayes estimation of W [18]. However, normality of the effect-size distribution is a conventional assumption as there is no guarantee that it is reasonable. Some authors suggest the use of other parametric priors, such as Laplace priors [19]. Actually, the effect-size distribution is expected to have various distributional forms, reflecting complicated biological mechanisms between genetic factors and disease (see Figs. 1 and S4).
In this paper, we propose an empirical Bayes method with a flexible, nonparametric prior for the effect-size distribution to address the issue of misspecification. Our model is semiparametric because of a combination of the nonparametric prior with the theoretically reasonable, asymptotic normality for the sampling distribution of the estimated effect size [8,11,20] (as done in the ABF). Even with the nonparametric prior, we can accurately estimate the effect-size distribution from high-dimensional genomic data, plausibly involving a large quantity of parallel data structures. See Nishino et al. [11] and Otani et al. [8] for the effectiveness of our estimation approach in the context of GWAS.
As such, our semi-parametric empirical BF (SP-EBF) method intends to improve the current BF analysis, possibly with inappropriate prior distributions. In other words, with the use of appropriate (nonparametric) prior distributions, our method aims to realize the inherent effectiveness of the BF analysis, potentially rendering it superior to traditional GWAS analysis based only on the P value.

Hypothesis testing and the BF
In a GWAS, each SNP is tested individually for its association with disease. Typically, the following univariate logistic regression model is assumed for the jth SNP, where η ij is the probability of disease for the ith subject with genotype x ij (x ij = 0, 1, or 2), and α j and β j are intercept and effect-size (log odds ratio) parameters [21] (i = 1, …, n; j = 1, …, m). When performing a test of the null hypothesis, H 0 :β j = 0, a Wald Z value is expressed as follows, z j 1 ; whereβ j is a maximum likelihood estimate of β j , and V j is an estimated variance ofβ j . Note that this test is a usual univariate test on single SNPs for common variants. Typical quality control processes remove low-frequency variants (e.g., MAF < 1%), so that the following BF analyses that are based on summary statistics ðβ j ; V j Þ would not cover such rare variants.
The BF is defined by the ratio of the probability of observingβ j under H 0 and the corresponding probability under an alternative hypothesis H 1 :β j ≠ 0, For example, when the BF = 0.01, the obtained valueβ j can be interpreted as being 100 times more likely to occur under the alternative hypothesis than under the null hypothesis. For a particular SNP, a formal Bayesian decisiontheoretic testing is to reject H 0 if the posterior odds of H 0 , i.e., Pr H 0 jβ j À Á = Pr H 1 jβ j À Á , is smaller than the ratio of costs, where c FP and c FN are the costs of the false positive (type I error) and false negative (type II error), respectively [15]. As the posterior odds of H 0 are expressed as a product of the BF and the prior odds of H 0 , that is, the aforementioned Bayesian decision-theoretic testing can be transformed to a decision rule to compare the BF with a relative cost R divided by the prior odds, i.e., if BFðβ j Þ<R=ðPr H 0 ð Þ=PrðH 1 ÞÞ; then H 0 is rejected.

The ABF
The ABF was proposed by Wakefield [16]. In this analysis, for the jth SNP an approximation of asymptotic normality is employed for the estimate of β j , i.e.,β j . Thus, under H 0 : β j = 0, the distribution ofβ j is specified as N (0, V j ).
Similarly, under the alternative hypothesis, H 1 :β j ≠ 0, the distribution ofβ j (given β j ) is specified as N (β j , V j ), but a normal prior N (0, W) is specified for the distribution of β j . Here, the variance W can be specified as W = 0.21 2 with a 95% belief that the odds ratio is within 1/1.5 to 1.5. Specifications of W incorporating effect-MAF dependence are also possible [16]. Accordingly, the ABF is expressed as a ratio of the probability density f 0; ABFβj À Á under H 0 and the probability density where φ μ; σ 2 is the density of the normal distribution with mean μ and variance σ 2 .

The SP-EBF
For the jth SNP, we assume the following two-component mixture model for the marginal distribution ofβ j , where π is the prior probability of H 1 , and f 0 and f 1 are the density distributions ofβ j under H 0 and H 1 , respectively. As with the ABF, we employ asymptotic normality for the sampling distribution ofβ j . Accordingly, we specify N (0, V j ) for f 0 under H 0 . In forming f 1 under H 1 , we also specify the sampling distribution ofβ j as N (β j , V j ) for a given value of β j , but specify a nonparametric distribution, g, for the prior distribution of β j . We then obtain the following BF, We estimate the priors π and g based on the data, i.e., the empirical Bayes approach. To this end, we apply the smoothing-and-roughening algorithm [22], a form of the expectation-maximization algorithm [8,11,20]. We discretize the effect-size distribution g into mass point probabilities p ¼ This discretized prior distribution excludes the probability mass at the zero point of the null hypothesis [23] (see Section S1 of Supplementary Materials for the details of the algorithms).
Another approach to flexible modeling of the effect-size distribution g is to specify parametric finite mixture normal distributions whose components have mean zero, but distinct variances [24,25]. However, this model could not capture components with non-zero mean (small peaks with relatively large effects) as seen in actual effect-size distributions, e.g., those in schizophrenia and coronary artery disease (see "Applications"). Furthermore, as indicated by these distributions, there is no guarantee that actual effectsize distributions are symmetric. In contrast, our method utilizes a nonparametric distribution for g to flexibly capture any forms of the effect-size distribution, including asymmetric multimodal distributions.
We then obtain an estimated BF, i.e., the SP-EBF expressed as R code to implement the estimation of the SP-EBF (including the estimation of the hierarchical mixture model in Eq. (2)) is provided in Section S6 of Supplementary Materials. We ascertained superiority of the SP-EBF over the ABF for various forms of the effect-size distribution by simulation experiments (see Section S4 of Supplementary Materials).

Applications
We investigated the characteristics of the SP-EBF in comparison with the ABF through their applications to a metaanalysis of seven GWAS studies in bipolar disorder [26], consisting of 7482 cases and 9250 controls (see Section S2 of Supplementary Materials). We utilized summary statistic data and MAF available at the Psychiatric Genomics Consortium website (https://www.med.unc.edu/pgc) on 2135,534 SNPs, after excluding those with no information about allele frequency based on the HapMap CEU sample. In Section S3 of Supplementary Materials, we briefly summarize similar results of our analyses of other two GWAS datasets, in schizophrenia and coronary artery disease.
Estimation of the effect-size distribution Figure 1 shows an estimate of the nonparametric effect-size distribution (g) for the bipolar disorder dataset. The estimated effect-size distribution was greatly different in dispersion from the ABF normal prior N (0, W) with W = 0.21 2 . This result indicates that the ABF prior missed substantial numbers of small effects, while assuming the presence of substantial numbers of large effects that might not actually be present. The estimation also indicated that the form of the effect-size distribution was not normal (note: this was particularly apparent in the other datasets, specifically in schizophrenia and coronary artery disease (see Fig. S4 in Supplementary Materials); the estimated effectsize distributions had very complex forms with multiple peaks). Figure 2 shows plots of the P value, ABF, and SP-EBF across all the SNPs in the bipolar disorder dataset. Note that the scales in the P value plot and those in the BF plots are different, reflecting that the P value and BF are different measures of association. In the following, for the sake of convenience, we shall use the term "significance" when the P value or BFs suggest the alternative hypothesis. At first glance of the SP-EBF plot, the SP-EBF seems to downweigh the associations consistently for all SNPs, like a zoomed out version of the ABF plot, but actually it is not.

Comparison of the SP-EBF and ABF
The SP-EBF generally down-weighs (or greatly shrinks) the associations for SNPs with very small P values, but could up-weigh for those with very small effect-size estimates with large P values (with larger supports by the estimated effect-size distribution), according to the shape of the estimated prior distribution shown in Fig. 1. In other words, compared with the SP-EBF, the ABF attributed greater degrees of significance to significant SNPs. Figure 3 shows scatter plots of the P value versus the ABF or SP-EBF for all the SNPs in the bipolar disorder dataset, color-coded by the absolute value of the estimated effect sizeβ j . In the scatter plots of the P value versus the ABF, the points form almost a straight line (this is particularly the case for SNPs with high significance), indicating that the ranking of SNPs using the ABF is almost the same that using the P value. In contrast, in the scatter plot for the P value versus the SP-EBF, the points are relatively more scattered, indicating a greater difference in SNP ranking between the SP-EBF and the P value. In Fig. 3, as expected by a large difference in the shape of the prior distribution between the ABF and SP-EBF, the ABF and SP-EBF show an opposite tendency in that for a given P value, there is a larger -log 10 ABF (greater significance) for largerβ j but a larger -log 10 SP-EBF for smallerβ j . In other words, the SP-EBF ascribed greater significance to SNPs with smallerβ j (for a given P value).
We also observed similar results in figures colored based on the variance or MAF (see Figs. S1 and S2 in Supplementary Materials). We observed that for a given P value, the SP-EBF attributed greater significance to SNPs with smaller variances or larger MAF. These results are essentially the same as those in Fig. 3, since a smallβ j corresponds to a small standard error or large MAF for a given P value (or z value). Figure 4 shows a plot of the SP-EBF versus the ABF for the 100 SNPs with the smallest P values. These SNPs could be roughly divided into six regions that were in linkage disequilibrium. We observed that SNPs in each region had similar estimated effect sizes. Table 1 presents the SNPs with the smallest P values in each of the six regions, and shows their rankings among the top 100 SNPs (without regard to region) based on the ABF, SP-EBF, and P value. Again, the rankings based on the ABF and P value are almost the same. In comparison, the SP-EBF resulted in a lower ranking of representative SNPs with relatively largê β j (such as rs10994415 (NC_000010.10:g.62322034T>C) withβ j = 0.271 and rs17138230 (NC_000011.9: g.79075852A>T) withβ j = 0.163), and a higher ranking of SNPs with relatively smallβ j . Of note, similar results were obtained when dividing SNPs into LD clumps and then comparing the rankings of the associated regions (see Fig. S3 and Table S1). It is interesting to observe that the first-ranked 1 SNP based on the ABF and P value, rs10994415, is ranked sixth based on the SP-EBF, while the fourth-ranked SNP based on the ABF and P value, rs9371601 (NC_000006.11:g.152790573G>T), is ranked first based on the SP-EBF.

Discussion
The applications to real-life GWAS datasets indicated that the ABF prior was excessively dispersed compared to the effect-size distribution estimated by our method (see Figs. 1  and S4). In such situations, it is expected that smaller BFs (more evidence for the alternative hypothesis) to SNPs with smaller effect sizes than those with larger effect sizes, because the estimated effect-size distribution may put more weight on smaller effect sizes. In particular, compared with the SP-EBF, in the ABF a SNP with a large absolute value of the estimated effect sizeβ j , which is of greater interest in GWASs, may have a larger denominator of the BF (the probability of observingβ under the alternative hypothesis), leading to a smaller value of the BF (or larger value of -log 10 BF) (see Fig. 2). That is, the ABF tends to attribute a higher degree of significance to a significant SNP. On the other hand, the ABF may attribute less significance to a SNP with a smallβ j because of the relatively small prior probability assigned to the small absolute value of the estimated effect size. Another observation in the applications to real-life GWAS datasets was that the SNP rankings were similar between the ABF and P value. One reason for this is that the ranking based on P value and that based on PrβjH 0 À Á are generally close (perfectly equal if the estimated variances of Fig. 2 Plots of the P value, ABF, and SBF (-log 10 P, -log 10 ABF, and -log 10 SP-EBF) for all SNPs, ordered according to the position on the chromosome in the bipolar disorder dataset. Note that the scales in the P value plot and those in the BF plots are different, reflecting that P value and BF are different measures of association. The red horizontal line in -log 10 P represents the genome-wide significance level. Fig. 3 The -log 10 P versus the ABF and SP-EBF (-log 10 ABF and -log 10 SP-EBF), color-coded by the absolute value of the estimated effect sizeβ j in the bipolar disorder dataset; red: small (0-90 percentile), yellow: medium (90-99 percentile), green: large (99-99.9 percentile), blue: very large (99.9-100 percentile). The red horizontal lines in -log 10 P represent the genome-wide significance level. Note that the scales of x-axis are different between the ABF and SP-EBF to incorporate the difference in magnitude between them as noted in "Comparison of the SP-EBF and ABF".
β are the same across SNPs). Moreover, if the support by a prior effect-size distribution is almost constant (due to its flat form) over an actual range of non-null effect sizes (as indicated by Fig. 1), PrβjH A À Á will be almost constant regardless of the absolute value of the estimated effect sizê β j . Therefore, the ABF prior with a large variance W essentially functions as a non-informative prior distribution.
In other words, it can be said that such a prior distribution may fail to incorporate the information about the alternative hypothesis, although this is the main motivation of using the BF.
On the other hand, the SP-EBF could resolve the aforementioned issues in the ABF by utilizing an actual effectsize distribution estimated under a flexible, semi-parametric hierarchical mixture model. In the applications to real-life GWAS datasets, the estimated effect-size distributions indicated the presence of large numbers of SNPs with small effect sizes. Accordingly, for a SNP with a small absolute value of the estimated effect sizeβ j , PrβjH A À Á may become larger (due to relatively greater support by the effect-size distribution), leading to a smaller (more significant) BF in the SP-EBF. In contrast, a SNP with a very large absolute value of the estimated effect size would become less significant by using the SP-EBF because of less support by the effect-size distribution. As such, the SP-EBF could successfully incorporate the information about the alternative hypothesis by being based on an actual effectsize distribution.
Based on the arguments above, with the SP-EBF we can expect that SNPs with small effect sizes, where the P values are not strongly significant, become more significant and worthy of further investigation in subsequent studies. In the bipolar example, rs6746896 (NC_000002.11:g.97410949A>G) and rs736408 (NC_000003.11:g.52835354C>T) had small effect sizes that did not exceed the genome-wide significance level but that were slightly more significant in the SP-EBF than in the ABF. However, other GWASs [27,28] reported that bipolar disorder was associated with the gene LMAN2L, encoded near rs6746896 (Chr2). Meanwhile, in a pooled population of bipolar and schizophrenia patients, an association was demonstrated [26] with rs736408 (Chr3) in the intron region of ITIH3. Of note, for rs10994415, rs9371601, and rs7296288 (NC_000012.11:g.49479968A>C) that exceeded the genome-wide significant level, several studies [29,30] investigated biological mechanism. The rank improved for rs9371601 for the SP-EBF, although its values were substantially larger than the ABF owing to a less support by the estimated effect-size distribution. The SP-EBF analysis is expected to be particularly useful for detecting novel SNPs with small effect sizes that cannot be detected by standard analysis based on the P value, and could therefore address the so called "missing heritability" problem in many complex diseases (see also Nishino et al. [11] and Otani et al. [8]).
Last, as a further extension of our BF analysis, a byproduct of obtaining an estimate, sayπ, of the prior probability of null association π in Eq. (2), is that it may allow for more accurate Bayesian decision-theoretic testing based on the rule given in "Hypothesis testing and the BF," utilizing an estimate,π=ð1 ÀπÞ; for the prior odds, Fig. 4 Plot of the ranking in SP-EBF versus that in the ABF for the top 100 SNPs with the smallest P values. SNPs in the same linkage disequilibrium region, that had r 2 > 0.2 (according to Haploleg v4.1) or that shared the same GENCODE gene, are plotted using the same color. Representative SNPs (SNPs with the smallest P value in each region) are plotted using large dark-colored triangles. Pr (H 0 )/Pr(H 1 ). In practice, we may also consider accommodation of stratification factors, permitting possible varying effect-size distributions as well as possible dependence of the probabilities of null association on the stratification factors. See Nishino et al. [11] for stratified analyses based on the derived allele frequency and the status of eQTL. Our method can be easily applied to continuous traits in which a least-square estimate of the slope parameter, rather than the log-odd ratioβ. Compared with empirical Bayes methods under parametric effect-size distributions in the context of human or animal genetic studies [31][32][33], we can incorporate a nonparametric effect-size distribution into our hierarchical mixture model in Eq.
Funding Grant-in-Aid for Scientific Research (16H06299) and CREST (JPMJCR1412) from the Ministry of Education, Culture, Sports, Science and Technology of Japan.

Compliance with ethical standards
Conflict of interest The authors declare that they have no conflict of interest.
Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons. org/licenses/by/4.0/.