Abstract
The X chromosome is a relatively large chromosome, harboring a lot of genetic information. Much of the statistical analysis of Xchromosomal information is complicated by the fact that males only have one copy. Recently, frequentist statistical tests for Hardy–Weinberg equilibrium have been proposed specifically for dealing with markers on the X chromosome. Bayesian test procedures for Hardy–Weinberg equilibrium for the autosomes have been described, but Bayesian work on the X chromosome in this context is lacking. This paper gives the first Bayesian approach for testing Hardy–Weinberg equilibrium with biallelic markers at the X chromosome. Marginal and joint posterior distributions for the inbreeding coefficient in females and the male to female allele frequency ratio are computed, and used for statistical inference. The paper gives a detailed account of the proposed Bayesian test, and illustrates it with data from the 1000 Genomes project. In that implementation, a novel approach to tackle multiple testing from a Bayesian perspective through posterior predictive checks is used.
Introduction
The number of genetic markers identified for the human genome has increased tremendously over the past decades. The 1000 Genomes project currently include more than 88 million genetic variants (The 1000 Genomes Project Consortium, 2015). Most of the variants reside on the autosomes, which are ordered according to their size. The X chromosome is a large chromosome with a size of about 155 Mb, and is almost as large as chromosome 7 (Hein et al., 2005), and estimated to contain about 5% of the genes in the human genome (Wise et al., 2013). Currently, ~3.5 million variants on the X chromosome have been reported. Much of the statistical analysis of the Xchromosomal data is complicated by the fact that males have only one copy, whereas females have two. The pseudoautosomal regions (Graves et al., 1998) of the X chromosome behave as autosomes, and for these regions autosomal statistical methodology applies.
A simple way to deal with Xchromosomal data is to ignore males, and apply usual autosomal procedures to females only. This is what often has been done in studies of Hardy–Weinberg (HW) equilibrium, linkage disequilibrium, genetic association studies (Wise et al., 2013) and others. The HW law is a wellknown elementary genetic principle typically explained in detail in genetic textbooks (Crow and Kimura, 1970; Li, 1976; Hartl, 1980; Hamilton, 2009). For a biallelic marker with alleles A and B with relative frequencies p and q, the law states that the genotype frequencies AA, AB and BB will reach the stable proportion (p^{2}, 2pq, q^{2}) in one generation of random mating. From this point on, genotype and allele frequencies will remain unaltered through time, as long as disturbing forces like differential mortality, migration and others remain absent.
The dynamics of Xchromosomal markers is quite different. If male and female allele frequencies initially differ then it will take more than one generation before equilibrium is achieved. Because A males inherit their A allele from their mother, the male A allele frequency always equals the female A allele frequency of the previous generation. Because females inherit one allele from each parent, the female A allele frequency is the mean of the male and female A allele frequency of the previous generation. This ‘lagging and averaging’ continues till the difference between male and female allele frequencies becomes vanishingly small. At that point, the female genotype frequencies will have stabilized as well, reaching the HW proportions. In each generation, the absolute difference between male and female allele frequencies is halved. If D_{t} represents the absolute difference in male and female allele frequency in generation t then we have D_{t}=(1/2)^{t}D_{1}, with D_{1} the initial generation. In a worst case scenario with D_{1}=1, it will take eight generations in total before the difference drops below 0.01. Figure 1 illustrates the faster attainment of HW equilibrium for smaller values of D_{1}.
Statistical tests for HW equilibrium should reflect the special characteristics of the X chromosome. In recent work, Graffelman and Weir (2016) have proposed χ^{2}, exact and permutation tests for HW equilibrium for markers at the X chromosome that take both males and females into account. You et al. (2015) have developed a likelihood ratio test for Xchromosomal markers that also uses males and females. These frequentist procedures jointly test HW proportions for females and equality of allele frequencies in males and females.
There is a considerable number of contributions to the Bayesian testing for HW equilibrium of autosomal markers, starting with Pereira and Rogatko (1984) and Lindley (1988), and including Shoemaker et al. (1998), Ayres and Balding (1998) and Wakefield (2009, 2010). Bayesian methods have also been used to deal with variants of unknown location, and classify them as autosomal or Xchromosomal under the assumption of HW equilibrium (Gautier, 2014). For testing autsomal variants for HW equilibrium, Ayres and Balding (1998) proposed a markov chain monte carlo method to obtain the posterior distributions of inbreeding coefficients for markers with multiple alleles. Shoemaker et al. (1998) obtained explicit expressions for the joint posteriors of various disequilibrium coefficients and allele frequencies in the biallelic case. Wakefield (2010) advocates the use of the Bayes factor in Bayesian inference on HW equilibrium and addresses Bayesian testing in a genomewide context. However, all these Bayesian studies address autosomal markers, and to date Bayesian procedures for Xchromosomal markers have apparently not been developed. This paper therefore first proposes Bayesian methods for a HW analysis of Xchromosomal markers that take both males and females into account, by using an extra parameter allowing for different allele frequencies in the sexes. We concentrate on the most commonly used single nucleotide polymorphisms (SNPs) and consider markers with multiple alleles, such as microsatellites, beyond the scope of the current paper.
The structure of the paper is as follows. In the ‘Background and notation’ section, we provide some background and establish notation. In the ‘Bayesian tests for Xchromosomal markers’ section, we develop a Bayesian approach to the problem of testing Xchromosomal markers for HW equilibrium, and we assess the method through simulation. The ‘Examples’ section illustrates the use of the Bayesian approach with empirical data taken from the Japanese population of the 1000 Genomes project (The 1000 Genomes Project Consortium, 2010), both for single SNPs as well as sets of multiple Xchromosomal SNPs. The approach adopted in that implementation to deal with multiple testing through posterior predictive checks is novel. A discussion section completes the paper.
Background and notation
We consider a biallelic genetic polymorphism on the X chromosome with alleles A and B having allele frequencies p_{Am} and p_{Bm} in males and p_{Af} and p_{Bf} in females, with p_{Am}+p_{Bm}=p_{Af}+p_{Bf}=1. There are five genotypes consisting of hemizygous males, with genotypes A and B, and diploid females, with genotypes AA, AB and BB. We denote the observed genotype counts in males by n_{Am} and n_{Bm}, and in females by n_{AAf}, n_{ABf} and n_{BBf}. The total sample size is n=n_{m}+n_{f}, where n_{m}=n_{Am}+n_{Bm} is the total number of males, and where n_{f}=n_{AAf}+n_{ABf}+n_{BBf} is the total number of females.
The male A genotype (or allele) count, n_{Am}, is assumed to follow a Binomial(n_{m}, p_{Am}) distribution, and the vector of female genotype counts, (n_{AAf}, n_{ABf}, n_{BBf}), is assumed to follow a Multinomial(n_{f}, (p_{AAf}, p_{ABf}, p_{BBf})) distribution, where p_{AAf}+p_{ABf}+p_{BBf}=1.
Equilibrium in Xchromosomal markers
For Xchromosomal markers, it can take more generations to achieve equilibrium, depending on the initial difference in allele frequency between the sexes (Crow and Kimura, 1970). In fact, under disequilibrium, allele and genotype frequencies for the X chromosome will always be changing from generation to generation. All the frequencies considered next correspond to the current generation.
HW equilibrium holds for the SNPs of the X chromosome if and only if:

1
There is equality of male and female allele frequencies, p_{Am}=p_{Af},

2
The female genotype counts, (n_{AAf}, n_{ABf}, n_{BBf}), are multinomially distributed with HW proportions
When both these conditions hold in one generation then, under random mating, the allele frequencies in males and the genotype frequencies in females are constant from one generation to the next (see, for example, Li (1976) and Zheng et al. (2007)). In the case of Xchromosomal markers, disequilibrium can be present under three different scenarios.
In the first scenario, p_{Am}=p_{Af} holds, but the female genotype proportions fail to match the HW proportions, a case which is typically parametrized in terms of a female inbreeding coefficient, f, such that:
When p_{Am}=p_{Af}, a value of f=0 corresponds to HW equilibrium, a positive f indicates a lack of female heterozygotes, and a negative f indicates an excess of female heterozygotes. Hence, for studying this first kind of disequilibrium on the X chromosome, we will use this female inbreeding coefficient, f, as a measure of the deviation of female genotype frequencies from HW proportions in the current generation, which can be posed as:
Note that the value of f can range between −MAF/(1−MAF) and 1, where MAF=min(p_{Af}, 1−p_{Af}). Under this first disequilibrium scenario and random mating, one will have HW equilibrium in the next generation, like for autosomal markers.
Under a second disequilibrium scenario, female genotype probabilities satisfy HW proportions and therefore f=0, but the allele frequencies between males and females are different and therefore condition 1 does not hold, in which case we use as a measure for disequilibrium the ratio of male to female allele frequencies,
Under this second disequilibrium scenario, with d≠1, allele frequencies of males and genotype frequencies of females converge to equilibrium only when the number of generations goes to infinity; even though in this setting in the current generation f is 0, in the previous and in the following generations f is different from 0.
Under a third disequilibrium scenario, Xchromosomal markers might not be in equilibrium because both f≠0 as well as d≠1.
Models for equilibrium and for disequilibrium
In practice one will face either the HW equilibrium scenario, or one of three disequilibrium scenarios, which leads to the choice between four models. Under HW equilibrium, female genotype counts have the Multinomial distribution with p_{Am}=p_{Af}. In this case, the value of p_{Af} determines the value of all the remaining probabilities. The model under HW equilibrium will be labeled as Model 0, (M_{0}).
In the first disequilibrium scenario described above, f≠0 and d=1, and female genotype counts have the Multinomial(n_{f}, (p_{AAf}, p_{ABf}, p_{BBf})) distribution, while male A genotype counts follow the Binomial(n_{m}, p_{Am}) distribution with:
In this case the value of (p_{AAf},p_{ABf}) determines the value of all the remaining probabilities. The model for this disequilibrium scenario is labeled as Model 1, (M_{1}).
In the second disequilibrium scenario described in the ‘Equilibrium in XChromosomal markers’ section, the inbreeding coefficient for females, f, is equal to 0, but d is not equal to 1. In that case, female genotype counts have the Multinomial distribution, while the male A genotype count follows the Binomial(n_{m}, p_{Am}) distribution with probability p_{Am} functionally unrelated to p_{Af}. In this case, the value of (p_{Af}, p_{Am}) determines the value of all the remaining probabilities. The model for this second disequilibrium scenario is labeled as Model 2, (M_{2}).
In the third and last disequilibrium scenario, f≠0 and d≠1. In that case, female genotype counts are multinomial with unrestricted probabilities, as in Model 1, while the male A genotype count is binomially distributed with probability p_{Am}, as in Model 2. In this case, the parameter space is the largest possible, and the model is labeled as Model 3, (M_{3}), or the saturated model.
Bayesian tests for Xchromosomal markers
In the frequentist approach to testing for HW equilibrium for Xchromosomal markers presented in Graffelman and Weir (2016), one chooses between HW equilibrium, (that is, Model 0), and disequilibrium, (that is, Models 1, 2 and 3), but it does not allow one to distinguish between the three different disequilibrium scenarios. In the frequentist approach, additional statistical tests for equality of allele frequencies and/or HW proportions in females would be needed to finally pinpoint the scenario.
Instead, in the Bayesian setting it is more natural to test for HW equilibrium by choosing one scenario among the four alternative scenarios described above, which is equivalent to selecting one model among M_{0}, M_{1}, M_{2} and M_{3}. That is done by choosing a prior distribution for the parameters of the models that captures what one knows about them before observing the data, and a prior distribution on the model space, and then computing the posterior probability of each one of the four models (scenarios). Then, one selects the model (scenario) with largest posterior probability.
Choice of a prior distribution
Different parametrizations allow for different ways of capturing what one knows about the parameters of the model in terms of a prior distribution for them. Here we will adopt the parametrization of Models 0, 1, 2 and 3 in terms of male and female genotype frequencies, because that allows for a choice of priors that leads to simple expressions for the posterior probabilities of the four models considered, and because they are the most convenient ones when one has little information.
Under the HW equilibrium scenario, leading to Model 0, male and female allele frequencies, p_{Am} and p_{Af}, are equal, and they will be assumed to be Beta(b_{1,0}, b_{2,0}) distributed, where the second subindex, 0, refers to M_{0}. Under this scenario, this prior distribution univocally determines the prior distribution of all female genotype frequencies.
Under the first disequilibrium scenario, leading to Model 1, the female genotype frequencies, (p_{AAf}, p_{ABf}, p_{BBf}), are assumed to be Dirichlet(a_{1,1,f}, a_{2,1,f}, a_{3,1,f}) distributed, where the second subindex, 1, refers to M_{1}. The distribution of the female genotype frequencies determines the distribution of the female and male allele frequencies.
Under the second disequilibrium scenario, leading to Model 2, male and female allele frequencies are assumed to be independently distributed as a Beta(b_{1,2,m}, b_{2,2,m}) and Beta(b_{1,2,f}, b_{2,2,f}), respectively, and that determines the distribution of the female genotype frequencies. Finally, under the last disequilibrium scenario, leading to Model 3, female genotype frequencies are assumed to be Dirichlet(a_{1,3,f}, a_{2,3,f}, a_{3,3,f}) distributed, independent of the male allele frequency, which is assumed to be Beta(b_{1,3,m}, b_{2,3,m}) distributed.
Depending on the values chosen for (a_{1,i,f}, a_{2,i,f}, a_{3,i,f}), the Dirichlet(a_{1,i,f}, a_{2,i,f}, a_{3,i,f}) distribution will be more or less informative, and it will capture different information about female genotype frequencies. In particular, its expected value is , and one can choose the a_{j,i,f}’s to reflect the fact that one expects some genotypes to have larger probabilities than others. Also, the larger the smaller the variances of the components of the Dirichlet random variable, and the more informative that prior distribution. When one is not willing to use subjective information about the female genotype frequencies, Berger et al. (2015) recommend using a Dirichlet with a_{1,i,f}=a_{2,i,f}=a_{3,i,f}=1/3, which is also recommended by Bernardo and Tomazella (2010) as a good approximation to a prior distribution tailored for a reference analysis of HW equilibrium. We will use this reference prior, which is like assuming an effective sample size of only one to start with (see, for example, Morita et al. (2008)). Given that the actual sample sizes in our setting will typically be a lot larger than that, the impact of this prior on the posterior distribution for female genotype frequencies will be negligible.
An analogous argument can be made for choosing the parameters of the Beta(b_{1,i}, b_{2,i}) to model the prior information about allele frequencies. In that case, in the absence of subjective information one often chooses Beta(b_{1,i}, b_{2,i}) with b_{1,i}=b_{2,i}=1/2, which corresponds to a relatively uninformative prior that assumes an effective sample size of only one to start with. Moreover, this prior captures the fact that low MAF markers are more frequent.
An alternative way of eliciting prior information under Models 1 and 3 is to choose a specific distribution for the inbreeding coefficient, f, and for the female allele frequency, p_{Af}, instead of resorting to the Dirichlet distribution for the genotype frequencies. However, this complicates the computation of the posterior probabilities, and it does not make much difference when carrying out a reference analysis that uses little prior information. The inbreeding coefficient is related to the female genotype frequencies through:
and one can explore the prior distribution of f that is induced by assuming a Dirichlet distribution on (p_{AAf}, p_{ABf}, p_{BBf}). When one does that for our reference choice, with a_{1,i,f}=a_{2,i,f}=a_{3,i,f}=1/3, one finds that the prior distribution for f is not symmetric on its support. That prior is in fact trimodal, with two modes at the two extremes of the range of values taken by f, and a third mode at 0, which are features that one considers desirable for a reference prior for a parameter such as f, with finite range and a null hypothesis at 0. Furthermore, under our choice of parameters for the Dirichlet prior, one can check through MonteCarlo simulation that the probability that f is larger than 0 is 0.548.
Instead, when one assumes a Dirichlet(a_{1,i,f}, a_{2,i,f}, a_{3,i,f}) distribution with a_{1,i,f}=a_{2,i,f}=a_{3,i,f}=1 as the prior distribution for the female genotype frequencies, which corresponds to assuming a uniform distribution on them, one finds that the prior distribution for f concentrates on values larger than 0, with a prior probability that f is larger than 0 equal to 0.667. The problem of this upward bias introduced when using a uniform distribution has already been reported by Foll and Gaggiotti (2008). Another shortcoming of assuming a uniform prior for (p_{AAf}, p_{ABf}, p_{BBf}) is that the prior induced on the female allele probability through p_{Af}=(2p_{AAf}+p_{ABf})/2 becomes strongly unimodal with mode at 0.5. Instead, our choice of a_{1,i,f}=a_{2,i,f}=a_{3,i,f}=1/3 leads to a prior distribution for p_{Af} that is a lot closer to the Beta(0.5, 0.5) that is assumed for p_{Am}.
Note though that either one of these two choices of values for (a_{1,i,f}, a_{2,i,f}, a_{3,i,f}) leads to posterior distributions that are very similar, because they are both a lot less informative than the data that one typically obtains in these settings.
Alternative ways of choosing prior distributions for HW equilibrium under the usual autosomal data can be found in Lindley (1988), Shoemaker et al. (1998), Consonni et al. (2008) and Wakefield (2010). All their proposals could be adapted to our Xchromosomal marker setting, but if one chose these priors to have a small effective sample size, they would make a small difference at a considerable extra computational cost, because they do not lead to closed form expressions for the posterior probabilities described next.
Bayesian model selection
The Bayesian way to select a model is through the posterior probability of each model, P(M_{i}y), which is the probability that the M_{i} model is the one generating the data, y=(n_{AAf}, n_{ABf}, n_{BBf}, n_{Am}, n_{Bm}), assessed after the data has been observed. It can be computed by using Bayes theorem:
where P(M_{i}) is the prior probability assigned to M_{i} (that is, the probability that this model is correct, assessed before the data are available), and where P(yM_{i}) is the marginal likelihood of M_{i}. If all models were considered equally likely a priori, the way it will be assumed in the ‘Examples’ section, the larger P(yM_{i}), the more attractive M_{i} will be.
Most often, computing P(yM_{i}) exactly is too complicated, and the marginal likelihoods need to be estimated through the markov chain monte carlo simulations used to update the model. In our Binomial/Multinomial setting with Beta/Dirichlet priors though, there are closed form expressions for P(yM_{i}), which allow one to either compute these marginal likelihoods exactly, in the case of Models 0, 2 and 3, or to evaluate them numerically in the case of Model 1. The expressions for the marginal likelihoods, P(yM_{i}), under our choice of prior distribution can be found in the Appendix 1; they allow one to compute the posterior probabilities on the model space, P(M_{i}y), exactly through equation (3.2).
To assess the strength of evidence in favor or against a given model, M_{i}, one sometimes resorts to the corresponding Bayes factor, BF_{i}, which is the ratio of the posterior odds and the prior odds for that model. When all four models are considered equally likely a priori, BF_{i}=3P(M_{i}y)/(1−P(M_{i}y)). One usually considers that a log_{10}(BF_{i}) that takes a value between 0.5 and 1 indicates that the strength of evidence in favor of M_{i} is substantial, when its value is between 1 and 1.5 it is strong, when it is between 1.5 and 2 it is very strong, and when it is larger than 2 it considers the evidence in favor of M_{i} to be decisive.
Simulation assessment of the Bayesian test
To assess the performance of this Bayesian test for HW equilibrium, here it is used under a very wide set of known scenarios through an extensive simulation study. In particular, the test is tried on SNPs from populations with inbreeding coefficients, f, taking values in its whole range, and with a ratio of male to female allele frequencies, d, ranging between 0.5 and 2. In total, we have considered 625 different pairs of values for (f, d), and for each pair we have checked the performance of the test on populations with p_{Af}=0.2 and 0.4 assuming samples with n_{f}=n_{m}=500 and with n_{f}=n_{m}=2000.
For each one of the set of 2500 values of (f, d, p_{Af}, n_{f}) considered we have simulated 1000 independent SNPs with sample size n=n_{f}+n_{m} from a population with the corresponding values of (f, d, p_{Af}), and we have computed P(M_{i}y) for i=0,1,2,3 and for each one of the samples.
Figure 2 presents the contour plots for the average of all the values of P(M_{i}y) obtained, as a function of (f, d) for the four combinations of (p_{Af}, n_{f}) considered. This average estimates the expected value of P(M_{i}y) for each given (f, d, p_{Af}, n_{f}). As desirable, the expected value of P(M_{i}y) peaks on the region of the (f, d) space where the corresponding M_{i} model holds true. One also observes that the larger the sample size n, and/or the larger p_{Af}, the more peaked the expected value of P(M_{i}y) is as a function of (f, d), and hence the better does this Bayesian test work.
Examples
To present applications of the Bayesian approach to testing for HW equilibrium advocated in this paper, we analyze individual markers (see the ‘Test on four individual SNPs’ section) and groups of markers (see the ‘Simultaneous analysis of multiple Xchromosomal SNPs’ section) of the Japanese population of the 1000 Genomes project, consisting of n_{m}=56 males and n_{f}=48 females. We also explain how one can take into account the multiple testing effect through posterior predictive checks when assessing the HW equilibrium hypothesis based on the simultaneous analysis of a large number of SNPs (see the ‘Multiple testing and the assessment of HW equilibrium’ section).
Test on four individual SNPs
In order to compare the Bayesian test proposed here for HW equilibrium at biallelic genetic markers on the X chromosome with the tests proposed in the context of a frequentist approach, we report the posterior probabilities of the four possible scenarios together with the Pvalues of the exact tests for four example SNPs in Table 1. Exact tests were performed with and without the data on males using the methods proposed by Graffelman and Weir (2016). The posterior probabilities are computed through equation (3.2), assuming equal prior probabilities for the four models, and hence P(M_{i})=1/4, and using the expressions for the marginal likelihoods, P(yM_{i}) in the Appendix 1 with a_{j,i,f}=1/3 for the Dirichlet prior and b_{j,i}=1/2 for the Beta priors. Given that each one of these priors corresponds to an effective sample size of only one and data involves a sample size of n=104, the role played by the prior distribution is negligible. Sample sizes will most often be larger than in this example, and hence in practice the choice of a prior will most often be even less relevant.
The first marker in Table 1, rs13440889, has a posterior probability of 0.748 of being in HW equilibrium, and hence one rejects the three disequilibrium scenarios, with posterior probabilities of 0.126 or smaller. The corresponding Bayes factor indicates that the evidence in favor of being in HW equilibrium here is substantial. This is consistent with the nonsignificant exact test for HWE (P=0.954). The second marker in Table 1 has a posterior probability of only 0.072 of being in HW equilibrium, but it has instead a posterior probability of 0.803 of being in the first disequilibrium scenario, with d=1 and f≠0, and hence one settles with M_{1} for that marker. Here the Bayes factor indicates that the evidence in favor of M_{1} is strong. In this case, choosing M_{1} is in agreement with the frequentist exact test rejecting HW proportions in females (P=0.004).
For the third marker in Table 1, HW equilibrium is also rejected, because it has a posterior probability of only 0.092, and one settles with the second disequilibrium scenario, with d≠1 and with f=0. Note that in this case, the frequentist tests do reject HWE overall, but do not reject HW proportions for females (P=0.760). For the last marker in Table 1, the most probable scenario is clearly the third disequilibrium scenario, with d≠1 and f≠0. Here BF_{3} indicates that the evidence in favor of M_{3} is decisive. The frequentist tests reject equilibrium (P<0.0005), but the difference in allele frequencies goes unnoticed.
In Figure 3, one has the set of marginal posterior distributions for the marker in the second row in Table 1, SNP rs2301322. These marginal posteriors are computed assuming the full Model 3, in the way described in Appendix 2. The first row presents the marginal posterior for female genotype frequencies, the second row presents the marginal posterior for male and female allele frequencies and for their ratio, while the third row presents the marginal posterior for the inbreeding coefficient as well as the joint posterior for allele frequencies and for (f, d).
Figure 3 also presents 90% highest posterior density (hpd) credible intervals/regions for all these parameter values or pairs of parameter values. The marginal posterior for f in Figure 3, for example, places almost all its probability mass away from f=0, with the 90% hpd posterior credible interval being (0.199, 0.683). Instead, the marginal posterior for d places d=1 well inside its 90% hpd posterior credible interval, which is (0.853, 1.172). The results clearly show that females are out of HW proportions, but that equality of male and female allele frequencies is a tenable supposition.
Note that different from confidence regions, Bayesian credible regions are statements about the probability that the actual parameter value for that given SNP falls in a given region, and not the probability that the region captures the true parameter value under repeated use of these regions on different samples.
Figure 4 presents the marginal posterior distributions for (f, d) for the four SNPs in Table 1, together with its 90% hpd posterior credible region. The fact that, for example, for the SNP rs13440889, the (0, 1) point falls well inside the 90% posterior credible region is a clear indication that in that case HW equilibrium holds. In the other three examples, the (0, 1) point falls outside the corresponding 90% credible region in three different ways, which are representative of the three different reasons through which equilibrium might be broken.
Simultaneous analysis of multiple Xchromosomal SNPs
In this section, we illustrate the Bayesian approach to testing for HW equilibrium of Xchromosomal markers by carrying out the Bayesian test based on the simultaneous analysis of a large set of SNPs selected from the Japanese population of the 1000 Genome project. The 1000 Genomes project provides genotype information for ~3.5 million variants on the X chromosome. SNPs without rs identifier, SNPs in the pseudoautosomal regions and SNPs with missing values were excluded. Xchromosomal SNPs were linkage disequilibrium pruned with Plink (Purcell et al., 2007) using the independent pairwise option with a sliding window of 50 SNPs and a threshold of R^{2}=0.50 using Plink instruction plink—bfile JPTChrX—indeppairwise 50 5 0.50—ldxchr 1. The SNPs with small MAF have not been filtered out. This leaves a sample of 162225 SNPs from the whole X chromosome, that is the one that will be used in this subsection.
Figure 5 presents the model with the largest posterior probability for each one of these SNPs, presented in the order in which these SNPs appear on the X chromosome. The white band without SNPs between 58.1 MB and 63.0 Mb corresponds to the centromere. The presence of consecutive sequences of markers being systematically classified to the same disequilibrium scenario, or to one of the three disequilibrium scenarios, might be an indication of quality control problems in the SNP measurements, or might arise if the PAR region is erroneously included in the analysis. Too few SNPs being classified as being in HW equilibrium would also be an indication of either a problem in the measurements or of the fact that the population under scrutiny is actually in disequilibrium.
Model 0, representing HW equilibrium, is the one with the largest posterior probability in 95.27% of all the 162225 SNPs considered, Model 1 is the one with the largest probability in 1.89% of the cases, Model 2 is the one with the largest probability in 2.13% of the cases and Model 3 is the one with the largest probability in 0.71% of the cases. It is known that for SNPs with low MAF, which are abundant in this set of 162225 SNPs, power to detect disequilibrium is low. When one filters out the SNPs with MAF<0.05, one is left with only 52008 SNPs, and the proportion classified as being in equilibrium falls down to 89.08%.
The next subsection illustrates how one can assess whether the overall proportions obtained for all 162225 SNPs are compatible with the HW equilibrium model, M_{0}, holding true, in a way that takes into account the multiple testing effect involved.
Multiple testing and the assessment of HW equilibrium
Carrying out the test for HW equilibrium based on the simultaneous analysis of multiple SNPs involves dealing with the multiple testing effect, which requires one to account for the experimentwise error rate. In the Bayesian context, one approach to that problem is through the use of the false discovery rate and qvalues, as described in Storey (2002, 2003, Muller et al. (2006), de Villemereuil et al. (2014) and de Villemereuil and Gaggiotti (2015).
Instead of using the false discovery rate, here a novel approach to address multiple testing in the Bayesian setting is used. The alternative method uses posterior predictive checks to assess whether the proportion of SNPs in the sample of 162225 of the previous section classified to each one of the four different scenarios is consistent with the proportions that would be obtained if the HW equilibrium was actually in place for the population. To estimate the proportions classified under each scenario in a population of SNPs actually in HW equilibrium, simulation from the posterior predictive distribution under M_{0} is used. For a description of the use of posterior predictive checks as a tool to validate models in general, see Chapter 6 of Gelman et al. (2014) or Puig and Ginebra (2014).
To do this simulation exercise, one needs to resort to a sample of SNPs that is smaller than the one used in the previous section, because we need to assume approximate independence between SNPs and because, at this point, the simulation exercise that would need to be done with the larger set of SNPs would take too long. That is why a random subsample of only 1622 SNPs is obtained from the 162225 SNPs used in the previous section to carry out the posterior predictive checks.
It turns that for the subsample with only 1% of all the SNPs previously used, Model 0 is the one with the largest posterior probability in 95.25% of the SNPs, Model 1 is the one with the largest probability in 2.03% of the cases, Model 2 is the one with the largest probability in 2.03% of the cases and Model 3 is the one with the largest probability in 0.68% of the cases. The second panel in Figure 5 presents the model with the largest probability for each one of these 1622 SNPs.
To assess whether these observed proportions of SNPs being classified as following each one of the models are compatible with the assumption that HW equilibrium is in place, we estimate the posterior predictive distribution and the posterior predictive credible intervals for these four proportions, assuming that the HW equilibrium holds and that the SNPs are independent. This last assumption will be satisfied due to the way in which the smaller subset of only 1622 SNPs was selected from the whole set of SNPs of the X chromosome.
The posterior predictive distribution can be estimated by repeatedly simulating 1622 × 5 tables of ‘data like the one from the Japanese study’ used to test for HW equilibrium, by using the posterior predictive distribution of the data under Model 0. For each one of the simulated tables, one then classifies the 1622 simulated SNPs to one of the four scenarios, based on their P(M_{i}y) and finds the proportion of SNPs classified into each scenario.
Each one of the tables can be simulated from the posterior predictive distribution by:

1)
Simulating 1622 values for p_{Af}, one value of each row of the table, using its posterior distribution under Model 0 which, assuming Beta(b_{1,0}, b_{2,0}) to be the prior, is:
which is also the posterior for p_{Am} because under M_{0}, p_{Am}=p_{Af}.

2)
For each value of p_{Am} one simulates the n_{Am} for that row from a Binomial(n_{m}, p_{Am}), one computes n_{Bm}=n_{m}−n_{Am}, and one simulates (n_{AAf}, n_{ABf}, n_{BBf}) for that row from a Multinomial.

3)
For each row of each table one computes P(M_{i}y) for i=0,1,2,3, and for each simulated table one obtains the proportions of SNPs classified as following each one of the four M_{i}’s based on the largest P(M_{i}y) for that row.
By repeating this exercise as many times as tables of data one intends to simulate, one obtains the posterior predictive distribution for the proportions of SNPs being classified as M_{i} for i=0,1,2,3, conditioned on the HW equilibrium model, M_{0}, being the correct one. The rows of the new tables are simulated to be independent, which is a realistic assumption when one is analyzing a subset of approximately independent SNPs the way it is done here.
We have carried out this simulation exercise for the subset of 1622 SNPs selected from the Japanese population study by simulating 1000 tables from its posterior predictive distribution. It turns that if the HW equilibrium is in place, the 90% central posterior predictive credible interval for the proportion of SNPs classified as following M_{0} because their P(M_{0}y) is the largest is (94.6, 96.4), the 90% credible interval for the proportion of SNPs classified as following M_{1} because P(M_{1}y) is the largest is (1.4, 2.3), the one for the proportion of SNPs classified as following M_{2} is (1.8, 3.0) and the one for the proportion of SNPs classified as following M_{3} is (0.1, 0.5).
Note that the proportion of SNPs being classified into each one of these four models for the subset of markers from the Japanese population Genome project, which are 95.25, 2.03, 2.03 and 0.68% for M_{0}, M_{1}, M_{2} and M_{3}, fall either well within these four posterior predictive credible intervals, or very close to it in the case of M_{3}. The fact that the observed percentages fall within the posterior predictive intervals generated under the HWE assumption, suggests that the Xchromosomal markers without missing values of the LDpruned database are in equilibrium, with only a slight excess of markers in scenario M_{3}. By using posterior predictive checks involving all 1622 SNPs at once, instead of doing it one SNP at a time, one already takes into account the experimentwise error rate, and one does not have to correct for the fact that one carries out multiple tests.
When one does the same exercise on data from populations that are not in HW equilibrium, the proportion of SNPs that are classified as following M_{0} falls, and some of the other three proportions increase, and they would fall outside of the posterior predictive intervals for these four proportions obtained assuming that the HW equilibrium model was in place. Given that the sample size here is a lot larger than the effective sample size assumed by the priors, carrying out a sensitivity analysis that considers alternative priors of similar effective sample size leads to results which are almost identical to the ones reported here.
Discussion
We have developed a Bayesian method for inference on HW equilibrium for biallelic markers at the X chromosome. Disequilibrium at the X chromosome may be due to a difference in allele frequencies between the sexes, or to females not corresponding to HW proportions or both these factors simultaneously. By computing the posterior probability for each scenario, geneticists can immediately infer the most likely scenario. A similar approach can also be used for the Bayesian analysis of autosomal variants.
The Xchromosomal exact test chooses between HW equilibrium, (that is, Model 0), and disequilibrium, (that is, Models 1, 2 and 3). In order to precisely determine the disequilibrium scenario with a frequentist approach, several statistical tests are necessary: an exact test with and without males and eventually an exact test for equality of male and female allele frequencies. Instead, by assigning a posterior probability to each one of the four scenarios, with the four probabilities adding up to one, our Bayesian approach provides a simple way of selecting the most probable scenario in the light of the data.
One of the advantages of the Bayesian approach to HW equilibrium testing of Xchromosomal markers is that, on top of yielding posterior probabilities for each one of the four scenarios, it also provides the posterior distribution of the parameters of interest. In Appendix 2, one can find details on that distribution.
Among all the marginal posterior distributions, the one for (f,d) is particularly useful because it helps one assess the degree of departure from HW equilibrium beyond computing the corresponding four posterior probabilities.
For our Bayesian analysis, we have found it convenient to parametrize disequilibrium by using the inbreeding coefficient and the ratio of male to female allele frequencies, using a Dirichlet prior on the genotype frequencies. Alternatively, other disequilibrium measures with priors specified directly on the disequilibrium measures might also be considered.
One side contribution of this manuscript is the suggestion to use posterior predictive checks to deal with multiple testing in the Bayesian framework, as described in the previous section.
From a computational point of view, the χ^{2}test for HWE of Xchromosomal markers is very fast, and it is feasible to do this for a complete X chromosome with 3.5 million markers. An exact test is computationally more demanding due to the presence of factorial calculations and enumeration of possible outcomes. The Bayesian procedures outlined in this paper do not require a markov chain monte carlo implementation as it is usual in most Bayesian applications these days, and that simplifies the computation a lot. If the integration required for the computation of the posterior probability of M_{1} is carried out efficiently, there should not be any problem in using the proposed method for a whole X chromosome.
Further computational savings could be attained by using the fact that many of the 3.5 million markers on the X chromosome are rare variants with a low minor allele frequency, and therefore the set of genotype counts will be identical for many SNPs. For markers with identical counts, the HW tests only have to be computed once.
Software
The Bayesian Xchromosomal procedures described in this paper have been programmed in R (R Core Team, 2017) by Xavi Puig, and are made available in version 1.5.8 of the Hardy–Weinberg package (Graffelman, 2015).
References
Ayres KL, Balding DJ . (1998). Measuring departures from HardyWeinberg: a Markov chain monte carlo method for estimating the inbreeding coefficient. Heredity 80: 769–777.
Berger JO, Bernardo JM, Sun D . (2015). Overall objective priors. Bayesian Anal 10: 189–221.
Bernardo J, Tomazella V . (2010). Bayesian reference analysis of the HardyWeinberg equilibrium. In Chen MH, Dey DK, Muller P, Sun D, Ye K (eds). Frontiers of Statistical Decision Making and Bayesian Analysis, In Honor of James O. Berger. Springer Verlag: New York, NY, USA, pp 31–43.
Consonni G, GutierrezPena E, Veronese P . (2008). Compatible priors for Bayesian model comparison with an application to the HardyWeinberg equilibrium model. Test 17: 585–605.
Crow JF, Kimura M . (1970) An Introduction to Population Genetics Theory. Harper & Row Publishers: New York, NY, USA.
Foll M, Gaggiotti O . (2008). A genomescan method to identify selected loci appropriate for both dominant and codominant markers: a Bayesian perspective. Genetics 180: 977–993.
Gautier M . (2014). Using genotyping data to assign markers to their chromosome type and to infer the sex of individuals: a Bayesian modelbased classifier. Mol Ecol Resour 14: 1141–1159.
Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB . (2014) Bayesian Data Analysis. 3rd Ed. Chapman and Hall: Boca Raton.
Graffelman J . (2015). Exploring diallelic genetic markers: the HardyWeinberg package. J Stat Softw 64: 1–23.
Graffelman J, Weir BS . (2016). Testing for HardyWeinberg equilibrium at biallelic genetic markers on the X chromosome. Heredity 116: 558–568.
Graves JA, Wakefield MJ, Toder R . (1998). The origin and evolution of the pseudoautosomal regions of human sex chromosomes. Hum Mol Genet 7: 1991–1996.
Hamilton MB . (2009) Population Genetics. Chichester, UK; Hoboken, NJ. WileyBlackwell.
Hartl DL . (1980) Principles of Population Genetics. Sinauer Associates: Sunderland, MA, USA.
Hein J, Schierup MK, Wiuf C . (2005) Gene Genealogies, Variation and Evolution. Oxford University Press: Oxford: New York, NY, USA.
Li CC . (1976) The First Course in Population Genetics. The Boxwood Press: Pacific Groove, CA, USA.
Lindley D . (1988). Statistical inference concerning HardyWeinberg equilibrium. In Bernardo J, DeGroot M, Lindley, D, Smith A (eds). Bayesian Statistics 3. Oxford University Press: Oxford, pp 307–320.
Morita S, Thall PF, Muller P . (2008). Determining the effective sample size of a parametric prior. Biometrics 64: 595–602.
Muller P, Parmigiani P, Rice K . (2006). FDR and Bayesian multiple comparison rules. Johns Hopkins University, Department of Statistics Working Papers 115.
Pereira C, Rogatko A . (1984). The HardyWeinberg equilibrium under a Bayesian perspective. Rev Bras Genet 4: 689–707.
Puig X, Ginebra J . (2014). A Bayesian cluster analysis of election results. J Appl Stat 41: 73–94.
Purcell S, Neale B, ToddBrown K, Thomas L, Ferreira MAR, Bender D et al. (2007). PLINK: a toolset for wholegenome association and populationbased linkage analysis. Am J Hum Genet 81: 559–575.
R Core Team. (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing. Vienna, Austria. Available at: https://www.Rproject.org/.
Shoemaker J, Painter I, Weir B . (1998). A Bayesian characterization of HardyWeinberg disequilibrium. Genetics 149: 2079–2088.
Storey JD . (2002). A direct approach to false discovery rates. J R Stat Soc B 64: 479–498.
Storey JD . (2003). The positive false discovery rate. A Bayesian interpretation and the qvalue. Ann Stat 31: 2013–2035.
The 1000 Genomes Project Consortium The 1000 Genomes Project Consortium, Abecasis GR The 1000 Genomes Project Consortium, Altshuler D The 1000 Genomes Project Consortium, Auton A The 1000 Genomes Project Consortium, Brooks LD The 1000 Genomes Project Consortium, Durbin RM et al. (2010). A map of human genome variation from populationscale sequencing. Nature 467: 1061–1073.
The 1000 Genomes Project Consortium The 1000 Genomes Project Consortium, Auton A The 1000 Genomes Project Consortium, Brooks LD The 1000 Genomes Project Consortium, Durbin RM The 1000 Genomes Project Consortium, Garrison EP The 1000 Genomes Project Consortium, Kang HM et al. (2015). A global reference for human genetic variation. Nature 526: 68–74.
de Villemereuil P, Frichot E, Bazin E, François O, Gaggiotti OE . (2014). Genome scan methods against more complex models: when and how much should we trust them? Mol Ecol 23: 2006–2019.
de Villemereuil P, Gaggiotti OE . (2015). A new F STbased method to uncover local adaptation using environmental variables. Methods Ecol Evol 6: 1248–1258.
Wakefield J . (2009). Bayes factors for genomewide association studies: Comparison with pvalues. Genet Epidemiol 33: 79–86.
Wakefield J . (2010). Bayesian methods for examining HardyWeinberg equilibrium. Biometrics 66: 257–265.
Wise AL, Gyi L, Manolio TA . (2013). eXclusion: toward integrating the X chromosome in genomewide association analyses. Am J Hum Genet 92: 643–647.
You XP, Zou QL, Li JL, Zhou JY . (2015). Likelihood ratio test for excess homozygosity at marker loci on X chromosome. PLoS One 10: e0145032.
Zheng G, Joo J, Zhang C, Geller NL . (2007). Testing association for markers on the X chromosome. Genet Epidemiol 31: 834–843.
Acknowledgements
This work was partially supported by Grant 2014SGR551 from the Agència de Gestió d'Ajuts Universitaris i de Recerca (AGAUR) of the Generalitat de Catalunya, by Grants MTM201565016C22R (MINECO/FEDER) and MTM201343992R of the Spanish Ministry of Economy and Competitiveness and the European Regional Development Fund, and by Grant R01 GM075091 from the United States National Institutes of Health. The authors are extremely grateful for the comments and suggestions for improvement of the associate editor and two referees.
Author information
Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no conflict of interest.
Appendices
Appendix 1
Marginal likelihoods
Here we present the marginal likelihoods, P(yM_{i}) for i=0,…,3, needed to compute the posterior probabilities, P(M_{i}y), through equation (3.2). The priors assumed are the ones described in ‘Choice of a prior distribution’ section, and y=(n_{AAf}, n_{ABf}, n_{BBf}, n_{Am}, n_{Bm}).
The marginal likelihood under Model 0, under HW equilibrium, is:
The marginal likelihood under Model 1, with d=1 and f≠0, can be computed through:
The marginal likelihood under Model 2, with d≠1 and f=0, is:
Finally, the marginal likelihood under the saturated Model 3 is:
Note that the only model that requires integration is Model 1. However, it can be carried out numerically without any problem because the integration region is compact, and grid size can be set to be as small as needed for the precision required.
Appendix 2
Posterior distribution under Model 3
Under the saturated Model 3, (n_{AAf}, n_{ABf}, n_{BBf}) has the Multinomial(n_{f},(p_{AAf}, p_{ABf}, p_{BBf})) distribution and n_{Am} has the Binomial(n_{m}, p_{Am}) distribution. Under the assumption that a priori (p_{AAf}, p_{ABf}, p_{BBf}) is Dirichlet(a_{1,3,f}, a_{2,3,f}, a_{3,3,f}), and p_{Am} is Beta(b_{1,3,m}, b_{2,3,m}), the posterior distribution for (p_{AAf}, p_{ABf}, p_{BBf}) is:
independent of the posterior distribution for p_{Am}, which is:
The marginal posterior distributions for p_{Af}, f and d follow from the ones for (p_{AAf},p_{ABf},p_{BBf}) and for p_{Am}, and they can be easily estimated by simulating large samples of (p_{AAf},p_{ABf},p_{BBf}), and of (p_{Am},p_{Bm}), and for each value in the sample compute the corresponding value of p_{Af}, of f, and of d, using equations (2.6),(2.4) and (2.5), respectively.
Rights and permissions
This work is licensed under a Creative Commons AttributionNonCommercialNoDerivs 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/byncnd/4.0/
About this article
Cite this article
Puig, X., Ginebra, J. & Graffelman, J. A Bayesian test for Hardy–Weinberg equilibrium of biallelic Xchromosomal markers. Heredity 119, 226–236 (2017). https://doi.org/10.1038/hdy.2017.30
Received:
Accepted:
Published:
Issue Date:
Further reading

A robust test for Xchromosome genetic association accounting for Xchromosome inactivation and imprinting
Genetics Research (2020)

GPR174 and ITM2A Gene Polymorphisms rs3827440 and rs5912838 on the X chromosome in Korean Children with Autoimmune Thyroid Disease
Genes (2020)

A test for deviations from expected genotype frequencies on the X chromosome for sexbiased admixed populations
Heredity (2019)

Bayesian model selection for the study of Hardy–Weinberg proportions and homogeneity of gender allele frequencies
Heredity (2019)

Testing for goodness rather than lack of fit of an X–chromosomal SNP to the HardyWeinberg model
PLOS ONE (2019)