Introduction

Recent studies show that the large number of disease-associated variants identified through genome-wide association studies account for only a small portion of the presumed phenotypic variation.1 One of the potential sources of missing heritability is the contribution of rare variants.2, 3, 4, 5, 6, 7 The recent advances of sequencing technology have made directly testing rare variants possible.8, 9 Therefore, there is increasing interest to detect associations between rare variants and complex traits.

Recently, several statistical methods to detect associations between rare variants and complex traits have been developed for unrelated individuals. These methods can be roughly divided into three groups: burden tests, quadratic tests, and combined tests. Burden tests include the cohort allelic sums test,10 the combined multivariate and collapsing method,11 the weighted sum statistic (WSS),12 the variable minor allele frequency (MAF) threshold method,13 and the cumulative minor-allele test14 among others. Burden tests implicitly assume that all the rare variants are causal and the directions of the effects are all the same. If these assumptions are true, burden tests can be powerful tests; otherwise, burden tests can perform poorly.15, 16, 17, 18 Quadratic tests include C-alpha test,19 sequence kernel association test,15 and the test for Testing the effects of the Optimally Weighted combination of variants (TOW).17 Quadratic tests also include adaptive weighting methods20, 21, 22, 23, 24 since, as pointed out by Derkach et al,18 adaptive weighting methods are operationally similar to quadratic tests. Quadratic tests are robust to the directions of the effects of causal variants and are less affected by neutral variants than burden tests are. If most of the rare variants are causal and the directions of the effects of causal variants are all the same, then burden tests can outperform quadratic tests; otherwise, quadratic tests perform better. To increase the robustness of a test, Derkach et al and Lee et al proposed combined tests that combine information from burden and quadratic tests aiming to have advantages of both burden and quadratic tests.16, 18

All of the aforementioned methods are for unrelated individuals. For any type of study design, the statistical power will be improved if rare variants can be enriched in the samples. If one parent has a copy of a rare allele, half of the offspring are expected to carry it, and hence, variants that are rare in the general population could be very common in certain families.25 Therefore, family-based designs may have an important role in rare variant association studies. More recently, a couple of family-based rare variant association methods for quantitative traits26, 27 and for qualitative traits28, 29 have been developed.

In this article, based on affected sib-pair data, we propose a test for Testing the effects of the Optimally Weighted combination of variants (TOW-sib). TOW-sib is based on the score test for testing the optimally weighted combination of variants derived from the retrospective likelihood of affected sib-pairs, unrelated controls, and possible unrelated cases. The optimal weights are analytically derived and can be calculated from sampled genotypes and phenotypes. Based on the optimal weights, TOW-sib is robust to the directions of the effects of causal variants and is less affected by neutral variants than existing tests are. We use extensive simulation studies to compare the performance of the proposed method with that of existing methods based on unrelated individuals12, 17 and existing methods based on affected sib-pairs.28 Our simulation results show that, in all the cases, the proposed method is substantially more powerful than existing methods based on either unrelated individuals or affected sib-pairs.

Materials and Methods

Consider a sample of ns affected sib-pairs, na unrelated cases, and nc unrelated controls. Each individual has been genotyped at M variants in a genomic region. Denote gji=(gji1,...,gjiM)T, gai=(gai1,…,gaiM)T, and gci=(gci1,…,gciM)T as the genotypes of the jth individual in the ith sib-pair, the ith case, and the ith control, respectively, where gjim, gaim, gcim{0,1,2} are the number of minor alleles. Let , , and denote the combinations of genotypic scores at the M variants of the ith sib-pair, the ith case, and the ith control, respectively, where w=(w1,...,wM) are weights and their values will be decided later. Denote the disease status of an individual by D with D=0 indicating a normal, whereas D=1 indicating a diseased individual.

The retrospective likelihood is given by

where and represent all possible genotype pair for a sib-pair and g* represents all possible genotypes for an individual. Choose g0=(0,…,0) as a baseline genotype. Let r(g) be the relative risk of genotype g to the baseline genotype. Following Schaid,30 we use a log-linear model to model the relative risk, ie, r(g)=e, with x representing the combination of genotypic scores of the genotype g. Denote the risk of an individual with the baseline genotype as Pr(D=1|g0)=eα. Then, the retrospective likelihood is given by

where and represent the combinations of genotypic scores of the genotypes and , respectively, and x* represents the combination of genotypic scores of the genotype g*.

In Appendix A, we have shown that, under the assumption that the M variants are independent (our proposed test is still valid if this assumption is not true), the score test statistic to test the null hypothesis H0:β=0 is given by

where , , and â are the maximum likelihood estimates (MLEs) of pm and under the null hypothesis, pm is the MAF at the mth variant, and . Under the null hypothesis, the likelihood function becomes

Based on L0, has no explicit expression. Using the joint distribution of genotypes of a sib-pair given by Table 1, we can construct an expectation-maximization algorithm to calculate (see Appendix B). We cannot estimate α based on L0, because L0 does not contain α. We propose to estimate α based on the full likelihood function

Table 1 The joint distribution of genotypes of a sib-pair

Based on Lfull, the MLE of under the null hypothesis is . Using this estimate of a, U can be written as . Let , N=6ns+2na+2ncâ2, , and w=(w1,…,wM)T. Then,

T(w1,…,wM) reaches its maxim when w=v−1u. We define the statistic of the test for Testing the effect of an Optimally Weighted combination of variants for sib-pair data (TOW-sib) as

We use a special permutation test to evaluate P-values of TOW-sib. For each permutation, we have the following steps: (1) permute the multi-variant genotypes and get the permuted genotypes . (2) In the ith sib-pair, given , we generate variant by variant according to the conditional distribution Pr(g2|g1) from Table 1. (3) Calculate , the value of TTOW−sib based on the permuted genotypes , , and . We generate under the assumption that the M variants are independent. When the M variants are in linkage disequilibrium (LD), TTOW−sib and may have different variances, although they have the same mean. In order to make TTOW−sib and have the same mean and same variance, we standardize TTOW−sib such that TTOW−sib−ST=(TTOW−sibμTOW−sib)/σTOW−sib, where μTOW−sib and are the estimates of the mean and variance of TTOW−sib (see Appendix C on how to calculate μTOW−sib and ). Suppose we perform B times of permutations. Let denote the value of TTOW−sib−ST based on data of the bth permutation (b=0 denotes the original data). Then, the P-value of the test is given by .

For a simulation study with R replicates, the above procedure will be rather computationally expensive. In our simulation studies, we use the pooling permutation method proposed by Guo and Lin to evaluate P-values.31 In the pooling permutation method, permuted samples from all the replicates are pooled together to form a joint sample from the null distribution. Suppose that we have R replicates and we perform B permutations for each replicate. Let TTOW−sib−ST(b,r) denote the value of TTOW−sib−ST based on data of the bth permutation of the rth replicate (b=0 denotes the original data). Then, the P-value of the test in the rth replicate is given by

As the permutation samples are pooled across all replicates to form a sample from the null, B can be set to be much smaller than the situation when only one sample is analyzed.

We compare the performance of the proposed method with three existing methods: WSS,12 sibpair-based weighted sum statistic (SPWSS),28 and TOW.17 WSS and TOW are based on unrelated cases and controls, whereas SPWSS is based on affected sib-pairs, unrelated cases, and unrelated controls.

Simulation

The empirical Mini-Exome genotype data provided by the genetic analysis workshop 17 are used for simulation studies. This data set contains genotypes of 697 unrelated individuals on 3205 genes. The genotypes of the genetic analysis workshop 17 data set are extracted from the sequence alignment files provided by the 1000 Genomes Project for their pilot3 study (http://www.1000genomes.org). We choose four genes: ELAVL4 (gene1), MSH4 (gene2), PDE4B (gene3), and ADAMTS4 (gene4) with 10, 20, 30, and 40 variants, respectively. We merge the four genes to form a super gene (Sgene) with 100 variants with 86 rare variants (MAF<0.01) and 14 common variants (MAF≥0.01). We choose Sgene because the distributions of MAFs in the 100 variants in Sgene and in the 24 487 variants in all the 3205 genes are very similar.17 In our simulation studies, we generate genotypes based on the genotypes of 697 individuals in Sgene. We use the program fastPHASE to infer haplotypic phase for the 697 individuals and calculate haplotype frequencies.32 To generate the genotype of an individual, we generate two haplotypes according to the haplotype frequencies. To obtain the genotypes of a family, we first generate genotypes of parents. Then the genotypes of children are generated from parental haplotypes by random transmission. To generate a qualitative disease affection status, we use a liability threshold model based on a continuous phenotype (quantitative trait). An individual is defined to be affected if the individual’s phenotype is at least one standard deviation larger than the phenotypic mean. This yields a prevalence of 16% for the simulated disease in the general population. In the following, we describe how to generate a quantitative trait.

Under the null hypothesis, we generate trait values for unrelated individuals according to the standard normal distribution. For a family with m children, let Y1=(yF,yM) and Y2=(y1,y2,,ym) denote the trait values of the parents and the m children in a family, respectively. Assume that (Y1,Y2) follows a multivariate normal distribution with a mean vector of zero and variance-covariance matrix of, where , , and

This variance-covariance matrix indicates that the parents in each family are independent, and the correlation coefficient between a parent and a child or between two children is constant, ρ (in this study, ρ=0.2). To generate trait values of all members in each family, we first generate the trait value of a parent by using a standard normal distribution. Then, trait values of the children are generated by a normal distribution with a mean vector and a variance–covariance matrix .

Under the alternative hypothesis, we choose ncau rare variants (MAF<1%) as causal variants. The value of ncau is determined by pcau, the percentage of causal variants in rare variants. Let pp denote the percentage of protective variants in causal variants, then the number of protective variants and the number of risk variants are np=ncau·pp and nr=ncau·(1−pp), respectively. For the jth member in the ith family, let and denote the genotypic scores of the risk variant and the protective variant, respectively. Assume that all causal variants have the same heritability. Then the disease model is given by , where and are coefficients and their values depend on the total heritability, and ɛij is the trait value under the null hypothesis.

To generate affected sib-pairs, we generate families with two children. We keep generating families with two children until we have generated enough families with two affected children.

Results

In simulation studies, P-values are estimated using a pooling permutation method in which permuted samples from all the replicates are pooled together to form a joint sample from the null distribution.31 In each replicate, we perform 20 permutations. Type I error rates are evaluated using 10 000 replicated samples, whereas powers are evaluated using 500 replicated samples.

For type I error evaluation, we consider different haplotype structures (different genes), different sample sizes, different designs, and different significance levels. For 10 000 replicated samples, the 95% confidence intervals for type I error rates of nominal levels 0.05, 0.01, and 0.001 are (0.046, 0.054), (0.008, 0.012), and (0.0004, 0.0016), respectively. The estimated type I error rates of the proposed test are summarized in Tables 2 and 3. As shown by these tables, all the estimated type I error rates are within the 95% confidence intervals, which indicates that the proposed test is valid.

Table 2 Estimated type I error rates of TOW-sib for the design of affected sib-pairs and unrelated controls based on 10 000 replicated samples
Table 3 Estimated type I error rates of TOW-sib for the design of affected sib-pairs, unrelated cases, and unrelated controls based on 10 000 replicated samples

For fixed number of total cases and fixed number of total individuals, power comparisons for power as a function of the number of affected sib-pairs are given in Figure 1. As shown by Figure 1, the power of TOW-sib increases with the increase of the number of affected sib-pairs. With the increase of the number of affected sib-pairs, the power of SPWSS increases if the number of affected sib-pairs is less than 20% of total number of cases and the power of SPWSS decreases otherwise. Therefore, in the following discussion, the number of affected sib-pairs is equal to the half of total number of cases in the design for TOW-sib and the number of affected sib-pairs is equal to 20% of total number of cases in the design for SPWSS. The powers of TOW and WSS do not have relation with the number of affected sib-pairs. In almost all the cases, TOW-sib is the most powerful test. When the percentage of causal variants is small (10%), SPWSS is more powerful than TOW and WSS if the number of affected sib-pairs is between 10 and 45% of the total number of cases. When the percentage of causal variants is large (50%), SPWSS is the least powerful test.

Figure 1
figure 1

Power comparisons of four tests for power as a function of number of affected sib-pairs. TOW and WSS are based on 1000 unrelated cases and 1000 unrelated controls. For TOW-sib and SPWSS, the sample size is 2000, where number of unrelated controls is 1000 and number of unrelated cases plus twice of the number of affected sib-pairs is 1000. Total heritability is 0.03. pcau denotes the percentage of causal variants in rare variants; pp denotes the percentage of protective variants in causal variants. The power is evaluated at a significance level of 0.001.

As shown by power comparisons for power as a function of heritability and for power as a function of the percentage of protective variants (Figures 2 and 3), TOW-sib is the most powerful test in all the cases. When the percentage of causal variants is small (10%), SPWSS is more powerful than TOW and WSS. When the percentage of causal variants is large (50%), SPWSS and TOW have similar power and are less powerful than WSS if the percentage of protective variants is small and are more powerful than WSS if the percentage of protective variants is large.

Figure 2
figure 2

Powers as a function of heritability. TOW and WSS are based on 1000 unrelated cases and 1000 unrelated controls. SPWSS is based on 1000 unrelated controls, 600 unrelated cases, and 200 affected sib-pairs. TOW-sib is based on 1000 unrelated controls and 500 affected sib-pairs. pcau denotes the percentage of causal variants in rare variants; pp denotes the percentage of protective variants in causal variants. The power is evaluated at a significance level of 0.001.

Figure 3
figure 3

Powers as a function of percentage of protective variants. TOW and WSS are based on 1000 unrelated cases and 1000 unrelated controls. SPWSS is based on 1000 unrelated controls, 600 unrelated cases, and 200 affected sib-pairs. TOW-sib is based on 1000 unrelated controls and 500 affected sib-pairs. pcau denotes the percentage of causal variants in rare variants; herit denotes the total heritability. The power is evaluated at a significance level of 0.001.

Figure 4 shows power comparisons for power as a function of the percentage of causal variants. This figure shows that TOW-sib is the most powerful test in all the cases and the power of TOW-sib is not affected much by the percentage of causal variants. With the increase of the percentage of causal variants, the powers of WSS and TOW increase, whereas the power of SPWSS decreases. It is easy to understand that the power increases with the increase of the percentage of causal variants because larger percentage of causal variants or smaller percentage of neutral variants means smaller noise level. The reason of decrease in power of SPWSS with the increase of the percentage of causal variants probably is that it is easier to estimate weights when the percentage of causal variants is smaller. We also conduct a set of simulations to compare the powers for different values of ρ. The results (Supplementary Figure 1) show that the power comparisons have similar patterns for different values of ρ.

Figure 4
figure 4

Powers as a function of percentage of causal variants. TOW and WSS are based on 1000 unrelated cases and 1000 unrelated controls. SPWSS is based on 1000 unrelated controls, 600 unrelated cases, and 200 affected sib-pairs. TOW-sib is based on 1000 unrelated controls and 500 affected sib-pairs. pp denotes the percentage of protective variants in causal variants; herit denotes the total heritability. The power is evaluated at a significance level of 0.001.

In summary, TOW-sib is the most powerful test in all the cases. Among other three tests: WSS, SPWSS, and TOW, none is consistently more powerful than the other two.

Discussion

There is increasing interest to detect associations between rare variants and complex traits. Recently, several statistical methods for detecting rare variant associations by jointly considering multiple variants in a genomic region have been developed for unrelated individuals. However, statistical methods for detecting rare variant associations under family-based designs have not received as much attention as methods for unrelated individuals, although family-based designs have been shown to improve power to detect rare variants.28, 29 Motivated by the facts that rare disease variants will be enriched in family data33 and a large number of affected sib-pairs for a variety of diseases has been collected by traditional linkage studies, we develop TOW-sib to detect associations between the optimal combination of rare variants in a genomic region and complex traits based on affected sib-pairs and unrelated individuals. TOW-sib is robust to the directions of the effects of causal variants and is also relatively robust to the number of neutral variants. The proposed method does not require a MAF filtering threshold and can be applied to genomic regions that contain both rare and common variants. Our simulations demonstrated that TOW-sib using affected sib-pairs can be dramatically more powerful than the methods based on unrelated individuals and the existing methods based on affected sib-pairs.

Although TOW-sib is derived under the assumption that variants are independent, our simulation results show that TOW-sib is still a valid test when variants are in LD. Our simulations for type I error evaluation are based on the LD structures of genes 1–4 and, in each gene, there are variants in strong LD (Supplementary Tables 1–4). The correct type I error rates of TOW-sib in our simulations (Tables 2 and 3) indicate that this test is valid even if variants are in LD.

The current version of TOW-sib cannot adjust for covariates. It is possible to extend TOW-sib to be able to adjust for covariates. Denote zji, zai, and zci as the covariates of the jth individual in the ith sib-pair, the ith cases, and the ith controls, respectively. With covariates, the retrospective likelihood can be written as

Let , where x represents the combination of genotypic scores of the genotype g and z denotes covariates. Based on this likelihood, we can derive a score test statistic. However, the details of adjusting for covariates in TOW-sib need further investigation.

TOW-sib uses the optimal data-driven weights. TOW-sib belongs to quadratic tests and thus is robust to the directions of the effects of causal variants. We can use other weights. For example, in the score test statistic T(w1,…,wM), we can use the weights suggested by Madsen and Browning,12 that is, , where pm is the estimated MAF with pseudo-counts at the mth variant. We call the score test T(w1,…,wM) with WSS-sib. WSS-sib belongs to burden tests. When most of the rare variants are causal and the directions of the effects of causal variants are all the same, WSS-sib can outperform TOW-sib; otherwise, TOW-sib should outperform WSS-sib. To increase the robustness of the tests, we can also construct combined tests by combining information from TOW-sib and WSS-sib. One thing we want to make clear is the term ‘optimal weight’. The optimal weight in this paper only means that the selected weight makes the score test statistic maximum, it does not mean that the selected weight makes the score test to have the maximum power.

In this study, we estimate based on the full likelihood. We can also use other estimates of a. Different estimates do not affect type I error, but do affect power. Our simulations (results not shown) show that the MLE of a based on the full likelihood is a good choice. We compare our proposed method with two methods based on the case/control design to see if the affected sib-pair design is more powerful than the case/control design. This is our main purpose. We also compare our proposed method with one of the existing methods that are applicable to the affected sib-pair design. Although several methods28, 29 developed recently are applicable to the affected sib-pair design, we only choose SPWSS28 to compare with because SPWSS is most relevant to our proposed method.