Introduction

In recent years, genome-wide association studies of clinically defined case phenotypes against controls have transformed our understanding of the common genetic causes of many diseases. Hundreds of common genetic variants have been identified that confer small but significant proportions of disease risk.1, 2, 3, 4, 5, 6, 7 The use of clinical phenotypes to define case sets has simplified the collection of sets of diseased cases and has enabled easy interpretation of the impact of disease loci. However, there are limitations to such a simple approach to phenotyping,8, 9, 10 particularly in the presence of heterogeneity.11

First, such phenotype definitions depend on adequate diagnostic sensitivity and specificity, which is challenging in some diseases. For example, in Alzheimer’s Disease, where a number of common variants have been shown to confer risk of the disease,2 the majority of cases are diagnosed based on clinical criteria (eg, DSM-IV criteria). Post-mortem data show that clinical diagnoses are imperfect, with specificity and sensitivity of <80%.12 This leads to underestimation of the effects of associated SNPs, and worse could lead to false positive results, where associations in reality are with diseases misdiagnosed as Alzheimer’s Disease. Second, such clinical diagnoses ignore underlying heterogeneity in disease pathogenesis where case subtyping might be more appropriate. Examples of diseases with genetically distinct subgroups include ischaemic stroke, where at least three distinct pathologies (cardioembolic, small vessel and large vessel) lead to stroke events;13,14 migraine, where cases with or without aura have distinct genetic susceptibility factors;15 and rheumatoid arthritis, where anti-citrullinated peptide antibody-negative individuals show distinct genetic associations, particularly in the HLA region.16,17 Third, heterogeneity in genetic susceptibility to disease may exist. For example, cases with later disease onset have more exposure to environmental risk factors, and therefore under a liability threshold model will have a weaker genetic susceptibility.18,19 Similarly, individuals with type 2 diabetes with higher body mass index may have decreased genetic susceptibility to the disease.18,20

Analysis of subgroups of cases in GWAS data may therefore be valuable to identify further associations. Such analyses have been performed,13,15,20,21 but this has generally been carried out without consideration of the relative power of these analyses, and the conditions under which such analyses are advantageous are not well understood. To resolve this, we seek to answer two questions. First, what increase in genotypic relative risk in a disease subgroup is required to achieve equivalent power to a full analysis? Second, what is the relationship between power in a full analysis and a subgroup analysis, and how is this affected by the size of the genetic effect and its allele frequency? We first derive formulae for the relative power of subgroup analyses to a full analysis and use these to study the power of subgroup analyses for scenarios relevant to GWAS. We then derive expressions for the genotypic relative risk required in a subgroup analysis to achieve equivalent power to a full sample and evaluate this relationship for plausible scenarios. Finally, by interrogating the power relationship between a full and subgroup analysis for fixed proportional increases in genotypic relative risk for the subgroup, we show that subgroup analyses are advantageous in identifying genetic variants with increasingly smaller effects.

Materials and methods

Relative power of subgroup analyses

To better understand the relationship in power between a full and subgroup analysis, formulae for the ratio of non-centrality parameter (NCP) from analysis of a subgroup of cases to a full analysis were derived. These formulae were then used to study properties of the NCP ratio for appropriate risk allele frequencies (RAF), index genotypic relative risks with respect to the causal variant (λ), and proportional increases in GRRs in the subset analysis (κ).

Expressions for the ratio in power of a subset analysis to a full analysis were first derived using the framework from Yang et al22 as follows. In the context of a case–control study of a complex disease with prevalence K, consider a variant with two alleles (A, a) with frequency p and (1–p). Assuming a multiplicative model of allele effects, the NCP of a χ2 test for association can be expressed as follows:

where N is the total sample size and v denotes the proportion of the overall sample that are cases.22 NCP, used to calculate power analytically, is the value that determines the degree of noncentrality of the chi-squared distribution under the alternative hypothesis. The power of a χ2 test for association can be found by integrating under the noncentral χ2 distribution for given NCP and degrees of freedom of the test. Although NCP does not directly equate to power, it is a suitable proxy measure and is used here to represent power, as in previous studies.22

Of interest is the ratio in NCP of analysis of a subset of the cases (N2 individuals, prevalence K2, proportion of cases v2, and GRR λ2) to the power of a full study with N1 individuals, prevalence K1, proportion of cases v1, and GRR λ1, where N2 is a subset of N1 (retaining all controls) in which the variant has a stronger effect.

ie, N1>N2 and λ12.The ratio of NCP between two such analyses can be expressed as:

p1 and p2 are defined in the union of cases and controls in each situation. In the context of GWAS, where genetic effects are small, and particularly when number of controls exceeds that of cases, it can be assumed that 2p1(1−p1)2p2(1−p2) in the two scenarios as changes in allele frequency between the full and subgroup analysis will be minimal.

Thus we obtain the following expression (formula (1)) for the ratio in power of a subgroup analysis to power of a full analysis:

where

To evaluate the derived formula, we calculated NCP ratios for a case–control study of 2000 cases and 2000 controls compared with a subgroup of 1000 cases and 2000 controls for a RAF=0.25 and for index GRR (λ1) of 1.1, 1.2, and 1.3, each for five subgroup GRR (λ2) using formula (1). We then compared these with the equivalent NCP ratios from Genetic Power Calculator, calculating each NCP separately and determining the appropriate ratio, to confirm that our estimates matched those obtained from an alternative approach.23

Genotypic relative risk required for equivalent power in the subgroup analysis

Having derived expressions for relative power of subgroup analyses, we then derived an expression for the GRR (λ2) required in the subgroup to achieve equivalent power as the full analysis. From equation (1) above, we set the full expression for NCP ratio equal to one and solve for λ2:

If we make the substitutions, γ=λ1−1 and δ=λ2−1, we have:

This simplifies to a quadratic equation for δ:

Using the quadratic formula to solve for δ gives:

Discarding the implausible negative solution and expressing in terms of GRR in the subgroup analysis λ2, we obtain the following expression (formula (2)) for the GRR in the case subgroup λ2 required to obtain equivalent power in a subgroup analysis:

We then used the above formula to calculate the GRR (λ2) required in the subgroup analysis to achieve equivalent power to the full analysis for four scenarios in which a genetic variant was assumed to have a given GRR (1.05, 1.1, 1.2, or 1.3) in the full sample, and for different proportions of discarded cases in the subgroup analysis for three risk allele frequencies (RAF=0.01, 0.25, 0.5), assuming an equal number of cases and controls in the full sample.

Relative power for fixed proportional increase in odds ratio

Additionally, we sought to study how the NCP ratio between the full and subgroup analyses was affected by index GRR (λ1) in the full sample. We generated results using formula (1) for intervals of GRR (1.05≤λ1≤1.35) in the full sample and for a fixed proportional increase in GRR in the subgroup, (λ2)=κ × (λ1), where κ=1.05, 1.10, 1.20, 1.30, using three minor allele frequencies (RAF=0.01, 0.25, 0.5).

Results

Relative power of subgroup analyses

We identified properties of the relative power of a subgroup analysis, using formula (1). We set disease prevalence at 1% throughout, but we note that the results were almost completely insensitive to this value. We used three index GRRs (λ1=1.1, 1.2, 1.3) and, for simplicity, assumed that λ2 is a product of λ1, that is,

Index GRRs were in the range of those found in previous GWAS studies1,2,5,6,13 and were chosen in order to represent SNPs that might be identified in future studies. Results were analysed by κ, where λ2=κ × λ1, rather than specific GRRs to enable comparison across different index GRRs and different proportions of discarded cases. κ values (range=1.00–1.10) were chosen to reflect modest changes in GRR in case subsets and are similar to those observed in our previous analysis of the effect of age-at-onset in ischaemic stroke.19 The NCP ratio was calculated for proportional increases in GRR (κ) at three minor allele frequencies, assuming either 25 or 50% of cases were discarded (Figure 1).

Figure 1
figure 1

Ratio in power between analyses of all cases and case-subset, for different minor allele frequency and subset size. MAF, minor allele frequency; kappa (κ), proportional increase in GRR. Horizontal line at NCP ratio=1 denotes the kappa (κ) value for which power in the subset analysis exceeds that of the full analysis.

As expected, increasing values of κ increased the relative power (NCP ratio) of the subset analysis. When 50% of cases were discarded, the threshold of κ at which relative study power became greater in the subset analysis was higher than when 25% of cases were discarded. This indicates that, as expected, the required proportional increase in GRR is correlated with the proportion of cases discarded: a higher proportion of discarded, and therefore smaller retained case sample size, requires a higher GRR to achieve the same power (NCP).

We compared the NCP ratios from our formula with those calculated from comparing two sets of results generated from Genetic Power Calculator.23 The concordance between the results was nearly exact (r=0.999, r2=0.999), showing that our simple formulae reproduce the results obtained when performing the calculations using alternative approaches.

Genotypic relative risk required for equivalent power in subgroup analysis

We calculated the λ2 value required to achieve equivalent power in the subgroup for intervals of proportions of cases discarded in the subgroup using formula (2) (Figure 2). The results showed that relatively small increases in GRR in the subgroup were required to achieve equivalent power as a full analysis. This was particularly notable for lower index GRRs. For example, for a GRR of 1.05 in the full sample, a GRR of 1.079 in 25% of the cases achieves equivalent power (Table 1). Similarly, for a GRR of 1.10, a GRR of 1.16 in 25% of cases achieves equivalent power. This clearly shows that if stronger genetic effects exist in subgroups of data sets, a large proportion of cases can be discarded without loss of power.

Figure 2
figure 2

Genotypic relative risk in subgroup required to achieve equivalent power as a full analysis for intervals of proportions of discarded cases and three minor allele frequencies. MAF, minor allele frequency.

Table 1 λ2 values required in subgroup analysis to achieve equivalent power as full sample for given index genotypic relative risk λ1 and proportions of discarded cases assuming RAF=0.10

The results also showed that the relative power of a subset analysis is greatest for rare genetic variants at fixed values of λ1 and κ. This result was consistent across all scenarios studied (Figures 1 and 2). For example, for a genetic variant with index GRR λ1=1.3 and RAF=0.01, analysis of a subset of 25% of cases has more power if the proportional increase in the GRR is κ>1.14 (λ2>1.48). The proportionate increase in index GRR required for equal power increases with RAF: for the same scenario, but assuming a variant with a RAF=0.5, the analysis has more power for κ>1.17 (λ2>1.52). These results show that subgroup analyses have comparatively more power to identify rare as opposed to common variants.

Relative power for fixed proportional increase in genotypic relative risk

Finally, we interrogated the relationship of the NCP ratio for different index GRRs (λ1) in the full sample, fixing κ values in each case. For four κ values (1.05, 1.1, 1.2, 1.3) and across three minor allele frequencies (0.01, 0.25, 0.5), the NCP ratio monotonically increased as index GRRs decreased (Figure 3). This effect was particularly strong for index GRR<1.15, below which the curve increased dramatically. For example, for a variant with RAF=0.25 and a GRR of 1.3 in the full sample, if the variant has an effect 1.05 times stronger in half the cases, then analysis of this subgroup does not have as much study power than the full analysis (NCP ratio=0.96). Conversely, for the same variant with same proportional increase in half the cases, but an index GRR=1.05, then analysis of the subgroup has considerably more study power (NCP ratio=2.73). This effect clearly shows that, for smaller genetic effects, the proportional increase in power for analysis of a homogeneous subgroup of the cases increases greatly and emphasizes that homogeneous disease subgroups in which genetic effects may be larger are better suited for detection of small genetic effects.

Figure 3
figure 3

Ratio in power between an analysis considering all cases to an analysis considering a subset of cases for different minor allele frequency with fixed proportional increase in genotypic relative risk (kappa (κ)) in the subset. MAF, minor allele frequency; kappa (κ), proportional increase in genotypic relative risk.

All statistical analysis was performed using the R statistical software. All formulae and scripts used to generate plots are available from https://sites.google.com/site/mtraylor263/software/case-subgroup-power-analysis.

Discussion

We have developed a framework that elucidates the power relationship between a full GWAS analysis, and analysis of a subgroup of the cases in which genetic effects were stronger, while retaining all controls. We derived an expression for the ratio in power between a subgroup analysis and a full analysis and used this to study the power properties of subgroup analyses. A simplifying assumption regarding the frequency of genetic variants in the two analyses enabled the broad properties of the power ratio to be studied. This assumption is valid for GWAS, particularly where controls normally exceed cases by at least twofold. This enabled identification of two important results.

First, it was shown that, as GWAS sample sizes increase, and the detectable genetic effects of SNP variants become smaller, the power of a subset analysis in which variants have stronger effects becomes proportionally greater than a full analysis. Calculating NCP ratios for ratios of GRR (κ) in the full and subset analyses supported this observation: at lower index GRR, NCP ratios increased dramatically. This clearly shows that, when attempting to identify genetic variants with smaller effects, improvements in power become more substantial for subgroup analyses, particularly for index GRR<1.15. In a recent GWAS meta-analyses of rheumatoid arthritis,24 schizophrenia,6 multiple sclerosis,25 Alzheimer’s disease,2 coronary artery disease,3 and breast cancer,26 only 39 of the 339 associated variants showed overall odds ratios >1.15, suggesting that the majority of variants with effects greater than this threshold have now been identified. Our results show that analysing homogeneous disease subgroups forms a powerful strategy to identify further variants with effects in this range, as stronger effects may be present. Second, we also showed that the relative increase in power for a subset analysis is consistently greater for rare as opposed to common variants, although this effect was more modest. These results imply that searching for rare variants will particularly be aided by subtyping of disease cases into genetically distinct groups. Importantly, both of these results also hold for a lower case:control ratio (Supplementary Figure S1). Several methods can be used to identify genetically distinct disease subgroups in which a stratified GWAS analysis could be performed. Genetic correlations between disease subgroups can be calculated using the GREML methods,27 which use linear mixed models to obtain estimates of the genetic correlation between the groups. This approach showed shared genetic susceptibility to psychiatric disorders28 and that Tourette syndrome and obsessive-compulsive disorder are genetically distinct.29 The approach can easily be adapted to interrogate disease subgroups, where a low genetic correlation in a well-powered sample would imply distinct genetic architecture. Similarly, genetic risk profile scoring, in which the cumulative effect of genome-wide SNPs is used to test for differences between sets of cases and controls, can be used for the same purpose.30 This has been used in analyses comparing multiple sclerosis with amyotrophic lateral sclerosis31 and Parkinson’s disease with Alzheimer’s disease.32 Genome-wide genetic correlations between diseases can be calculated in combination with these methods using the framework created by Dudbridge.33 Finally, polygenic rare variant analysis approaches can be used to identify disease groups that have a polygenic contribution from rare variants and can be used to identify disease subgroups.34 These methods will be valuable for identifying the genetically distinct groups that would benefit from further association analysis.

A compelling example of the benefits of subgroup analyses can be found in ischaemic stroke, where subtyping of cases based of clinical and radiological criteria has enabled identification of the first common variants associated with the disease.13,14,35, 36, 37 Importantly, in the largest meta-analysis to date, all associations were with ischaemic stroke subtypes, and these showed much stronger association than in an analysis with all ischaemic stroke (κ=1.24, 1.23, 1.19, and 1.11 for rs2107595 (HDAC9), rs6843082 (PITX2), rs879324 (ZFHX3), and rs2383207 (9p21), respectively).13 Further to this, analyses of early onset cases suggest even stronger associations with young onset cases at these loci.19 This example clearly shows that with careful subtyping, genetic studies can provide new information on heterogeneous diseases, such as stroke.

Several other considerations should be made when interpreting our conclusions. First, our results are expressed in terms of genotypic relative risk. For rarer diseases, these values are almost equivalent to odds ratios, the preferred measure in most GWAS studies. This approach may therefore be more intuitive to researchers. However, an alternative approach would have been to benchmark our comparisons in terms of the variance explained by a locus. In particular, this may affect comparisons across the allele frequency spectrum. Second, our approach does not take into account the multiple testing correction, which it might be appropriate to make when performing multiple analyses on a single dataset. Factoring this correction into the analyses would have the effect of increasing the GRR required to achieve equivalent power in the subset analysis. Third, it should be noted that, for complex diseases, the underlying genetic architecture is unknown. Therefore the optimal approach to splitting the data into homogeneous groups remains elusive. Indeed, in some diseases, splitting the cases into groups may not prove beneficial. Interrogation of GWAS data sets with GREML methods,27 polygenic scoring,30 and polygenic rare variant association methods,34 as discussed above, may help to shed some light on this.

Many have been critical of GWAS for not identifying a large proportion of disease variance and for only identifying risk variants with small effects.38,39 However, GWAS have been very successful, particularly in auto-immune and metabolic disorders, where hundreds of associated genetic variants have been identified.4 Our results strongly advocate a renewed effort to identify genetically distinct disease groups with increased phenotypic homogeneity within existing data sets, in which power to detect genetic variants with small effects will be greater. This will be particularly important as the focus of genetic studies turns from common to rare variation.