Introduction

Longevity is a complex trait which is affected by both genetic and environmental factors, with potential interactions among them. The earliest confirmation of the influence of genes on longevity came from studies on related individuals such as twins. According to a well-known Danish twin study, the genetic variance component to life span was estimated to account for about 25% of total life span variation.1,2,3 These results indicate how important a role genes play in modulating individual survival. In addition, these studies also found evidence for non-additive genetic effects on life span resulting from interactions among genes either at a single locus or at different loci.1,2 Such results support the works of Wright4 and might suggest that non-additivity is an important mechanism in maintaining individual survival. However, the molecular evidence of non-additivity in human longevity is far from clear.

There has been an ever-increasing interest in studying the association between genes and longevity due to rapid developments in the field of molecular biology.5 Important genes such as the ApoE have been reported to exhibit a significant influence on human longevity.6,7,8,9,10 Unfortunately, there has been very little research focusing on the coordinating effects among different genes, which constitutes the basic nature of non-additivity. One of the difficulties in this approach can be ascribed to the involvement of multiple genes or loci, especially when the genes of interest are highly polymorphic, which requires a vast amount of work in genotyping. While the genotyping problem is being solved by advanced new technology, an even more difficult task facing researchers is the search for genes that manifest coherent rather than independent effects. This issue is gaining in importance since there is soon to be an upsurge in the amount of individual genetic information available. When there are many potential gene interactions, the number of individuals for each possible gene combination will be very small. Hence the power for statistical testing will be low. In order to deal with this situation, powerful statistical methods are called for. In this paper we introduce a new approach to detecting gene–gene interactions in human longevity, the case-only design. This approach was originally derived for analysing gene-environment and gene-gene interactions in the etiology of complex diseases.11,12,13 Since we treat centenarians as cases, we refer to our method as the centenarian-only approach. After introducing the method we show, on the basis of examples, how this simple approach can be implemented to screen for gene–gene interactions in human longevity. We also discuss the basic assumptions of the model in the context of longevity studies. We highlight the advantages of our approach over other models and discuss problems that arise in the application.

Methods

In Table 1, we display the conventional case-control approach, which is aimed at assessing the influence on longevity of two gene alleles observed from two chromosomes, with an additional interest in detecting the effect of interaction between them. In Table 1, we use the subscripts co for controls and ce for centenarians. The subscripts a and b stand for the absolute numbers of observations in the centenarian and the control groups, p00 is the frequency of subjects carrying none of the alleles, p10 is the frequency of subjects carrying allele M but not N, p01 is the frequency of subjects carrying allele N but not M, and p11 is the frequency of subjects carrying both alleles. In accordance with the above notation, OR10 is the odds ratio for the effect of allele M alone, OR01 is the odds ratio for the effect of allele N alone, OR11 is the odds ratio for the joint effect of the two gene alleles. The cross-product for the centenarian group (Table 2) can be calculated as

Table 1 A 2 ×4 table for assessing gene-gene interaction based on case-control design (a, b: numbers of subjects; p: genotype frequency; OR: odds ratio)
Table 2 A 2×2 table for assessing gene-gene interaction based on centenarian-only design (a: number of subjects; p: genotype frequency; n: sum of subjects)

where p11,ce+p10,ce+p01,ce+p00,ce=1. Using the relationships between the frequencies and odds ratios in Table 1 and substituting them into (1) yields

In equation (2), ORco is the control-only odds ratio, which can become one under two assumptions. First, the two genes are independent; second, the event (being a centenarian) related to the interaction is rare. Here independence means the two genes are not in linkage disequilibrium. Before discussing the relevant issues concerning the assumptions, we first demonstrate how the control-only odds ratio becomes one when the two assumptions are satisfied. Similarly to equation (1), the control-only odds ratio can be written as

Again, we have in equation (3) p11,co+p10,co+p01,co+p00,co=1. If the frequencies corresponding to the four combinations in Table 1 are p(M=1|N=1), p(M=1|N=0), p(M=0|N=1) and p(M=0|N=0), the independence of the two gene alleles means that

In equation (4), p(M=1|N=1)=p11,cop(co|N=1)+p11,cep(ce|N=1). Since longevity is rare, we have p(ce|N=1)≈0 and p(co|N=1)≈1. This leads to p(M=1|N=1)≈p11,co. In the same manner, we have p(M=1|N=0)≈p10,co, p(M=0|N=1)≈p01,co, p(M=0|N=0)≈p00,co. Substituting them into equation (4) and rearranging, we get

Now we can rewrite equation 1 as

Equation (5) means that the centenarian-only odds ratio measures the departure from multiplicative joint effect of the two genes. In other words, it reflects the effect of gene–gene interaction. The null hypothesis for this approach is H0:ORce=1. Any statistically significant deviation of ORce from one indicates that there is a gene–gene interaction that contributes to and modifies the probability of achieving longevity. By comparing variances of the MLE of the logarithm of equation (1) and equation (5), Piegorsch et al.11 concluded that the case-only approach increases precision in estimating interactions since the variance corresponding to equation (1) involves an extra component for ORco, the control-only odds ratio.

A statistical test for the null hypothesis can be conducted by employing the log likelihood ratio (LLR) test. One calculates twice the difference between the log likelihoods estimated at ORce=1 and at ORce. When the sample size is large, the LLR is approximately distributed as ω2 on 1 degree of freedom. Alternatively, we can apply the standard ω2 statistic to test the null hypothesis, ie on one degree of freedom. Here, n1., n2., n.1, n.2 are marginal sums of the observations by genotype, and n is the total sum of the observations (centenarians). The ω2 statistic with continuity correction can also be applied, but this is not recommended.14 For a significant ORce, confidence intervals can be constructed. The procedure is first to construct a confidence interval for the natural log of ORce and then exponentiate the boundaries to get the confidence intervals for ORce. The standard error of ln(ORce) is given by . Since the natural log of the odds ratio is more normally distributed than the odds ratio itself, we can use the critical values of the standard normal distribution for calculating the intervals.

Application

In a multicentric longevity study conducted in 1995 in Italy,15 a sample consisting of 157 centenarians (38 males, 119 females) was collected. Participants were healthy and free from major pathologies or disabilities.16 Each individual was checked for ethnicity and geographical origin. Blood samples were genotyped for genes that one suspects to have an influence on longevity. Important genes significantly associated with longevity have been reported in previous publications.15,17,18,19 Among the genes tested, REN (renin) gene failed to show a significant contribution to individual survival. The REN gene encodes for renin, an enzyme produced in the kidney. Renin is a key element in the renin-angiotensin system (RAS) in the circulation. It is responsible for regulating blood pressure and stimulating aldosterone production. In recent years, researchers have reported the existence of renin in mitochondria.20,21 A possible role of intramitochondrial renin in the regulation of steroid biosynthesis is suggested.20,22 Here, we want to apply the centenarian-only approach described above to check if there is any functional interaction between the REN and mtDNA variations with consideration that some polymorphisms of the two genes could have enhanced functions as a result of gene-gene interaction. Five polymorphic alleles of the REN gene (alleles 7, 8, 10, 11, 12) were detected at the locus designated by the number of short tandem repeats.15 An earlier study showed that the genotype-allele frequencies are in Hardy-Weinberg equilibrium.15 However, the rare alleles 7 and 12 were not found in the centenarians, leaving the rest with allele frequency 0.793 for REN 8 allele, 0.092 for REN 10 allele and 0.105 for allele 11. In Tables 3, 4, 5, we show the effects of gene-gene interaction for REN allele 8, 10 and 11 with the mitochondrial haplotypes, H, J, K, T, U. Given the small sample size, we combine the remaining mitochondrial haplotypes to form one group,23 which we refer to as ‘the others’. The P values are calculated from Fisher's exact test for two tails. Since we are making conclusions on interaction between each REN allele and the mitochondrial haplotypes, multiple comparison has to be taken into account. The significant level has to be adjusted using Bonferroni correction. In Tables 3, 4, 5, there are five independent mitochondrial variations in which we can adjust our significance level to 1-(1-0.05)1/5=0.010 if the original type I error is set to 0.05. While no significant result is found in Tables 3 and 5, in Table 4 the REN 10 allele shows a remarkable interaction with the mtDNAhapl-H with a P-value of 0.006. The estimated odds ratio is 3.274 and 95% CI is from 1.405 to 7.630. The result indicates that the presence of both genes may substantially increase the probability of achieving longevity. No additional gene–gene interaction was detected as significant in Table 4.

Table 3 REN8 and mtDNAhaplotype interactions from Italian Centenarian data
Table 4 REN10 and mtDNAhaplotype interactions from Italian Centenarian data
Table 5 REN11 and mtDNAhaplotype interactions from Italian Centenarian data

Simulation

By investigating the distribution of the odds ratio, we can determine the threshold for a significant odds ratio for the given sample size. The simulation was done by applying the survival model introduced by Yashin et al18 and Tan et al19 with proportional hazard for the interaction. Since allele frequency for REN10 is 0.092, the frequency of REN10 carriers (both homozygotes and heterozygotes) is roughly 0.2. Our preliminary analysis showed that frequency for mtDNAhapl-H is about 0.4. Using these frequency parameters and assuming no interaction between the two genes, we simulated 10 000 data sets each with sample size of 157 individuals who survived above age 100. From the simulated data, we calculated 10 000 odds ratios by applying the centenarian-only approach with a median of 0.999. In Figure 1, we show the simulated distribution of the odds ratio which peaks at 1 with a long right tail. From this distribution curve, we obtain a very low probability of observing an odds ratio bigger than that for REN10 and mtDNAhapl-H in Table 4 (P=0.0025). It means that, if there is no interaction between the two genes, the probability of having an odds ratio bigger than 3.274 is extremely low.

Figure 1
figure 1

Simulated distribution of centernarian-only odds ratio when assuming no gene–gene interaction. 10 000 samples were simulated each with sample size of 157 individuals. The median is 0.999, probability of observing an odds ratio larger than 3.274 is 0.0025.

Discussion

The application of the centenarian-only approach has detected significant interaction between REN gene allele 10 and the mitochondrial haplotype H, which could favour longevity. Both genes had been reported to exhibit no effect on survival alone.17,19 However, carriers of the two genes have a significantly increased probability of achieving a long life. As specified in the method section, the model makes sense only when our primary interest is in assessing gene–gene interaction. No effect of the gene alone can be estimated by this approach. Although it is limited to interaction only, this non-traditional method has certain advantages that are unmatchable by conventional case-control studies. Since the centenarian-only approach does not require controls, crucial issues in the choice of an appropriate control group, which have complicated case-control studies, are avoided.12 This is important since improper choice of the controls could lead to spurious conclusions that distort the study. In addition, this approach attains greater precision in estimating interactions than the traditional case-control design.11,24

However, there are important assumptions that underlie the application of the model. First, before applying this method, one has to make sure that the genes in question are not in linkage disequilibrium. Second, the event associated with gene–gene interaction should be rare. The study of longevity fulfils this assumption since longevity is by definition always a rare event. It was estimated that there were only about 44 centenarians per million population in the developed countries in the year 1990.25 Although the number is increasing very rapidly,26 according to the United Nations′ prediction, in the year 2050 centenarians will make up only about 1% of the total population of Japan, which has the highest percentage of centenarians in the world.

Since the model only captures gene-gene interactions, additive effect of the genes can't be measured using this approach. This means that the traditional case-control study can't be replaced by this method when one needs to assess both the additive and non-additive effects on longevity for the genes concerned. However, the simple approach can be used as a powerful tool for screening purpose in finding important genes that work together interactively in modulating individual survival.

We must point out that, as in other association studies, the case-only approach also runs into difficulties in a situation where linkage disequilibrium exists. The detected interaction could be due to the fact that one or both of the markers are in linkage disequilibrium with the causal genes. In addition, the choice of subject in the centenarian-only study should follow the usual rules of case selection for any case-control study.27 Despite these potential problems, the association approach can help to complement further work aimed at localising the specific gene loci. Since centenarians constitute the special part of the population representing successful ageing, the application of the centenarian-only approach in genetic studies on aging could help to screen important genes that contribute to human health and longevity.