Composite measure of linkage disequilibrium for testing interaction between unlinked loci

Wu, Xuesen; Jin, L; Xiong, Momiao

doi:10.1038/sj.ejhg.5202004

Download PDF

Article
Published: 23 January 2008

Composite measure of linkage disequilibrium for testing interaction between unlinked loci

Xuesen Wu^1,2,
L Jin^1,3 &
Momiao Xiong^1,4

European Journal of Human Genetics volume 16, pages 644–651 (2008)Cite this article

1348 Accesses
18 Citations
Metrics details

An Erratum to this article was published on 21 August 2008

Abstract

Widely used statistical interaction models essentially treated the interaction effect as a residual term and hence are likely to limit the power to detect interaction. Alternatively, interactions between two loci can be understood as irreducible dependencies between loci causing disease or viewed as the linkage disequilibrium (LD) between them. This motivated the development of LD-based statistics for the detection of interaction between two loci. Although LD-based statistics have demonstrated high power to detect interaction between two loci, in general, linkage phase information of marker loci for unrelated individuals is unknown. To overcome this limitation, we classify the interaction between two loci into intragametic interaction that characterizes interaction of two alleles from different loci on the same haplotype and intergametic interaction that characterizes the interaction of two alleles from different loci on different haplotypes. Then we show that intragametic and intergametic interaction will lead to the corresponding intragametic and intergametic LD. This stimulates the use of composite measure of LD for developing statistics to detect interaction between two unlinked loci. To study the validity of the composite LD-based statistic for testing interaction, we estimate its type 1 error rates by simulation. To evaluate the performance of the composite LD-based statistic for detection of interaction between two loci, we compare its power with logistic regression and apply it to two real examples. The preliminary results demonstrate that the composite LD-based statistic is a strong alternative to the logistic regressions and the intragametic LD-based statistic for the detection of interaction between two unlinked loci.

Population-specific long-range linkage disequilibrium in the human genome and its influence on identifying common disease variants

Article Open access 06 August 2019

High-definition likelihood inference of genetic correlations across human complex traits

Article 29 June 2020

Extremely sparse models of linkage disequilibrium in ancestrally diverse association studies

Article 28 August 2023

Introduction

It is increasingly recognized that common diseases are not consequences of independent actions of the genes, but are caused by complex joint actions of multiple genetic and environmental risk factors. Gene–gene interactions play an essential role in the ignition and development of the diseases.¹

Despite current enthusiasm for investigation of interactions between genes, the essential issue of how to define and detect gene–gene interaction remains unresolved.² In the past, statistical and biological interactions are often defined separately. As Rothman et al³ pointed out, ‘The term statistical interaction is intended to denote the interdependence between the effects of two or more factors within the confines of a given model of risk’, and ‘Biological interaction may be defined as the interdependent operation of two or more causes to produce disease’. A core part of statistical interaction is to specify statistical models. Most popular models for statistical interactions between genes are additive models that defined the effect of gene interactions as a statistical deviance from the additive effects of single genes in the linear models (or logistic regression for qualitative traits) and were originally proposed by Fisher⁴, and further developed into their modern representations by Cockerham⁵ and Kempthorne.⁶ Statistical interaction models essentially treated the interaction effect as a residual term in genetic analysis, and hence are likely to limit the power to detect the interaction.

As an alternative to statistical interaction models, interactions between two loci (or genes) can be understood as irreducible dependencies between loci causing disease.⁷ The purpose of a new definition of interaction is to develop a mathematical representation of biological interaction, which is close to the true biological interaction. We use penetrance of the risks to measure the degrees of the risks in causing diseases. In a broad sense, the interaction corresponds to the situation in which the effect of one locus (gene) is affected by the presence or absence of the other.^{8, 9, 10} The presence of interaction between two loci implies that the two loci share something in common to cause diseases (or phenotype). The shared common features or information lead to the association of two loci in the disease population, that is, high dependency or correlation between two loci in the disease population. In the language of population genetics, the dependency between two loci corresponds to the linkage disequilibrium (LD) between two loci. In other words, although LD between two loci is not the interaction of the effects of those alleles on a disease, LD can be used to detect interaction. If we assume that the controls are sampled from a single isolated population, two unlinked loci are in linkage equilibrium in controls. However, the interaction between two loci will generate LD in disease population.¹¹ Therefore, we can use the difference in LD between controls and cases to assess whether the interaction between two unlinked loci is present or not. If we assume that two loci are unlinked in the controls, in the presence of interaction, we observe LD between two loci in the cases. The level of LD due to interaction in the disease population depends on the magnitude of interaction between two loci. This motivated the development of statistics based on deviations from linkage equilibrium in the cases for detection of interaction between two loci.

Although LD-based statistics have demonstrated high power to detect interaction between two loci, in general, linkage phase information of marker loci for unrelated individuals is unknown; only genotype data are available. Experiments for generation of haplotype data are expensive and time consuming. Estimation of haplotypes based on genotype data inevitably incurs errors, which in turn will lead to increasing false interaction positive interaction in detection of interaction between two loci.¹² The main purpose of this paper is to directly use unphased genotypes to develop statistics for the detection of interaction between two unlinked loci. Similar to the Hardy–Weinberg disequilibrium at marker loci, which can be used to develop an association test,¹³ the composite measure of LD^{14, 15, 16} that uses the genotype data to estimate the nonrandom association of alleles from different loci on the chromosomes, which are from the same parent (intragametic LD) and on the chromosomes, which are fromdifferent parents (intergametic LD), was used to design association tests allowing unknown linkage phase.^{17, 18, 19} We extend the composite measure of LD to test the interaction between two unlinked loci when only genotype data are available. To achieve this, we first develop a general theory to study intragametic and intergametic LD patterns under two-locus disease models. Then we develop a novel definition and measure of intragametic interaction, which is caused by two interacted alleles from unlinked loci on the same haplotype and intergametic interaction, which is caused by two interacted alleles from unlinked loci on different haplotypes. The pattern of intragametic and intergametic LD between two unlinked loci due to gene–gene interaction provides a foundation for developing statistics for the detection of interaction between two loci using genotype data. This motivates us to develop the composite LD-based statistics for testing interactions between two unlinked loci. To study the validity of the composite LD-based statistic for testing interaction, we estimate type 1 error rates of the test statistic using simulation. To evaluate the performance of the composite LD-based statistic for detection of interaction between two loci, we compare its power with logistic regression and apply it to two real examples.

Methods

Measure of interaction between two loci

Let D₁ and d₁ be the two alleles at the first disease locus with frequencies , respectively. Let D₂ and d₂ be the two alleles at the second disease locus with frequencies , respectively. Alleles D₁ and d₁ can be indexed by 1 and 2, respectively. At the first disease locus, let D₁D₁ be genotype 11, D₁d₁ be genotype 12 (or d₁D₁ be genotype 21) and d₁d₁ be genotype 22. Thus, the genotypes at the first disease locus can be indexed by ij. The genotypes at the second disease locus are similarly defined and can be indexed by kl. Two-locus genotypes are simply denoted by ijkl for individuals carrying the genotype ij at the first disease locus and kl at the second disease locus. Let f_ijkl be the penetrance of the individuals with genotype ijkl. Let P₁₁, P₁₂, P₂₁, and P₂₂ be the frequencies of haplotypes in the general population, respectively. Let P₁₁^A, P₁₂^A, P₂₁^A, and P₂₂^A be their corresponding haplotype frequencies in the disease population. Let P_1/1, P_1/2, P_2/1 and P_2/2 be the frequencies of respectively, where the slash denotes the two chromosomes in the individual, which are from different parents. Let P_1/1^A, P_1/2^A, P_2/1^A, and P_2/2^A be their corresponding frequencies of in the disease population. Let be the frequencies of the alleles D₁, d₁, D₂, and d₂ in the disease population, respectively.

In general, it is genotypes that have penetrances. For ease of discussion, we introduce a concept of haplotype penetrance. Consider a haplotype with two alleles at the different loci on the same chromosome. Then, the penetrance of haplotype is defined as

In other words, the penetrance of haplotype is defined as the probability that individual with the haplotype is affected. It is a weighted sum of the penetrances that contain haplotype . The penetrance h₁₂, h₂₁, and h₂₂ is similarly defined.

The penetrance of two alleles at different loci on different chromosomes can be defined as

It is a weighted sum of genotypic penetrances. Similarly, we can define the penetrance h_1/2, h_2/1, and h_2/2. If we assume the Hardy–Weinberg equilibrium and genotypic equilibrium in general population, then we have h₁₁=h_1/1, h₁₂=h_1/2, h₂₁=h_2/1, and h₂₂=h_2/2. Let be the measure of intragametic LD that measures the association of alleles from different loci on the same haplotype¹⁷ and be the measure of intergametic LD that measures the association of two alleles from different loci on different haplotypes¹⁷ in the general population. We can show that haplotype frequencies in disease population can be expressed as

and

where P_A denotes disease prevalence.

Now we calculate the measures of intragametic and intergametic LD in disease population under a general two-locus disease model. The measures of intragametic and intergametic LD in disease population are denoted by respectively. We can show that they can be given by

and

where and . We define a measure of intragametic interaction that measures the interaction of two alleles from different loci on the same haplotype as and a measure of intergametic interaction that measures the interaction of two alleles from different alleles on the different haplotypes as . Then a measure of total interaction between two loci, which consists of intragametic and intergametic interaction is given by

Equation clearly shows that the interaction between two loci is defined by the penetrance of the two loci. Although the penetrance of the risks is not directly related to the biological process, it is related to the causes of the disease. Therefore, the above definition of interaction may have something to do with biological interaction. It follows from equations (3), (4) and (5) that the composite measure of LD, (Weir 1996) in disease population is given by

Absence of interaction between two loci is then defined as

equation (7) indicate that similar to linkage equilibrium where frequency of a haplotype is equal to the product of the frequencies of the component alleles of the haplotype, absence of interaction between two loci implies that the proportion of individuals carrying two alleles (either in the same chromosome or in the different chromosome) in the disease population is equal to the product of proportions of individuals carrying single allele in the disease population, if we assume that the disease is caused by only two investigated disease loci. In other words, the interaction between two disease susceptibility loci occurs when the contribution of one locus to the disease depends on another locus. In contrast to additive model for interaction, which was introduced by Fisher⁴, the interaction model defined by equations (5) and (7) are referred as to a multiplicative interaction model.

Indirect interaction between two unlinked marker loci

In the previous section, we studied interaction between two unlinked disease loci. Now we consider two marker loci, each of which is in LD with either of the two interacting loci. Assume marker M₁ is in LD with disease locus D₁ and marker M₂ is in LD with disease locus D₂. Furthermore, we assume that two disease loci D₁ and D₂ are unlinked. Let be the measures of intragametic and intergametic LD between two marker loci in the disease population, respectively. We denote the composite measure of LD between two marker loci by . Let δ_i be the LD measure between marker M_i and disease locus D_i (i=1,2) in the general population. Then, we can show that (Appendix A)

It is clear that when the marker loci are the disease loci themselves, are reduced to . equation (8) can also be written in terms of the measure of interaction between two unlinked loci

Since , the absolute value of the LD measure between two unlinked marker loci in the disease population, for example, the composite measure of LD between two marker loci will be less than or equal to the absolute value of the composite measure of LD between two unlinked disease loci in the disease population.

Test statistic

In the previous section, we showed that under the multiplicative disease model, interaction between unlinked loci will create LD. Intuitively, we can test interaction by comparing the difference in the composite genotypic disequilibrium between two unlinked loci between cases and controls. Precisely, if we denote the estimators of the composite LD measures in cases and controls by , respectively, then the test statistic can be defined as

where

are similarly defined for controls, the formula for calculations of the composite measure of LD in cases and controls is given in Weir (1996),¹⁵ are defined as before, are their estimators, the quantities n_A and n_G denote the number of sampled individuals in cases and controls, respectively; the variance of the composite LD measure was the large-sample variance.¹⁵ Under the null hypothesis and assumption of the Hardy–Weinberg equilibrium, the variance of the composite measure of LD in cases and controls becomes . When sample size is large enough to ensure application of large sample theory, test statistic T_I is asymptotically distributed as a central χ₍₁₎² distribution under the null hypothesis of no interaction (both intragametic and intergametic interactions) between two unlinked loci and assumption of the Hardy–Weinberg equilibrium.

In theory, we can use case only design to study interaction between two loci. However, in practice, background LD between two unlinked loci may exist in the population due to many unknown factors. Therefore, the test statistic based on case–control design is more robust than the statistic based on case only design.

Results

Type 1 error rates of test statistics

To examine the validity of the statistic for testing interaction, we performed a series of simulation studies. The computer program SNaP²⁰ was used to generate two-locus genotype data of the sample individuals. A total of 20 000 individuals, who were equally divided into cases and controls were generated in the general population, assuming genotypic equilibrium (both intragametic and intergametic equilibria) between two loci. We randomly sampled 100–400 individuals from each of the cases and controls for the calculation of the type I error rates. A total of 10 000 simulations were repeated. Table 1 shows that the estimated type I error rates of the statistic T_I for testing the interaction between two unlinked loci were not appreciably different from the nominal levels α=0.05, α=0.01 andα=0.001.

Table 1 Type 1 error rates of the test statistic T_I to test interaction between two unlinked loci in a homogenous population

Full size table

Power evaluation

To evaluate the performance of the composite LD-based statistic in testing gene–gene interaction, we compared the power of the statistic employing composite measure of the LD to that of the logistic model. We use the genotype coding scheme in QUANTO²¹ for power calculations. Specifically, we considered two types of genotype coding (genetic covariate variables). For a dominant model, homozygous wild type, heterozygous, and homozygous mutant genotypes were coded as 0, 1, and 1, respectively. For an additive model, they were coded as 0, 1, and 2, respectively. We considered two loci, denoted as G and H, respectively. We assume the following logistic model:

where is the baseline probability of disease in the population,

are the odds ratios for G when H=0, H when G=0 and interaction G × H, respectively.²¹ Power for both composite LD-based statistic and logistic regression²² was calculated by simulation. The computer program SNaP²⁰ was used to generate 10 000 cases and 10 000 controls with unlinked two-locus genotype data. Two-locus interaction effect were simulated for two-locus dominant and additive models with penetrance functions as given in Gauderman (2002).²¹ Five hundred individuals were randomly sampled from each of the cases and controls. A total of 10 000 simulations were repeated. Figures 1a and 1b present the power comparisons between the logistic regression model and the composite LD-based statistic under the following two genetic interaction models: dominance × dominance and additive × additive. Figures 1a and 1b show that the power of both logistic regression and the composite LD-based statistic in detecting gene–gene interaction was an increasing monotonic function of the interaction odds ratio, a widely used measure in quantifying the strength of interaction between two loci. This implies that the proposed new interaction measure and test statistic are closely related to the traditional interaction measure. We can also see that the power of the composite LD-based statistic T_I is higher than that of the logistic regression model.

Application to real data examples

To further evaluate its performance for detecting interaction between two unlinked loci, the proposed test statistic T_I was applied to two real examples. The first example was a breast cancer case–control study. A total of 398 Caucasian breast cancer cases and 372 matched controls were sampled from the Ontario Familial Breast Cancer Registry (OFBCR).²³ Nineteen SNPs from 18 key genes in DNA repair, cell cycle, carcinogen/estrogen metabolism, and immune system were typed. All SNPs were in Hardy–Weinberg equilibrium. Using multivariate logistic analysis under the codominant models, four pairs of genes: XPD and IL10, GSTP1 and COMT, COMT and CCND1, and BARD1 and XPD showed significant interactions.²³ We used the statistic T_I to test interactions between these four pairs of genes. The test results are summarized in Table 2, where the crude P-values were from the Table 4 in the paper by Onay et al (2006).²³ The crude P-values were obtained from multivariate logistic regression analysis that includes all main effects and only the interaction of interest under the codominant models. As shown in Table 2, logistic regression analysis interactions between XPD-(Lys751Gln) and IL10-(G(-1082)A), BARD-(Pro24Ser) and XPD-(Lys751Gln), COMT-(Met108/158Val) and CCND1-(Pro24Pro) and GSTP1-(Ile241Val), and COMT-(Met108/158Val) were identified. But after the more conservative Bonferroni adjustment, none of these interactions were significant.²³ Table 2 demonstrated that the P-values based on the test statistic T_I were smaller than those based on the traditional logistic regression analysis for the XPD-(Lys751Gln) and IL10-(G(1082)A), but larger for the BARD1-(Pro24Ser)and XPD (Lys751Gln), and COMT-(Met108/158Val) and CCND1 (Pro241Pro).

Table 2 Comparison of P-values for testing gene-gene interactions (Example 1)

Full size table

A popular point of view is that the statistics using haplotype data usually have smaller P-values than the statistics using genotype data. To examine this statement, the second example is coronary heart disease study in Shaghai, China in which 812 SNPs in 176 genes were typed for 1320 cases and 1129 controls. Atherosclerosis is the primary cause of coronary heart disease.^{24, 25} Although a majority of the results demonstrate that the P-values of intragametic LD-based statistic is smaller than that of composite LD-based statistic (data not shown), we can still find many pairs of SNPs for which the P-values of the composite LD-based statistic are smaller than that of the intragametic LD-based statistic and logistic regression. Here, we report the results of the detected interactions between 10 pairs of SNPs in Table 3. In Table 3, we can see that for all 10 pairs of SNPs, the P-values of the composite LD-based statistic are smaller than those of the intragametic LD-based statistic. This indirectly shows that there may exist intragametic and intergametic interactions, which generate intragametic and intergametic LD, respectively. The composite LD is the summation of intragametic and intergametic LD. When both intragametic and intergametic LD have the same sign, the absolute value of composite LD is larger than that of its component. Therefore, in this case the P-values of the composite LD-based statistic will be smaller than those of the intragametic LD-based statistic.

Table 3 P-values for testing interaction between unlinked loci in CAD study (example 2)

Full size table

Discussion

For almost a century, interaction between loci is defined as a deviance from the summation of their genetic main effects of individual locus. As an alternative to additive model of interaction, we have shown that the interaction between loci can be interpreted as irreducible dependencies between them. In genetics, dependencies between loci can be understood as LD If two loci in the general population are in linkage equilibrium (or independent), their departure from equilibrium in the disease population is often attributed to the interaction between them. Therefore, the LD due to interaction between two loci can be used to measure the magnitude of interaction.

The most popular measure of LD is the intragametic LD measure that quantifies nonrandom association of two alleles from different loci on the same haplotype. The major limitation of using the intragametic LD measure to test for interaction is that in practice, haplotype data are often unavailable. Although a number of algorithms for estimation of haplotypes have been developed, the errors of haplotype estimation are inevitable. This will lead to inaccuracy in the detection of interactions between loci. To overcome this limitation, we proposed to use the composite measure of LD based on genotype data for detection of interactions between loci.

To gain a deep understanding of intragametic and intergametic interactions, we first developed the general theory to study composite LD patterns in the disease population under two-locus disease models. We introduced a new concept of intragametic and intergametic penetrance and developed a measure of interaction between two unlinked loci, including both intragametic and intergametic interactions. The theoretic analysis of the intragametic and intergametic LD motivated us to use a composite measure of LD for developing statistics to test interactions.

We examined the distribution of the composite LD-based statistic under the null hypothesis of no interaction and calculated type 1 error rates of the proposed statistic by simulation. Our results showed that type 1 error rates were close to nominal significance levels. The composite LD-based statistic has two remarkable features. First, the calculation of the composite LD-based statistic does not require linkage phase information. Therefore, the results of the composite LD-based statistic are more reliable than that of the intragametic LD-based statistic. Second, the power of the composite LD-based statistic may not always be less than that of the intragametic LD-based statistic. Although by simulation we showed that in general, the composite LD-based statistic under the dominant and additive two-locus disease models has higher power than the logistic regressions, the critical question is whether there are situations where the composite LD-based statistic has higher power than the traditional LD (intragametic LD)-based statistic. The preliminary results of real data analysis showed that in some cases, P-values of the composite LD-based statistic may be smaller than those of the intragametic LD-based statistic. equations (6) and (10) show that in theory, the composite LD-based statistic varies from half of the intragametic LD-based statistic to two times of the intragametic LD-based statistic depending on the ratio of the intergametic LD over the intergametic LD. Therefore, when the intergametic LD is comparable with the intragametic LD, the composite LD-based statistic may have higher power than the intragametic LD-based statistic.

Although the composite LD-based statistic has merit, it also has potential limitations. First, in addition to interaction, HWD may also increase the composite LD. The small P-values of the composite LD-based statistic may be caused by HWD, not by the interaction. Although this will not be a problem for association studies of two loci with the disease, but it will be the problem for gene–gene interaction analysis. Second, like other population-based methods, the population substructure may generate LD and hence create spurious interactions. Third, the presented methods in this report require that the two loci are unlinked.

In summary, our results suggest that the composite LD-based statistic is an alternative to the traditional logistic regression or the haplotype-based LD statistics.

References

Ay N : Locality of global stochastic interaction in directed acyclic networks. Neural Comput 2002; 14: 2959–2980.
Article Google Scholar
Clayton D, McKeigue PM : Epidemiological methods for studying genes and environmental factors in complex diseases. Lancet 2001; 358: 1356–1360.
Article CAS Google Scholar
Rothman KJ, Greenland S, Walker AM : Concepts of interaction. Am J Epidemiol 1980; 112: 467–470.
Article CAS Google Scholar
Fisher RA : The correlation between relatives on the supposition of Mendelian inheritance. Trans R Soc Edinburgh 1918; 3: 399–433.
Google Scholar
Cockerham CC : An extension of the concept of partitioning hereditary variance for analysis of covariance among relatives when epistasis is present. Genetics 1954; 39: 859–882.
CAS PubMed PubMed Central Google Scholar
Kempthorne O : The correlation between relatives in a random mating population. Proc R Soc Lond B 1954; 143: 103–113.
Article Google Scholar
Jakulin A, Bratko I : Analyzing attribute dependencies; in Lavrǎc N, Gamberger D, Blockeel H, Todorovski L (eds): Proceedings of Principles of Knowledge Discovery in Data (PKDD); LNAI, 2003; 2838: 229–240.
Google Scholar
Cordell HJ : Epistasis: what it means, what it doesn’t mean, and statistical methods to detect it in humans. Hum Mol Genet 2002; 11: 2463–2468.
Article CAS Google Scholar
Hansen TF, Wagner GP : Modeling genetic architecture: a multilinear theory of gene interaction. Theor Popul Biol 2001; 59: 61–86.
Article CAS Google Scholar
Wagner GP, Laubichler MD, Bagheri-Chaichian H : Genetic measurement theory of epistatic effects. Genetica 1998; 102/103: 569–580.
Article Google Scholar
Zhao J, Jin L, Xiong MM : Test for interaction between two unlinked loci. Am J Hum Genet 2006; 79: 831–845.
Article CAS Google Scholar
Fallin D, Schork NJ : Accuracy of haplotype frequency estimation for biallelic loci, via the expectation-maximization algorithm for unphased diploid genotype data. Am J Hum Genet 2000; 67: 947–959.
Article CAS Google Scholar
Nielsen DM, Ehm MG, Weir BS : Detecting marker-disease association by testing for Hardy-Weinberg disequilibrium at a marker locus. Am J Hum Genet 1998; 63: 1531–1540.
Article CAS Google Scholar
Weir B : Inferences about linkage disequilibrium. Biometrics 1979; 35: 235–254.
Article CAS Google Scholar
Weir B : Genetic Data Analysis II. Sunderland, MA: Sinauer Associates, 1996.
Google Scholar
Weir BS, Cockerham CC : Complete characterization of disequilibrium at two loci; in Feldman MW (ed): Mathematical Evolutionary Theory. Princeton, NJ: Princeton University Press, 1989, pp 86–110.
Google Scholar
Schaid DJ : Linkage disequilibrium testing when linkage phase is unknown. Genetics 2004; 166: 505–512.
Article Google Scholar
Zaykin DV, Meng Z, Ehm MG : Contrasting linkage-disequilibrium patterns between cases and controls as a novel association-mapping method. Am J Hum Genet 2006; 78: 737–746.
Article CAS Google Scholar
Nielsen DM, Ehm MG, Zaykin DV, Weir BS : Effect of two- and three-locus linkage disequilibrium on the power to detect marker/phenotype associations. Genetics 2004; 168: 1029–1040.
Article Google Scholar
Nothnagel M : Simulation of LD block-structured SNP haplotype data and its use for the analysis of case–control data by supervised learning methods. Am J Hum Genet 2002; 4 (Suppl.): A2363.
Google Scholar
Gauderman WJ : Sample size requirements for matched case–control studies of gene-gene interaction. Am J Epidemiol 2002; 155: 478–484.
Article Google Scholar
Millstein J, Conti DV, Gilliland FD, Gauderman WJ : A testing framework for identifying susceptibility genes in the presence of epistasis. Am J Hum Genet 2006; 78: 15–27.
Article CAS Google Scholar
Onay VU, Briollais L, Knight JA et al: SNP-SNP interactions in breast cancer susceptibility. BMC Cancer 2006; 6: 114.
Article Google Scholar
Lusis AJ, Mar R, Pajukanta P : Genetics of atherosclerosis. Annu Rev Genomics Hum Genet 2004; 5: 189–218.
Article CAS Google Scholar
Libby P : Inflammation in atherosclerosis. Nature 2002; 420: 868–874.
Article CAS Google Scholar

Download references

Acknowledgements

M Xiong is supported by NIH-NIAMS Grant P01 AR052915-01A1, NIH Grant HL74735, and ES09912 in the US and Shanghai Commission of Science and Technology Grant 04dz14003 in China. L Jin is supported by Shanghai Commission of Science and Technology Grant 04dz14003, China. X Wu is supported by Shanghai Commission of Science and Technology Grant 04dz14003 and postdoctoral fund 05R214115 from Shanghai, China.

Author information

Authors and Affiliations

School of Life Science, Fudan University, Shanghai, China
Xuesen Wu, L Jin & Momiao Xiong
Department of Epidemiology and Statistics, Bengbu Medical College at Bengbu, Anhui, China
Xuesen Wu
CAS-MPG Partner Institute of Computational Biology, SIBS, CAS, Shanghai, China
L Jin
Human Genetics Center, University of Texas Health Science Center at Houston, Houston, TX, USA
Momiao Xiong

Authors

Xuesen Wu
View author publications
You can also search for this author in PubMed Google Scholar
L Jin
View author publications
You can also search for this author in PubMed Google Scholar
Momiao Xiong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Momiao Xiong.

Appendix A

Assume that marker locus M₁ has two alleles M₁ and m₁, and the marker locus M₂ has two alleles M₂ and m₂. Let be the frequencies of the marker alleles M₁ and M₂ in the disease population, respectively. Let the frequencies of the haplotypes D₁M₁, D₁m₁, d₁M₁ and d₁m₁ be , respectively. The frequencies of the haplotypes D₂M₂, D₂m₂, d₂M₂, and d₂m₂ can be similarly defined. Let the frequencies of the haplotypes M₁M₂, M₁m₂, m₁M₂, and m₁m₂ in the disease population be q₁₁^A, q₁₂^A, q₂₁^A, and q₂₂^A, respectively. Then, we have

Similarly, we have

Note that

It follows from equation (A3) that

Similarly, we have

From equations (A2), (A3) and (A4) we obtain that

Thus,

Similarly, we have

Combining equations (A6) and (A7) yields

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, X., Jin, L. & Xiong, M. Composite measure of linkage disequilibrium for testing interaction between unlinked loci. Eur J Hum Genet 16, 644–651 (2008). https://doi.org/10.1038/sj.ejhg.5202004

Download citation

Received: 03 April 2007
Revised: 10 December 2007
Accepted: 13 December 2007
Published: 23 January 2008
Issue Date: May 2008
DOI: https://doi.org/10.1038/sj.ejhg.5202004

Keywords

This article is cited by

Region-based interaction detection in genome-wide case-control studies
- Sen Zhang
- Wei Jiang
- Weichuan Yu
BMC Medical Genomics (2019)
Detection for gene-gene co-association via kernel canonical correlation analysis
- Zhongshang Yuan
- Qingsong Gao
- Fuzhong Xue
BMC Genetics (2012)

Composite measure of linkage disequilibrium for testing interaction between unlinked loci

Abstract

Similar content being viewed by others

Population-specific long-range linkage disequilibrium in the human genome and its influence on identifying common disease variants

High-definition likelihood inference of genetic correlations across human complex traits

Extremely sparse models of linkage disequilibrium in ancestrally diverse association studies

Introduction