Introduction

Twenty years ago, variations in human DNA were recognized as genetic markers in linkage study.1 After that, the advances in molecular biology and computational technology have led to mapping several human inherited disease genes. Using restriction fragment length polymorphism (RFLP) markers and polymorphic micro-satellite loci, linkage analysis and positional cloning have been used successfully in mapping the chromosome locations of Mendelian disease genes. The success mainly depends on one premise that the disease genes of Mendelian traits have a large effect on the phenotypes.2 In fact, there is usually a one-to-one correspondence between disease gene genotypes and the disease phenotype for Mendelian traits. Moreover, the correlations between genotypes and phenotype of Mendelian traits are strong. Given sufficient family data, Mendelian traits can be mapped with high probability by linkage analysis.

With the encouragement of successful mapping Mendelian trait genes, there has been growing interests and endeavors in the study of complex traits such as asthma and diabetes. For complex diseases, the inheritance patterns and phenotype definitions as with genetic etiology are much more complex. The trait/affection status is usually a continuous variable.3 The mapping of complex disease genes is much harder. Novel statistical methods such as both linkage analysis and linkage disequilibrium (LD) mapping or association study are needed in dissecting complex traits. As very dense marker maps such as single nucleotide polymorphism (SNP) are available,4 both linkage analysis and association study are utilized simultaneously for mapping complex disease loci.5,6 Almasy et al7 and Fulker et al8 proposed to use combined linkage and association analysis for quantitative trait loci (QTL). Sham et al9 studied the power of linkage versus association analysis of quantitative traits by analytically calculating non-centrality parameters of test statistics. Abecasis et al10,11,12 proposed test statistics of association studies for quantitative traits in nuclear families, general pedigrees, and selected samples. Cardon13 studied a sib-pair regression model of LD for quantitative traits. All these researches concentrated on family data which include sib-pairs, and used only one marker in analysis. In Fan and Xiong,14 we proposed a linear regression method of high resolution mapping of quantitative trait loci by LD mapping analysis. The method is based on population data. Using two flanking markers, the regression models have higher power than that of models using only one marker.14

It is well-known that family pedigree data can be used in both linkage analysis and association study, and population data can be used in association study. Hence, it is necessary to consider a method to combine both population data and family pedigree data in the analysis. In this paper, we propose to perform both linkage analysis and high resolution LD mapping for QTL based on combined family and population data. Linkage interval mapping is based on family data, and LD mapping is based on both family pedigree and population data. Based on variance component models, we construct likelihood to analyse family and population data in Section of Models. Then, we discuss the parameter estimations and regression coefficients. The linkage information, i.e., recombination fractions, is contained in the variance-covariance matrix, and the association information, i.e., the LD coefficients, is contained in the mean parameters or the regression coefficients. We calculate the non-centrality parameters for association study and linkage analysis, respectively. Using the non-centrality parameters, we perform power calculations and comparisons. The technical details to calculate the regression coefficients, parameters, non-centrality parameters are left in the Appendixes.

Models

Consider a quantitative trait locus Q which has two alleles Q1 and Q2. Suppose that the allele frequencies of Q1 and Q2 are q1 and q2, respectively. Assume that two markers A and B flank the trait locus Q in an order of AQB. Marker A has two alleles A and a with frequencies PA and Pa, respectively. Marker B has two alleles B and b with frequencies PB and Pb, respectively. For a nuclear family of k children and two parents, let us denote their quantitative traits by a vector y=(yf, ym, y1, · · · , yk)τ, genotypes at marker A by a vector (Af,Am,A1, · · ·, Ak)τ, and genotypes at marker B by a vector (Bf,Bm,B1,· · ·, Bk)τ. Here yf is the trait value of the father, Af is the genotype of the father at marker A, and Bf is the genotype of the father at marker B. Other notations are defined, similarly, for the mother with subscript m and for the i-th child with subscript i. The log-likelihood is defined by . The notations of the log-likelihood are defined as follows. For the mean component Xμ, we consider the following regression equation such as model (1) in Fan and Xiong14

where β is overall mean, wi is a row vector of covariates such as gender and age, γ is a column vector of regression coefficients for the covariates wi, Gi is polygenic effect, ei is error term. Assume that Gi is normal , and ei is normal . Moreover, Gi and ei are independent. xAi, xBi, zAi and zBi are dummy variables defined by

αA, αB, δA, and δB are regression coefficients of the dummy variables xAi, xBi, zAi and zBi.

and μ=(β, γτ, αA, αB, δA, δB)τ is a vector of regression coefficients. Σ is a (k+2)×(k+2)

variance-covariance matrix defined as

is variance explained by the putative QTL Q, is polygenic variance, and is error variance. The genetic variances are decomposed into additive and dominant components. is correlation between parents and children, is correlation between the i-th child and the j-th child, πijQ is the proportion of alleles shared identical by descent (IBD) at QTL Q by the i-th child and the j-th child, and ΔijQ is the probability that both alleles at QTL Q shared by the i-th child and the j-th child are IBD.

For population data, an intuitive rationale of regression model (1) is given in Fan and Xiong14. In general, one can construct a variance-covariance matrix for any type of pedigree in a similar way as above. Assume that there are two independent sub-samples of data: (1) population data: n independent individuals; (2) family data: In (I>n) independent families. Let us list the log-likelihood of the n independent individuals by L1, · · · , Ln, and the likelihood of the In families by Ln+1, · · · , LI. Then the overall log-likelihood is . The unknown parameters are μ= . Using the likelihood ratio tests, one may test statistical significance of the parameters of interest.

Parameter estimations and regression coefficients

Regression coefficients

Let μij be the effect of genotype QiQj, i, j=1, 2, μ1221. Denote the population effect mean by and define αQ=q1μ11 +(q2−q112−q2μ22, δQ=2μ12μ11μ22. If μ11=a, μ12=d, and μ22=–a as in the traditional quantitative genetics,15 αQ= a+(q2q1)d is the average allele substitution effect, and δQ =2d characterizes the dominant effect. In general, one may define a111122)/2 and d121122)/2. It is well known that the additive variance and the dominant variance . A true random effect model describing the trait value is yi=β+wiγ+gi+Gi+ei, where

Denote LD coefficient between trait locus Q and marker A by DAQ=P(AQ1)–q1PA, LD coefficient between trait locus Q and marker B by DQB=P(BQ1)–q1PB, and LD coefficient between marker A and marker B by DAB=P(AB)–PAPB. Let the additive and dominant variance–covariance matrices be

Moreover, let us denote three ratios . As in Appendix B,14 we can show that the coefficients of regression equation (1) are given by

Parameters of variance–covariances

Denote the recombination fraction between trait locus Q and marker A by θAQ, the recombination fraction between trait locus Q and marker B by θQB, and the recombination fraction between marker A and marker B by θAB. Fulker and Cardon16 proposed to estimate the proportion πijQ of allele IBD at putative QTL Q for a sib-pair i and j by where πijA and πijB are the IBD proportions of alleles shared at the marker A and marker B, respectively. The coefficients απ, βπA and βπB are given by

Let ΔijA, ΔijB be the probability of sharing two alleles IBD at markers A and B for a pair of sibs, respectively. In Fan,17 we proposed to estimate ΔijQ by equation . Under the assumption of no interference, the coefficients are as follows (Fan17):

where Assuming that the positions of marker A and marker B are known, θAB can be calculated through Haldane's map function. Then only one of θAQ and θQB is unknown since the other can be calculated through Trow's formula.18 For general relatives i and j, Almasy and Blangero19 proposed an algorithm to calculate the proportion πijQ of allele IBD at putative QTL Q, and the expected probability ΔijQ that both alleles at QTL Q are IBD. In Fan,17 we derived formulas to calculate the covariances of trait values for a few types of relatives directly without performing matrix operations.

Association and linkage studies

From equations (3) and (4), we can see that the coefficients of LD (i.e., DAQ and DQB) and gene effects (i.e., αQ and δQ) are contained in the regression coefficients. Moreover, we show in the above paragraph that the linkage parameters (i.e., recombination fractions θAQ, θQB and θAB) are contained in the variance-covariance matrix. Assume that markers A and B are in LD with the trait locus Q, i.e., DAQ≠0,DQB≠0. We may simultaneously test LD of marker A and marker B with trait locus Q, the gene substitution and dominant effects by testing αABAB=0. From equation (3), we may test LD of markers A and B with the trait locus Q and the gene substitution effect αQ by testing αAB=0. From equation (4), we may test LD of markers A and B with the trait locus Q and the dominant effect by testing δAB=0.

To test linkage, one may use the likelihood ratio test of the log-likelihood L. Under the null hypothesis of no linkage between the major trait locus Q and the markers, θAQ= θQB=1/2. Under the alternative hypothesis of linkage, θAQ≠1/2 or θQB≠1/2. By comparing the difference of maximum log-likelihoods under the alternative and null hypotheses, we may use χ2 statistic to test the linkage. We will derive analytical formulas to explore the linkage interval mapping by the nuclear families in a similar way to Sham et al9 according to statistical theory of likelihood ratio tests.20

Non-centrality parameters of association study

Assume that there are no covariates. Then μ=(β, αA, αB, δA, δB)τ. Consider the overall log-likelihood , where Li is the log-likelihood of trait values yi of the i-th family or individual. Let Σi be the variance-covariance matrix of yi, and Xi be its model matrix. Denote the total trait values , the total variance–covariance matrix by Σ=diag1, · · · ,ΣI), and the model matrix . Let be the maximum likelihood estimators of β, αA, αB, δA, δBi, Σ. The estimate of μ is . Let H be a q×5 test matrix of rank q. Suppose that the total number of individuals is N. By Graybill,21 Chapter 6, the test statistic of a hypothesis Hμ=0 is non-central F(q, N–5) defined by

The non-centrality parameter of the test statistic F can be calculated by . If the data are composed of n individuals of a population, Fan and Xiong14 worked out the non-centrality parameters to test if there are allele substitution and/or dominant effects and LDs between the markers and the major gene locus. In the following, we discuss a situation that the data are composed of both individual population data and family data.

Suppose that there are n individuals of a population, and n is sufficiently large. For each yi of the n individuals, Σi2 and Xi=(1 xAi xBi zAi zBi), i=1, 2,· · ·, n. From formulas in Fan and Xiong,14 Appendix A and Appendix B, we may show that

where VA and VD are additive and dominant variance-covariance matrices given in (2).

Secondly, suppose that there are m trio families, and m is sufficiently large. A trio family is composed of both parents and a single child. Notice that the means of xAi, xBi, zAi and zBi are 0. Let Kf =(xAf xBf zAf zBf) and Km=(xAm xBm zAm zBm). We show in Appendix A that the covariance matrix between parents and their offspring is

where K1=(xA1 xB1 zA1 zB1) and O2 is zero 2×2 matrix. For each of the trio families, the variance–covariance Σi is a 3×3 matrix whose inverse is

Using equations (5), (6), and (7), we show in Appendix B

Thirdly, suppose that there are k nuclear families each of them has both parents and two offspring, and the correlation of the two offspring is ρ12. Assume that k is sufficiently large. For each family, the variance–covariance Σi is a 4×4 matrix whose inverse is

where . In Appendix C, we show that the covariance matrix between two offspring is

Using equations (5), (6), (9) and (10), we show in Appendix D that

where the constants are given by Combine the n individuals, m trio families, and k families with two offspring. Define . Then equations (5), (8) and (11) lead to

To test if there are additive and dominant effects, we may test the hypothesis HAB,ad: αABAB=0. Then the test matrix H is defined by

Let us denote the corresponding F-test statistic by FAB,ad, and the non-centrality parameter by λAB,ad. Then we have from (3), (4), and (12) that

Assume that the two markers A and B are in linkage equilibrium, then DAB=0. Moreover, assume that the trait locus Q is in LD with marker A but not with marker B, then DQB=0 and DAQ≠0. Then , which only involves marker A and can be written as λA,ad. Correspondingly, we denote the F-test statistic by FA,ad. Similarly, is the non-centrality parameter of a test statistic FA,a. To test the other hypotheses, we may get the non-centrality parameters in a similar way by taking appropriate test matrices H. To test if there is dominant effect, we may test the hypothesis HAB,d : δAB=0. The non-centrality parameter is . To test if there is an additive or substitution effect, we may test the hypothesis . The non-centrality parameter is . The corresponding F-test statistic is denoted by FAB,a.

Non-centrality parameters of linkage studies

Consider a nuclear family with k children and both parents. Under the null hypothesis of no linkage between the trait locus and markers, the correlation of each sib-pair is

The expected log-likelihood is Under the alternative hypothesis of linkage between the trait locus and marker A, the correlation between a sib-pair is . From Haseman and Elston,22 Table IV, we have

The expected log-likelihood under the alternative hypothesis of linkage is

where PijA=0)=PijA=1)=1/4 and PijA=1/2)=1/2. From Stuart and Ord,20 the non-centrality parameter for linkage of the nuclear family is equal to λlinkage,A=E(2Lrandom,A)–E(2LNull). If k=2, it can be shown that .

Under the alternative hypothesis of linkage between the trait locus and markers A and B, the correlation between a sib-pair is given by for i, j=0, 1, 2

To calculate Cij, we need to calculate the joint distribution of π12A, π12Q and π12B of a sib-pair under the alternative hypothesis of linkage. Assume that there is no interference for disjoint regions of the chromosome. Then we have

From Haseman and Elston,22 Table IV, we may construct the joint distribution of π12Q, π12A and π12B by relation (15), and the results are presented in Table 3 of Fan.17 Based on the results, we can calculate Cij, i, j=0, 1, 2, which are given in Appendix D of Fan.17 The expected log-likelihood under the alternative hypothesis of linkage is E(2 L r a n d o m , A B )  = - ( k + 2 ) [ log ( 2 π σ 2 ) + 1 ]  -  Σ π 12 A Σ π 12 B Σ π k -1, k A Σ π k -1, k B P ( π 12 A ) P ( π 12 B ) P ( π k - 1 , k A ) P ( π k -1, k B )

where PijB=0)=PijB=1)=1/4 and PijB=1/2)=1/2 such as those for marker A. From Stuart and Ord20, the non-centrality parameter for linkage of the nuclear family is equal to λlinkage,AB=E(2Lrandom,AB)–E(2LNull). If k=2, it can be shown that .

Power calculation and comparison

Let us denote heritability by h2 which is defined by . In the power calculations, we take the additive polygenic variance , polygenic dominant variance , the equal allele frequencies PA=q1=PB=0.5 at the two markers A and B, and the QTL Q. Moreover, suppose that μ11=a, μ1221=d and μ22=–a.

Suppose that the map distance λAB between marker A and marker B is known. Under the assumption of no interference, we may calculate the recombination fraction by Haldane's map function θAB=[1–exp(–2λAB)]/2. Similarly, we may calculate the recombination fractions θAQ and θQB by the map distances λAQ and λQB. Assume that marker A and marker B are in linkage equilibrium, i.e., DAB=0, the genetic distances λAB=5 cM, λAQQB=2.5 cM, and the heritability h2=0.25. Suppose we have a sample with n=100 individuals, m=30 trio families, and k=20 nuclear families with two offspring. Assume that the IBD proportions shared by the two offspring in the k=20 families at both markers A and B are πA= πB=0.5, and the probability of sharing two alleles IBD at markers A and B are ΔAB=0.5. Figure 1 shows the power of the test statistics FAB,ad, FAB,a, FA,ad, and FA,a against the disequilibrium coefficient DAQ when DQB=0.15 for a mode of dominant inheritance with a=d=1.0, and a mode of recessive inheritance with a=1.0, d=–0.5, respectively. Several features are interesting in the two graphs of Figure 1. First, the power of FAB,ad and FAB,a are higher than that of FA,ad and FA,a. Hence, the regression mapping which uses two markers A and B has its advantage over the one marker mapping which only uses one marker A or B. Second, the statistic FAB,ad has higher power than that of FAB,a, and the statistic FA,ad has higher power than that of FA,a. Thus, excluding the dominant variance from the analysis when it does exist would lose power. Third, as expected, when DAQ=0 the power to detect LD using only marker A is minimal. More interestingly, when DAQ=0.15 the power is still higher using the flanking two markers than using only marker A.

Figure 1
figure 1

Power of test statistics FAB,ad, FAB,a, FA,ad, and FA,a against disequilibrium coefficient DAQ at 0.01 significant level, when q1=PA=PB=0.50, DAB=0.0, DQB=0.15, h2=0.25, n=100, m=30, k=20, πABAB=0.5, λAB=5 cM, λAQQB=2.5 cM, for a mode of dominant inheritance a=d=1.0, and a mode of recessive inheritance a=1.0, d=–0.5, respectively.

Figure 2 shows the power of the test statistics FAB,ad, FAB,a, FA,ad, and FA,a against the heritability h2 when DAB=0.10 and DAQ=DQB=0.15 for a mode of dominant inheritance with a=d=1.0, and a mode of recessive inheritance with a=1.0, d=–0.5, respectively. The other parameters are the same as those of Figure 1. Among the features observed in Figure 1, the power is reasonably high when the heritability h2 is bigger than 0.15. To compare the power of population based and family based methods, Figure 3 shows the power of the test statistics FAB,ad and FAB,a for a mode of dominant inheritance with a=d=1.0, and a mode of recessive inheritance with a=1.0, d=–0.5, respectively. For Figure 3, population data contain n=252 individuals, but no family data (m=k=0). For dominant inheritance of Figure 3, the data contain m=84 trio families (n=k=0). For recessive inheritance of Figure 3, the data contain k=63 nuclear families each has two offspring (n=m=0). Notice that m=84 or k=63 family data contain 252 individuals, and thus the number of individuals is the same as that of the population data. We can see that population based method is more powerful than the family based method for the same number of individuals.

Figure 2
figure 2

Power of test statistics FAB,ad, FAB,a, FA,ad, and FA,a against heritability h2 at 0.01 significant level, when q1=PA=PB= 0.50, DAB=0.10, DAQ=DQB=0.15, n=100, m=30, k=20, πAB= ΔAB=0.5, λAB=5 cM, λAQQB=2.5 cM, for a mode of dominant inheritance a=d=1.0, and a mode of recessive inheritance a=1.0, d=–0.5, respectively.

Figure 3
figure 3

Power of test statistics FAB,ad and FAB,a against heritability h2 at 0.01 significant level for a mode of dominant inheritance a=d=1.0, and a mode of recessive inheritance a=1.0, d=–0.5, respectively. For population data n=252, m=k=0; for dominant family data n=k=0, m=84; for recessive family data n=m=0, k=63. Other parameters are the same as those of Figure 2.

In a population, the LD can exist due to mutations at the trait locus. In the absence of tight linkage between the trait locus and a marker, the recombination between the marker locus and the trait locus can rapidly dissipate the disequilibrium from generation to generation. Denote the frequency of haplotype AQ at the generation when the mutations occur by P(AQ)(0). Then LD coefficient is DAQ(0)= P(AQ)(0)–q1PA for the generation when the mutations occur. For the following generations, the disequilibrium coefficient is reduced by a factor 1θAQ in each generation.23 Suppose that the mutation is already T generation old. Then the LD coefficient is DAQ(T)=DAQ(0)(1–θAQ)T. Similarly, the other LD coefficients are DAB(T)=DAB(0)(1– θAB)T and DQB(T)=DQB(0)(1–θQB)T.

Assume that the map distance between marker A and marker B is λAB=5 cM, and the other parameters are given by DAB(0)=0.20,DAQ(0)=DQB(0)=0.25, h2=0.25, λAB=5 cM, n=100, m=30, k=20, T=30, πA= πB=0.5, ΔAB=0.25. Figure 4 shows the power of the test statistics FAB,ad, FAB.a, FA.ad, and FA,a against the recombination fraction θAQ for a mode of dominant inheritance with a=d=1.0, and a mode of recessive inheritance with a=1.0, d=–0.5, respectively. We can see that the power curves of FAB,ad and FAB,a are very high, although the power curves of FA.ad and FA,a decrease very rapidly as the recombination fraction θAQ increases. Hence, high resolution LD mappings have advantage to do fine gene mappings, and appropriate for the dense marker maps such as single nucleotide polymorphisms on human genome. To investigate the effect of the age of the mutation on the power, Figure 5 shows the power curves against the position of markers. In the Figure, the QTL locates at 15 cM which is flanked by two markers A and B. One marker is one the right-hand side of the QTL, and the other is on the left-hand side with equal distance to the QTL. The power decreases quickly when the age of the mutation increases. For a mutation which is 30 generations old, one should expect very low power if the markers locate 5 cM away from the QTL.

Figure 4
figure 4

Power curves of the test statistics FAB,ad, FAB,a, FA,ad, and FA,a against the recombination fraction θAQ at 0.01 significant level, when q1=PA=PB=0.50, DAB(0)=0.20, DAQ(0)=DQB(0)= 0.25, h2=0.25, λAB=5 cM, T=30, n=100, m=30, k=20, πAB=0.5, ΔAB=0.25, for a mode of dominant inheritance a=d=1.0, and a mode of recessive inheritance a=1.0, d=–0.5, respectively.

Figure 5
figure 5

Power curves of the test statistics FAB,ad against the position of markers at 0.01 significant level for a mode of dominant inheritance a=d=1.0, and a mode of recessive inheritance a=1.0, d=–0.5, respectively. The QTL locates at 15 cM which is flanked by two markers A and B. Here the mutation age T=20, 30, 40, 60, and the other parameters are the same as those in Figure 4.

To explore the linkage interval mapping, we take a sample of k=250 nuclear families each has two offspring. Multiplying λlinkage,A and λlinkage,AB by k, we may calculate the non-centrality parameters for the linkage mapping using marker A and the linkage interval mapping using markers A and B. Moreover, assume that the genetic distances are λAB=30 cM, and λAQQB=15 cM, i.e., the QTL Q is right in the middle between markers A and B. Figure 6 gives power curves of linkage interval mapping by markers A and B, and linkage mapping by marker A against heritability h2 for a mode of dominant inheritance with a=d=1.0, and a mode of recessive inheritance with a=1.0, d=–0.5, respectively. It is clear that the power of interval linkage mapping using both markers A and B is higher than that of linkage mapping using only one marker A.

Figure 6
figure 6

Power curves of the linkage interval mapping by markers A and B, and linkage mapping by marker A against the heritability h2, when q1=PA=PB=0.50, λAB=30 cM, λAQQB= 15 cM, k=250, , at 0.05 significant level for a mode of dominant inheritance a=d=1.0, and a mode of recessive inheritance a=1.0, d=–0.5, respectively.

Discussion

In this paper, we investigate variance component models of both high resolution LD mapping and linkage analysis for QTL. The models are based on family pedigree and population data. We consider likelihoods which utilizes flanking marker information. The likelihoods jointly include recombination fractions, LD coefficients, the average allele substitution effect and allele dominant effect as parameters. The linkage parameters are contained in the variance-covariance matrix. The parameters of LD and gene effects are contained in the regression coefficients.8,9,11,12 The model simultaneously takes care of the linkage, LD and the effects of the putative trait locus Q, and hence clearly demonstrates that linkage analysis and LD mapping are complimentary, not exclusive, methods for QTL mapping. The family data which have at least two offspring contain information for both linkage and association, and population data and trio family data which have two parents and only one offspring contain information for association. By combining the family and population data in the analysis, one may expect to get better results than that by analysing them separately.

Linkage analysis can localize genetic trait loci in broad chromosome regions of a few cM (<10 cM), and is less sensitive to population admixture than LD mapping. In practice, one may carry out linkage analysis as a first step to obtain prior suggestive linkage based on a sparse marker map. By performing linkage interval mappings, one may get higher power than that of using only one marker. With prior linkage in hand, LD mapping can be used to get high resolution of the genetic trait loci based on a dense marker map. We have shown that models of high resolution LD mappings using two flanking markers have higher power than that of models of using only one marker. Hence, high resolution LD mappings have the advantages to do fine gene mappings, and appropriate for the dense marker maps such as SNPs on human genome. Performing both LD mapping and linkage analysis has potential to avoid false positives due to population history or environmental effects. In the meantime, it takes the advantage of high resolution of LD mapping.

The power of association study depends on the existence of LD between trait locus and markers. In the absence of LD, the power of LD mappings is very low. To increase the probability of detecting LD, one may need to carry out suitable design for a genetic study.24 It is well known that the level of LD is heavily affected by population stratification. On the one hand, the family based methods are less likely influenced by population stratification than those of population data based methods. On the other hand, a family based association study is less powerful than that of population based study for the same number of individuals. Combining the family and population data, one may expect more information, and take the advantage of population data and family data. More investigation is needed to explore the population stratification effect on high resolution LD mapping of QTL, and to develop robust methods to identify association between multiple markers and QTL in the presence of population stratification.

To our knowledge, there is not much research on statistical analysis about high resolution LD mapping of QTL. Using only one bi-allelic marker, the statistical analysis of LD mapping has been studied by a few colleagues.8,9,10,11,12,13 Relatively, multipoint linkage mapping has been studied more intensively.16,19,25 It is our hope that the current research may shed more light on the high resolution association study, and stimulate more interests to utilize the advantage of LD mapping in fine resolution of genetic studies. In the Section of power calculation and comparison, we mainly explore a set of scenarios of LD mapping. For several sets of parameters, we compare the power of four test statistics for LD mapping. Moreover, we compare the power of LD mapping of using population data and family data. We also investigate the effect of mutation age on the power. For linkage mapping, we only include one figure to make power comparison of linkage interval mapping using two markers with linkage mapping using only one marker.9 This reflects the need for more research on high resolution LD mapping of QTL, since the research on linkage interval/multipoint mapping is more mature.

In this paper, we treat LD as a fixed effect since only two markers are considered. In general, inference about the LD structure in the population are desirable, and LD should be modeled as a random effect when multiple markers/haplotypes are used in analysis, which would need more investigation. We assume that the data of all family members are available. For some late-onset diseases, the data for the parents or former family members may no longer be available. In principle, one can use similar methods as the ones proposed in this paper to perform high resolution LD mapping for sib-pair data of late-onset diseases. This is an area which is of importance and needs more research. Due to the length of this paper, we do not pursue these issues in depth, and they will be explored in other projects.