Abstract
The in silico mapping (ISM) technique and its extension represent major advances for novel gene discovery in germplasm resources of inbred lines. However, the techniques suffer from a relatively high falsepositive rate (FPR) and they do not consider the effect of linkage disequilibrium (LD) markers around the identified quantitative trait locus (QTL). In addition, it has not yet been established whether it is optimal to use absolute trait differences as the response variable. To address these problems, this article presents the multiple loci ISM (MLISM) approach, which uses all markers on the entire genome, along with a penalized maximum likelihood. The method proposed here was verified by a series of simulation experiments with a maize pedigree population of inbred lines of known ancestry. Results from the simulated studies show that the best response variable is the trait product. The MLISM FPR is substantially decreased and the proportion of the number of false QTL to the number of LD markers around the identified QTL is adequately reduced. The MLISM method, with the trait product as the response variable, is an improvement on the existing methods for novel QTL mapping in germplasm resources of inbred lines.
Introduction
Novel genes in crop cultivars or animal inbred strains are a critical resource for plant and animal improvement. Therefore it is relevant to ask whether it is also possible to map a quantitative trait locus (QTL) within the crop cultivars or animal inbred strains. Grupe et al. (2001) proposed the use of the mean phenotypic values of inbred strains to map the likely genomic locations of QTL ‘in silico,’ which theoretically would represent a major advance. Chesler et al. (2001) and Darvasi (2001) have questioned the validity of this socalled in silico mapping (ISM) on the basis that it is associated with a relatively high falsepositive rate (FPR). Recently, Peltz's group further extended the computational method for mapping phenotypic traits that vary among inbred strains onto haplotypic blocks, known as the haplotypebased ISM (HISM) approach (Liao et al., 2004). This method predicted the genetic basis for strainspecific differences in several biologically important traits of mice (Liao et al., 2004; Guo et al., 2006, 2007; Liang et al., 2006a, 2006b), but it only performs a singlelocus analysis. Nevertheless, the advantages of detecting QTL from inbred strains without the need for conventional QTL analysis are strong enough to justify seeking improvements to the ISM methodology.
The ISM and its extension assume the presence of a single QTL per linkage group. This assumption is problematic (Kao et al., 1999; Zhang, 2006), primarily because only the effects of the putative QTL at any particular marker position can be included in the model and all other QTL effects have to be ignored. As a result, estimates of the effects and the positions of QTL will be biased whenever there is more than one QTL present on a given linkage group. On the other hand, when several markers are in strong linkage disequilibrium (LD), the method can fail to reject any of these markers because of strong linkage. This inevitably leads to a relatively high FPR. Several approaches have been proposed to ameliorate this situation. The early approaches to this problem applied composite interval mapping on a large sample size (Jansen, 1993; Zeng, 1993). More recently, multiQTL mapping has been developed (Kao et al., 1999; Xu, 2003, 2007; Zhang and Xu, 2005; Zhang, 2006; Xu and Jia, 2007). However, all these approaches are focused on segregating populations from controlled crosses rather than from the germplasm resource of inbred lines. Therefore, a current priority is to incorporate the notion of multiQTL mapping into the extended ISM, which can be used in a natural population.
Another major concern is whether the choice of absolute trait differences as the response variable is optimal. In Haseman–Elston (H–E) regression (Haseman and Elston, 1972), the earliest suggestion was to apply the squared trait difference as the phenotypic difference. As pointed out by Wright (1997), this approach discards some useful information, and some benefit has been seen in using the trait values of both members of a sib pair. In effect, the squared difference and the trait sum together contain exactly the same information as the original two trait values. Drigalenko (1998) developed the idea further and suggested using the trait product as the phenotypic difference. The result has been further confirmed in multiQTL H–E regression (Zhang et al., 2008), and these methods were exploited as the basis for the extended ISM described in this article.
In this paper, we show that the ISM can be extended to multiloci ISM (MLISM) using all markers of a given genome. In the current version of MLISM, the parameters were estimated by the penalized maximum likelihood (PML) method; several response variables for phenotypic difference were compared with each other in order to optimize the procedure. The effect of LD markers around the identified QTL was also investigated.
The new method proposed in this paper was tested by simulation. The purposes of the simulation were threefold: (1) to select the best response variable, (2) to identify whether the MLISM FPR was substantially decreased and (3) to understand the effect of LD markers around detected QTL.
Materials and methods
Simulation study design
We conducted seven simulation experiments in this paper. In the first, the simulated pedigree was the maize pedigree described by Zhang et al. (2008) (Figure 1). The number of inbred lines within the maize pedigree was 404(n). Of these, n_{0}(=103) were base (founder) lines, which were in linkage equilibrium so that the genotypes for markers and QTL with four alleles could be simulated. Nonfounders (n_{1}=301) were bred by repeated selfpollination of a hybrid between two inbred lines. Thus, each nonfounder line represents a recombinant inbred line with respect to a pair of known parents. The genotype of all nonfounders could be generated from the genotypes of their corresponding parents, analogous to simulating the genotypes of recombinant inbred lines from their two parents. All of the nonfounder lines can be used to detect QTL. Sixtyone equally spaced markers were simulated on a singlechromosome segment 600 cM long. A single QTL was located at position 200 cM and overlapped with markers. The environmental variance was calculated as
where σ_{g}^{2} is the genetic variance and h^{2} is heritability. Allelic effects were calculated by relating the genetic variance of the QTL to both the allelic frequencies and the allelic number. The phenotypic value of each line was the sum of corresponding QTL genotypic values and the residual error, with an assumed N(0,σ_{e}^{2}) distribution. Each simulation run consisted of 200 replicates. The other simulation experiments were carried out similarly. All simulated parameters are given in Table 1.
Genetic model
Let the kth inbredline pair among all nonfounders have trait phenotypic values (z_{k,1}, z_{k,2}), and the average value of z_{k,1} and z_{k,2} over all pairs be z̄. The absolute trait difference is y_{k}^{A}=∣z_{k,1}−z_{k,2}∣, and the identitybystate (IBS) sharing at a marker locus for the two lines is x. On the basis of the ISM of Grupe et al. (2001), the regression of y^{A} on x is described by:
where b_{0} is the regression intercept and b is the regression slope (Grupe et al., 2001). Provided that each marker locus on the entire genome can be linked to putative QTL, the model (1) can be extended to the MLISM:
where b_{i} is the regression coefficient related to the ith putative QTL; p is the number of all markers on the entire genome; x_{ki} is the IBS of the kth inbredline pair at the ith marker locus; and e_{k} is the residual error with an assumed N(0,σ^{2}) distribution. The squared trait difference is given by y_{k}^{D}=(z_{k,1}−z_{k,2})^{2}, the trait product by y_{k}^{P}=(z_{k,1}−z̄)(z_{k,2}−z̄), and the trait sum by y_{k}^{S}=[(z_{k,1}−z̄)+(z_{k,2}−z̄)]^{2}. The regressions of y_{k}^{D}, y_{k}^{P} and y_{k}^{S} on the IBS can be established similarly. Response variables are denoted by y.
Parameter estimation
There are several methods to estimate the parameters in model (2), that is, the PML (Zhang and Xu, 2005) and Bayesian LASSO (Park and Casella, 2008; Yi and Xu, 2008). If residual variance is heterozygous, the method proposed by Yi and Banerjee (2009) is available as well. We here adopt the PML method. Briefly, in the PML method, the penalized likelihood function is the product of a likelihood function L(θ∣Y, M) and penalty function P(θ, ξ). The likelihood function is calculated by
where θ=(b_{0}, b_{1}, …_{,} b_{p}, c, σ^{2}), m=n_{1}(n_{1}−1)/2, , Y=(y_{1}, y_{2}, …, y_{m})^{T}, M represents marker information, and ϕ(y;α,σ^{2}) is a normal density function with mean α and variance σ^{2}. The penalty function is:
where ξ=(μ_{1}, …, μ_{p}, σ_{1}^{2}, …, σ_{p}^{2}) is the vector of hyperparameters, and η>0 is the prior sample size for assessing μ_{i}. Note that p(σ_{i}^{2})∝1 for the response variable y_{k}^{P}, and p(σ_{i}^{2})∼invχ^{2}(ν,s_{i}^{2}), with s_{i}^{2}=0 (i=1,2,…, p), for the other response variables (Zhang et al., 2008). The penalized likelihood function is
Thus, the PML estimates for both model parameters and hyperparameters are
The iterative steps for parameter estimation are identical to those given by Zhang and Xu (2005) and He and Zhang (2008). The convergence criterion was Σ∣θ_{i}^{(t+1)}−θ_{i}^{(t)}∣<10^{−6}. In equation (10), the value of ν depends on the response variable when s_{i}^{2}=0 (i=1,…, p). From a wide range of values, we have empirically determined that n should be set to 6 for y_{k}^{A} and 7 for y_{k}^{D} (data not shown).
Likelihood ratio test
As stated by Zhang and Xu (2005) and Zhang et al. (2008), it is now possible to test the null hypothesis H_{0}:b_{i}=0 that there is no QTL linked to the ith marker locus by using the LR test statistic:
where θ′={b_{0}, b_{(1)}, …_{,} b_{(q)}, σ_{−i}^{2}} with ∣b_{(k)}∣>10^{−6}, k=1, 2, …, q, θ_{−i}={b_{0}, b_{(1)}, …, b_{(i−1)}, b_{(i+1)}, …, b_{(q)}, σ_{−i}^{2}} is the vector of parameters that excludes b_{(i)}, σ_{−i}^{2} is the residual variance of the reduced model under H_{0}, and L(θ) is the loglikelihood function. For simplicity, the usual QTL significance criterion (LOD_{i}⩾3) was applied. For each QTL simulated, the samples for which LOD or −lgP(H_{0}) exceeded the threshold of 3.0 were counted. The ratio of the number of actual QTL to the total number of replicates (200) represents the empirical power of the method. The FPR was given by the ratio of the number of falsepositive effects to the total number of zero effects. Note that linked false positives were only counted once.
Monte Carlo simulation studies
The choice of the best response variable for phenotypic difference
To demonstrate the first objective, four response variables, including the absolute (squared) trait difference y_{k}^{A}(y_{k}^{D}) and the trait product (sum) y_{k}^{P}(y_{k}^{S}), were compared in the first and second simulation experiments with the maize pedigree in Figure 1. In the first simulation experiment, the number of nonfounders in the maize pedigree was 301. Sixtyone equally spaced markers, each with four alleles, were simulated on a singlechromosome segment of length 600 cM, and a single QTL with 0.20 heritability and four alleles was located at 200 cM. Each of the 200 simulated data sets was analyzed four times by MLISM with each response variable, in turn (Table 2). This analysis showed that the minimum FPR is achieved using y_{k}^{P}, and the QTL detection power and the standard deviation for the estimates of QTL position were highest and lowest, respectively, using y_{k}^{P}. In the second simulation experiment, three QTL, each with two alleles, were simulated with heritabilities of 0.05, 0.10 and 0.15 and located at marker positions 100, 300 and 500 cM, respectively. The number of alleles for markers was set at 2, and other parameters were the same as those in the first simulation experiment. The results were consistent with the above trend (Table 3). This implies that the trait product is the optimal response variable in the MLISM. It should be noted that the estimate of σ^{2} for the MLISM is different from the simulated value. This is because that the MLISM uses pairwise distances analysis rather than origin response variable analysis. In addition, under the situation of low heritability, the power for QTL detection is less for the HISM than for the MLISM. In the following simulation experiments, only outcomes using y_{k}^{P} are reported.
The evaluation of FPR for multiple loci ISM
To demonstrate whether the FPR for the MLISM was significantly reduced, the FPR for five simulation experiments was calculated. Each data set was analyzed by the ISM, the HISM and the MLISM, for a total of three analyses.
In the third simulation experiment, we pruned the maize pedigrees to have the correct number of nonfounders of 100, 200 and 300, respectively. Eleven equally spaced markers, each with three alleles, were placed on each of two 100 cM chromosome segments; two QTL, each with three alleles, were simulated with heritabilities of 0.10 and 0.20 and located at marker positions 20 and 80 cM, respectively. The results are shown in Figure 2a (Supplementary Table S1). As expected, the FPRs in the MLISM were substantially decreased under various sample sizes as compared with those in the ISM. The power of QTL detection is less for the HISM than for the MLISM with small sample sizes. In addition, the FPR in the MLISM increased as sample size increased because the QTL detection power increases as sample size increases.
The fourth simulation experiment was designed to investigate the effects of the number of alleles (both marker and QTL) on the FPR of the new method, so the number of alleles was set at 2, 3 and 4. The number of nonfounders in the maize pedigree was 301, and other parameters were the same as those in the third simulation experiment. The results are given in Figure 2b (Supplementary Table S2). Compared with the FPR obtained by using the ISM, the FPR from the MLISM was substantially decreased with various conditions of allele numbers. The QTL detection power decreased as the number of alleles increased. In simulations with a small amount of alleles, the power for QTL detection is slightly less for the HISM than for the MLISM.
In the fifth simulation experiment, the effect of the allelic frequency on the FPR of the new method was assessed by letting the frequency ratio of the three alleles (for the two simulated QTL and all markers) be set as either 1:1:1 (uniform distribution) or 1:2:3 (skewed distribution). Other parameters were the same as those in the fourth simulation experiment. The results are given in Figure 2c (Supplementary Table S3). The trend for the FPR obtained by both distributions was the same. In addition, the skewed distribution decreased the statistical power.
The sixth simulation experiment needed to mimic the actual data, which did not have equally spaced markers. The corrected genetic linkage map of Zhu et al. (2007) was adopted, and this genome consisted of 12 chromosomes. We genotyped 103 founders and 301 nonfounders from the maize pedigree for 167 markers, each with two alleles. These markers covered ∼2121.15 cM of the genome with an average marker interval of 13.68 cM. Six QTL with heritabilities of 0.05, 0.05, 0.025, 0.15, 0.075 and 0.05 and locations at marker positions 0.00, 60.32, 123.52, 942.78, 1672.05 and 1723.00 cM, respectively, were simulated. The results are shown in Figure 2d (Supplementary Table S4) and the FPR for the MLISM was significantly reduced as well.
Another simulation was performed to evaluate the effect of a large genome with a high marker density on the FPR of the new method. The simulated genome consisted of 10 chromosomes, each with 101 evenly spaced markers at a 2 cM per marker interval. For 1010 markers (two alleles) along the genome, we genotyped 10 founders and 100 nonfounders from a simulated pedigree. Ten QTL with heritabilities of 0.025, 0.05, 0.025, 0.05, 0.15, 0.075, 0.025, 0.10, 0.05 and 0.05 were placed at marker positions 220, 270, 320, 370, 650, 1050, 1150, 1450, 1850 and 1950 cM, respectively. The simulation was repeated 100 times. The results are shown in Figure 2e (Supplementary Table S5) and the same trend in Figure 2d was observed.
The relationship between QTL detection and LD
To demonstrate the last objective, the last simulation experiment was used to investigate the relationship between QTL detection and LD. When one QTL at a marker position was identified, markers could be observed around the identified QTL. Of these markers, we needed to analyze not only whether there was LD between the identified QTL and adjacent markers, but also whether there were false QTL residing on these LD markers. Thus, the proportion of the number of false QTL to the number of LD markers could be calculated. The results are listed in Table 4. The results showed that the false QTL proportion for the new method was generally lower than those for the ISM and the HISM. This implies that the new method has almost controlled for the effect of strong LD markers mentioned above.
Discussion
We used a maize pedigree population of inbred lines of known ancestry as an example to demonstrate the new method of MLISM. This method can be directly applied to inbred strains of mice. When haplotype data are available, the new method can further be extended to haplotypebased MLISM (Liao et al., 2004; Guo et al., 2006, 2007; Liang et al., 2006a, 2006b). When singlenucleotide polymorphism molecular data are available, the new method can be used to improve the precision of QTL mapping and to select the candidate genes. When microarray gene expression marker data are available, it will be possible to elucidate the functional relationships among genes. This study indicates that the MLISM method, with the trait product as the response variable, represents an improvement on the existing methods for QTL mapping in a population of inbred lines.
This method was validated using simulated data. When applied to real data, most of the favorable properties will remain; however, some minor modifications are required. First, population structure generates spurious genotype/phenotype associations (Pritchard and Rosenberg, 1999; Yu et al., 2006). As reported by Yu et al. (2006), false positives can be reduced by incorporating a Q matrix {p_{(k,1)h}} obtained from the STRUCTURE software into model (2). The model appears to be
where H is the number of subpopulations; p_{(k,1)h} and p_{(k,2)h} are the posterior probabilities of z_{k,1} and z_{k,2}, respectively, conditional on the hth subpopulation; and s_{h} is the regression coefficient related to the hth subpopulation. In the simulation studies, therefore, we have focused on evaluating the performance of the proposed method rather than addressing the effect of population structure on the new method. Second, the polygenic effect is generally included in the mixed genetic models, such as Yu et al. (2006) and Zhang et al. (2005). However, there are no polygenic effects in the MLISM. The reason for this is that all QTL across the genome have been included in the present genetic model. For ISM with unrelated individuals, one would expect that a method with all markers fitted in the model would be a better choice. Thus, there would be no need to use the mixed model with a combined polygene control, which requires testing single loci onebyone. Third, in reality, markers may not be evenly distributed along the genome. Although the MLISM does not depend on the uniformity of marker distribution, tightly linked markers may cause poor estimates of the marker effects because of a high degree of multicollinearity. Therefore, it is recommended to use only one marker from a cluster of markers.
The proposed method differs from the ISM and the HISM in several ways. First, it extends the analysis from a single QTL to a multiQTL model, which significantly reduces the FPR. Second, the response variable used in ISM is replaced by the trait product. Third, because the parameters are estimated by PML rather than by least squares method or analysis of variance, the FPR is substantially decreased because PML associates extremely small estimates for loci that are only loosely linked to a QTL. In particular, the PML approach is able to estimate the parameters for an oversaturated genetic model (Zhang and Xu, 2005; He and Zhang, 2008). It should also be noted that this property of the PML method results in a lower power of the new method compared with the ISM, but this shortcoming can be overcome by increasing the sample size. In addition, when several markers are in strong LD, the LR test in the singleQTL analysis could fail to reject any of these markers because of strong linkage. Thus, the relationship between QTL detection and LD was analyzed in the last simulation experiment. Results in Table 4 showed that the ratio of the number of false QTL to the number of the LD markers for the new method is less than those for the ISM and the HISM. This indicates that the new method almost controls for the effect of strong LD, which is a significant advance. In the cases of small sample sizes, low QTL heritability, and a small number of alleles, the power for QTL detection is less for the HISM than for the MLISM. Therefore, the new method is better than the ISM and its extension.
The new method differs from many QTL mapping methods proposed to date. The MLISM is designed for whole genome data sets collected from unrelated or very distantly related inbred lines, whereas other mapping methods are designed mostly for segregating populations from controlled crosses. This new method differs from genomewide association mapping as well. The MLISM uses pairwise distances analysis, whereas genomewide mapping directly fits the value of the trait with a linear model where the independent variables are simply the marker genotypes. Results from the Monte Carlo simulation showed that the former technique is better than the latter under the situations of both multiple alleles and small number of nonfounders (data not shown), because the sample size is much more for the former than for the latter. On the other hand, the large sample size can result in slightly higher FPR, although their FPRs have been controlled under a low level.
Although the MLISM is similar to multiQTL Haseman–Elston regression (Zhang et al., 2008), it differs in several respects. The first difference concerns the IBS and the identitybydescent in the multiloci genetic model. Second, if the pedigree is not complete, that is, it is missing or mistaken, then the MLISM is a better choice than multiQTL Haseman–Elston regression. Finally, the power of the detection of QTL is relatively higher for the MLISM than for multiQTL Haseman–Elston regression (Zhang et al., 2008).
In coarse QTL mapping, mapping precision is relatively low because of a combination of too few generations, limited marker density and too few recombinants. The availability of plentiful singlenucleotide polymorphism markers or haplotypes across a population of inbred lines can address this problem. Singlenucleotide polymorphism marker density can, in principle, be increased to as high a level as necessary. In addition, there are a number of advantages to using inbred lines instead of backcross and F_{2} populations for QTL mapping. These include a greater allele number and a broader reference population, a much higher mapping resolution and a lesser investment in time (Buckler and Thornsberry, 2002; FlintGarcia et al., 2003; Zhang et al., 2005; Yu and Buckler, 2006; Iwata et al., 2007). In particular, where singlenucleotide polymorphism genotypes are available, the fine mapping of QTL in a population of inbred lines becomes possible without the earlier need to generate mapping populations.
References
Buckler ES, Thornsberry JM (2002). Plant molecular diversity and application to genomics. Curr Opin Plant Biol 5: 107–111.
Chesler EJ, Rodriguez SL, Mogil JS (2001). In silico mapping of mouse quantitative trait loci. Science 294: 2423.
Darvasi A (2001). In silico mapping of mouse quantitative trait loci. Science 294: 2423.
Drigalenko E (1998). How sib pairs reveal linkage. Am J Hum Genet 63: 1242–1245.
FlintGarcia SA, Thornsberry JM, Buckler ES (2003). Structure of linkage disequilibrium in plants. Ann Rev Plant Biol 54: 357–374.
Guo Y, Lu P, Farrell E, Zhang X, Weller P, Monshouwer M et al. (2007). In silico and in vitro pharmacogenetic analysis in mice. Proc Natl Acad Sci USA 104: 17735–17740.
Guo Y, Weller P, Farrell E, Cheung P, Fitch B, Clark D et al. (2006). In silico pharmcogenetics of warfarin metabolism. Nat Biotech 24: 531–536.
Grupe A, Germer S, Usuka J, Aud D, Belknap JK, Klein RF et al. (2001). In silico mapping of complex diseaserelated traits in mice. Science 292: 1915–1918.
Haseman JK, Elston RC (1972). The investigation of linkage between a quantitative trait and a marker locus. Behav Genet 2: 3–19.
He XH, Zhang YM (2008). Mapping epistatic QTL underlying endosperm trait loci in random hybridization design. Heredity 101: 39–47.
Iwata H, Uga Y, Yoshioka Y, Ebana K, Hayashi T (2007). Bayesian association mapping of multiple quantitative trait loci and its application to the analysis of genetic variation among Oryza sativa L. germplasms. Theor Appl Genet 114: 1437–1449.
Jansen RC (1993). Interval mapping of multiple quantitative trait loci. Genetics 135: 205–211.
Kao CH, Zeng ZB, Teasdale RD (1999). Multiple interval mapping for quantitative trait loci. Genetics 152: 1203–1216.
Liao G, Wang J, Guo J, Allard J, Cheng J, Ng A et al. (2004). In silico genetics: identification of a functional element regulating H2Eα gene expression. Science 306: 690–695.
Liang DY, Liao G, Lighthall GK, Peltz G, Clark DJ (2006a). Genetic variants of the Pglycoprotein gene Abcb1b modulate opioidinduced hyperalgesia, tolerance and dependence. Pharmacogenet Genomics 16: 825–835.
Liang DY, Liao G, Wang J, Usuka J, Guo Y, Peltz G et al. (2006b). A genetic analysis of opioidinduced hyperalgesia in mice. Anesthesiology 104: 1054–1062.
Park T, Casella G (2008). The Bayesian Lasso. J Am Stat Assoc 103: 681–686.
Pritchard JK, Rosenberg NA (1999). Use of unlinked genetic markers to detect population stratification in association studies. Am J Hum Genet 65: 220–228.
Wright FA (1997). The phenotypic difference discards sibpair QTL linkage information. Am J Hum Genet 60: 740–742.
Xu S (2003). Estimating polygenic effects using markers of the entire genome. Genetics 163: 789–801.
Xu S (2007). An empirical Bayes method for estimating epistatic effects of quantitative trait loci. Biometrics 63: 513–521.
Xu S, Jia Z (2007). Genomewide analysis of epistatic effects for quantitative traits in barley. Genetics 175: 1955–1963.
Yi N, Banerjee S (2009). Hierarchical generalized linear models for QTL mapping. Genetics 181: 1101–1113.
Yi N, Xu S (2008). Bayesian Lasso for quantitative trait loci mapping. Genetics 179: 1045–1055.
Yu J, Buckler ES (2006). Genetic association mapping and genome organization of maize. Curr Opin Biotech 17: 155–160.
Yu JM, Pressoir G, Briggs WH, Bi IV, Yamasaki M, Doebley JF et al. (2006). A untied mixedmodel method for association mapping that accounts for multiple levels of relatedness. Nat Genet 38: 203–208.
Zeng ZB (1993). Theoretical basis for separation of multiple linked gene effects in mapping of quantitative trait loci. Proc Natl Acad Sci USA 90: 10972–10976.
Zhang YM (2006). Advances on methods for mapping QTL in plant. Chin Sci Bull 51: 2809–2818.
Zhang YM, Lü HY, Yao LL (2008). Multiple quantitative trait loci HasemanElston regression using all markers on the entire genome. Theor Appl Genet 117: 683–690.
Zhang YM, Mao YC, Xie CQ, Smith H, Luo L, Xu S (2005). Mapping QTL using naturally occurring genetic variance among commercial inbred lines of maize (Zea mays L). Genetics 169: 2267–2275.
Zhang YM, Xu S (2005). A penalized maximum likelihood method for estimating epistatic effects of QTL. Heredity 95: 96–104.
Zhu C, Wang F, Wang J, Li G, Zhang HS, Zhang YM (2007). Reconstruction of linkage maps in the distorted segregation populations of backcross, doubled haploid and recombinant inbred lines. Chin Sci Bull 52: 1648–1653.
Acknowledgements
We are grateful to the Subject Editor and two anonymous referees for their constructive comments and suggestions that significantly improved the presentation of the manuscript. This study was supported by the National Basic Research Program of China Grant 2006CB101708; the National Natural Science Foundation of China Grant 30671333; Jiangsu Natural Science Foundation Grant BK2008335, NCET Grant NCET050489; 863 program Grant 2006AA10Z1E5; and the 111 Project Grant B08025 to YMZ.
Author information
Additional information
Supplementary Information accompanies the paper on Heredity website (http://www.nature.com/hdy)
Supplementary information
Rights and permissions
About this article
Received
Revised
Accepted
Published
Issue Date
DOI
Keywords
 in silico mapping
 linkage disequilibrium
 penalized maximum likelihood
 identitybystate
 falsepositive rate
Further reading

Bias correction for estimated QTL effects using the penalized maximum likelihood method
Heredity (2012)

Mapping of epistatic quantitative trait loci in fourway crosses
Theoretical and Applied Genetics (2011)

Epistatic Association Mapping in Homozygous Crop Cultivars
PLoS ONE (2011)