Introduction

Quantitative trait gene or locus (QTL) mapping has become a routine approach for genetic studies of complex traits in plants, animals and humans because of the availability of high-throughput molecular markers. In comparison with association mapping, QTL linkage mapping in animals and humans is normally based on pedigree data, but in plants it is more often based on biparental genetic populations. Statistical methods for QTL linkage mapping have been extensively studied (Lander and Botstein, 1989; Darvasi et al., 1993; Zeng, 1994; Whittaker et al., 1996; Piepho, 2000; Sen and Churchill, 2001; Xu, 2003; Bogdan and Doerge, 2005; Li et al., 2007; Wang, 2009), and composite interval mapping (CIM) proposed by Zeng (1994) represents one of the most commonly used methods.

Recently, Li et al. (2007) found that CIM resulted in biased mapping results because of the simultaneous estimation of QTL and background effects in the implementation algorithm. Inclusive composite interval mapping (ICIM) was then proposed (Li et al., 2007; Wang, 2009) to deal with this problem while retaining other advantages related to CIM. Major advantages of ICIM were summarized as follows: (1) ICIM controls the sampling variance better; (2) it makes the background marker selection process easier and simpler; (3) it gives clearly high logarithm of the odds (LOD) scores at chromosomal regions with QTL but rather low LOD scores (that is, close to 0) in which no QTLs are located, thereby increasing mapping power and decreasing the false discovery rate (FDR); (4) it is robust for mapping parameters; (5) it can be extended to map digenic epistatic QTLs regardless of whether the two interacting QTLs have significant additive effects or not; and (6) the expectation and maximization (EM) algorithm used in ICIM has a high convergence speed and is therefore less computing intensive (Li et al., 2007, 2008; Zhang et al., 2008).

Available mapping methods have their own statistical properties and power for detecting QTL. Factors influencing the statistical power of each method include mapping population size (PS), marker density, significance level in declaring the existence of QTL, contribution of the segregating QTL to the observed phenotypic variance and genetic distances of QTL to markers. There are several simulation studies on how these factors affect the detection power of interval mapping (IM). Darvasi et al. (1993) investigated the effect of marker density in a backcross population, and concluded that reducing marker spacing below 10 or 20 cM does not provide additional gains, regardless of PS and gene effect. At 20 cM marker density and assuming QTLs have equal effects with all positive alleles from one parent, Beavis (1994) showed that the estimated effects with correctly identified QTLs were greatly overestimated if only 100 progeny were evaluated, slightly overestimated if 500 progeny were evaluated and fairly close to the actual magnitude when 1000 progeny were evaluated; this was statistically explained by Xu (2003). Using an analytical method, Piepho (2000) showed that the power of QTL detection and the standard errors of effect estimates are little affected by an increase in marker density beyond 10 cM. The bias of estimators of QTL effects and locations from IM was discussed by Bogdan and Doerge (2005). On the basis of multiple interval mapping, Mayer et al. (2004) studied the accuracy of position and effect estimates of linked QTLs in F2 populations by simulation. Some theoretical and simulation studies have also been conducted on the confidence interval of IM (Visscher et al., 1996; Dupuis and Siegmund, 1999). Recently, Bogdan et al. (2008) showed the influence of marker density on the detection power of small- or medium-sized QTLs by a modified version of the Bayesian information criterion.

ICIM has superior genetic and statistical properties, which may represent an important improvement in QTL linkage mapping. It may be misleading to assume that the influence on ICIM of experimental parameters such as PS, QTL effect and marker density is the same as has been found in IM. Our objectives in this study were (1) to investigate the effect of genetic effect, PS and marker density on statistical power, position and effect estimations of ICIM and (2) to provide practical and statistical tables of probabilities and confidence intervals so that a QTL can be identified in mapping populations of various sizes.

Materials and methods

Genetic models used in simulation

In this paper, we considered a hypothetical genome consisting of 10 chromosomes. Each chromosome was 160 cM in length, similar to the maize genome. Four marker densities (MD) were used (that is, MD=40, 20, 10 and 5 cM) from sparse to dense, which corresponded to 5, 9, 17 and 33 evenly distributed markers on each chromosome. Two genetic models (Tables 1 and 2) were simulated.

Table 1 One genetic model consisting of eight independent QTLs
Table 2 One genetic model of two linked QTLs

In the first genetic model (Table 1), there were eight independent QTLs, that is, IQ1–IQ8, with different levels of additive effects on a quantitative trait of interest (Table 1). IQ1 had the smallest genetic effect, explaining only 1% of phenotypic variation, that is, phenotypic variance explained (PVE)=1%, whereas IQ8 had the largest effect, explaining 30% of phenotypic variation, that is, PVE=30% (Table 1). The eight QTLs were distributed on different chromosomes, and no interactions between QTLs were considered. The error variance was set at 0.25, for a total of phenotypic variance equal to one. Thus, the additive effect of a QTL was equal to the square root of the corresponding PVE (Table 1). Broad-sense heritability of this quantitative trait was therefore 0.75, which is the sum of PVE as all QTLs were not linked.

In the second genetic model, two QTLs, that is, LQ1 and LQ2, were linked on chromosome 1. Two linkage phases, that is, coupling and repulsion, and three linkage distances, that is, 10, 20 and 30 cM, were considered (Table 2). No QTLs were located on the other nine chromosomes. The genetic effects of LQ1 and LQ2 were the same, but in different directions to simulate the two linkage phases. Error variance was fixed at 0.80 for the two linkage phases and three linkage distances. For the coupling linkage, the total genetic variance in a doubled haploid (DH) genetic population was 0.3637, 0.3340 and 0.3097, and each QTL explained 8.59, 8.82 and 9.01% of phenotypic variance for linkage distances 10, 20 and 30 cM, respectively. For the repulsive linkage, the total genetic variance in a DH population was 0.0362, 0.0659 and 0.0902, and each QTL explained 11.96, 11.55 and 11.23% of phenotypic variance for linkage distances 10, 20 and 30 cM, respectively (Table 2). If there was no linkage between LQ1 and LQ2, the total genetic variance in a DH population was 0.2, and heritability was 20%. Hence, the coupling linkage increases genetic variance and, therefore, increases heritability. On the contrary, the repulsive linkage decreases genetic variance and, therefore, decreases heritability. It is worth noting that the sum of PVE is not equal to heritability in the presence of linkage.

Simulated populations and QTL mapping of ICIM

DH lines were simulated by crossing two inbred parental lines. ICIM was used to conduct QTL mapping, which consists of two steps (Li et al., 2007; Zhang et al., 2008). In the first step of ICIM, marker selection is conducted through stepwise regression by considering all marker information simultaneously. Phenotypic values are then adjusted by all markers retained in the regression equation, except the two markers flanking the current mapping interval. In the second step, the adjusted phenotypic values are used in one-dimensional scanning. In this study, the two probabilities for entering and removing variables in the first step were set at 0.01 and 0.02, respectively. The empirical LOD threshold was set at 3.0 in the second step. Mapping populations were generated, and QTL mapping was completed by a software package called QTL IciMapping, available from http://www.isbreeding.net.

Calculation of statistical power, FDR, LOD score and position and effect estimations

ICIM is based on the interval test, which is not a point estimation procedure (Li et al., 2007). One QTL is unlikely to be located exactly at the predefined position in each simulated mapping population. In this sense, a confidence interval (CI) has to be used to indicate which significant QTLs belong to which predefined QTLs in simulation. As an example, a CI of 10 cM was used by Li et al. (2007) and Zhang et al. (2008) to compare the detection power of various QTL mapping methods.

IQ1–IQ8 were not linked in our simulation study (Table 1), and hence in this case there is no misunderstanding regarding which detected QTL belongs to which putative QTL. For instance, a QTL identified on chromosome 1 will be IQ1, one identified on chromosome 2 will be IQ2 and so on. Therefore, the CI of each QTL was first assigned as the whole chromosome. This provided the opportunity to properly calculate the variations of estimated location and effect. Thus, the power of each simulated QTL was calculated as the frequency, out of 1000 simulation runs, with which the putative QTL was correctly identified on the predefined chromosome, and the variation of the estimated QTL position was determined accordingly. When more than one QTL was identified on a chromosome, the one with the highest LOD score peak was counted. QTL identified on the last two chromosomes, that is, chromosomes 9 and 10, were assumed to be false positives, from which FDR was calculated as the proportion of false positives to the total number of significant discoveries (Benjamini and Hochberg, 1995).

Second, we were also interested in the statistical power to identify QTL in a CI of predefined length, say 10 or 20 cM, around the true QTL location (Li et al., 2007). In this case, power can be achieved by counting in how many of the 1000 simulation runs the estimated QTL locations fall in the fixed CI. QTLs identified in other intervals and chromosomes were counted as false positives. FDR can be then calculated (Benjamini and Hochberg, 1995). In this study, power and FDR from CI=10 cM were given for both genetic models, that is, for IQ1–IQ8 and LQ1 and LQ2.

For IQ1–IQ8 when CI was the whole chromosome in which each QTL was located, the LOD scores and position and effect estimates were calculated from peaks in the CI having the LOD score over the predefined threshold. The length of a 95% CI for each IQ1 and IQ2 position is equal to 2 × U1-α/2 × SE, where SE is the standard error of estimated QTL position, U1-α/2 is the 1-α/2 quantile of the standard normal distribution and α is the type I error, set at 0.05 when 95% is the probability level of the CI. Thus U1-α/2 is equal to 1.96 in this study. For LQ1 and IQ2 when CI was set as 10 cM in length centered at the putative QTL position, to monitor the performance of linked QTL dissection, the LOD score and QTL effect were calculated for each scanned chromosomal position by averaging the 1000 simulation runs. It is expected that the QTL effect is underestimated, as estimates from nonsignificant LOD scores are also counted.

Results

Effect of PS and marker density on detection power

LOD score is the test statistic used in ICIM and IM to declare the existence of QTL. The higher the LOD score, the more likely there is a QTL. It is clear that higher PVE and larger PS resulted in higher LOD scores, regardless of the mapping method (Figure 1 and Supplementary Figure S1). But ICIM can produce much higher LOD scores than IM, which empirically indicates the high mapping power of ICIM compared with IM. In addition, for both ICIM and IM, QTLs with larger genetic effects have obviously higher LOD scores than QTLs with smaller effects. For a specific marker density and genetic effect, LOD score linearly increases with PS (Figure 1 and Supplementary Figure S1), indicating the great importance of PS in QTL mapping. For ICIM, more densely distributed markers can increase the LOD score as well, but the advantage of using denser markers diminishes as PS decreases. For IM, the LOD score did not change much when MD became denser. Therefore, ICIM has greater response to MD than IM, indicating that ICIM can take more advantage of using dense markers in QTL mapping.

Figure 1
figure 1

Average LOD scores from 1000 simulation runs of ICIM (a, b) and IM (c, d) across a range of population sizes for IQ1–IQ8 corresponding to eight levels of explained phenotypic variance (PVE; PVE=1, 2, 3, 4, 5, 10, 20 and 30%) and two marker densities (MD; MD=5 and 40 cM). The confidence interval (CI) was assumed to be the whole chromosome. LOD scores were calculated from peaks in the CI having the LOD score over the predefined threshold of 2.5.

Generally speaking, QTL detection power increases and FDR decreases as PS rises, especially for ICIM and larger-effect QTLs (those explaining 3% of phenotypic variation), regardless of the length of the predefined CI (Figure 2). The power of MD=40 cM was consistently lower than that of other MD, especially for QTL with medium-sized genetic effects, that is, QTL explaining 3–10% of phenotypic variation. This result suggests that MD=40 cM would be too sparse in QTL mapping when the target is to identify QTL with medium-to-large genetic effects (Figure 2).

Figure 2
figure 2

QTL detection power and FDR from 1000 simulation runs of ICIM (a, c) and IM (b, d) across a range of population sizes for IQ1–IQ8 corresponding to eight levels of explained phenotypic variance (PVE; PVE=1, 2, 3, 4, 5, 10, 20 and 30%) and four marker densities (MD; MD=5, 10, 20 and 40 cM). The confidence interval (CI) was assumed to be either the whole chromosome in which each QTL is located (a, b), or 10 cM in length centered at the putative QTL position (c, d).

When CI is the whole chromosome in which each QTL is located, for both ICIM and IM the difference in detection power was marginal among MD=5, 10 and 20 cM, especially considering that a total of 330 markers were needed for MD=5 cM, whereas a total of 90 markers were needed for MD=20 cM. This is consistent with what we have observed from the LOD score shown in Figure 1 and Supplementary Figure S1. For QTL with PVE >10% and PS over 100, the power of ICIM and IM were similar, all >90% (Figures 2a and b), but the average LOD scores from MD=5, 10 and 20 cM were all >6 under ICIM (Figure 1a and Supplementary Figures S1A and S1B), and >4 under IM (Figure 1c and Supplementary Figures S1C and S1D). When ICIM was used for QTLs with low PVE, that is, IQ1 and IQ2 (Table 1), the increase in MD was useful for increasing QTL detection power, especially for greater PS. In the meantime, denser markers resulted in higher FDR (Figure 2). For example, for MD=5 cM, detection power is the highest, but FDR is also the highest among the four marker densities.

When CI was the whole chromosome and PS was >200, FDR from ICIM and IM was close to 0; when PS was <200, FDR of ICIM was a bit higher than that of IM, but detection of ICIM was much higher (Figures 2a and b). As expected, the use of a narrower CI, that is, 10 cM in length centered at the putative QTL position, resulted in lower detection power and higher FDR, as more significant QTLs were counted as false positives (Figures 2c and d).

Estimation of QTL position and effect

As PS increases, the estimated QTL locations of ICIM approach their real values (Table 1), regardless of the QTL effect and MD (Figure 3 and Supplementary Figure S2). The convergence speed depends mostly on the QTL effect. For PVE 5%, the average estimated position approaches the true position when PS 200, whereas a larger PS is needed for the estimated position of QTL with PVE <5% to converge with the true position (Figures 3a and b and Supplementary Figures S2A and S2B). Compared with ICIM, a much larger PS is needed for the estimated position of IM to converge with the true position (Figures 3c and d and Supplementary Figures S2C and S2D).

Figure 3
figure 3

Deviations to true positions from 1000 simulation runs of ICIM (a, b) and IM (c, d) across a range of population sizes for IQ1–IQ8 corresponding to eight levels of phenotypic variance explained (PVE; PVE=1, 2, 3, 4, 5, 10, 20 and 30%) and two marker densities (MD; MD=5 and 40 cM). The confidence interval (CI) was assumed to be the whole chromosome. Positions were estimated from peaks in the CI having the LOD score over the predefined threshold of 2.5.

Before the estimated location converged with its actual value or when the sample size was small, there was a tendency for the identified QTL to be located toward the middle of the chromosome, that is, at 80 cM (Figure 3 and Supplementary Figure S2). The eight QTLs defined in Table 1 are all located on the left side of their corresponding chromosomes. When QTL location was changed to the right side of the chromosome, a similar trend was observed (Supplementary Figure S3). This is consistent with what Bogdan and Doerge (2005) stated that when the QTL is close to one end of the chromosome, the distribution of estimators of QTL location is skewed toward the opposing end of the chromosome. The phenomenon observed here may have an implication for QTL fine mapping and map-based cloning. If a QTL is estimated to be on the left side of a chromosome, the true QTL position is likely located to the left of the estimated position. On the other hand, if a QTL is estimated to be on the right side of a chromosome, the true QTL position is likely located to the right of the estimated position.

The increase in MD and PS reduced the 95% CI of the estimated QTL location of ICIM and IM (Table 3) because of the reduced standard error (results not shown). But in some cases, IQ6 had a narrower CI than IQ7, reflecting the influence of marker distributions on QTL detection. For MD=10 and 20 cM, IQ6 coincided with a marker on chromosome 6, representing the easiest mapping scenario (Darvasi et al., 1993) and resulting in narrower estimated CI. In all cases, the 95% CIs of IM were wider than those of ICIM, indicating the improved mapping efficiency of ICIM.

Table 3 Length of 95% empirical CIs of QTL positions using two population sizes (i.e., PS=200 and 400) and four marker densities (that is, MD=5, 10, 20 and 40 cM)

As PS increased, the estimated QTL effects of ICIM asymptotically approached their actual values (Table 1) regardless of the QTL effect and MD (Figures 4a and b and Supplementary Figures S4A and S4B), but the genetic effects were overestimated by IM even when PS=600 for QTL with PVE <3% (Figures 4c and d and Supplementary Figures S4C and S4D). Similar to the estimation of QTL position, the convergence speed depends mostly on the QTL effect, but ICIM has a much faster convergence speed. The QTL effect tends to be overestimated for ICIM and IM, especially for QTL with small effects (Figure 4 and Supplementary Figure S4). This has been noted before (Beavis, 1994; Zeng, 1994; Li et al., 2007), and can be explained properly. In a limited-size mapping population, detection power is low, especially when PVE is <5% (Figure 2). In simulation, only peaks higher than the threshold LOD score are counted. Those with smaller estimated effects may result in peaks lower than the LOD threshold, and are therefore not counted when calculating the average estimated effect. When all peaks are counted, an approximately unbiased estimation of the QTL effect can be achieved, regardless of PS (Li et al., 2007).

Figure 4
figure 4

Deviations to true additive effects from 1000 simulation runs of ICIM (a, b) and IM (c, d) across a range of population sizes for IQ1–IQ8 corresponding to eight levels of phenotypic variance explained (PVE; PVE=1, 2, 3, 4, 5, 10, 20 and 30%) and two marker densities (MD; MD=5 and 40 cM). The confidence interval (CI) was assumed to be the whole chromosome. Effects were estimated from peaks in the CI having a LOD score over the predefined threshold of 2.5.

PS required to detect QTL with a certain power

As previously mentioned, IQ1–IQ8 were not linked in the simulation study. Although the CI of each QTL is the whole chromosome for calculating detection power and estimating QTL location and effect in Figures 2, 3 and 4 and Supplementary Figures S2S4, the required PS to detect QTL at four probability levels are given in Table 4. In general, larger populations are needed to map QTL with smaller genetic effects, whereas small populations can only have high power for detecting QTL with larger genetic effects. For ICIM in populations with a size of approximately 200, the probability of detecting QTL with PVE <3% is low, no matter how many markers are screened (Table 4). For QTL with PVE of <5%, a slightly smaller population is required to achieve similar detection power when more markers are used. As shown in Figure 1 and Supplementary Figures S1 and S2, the power of QTL with PVE of <5% can be improved by increasing MD when ICIM is used. However, increasing MD had little effect on QTLs with larger genetic effects (Table 4). For the two largest QTLs, that is, IQ7 and IQ8, similar PS is needed for ICIM and IM to reach similar detection power. But IM needs much larger PS for other QTLs. For instance, when MD=10 cM, at least 70 DH lines are needed for ICIM to detect IQ6 with 80% probability, whereas more than two times of DH lines are needed if IM is used.

Table 4 Population sizes required to identify QTLs on predefined chromosomes using four marker densities (i.e., MD=5, 10, 20 and 40 cM) and four levels of detection power (that is, 0.6, 0.7, 0.8 and 0.9)

When the CI of each QTL was designated as a 10 cM interval with the QTL sitting in the center, the required PS for ICIM and IM to detect the QTL at four probability levels are shown in Table 5. For example, if IQ2 is located at 32 cM on chromosome 2 (Table 1), the interval from 27 to 37 cM will be the CI. Power simulated in this way represents the probability that the QTL will be identified in the interval between 27 and 37 cM. As expected, a larger PS is needed to map QTL with similar power (Table 5), compared with the PS needed when CI was set as the whole chromosome (Table 4). A larger difference between ICIM and IM was observed for QTL with small effects and when lower marker density was used.

Table 5 Population sizes required to identify QTLs within the 10-cM predefined CI using four marker densities (i.e., MD=5, 10, 20 and 40 cM) and four levels of detection power (that is, 0.6, 0.7, 0.8 and 0.9)

Effect of PS and marker density on the detection of coupling linkage

The dissection of linked QTL depends much on empty marker intervals isolating the linked QTL, the distance between linked QTL and PS. Two QTLs located at one marker interval are less likely to be separated. Hence, the first requirement for dissection of linked QTLs is that there has to be at least one empty marker interval, that is, the linked QTLs are isolated (Whittaker et al., 1996; Li et al., 2007). When the genetic distance between LQ1 and LQ2 was 10 cM, there were no empty marker intervals between LQ1 and LQ2 for MD=40, 20 and 10 cM. In this case, ICIM identified one ‘ghost’ QTL between LQ1 and LQ2, and the genetic effect was estimated as the sum of two QTLs (Figure 5 and Supplementary Figures S5A–S5C).

Figure 5
figure 5

Average LOD scores from 1000 simulation runs of ICIM (a–c, g–i and mo) and IM (df, jl and p–r) for two linked QTLs (LQ1 and LQ2) in the coupling phase, four marker densities (MD; MD=5, 10, 20 and 40 cM) and three population sizes (PS; PS=100, 300 and 500). The confidence interval (CI) was assumed to be 10 cM in length centered at the putative QTL position. LOD scores were calculated for each scanned chromosomal position by averaging the 1000 simulation runs. Three linkage distances were considered, that is, 10 cM for a–f, 20 cM for g–l and 30 cM for m–r.

When the genetic distance between LQ1 and LQ2 was 20 or 30 cM, LQ1 and LQ2 were not isolated by empty marker intervals for MD=40, and 20 cM, and ICIM could not separate LQ1 and LQ2 properly either (Figure 5g–i). For MD=10 and 5 cM, LQ1 and LQ2 were isolated by at least one empty marker interval. Two clear peaks can be observed on the mean LOD profile of ICIM for PS=300 and PS=500 (Figures 5h, i, n and o). However, the LOD score for PS=100 was low, and thus LQ1 and LQ2 may not be separated precisely. On the other hand, when the distance between LQ1 and LQ2 was 10 cM, only one peak could be observed on the mean LOD profile of ICIM for PS=500 (Figure 5c), although LQ1 and LQ2 were isolated by one empty marker interval for MD=5 cM. For the 10 cM linkage, mapping populations with sizes >500 are needed to separate the linked QTL properly. Therefore, isolated QTLs is the necessary condition for ICIM to dissect relatively close linkages, but at the same time, a larger PS is also needed.

IM was proposed based on the assumption that at most one QTL was located on each chromosome or linkage group (Lander and Botstein, 1989). Our simulation results showed that IM cannot dissect LQ1 and LQ2 for any PS, MD and linkage distance in the coupling phase (Figures 5d–f, j–l and p–r), which was the major reason for proposing CIM and ICIM. When two linked QTLs were not separated properly, the LOD score around the linked QTL region was affected by both QTLs. In fact, only one ‘ghost’ QTL between the two linked QTL was observed; its estimated effect was equal to the sum of the two QTLs. In comparison, when two linked QTLs were separated properly, the LOD score was affected only by the QTL around the testing region. In other words, only one QTL contributed to the LOD score as the testing position moved along the chromosome. Therefore, for the coupling linkage, higher LOD score and power were observed when LQ1 and LQ2 were not separated by either ICIM or IM (Figure 5 and Supplementary Figure S6). The higher power observed for IM than for ICIM under low MD and small PS (Supplementary Figure S5 and S6) does not indicate that IM can separate linked QTLs better than ICIM.

Effect of PS and marker density on the detection of repulsive linkage

Most findings for coupling linkage are applicable for repulsion. When LQ1 and LQ2 were 10 cM apart, there were no empty marker intervals between them for MD=40, 20 and 10 cM, and ICIM identified no QTLs between them (Figures 6a–c and Supplementary Figure S7) because of their opposite genetic effects. When LQ1 and LQ2 were 20 or 30 cM apart, there were empty marker intervals between them for MD=10, and 5 cM, and ICIM separated them properly (Figures 6g–i, and m–o). As with the coupling linkage, in a mapping population of small PS, it is impossible for ICIM to precisely dissect LQ1 and LQ2 linked in the repulsive phase even when MD=5 cM (Figures 6g and m). When LQ1 and LQ2 were 30 cM apart, two peaks were observed on the average LOD score of IM (Figures 6q and r), but the genetic effect was greatly underestimated (Supplementary Figure S8). In comparison, ICIM still achieved an asymptotically unbiased estimation of genetic effect, regardless of the linkage phase (Supplementary Figures S5 and S8).

Figure 6
figure 6

Average LOD scores from 1000 simulation runs of ICIM (a–c, g–i and m–o) and IM (df, jl and p–r) for two linked QTLs (LQ1 and LQ2) in the repulsive phase, four marker densities (MD; MD=5, 10, 20 and 40 cM) and three population size (PS; PS=100, 300 and 500). The confidence interval (CI) was assumed to be 10 cM in length centered at the putative QTL position. LOD scores were calculated for each scanned chromosomal position by averaging the 1000 simulation runs. Three linkage distances were considered, that is, 10 cM for a–f, 20 cM for g–l and 30 cM for m–r.

Discussion

ICIM is a useful step forward that highlights the importance of model selection and interval testing in QTL mapping. While conducting the interval test in ICIM, the genetic variations in other marker intervals and chromosomes are completely controlled (Li et al., 2007), resulting in much higher LOD score than IM at chromosomal regions with QTL but in much lower LOD score where no QTL is located. QTL detection power and FDR were calculated from LOD score; therefore, ICIM has much higher power but lower FDR than IM, indicating the great improvement over IM. ICIM has also been shown to offer improvements on CIM and some Bayesian models (Li et al., 2007, 2008; Zhang et al., 2008). In practice, ICIM has been successfully applied to identify flowering time QTL in maize nested association mapping population (Buckler et al., 2009).

ICIM is based on the interval test, involving a large number of statistical tests along a genome. The complicated nature of the QTL mapping method makes it less likely that its statistical properties can be explicitly derived. However, these properties can be properly investigated through computer simulation. The large-scale simulations conducted in this study showed that as PS increases, the estimated QTL position and the effect from ICIM asymptotically approach their true values, regardless of the QTL effect and marker density. Larger-effect QTLs reach unbiased position estimation faster, and at the same time, the increase in marker density and PS is useful for achieving a more accurate estimation of QTL position.

The development and use of single-nucleotide polymorphism markers and increasing capacity of the high-throughput genotyping platform enable the use of denser markers than had been previously used. However, the question does arise as to whether the increase in marker density can improve QTL mapping efficiency significantly enough to warrant the investment in more marker data points in existing genetic populations. From the simulations in this study, increased marker density did not improve the QTL mapping efficiency of IM, which is consistent with what Darvasi et al. (1993) showed. But for ICIM, with better control of background noise, simulation results indicate that the use of dense markers can improve the detection power of QTL with medium-to-small genetic effects.

Dissection of linked QTLs depends mainly on their linkage distance, genetic effects, PS and marker density. The genetic effects of non-isolated QTLs are difficult to separate. For linked QTLs, we need at least one empty interval. It is therefore expected that higher marker density will provide greater potential to resolve QTLs that are more closely linked. Simulations in this study showed that denser markers helped ICIM narrow down the QTL positions and dissect linked QTLs. However, the dissection of linked QTLs is also dependent on the mapping PS. In a small population, linked QTLs cannot be separated even if they are isolated by dense markers. Therefore, only large populations can take advantage of densely distributed markers. It is thus more advisable to increase marker density accompanied by an increase in PS.

Simulation results in this paper extend the studies by Darvasi et al. (1993) and Beavis (1994), and provide relatively simple approximations of statistical power for detecting QTLs with a series of effect sizes for a predefined CI. If the purpose of a genetic study is to detect QTL with PVE 5% by ICIM within 10 cM of CI with 90% probability, at least 300, 320, 580 and >600 individuals are needed for MD=5, 10, 20 and 40 cM, respectively (Table 5). On the other hand, if a mapping population has been built, we can roughly evaluate QTL mapping efficiency. For instance, with intermediate marker density and PS, say MD=10 cM and PS=200, we have a more than 90% chance of mapping QTL with PVE >10%, a more than 70% chance of mapping QTL with PVE >5% and a more than 60% chance of mapping QTL with PVE >3% within 10 cM of their true positions (Table 5).

We focused on additive genetic models in this study because they are most important and consistent in the genetic architecture of most of species, and thus most useful in molecular design breeding, especially for selecting inbred lines. In maize nested association mapping population, a simple additive model accurately predicts flowering time for a range of related germplasm (Buckler et al., 2009). Except for additive effects, epistasis also has an important role in the genetic control of complex traits, although it is more difficult to be detected because of the complexity of its pattern. To apply it, it is essential for breeders to know the mapping efficiency of epistasis for their current data set or for future experimental design. Further simulations are needed to explore the statistical power of ICIM for mapping epistatic QTL and thus facilitate the detection of reliable and consistent epistasis.