Introduction

Many traits of biological and economic importance in plants, animals and human populations are measured in a discrete manner. For example, most disease resistance traits in plants, such as sheath blight resistance in rice (Zou et al, 2000), clubroot resistance in brassica napus (Manzanares-Dauleux et al, 2000) and cucumber mosaic virus resistance in pepper (Caranta et al, 2002), are all scored in several ordered categories, based on the magnitude of disease symptom. Similarly, there are many characters in animals and humans, such as scores for calving difficulty, expression of congenital malformations, numbers of reproductive events and so on, which are expressed as binary or ordinal traits. Although the expression of some discrete traits is a consequence of the expression of a single segregating factor, multiple loci are often involved (Lynch and Walsh, 1998). Naturally, we may postulate that a number of different genes along with a number of environmental variables act jointly as risk and protective factors for the trait development. When enough risk factors accumulate and greatly outweigh the protective factors, the trait phenotype develops. As many factors contribute to the trait variation, the liability or predisposition towards the trait is really a continuous and quantitative trait. Once the liability passes a certain critical point or threshold, the trait phenotype emerges. Attributes that are categorical on an outward (observed) scale but believed to be continuous on an underlying (unobserved) scale are called the threshold or quasi-continuous characters (Lynch and Walsh, 1998).

Rice sheath blight, caused by Rhizoctonia solani Kühn, is one of the three major diseases of rice and severely impairs both rice yields and quality. Resistance to sheath blight in the rice shows a quantitative nature, that is, different rice varieties show different degrees of resistance and the disease phenotypes usually overlap (Zou et al, 2000). The phenotypic value of sheath blight resistance is measured in grade, ranging from 0 (complete resistance) to 9 (complete susceptible) (Rush et al, 1976). However, the distribution of the grade severely deviates from normality. Therefore, classical quantitative genetics analysis for normal traits is not optimal for this type of ordinal traits. Binary trait analysis techniques are not suitable either because they cannot handle multiple categories. Therefore, new statistical methods are required to map such quantitative resistance loci (QRL).

A number of statistical methods are now available to map quantitative trait locus (QTL) for continuous traits (Lander and Botstein, 1989; Haley and Knott, 1992; Jansen, 1993; Zeng, 1994; Kao et al, 1999), but relatively little work has been carried out on mapping ordinal traits (Xu and Atchley, 1996; Visscher et al, 1996; Galecki et al, 2001; Xu et al, 2003), especially for multiple ordinal traits (Hackett and Weller, 1995; Rao and Xu, 1998; Rao and Li, 2001). Genetic analysis for ordinal categorical traits is difficult because the observed phenotype (category) cannot be described by a straightforward linear model. Hackett and Weller (1995) developed an approximate logistic regression method using the threshold model to map QTL for such traits in backcross (BC) population. Rao and Xu (1998) also proposed a similar method for QTL mapping in four-way crosses. Rao and Li (2001) further extended the methods to map QTL using independent multiple families. All the above methods were implemented using either a different statistical model from traditional QTL mapping or a different optimization algorithm from the commonly used EM algorithm. We found that the models and the optimization algorithms for mapping ordinal traits and quantitative traits can be formulated in the same framework. Thus, QTL mapping and ordinal trait mapping can be unified under the same statistical framework. The same linear model and EM algorithm can be used for both types of traits. We also found that the multicycle expectation-conditional-maximization (ECM) algorithm developed by Meng and Rubin (1993), an extended EM algorithm, is more intuitive and easy to understand to the QTL mapping community. In addition, the multicycle ECM algorithm can be easily programmed in computers. Therefore, the objective of this study is to introduce such an ECM algorithm for mapping ordinal traits.

Mapping populations that can be handled in most statistical methods involve only two inbred lines. The drawback of these designs is that the statistical inference space is quite narrow (within the two inbred lines), and thus results from one cross cannot be generalized to other crosses derived from different inbred lines. Xu (1996,1998) proposed the four-way cross design of QTL mapping, intended to increase the statistical inference space and the opportunity for detecting more QTL. We found that the different designs of line cross can be incorporated into a unified QTL mapping strategy that is aimed to handle a four-way cross family but treat commonly used mapping populations, such as F2 and BC, as special cases. In this study, we will discuss how to implement this strategy.

Theory and methods

Statistical model for ordinal traits

Consider n individuals in the mapping population and denote the observed ordered category of individual j by wj where j=1, 2, …, n. For C categories, the phenotype of an individual can be defined as wj=c if individual j belongs to class c, for c=1,…, C. A set of fixed thresholds, t1, t2,…, tC−1, on the underlying scale define the observed categories on an ordinal scale 1, 2, …, C. Further define yj as the underlying variable for individual j. We thus have model

Here we have C+1 thresholds but only C−1 thresholds, t={t1, t2,…, tC−1}, which are parameters that are subject to estimation.

Although the natural choice for the distribution of y would be the normal distribution, Hackett and Weller (1995),Rao and Xu (1998) and Rao and Li (2001) all used logistic distribution to approximate the normal distribution for the purpose of computational simplicity. In contrast to the above methods, we directly use the normal distribution. The underlying variable y is assumed to be a continuous variable similar to the phenotypic value of a common quantitative trait. The only difference is that yj is not observable but inferred from the observed phenotype of individual j. As a quantitative trait, yj can be described by the linear model

where b is a vector of nongenetic effects, for example, block and year effects in plant or sex and age effects in animals, Xj is a known design matrix for the nongenetic effects, u is a vector of genetic effects, Zj is the design matrix for the genetic (QTL) effects, and ej is a random environmental effect defined as a standardized normal variable.

Under this assumption, the probability that individual j is classfied into the cth category is

where Φ(tc−Xjb−Zju) is the standard normal cumulative distribution function. The above multiple thresholds model for ordinal trait provides a link between wj and yj. If we know the thresholds, mapping QTL for ordinal categorical trait has been formulated as a problem of mapping QTL for regular quantitative trait. The difficulty, however, is that these thresholds are unknown and must be estimated simultaneously along with the QTL effects.

Genetic model of a four-way cross

The genetic model is developed based on a four-way cross design because backcross and F2 designs are shown to be special cases of such a general design. The genetic model for a four-way cross has been proposed by Xu (1996,1998). In order for the paper to be self-contained, these models are summarized and described here. Let L1 and L2 be the two inbred lines initiating the first cross and L3 and L4 be the inbred lines intiating the second cross. Denote the QTL genotypes of L1 and L2 by Q1mQ1m and Q2mQ2m, respectively, and the genotypes of L3 and L4 by Q1fQ1f and Q2fQ2f, respectively. The genetic constitution of the four-way cross population will consist of four genotypes: Q1mQ1f, Q1mQ2f, Q2mQ1f and Q2mQ2f, with equal frequency. Let Gab be the value of genotype QamQbf where a,b=1, 2, and it can be expressed by the following linear model:

where

The three elements of vector u are defined as the additive effect for the maternal parent, the additive effect for the paternal parent and the dominance effect of the QTL, respectively. Let Hg be the gth row of matrix H, then G11=H1u, G12=H2u, G21=H3u and G22=H4u.

We now connect the threshold model and the genetic model in the four-way cross. In model (2), Zj=H1 if individual j takes the first genotype Q1mQ1f and Zj=H2 if j takes the second genotype Q1mQ2f and so on. Model (2) is a general linear model (GLM) with missing value in Zj because the genotype of j is not observable.

The next step of the GLM analysis with missing value is to infer the probabilities of QTL genotypes conditional on marker information, denoted by pjg(0)=Pr(Zj=Hg∣IM) for g=1,…, 4 where IM represents marker information. Multipoint method (Rao and Xu, 1998) can be used to infer the conditional probabilities of QTL genotypes. This method is the same as Jiang and Zeng (1997) in dealing with missing or partially informative markers and can be implemented in a simple way.

Maximum likelihood estimation (MLE)

Let us denote the parameters by a vector θ={t, b, u}. The probability of phenotype for the jth individual conditional on Zj is

Since Zj is missing and only pjg(0) can be calculated, the actual likelihood function for the jth individual is

The overall log likelihood for the entire mapping population is

Solving the above log likelihood function is tedious. We now introduce a multicycle ECM algorithm (Meng and Rubin, 1993) to find the solution. The multicycle ECM algorithm is to perform one E step before each CM step or a few selected CM steps. A cycle is defined as one E step followed by one CM step. The proposed multicycle ECM solution takes advantage of the simplicity of the original linear model with both yj and Zj being treated as missing values.

If Zj and yj were observed for every individual, the estimates of the parameters b and u at the (k+1)th iteration could be found explicitly using the following iterative equations by the two conditional maximizatiom (CM) steps:

In QTL mapping for continuous traits, Zj is missing but the distribution of Zj is given, the ECM algorithm can be adopted to take advantage of the above equations. The ECM equations simply replace all the terms related to Zj by their expectations, that is,

The expectations are obtained conditional on both marker information and the value of liability yj. The connection between the phenotype and the QTL genotype is through the parameter values, but the parameters are what we are trying to find. Therefore, we need iterations on equation (9) by providing some initial values of the parameters to start the iteration. This is the ECM algorithm. The E step is to find the expectations and the CM step is to invoke equation equation (9) for iterations.

Recall that the probability of Zj conditional on marker information is denoted by pjg(0). This probability may be called the prior probability. After incorporating the phenotypic value, we obtain the posterior probability at the (k+1)th iteration, denoted by

where

is the standardized normal density. Note that the prior probability pjg(0) in equation (10) is used for all iterations to calculate the posterior probability.

The expectations are actually obtained using the posterior probabilities rather than the prior probabilities. Therefore,

The problem here is that yj is also missing for ordinal traits. Thus, we need to use ŷ, the expectation of y conditional on w, Zj and θ, in place of y for the estimation of θ before each CM step and this becomes the multicycle ECM.

As t contains a set of parameters different from u and b, we now build the equations as follows. The solution for b and u conditional on t at the (k+1)th iteration is

Before taking the above CM steps, we first need to calculate the corresponding expectations by

where

The quantity i ( w j = c ) in (14) is an indicator variable and defined as one for wj=c and zero otherwise. Formulae 12, (13) and (14) consist of the first cycle in our multicycle ECM algorithm.

A closed form for the exact solution of t is hard to define. However, an explicit solution can be approximated. The solution of the cth element of t, for c=1, …, (C−1), conditional on b and u at the (k+1)th iteration is

where

and

The quantity i ( w j ≤ c ) in (17) is also an indicator variable and defined as one for wj≤c and zero otherwise. Here we have (C−1) thresholds to estimate and thus have (C−1) ECM cycles. In each cycle, we first calculate the expectation using equations (16) and (17) and then estimate the threshold by using equation (15). Therefore, we have a total of C ECM cycles in one iteration for estimation of all the parameters.

Note that pjg(k+1) used in equations (13) and (16) for ordinal traits are different from that used in equation (10) for quantitative traits. The pjg(k+1) used here for ordinal traits is

The calculation begins with some starting values for b(0), u(0), t(0) and pjg(0). Iterations are then made between (18), (14), (13), (12), (17), (16), and (15) and terminated until a predetermined convergence criterion is satisfied. The MLE of parameters are denoted as b̂, û, t̂which will then be used for the calculation of the maximum likelihood value for hypothesis testing.

Likelihood ratio test statistic

Define the log-likelihood value evaluated at the MLE of parameters as

where

and

This is also called the likelihood value under the full model. We need the likelihood values under various restricted models to test various hypotheses.

The overall null hypothesis is no effect of QTL at the locus of interest, denoted by H0: am=af=d=0 or H0: Lu=0, where

If we solve the MLE of the parameters under the restriction of Lu=0 and evaluate the likelihood value at the solutions with this restriction, we have

The likelihood ratio test statistic is

Various other test statistics can be defined by redefining the L matrix. To test the hypothesis of H1: am=0, we define L1=[1 0 0]. The likelihood ratio test statistic is . To test the hypothesis of H2: af=0, we define L2=[0 1 0] and use . Similarly, we use to test the hypothesis of H3: d=0 where L3=[0 0 1].

Extension to F2 and BC populations

The four-way cross model is a general model from which the F2 and BC models are considered as special cases. Let us first consider a BC population. The genotypes of the two parents of the BC family is defined as Q1mQ2f × Q1mQ1m or Q1mQ2f × Q2fQ2f, depending on which inbred line is used as the tester. The constitution of genotypes of the mating pair may be called the mating type. Let us assume that Q1mQ2f × Q2fQ2f is the parental mating type for the BC family. A progeny from this mating type can take one of the four possible genotypes: Q1mQ2f, Q1mQ2f, Q2fQ2f and Q2fQ2f. Note that the first and the second genotypes are not distinguishable, and neither are the third and the fourth. If we use the same notation as that of the four-way cross for the four genotypic values, we have G11=G12 and G21=G22. The genetic effects defined in the notation of a four-way cross are am=G11+G12−G21−G22, af=G11−G12+G21−G22=0 and d=G11−G12−G21+G22=0. Therefore, we can use the same four-way cross model for the BC mapping with the restriction of af=d=0. This can be acomplished by searching for the MLE of the four-way cross model with the restriction of Lu=0, where

All marker genotypes are considered as either partially informative (when typed) or noninformative (when missing), and thus the same multipoint method can be used to infer the QTL genotype of a putative position using all markers.

Let us now consider an F2 population. The genotypes of the two parents of the F2 family can be defined as Q1mQ2f × Q1mQ2f. A progeny from this parental mating type can take one of the four possible genotypes: Q1mQ1m, Q1mQ2f, Q2fQ1m and Q2fQ2f. Note that the second and the third genotypes are not distinguishable. If we use the same notation as that of the four-way cross for the four genotypic values, we have G12=G21. The genetic effects defined in the four-way cross are am=G11+G12−G21−G22, af=G11−G12+G21−G22 and d=G11−G12−G21+G22. As G12=G21, we have am=af. Therefore, we can use the same four-way cross model for the F2 mapping with the restriction of am=af. This can be acomplished by searching for the MLE of the four-way cross model with Lu=0, where L=[1 −1 0]. A marker genotype is considered as fully informative if it is homozygous. A heterozygous genotype is considered as partially informative because we cannot tell the difference between the second and the third genotypes. The same multipoint method can be used to infer the QTL genotype of a putative position.

Simulation studies

We designed a series of simulation experiments to verify the proposed multicycle ECM algorithm and the computer program. Since F2 and BC populations are special cases of the four-way cross design, for the purpose of simplicity, we only simulated a BC population. We assumed that the liability of a BC population has a zero mean and unity residual variance. A single QTL was placed at position 25 cM (between markers 3 and 4) of a chromosome with 100 cM long covered by 11 evenly distributed markers. For the single QTL model, the QTL variance is defined as a2, where a is the QTL effect (the difference of the allelic values of the segregating parent of the BC progeny). If the segregating parent is the female parent, a=af, otherwise, a=am (see the notation in the previous paragraph). The QTL variance in the traditional BC analysis is a2/4, which is different from what we defined here. This is because we defined the genotype indicator variable as 1 and −1 for the two alternative genotypes, whereas the genotype indicator variable is defined as 1 and 0 for the two alternative genotypes in the traditional BC analysis (Lynch and Walsh, 1998). The total variance of the liability is σy2=a2+1 because the environmental variance of the liability is defined as 1. The proportion of the liability variance explained by the QTL is called the QTL heritability and is denoted by h2=a2/(a2+1).

Comparison with logistic regression

In the first simulation experiment, we simulated five ordered categories (C=5) with four threshold values. The four thresholds were chosen by trial and error so that the frequencies of the five categories occuring in the BC population have a ratio of 1:2:4:2:1. These threshold values depend on the genetic effects of the simulated QTL. The QTL effect was set at four levels, that is, a=0.2294, 0.3333, 0.5000, 0.8165, so that the corresponding heritabilities at the four levels are h2=0.05, 0.10, 0.20, 0.40, respectively. The simulated thresholds and the genetic effects are given in Table 1. The sample size of the BC population was n=300. The simulation was replicated 100 times so that we can compare the empirical statistical powers, the mean estimated parameters and the standard errors of the estimates for different levels of the heritabilities. The critical values of the test statistic used to declare statistical significance at the 5% experiment-wise type I error rate were calculated from the approximate method of Piepho (2001). The empirical statistical power was calculated as the proportion of the simulated samples among the 100 replicates with the highest test statistical value along the genome greater than the approximate critical value.

Table 1 Comparison of the new method of QTL mapping with the logistic regression analysis

Logistical regression analysis was the only existing method available for ordinal trait QTL mapping (Hackett and Weller, 1995; Rao and Xu, 1998). For each simulated sample, we also analyzed the data with the method of Rao and Xu (1998), which was implemented via the simplex algorithm (Nelder and Mead, 1965) for direct maximization of the likelihood function. In order to compare the estimated parameters of the logistic analysis with the probit model proposed here, the estimated QTL effect obtained from the logistic regression was multiplied by a constant (Hackett and Weller, 1995). Results of both analyses are given in Table 1. The estimated parameters are close to the true values simulated for both methods. However, the estimated QTL effects and the threshold values from the logistic regression are slightly biased downwards due to the approximation of the constant factor . The statistical powers of the two methods are also comparable and both follow the expected trend that larger QTL tends to have a higher power to be detected. The estimated QTL positions for both methods are slightly biased and with a large estimation error when the QTL is small, which follows the usual expectation of QTL mapping studies.

Effect of the number of categories on QTL mapping

In the second simulation experiment, we evaluated the effect of the number of categories on the result of QTL mapping. The design of the simulation was similar to that described in the first paragraph of the section of simulation studies. We now set the QTL effect at a=0.3333 so that h2=0.10. We simulated three levels for the number of categories, 2, 5 and 8, corresponding to 1, 4 and 7 different threshold values (see Table 2 for the simulated threshold values). The frequency ratios of the categories in the three sets of simulations were 1:1, 1:2:4:2:1, and 1:2:3:4:4:3:2:1, respectively, for the three sets of threshold values. The sample size for the BC population was fixed at n=200. The simulations was replicated 100 times for each setting. The results are given in Table 2, which shows that the number of phenotypic categories does not have a dramatic effect on the estimate of the QTL effect and position, but it does affect the statistical power. Increasing the number of categories tends to increase the statistical power. This result may be explained by the fact that increasing the number of categories has increased the information of predicting the liability from the observed categorical phenotype. If the number of categories had been increased to infinity, we would have observed the liability, and thus the power would reach that of QTL mapping for continuous traits. In reality, however, it is impossible to handle a large number of categories because we may encounter a problem of overparameterization due to the large number of thresholds to be estimated. For a large number of categories, the phenotype should be treated as a continuous trait and analyzed using a classical QTL mapping procedure.

Table 2 Mean and standard deviation (STD) of the estimated threshold values and QTL effect for various number of phenotypic categories (C)

Effect of the size of QTL on the result of QTL mapping

This simulation experiment intends to evaluate the effect of QTL size on the result of QTL mapping under a sample size of n=200, which is typically used in QTL mapping experiments. The parameters simulated in this experiment are identical to those reported in the paragraph under the title of ‘comparison with logistical regression,’ except n=200. The results are given in Table 3. Again, a general trend of higher statistical power for higher heritability was observed. In addition, the QTL position is more precisely estimated for higher heritability than for lower heritability.

Table 3 Mean and standard deviation (STD) of the estimated threshold values and QTL effect under various levels of QTL size

Effect of phenotypic distribution on QTL mapping

In this simulation experiment, we investigated the effect of the shape of phenotypic distribution on the result of QTL mapping under a fixed sample size (n=200), a given number of categories (C=5) and a given size of QTL (a=0.3333, that is, h2=0.10). We choose the set of threshold values by trial and error so that the phenotypic frequency ratios of the five categories were 1:1:1:1:1 for the first set (uniform distribution), 1:2:4:2:1 for the second set (symmetrical and bell-shaped distribution) and 6:4:3:1:1 for the third set (highly skewed distribution). The simulated threshold values as well as the estimated parameters from 100 replicated simulations are given in Table 4. We found that skewed distribution has decreased the statistical power. The optimal power occurred in the situation where the phenotypic distribution is bell-shaped.

Table 4 Mean and standard deviation (STD) of the estimated threshold values and QTL effect for various shapes of phenotypic distribution

Effect of sample size on QTL mapping

Finally, we investigate the effect of sample size on the result of QTL mapping when the QTL size was fixed at a=0.3333 (h2=0.10), the number of categories was C=5 and the shape of the phenotypic distribution was 1:2:4:2:1. We evaluated four levels of sample sizes: 100, 200, 300 and 500. Results of 100 replicated simulations are summarized in Table 5. We did observe the expected trend of the power increase as the sample size was increased. The accuracy and precision were also increased as the sample size was increased. Note that the statistical power was 90% when the sample size was 200. This situation has been simulated several times in the previous subsections (Tables 2, 3 and 4). The empirical statistical powers ranged from 86 to 90%, which reflects the stochastical error due to limited number of replicates. The main purpose of the paper was to develop a new method rather than to conduct exhaustive simulations for comparison of statistical power in an exact manner. Therefore, 100 replicates appear to suffice for demonstrating the efficiency of the new method of QTL mapping.

Table 5 Mean and standard deviation (STD) of the estimated threshold values and QTL effect for various sample sizes (n)

Discussion

We introduced the multicycle ECM algorithm for mapping ordinal traits using a four-way cross model, not because the four-way cross model is more common than the simple line cross model (BC and F2) but because the former is a general model which covers the simple line crosses as special cases. Note that when we extend the four-way crosses model to BC and F2 families, the estimated genetic effects need to be rescaled in order to be comparable with the results using the traditional BC and F2 models. Recall that the design matrix for the linear model in the four-way cross is denoted by Zj=[Z1j Z2j Z3j] for the jth individual. The coefficient of each genetic effect takes one of two possible values, 1 and −1, with an equal probability. Therefore, they all have a zero expectation and a unity variance, and are orthogonal to each other. When extended to the BC family, Z2j and Z3j have vanished from the model. The only coefficient left in the model is Z1j, which takes value 1 for a heterozygote and −1 for a homozygote. In the traditional BC model, however, the coefficient is defined as 1 for a heterozygote and 0 for a homozygote, which leads to an expectation of 1/2 and a variance of 1/4. Therefore, when the traditional BC model is compared with our extended BC model, we should take into consideration the scale difference. The estimated effect of the extended BC model would be half the effect of the traditional BC model. When extended to the F2 family, Z1j and Z2j have been combined because am=af=a. Therefore, the coefficient of the additive effect is Z1j+Z2j, with a zero expectation and a variance of 2. This means that the coefficient of the additive effect is defined as −2 for one homozygote, 0 for the heterozygote and 2 for the other homozygote. In the traditional F2 model, however, the coefficient of the additive effect is defined as 0 for one homozygote, 1 for the heterozygote and 2 for the other homozygote. In such a scale, the expectation of the additive coefficent is 1 and the variance is 1/2. Therefore, when the traditional F2 model is compared with our extended F2 model, we should take into consideration the scale difference. The estimated additive effect of the traditional F2 model would be twice the effect of the extended F2 model. The coefficient of the dominance effect in the extended F2 model is defined as 1 for the homozygote and −1 for the heterozygote, whereas, in the traditional F2 model, this coefficient is defined as 1 for the heterozygote and 0 for the homozygote. Therefore, the estimated dominance effect in the extended F2 model should be half the effect of the traditional F2 model with an opposite sign.

The ECM algorithm developed in this study depends on the probit model rather than the logistic model (Hackett and Weller, 1995; Rao and Xu, 1998). The probit model uses a normal link function, which is more natural than the logistic link function because the residual error is assumed to be normally distributed in the probit model. With the normal link function, QTL effect is estimated in the original scale rather than in a logit scale and then converted into the probit scale using a constant , which is an approximate factor. The probit model serves as an alternative but slightly better model than the logistic analysis because of the normal distribution of the residual error. The two models take different approaches, in that the probit model simply tries to take advantage of existing QTL mapping theory for regular quantitative traits whereas the logistic regression tries to take advantage of the simple form of the link function. The logistic link function can be easily calculated without using numerical integration, whereas the probit link function may require numerical integration because there is no closed form of the normal distribution function. This disadvantage is less relevant as most modern computer programs, such as SAS (SAS Institute, 1999), can make use of a function to call the normal distribution function and its reverse function. Binary data QTL mapping is a special case of the ordinal data QTL mapping where there are only two categories in the phenotype. In binary trait QTL mapping, Rebai (1997) and Kadarnideen et al (2000) compared the threshold model with a simple regression analysis where the binary phenotype, coded as 0 or 1, was simply analyzed as if it were continuous. They showed that the power loss in the simple regression analysis was almost negligible compared with the threshold model. We present the threshold model to give the users an alternative but statistically more rigorous method for ordinal data analysis. Users may choose either method for their data analysis. If users prefer rapid results, then simple regression is the choice; otherwise, the threshold model implemented via the EM algorithm should be choice, because the EM method is at least as efficient as the regression method.

We have written a computer program implementing the above data analyses. The program is written in SAS 8.2, called QTL-By-SAS, which runs on both the Windows and Unix platform. The program codes and a user manual can be downloaded from our website at www.statgen.ucr.edu.