Mapping small-effect and linked quantitative trait loci for complex traits in backcross or DH populations via a multi-locus GWAS methodology

Wang, Shi-Bo; Wen, Yang-Jun; Ren, Wen-Long; Ni, Yuan-Li; Zhang, Jin; Feng, Jian-Ying; Zhang, Yuan-Ming

doi:10.1038/srep29951

Download PDF

Article
Open access
Published: 20 July 2016

Mapping small-effect and linked quantitative trait loci for complex traits in backcross or DH populations via a multi-locus GWAS methodology

Shi-Bo Wang^1,2,
Yang-Jun Wen²,
Wen-Long Ren²,
Yuan-Li Ni²,
Jin Zhang²,
Jian-Ying Feng² &
…
Yuan-Ming Zhang¹

Scientific Reports volume 6, Article number: 29951 (2016) Cite this article

3494 Accesses
40 Citations
2 Altmetric
Metrics details

Subjects

Abstract

Composite interval mapping (CIM) is the most widely-used method in linkage analysis. Its main feature is the ability to control genomic background effects via inclusion of co-factors in its genetic model. However, the result often depends on how the co-factors are selected, especially for small-effect and linked quantitative trait loci (QTL). To address this issue, here we proposed a new method under the framework of genome-wide association studies (GWAS). First, a single-locus random-SNP-effect mixed linear model method for GWAS was used to scan each putative QTL on the genome in backcross or doubled haploid populations. Here, controlling background via selecting markers in the CIM was replaced by estimating polygenic variance. Then, all the peaks in the negative logarithm P-value curve were selected as the positions of multiple putative QTL to be included in a multi-locus genetic model and true QTL were automatically identified by empirical Bayes. This called genome-wide CIM (GCIM). A series of simulated and real datasets was used to validate the new method. As a result, the new method had higher power in QTL detection, greater accuracy in QTL effect estimation and stronger robustness under various backgrounds as compared with the CIM and empirical Bayes methods.

Population size in QTL detection using quantile regression in genome-wide association studies

Article Open access 13 June 2023

Leveraging information between multiple population groups and traits improves fine-mapping resolution

Article Open access 10 November 2023

The flashfm approach for fine-mapping multiple quantitative traits

Article Open access 22 October 2021

Introduction

Numerous methods and mating designs have been proposed since the first interval mapping procedure was developed by Lander & Botstein¹. The composite interval mapping (CIM) procedure^2,3 remains one of the most popular methods for quantitative trait locus (QTL) mapping due to its simplicity of single locus scanning and ability to control genetic background information. The inclusive composite interval mapping (ICIM) developed by Li et al.⁴ modified the CIM method by fitting a multiple regression model to estimate the co-factor effects and then adjusting the phenotypic value by the estimated effects of the co-factors before performing the usual interval mapping. The ICIM method is more robust than the original CIM because a more efficient co-factor selection process has been implemented.

Up to now, numerous QTL have been reported for complex traits in animals, plants and humans. Among these QTL, most have small effects on complex traits⁵ and some are closely linked⁶. Although QTL mapping has proven to be useful for detecting major QTL with relatively large effects, it may lack power in accurately modeling small-effect QTL⁷. Meanwhile, closely linked QTL might be mistakenly estimated as a single QTL with a larger effect at the wrong position if they have the same direction in effects, or they might be missed if their effects are in opposite directions^1,8,9,10. Due to the difficulty in detection of small-effect and closely-linked QTL, the genetic foundations of most complex traits are not well understood. To address this issue, it is necessary to reconsider the model and improve the way that polygenic background is controlled.

Genome-wide association study (GWAS) has been widely used in human, animal and plant genetics^{11,12,13,14,15,16,17,18,19}. The GWAS data often includes a large number of markers, making co-factor selection infeasible. Thus, polygenic effects are often fitted to a mixed linear model to capture the genomic background information^11,12. This treatment can help us improve the methods of QTL mapping and overcome the subjectivity nature of the CIM in co-factor selection^20,21. However, it is still difficult to detect small-effect and closely-linked QTL.

In the usual GWAS, one marker is tested at a time and the entire genome is then scanned. In QTL mapping, each pseudo marker is tested at a time until all putative positions are scanned. The effect of the current marker has been treated as a fixed effect. However, treating QTL effect as random may have some advantages over treating it as fixed effect^21,22. For example, the random effect treatment places a prior variance to shrink the estimated effect and such a shrinkage estimate of the QTL effects more stable than the least squares estimate when the genotypes are skewly distributed¹⁹.

Previous studies showed higher power from multi-locus QTL detection as compared with single-locus linkage analysis^22,23,24 and the single-marker GWAS analysis^19,25,26. To improve the power and accuracy in mapping small-effect and closely-linked QTL, the multi-locus model approach should be considered. In this study, therefore, a genome-wide composite interval mapping (GCIM) in backcross or DH was proposed under the framework of multi-locus GWAS of Wang et al.¹⁹. Not only can the new method solve the problem in co-factor selection, caused by a large number of markers, but also can overcome the shortcomings of the single-marker analysis. More importantly, GCIM can improve the power and accuracy in detection of small-effect and closely-linked QTL.

Results

Comparison of the GCIM under various QTL-effect models and K matrices

To optimize the GCIM, random- and fixed-effect models and the K matrices calculated respectively from whole and part markers were considered in this study. In the part-marker case, only the markers flanked by QTL under study were deleted. Thus, the performances for the above four GCIM methods were investigated. In other words, each sample in all the simulation experiments was analyzed by all the four methods. Results from the comparison of K matrices from the whole and part markers under random effect model showed that the higher power in QTL detection and greater accuracy in QTL effect estimation were observed from the whole-marker-derived K matrix than from the part-marker-derived K matrix (Tables 1 and S1 to S6). The above differences for power and accuracy under random effect model were slightly better than those under fixed effect model, which was further confirmed from the comparison between random- and fixed-QTL-effect models under K matrix from whole markers, although most of them were not significant (Table 1). Therefore, in the new GCIM method, QTL effect was viewed as random and K matrix was calculated from the whole markers.

Table 1 Differences and their paired-t-test probabilities for average power and mean absolute deviation (MAD) obtained from genome-wide composite interval mapping in Monte Carlo simulation studies^*.

Full size table

Power in QTL detection and accuracy in QTL-effect estimation for all the simulated QTL

To validate the new GCIM, a series of Monte Carlo simulation studies was carried out. In the first simulation experiment, only 20 QTL were simulated. Each sample was analyzed by the GCIM, CIM and empirical Bayes methods. As a result, the new method and empirical Bayes had 25.20% and 25.80% higher average power in the detection of QTL than the CIM, respectively (P-values: 2.09E-4 and 1.60E-4, respectively), while there was no significant difference between the new method and empirical Bayes (P-value = 0.4790) (Tables S4 and S8). When polygenic background was added to the first simulation experiment and the polygenic background was changed into epistatic background , similar trends were observed (Tables S5, S6 and S8), although various P-values were observed.

We also used mean absolute deviation (MAD) to validate the new method. The new method and empirical Bayes had 0.44 and 0.49 lower average MAD in the estimation of QTL effect than the CIM, respectively (P-values: 0.0253 and 0.0162, respectively). When polygenic background was added to the first simulation experiment and the polygenic background was changed into epistatic background , similar trends were observed (Tables S5, S6 and S8), although various P-values were observed.

As shown above, the contribution to increase the statistical power and to decrease the MAD varies from QTL to QTL. It is natural to make clear which kind of QTL has the maximum contribution to the increase of power and to the decrease of the MAD.

Small-effect QTL

To make clear the reasons that result in significant difference in statistical power across various methods, we summarized the results from small-effect QTL. In this study, the 9th, 14th, 19th and 20th QTL were viewed as small, because their r² were less than 1%. For these four small-effect QTL, the average powers for QTL detection from the new method and empirical Bayes had 31.38% and 31.75% higher than that from the CIM, respectively (P-values: 0.0178 and 0.0173, respectively), while no significant difference between the new method and empirical Bayes was observed (P-value = 0.7177) (Fig. 1, Tables S4 and S8). In the second and third simulation experiments, the same trends were found as well (Fig. 1; Tables S5, S6 and S8), although their P-values varied.

The accuracy for the QTL-effect estimate is also important for a new method. All the estimates from the three methods in the three simulation experiments were compared with their corresponding true values. MAD, MSE and SD were used to measure the accuracy. As a result, the new method and the empirical Bayes method had 0.23 and 0.24 significantly smaller average MAD for QTL effect than the CIM (P-values = 0.0115 and 0.0102) (Fig. 1, Tables S4 and S8). For other indicators and simulation experiments, the trends were similar except that the new and empirical Bayes methods were not statistically significant (P-values = 0.5456 and 0.5707) (Fig. 1; Tables S5, S6 and S8).

Closely-linked QTL

To make clear the reasons that result in significant difference in statistical power across various methods, we summarized the results from closely linked QTL. If the map distance between two adjacent QTL is not larger than 20 cM, these QTL were viewed as linked QTL. For example, the 5th and 6th QTL, the 7th and 8th QTL, the 10th to 12th QTL and the 16th to 18th QTL. For these ten closely-linked QTL, the statistical powers in the detection of QTL using the new and empirical Bayes methods were 34.25% and 35.20% higher than that using the CIM method, respectively (P-values: 4.92E-3 and 3.85E-3, respectively), while no significant difference between the new and empirical Bayes methods was observed (P-value = 0.5813) (Fig. 2, Tables S4 and S8). In the second and third simulation experiments, similar trends were found as well (Tables S5, S6 and S8), although their P-values varied.

False positive rate

In the fourth simulation experiment, no QTL was simulated. At this case, all the QTL identified were false. If the number of QTL detected is large, the false positive rate is high. As a result, the number of QTL found in the fourth simulation experiment was 8, 160 and zero from the new method, CIM and empirical Bayes, indicating relatively low false positive rate from the new method.

Real data analysis in triticale

To validate the new method, each of four crosses in Würschum et al.²⁷ were analyzed for DS1, DS2 and DS3 by the CIM method and all the four crosses were jointly analyzed for DS1, DS2 and DS3 by the new method, the empirical Bayes method and the method of Würschum et al.²⁷. Würschum et al.²⁷ detected 29, 14 and 14 QTL, respectively, for DS1, DS2 and DS3. All the results were listed in Tables 2 and S7.

Table 2 Main-effect DS3 QTL identified by genome-wide composite interval mapping (GCIM), composite interval mapping (CIM), empirical Bayes and joint multi-population analysis of Würschum et al. ²⁷.

Full size table

The new method detected 27, 16 and 18 QTL, respectively, for DS1, DS2 and DS3. The corresponding numbers significantly associated with markers are 18, 10 and 12, respectively, from empirical Bayes. These significantly associated markers or QTL for each trait were used to conduct a multiple linear regression analysis and the corresponding Bayesian information criteria (BIC) were calculated. The new method shows the lowest BIC values for all the three traits (Table 3), indicating the best model fit from the new method.

Table 3 Bayesian information criterion (BIC) for the regression of each trait on all the associated SNPs using genome-wide CIM (GCIM), empirical Bayes and joint analysis of Würschum et al. ²⁷.

Full size table

In this study, two QTL detected by various methods are viewed to be similar if their positions are within 5 cM. As a result, there were 19, 11 and 10 similar QTL, respectively, for the above three traits, between the new method and the method of Würschum et al.²⁷, 15, 14 and 11 similar QTL between the new method and the CIM method and 14, 8 and 10 similar QTL between the new method and the empirical Bayes method. Clearly, these results validated the new method.

Discussion

In the random model method of Wei & Xu²⁸, the part-marker-derived K matrix method has higher power in QTL detection than the whole-marker-derived K matrix method. Similar results have been found as well in GWAS. The results are not seemingly consistent with our results in this study. Actually, they are identical, because smaller probabilities in the genome-wide scan (the first step) of the GCIM are found in the part-marker-derived K matrix case (data not shown). In the GCIM, only all the peaks in the genome-wide scan curve are selected as the positions of multiple putative QTL. The position changes for the putative QTL under the part-marker-derived K matrix may decrease the power and accuracy.

Although Xu²¹ used the random-QTL-effect mixed linear model framework of GWAS to identify QTL in backcross, his approach was not significantly better than the CIM in the Monte Carlo simulation experiments. The integration of multi-locus genetic model with Xu’s method²¹ in this study has significantly improved the statistical power of QTL detection. Although we adopted the GWAS methodology in this study, the new method is different from the GWAS methodology, because QTL mapping in backcross or DH populations can evaluate each possible genome position without marker. In this case, pseudo markers in every d cM need to be inserted in order to cover the entire genome, which makes the new method estimate QTL positions more accurately. This is another reason why the new method is called as GCIM, although the most important reason is that the new method may control polygenic background on a genome-wide level. Although the new method was proposed in backcross or DH, it is suitable for the mapping any populations with two genotypes, for example, recombinant inbred lines. The new method is also used to map QTL in chromosome segment substitution lines, but we can scan only marker positions, because conditional probabilities at the positions of pseudo markers can not be calculated. If the number of genotypes in a mapping population is more than two, for example, F₂, the current method requires some modifications and further investigation will be conducted in the near future.

Here we compared the new method with the CIM, which is a widely-used QTL mapping method. The results from the new method showed higher power in QTL detection, higher accuracy in QTL effect estimation and better model fit under various genetic backgrounds in the first to third simulation experiments, especially for small-effect and closely-linked QTL. The reasons are as follows. We scanned and selected markers with a low criterion of significance test. Potential QTL especially with small-effect or linkage cannot be excluded and can be easily included in the last model. In addition, we also compared the new method with inclusive CIM (ICIM) of Li et al.⁴. As a result, the new method has higher average power than the ICIM, especially for small-effect and closely-linked QTL (Table S9).

Although empirical Bayes is slightly better than the new method but no significant difference is observed in this study. Note that in this study empirical Bayes was implemented with multi-marker analysis. If pseudo markers were inserted between two adjacent markers and the effects for all the true and pseudo markers were simultaneously estimated by the empirical Bayes, the empirical Bayes was significantly worse than the new method (results not shown). If we increase marker density, the collinearity among closely linked markers will make parameter estimation of the empirical Bayes method difficult. If the number of the true and pseudo markers is many times larger than sample size, the empirical Bayes will fail. Thus, the new method is better than the empirical Bayes method.

Conclusion

The mrMLM methodology for GWAS was used to conduct linkage analysis in backcross and DH populations. Genomic background can be effectively controlled by estimating polygenic variance. All the peaks of the negative logarithm P-value curve in genome-wide single-point scanning were selected as the positions of multiple putative QTL to be included in a multi-locus genetic model and true QTL were automatically identified by empirical Bayes. The new method had higher power in QTL detection, greater accuracy in QTL effect estimation and stronger robustness under various backgrounds as compared with the CIM and empirical Bayes methods, especially for small and closely linked QTL.

Methods

The method is adopted from the mixed model genome-wide association studies^11,12 where population structure and other non-genetic variables are treated as fixed effects and polygenes are fitted to the model as a random effect. The computational algorithm follows the eigen-decomposition approach used in the efficient mixed model association (EMMA) proposed by Kang et al.¹⁴ and in the improved genome-wide efficient mixed-model association (GEMMA) developed by Zhou & Stephens¹⁸. We propose to scan the genome by one marker at a time. The effect of the current marker is treated as a random effect¹⁹. In this study, when the marker effect is treated as random effect, the model is called the random effect model. For the paper to be self-contained, we briefly introduce both models in the next paragraphs.

Genetic model

Let y be an n × 1 vector of phenotypic values for n individuals in a backcross or DH population. Let Z_k be a genotype indictor variable for marker k, where the jth element of Z_k is defined as

We now write the model by

where X is a design matrix for (non-genetic) fixed effects α, γ_k is the effect of marker k. We now treat γ_k as a random effect with a distribution. When treated as random, the estimated γ_k is a shrinkage estimator and also called empirical Bayes estimate because ϕ² is also estimated from the data²⁹. ξ is an n × 1 vector of polygenic effects and ε is the residual error. Assume that ε ∼ N(0, I_nσ²) and ξ ∼ N(0, Kϕ²), where σ² is the residual error variance and ϕ² is the polygenic variance. The covariance structure K is calculated using genome-wide marker information²¹.

QTL mapping differs from GWAS in that chromosome regions without markers also need to be evaluated. Essentially, we insert one pseudo marker in every d cM to cover the entire genome evenly so that every position of the genome will be evaluated. When a pseudo marker is located between two consecutive markers, we will use the multipoint method of Jiang & Zeng³⁰ to calculate the genotype probabilities, denoted by p_jk(AA) and p_jk(Aa), respectively, for the two genotypes AA and Aa in backcross. The genotype indicator variable, Z_jk, is then defined as the expected value conditional on flanking marker genotypes³¹. Therefore, Z_jk is defined as

The expectation of y is E(y) = Xα and the variance is

where and H = Kλ+I where

is a marker-inferred kinship matrix²¹.

Parameter estimation

After absorbing α and σ² we have two variance ratios to estimate for each marker, which are λ_k and λ. The fast algorithm of Zhou & Stephens¹⁸ does not apply to more than two variance components in its original form. The Newton-Raphson iteration algorithm must be used to search for the solution of two variance components. Here, we adopted the approximate approach implemented in EMMA¹⁵ and P3D¹⁶ to assume that the polygenic variance ratio is constant across loci and thus is replaced by the estimated value under the pure polygenic model () where no markers are fitted to the model. Once is fixed, we can still take advantage of the eigen-decomposition to speed up the genome scan process¹⁹.

Let y^* = U^Ty, X^* = U^TX and be transformed variables so that

The variance-covariance matrix of y^* is

where is a known diagonal matrix. Let be the general covariance structure. After absorbing α and σ², we have the following profiled restricted log likelihood function,

where

This likelihood function contains only one unknown parameter, λ_k. The Newton algorithm for λ_k is

Once the iteration process converges, the solution is the REML estimate of λ_k, denoted by . Given , the estimates of α and σ² are

The best linear unbiased prediction (BLUP) of γ_k is also the conditional expectation of γ_k given y^* and has the following expression,

The conditional variance is

Under the random model approach, we first estimate the polygenic variance. We then estimate λ_k and test λ_k = 0 for each marker using the same estimated polygenic variance.

Wald test for marker effect

We use the Wald test to test H₀ : γ_k = 0 or H₀ : λ_k = 0 in the random effect model approach, The Wald test statistic is

where and var(γ_k) are obtained from equations (12) and (13), respectively³².

Multi-locus random-QTL-effect mixed linear model method

The single-marker random-QTL-effect mixed linear model (rMLM) method described above is considered as an initial scanning step for a new multi-locus random-QTL-effect mixed linear model (mrMLM) approach that is described here. In the rMLM step, the negative logarithm P-value curve is obtained for the whole genome. In the curve, all the peaks were selected as the positions of putative QTL in a multi-locus QTL mapping model. Thus, multi-locus genetic model under consideration is as follows

where K is the number of peaks in the negative logarithm P-value curve, γ = (γ₁ γ₂…γ_K)^T and the others are same as those in model (2). Note that and other prior distributions are same as those in Xu³². In the above model, polygenic background is not included, because that all the potential QTL have been included in the model (15)¹⁹.

All the effects of QTL in the multi-locus model were estimated by empirical Bayes of Xu³². The procedure for parameter estimation is as follows.

(1) Setting initial values: , α=(X^TX)⁻¹X^Ty and ;

(2) E-step: QTL effect can be predicted by

where ;

(3) M-step: update parameters , α and σ²

where and . Repeat E-step and M-step until convergence is satisfied.

If the P-value for a random effect apart from zero was less than 0.01 in the F-test, the putative QTL were picked up to perform the likelihood ratio test (LRT). All the putative QTL with the ≥2.5 LOD score were viewed as true. Considering that all potential QTL were selected in the first stage, we decided to place a slightly more stringent criterion of 0.000691, which is converted from LOD score 2.50 of the test statistics using .

In this study, the mrMLM method was used to detect QTL for the trait. The identified QTL were used to correct original phenotypes and the corrected ones were analyzed again in order to increase the power.

Real dataset analyzed

A large mapping population of triticale with 647 doubled haploid lines derived from four partially connected crosses (HeTi117-06 × Pawo, HeTi117-06 × TIW671, Modus × Saka3006 and Modus × Saka3008) was used for the demonstration²⁷. All the plants were scored visually for their developmental stages at three time points termed DS1, DS2 and DS3 and scanned for genotypes. Only one marker was kept if several markers are located at the same position. All the 1549 markers with different positions covered 2,306.7 cM of the entire genome. The average marker distance was 1.5 cM. All the data was downloaded from http://www.g3journal.org/content/4/9/1585/suppl/DC1. We inserted one or more pseudo markers in intervals larger than 1 cM to make sure that the entire genome is evenly covered by pseudo or true markers with no intervals larger than 1 cM. The number of pseudo markers inserted was 1691, resulting in a total of 3240 markers. For the pseudo markers, the genotype indicator variable is missing for every individual. In this case, the missing variable was replaced by their conditional expectation.

Monte Carlo simulation experiments

Each backcross population with 400 individuals was simulated. We placed one marker in every 5 cM and the entire chromosome was then evenly covered by 481 co-dominant markers. Twenty simulated QTL were located on a single large chromosome of 2400 cM in length. The effects and locations of the 20 QTL were listed in Table S1. These QTL vary in size with the largest QTL explaining 20% of the phenotypic variance and with the smallest QTL explaining 0.5% of the phenotypic variance. The population mean is 100 and the residual variance is 10. Each of the 200 simulated samples was analyzed by the new method, CIM and empirical Bayes. All the true and pseudo markers were scanned for the new and CIM methods but only the true markers were included in the full model for empirical Bayes. For each simulated QTL, we counted the samples in which the LOD statistic exceeded 2.5. A detected QTL within 5 cM of the simulated QTL was considered a true QTL. The ratio of the number of such samples to the total number of replicates (200) represented the empirical power of this QTL. The false positive rate (FPR) was calculated as the ratio of the number of false positive effects to the total number of zero effects considered in the full model. To measure the bias of QTL effect estimates, mean squared error (MSE) and mean absolute deviation (MAD),

were calculated, where is the estimate of γ_k in the ith sample.

To investigate the effect of polygenic (small effect genes) background on the new method, polygenic effect was simulated by multivariate normal distribution, where is polygenic variance and K is kinship coefficient matrix between a pair of individuals. Here , so . Other setups are identical to the first simulation experiment (Table S2).

To investigate the effect of epistatic background on the new method, three epistatic QTL pairs each with and were simulated. The first one was placed between 800 cM and 1800 cM; the second one between 1210 cM and 1860 cM; and the last one between 275 cM and 740 cM. Other setups are identical to the first simulation experiment (Table S3).

To investigate the type I error for the new method, no QTL was simulated. We just simulated residual error in this simulation study. Other setups are identical to the first simulation experiment.

We developed our own software to implement all the analyses in this paper and would upload it to the R website (https://cran.r-project.org/web/packages/mrMLM/index.html).

Additional Information

How to cite this article: Wang, S.-B. et al. Mapping small-effect and linked quantitative trait loci for complex traits in backcross or DH populations via a multi-locus GWAS methodology. Sci. Rep. 6, 29951; doi: 10.1038/srep29951 (2016).

References

Lander, E. S. & Botstein, D. Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121, 185–199 (1989).
CAS PubMed PubMed Central Google Scholar
Jansen, R. C. Interval mapping of multiple quantitative trait loci. Genetics 135, 205–211 (1993).
CAS PubMed PubMed Central Google Scholar
Zeng, Z.-B. Theoretical basis for separation of multiple linked gene effects in mapping quantitative trait loci. Proc. Natl. Acad. Sci. USA 90, 10972–10976 (1993).
Article ADS CAS Google Scholar
Li, H. H., Ye, G. Y. & Wang, J. K. A modified algorithm for the improvement of composite interval mapping. Genetics 175, 361–374 (2007).
Article Google Scholar
Mackay, T. F. The genetic architecture of quantitative traits. Annual Review of Genetics 35, 303–339 (2001).
Article CAS Google Scholar
Hawthorne, D. J. & Via, S. Genetic linkage of ecological specialization and reproductive isolation in pea aphids. Nature 412, 904–907 (2001).
Article ADS CAS Google Scholar
Heffner, E. L., Sorrells, M. E. & Jannink, J. L. Genomic selection for crop improvement. Crop Sci. 49, 1–12 (2009).
Article CAS Google Scholar
Kao, C. K. & Zeng, Z. B. General formulas for obtaining the MLE and the asymptotic variance-covariance matrix in mapping quantitative trait loci when using the EM algorithm. Biometrics 53, 653–655 (1997).
Article CAS Google Scholar
Ronin, Y. I., Korol, A. B. & Nevo, E. Single- and multiple-trait mapping analysis of linked quantitative trait loci: Some asymptotic analytical approximation. Genetics 151, 387–396 (1999).
CAS PubMed PubMed Central Google Scholar
Zhang, Y. M. & Xu, S. Advanced statistical methods for detecting multiple quantitative trait loci. Recent Research Developments in Genetics and Breeding 2, 1–23 (2005).
Google Scholar
Zhang, Y. M. et al. Mapping quantitative trait loci using naturally occurring genetic variance among commercial inbred lines of maize (Zea mays L.). Genetics 169, 2267–2275 (2005).
Article CAS Google Scholar
Yu, J. et al. A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat. Genet. 38, 203–208 (2006).
Article CAS Google Scholar
Aulchenko, Y. S., de Koning, D. J. & Haley, C. Genomewide rapid association using mixed model and regression: a fast and simple method for genomewide pedigree-based quantitative trait loci association analysis. Genetics 177, 577–585 (2007).
Article CAS Google Scholar
Kang, H. M. et al. Efficient control of population structure in model organism association mapping. Genetics 178, 1709–1723 (2008).
Article Google Scholar
Kang, H. M. et al. Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42, 348–354 (2010).
Article CAS Google Scholar
Zhang, Z. et al. Mixed linear model approach adapted for genome-wide association studies. Nat. Genet. 42, 355–360 (2010).
Article CAS Google Scholar
Lippert, C. et al. FaST linear mixed models for genome-wide association studies. Nat. Methods 8, 833–835 (2011).
Article CAS Google Scholar
Zhou, X. & Stephens, M. Genome-wide efficient mixed-model analysis for association studies. Nat. Genet. 44, 821–824 (2012).
Article CAS Google Scholar
Wang, S. B. et al. Improving power and accuracy of genome-wide association studies via a multi-locus mixed linear model methodology. Sci. Rep. 6, 19444 (2016).
Article ADS CAS Google Scholar
Bernardo, R. Genomewide markers as cofactors for precision mapping of quantitative trait loci. Theor. Appl. Genet. 126, 999–1009 (2013).
Article CAS Google Scholar
Xu, S. Mapping quantitative trait loci by controlling polygenic background effects. Genetics 195, 1209–1222 (2013).
Article CAS Google Scholar
Xu, S. Estimating polygenic effects using markers of the entire genome. Genetics 163, 789–801 (2003).
CAS PubMed PubMed Central Google Scholar
Kao, C. H., Zeng, Z. B. & Teasdale, R. D. Multiple interval mapping for quantitative trait loci. Genetics 152, 1203–1216 (1999).
CAS PubMed PubMed Central Google Scholar
Wang, H. et al. Bayesian shrinkage estimation of quantitative trait loci parameters. Genetics 170, 465–480 (2005).
Article CAS Google Scholar
Lü, H. Y., Liu, X. F., Wei, S. P. & Zhang, Y. M. Epistatic association mapping in homozygous crop cultivars. PLoS ONE 6, e17773 (2011).
Article ADS Google Scholar
Segura, V. et al. An efficient multi-locus mixed-model approach for genome-wide association studies in structured populations. Nat. Genet. 44, 825–830 (2012).
Article CAS Google Scholar
Würschum, T. et al. Adult plant development in Triticale (x Triticosecale Wittmack) is controlled by dynamic genetic patterns of regulation. Genes Genomes Genetics 4, 1585–1591 (2014).
PubMed Google Scholar
Wei, J. & Xu, S. A random model approach to QTL mapping in multi-parent advanced generation inter-cross (MAGIC) populations. Genetics 202, 471–486 (2016).
Article CAS Google Scholar
Xu, S. An empirical Bayes method for estimating epistatic effects of quantitative trait loci. Biometrics 63, 513–521 (2007).
Article MathSciNet CAS Google Scholar
Jiang, C. & Zeng, Z. B. Mapping quantitative trait loci with dominant and missing markers in various crosses from two inbred lines. Genetica 101, 47–58 (1997).
Article CAS Google Scholar
Haley, C. S. & Knott, S. A. A simple regression method for mapping quantitative trait loci in the line crosses using flanking markers. Heredity 69, 315–324 (1992).
Article CAS Google Scholar
Xu, S. An expectation-maximization algorithm for the lasso estimation of quantitative trait locus effects. Heredity 105, 483–494 (2010).
Article CAS Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (31571268) and Huazhong Agricultural University Scientific & Technological Self-innovation Foundation (Program No. 2014RC020).

Author information

Authors and Affiliations

Statistical Genomics Lab, College of Plant Science and Technology, Huazhong Agricultural University, Wuhan, 430070, China
Shi-Bo Wang & Yuan-Ming Zhang
State Key Laboratory of Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing, 210095, China
Shi-Bo Wang, Yang-Jun Wen, Wen-Long Ren, Yuan-Li Ni, Jin Zhang & Jian-Ying Feng

Authors

Shi-Bo Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yang-Jun Wen
View author publications
You can also search for this author in PubMed Google Scholar
Wen-Long Ren
View author publications
You can also search for this author in PubMed Google Scholar
Yuan-Li Ni
View author publications
You can also search for this author in PubMed Google Scholar
Jin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jian-Ying Feng
View author publications
You can also search for this author in PubMed Google Scholar
Yuan-Ming Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Y.-M.Z. conceived and supervised the study and wrote the manuscript. S.-B.W., Y.-J.W., J.Z. and J.-Y.F. performed the experiments and analyzed the data. Y.-L.N. and W.-L.R. wrote the R software. All authors reviewed the manuscript.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Electronic supplementary material

Supplementary Information

Supplementary Dataset 1

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Wang, SB., Wen, YJ., Ren, WL. et al. Mapping small-effect and linked quantitative trait loci for complex traits in backcross or DH populations via a multi-locus GWAS methodology. Sci Rep 6, 29951 (2016). https://doi.org/10.1038/srep29951

Download citation

Received: 29 February 2016
Accepted: 24 June 2016
Published: 20 July 2016
DOI: https://doi.org/10.1038/srep29951

This article is cited by

Finding stable and closely linked QTLs against spot blotch in different planting dates during the adult stage in barley
- Fakhtak Taliei
- Hossein Sabouri
- Shahram Ghasemi
Scientific Reports (2024)
4D genetic networks reveal the genetic basis of metabolites and seed oil-related traits in 398 soybean RILs
- Xu Han
- Ya-Wen Zhang
- Yuan-Ming Zhang
Biotechnology for Biofuels and Bioproducts (2022)
postQTL: a QTL mapping R workflow to improve the accuracy of true positive loci identification
- Prashant Bhandari
- Tong Geon Lee
BMC Research Notes (2022)
Identification and analysis of oil candidate genes reveals the molecular basis of cottonseed oil accumulation in Gossypium hirsutum L.
- Zhibin Zhang
- Juwu Gong
- Haihong Shang
Theoretical and Applied Genetics (2022)
Detection of QTNs for kernel moisture concentration and kernel dehydration rate before physiological maturity in maize using multi-locus GWAS
- Shufang Li
- Chunxiao Zhang
- Xiaohui Li
Scientific Reports (2021)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

Comparison of the GCIM under various QTL-effect models and K matrices

Power in QTL detection and accuracy in QTL-effect estimation for all the simulated QTL

Small-effect QTL

Closely-linked QTL

False positive rate

Real data analysis in triticale

Discussion

Conclusion

Methods

Genetic model

Parameter estimation

Wald test for marker effect

Multi-locus random-QTL-effect mixed linear model method

Real dataset analyzed

Monte Carlo simulation experiments

Additional Information

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Ethics declarations

Competing interests

Electronic supplementary material

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links