Metadata analysis indicates biased estimation of genetic parameters and gains using conventional pedigree information instead of genomic-based approaches in tree breeding

Beaulieu, Jean; Lenz, Patrick; Bousquet, Jean

doi:10.1038/s41598-022-06681-y

Download PDF

Article
Open access
Published: 10 March 2022

Metadata analysis indicates biased estimation of genetic parameters and gains using conventional pedigree information instead of genomic-based approaches in tree breeding

Jean Beaulieu¹,
Patrick Lenz^1,2 &
Jean Bousquet¹

Scientific Reports volume 12, Article number: 3933 (2022) Cite this article

1302 Accesses
8 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Forest tree improvement helps provide adapted planting stock to ensure growth productivity, fibre quality and carbon sequestration through reforestation and afforestation activities. However, there is increasing doubt that conventional pedigree provides the most accurate estimates for selection and prediction of performance of improved planting stock. When the additive genetic relationships among relatives is estimated using pedigree information, it is not possible to take account of Mendelian sampling due to the random segregation of parental alleles. The use of DNA markers distributed genome-wide (multi-locus genotypes) makes it possible to estimate the realized additive genomic relationships, which takes account of the Mendelian sampling and possible pedigree errors. We reviewed a series of papers on conifer and broadleaf tree species in which both pedigree-based and marker-based estimates of genetic parameters have been reported. Using metadata analyses, we show that for heritability and genetic gains, the estimates obtained using only the pedigree information are generally biased upward compared to those obtained using DNA markers distributed genome-wide, and that genotype-by-environment (GxE) interaction can be underestimated for low to moderate heritability traits. As high-throughput genotyping becomes economically affordable, we recommend expanding the use of genomic selection to obtain more accurate estimates of genetic parameters and gains.

Long noncoding RNAs underlie multiple domestication traits and leafhopper resistance in soybean

Article 29 April 2024

Genomic analyses reveal the stepwise domestication and genetic mechanism of curd biogenesis in cauliflower

Article Open access 07 May 2024

Positive selection in the genomes of two Papua New Guinean populations at distinct altitude levels

Article Open access 30 April 2024

Introduction

Forest tree breeding traditionally aimed to increase volume production and improve both adaptive traits and fiber attributes of forest tree plantations. Ongoing environmental and market changes are currently shifting the selection focus towards seedstock production for enhanced carbon sequestration capacities and resistance to biotic and abiotic stressors. Those multiple breeding goals and the long-lived perennial nature of trees demand for the most precise estimation of genetic parameters and exact selection of individuals that best combine the desired properties of various nature (e.g.^1,2). Compared with crop or animal breeding, forest tree breeders initiated their research activities quite recently, i.e., in the late 1950’s³. Since then, genetic improvement programs have been initiated for a large number of forest tree species from around the world⁴. The breeding process in forest tree species is often slow due to the limited resources available and to its inherent complexities. Among these complexities, are: (1) the necessity to set up more or less long-term progeny tests to assess the traits of interest, (2) the fact that many of these traits are often mature traits, thus requiring many years or even decades of testing before their appropriate assessment can be conducted, and (3) some of the traits are difficult and expensive to assess, such as microfibril angle and cell wall thickness.

Traditionally, the selection of trees with desirable attributes for breeding or propagation has been based on both pedigree and phenotypic information. The genetic merit of candidate trees in a breeding population is now estimated using individual tree model or the so-called animal model in a mixed model framework^5,6,7. Hence, the performance of a candidate tree and all known pedigree relationships with other members of its population are used to estimate its breeding value (EBV). The model is characterized by the fitting of a random component for the breeding value of each individual, which can be obtained using Best Linear Unbiased Prediction (BLUP)⁸. The animal model can also account for other environmental and genetic effects⁹. The additive genetic relationships between individuals is of crucial importance in the prediction of breeding values. The probability of identical genes by descent occurring in any pair of individuals is called the coancestry¹⁰ and the additive genetic relationship is twice the coancestry. The genetic evaluation using BLUP is heavily dependent on the genetic covariance among individuals, both for accuracy and unbiased results⁸, and genetic covariance among individuals includes the additive genetic variance, the dominance variance and the epistatic variance. In the present article, we will focus on additive genetic variance only.

The additive genetic relationships among individuals are usually represented by a matrix called the Numerator Relationship Matrix or A-matrix. The inverse of this matrix is needed for solving the mixed model equations and for obtaining the best linear unbiased predictions of breeding values. Henderson¹¹ proposed a recursive method to compute this matrix. The method is known as ABLUP. The A-matrix is symmetric. Its diagonal elements are equal to 1 + F_i, where F_i is the inbreeding coefficient of the individual i, and its off-diagonal elements equal the numerator of the coefficient of relationship between individuals i and j¹². The covariance among breeding values can be obtained by multiplying the additive genetic variance by the A-matrix. As the additive genetic relationship among pairs of relatives is estimated using registered pedigree information, all members of a family receive the same expected relationship (for instance, 0.25 for half-sibs, and 0.50 for full-sibs). Hence, it is neither possible to take account of Mendelian sampling, which is due to the random segregation of the alleles of the parents¹³ and may cause a deviation of actual relationship from the expected one, nor of potential pedigree identification errors or contamination (e.g.^14,15,16,17). Moreover, Askew and El-Kassaby¹⁸ reported that for relatively undomesticated forest tree species, the average relationship does not allow detecting unknown population structure and/or inbreeding, as often shown for tree species (e.g.^14,19). Thus, forest tree breeding values as well as heritability and genetic gain estimates obtained only with pedigree information could be biased.

In the early 2000’s, Meuwissen et al.²⁰, in their seminal paper, proposed using genome-wide distributed markers to model the entire complement of QTL effects across the genome, whether these effects are significant or not, and to obtain genomic-estimated breeding values (GEBVs). This method is called genomic selection (GS). Since then, it has been tested for the selection of complex traits in numerous species, such as maize and wheat²¹, trees^22,23, and cattle²⁴. In the last two decades, there has been a rapid development in DNA marker technologies. The availability of large numbers of markers distributed genome-wide and relatively inexpensive high-throughput genotyping technologies offers the possibility to improve the efficiency of tree breeding at a reasonable cost^{25,26,27,28,29}. Hence, depending on availability of markers for a given species, a large number of individuals can be genotyped for up to many thousands of markers. It is then possible to obtain genotypes from many different loci well distributed across all the chromosomes of a species (e.g.^17,30).

Various statistical methods have been developed for GS and they can be classified in two main groups³¹. The methods of the first group are based on the concept that it is possible to predict the genetic value of individuals by using a regression model that relates phenotypes to all available markers. When the number of available markers generally exceeds the number of individuals in the population used to solve the regression, and the predictors (markers) are highly correlated, variable selection or shrinkage estimation procedures are required. Hoerl and Kennard³² proposed a method called ridge regression, which introduces a little bias so that the variance can be substantially reduced, which leads to a lower overall mean squared error. Tibshirani³³ introduced the least absolute shrinkage and selection operator (LASSO) as an alternative to ridge regression. Since then, Bayesian estimation procedures of the shrinkage estimation methods have also been proposed to address the problem of multi-collinearity. With these regression methods, the genetic effect of each marker can be estimated and the summation of these marker effects for a given individual corresponds to its GEBV. The second group of methods uses the genomic relationships between individuals of a population (G-matrix), which are derived from their multi-locus genotypes, in a linear mixed or animal model framework to directly estimate the GEBV of any individual. It is usually referred to as Genomic Best Linear Unbiased Prediction or simply GBLUP. It is possible to use it in the context of an additive infinitesimal model where the standard pedigree-based numerator relation matrix (A-matrix) is replaced by the marker-based or realized genomic relationship matrix (G-matrix)^31,34. Thus, this second marker-based method is more familiar to tree breeders and quantitative geneticists, and its results are easier to interpret for them.

More recently, a new method, which makes it possible to combine in a single GS analysis trees that were genotyped and trees for which only the pedigree information is known, has been proposed³⁵. This method is referred as to the single-step GBLUP. A relationship matrix called H-matrix is generated using information from both the A-matrix and the G-matrix, and then again, as for GBLUP, it is possible to work in the context of an additive infinitesimal model, by simply replacing the A-matrix by the H-matrix.

Forest geneticists and tree breeders have used all these methods to analyze their data over the last decade. Depending of the genetic structure of the populations studied and the experimental design used, they reported estimates of heritability and genotype-by-environment (GxE) interaction for several quantitative traits based on additive effects only, and additive and non-additive effects (e.g.^{2,28,36,37,38,39}). It is thus possible to make comparisons between results obtained with the pedigree-based and the marker-based approaches. Using a meta-analytical approach, our objective was thus to verify, for a large number of conifer and broadleaf tree species and traits, whether the estimates of narrow-sense heritability of quantitative traits as well as the expected genetic gains derived from selection were biased following an upward or downward trend when using registered pedigree information only, as compared to using marker-based information. We also wanted to compare the estimates of GxE interaction obtained with both approaches to determine if they provided the same results.

Results

Most of the published studies reported results for both growth and wood quality traits (Table 1 and Supplementary material 1). In a few of the studies, heritability estimates were also presented for insect resistance traits^1,2. Narrow-sense heritability estimates were available for 87 study-trait pairs of pedigree-based and marker-based estimates. In total, this metadata sample represents estimates obtained from 16 distinct species or taxa, corresponding to 23 distinct tree breeding populations. Using a Wilcoxon signed-rank test, we found significant differences between the pedigree-based and the marker-based narrow-sense heritability estimates (p-value = 3.43e−09) when all the traits in all studies were considered, with pedigree-based estimates being higher than marker-based ones (Fig. 1a). The same trend was also observed for growth traits (p-value = 2.87e−05) and wood quality traits (p-value = 7.92e−05), when analyzed separately. When considering only conifers (57 study-trait pairs), the pedigree-based narrow-sense heritability estimates were again highly significantly higher than those obtained with DNA marker information (p-value = 4.23e−08; Fig. 1c). This trend was also observed for broadleaf species studies, although the difference was less significant (p-value = 0.0175; Fig. 1d). Genetic gains after selection of the 5% top trees for a variety of growth, wood quality and insect resistance traits were also provided in seven studies on spruces. A comparison of both the pedigree-based and the marker-based estimated genetic gains using a Wilcoxon signed-rank test again showed that globally, the gains estimated using the pedigree information only, were higher than those estimated using DNA marker information (p-value = 0.0021) (Fig. 1b).

Table 1 Pedigree-based and marker-based estimates of narrow-sense heritability for a variety of traits of conifer and broadleaved tree species.

Full size table

In a few studies and for several traits, results of analyses carried out on more than one site were reported. In such cases, estimates of GxE interaction using type-B genetic correlations were presented, which are inversely correlated to the amplitude of GxE interaction. We found such estimates in six of the studies reported in the literature (Table 2). Although we could observe variation in the estimates between pedigree-based and marker-based methods, we could not find an overall significant difference (p-value = 0.0644) using a Wilcoxon signed-rank test. However, this result clearly depended on the type of traits analyzed. Indeed, when the two approaches were compared for wood quality traits only, there was no significant difference between them (p-value = 0.9645) (Fig. 2a). On the other hand, statistical significance was reached for the other traits (growth and resistance to insects pooled together, p-value = 0.0107) (Fig. 2b), with lower type-B genetic correlations and thus higher GxE interaction estimates obtained with the information based on DNA markers. Only two studies considered insect resistance traits so no formal testing could be conducted for these traits only. When considering only growth traits, the statistical difference between pedigree-based and marker-based methods was of the same order of magnitude (p-value = 0.0206) (Fig. 2c) as that obtained by regrouping growth and insect resistance traits (Fig. 2b). This contrasting result between wood quality traits and the other traits may not be surprising, given that wood quality traits are generally under stronger genetic control than the other traits, and thus less influenced by the environment^40,41,42. Therefore, the absence of significant difference between the two approaches when all traits were considered together was because wood quality traits, which represented 58% of the study-trait pairs analyzed, are clearly less influenced by GxE interaction.

Table 2 Pedigree-based and marker-based estimates of genotype-by-environment (GxE) interaction for a variety of traits of conifer tree species.

Full size table

However, even for growth traits, the importance of GxE interaction appears to depend on the species and the populations considered. For instance, substantial GxE interactions were reported for interior spruce from western Canada⁴³, for loblolly pine from southeastern United States⁴⁴, and for Norway spruce in Sweden⁴⁵. Moreover, these authors have indicated that GS models ideally needed to be calibrated for each breeding zone. In contrast to these results, much lower GxE interactions were reported for black spruce^30,46, even for growth traits. Similarly, it was shown that GxE interaction was also limited in white spruce large genetic trials established in eastern Canada^28,47. Nevertheless, the sole use of pedigree-based information appeared to underestimate the importance of GxE interaction in most studies especially for lower heritability traits, as compared to using marker-based methods (Table 2).

Discussion

Results of the present meta-analytical study provides clear evidence that quantitative genetic analyses based on registered pedigree information only, resulted in upwardly biased estimates of narrow-sense heritability for growth, insect resistance, and wood quality traits in forest tree species, as compared to estimates obtained with realized genomic relationships based on DNA markers sampled genome-wide. For open-pollinated families, additive genetic variance estimates can be biased upwardly as the assumption of true half-siblings cannot be confirmed^48,49. Several authors have indeed reported such bias^1,43. However, this is the first time that a metadata analysis brings evidence of this bias even with full-sib family pedigree structure, although it has already been suspected in a few previous studies (e.g.^36,50). Indeed, when we removed results of studies on open-pollinated families (22 study-trait pairs) in our metadata analysis, the Wilcoxon signed-rank test still remained highly significant (p-value = 5.64e−06), thus confirming the observed general bias of using registered pedigree information to represent the genetic relationships existing between the individuals making up the breeding population analyzed. The same upwardly biased trend was also observed for genetic gain estimates derived from pedigree information only, with important implications for conventional tree breeding programs.

The same significant trend was observed in both conifers and broadleaf species, although it was weaker in the latter. Several reasons may explain this difference. For conifers, 48 out of the 57 species- trait pairs compared (84%) showed higher pedigree-based heritability estimates than marker-based estimates. For broadleaf species, 22 of the 30 species-traits pairs (73%) showed such results. Thus, the proportion of significant higher pedigree-based estimates was marginally reduced in broadleaf species. The smaller number of species-trait pairs available for broadleaf species as compared to those of conifers may also partly explain the reduction in the significance of the statistical test. Indeed, the Wilcoxon matched-pairs signed-ranks test gives more weight to a pair that shows a large difference between the two conditions compared than a pair that shows a small difference. For many of the species-trait pairs of the broadleaf species, the differences between both the pedigree-based and the marker-based heritability estimates were quite small, but the trend remained statistically significant, as that observed for conifers. In addition, one can observe that the heritability estimates obtained with the pedigree information only were lower to those from marker information for all species-trait pairs in a single study on 6-yr old E. grandis x E. urophylla full-sibs⁵¹. After a thorough analysis, these authors concluded that putative pedigree errors (pollen contamination and mislabeling) negatively affected their ability to estimate accurate heritability estimates of traits based on pedigree information only. This might be the best explanation, knowing that in another study on the same hybrid species⁴⁴, such lower pedigree-based heritability estimates were not observed. As this study provided half of the 8 species-trait pairs that did not follow the general trend of higher pedigree-based heritability estimates, it had some influence on the lower significance of the statistical test.

While pedigree records and performance data for dairy cattle date back to the late 1800s, together with widespread collection of performance data shortly thereafter⁵², forest tree breeding programs are generally only in their second or third generation at best⁴, and the available pedigree records and performance data are scarce. Thus, the presence of any preexisting inbreeding or relatedness among the ancestors of the current generation in typical forest genetic experiments is most often unknown unless marker-based assessment is used^18,19. Consequently, using the current generation average (theoretical) additive genetic relationships between individuals does not allow taking account of ancestral effects. This may partly explain the systematic overestimation of the real additive genetic relationships observed in this study when using the pedigree-based approach to estimate breeding values and associated genetic parameters in forest trees.

The DNA markers used to estimate identity-by-state relatedness between individuals represent the observed (realized genomic relationships) rather than the average (theoretical) relationship values and thus, make it possible to potentially capture distant relationships and the variation around close relationships due to Mendelian sampling^34,35. Hence, more accurate additive genetic variance and breeding values can be obtained. As indicated by several authors^28,53, GS can have a substantial impact on the rate of genetic gains, especially because the use of realized genomic relationships is associated with increased accuracy in estimating the additive genetic variance and the breeding values.

One could wonder if the combination of both the pedigree and marker information would help obtain more accurate genetic parameter estimates. Single-step GBLUP analysis or the use of blended relationship matrices (H-matrix), which makes it possible to carry out an analysis combining both genotyped individuals and those for which only the registered pedigree is available, was proposed to take advantage of all the information available³⁵. This approach would likely be useful when the number of genotyped individuals is limited. In such non-optimal conditions, the marker-based estimates would likely not be more accurate than those obtained with the registered pedigree information would, and combining both types of data might be somewhat advantageous, but we would not recommend it outright. When the marker density and genome coverage are inadequate or the number of genotyped individuals is small relative to the non-genotyped individuals, estimates obtained would likely be closer to the upwardly biased pedigree-based genetic parameters, because the information derived from genomic data would be insufficient to counterbalance the bias from using pedigree information. Thus, the best option would still be the use of genomic-based approaches applied to most or all of the population even in a situation where all pedigree errors could be recovered by pedigree reconstruction, given that genomic-based approaches also take into account the effects of Mendelian sampling.

One interesting finding of the current metadata analysis is that the pedigree-based approach appears to underestimate the importance of GxE interaction as compared with marker-based methods for traits that respond more strongly to variation in environmental conditions. To delineate breeding zones and select superior trees to assemble their breeding and production populations, tree breeders have traditionally based their decisions mainly on growth and adaptive traits. These traits are generally under low to moderate genetic control and thus, are more influenced by the environment. Hence, the use of a marker-based approach to estimate more precisely GxE interaction would be beneficial, especially to tree breeders who have to address the reforestation and plantation needs of land or territories of more heterogeneous nature. The accurate prediction of trait value and genetic merit to specific environments is becoming even more important for some tree breeders given the context of deploying efficient climate change mitigation measures such as seed transfer and assisted migration.

In addition to obtaining more precise estimates of genetic parameters and gains, GS offers other significant advantages over conventional breeding based on registered pedigree information. It indeed makes it possible to practice higher selection intensities or facilitate multi-trait selection by screening large number of candidates without phenotyping all of them or even phenotyping only a fraction of them². It also allows considering forward selection of superior individuals at an earlier stage and thus to hasten breeding cycles^17,54,55. This increased flexibility and efficiency will also be proving particularly important in the context of climate change, to allow tree breeders to adjust their selection goals more rapidly. Consequently, while being more accurate, the genetic gains per time unit provided by the use of GS are considerably increased as well as the benefits and profitability of tree breeding programs⁵⁶. Although our study was specifically on forest tree species, we believe that our results might also have similar implications for breeders working with other plant species that are in their first steps of domestication.

Overall, tree breeders should take advantage of the reduction of breeding cycles and the increase in accuracy of genetic parameters and genetic gain estimates resulting from the use of GS approaches. Over the last decade, the cost of genotyping offered by commercial high-throughput genotyping platforms has also been regularly decreasing so that such an investment should now be viewed as affordable and essential for the rightful management and renewed progress of tree breeding programs.

Material and methods

We conducted a meta-analytical review of scientific papers on tree genomic selection in which both marker-based and pedigree-based estimates of narrow-sense heritability were reported. We found 22 studies that were carried out over the last 10 years (Table 1 and Supplementary material 1). Among these, 14 were carried out on conifer breeding populations, whereas the eight remaining ones were mainly proof-of-concept studies for eucalypts. Results of several additional GS studies on forest tree species have also been reported. However, as pedigree-based genetic parameters were not presented along with marker-based estimates, it was not possible to include these additional studies in the current metadata analysis. These studies also relied on various marker-based methods and in most cases, whatever the method used, the results were similar. Thus, when results were reported for GBLUP as well as for other marker-based methods, we preferentially presented estimates obtained with the GBLUP method in Table 1. When GBLUP results were not available, we indicated the marker-based method used. We also listed in Supplementary material 1 all marker-based methods tested in the various studies. GBLUP was also the marker-based method that we used for a new study that we conducted on an open-pollinated family test of black spruce (Picea mariana (Mill.) B.S.P.) that we incorporated in the present analysis (see Table 1, and Supplementary material 2). To determine whether the genetic parameters estimated with the ABLUP method were different from those obtained with a marker-based method, were carried out non-parametric Wilcoxon matched-pairs signed-ranks tests using the wilcox.test function in the R v.3.6.1 environment⁵⁷.

We opted for a non-parametric test procedure because the genetic parameters compared were obtained using different genetic material as well as experimental designs and marker types. Use of a parametric test would have required that data meet some assumptions, such as that differences in the matched-pairs follow a normal distribution and that the sample of pairs is a random sample for its population. In the context of the present metadata analysis, these assumptions could not be met adequately. The Wilcoxon matched-pairs signed-ranks test⁵⁸ is a non-parametric test procedure that gives more weight to a pair that shows a large difference between the two conditions compared than a pair that shows a small difference. This test makes it possible to tell which member of a pair is greater and to rank the differences in order of absolute size. With such a test, we could identify for each pair which member is greater, and we could make that judgment globally for the entire sample of matched pairs as well.

References

Beaulieu, J. et al. Genomic selection for resistance to spruce budworm in white spruce and relationships with growth and wood quality traits. Evol. Appl. 13, 2704–2722 (2020).
Article CAS PubMed PubMed Central Google Scholar
Lenz, P. et al. Multi-trait genomic selection for weevil resistance, growth and wood quality in Norway spruce. Evol. Appl. 13, 76–94 (2020).
Article CAS PubMed Google Scholar
Lebedev, V. G., Lebedeva, T. N., Chernodubov, A. I. & Shestibratov, K. A. Genomic selection for forest tree improvement: Methods, achievements and perspectives. Forests 11, 1190 (2020).
Article Google Scholar
Mullin, T. J. et al. Economic importance, breeding objectives and achievements. In Genetics, Genomics and Breeding of Conifers (eds Plomion, C. et al.) (Science Publishers & CRC Press, 2011).
Google Scholar
Zhang, J., Peter, G. F., Powell, G. L., White, T. L. & Gezan, S. A. Comparison of breeding values estimated between single-tree and multiple-tree plots for a slash pine population. Tree Genet. Genomes 11, 48 (2015).
Article CAS Google Scholar
Martínez-García, P. J. et al. Predicting breeding values and genetic components using generalized linear mixed models for categorical and continuous traits in walnut (Juglans regia). Tree Genet. Genomes 13, 109 (2017).
Article Google Scholar
Weng, Y., Ford, R., Tong, Z. & Krasowski, M. Genetic parameters for bole straightness and branch angle in Jack pine estimated using linear and generalized linear mixed models. For. Sci. 63, 111–117 (2017).
Article Google Scholar
Mrode, R. A. Linear Models for the Prediction of Animal Breeding Values 2nd edn. (CAB International, 2005).
Book Google Scholar
Henderson, C. R. Theoretical bias and computational methods for a number of different animal models. J. Dairy Sci. 71, 1–16 (1988).
Article Google Scholar
Falconer, D. S. & Mackay, T. F. C. Introduction to Quantitative Genetics 4th edn. (Longman Publishing Group, 1996).
Google Scholar
Henderson, C. R. A simple method for computing the inverse of a numerator relationship matrix used in prediction of breeding values. Biometrics 32, 69–83 (1976).
Article MATH Google Scholar
Wright, S. Coefficients of inbreeding and relationship. Am. Nat. 56, 330–338 (1922).
Article Google Scholar
Hill, W. G. & Weir, B. S. Variation in actual relationship as a consequence of Mendelian sampling and linkage. Genet. Res. 93, 47–64 (2011).
Article CAS Google Scholar
Doerksen, T. K. & Herbinger, C. M. Male reproductive success and pedigree error in red spruce open-pollinated and polycross mating systems. Can. J. For. Res. 38, 1742–1749 (2008).
Article Google Scholar
Godbout, J. et al. Development of a traceability system based on SNP array for the large-scale production of high-value white spruce (Picea glauca). Front. Plant Sci. 8, 1264 (2017).
Article PubMed PubMed Central Google Scholar
Galeano, E., Bousquet, J. & Thomas, B. R. SNP-based analysis reveals unexpected features of genetic diversity, parental contributions and pollen contamination in a white spruce breeding program. Sci. Rep. 11, 4990 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Lenz, P. et al. Genomic prediction for hastening and improving efficiency of forward selection in conifer polycross mating designs: An example from white spruce. Heredity 124, 562–578 (2020).
Article CAS PubMed PubMed Central Google Scholar
Askew, G. R. & El-Kassaby, Y. A. Estimation of relationship coefficients among progeny derived from wind-pollinated orchard seeds. Theor. Appl. Genet. 88, 267–272 (1994).
Article CAS PubMed Google Scholar
Doerksen, T. K., Bousquet, J. & Beaulieu, J. Inbreeding depression in intra-provenance crosses driven by founder relatedness in white spruce. Tree Genet. Genomes 10, 203–212 (2014).
Article Google Scholar
Meuwissen, T. H. E., Hayes, B. J. & Goddard, M. E. Prediction of total genetic value using genome-wide dense marker maps. Genetics 157, 1819–1829 (2001).
Article CAS PubMed PubMed Central Google Scholar
Heffner, E. L., Lorenz, A. J., Jannink, J.-L. & Sorrels, M. E. Plant breeding with genomic selection: Gain per unit time and cost. Crop Sci. 50, 1681–1690 (2010).
Article Google Scholar
Grattapaglia, D. & Resende, M. D. V. Genomic selection in forest tree breeding. Tree Genet. Genomes 7, 241–255 (2011).
Article Google Scholar
Beaulieu, J., Doerksen, T., Clément, S., MacKay, J. & Bousquet, J. Accuracy of genomic selection models in a large population of open-pollinated families in white spruce. Heredity 113, 342–352 (2014).
Article Google Scholar
Habier, D., Tetens, J., Seefried, F.-R., Lichtner, P. & Thaller, G. The impact of genetic relationship information on genomic breeding values in German Holstein cattle. Gen. Select. Evol. 42, 5 (2010).
Article Google Scholar
Perkel, J. SNP genotyping: six technologies that keyed a revolution. Nat. Methods 5, 447–454 (2008).
Article CAS Google Scholar
Pavy, N. et al. Development of high-density SNP genotyping arrays for white spruce (Picea glauca) and transferability to subtropical and nordic congeners. Mol. Ecol. Res. 13, 324–336 (2013).
Article CAS Google Scholar
Thomson, M. J. High-throughput genotyping to accelerate crop improvement. Plant Breed. Biotechnol. 2, 195–212 (2014).
Article Google Scholar
Beaulieu, J., Doerksen, T., MacKay, J., Rainville, A. & Bousquet, J. Genomic selection accuracies within and between environments and small breeding groups in white spruce. BMC Genomics 15, 1048 (2014).
Article PubMed PubMed Central Google Scholar
Liu, L., Chen, R., Fugina, C. J., Siegel, B. & Jackson, D. High-throughput and low-cost genotyping method for plant genome editing. Curr. Prot. 1, e100 (2021).
CAS Google Scholar
Lenz, P. et al. Factors affecting the accuracy of genomic selection for growth and wood quality traits in an advanced-breeding population of black spruce (Picea mariana). BMC Genomics 18, 335 (2017).
Article PubMed PubMed Central Google Scholar
de los Campos, G., Hickey, J. M., Pong-Wong, R., Daetwyler, H. D. & Calus, M. P. L. Whole-genome regression and prediction models applied to plant and animal breeding. Genetics 193, 327–345 (2013).
Article PubMed Central Google Scholar
Hoerl, A. E. & Kennard, R. W. Ridge regression: biased estimation for non-orthogonal problems. Technometrics 12, 55–67 (1970).
Article MATH Google Scholar
Tibshirani, R. Regression shrinkage and selection via the LASSO. J. R. Stat. Soc. Series B. 58, 267–288 (1996).
MathSciNet MATH Google Scholar
VanRaden, P. M. Efficient methods to compute genomic predictions. J. Dairy Sci. 91, 4414–4423 (2008).
Article CAS PubMed Google Scholar
Legarra, A., Aguilar, I. & Misztal, I. A relationship matrix including full pedigree and genomic information. J. Dairy Sci. 92, 4656–4663 (2009).
Article CAS PubMed Google Scholar
Zapata-Valenzuela, J., Whetten, R. W., Neale, D., McKeand, S. & Isik, F. Genomic estimated breeding values using genomic relationship matrices in a cloned population of loblolly pine. Genes Genomes Genet. 3, 909–916 (2013).
Google Scholar
Muñoz, P. R. et al. Unraveling additive from non-additive effects using genomic relationship matrices. Genetics 198, 1759–1768 (2014).
Article PubMed PubMed Central Google Scholar
Ratcliffe, B. et al. Single-step BLUP with varying genotyping effort in open-pollinated Picea glauca. Genes Genomes Genet. 7, 935–942 (2017).
Google Scholar
Gamal El-Dien, O. et al. Multienvironment genomic variance decomposition analysis of open-pollinated Interior spruce (Picea glauca x engelmannii). Mol. Breed. 38, 26 (2018).
Article Google Scholar
Zobel, B. J. & Sprague, J. R. Juvenile Wood in Forest Trees (Springer, 1988).
Google Scholar
Osorio, L. F., White, T. L. & Huber, D. A. Age trends of heritabilities and genotype-by-environment interactions for growth traits and wood density from clonal trials of Eucalyptus grandis Hill ex Maiden. Silv. Genet. 50, 108–117 (2000).
Google Scholar
Baltunis, B. S., Gapare, W. J. & Wu, H. X. Genetic parameters and genotype by environment interaction in radiata pine for growth and wood quality traits in Australia. Silv. Genet. 59, 113–124 (2010).
Article Google Scholar
Gamal El-Dien, O. et al. Prediction accuracies for growth and wood attributes of interior spruce in space using genotyping-by-sequencing. BMC Genomics 16, 370 (2015).
Article PubMed PubMed Central Google Scholar
Resende, M. D. V. et al. Genomic selection for growth and wood quality in Eucalyptus: Capturing the missing heritability and accelerating breeding for complex traits in forest trees. New Phytol. 194, 116–128 (2012).
Article PubMed Google Scholar
Chen, Z.-Q. et al. Accuracy of genomic selection for growth and wood quality traits in two control-pollinated progeny trials using exome capture as the genotyping platform in Norway spruce. BMC Genomics 19, 946 (2018).
Article PubMed PubMed Central Google Scholar
Beaulieu, J., Perron, M. & Bousquet, J. Multivariate patterns of adaptive genetic variation and seed source transfer in Picea mariana. Can. J. For. Res. 34, 531–545 (2004).
Article Google Scholar
Li, P., Beaulieu, J. & Bousquet, J. Genetic structure and patterns of genetic variation among populations in eastern white spruce (Picea glauca). Can. J. For. Res. 27, 189–198 (1997).
Article Google Scholar
Namkoong, G. Inbreeding effects on estimation of genetic additive variance. For. Sci. 12, 8–13 (1966).
Google Scholar
Squillace, A. E. Average genetic correlations among offspring from open-pollinated forest trees. Silv. Genet. 23, 149–156 (1974).
Google Scholar
Muñoz, P. R. et al. Genomic relationship matrix for correcting pedigree errors in breeding populations: impact on genetic parameters and genomic selection accuracy. Crop Sci. 53, 1115–1123 (2014).
Article Google Scholar
Tan, B. et al. Evaluating the accuracy of genomic prediction of growth and wood traits in two Eucalyptus species and their F1 hybrids. BMC Plant Biol. 17, 110 (2017).
Article PubMed PubMed Central Google Scholar
Weigel, K. A., VanRaden, P. M., Norman, H. D. & Grosu, H. A 100-year review: Methods and impact of genetic selection in dairy cattle—From daughter-dam comparisons to deep learning algorithms. J. Dairy Sci. 100, 10234–10250 (2017).
Article CAS PubMed Google Scholar
Grattapaglia, D. et al. Quantitative genetics and genomics converge to accelerate forest tree breeding. Front. Plant Sci. 9, 1693 (2018).
Article PubMed PubMed Central Google Scholar
Park, Y.-S., Beaulieu, J. & Bousquet, J. Multi-varietal forestry integrating genomic selection and somatic embryogenesis. In Vegetative Propagation of Forest Trees (eds Park, Y.-S. et al.) 302–322 (National Institute of Forest Science, 2016).
Google Scholar
Bousquet, J. et al. Spruce population genomics. In Population Genomics: Forest Trees (ed. Rajora, O. P.) (Springer Nature, 2021).
Google Scholar
Chamberland, V. et al. Conventional versus genomic selection for white spruce improvement: A comparison of costs and benefits of plantations on Quebec public lands. Tree Genet. Genomes 16, 17 (2020).
Article Google Scholar
R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2019).
Google Scholar
MacFarland, T. W. & Yates, J. M. Wilcoxon matched-pairs signed-ranks test. In Introduction to Nonparametric Statistics for the Biological Sciences Using R 133–175 (Springer, 2016) https://doi.org/10.1007/978-3-319-30634-6_5.
Li, Y. et al. Genomic selection for non-key traits in radiata pine when the documented pedigree is corrected using DNA marker information. BMC Genomics 20, 1026 (2019).
Article PubMed PubMed Central Google Scholar
Calleja-Rodriguez, A. et al. Evaluation of the efficiency of genomic versus pedigree predictions for growth and wood quality traits in Scots pine. BMC Genomics 21, 796 (2020).
Article CAS PubMed PubMed Central Google Scholar
Ukrainetz, N. K. & Mansfield, S. D. Assessing the sensitivities of genomic selection for growth and wood quality traits in lodgepole pine using Bayesian models. Tree Genet. Genomes 16, 14 (2020).
Article Google Scholar
Ukrainetz, N. K. & Mansfield, S. D. Prediction accuracy of single-step BLUP for growth and wood quality traits in the lodgepole pine breeding program in British Columbia. Tree Genet. Genomes 16, 64 (2020).
Article Google Scholar
Thistlethwaite, F. R. et al. Genomic prediction accuracies in space and time for height and wood density of Douglas-fir using exome capture as the genotyping platform. BMC Genomics 18, 930 (2017).
Article PubMed PubMed Central Google Scholar
Suontama, M. et al. Efficiency of genomic prediction across two Eucalyptus nitens seed orchards with different selection histories. Heredity 122, 370–379 (2019).
Article CAS PubMed Google Scholar
Müller, B. S. F. et al. Genomic prediction in contrast to a genome-wide association study in explaining heritable variation of complex growth traits in breeding populations of Eucalyptus. BMC Genomics 18, 524 (2017).
Article PubMed PubMed Central Google Scholar
Thavamanikumar, S., Arnold, R. J., Luo, J. & Thumma, B. R. Genomic studies reveal substantial dominant effects and improved genomic predictions in an open-pollinated breeding population of Eucalyptus pellita. Genes Genomes Genet. 10, 3751–3763 (2020).
CAS Google Scholar
Resende, R. T. et al. Assessing the expected response to genomic selection of individuals and families in Eucalyptus breeding with an additive-dominant model. Heredity 119, 245–255 (2017).
Article CAS PubMed PubMed Central Google Scholar
Marco de Lima, B. et al. Quantitative genetic parameters for growth and wood properties in Eucalyptus “urograndis” hybrid using near-infrared phenotyping and genome-wide SNP-based relationships. PLoS ONE 14, e0218747 (2019).
Article CAS PubMed PubMed Central Google Scholar
Bouvet, J.-M., Makouanzi, G., Cros, D. & Vigneron, Ph. Modeling additive and non-additive effects in a hybrid population using genome-wide genotyping: Prediction accuracy implications. Heredity 116, 146–157 (2016).
Article CAS PubMed Google Scholar
Pégard, M. et al. Favorable conditions for genomic evaluation to outperform classical pedigree evaluation highlighted by a proof-of-concept study in poplar. Front. Plant Sci. 11, 581954 (2020).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We are grateful to Simon Nadeau (Canadian Wood Fibre Centre, Natural Resources Canada) who shared some of his analytical pipelines. We are grateful to Dr. Ronald Sederoff, Emeritus Professor at the North Carolina State University for his constructive comments on a previous version of this manuscript, as well as two anonymous reviewers for their useful suggestions of improvement. We thank Mrs. Kathy Tosh of the New Brunswick Tree Improvement Council for sharing samples and phenotypic data of the black spruce open-pollinated family test for which new results were presented in this study. We also thank the Canadian Wood Fibre Centre of Natural Resources Canada for sharing genotypic data of this latter material, and financial support from the Canada Research Chair in Forest Genomics (Université Laval) to J. Bousquet, the FastTRAC tree genomics project lead by J. Bousquet and G. Smith and funded by Genome Canada and Génome Québec, and the Spruce-Up tree genomics project (234FOR) lead by J. Bohlmann and J. Bousquet and funded by Genome Canada, Génome Québec and Genome British Columbia.

Author information

Authors and Affiliations

Canada Research Chair in Forest Genomics, Institute of Systems and Integrative Biology and Centre for Forest Research, Université Laval, 1030 Avenue de la Médecine, Quebec, QC, G1V 0A6, Canada
Jean Beaulieu, Patrick Lenz & Jean Bousquet
Natural Resources Canada, Canadian Wood Fibre Centre, Quebec, QC, G1V 4C7, Canada
Patrick Lenz

Authors

Jean Beaulieu
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Lenz
View author publications
You can also search for this author in PubMed Google Scholar
Jean Bousquet
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.Be. carried out the literature review, performed data analyses and drafted the manuscript. P.L. made available the unpublished genotypic data on black spruce and prepared the figures, and J.Bo. designed and oversaw the study. All authors reviewed the manuscript.

Corresponding author

Correspondence to Jean Beaulieu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Beaulieu, J., Lenz, P. & Bousquet, J. Metadata analysis indicates biased estimation of genetic parameters and gains using conventional pedigree information instead of genomic-based approaches in tree breeding. Sci Rep 12, 3933 (2022). https://doi.org/10.1038/s41598-022-06681-y

Download citation

Received: 01 September 2021
Accepted: 31 January 2022
Published: 10 March 2022
DOI: https://doi.org/10.1038/s41598-022-06681-y

This article is cited by

Genomic dissection of additive and non-additive genetic effects and genomic prediction in an open-pollinated family test of Japanese larch
- Leiming Dong
- Yunhui Xie
- Xiaomei Sun
BMC Genomics (2024)
Efficient genomics-based ‘end-to-end’ selective tree breeding framework
- Yousry A. El-Kassaby
- Eduardo P. Cappa
- Ilga M. Porth
Heredity (2024)
Simulating deployment of genetic gain in a radiata pine breeding program with genomic selection
- Duncan McLean
- Luis Apiolaza
- Jaroslav Klápště
Tree Genetics & Genomes (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.