Understanding the combining ability of nutritional, agronomic and industrial traits in soybean F2 progenies

Obtaining soybean genotypes that combine better nutrient uptake, higher oil and protein levels in the grains, and high grain yield is one of the major challenges for current breeding programs. To avoid the development of unpromising populations, selecting parents for crossbreeding is a crucial step in the breeding pipeline. Therefore, our objective was to estimate the combining ability of soybean cultivars based on the F2 generation, aiming to identify superior segregating parents and populations for agronomic, nutritional and industrial traits. Field experiments were carried out in two locations in the 2020/2021 crop season. Leaf contents of the following nutrients were evaluated: phosphorus, potassium, calcium, magnesium, sulfur, copper, iron, manganese, and zinc. Agronomic traits assessed were days to maturity (DM) and grain yield (GY), while the industrial traits protein, oil, fiber and ash contents were also measured in the populations studied. There was a significant genotype × environment (G × A) interaction for all nutritional traits, except for P content, DM and all industrial traits. The parent G3 and the segregating populations P20 and P27 can be used aiming to obtain higher nutritional efficiency in new soybean cultivars. The segregating populations P11 and P26 show higher potential for selecting soybean genotypes that combine earliness and higher grain yield. The parent G5 and segregant population P6 are promising for selection seeking improvement of industrial traits in soybean.


Joint analysis of variance and grouping of means for agronomic traits
There were significant effects of genotypes and SCC (p-value < 0.05) for grain yield (Table S2).Environmental effects (E) were significant for DM and GY.G × E, GCA × E and SCA × E interactions were significant for DM.

Analysis of variance and grouping of means for industrial traits
There was significant effects of genotypes (G) and environments (E) (p-value < 0.05) for the industrial traits PC and OC (Table S3).There were significant GCA and SCA effects for OC.G × E and SCA × E interaction were significant for all the evaluated traits.GCA × E interaction was significant for all traits except AC.
Table 4 contains the grouping of means for the industrial traits.The segregant population P16 obtained the highest means for PC in Chapadão do Sul, OC and FC in both sites.The parents G2 and G5 and segregant populations P10, P15 and P20 obtained higher means for PC in both sites.The segregant population P24 obtained higher means for FC in both environments.Populations P13 and P17 obtained higher AC in both environments evaluated.

Combining ability for nutritional traits
The G3 parent stood out for contributing positive values to the contents of K, Mg, Ca, Fe, Mn, and Zn in both evaluated environments (Table 5), except for Fe levels in Aquidauana and Mn in Chapadão do Sul, whose values were negative.Similarly, G8 presented positive estimates for K, Mg, Ca, Fe, Mn, and Zn in both locations, except for K levels in Chapadão do Sul and Ca in Aquidauana, which showed negative values.
SCA estimates for macronutrient contents evaluated in the F 2 segregating populations of soybean are shown in Table 6.Population P7 stood out for presenting positive estimates for the contents of P, K, Ca, Mg and S at both locations, except for the K levels in Chapadão do Sul.Similarly, P20 also presented positive estimates for these traits, except for S contents in Chapadão do Sul.
Table 7 contains CEC estimates for micronutrient levels evaluated in F 2 segregating populations.Population P7 stood out for presenting positive estimates for Cu, Mn, and Zn levels at both locations.Population P13 presented positive estimates for all micronutrients except for Fe in Aquidauana and Zn in Chapadão do Sul, which were negative.Population P15 presented positive estimates for all micronutrients at both locations except for Mn. www.nature.com/scientificreports/

Combining ability for agronomic traits
For the GCA of the DM trait (Table 8), the parent G3 stood out for presenting negative values at both evaluated locations.Conversely, the parents G6 and G8 presented positive GCA estimates for this trait in Aquidauana and Chapadão do Sul.
The segregating populations P1 and P15 stood out by presenting positive SCC estimates for the traits days to maturity and grain yield (Table 9).Other populations that deserve to be highlighted are P11, P12 and P26 for presenting negative estimates for DM in both locations, as well as positive GCA estimates for grain yield.

Combining ability for industrial traits
The parent G5 stood out by contributing positive values for all industrial traits in both locations (Table 10).
Segregant populations P1, P24, P25, P26 and P27 stood out by showing positive estimates of protein content for both environments (Table 11).The segregant population P6 showed positive estimates for protein and oil content in Aquidauana and Chapadão do Sul.The population P11 obtained positive estimates for oil and fiber in both locations and ash in Aquidauana.The segregant population P19 obtained positive estimates for oil and ash contents in both locations evaluated.

Discussion
Diallel crosses allow estimation of the general combining ability (GC), which is associated with predominantly additive genes, and the specific combining ability (SCC), which is related to non-additive effect genes 13 .GCA was defined by Sprague and Tatum (1942) as the mean behavior of a parent line across a series of hybrid combinations, and this behavior results from the additive gene effect of the alleles.These authors defined the SCA as the vigor of a cross compared to that expected by the estimated GCA of the parents used in hybridization, which is determined by dominance genetic effects (complete or partial) and or epistasis.SCA is interpreted as an additional effect on hybrid expression regarding the parental GCA effects, and can be positive or negative.SCA results from the interaction of parental GCA effects and can improve or worsen hybrid expression relative to the expected effect based on GCA alone 14 .SCA effects, estimated as deviation of behavior from what would be expected based on GCA, are measures of non-additive gene effects, those hybrid combinations with more favorable SCA estimates, involving at least one of the parents that showed the most favorable GCA effect, are desirable 15 .
We can note that the parents showed significant differences regarding the genotypes × environments interaction for most of the nutritional contents, which is one of the main challenges in the choice and recommendation of superior cultivars.This interaction allows the emergence of stable genotypes for specific environments or genotypes with general behavior adapted to a wide range of environments 16 .In this research, besides the distinct  www.nature.com/scientificreports/climatic factors (Fig. 2) of each location, the physical-chemical properties of the soil are important factors for the occurrence of significant genotype × environment interaction.Besides the need to obtain more productive cultivars, it is necessary to spend more on fertilizers.Adopting nutrient-efficient genotypes is a strategy, especially in cerrado soils, aiming to save costs and prevent environmental impacts.When evaluating the nutrient contents, we found that the P content was significant for CEC, in which the parents G1, G5, and G7 and segregating populations P4, P5, P11, P14, P18, P19, P24, and P27 stand out.Thus, the difference between the means of the nutrients evaluated does not summarize only the individual behavior of the genitors 11,17 , but can also be attributed to environmental growing conditions.The soils of the cerrado are very weathered, with low levels of plant-available P, besides retaining nutrients in their colloids 18 .Under P limiting conditions, several metabolic problems can occur in the plants leading to delayed maturation and yield decrease 19 .
Obtaining information on the uptake and metabolization of P in plants allows the selection of these lines that have good development in soils with low P contents, in addition to makes it possible to use less phosphate fertilizers, which is essential for the sustainability of agricultural production aiming avoiding environmental problems caused by the incorrect use of fertilizers 20 .By studying P efficiency and responsiveness in soybean genotypes 21 , classified the cultivars studied into four groups: efficient and responsive, efficient and non-responsive, nonefficient and responsive, and non-efficient and non-responsive.The authors found that selecting P-use efficient Table 3. Grouping of means for the agronomic traits days to maturity and grain yield (kg ha −1 ) evaluated in parents and F 2 segregating populations of soybean in Aquidauana (E1) and Chapadão do Sul (E2).Means followed by different letters in the same column differ by the Scott Knott test at 5% probability.www.nature.com/scientificreports/cultivars in an environment with low availability of this nutrient favored the selection of cultivars responsive to the nutrient.Using genotypes with a better capacity to accumulate potassium (K) contents results in improved carbohydrate and protein metabolism and starch translocation, which are used in grain formation 22 .Similarly, the selection of parents and segregating populations with higher Ca content provides enhanced structural metabolism, since the element acts in the cell wall synthesis, pollen tube growth, and pollen grain germination 23 .Genotypes more efficient in sulfur (S) uptake and metabolization, which has a structural and metabolic function in plants 24 may be a promising strategy to develop cultivars better adapted to degraded soils, especially in the Brazilian Cerrado.
Mg content was significant for GCA, for which the genotypes G1, G3 and G7 and the segregating populations P1, P2, P20, P21, P24 and P27 stood out.Thus, at least one of the parents used in the crossings differed from the others regarding the concentration of alleles favorable for higher Mg expression 17 .The differential of obtaining lines with higher Mg contents in the plant is related to the role of this nutrient in activating enzymatic reactions and its presence in the structural part of plants (chlorophyll molecule) 25 .
In this context, the increasingly intensive use of soil, due to successive cropping, may result in degradations that lead to nutritional disorders in plants.Given this scenario, selecting genotypes containing higher levels of micronutrients is crucial for breeding programs.The parents G3 and G8 and the populations P7, P13, and P15 stood out for their relationship with micronutrients.Thus, selecting efficient genotypes in micronutrient uptake and metabolization makes the plant require a reduced amount of nutrients and have the same performance as the others to grow adequately in areas with nutrient limitations.
Besides improving plants' nutritional efficiency, soybean breeding programs have also aimed to develop cultivars combining higher earliness and grain yield 26,27 .According to Almeida et al., soybean cultivars can be classified as early (111 days), semi-early (112 to 124 days), and late (above 125 days).Soybean is a crop strongly influenced by weather conditions.By using early genotypes, farmers can minimize losses from end-of-cycle diseases 26,28 , besides allowing the growing of second-season maize 29 .
In the search for genotypes adapted to specific environments, the duration of the vegetative phase is an essential attribute to be considered.Plants that do not have juvenility genes will flower early, thus reducing the Table 6.Estimates of specific combining ability (sij effects) for macronutrient contents (P, K, Ca, Mg, and S), evaluated in F 2 segregating populations of soybean in Aquidauana (E1) and Chapadão do Sul (E2).www.nature.com/scientificreports/plant's size and leading to losses in grain yield 30 .Selecting early-flowering genotypes may lead to lower grain yield since these plants have a reduced height and number of nodes 31 .
Regarding DM, it is desirable that the genotypes present lower means (i.e., higher earliness), and that at least one of the parents has negative GCA estimates 32 .Parent G3 showed high negative GCA estimates for DM, suggesting a high concentration of alleles favorable to shortening the cycle of these soybean lines.The significant GCA effects indicate that some parents will contribute with a higher number of favorable alleles transmitted to the offspring 33 .
Considering the segregating populations evaluated in this experiment, P2, P9, P11, P12, P14, P23, P25, and P26 stood out for presenting negative specific combining ability in both locations evaluated.These findings reveal the possibility of obtaining earlier genotypes from these crossings after a few inbreeding generations 34 .www.nature.com/scientificreports/However, among these populations, only P11, P12, P25, and P26 showed positive values for grain yield, with P11 and P26 showing the highest means.
Besides combining favorable traits such as nutritional efficiency, earliness and high yields, the current soybean cultivars must have improved levels for traits of industrial interest.With the increasing consumption of animal protein, the demand for bran for poultry, cattle and confined swine feed has been rising.To supply this sector, the industry needs soybean to have high grain protein and oil contents, which are at least 40% protein and 20% oil contents, while the national average protein is around 37% 35 .Early and indeterminate cycle cultivars tend to have higher protein contents, which may be a response to increased exposure to solar radiation and heat during the grain-filling phase 36 .By evaluating four soybean cultivars developed and improved in Brazil 37 , found protein contents ranging between 33.4% and 35.1%, values below those found here.Table 9. Specific combining ability estimates (sij effects) for agronomic traits days to maturity (DM) and grain yield (GY) evaluated in segregating F 2 populations of soybean in Aquidauana (E1) and Chapadão do Sul (E2).www.nature.com/scientificreports/However, the increase in ash, fiber, oil and protein contents in soybean grains is a complex task for breeders, due to the high environmental influence on the genes 38 and the existence of a negative relationship between these traits and with grain yield.For example, it is known that fiber levels have decreased over the years due to the increase in oil content, variables that are negatively correlated.Likewise, the oil content tends to decrease by selecting genotypes for higher protein content.Thus, the genotypes that stood out for the industrial variables should be used further in the breeding process, seeking to improve such traits simultaneously with the other characteristics of interest in the breeding pipeline.
The parents and segregating populations showed distinct responses for selecting nutritional, earliness, yield, and industrial traits.These genotypes should be monitored in the breeding process because they guide the breeders toward what they want to improve and attempt to achieve genotypes containing one or more traits of interest in a soybean cultivar.
Seeking to identify segregating parents and populations of soybean that get better characteristics regarding the uptake and metabolism of nutrients, earliness, yield, and contents of ash, fiber, oil and protein, our study aimed to identify this genetic variability between parents and populations through a diallel analysis.Our findings reveal that the parent G3 and the segregating populations P20 and P27 can be used for improved nutritional efficiency in new soybean cultivars.The segregating populations P11 and P26 show higher potential for selecting genotypes combining early maturity and high grain yield.The parent G5 and segregant population P6 are promising for selection seeking to improve industrial traits in soybean.

Obtaining the progenies in the F 1 generation
Hybrids were obtained in the greenhouse, using commercial cultivars as parents (Table 12).Divergence based on the relative maturity group (RMG) was considered as a selection criterion for the parents.Twenty-eight crossbreeds were performed to obtain the F 1 generation, as described in Table 13.All methods were carried out in accordance with relevant guidelines with relevant institutional, national, and international guidelines and legislation.

Experimental design and treatments
A randomized block design was used with two repetitions, eight parents (Table 1) and 28 F 2 populations (Table 2).
The plots consisted of one three-meter row, with 0.45 m spacing between rows and a density of 15 plants m −1 .This size was adopted due to the limited quantity of seeds from the crosses carried out so that the genotypes could be evaluated in two locations.
For the nutritional analysis, we used the third fully developed leaf from the plant's apex, considered diagnostic for soybean nutritional analysis, where most metabolic processes responsible for energy acquisition occur.Twenty-five leaves with petioles were collected from each experimental unit.The nutritional contents of macronutrients were expressed in g kg −1 , while micronutrients were expressed in mg kg −1 .
Agronomic traits evaluated were: days to maturity (DM) and grain yield (GY, kg ha −1 ).DM corresponded to the days between emergence and maturation of more than 50% of plants in each experimental unit.GY was evaluated by harvesting the central 2 m of each plot and correcting for 13% moisture.
The measurement of protein (TP, %), total oil (TO, %), fiber (TF, %) and ash (TC, %) contents in F 2 populations was performed by near-infrared spectroscopy (NIRS) (Metrohm, DS2500 spectrometer, Herisau, Switzerland) with high optical precision.Grain samples were homogenized and placed in a sampling dish.The analysis was based on illuminating a sample with a specific radiation wavelength in the near-infrared region and then measuring the difference between the amount of energy emitted by the spectroscope and reflected by the sample to the detector (AOAC, 2000).This difference was measured in several bands, creating a spectrum for each sample.The output was compared with a calibration set and expressed as a percentage.

Statistical analyses
Initially, a joint analysis of variance was performed in Genes software according to the statistical model described below: wherein: Y ijk is the observation in the k-th block, evaluated in the i-th genotype and j-th environment; µ is the overall mean of the experiments; B/E jk is the effect of the block k within the environment j; G i is the effect of the i-th genotype considered as fixed; A j is the effect of the j-th environment taken as random; GxA ij is the random effect of the interaction between genotype i and environment j; e ijk is the random error associated with Y ijk .
Afterward, the unfolding of the genotype and the G × A interaction effects at each location was performed according to the partial diallel structure based on the progeny of F 2 to obtain additive (gi) and dominance (sij) effects.Diallel analysis followed the model 4 proposed by Griffing 40 to estimate general and specific combining abilities, as described below: wherein: Y ij is the mean of the crossbreeding between the i-th line from group 1 and the j-th line from group 2; µ is the overall mean of the diallel; g i is the general combining ability of the i-th line from group 1; g j is the general combining ability of the j-th line from group 2; s ij is the specific combining ability between the lines from groups 1 and 2; e ij is the mean experimental error.
Subsequently, grouping of means (overall across environments) of the 36 crossbreeds was performed by the Scott and Knott test at 5% probability level.All analyses were performed using the software Genes 41 , following the procedures recommended by Cruz et al.

Table 1 .
Grouping of means for the nutritional contents of macronutrients (P, K, Ca, Mg and S) evaluated in parents and F 2 segregating populations of soybean in Aquidauana (E1) and Chapadão do Sul (E2).Means followed by different letters in the same column differ by the Scott Knott test at 5% probability.

Table 2 .
Grouping of means for the nutritional contents of micronutrients (Cu, Fe, Mn e Zn) evaluated in parents and F 2 segregating populations of soybean in Aquidauana (E1) and Chapadão do Sul (E2).Means followed by different letters in the same column differ by the Scott Knott test at 5% probability.

Table 4 .
Grouping of means for the industrial traits protein (PC), oil (OC), fiber (FC) and ash (AC) contents assessed in leaf samples from parents and F 2 segregating populations of soybean in Aquidauana (E1) and Chapadão do Sul (E2).Means followed by different letters in the same column differ by the Scott Knott test at 5% probability.

Table 8 .
Estimates of general combining ability (gi effects) for the days to maturity (DM) trait in soybean parents in Aquidauana (E1) and Chapadão do Sul (E2).

Table 10 .
General combining ability estimates (gi effects) for the industrial traits protein (PC), oil (OC) and fiber (FC) contents evaluated in soybean parents in Aquidauana (E1) and Chapadão do Sul (E2).

Table 13 .
List of the twenty-eight F 1 soybean populations obtained.