Historical trend on seed amino acid concentration does not follow protein changes in soybeans

Soybean [Glycine max (L.) Merr.] is the most important oilseed crop for animal industry due to its high protein concentration and high relative abundance of essential and non-essential amino acids (AAs). However, the selection for high-yielding genotypes has reduced seed protein concentration over time, and little is known about its impact on AAs. The aim of this research was to determine the genetic shifts of seed composition for 18 AAs in 13 soybean genotypes released between 1980 and 2014. Additionally, we tested the effect of nitrogen (N) fertilization on protein and AAs trends. Soybean genotypes were grown in field conditions during two seasons under a control (0 N) and a N-fertilized treatment receiving 670 kg N ha−1. Seed yield increased 50% and protein decreased 1.2% comparing the oldest and newest genotypes. The application of N fertilizer did not significantly affect protein and AAs concentrations. Leucine, proline, cysteine, and tryptophan concentrations were not influenced by genotype. The other AAs concentrations showed linear rates of decrease over time ranging from − 0.021 to − 0.001 g kg−1 year−1. The shifts of 11 AAs (some essentials such as lysine, tryptophan, and threonine) displayed a relative-to-protein increasing concentration. These results provide a quantitative assessment of the trade-off between yield improvement and seed AAs concentrations and will enable future genetic yield gain without overlooking seed nutritional value.

N uptake during the season 17 , being mostly remobilized from vegetative organs 18 and concurrently assimilated from soil mineral N and biological N fixation. Hence, N accumulated prior to seed formation will be a predominant source for protein synthesis in seeds 19 . The effect of N fertilization on seed protein concentration has been studied, but response seems to be erratic 16 and historical trends in protein dilution could not be reversed even in heavily N-fertilized environments 20 . However, the degree to which individual AAs respond to N application across historical genotypes remains unknown.
Therefore, considering a historical set of soybean genotypes released between 1980 and 2014 in the United States, the aims of this research were as follows: (i) determine the genetic gain of 18 AAs concentrations in soybean seeds, (ii) evaluate the response in seed AA profile to N fertilization, and (iii) compare the AAs genetic gain by clustering those AAs presenting similar shifts (in both absolute and relative terms) relative to seed weight and protein. This information contributes to our ability to understand the magnitude and potential determinants of seed composition changes, therefore enabling future investigations for a more effective selection of seed nutritional value in soybeans.

Results
Seed yield and protein genetic gain in historical genotypes. The absolute genetic gain was given by the relationship between the crop trait and the genotype year of release. The mean crop yield genetic gain was 0.04 Mg ha −1 year −1 (Fig. 1a) regardless of the N fertilization treatment (Table 1). Seed yield ranged between 2.7 and 4.1 Mg ha −1 with an estimated yield increase of approximately 50% from 1980 to 2014. Protein absolute genetic gain decreased at a rate of 0.122 g kg −1 year −1 (Fig. 1b). Similar to yield, seed protein concentration was not affected either by the N fertilization nor its interaction with year of release ( Table 1). The average protein concentration was 349 g kg −1 , with an overall reduction of 1.18% when considering the entire evaluation period, and a total decrease of − 4.15 g kg −1 (Fig. 1b).   Fig. 2), with the exception of leucine, proline, cysteine, and tryptophan ( Fig. 2c, i, p, r). Similarly to yield and protein results, the N fertilization did not affect the AA trends over time. Glutamic acid displayed a rate of − 0.021 g kg −1 year −1 , with a decrease of 1.22% over the 1980 to 2014 timeframe (Fig. 2a). The concentration of aspartic acid decreased by 1.07% at a rate of − 0.012 g kg −1 year −1 (Fig. 2b). The other nonessential AAs (arginine, alanine, serine, glycine, and tyrosine) followed the same overall decreasing trend. Arginine concentration decreased by 1.43% in the selected time period, with a rate of − 0.01 g kg −1 year −1 (Fig. 2d), whereas alanine concentration decreased 0.8% with a rate of − 0.003 g kg −1 year −1 (Fig. 2j). Serine and glycine concentrations decreased at a rate of − 0.005 g kg −1 year −1 and 0.004 g kg −1 year −1 , respectively, (Fig. 2k, l) with an overall reduction of 1.21% and 0.88%, respectively, over the evaluated period of time.
In the essential AAs group, lysine concentration decreased by 0.92% for the evaluated period at a rate of -0.006 g kg −1 year −1 (Fig. 2e). Phenylalanine decreased at a rate of − 0.005 g kg −1 year −1 with concentration ranging from 17.54 to 17.37 g kg −1 (Fig. 2f). Valine decreased by 0.81% with a rate of − 0.004 g kg −1 year −1 (Fig. 2g). Isoleucine, threonine, and histidine showed rates of − 0.004, − 0.003, and − 0.003 g kg −1 year −1 with an average decrease of 0.85%, 1.03%, and 1.15%, respectively (Fig. 2h, m, o). Both leucine and tryptophan concentrations remained steady across genotypes with averages of 25.5 and 3.8 g kg −1 , respectively (Fig. 2c, r). For the sulfur amino acids, only methionine concentration was influenced by genotype's year of release. Genetic gain for methionine was − 0.001 g kg −1 year −1 in seed concentration, with the modern genotype attaining 0.71% less methionine than the oldest genotype evaluated in this study (Fig. 2q). Cysteine was constant in soybean seeds across the years of release with a mean concentration of 5.6 g kg −1 (Fig. 2p).
Not all AAs presented a negative trend of the same magnitude of seed protein as portrayed by the relativeto-protein genetic gain (Fig. 3b). This relationship is presented by the individual AA relative concentration to protein (%) and genotype year of release. Some AAs presented a less than proportionate reduction relative-toprotein and thus, those AAs were clustered as less negatively affected by the overall decrease in protein (black solid points). Within this cluster, we found the following AAs: phenylalanine, lysine, glycine, isoleucine, valine, leucine, tyrosine, alanine, threonine, proline, and tryptophan. In a second cluster, we found AAs presenting a www.nature.com/scientificreports/ www.nature.com/scientificreports/ decreasing relative-to-protein concentration but not significantly different from zero. They are arginine, glutamic acid, serine, aspartic acid, histidine, methionine, and cysteine (empty points) (Fig. 3b).

Discussion
Our results highlight the historical trends  in seed AAs concentrations. Yield improvement and protein reduction were within the range reported in the literature 5,[21][22][23][24][25] . In addition, as previously documented for soybeans in maturity group III 20 , the rate of protein reduction over years was unchanged by the application of N fertilizer (670 kg ha −1 ). Fourteen of the 18 AAs analyzed were present in lower concentrations in more recently released genotypes (Figs. 2, 3a). The shifts of the most abundant AAs in soybean (glutamic acid and aspartic acid) were in the same range as the protein reduction rate (Fig. 3a). Alternatively, the concentrations of the essential AAs lysine and threonine increased relative to protein which may represent an impact on the nutritional quality of soybean meal 4 . Although a majority of the AAs decreased in absolute values, 11 AAs increased in concentrations relative-to-protein (Fig. 3b), including leucine, isoleucine, histidine, phenylalanine, and valine, which are essential AAs for animal nutrition 12,26 . Therefore, breeding efforts to develop high protein genotypes should consider the underlying impact on AAs and the potential impact on the nutritional value of the seeds 6,27 . Addition of N did not affect protein or AA shifts over time (Table 1). A significant effect of N fertilization on protein is more likely in environments with poor N supply such as greenhouse conditions 16 or low activity  www.nature.com/scientificreports/ of biological nitrogen fixation 28 . In field studies, N biological fixation resulting from seed inoculation or indigenous soil rhizobia infection, may provide sufficient N to support high crop performance [29][30][31] . For the tested yield levels, our results indicate the inability of N fertilization to reverse the decline on soybean seed protein and AAs concentrations. A similar outcome was presented by Wilson et al. (2013), documenting a protein decline of − 0.25 g kg −1 year −1 for soybean genotypes released between 1923 and 2008 under contrasting N rates (zero vs. 560 kg N ha −1 ). Regarding the AA profile, controlled condition studies support a positive relationship between supra-optimal N and essential 32 or storage AAs 33 . For non-leguminous crops, field studies have validated the concept of N application enhancing AAs concentrations [34][35][36] . For soybeans, however, only a few studies were carried out in the field with N fertilization, and the results showed an increase in AAs concentrations only under low N availability 28 or associated with sulfur AAs reduction 37 . These findings were not observed in our results, therefore highlighting a lack of effect of non-limiting N supply for offsetting protein and AAs depression in historical soybean genotypes. Additionally, protein concentration was shown to be a better predictor of changes in AAs over time, but only for a select few such as glutamic acid, arginine, lysine, valine, proline, alanine, glycine, threonine, tyrosine, and tryptophan (Fig. 4). Using protein as a predictor of the AA changes over time was previously reported 9 , but considering only modern genotypes rather than a historical set as presented in this current study. Our findings show similar relationships for some AA changes such as glutamic acid and arginine relative-to-protein genetic gain (Fig. 3b), but other AAs did not exactly follow the trend of protein, e.g., aspartic acid, leucine, phenylalanine, isoleucine, serine, histidine, cysteine, and, methionine (Fig. 4). To date, there are no reported predictive models describing the entire AA profile as a function of protein concentration (as a reference seed composition fraction). Therefore, establishing foundational prediction models for AAs will assist breeders and ultimately growers in delivering soybean genotypes focusing on specific market demands.

Conclusions
This research explored the shifts in protein and AAs due to the genetic improvement of soybean genotypes from 1980 to 2014. These shifts were not driven by an increased N supply via inorganic N, as the N fertilization treatment did not change any trends for AAs concentrations. Similar negative rates, in absolute concentrations, were observed for some AAs such as arginine and glutamic acids but not for the rest of the AA profile relativeto-protein. Therefore, the concept of utilizing seed protein concentration genetic gain as an indicator of potential changes in AAs is not a valid rationale. Emerging areas of research focusing the genetic control of amino acids synthesis and its interaction to the environment will provide the foundation for improving seed traits either maintaining or improving the nutritional value of soybean. The experimental design was a randomized complete block in split-plot arrangement with four replications in both seasons. The main-plot consisted of the N factor with two levels and the sub-plot was the genotype factor with 13 levels. The N treatments were N fertilization at a rate of 670 kg N ha −1 and the control without N (0 N). The N fertilizer was equally split at sowing, R1, and R3 phenological stages 39 as a side dressed application of liquid urea ammonium nitrate (N P K, 28-0-0). Seed inoculation was performed shortly before sowing with the application of 3 × 10 9 colony units of Bradyrhizobium japonicum per 1 kg of seeds.

Methods
The Seed yield, protein and amino acids determination. At harvest maturity (R8), the two center rows in each plot were harvested with a plot combine, and the seed yield was adjusted to 130 g kg −1 water content basis. Approximately one kilogram of seed was sampled from each plot to measure seed composition. After seeds were dried to constant weight, the samples were ground to 0.1 mm final particle size. Protein and AAs concentrations were estimated with near-infrared spectroscopy (NIR) using the Perten DA7200 Feed Analyzer (Perten Instruments, Stockholm, Sweden). Briefly, the raw ground material was scanned between 1000 and 2500 nm wavelength and the reflectance normalized to a reference ceramic plate. The readings were subject to error removal due to an eventually uneven cup filling or sample size heterogeneity. The calibration between normalized reflec-Scientific Reports | (2020) 10:17707 | https://doi.org/10.1038/s41598-020-74734-1 www.nature.com/scientificreports/ tance and AA concentration was cross-validated using standard samples analyzed by wet chromatography following the AOAC 982.30 method 40 . The calibration curves were tested by root-mean-square error (RMSE). This method estimates the protein and 18 AAs concentrations (g kg −1 ) corrected to water content. However, this method does not distinguish between asparagine and aspartate, or between glutamine and glutamate. Therefore, glutamic acid and aspartic acid forms were reported as the sum of their respective components. The absolute genetic gain (g kg −1 year −1 ) was estimated by the regression of yield, protein, or each individual AA to genotype year of release. The relative genetic gain (% year −1 ) was calculated to allow the comparison between amino acids and protein concentrations. Thus, the slope of the absolute genetic gain for each AA was divided by the most recent estimated concentration 41 (Eq. 1).
The relative-to-protein genetic gain (% year −1 ) was determined by the relationship between the relative to protein concentration ratio (Eq. 2) with genotype year of release.
Finally, to investigate the correlation between AAs and protein regardless the year of release, AA concentrations relative-to-protein (Eq. 2) were tested against protein concentration.
Data analysis. We first tested the effect of N treatment by fitting two linear mixed models for each variable.
The first model included the year of release, N treatment, and the interaction as fixed effect factors, and the second model included only the year of release as the fixed effect. As N treatment was not significant, the model with the lowest score for Akaike Information Criterion (AIC) was selected. The random factors included year, block nested in year, N treatment nested in the interaction of block with year, and genotype nested in the interaction of N treatment, block, and year. Models were fitted using the package "lme4" 42 within the R software 43 . Assumptions of normality and homogeneity of the residuals were checked and no transformation was required.
The Resampling with Replacement Bootstrap was used to estimate the slope coefficient and the empirical distribution of model estimators 44 . A total of 5000 iterations were performed. All the distributions were summarized by the median, and the 2.5 and 97.5% percentiles were used as the boundaries of the 95% confidence intervals (CI), allowing statistical inference on the parameters 45 . The Pearson correlation coefficient (r) was estimated from the variable median estimation for each year of release (means distribution). The relative-to-protein genetic gain and relative genetic gain were empirically clustered using confidence intervals different from zero to separate increasing, neutral, or decreasing trends over the years. The standardized major axis (sma) regression was fitted 46 to test the relationship between AAs concentrations relative to protein and protein concentration. Data visualization (Figs. 1, 2, 3, 4) was performed using the package "ggplot2" 47 within the R software 43 . www.nature.com/scientificreports/ Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.