Selection of nitrogen responsive root architectural traits in spinach using machine learning and genetic correlations

Awika, Henry O.; Mishra, Amit K.; Gill, Haramrit; DiPiazza, James; Avila, Carlos A.; Joshi, Vijay

doi:10.1038/s41598-021-87870-z

Download PDF

Article
Open access
Published: 05 May 2021

Selection of nitrogen responsive root architectural traits in spinach using machine learning and genetic correlations

Scientific Reports volume 11, Article number: 9536 (2021) Cite this article

2357 Accesses
15 Citations
2 Altmetric
Metrics details

Subjects

Abstract

The efficient acquisition and transport of nutrients by plants largely depend on the root architecture. Due to the absence of complex microbial network interactions and soil heterogeneity in a restricted soilless medium, the architecture of roots is a function of genetics defined by the soilless matrix and exogenously supplied nutrients such as nitrogen (N). The knowledge of root trait combinations that offer the optimal nitrogen use efficiency (NUE) is far from being conclusive. The objective of this study was to define the root trait(s) that best predicts and correlates with vegetative biomass under differed N treatments. We used eight image-derived root architectural traits of 202 diverse spinach lines grown in two N concentrations (high N, HN, and low N, LN) in randomized complete blocks design. Supervised random forest (RF) machine learning augmented by ranger hyperparameter grid search was used to predict the variable importance of the root traits. We also determined the broad-sense heritability (H) and genetic (r_g) and phenotypic (r_p) correlations between root traits and the vegetative biomass (shoot weight, SWt). Each root trait was assigned a predicted importance rank based on the trait’s contribution to the cumulative reduction in the mean square error (MSE) in the RF tree regression models for SWt. The root traits were further prioritized for potential selection based on the r_g and SWt correlated response (CR). The predicted importance of the eight root traits showed that the number of root tips (Tips) and root length (RLength) under HN and crossings (Xsings) and root average diameter (RAvdiam) under LN were the most relevant. SWt had a highly antagonistic r_g (− 0.83) to RAvdiam, but a high predicted indirect selection efficiency (− 112.8%) with RAvdiam under LN; RAvdiam showed no significant rg or rp to SWt under HN. In limited N availability, we suggest that selecting against larger RAvdiam as a secondary trait might improve biomass and, hence, NUE with no apparent yield penalty under HN.

Natural selection under conventional and organic cropping systems affect root architecture in spring barley

Article Open access 22 November 2022

Temporal phenotypic variation of spinach root traits and its relation to shoot performance

Article Open access 08 February 2024

Clear effects on root system architecture of winter wheat cultivars (Triticum aestivum L.) from cultivation environment and practices

Article Open access 15 May 2024

Introduction

Being the first tissues that intercept various nutrients and water uptake, roots play an essential role in plant growth and development. Root architecture highly varies in response to different nutrient deficiencies¹ and adapts to continually changing growth conditions through structural plasticity². Most plants can utilize only half of the applied N, losing it in the form of nitrates (NO₃⁻), which can cause environmental hazards³. Spinach requires high rates of N fertilizer to produce high biomass and quality. It is estimated that about 60% of N applied to spinach in commercial production is lost through leaching⁴ due to its shallow root system⁵ and short production cycle. Spinach is also relatively poor in its NO₃⁻ reducing capacity^6,7. About 80% of the total root length of spinach settles in the upper 0–15 cm soil layer, and the root distribution is not affected even by the addition of usable nitrates below 15–30 cm⁵. Thus, the rooting structure and the growth patterns that define roots are essential considerations to delineate the differences in nutrient absorption and efficiency. This is even more important in spinach due to its high affinity for N⁸. The morphological changes in the root system are regulated by the plant's nutritional status and interaction with the surrounding environment as detected through the localized signals by roots. Several studies have discussed the N-dependent (NO₃⁻, ammonium and glutamate) changes in the root architecture across species^{9,10,11,12,13}. Therefore, investigation of root development is of importance for understanding plant responses to low-N stresses.

Unlike soil, where the complex root-microbial association may naturally facilitate the plant absorptive capacity of hitherto unavailable N^14,15, root development in the soilless media is entirely reliant on an exogenous supply of nutrients. Due to their inert nature, the soilless systems minimize the changes that could occur due to gradients in temperature, oxygen status, water availability, pH, bulk density, or seasonal variations. In these cases, the root structure and appearance are a function of genetics as modified by the soilless matrix, and the concentration of the nutrients applied¹⁶. Different studies have profiled root systems and their association with the environment in field conditions, non-soil media and microbes^17,18,19. Modeling techniques for root feature diversity, structure, and activity have been attempted, including multivariate and machine-learning techniques^20,21,22. However, much remains to decipher the genetic and potential importance prioritization of the assigned root traits in influencing the above-ground biomass.

Root structure and its functioning are associated with N uptake, which influences plant performance and yield. Although improving root performance is relevant to all crops, it is particularly relevant to short-cycle vegetable crops like spinach that would benefit from early below-ground vigor²³. However, the shorter crop duration of spinach allows only a short time for root development^24,25. To date, much research on developing breeding strategies to improve N uptake or utilization is focused on modifying the root architecture of main crops like maize, rice, and wheat. Although several root architectural, topological, and developmental traits such as deeper and longer roots, rapid growth, and higher root density associated with higher N use efficiency have been identified^26,27 in cereal crops, such efforts to identify root traits that respond to N stress to capture the N available at depths in vegetables are limited.

Soilless indoor farming is becoming increasingly popular in the recent years^28,29. Since a soilless indoor system relies on the artificial supply of essential nutrients^28,29,30, nutrient management is critical for the harvestable quality and quantity of a crop^31,32. The differences in the value for money' between varieties of the same species may sometimes narrow down how efficiently the crop can uptake and convert the nutrients into harvestable products. Rooting system and architecture are important determinants of efficiency that maintain a favorable balance between resource investment (photosynthates) and resource acquisition (raw materials)^33,34. Varieties with favorable root architecture that enhance nutrient uptake and photosynthate use will reduce operating costs while balancing nutritional content and yield. This study investigates root traits and their relationship to the harvestable 'above-ground' biomass of spinach grown in two contrasting N supplies in a uniformly controlled indoor environment. Our assumption here is that in a uniform soilless matrix, genetics and the N management are the two primary sources of variation defining root architecture. We have used supervised random forest optimized machine learning algorithm by 'ranger'³⁵, to predict the importance of eight root traits on the spinach's harvestable biomass at the 'baby-spinach' stage. We have also applied META-R³⁶ to determine the genetic and phenotypic correlations and heritability to prioritize the root traits in influencing the harvestable shoot biomass. Finally, we compare the machine learning classification and the prioritization based on genetic correlations and present the top trait(s) with the highest selection potential.

Methods

Plants, plant material, experimental setup, and evaluation environment

A collection of 202 spinach (S. oleracea) accessions maintained and provided by the USDA-National Plant Germplasm System (NPGS) (https://npgsweb.ars-grin.gov/) at Ames, Iowa, U.S.A was used in the present study. The plants were grown in a growth chamber under controlled conditions of 12/12 h (light/dark), 22 °C, and 75% relative humidity. The seeds of each accession were sown in triplicate in turface (Turface Athletics MVP, PROFILE Products LLC, Buffalo Grove, Illinois, USA) in small pots (10.2 cm × 10.2 cm and 8.9 cm deep). Each set of replicates was completely randomized across separate shelves. After the seedling emergence, plants were fertilized with Peters professional ready mix (5-11-26, hydroponics special water-soluble fertilizer, Everris NA Inc., Ohio, USA) every after four days. Two concentrations of nitrogen (N), low N (50 ppm), and high N (200 ppm) were used for low and high N management, respectively. An additional N for the high N-management was provided using calcium nitrate, and equivalent calcium (3.85 mM) was replaced by calcium chloride in the low N-management. The concentrations of the macro/micro-nutrients present in the fertilizer solution are provided in Supplementary Dataset Table S1.

The experimental research in the lab facilities for this study was performed as required by Texas A&M System Regulation (15.99.06 Use of Biohazards in Research, Teaching, and Testing) and the University’s Rule for the use of Biohazards and Dual Use Research of Concern (15.99.06.M1 Use of Biohazards, toxins and rDNA and DURC), approved by the Texas A&M Institutional Biosafety Committee (IBC).

Plant material processing, root imaging, and data processing

The plants were harvested at the physiological maturity of baby spinach (5–6 leaves) after 41 days of sowing. Each plant was carefully pulled from the turface and washed with running water to clear any debris off the roots. The roots were separated from the shoot at the cotyledonary nodes and floated in water. The lateral roots were separated gently using a fine tip paintbrush to minimize the overlapping of roots. The images were taken by digitally scanning roots of individual plants (Supplementary Dataset Figure S1) using a high-resolution scanner (Calibrated Color Optical scanner STD4800 with Special Lighting System) and scanned images analyzed using WinRHIZO Pro software (Regent Instruments Inc. Canada). Categorization of the traits was adapted from the Fine-Root Ecology Database³⁷ classification and included: (1) morphology for root length (RLength) and average root diameter (RAvdiam); (2) an indication of the complexity of the root system architecture measured using number of tips, forks (number of root bifurcations), and crossings (Xsings; overlapping parts); (3) root system of the standing crop (RSSC) for root volume (RVol), root surface area (RSarea) and root weight (RWt). The harvestable above-ground biomass was determined as fresh shoot weight (SWt) (WR P-series balance, Model 500P, VWR International, U.S.A.) after removing the excess surface moisture by gently paper-bloating the wet roots followed by a two-minute air drying.

Data analysis

The analysis pipeline was designed to define the phenotypic, genotypic, and predictive relationship between the root traits and between the root traits and the SWt of spinach plants grown in a soilless system. We determined the r_g and r_p (defined below in the section -Determining the genotypic and phenotypic correlation between traits) between root traits and r_g and r_p between the root traits and the SWt within and between the two N managements. Parallel to the correlation analyses, we used a supervised random forest machine learning^38,39 technique to estimate the variable importance of each root phenotype in predicting the above-ground shoot biomass. The details of the parallel procedures are described below.

Individual trait and combined management variance analysis and mean separation

Linear mixed models were implemented in lmer from package lme4 of R using REML via Multi Environment Trial Analysis with R, META-R³⁶, to calculate the adjusted means (best linear unbiased estimates, BLUEs, and predictors, BLUPs) for each root and shoot variable, under the two N managements. For individual analyses, we used the model is

$$Y_{ik} = \mu + Rep_{i} + Gen_{k} + \varepsilon ik$$

(1)

where Y_ijk is the trait of interest, µ is the mean effect, Rep_i is the effect of the ith replicate represented by the complete blocks, Gen_k is the effect of the kth genotype, εijk is the error associated with the ith replication, jth incomplete block and the kth genotype, which is assumed to be normally and independently distributed, with mean zero and homoscedastic variance σ². For the combined analyses across the two N-managements, the model was adjusted to

$$Y_{ijkl} = \mu + Mgt_{i} + Rep_{j} \left( {Mgt_{i} } \right) + Gen_{l} + Mgt_{i} \times Gen_{l} + \varepsilon ijkl,$$

(2)

where the new terms Mgt_i and Mgt_i × Gen_l are the effects of the ith N-management and the N-management by genotype interaction, respectively. Genotype and N-management were both treated as random effects, and BLUPs were used to estimate random effects and BLUEs to estimate the fixed effect. Grand means were separated based on Fisher’s Least Significance Difference (LSD) at α = 5%. We also determined the coefficients of variation (CV) for all traits.

Heritability

We estimated the average broad-sense heritability (repeatability, H) of three replicates in each N-management, which is also an estimate of correlation expected between line means from the three replicate trials conducted in the two N-managements. H was determined⁴⁰ on a line mean basis as

$$H = \sigma^{2}_{g} \times \sigma^{2}_{g} + \sigma^{2}_{e} /nReps,$$

(3)

and combined for the two N-managements as

$$H = \sigma^{2}_{g} \times \sigma^{2}_{g} + \sigma^{2} ge/nMgt + \sigma^{2}_{e} /\left( {nMgt \times nReps} \right),$$

(4)

where σ²_g and σ²_e are the genotype and the residual error variance components, respectively; nReps is the number of replicates, σ²_ge is the variance component of genotype by N-management interaction, and nMgt is the number of N-managements in the analysis.

Determining the genotypic and phenotypic correlation between traits

Genetic and phenotypic correlations were calculated for each trait pair, within and across the N-managements. The r_g were also determined in META-R, which applies the equations from Cooper et al.⁴¹. Between the N-managements, r_g was estimated as

$$r_{g} \left( {jj_{0} } \right) = r_{p} \left( {jj_{0} } \right)/H_{j} H_{j0} ,$$

(5)

and between traits within a single N-management,

$$r_{g} = \sigma_{g} \left( {jj_{0} } \right)/\sigma_{g} \left( j \right)\sigma_{g} \left( {j_{0} } \right),$$

(6)

where r_p(jj₀) is the phenotypic correlation between N-managements j and j₀; and H_j and Hj₀ are the heritability of N-managements j and j₀ respectively, σ_g(jj₀) is the arithmetic mean of all pairwise genotypic covariances between trait j and j₀, and σ_g(j)σ_g(j₀) is the arithmetic average of all pairwise geometric means among the genotypic variance components of the traits.

For graphical illustrations, cluster analysis based on the environment distance matrix (1—Genetic Correlation matrix) was also performed using the ‘Ward’ method⁴², creating a dendrogram. In each case, a minimum heritability threshold was set at 0.1; any trait whose heritability within or between the two N-managements was lower than 0.1 was excluded from the analysis and was not plotted. For phenotypic correlations, simple Pearson correlations between different pairs of N-managements or traits were used.

Predicting correlated response

Correlated response (CR) was predicted for SWt to determine if direct or indirect selection resulting from selecting a root trait would be superior under similar N-managements. We used the formula:

$$CR_{swt} = ir_{g} \surd H_{RT} V_{g(swt)} ,$$

(7)

where CR_swt is the correlated response of SWt, r_g is the genetic correlation, H_RT is the repeatability of root traits, V_g(swt) is the genetic variance of SWt, and i is the selection intensity whose estimate we assumed would be similar between traits. Thus, CR_swt was compared to direct response (R),

$$R_{swt} = i\surd H_{swt} V_{g(swt)} ,$$

(8)

by CR_swt/R_swt = r_g√H_RT/√ H_swt. That is, if r_g × H_RT > H_swt, then indirect selection would be superior^43,44,45,46.

Summary of data preparation and evaluation by machine learning

To rank the root traits by importance in the prediction models for SWt, we used the random forest (RF) modeling in R. RF is a powerful ensemble machine learning tool that combines the outputs of numerous decision tree classification models. We applied the regression type to the randomForest⁴⁷ package and first ran the regression on default tuning parameters. We also invoked a user-defined hyperparameter tuning in the ranger³⁵ package to optimize our models; ranger is a C++ implementation of Breiman's FORTRAN-based random forest algorithm³⁹. Finally, we compared the accuracy of the model from the randomForest default tuning and that from ranger hyperparametric search tuning. The function missForest⁴⁸ was used to impute missing data. Outliers were normalized by an internally derived proximity matrix procedure built into the RF. In the normalization, if an outlier case i and case j both end up in the same tree node, increase proximity prox(ij) between i and j by 1 and accumulate over all trees in RF, the outliers are normalized by twice the number of trees in RF. This creates a proximity square matrix where observations that are ‘similar or alike’ in value have proximities close to 1 and the dissimilar proximity closer to 0.

Default tuning and model evaluation

The default data split (into 63.25% as training dataset and the remainder as the validation set) were applied to train each N-management. The 63.25% is the proportion expected of unique observations in a bootstrap sample^39,49. The typical range is ~ 60 to 85%, where smaller sample sizes can reduce the training time but may introduce more bias than necessary, while too large a sample size can increase performance but at the risk of overfitting because it introduces more variance³⁹. An F-fold cross-validation feature in RF invokes the evaluation of model performance by training it on a number of different smaller datasets and evaluating them over the other smaller testing sets. By default, randomForest randomly splits the number of datasets of almost the same sized k-folds, and each of the folder models is evaluated over the number of folders and tested on the remaining test set^39,47. This process is repeated until all the subsets have been evaluated. The regression tree parameters are tuned further by choosing the number of independent variables (m) using the default as m = p/3, where p is the total number of root traits in our analysis. This helps generalize the data best to return the least out-of-bag (OOB) error rate and provides a built-in validation set. Further, it identifies the number of trees (ntree), required to stabilize the error rate during tuning more efficiently^39,49. OOB error is an internal error estimate of a random forest as it is being constructed³⁹. It is estimated by testing each tree built from the bootstrap aggregation (bagging samples) from the training set on the remaining (validation set as defined by the default data split) of the samples not used in building that tree; randomForest chooses a random subset of features and builds many regression trees, and the model averages out all the predictions of the decision trees.

Setting hyperparametric tuning and evaluation parameters

We first determined the optimal number of trees (ntree), producing the least OOB error rate. The term ‘Optimal’ refers to the number of trees that were just enough to stabilize the OOB error and improve efficiency by avoiding unnecessary runs, as determined from the ntree function and which.min argument. The optimal number of trees was delineated first by running 500 trees with the default 63.25:36.75 split for each N-management. A hypergrid search was then constructed across several hyperparameter combinations and looped through each combination (details are in Supplementary Dataset Table S2). The model was evaluated over all the combinations we passed in the search space function using the grid search. The hyperparameter searches applied (values in parenthesis) were: mtry (4 variables from 2 to 48), for the number of random root trait variables to include in each tree. The primary concern was to tune the number of candidate variables (features) to sample at each tree node split randomly; 2) sampsize (sample fractions 0.55, 0.60, 0.65, 0.70, 0.75, 0.80) denoting the number of samples to train, 3) model nodesize (8 variables from 1 to 48), which determines the minimum number of samples within the terminal nodes and thus controls the complexity of the trees. This was necessary to set a bias-variance tradeoff where smaller node size allows for deeper, more complex trees with the risk of introducing more variance (risk of overfitting) and larger node results in shallower trees which may introduce more bias (risk of not fully capturing unique patterns and relationships in the data)³⁹. The minimum OOB root mean square error (OOB_RMSE) was set at zero (0). For ntree, we used 500 because the OOB_RMSE from hypergrid searches stabilized with less than 500 trees (Fig. 1). The resulting hyperparameter combination producing a model with the least prediction variance and OOB_RMSE was selected and tested with the training set and an independent, smaller test sample data (not used in each N-management training). The independent test sample was obtained from the optimal sampsize split, without bootstrap replacement.

Constructing accuracy function and evaluating the models

We applied all the above models on the same independent test (validation set) dataset to evaluate the accuracy of the grid-tuned model compared to the default model. The best of the two (lower mean error rate and greater mean model R² of regression trees) was used as our prediction model in a new regression run in randomForest to predict the new test set. The validation set was used as the independent test set since the sampsize split was done before bootstrapping and before sampsize split-variable randomization of the predictor root traits. Furthermore, we set importance as equals impurity' in the above modeling, which allows us to assess the variable importance of the root traits. Variable importance is measured by recording the decrease in mean square error (MSE) each time a variable is used as a node split in a tree³⁹. The remaining error left in predictive accuracy after a node split is the ‘node impurity,’ and a variable that reduces this impurity is considered more important than those that do not. Consequently, the root variable with the greatest accumulated reduction in MSE was considered the more impactful³⁹.

Results

Model tuning and accuracy

Optimized hyperparameters used in cross-validation with both the training and test samples in the two N-management datasets were variable size, node size, sample size, and the number of trees. All the optimized settings resulting from hyperparameter grid search are in Table 1, and the stabilizing ntree and OOB-RMSE are shown by arrows in Fig. 1a–f, respectively. For each N-management, hyperparameters were constructed (tested) across a total of 196 models (combinations: 8 predictor parameters [all the eight root traits], 4 node sizes [2, 4, 6, 8], across 6 sample sizes [0.56, 0.60, 0.632, 0.70, 0.74, 0.80], and 1 predetermined optimal number of trees [within 500], Supplementary Table S2). To assess the performance of our tuned hyperparameters, we compared the mean OOB prediction error, and the mean OOB variance explained (R squared_OOB, Table 2) between the tuned OOB regression model (training) and the test model, and between the training model and the RF default models. The OOB prediction error of 0.210 g of SWt under LN and 2.712 g of SWt under HN for the trained model (Table 1) were marginally smaller (the smaller, the better) than those for the RF default cross-validation (LN, 0.227 g; HN, 2.794 g) and the test model (LN,0.239 g, and HN, 2.799 g). Here, we define a marginal' difference as separation by at least 1%, but not large enough to be statistically significant by the conventional (non-machine learning) mean separation methods. The prediction variance explained (R squared_OOB) by our tuned model (57.2% of SWt) was similar to that of the internally cross-validated (default RF) model (56.8% of SWt) under the LN management but marginally larger (61.3% of SWt) than the default (60.2% of SWt) under the HN management. The hyperparametric tuning performed marginally better in the test model, with 61.2% and 64.6% variance predicted for SWt under the LN and HN managements, respectively. Overall, the tuned model performed as expected (with no large penalty even with varying sample size) on the independent test data.

Table 1 Summary of model evaluation and test (validation) by random forest machine learning.

Full size table

Table 2 The genetic correlations and phenotypic correlations of root traits.

Full size table

Prediction by machine learning is a close approximation of both the genetic and phenotypic correlations

By machine learning (ML), we ranked root traits based on the predicted importance of each in the models in describing its relationship with SWt. The traits with the greatest variable importance (Tips under HN and Xsings under LN) identified by ML also had the largest r_g and r_p to SWt in the corresponding N-managements (Fig. 2). The traits with the least variable importance (RAvdiam under HN and RVol under LN) were correctly identified in three out of four cases by the r_g and r_p ranking methods; the exception was r_g in the LN, where RVol followed RSarea as the least correlated to SWt. Overall, 6 out of 8 traits were correctly matched between ML and r_g_HN, with a two-trait r_g position switch, e.g., RVol then Xsings, instead of Xsings then RVol. Only 2 out of 8 traits were correctly matched between ML and r_g_LN with a two-trait position switch between six traits, e.g., RAvdiam then Tips, RWt then Forks and RVol then RSrea, instead of vice versa (Fig. 2). These two-trait switches, in our opinion, are minor alterations if we consider the fact that the four root traits predicted by ML as the most important and the four predicted as the least important under LN were also the same root traits with the largest and smallest r_g and r_p under LN. The four traits predicted by ML as the most important and the least important under HN were the same for r_p_LN except for RWt instead of RAvdiam for r_p_LN. It seems that as the r_g and r_p decrease so does the ability of our ML variable importance predictions to correctly identify the ranking of root trait-SWt genotypic and phenotypic correlations, and vice versa.

Pairwise genetic and phenotypic correlations are affected by N management

The r_g between all traits within and between N-managements are summarized in Tables 2, 3 and 4, while the structure of these correlations is shown in Fig. 3. The main reason for estimating r_g is to determine if a greater response on SWt would result by selecting a root trait as a secondary trait. The pairwise r_g and r_p between root traits and SWt were nearly identical under HN except for correlations between RAvdiam andSWt where r_g = 0.045 and r_p = 0.168. Under LN, the similarity was also high except for correlations between RAvdiam and SWt where r_g = −0.83 and r_p = −0.48, and between RSarea and SWt where r_g = 0.05 and r_p = 0.28. With the exception of RAvdiam and Xsings, the r_g and r_p between the other root traits and SWt were generally larger under HN compared to the corresponding r_g and r_p between root traits and SWt under LN (Fig. 2; Tables 2 and 3). The close similarities between r_g and r_p between within an N-management show that in our experimental growth environment, the r_g and r_p between the root traits and SWt were close approximations of each other within an N-management, likely due to low effects of the environment external to the growth facility on genotypes. Across the N-managements, the r_g between the root traits and SWt were greater than the corresponding r_p, most likely due to the between N-management treatment noise confounding the phenotypic variance on r_p. As mentioned earlier, here, H was less than a threshold we had set at 0.1; therefore, H values for Xsings between the N-managements and r_g associated with it were not included in the across-management output (Table 4).

Table 3 The genetic correlations (lower triangle) and phenotypic correlations (upper triangle) of eight root trait and one shoot trait under low N-management.

Full size table

Table 4 The genetic correlations (lower triangle) and phenotypic correlations (upper triangle) between traits across the two N managements.

Full size table

Variation among traits and between N managements

Variation among the genotypes and in the genotype × management had a significant effect for all the traits in the two N managements, but RAvdiam had the least variation among the genotypes with CV ~ 10.3% in the HN and ~ 7.1% in the LN, compared to CV ranging between ~ 47% to ~ 80% among the rest of the traits in the two N managements (Table 6). Because CV is highly dependent on the grand mean of a trial⁵⁰, we exercised caution in using a CV to interpret the comparative variability between traits under the two N managements. The trait CV between the two N management did not show any specific pattern to suggest a trait variance inflation due to the low nitrogen treatment or the differences in the grand means.

Comparing heritability and correlated response in LN and HN among root traits

The mean H of only two traits, SWt and Xsings, were substantially greater under HN than under LN, while the mean H of RAvdiam was substantially greater under LN than under HN. Trait heritability showed varying degrees of H ‘instability’ between the two N-managements (Table 5). Some showed higher H under one N management than the other, with RAvdiam being the most heritable (under LN) and having the largest H difference between the N-managements (53.6% in LN and 95.7% in LN). RWt had the least difference (50.3% in HN and 50.6% in the LN). H between the N-managements was very low, yet the genetic correlation was very high, suggesting that H is affected by the environmental noise between the two N-managements.

Table 5 Variance, heritability and means separation for shoot weight and root traits.

Full size table

RAvdiam had significant negative and large correlations (− 0.83) to SWt under LN, while the correlations were positive but not significant under HN. It was ranked among the highest in the RF regression under LN but ranked bottom in the HN. The heritability was very high (0.957) under LN compared to 0.536 under HN. On the other hand, Xsing, was ranked highest ranked by the RF regressions under LN, but lower under HN (r_g and r_p were positive and highly significant in both managements, while H was medium' in the LN, 0.588; in the HN, 0.658). These observations suggest that selecting for small root diameter may be desirable for improving shoot weight of baby spinach in low N. In fact, of all the root traits in this report, only RAvdiam had a predicted high indirect selection efficiency (113%) for SWt (r_{g_RTswt} × H_RT > H_swt, Table 6). Stronger correlated response efficiency of SWt was predicted for morphological (RLength and RAvdiam) and architectural traits (Tips and Xsings) compared to the standing crop root traits (RSarea, RVol, and RWt) (Table 6).

Table 6 Predicting the correlated response and indirect selection efficiency for shoot weight using root traits as the secondary traits under high N-management and low N-management.

Full size table

Discussion

The analysis pipeline was designed to define the phenotypic, genotypic, and predictive relationship between root architecture traits (Forks, Tips, and Xsings), root morphological traits (RLength, and RAvdiam), the root system of the standing crop (RSarea, RVol, and RWt) and between the root traits and the SWt of spinach grown in a soilless system. The objective was to determine root traits that have the greatest effect on the harvestable shoot under low N and thus can be used as a secondary trait to select for high NUE germplasm. We determined the phenotypic and genetic correlations (r_p and r_g, respectively) between root traits and between the root traits and the SWt within and between the N-managements. Parallel to the correlation analyses, we used the predictive random forest machine learning technique to rank the root phenotypes according to their strength to predict SWt.

Selecting the root traits with predicted potential as secondary traits for shoot biomass

We predicted the correlated response (CR) in SWt resulting from selecting any of the eight root traits to speculate its suitability as a secondary trait. An important component in defining CR is the H, which integrates information on genetic variation and environmental noise into one statistic and thus is useful in planning breeding programs⁵¹. One condition that must be met for indirect selection to be effective is H and r_g must be high in both the selection and target environments^43,44 even though H and r_g are environment- and population-specific^43,46. Fortunately, H has been strikingly similar in many environments^34,52, and H variations in indoor growth environments are expected to be low^52,53. In this context, since our H estimates were the average repeatability of 3-replicate trials in each of the N-managements, it may also be used to estimate the correlation expected between line means obtained from trials conducted at different indoor systems. Selecting a root trait in one N-management where H is high may predict the performance in the other, but we think the actual phenotypic quantity may vary substantially. However, if heritability values are high for both traits, then the correlation in breeding values dominates the phenotypic correlation^43,45,46. Since the H values in SWt (H ~ 70%) was not as high compared to H of RAvdiam (~ 96%), and with a genetic correlation of − 0.827 between them, the correlation in environmental values within N-management which dominated the phenotypic correlation (− 0.481) between RAdiam and SWt may have been mainly due to LN-management effect on SWt. Thus, an LN environment that minimizes the devaluation of the breeding values between the RAvdiam and SWt must be maintained, and we believe indoor environments may provide this condition.

In this context, selecting for a root trait as a secondary trait should produce a correlated response in spinach shoot biomass, and the ratio CR_swt/R_swt (Table 5) provides such an indirect selection criterion. It is clear (from these ratios) that direct selection of shoot biomass (SWt) is predicted to be superior to selecting for most of the eight root traits as a proxy in both N-managements. The exception is RAvdiam, which was predicted to result in superior indirect selection efficiency (112.8%) for SWt, i.e., a gain of ~ 12.8% in SWt by selecting against large RAvdiam. Other traits resulted in lower than 100% predicted efficiency; for instance, under LN was Xsings (74.6%), while the rest were less than 45% efficient. In the HN management, RLength (77.2%) and Tips (82.6%) were the most efficient but not enough for an indirect selection advantage.

The case for selecting against large average root diameter in baby spinach

We have noticed that under HN, RAvdiam did not have significant r_g to any of the other root traits and with SWt, and only had significant positive r_p with RWt and RVol (Table 2). Meanwhile, under LN, RAvdiam had significant positive r_p only to RSarea and RVol, non-significant r_p with RWt (0.104) and RLength, but significant negative r_p with Tips (− 0.238) and Xsings (− 0.310) (Table 3). Spinach requires high N supply⁸, and under such conditions, it seems RAdiam is not likely to substantially influence the yield differences observed in shoot biomass among the genotypes. However, the significant negative r_p (− 0.481) and the highly negative r_g (− 0.827) with SWt under LN management (Table 3) suggest that larger mean root diameter is associated with smaller mean shoot weight and vice versa. The reduction in root diameter-related phenes in the youngest maize nodes under N stress suggested that root diameter might play a role in adaptive stress responses^54,55. Whether or not the greater mean RAvdiam was due to root girth expansion in a negative feedback response to low N or competing resource allocation^8,34 was not investigated in this study. Based on the pattern of diameter change in response to nutrient concentration in different species, it is suggested that altering root diameter may be another way to save C costs in root growth during nutrient stresses⁵⁶. Although it is unclear how the root anatomical changes influence spinach root diameter, maize roots showed reduced cell diameter and area of vessels but an increased amount of aerenchyma during LN stress⁵⁴. It is plausible to assume that N is preferentially allocated to the roots to sustain their growth under LN than shoots, and the reduced N concentration act as the internal signal in regulating the response of axile root growth. Given the robust correlated response efficiency of SWt predicted for RAvdiam and lateral root traits (Tips and Xsings) in our study, the RAvdiam measure can be used to indicate the ratio of axial: lateral roots.

The genotypic correlations between RAvdiam and RSarea (0.533) and between RAvdiam and RVol (0.794) were also highly significant and positive. This implies that selecting against large root diameter may also select against RSarea and RVol under low N-management. Since there was non-significant r_g but significant r_p between RSarea and SWt, our data suggest that only a limited genetic linkage drag or pleiotropy⁴⁵ on shoot yield might result from selecting against large RAvdiam. Moreover, RSarea and RVol are standing crop traits³⁷ for ‘root bulk’, which are a product of morphological traits (RAvdiam and RLength) and root architectural traits (e.g., Tips, Forks, and Xsings). The significant small positive r_p between RSarea and SWt may be a combined artifact of these other morphological and architectural components when we also consider a significant negative r_p existed between RAvdiam and SWt in the LN management. This position is also supported by the fact that RVol had a negative r_g (− 0.296) and an insignificant r_p to SWt, and yet RVol had positive r_g (0.439) and r_p (0.931) to RLength; RLength, on the other hand, had significant r_g (0.539) and r_p (0.529) to SWt. RAvdiam was also highly heritable with H ~ 96% under LN, Table 6). We propose that the root average diameter is the only trait in this study that can successfully be selected against to improve the yield of shoot biomass low N. Further studies to validate these findings might benefit from testing in multiple growth conditions (temperature, humidity, and graduated N concentrations) to define a broader range of root trait-shoot yield relationships and N-responsiveness.

Resolving the conundrum around the antagonistic relationship between RAvdiam and SWt

Compared to SWt, RAvdiam has shown greater H (Table 6). The r_g between SWt and RAvdiam can be high (Table 3). In other words, indirect selection for a secondary trait will be superior if the heritability of that trait is high, and the correlation between the traits is close to 1^45,46. In this study, the RAvdiam met these two critical criteria. However, for practical use in a breeding program, the secondary trait must also be inexpensive and easy to measure in large trials^43,45. In that case, shoot biomass estimate could be used to select for roots bulk traits in production systems that target spinach roots as the end product. Because precision in imaging techniques for roots and shoot is rapidly evolving^57,58, we believe that soilless systems can be designed to facilitate robust root metrics characterization to match the ease with which above-ground biomass can be phenotyped. It would also be worthwhile to determine if further selection for/ against other root traits would eventually result in a superior secondary selection for shoot biomass. How these relationships would play along as the plants mature under different indoor growth conditions or in the field conditions require further studies.

Although a soilless system reduces the complexities associated with soils, we believe that the selection of a root trait needs to be understood in the context of the possible complex interplay among root traits^45,51. The possible complex interactions between the root traits that may have influenced the SWt were not explicitly considered in our interpretations. However, we have alluded to such complexity by describing the trait to trait correlations, which we hope should serve as an impetus for further inquiry. Although the expected genetic correlation between estimates of cultivar means are best obtained from independent sets of trials^43,59, we hypothesize that under similar N treatments, manipulation of other growth conditions in independent indoor growth environments may lead to some deviations from the response to selection predicted here. With the advent of techniques in processing images and the deep learning⁶⁰ frameworks that use advanced optimization and features from data, such prediction accuracy is likely to improve continually^61,62,63. Nonetheless, machine learning recognized root traits would continue to rely on vigorous calibrations and field-based validation in systems of interest.

In conclusion, we report on the investigation of eight root traits genetic and phenotypic correlations with fresh shoot biomass of spinach grown in a soilless system in a controlled indoor environment. The plants were harvested at 41 d after sowing, a stage corresponding to the marketable baby spinach. We have used both genotypes by management and other conventional breeder statistics and a machine learning predictive technique to define candidate root traits with the potential for indirectly selecting for spinach shoot yield. The experiments were set up under two separate and contrasting N-managements. Of the eight root traits, the root average diameter emerged as the only candidate with a predicted indirect selection efficiency good enough to improve shoot biomass. However, it had a robust negative genetic correlation with shoot yield, making us believe that selecting against large root diameter may improve the fresh shoot yield of baby spinach. We have exercised caution in this interpretation by recommending further studies into the possible complex interactions among the root traits considered in improving shoot biomass yield in baby spinach.

Data availability

All data generated or analyzed during this study are included in this published article (and Supplementary Information files).

References

Gruber, B. D., Giehl, R. F., Friedel, S. & von Wirén, N. Plasticity of the Arabidopsis root system under nutrient deficiencies. Plant Physiol. 163, 161–179 (2013).
Article CAS Google Scholar
Sun, C.-H., Yu, J.-Q. & Hu, D.-G. Nitrate: a crucial signal during lateral roots development. Front. Plant. Sci. 8, 485 (2017).
Article PubMed PubMed Central Google Scholar
Socolow, R. H. Nitrogen management and the future of food: lessons from the management of energy and carbon. Proc. Natl. Acad. Sci. 96, 6001–6008 (1999).
Article ADS CAS Google Scholar
Marvi, M. S. P. Effect of nitrogen and phosphorous rates on fertilizer use efficiency in lettuce and spinach. J. Hortic. For. 1, 140–147 (2009).
Google Scholar
Schenk, M., Heins, B. & Steingrobe, B. The significance of root development of spinach and kohlrabi for N fertilization. Plant Soil 135, 197–203 (1991).
Article CAS Google Scholar
Stagnari, F., Di Bitetto, V. & Pisante, M. Effects of N fertilizers and rates on yield, safety and nutrients in processing spinach genotypes. Sci. Hortic. 114, 225–233 (2007).
Article CAS Google Scholar
Biemond, H., Vos, J. & Struik, P. Effects of nitrogen on accumulation and partitioning of dry matter and nitrogen of vegetables. 3. Spinach. NJAS Wageningen J. Life Sci. 44, 227–239 (1996).
Google Scholar
Smorlders, E. & Merckx, R. Growth and shoot:root partitioning of spinach plants as affected by nitrogen supply. Plant Cell Environ. 15, 795–807. https://doi.org/10.1111/j.1365-3040.1992.tb02147.x (1992).
Article Google Scholar
Walch-Liu, P. & Forde, B. G. Nitrate signalling mediated by the NRT1. 1 nitrate transporter antagonises l-glutamate-induced changes in root architecture. Plant J. 54, 820–828 (2008).
Article CAS Google Scholar
Lima, J. E., Kojima, S., Takahashi, H. & von Wirén, N. Ammonium triggers lateral root branching in Arabidopsis in an AMMONIUM TRANSPORTER1; 3-dependent manner. Plant Cell 22, 3621–3633 (2010).
Article CAS Google Scholar
Forde, B. G. Nitrogen signalling pathways shaping root system architecture: An update. Curr. Opin. Plant Biol. 21, 30–36 (2014).
Article CAS Google Scholar
Giehl, R. F., Gruber, B. D. & von Wirén, N. It’s time to make changes: Modulation of root system architecture by nutrient signals. J. Exp. Bot. 65, 769–778 (2014).
Article CAS Google Scholar
Razaq, M., Zhang, P., Shen, H.-L. & Salahuddin, A. Influence of nitrogen and phosphorous on the growth and root morphology of Acer mono. PLOS ONE 12, e0171321. https://doi.org/10.1371/journal.pone.0171321 (2017).
Article CAS PubMed PubMed Central Google Scholar
Lee, S. & Lee, J. Beneficial bacteria and fungi in hydroponic systems: Types and characteristics of hydroponic food production methods. Sci. Hortic. 195, 206–215. https://doi.org/10.1016/j.scienta.2015.09.011 (2015).
Article CAS Google Scholar
Parniske, M. Arbuscular mycorrhiza: the mother of plant root endosymbioses. Nat. Rev. Microbiol. 6, 763–775. https://doi.org/10.1038/nrmicro1987 (2008).
Article CAS PubMed Google Scholar
Eldridge, B. M. et al. Getting to the roots of aeroponic indoor farming. New Phytol. 228, 1183–1192. https://doi.org/10.1111/nph.16780 (2020).
Article PubMed Google Scholar
Liese, R., Alings, K. & Meier, I. C. Root branching is a leading root trait of the plant economics spectrum in temperate trees. Front. Plant. Sci. 8, 315–315. https://doi.org/10.3389/fpls.2017.00315 (2017).
Article PubMed PubMed Central Google Scholar
Gopinath, P., Vethamoni, I. & Gomathi, M. Aeroponics soilless cultivation system for vegetable crops. Chem. Sci. Rev. Lett. 6, 838–849 (2017).
CAS Google Scholar
Koohakan, P. et al. Evaluation of the indigenous microorganisms in soilless culture: Occurrence and quantitative characteristics in the different growing systems. Sci. Hortic. 101, 179–188. https://doi.org/10.1016/j.scienta.2003.09.012 (2004).
Article Google Scholar
Zhao, J., Bodner, G. & Rewald, B. Phenotyping: Using machine learning for improved pairwise genotype classification based on root traits. Front. Plant Sci. https://doi.org/10.3389/fpls.2016.01864 (2016).
Article PubMed PubMed Central Google Scholar
Bodner, G. et al. A statistical approach to root system classification. Front. Plant. Sci. https://doi.org/10.3389/fpls.2013.00292 (2013).
Article PubMed PubMed Central Google Scholar
Moon, T., Ahn, T. I. & Son, J. E. Forecasting root-zone electrical conductivity of nutrient solutions in closed-loop soilless cultures via a recurrent neural network using environmental and cultivation information. Front. Plant. Sci. 9, 66. https://doi.org/10.3389/fpls.2018.00859 (2018).
Article Google Scholar
Lammerts van Bueren, E. T. & Struik, P. C. Diverse concepts of breeding for nitrogen use efficiency. A review. Agron. Sustain. Dev. 37, 50. https://doi.org/10.1007/s13593-017-0457-3 (2017).
Article CAS Google Scholar
Chan-Navarrete, R., Dolstra, O., van Kaauwen, M., van Bueren, E. T. L. & van der Linden, C. G. Genetic map construction and QTL analysis of nitrogen use efficiency in spinach (Spinacia oleracea L.). Euphytica 208, 621–636 (2016).
Chan-Navarrete, R., Kawai, A., Dolstra, O., van Bueren, E. T. L. & van der Linden, C. G. Genetic diversity for nitrogen use efficiency in spinach (Spinacia oleracea L.) cultivars using the Ingestad model on hydroponics. Euphytica 199, 155–166 (2014).
Ju, C. et al. Root and shoot traits for rice varieties with higher grain yield and higher nitrogen use efficiency at lower nitrogen rates application. Field Crop Res. 175, 47–55 (2015).
Article Google Scholar
Mu, X. et al. Genetic improvement of root growth increases maize yield via enhanced post-silking nitrogen uptake. Eur. J. Agron. 63, 55–61 (2015).
Article CAS Google Scholar
SharathKumar, M., Heuvelink, E. & Marcelis, L. F. M. Vertical farming: Moving from genetic to environmental modification. Trends Plant. Sci. 25, 724–727. https://doi.org/10.1016/j.tplants.2020.05.012 (2020).
Article CAS PubMed Google Scholar
Despommier, D. The vertical farm: Controlled environment agriculture carried out in tall buildings would create greater food safety and security for large urban populations. J. Verbr. Lebensm. 6, 233–236. https://doi.org/10.1007/s00003-010-0654-3 (2011).
Article Google Scholar
Meinen, E., Dueck, T., Kempkes, F. & Stanghellini, C. Growing fresh food on future space missions: Environmental conditions and crop management. Sci. Hortic. 235, 270–278. https://doi.org/10.1016/j.scienta.2018.03.002 (2018).
Article PubMed PubMed Central Google Scholar
Eppendorfer, W. H. & Bille, S. W. Free and Total Amino Acid Composition of Edible Parts of Beans, Kale, Spinach, Cauliflower and Potatoes as Influenced by Nitrogen Fertilisation and Phosphorus and Potassium Deficiency. J. Sci. Food Agric. 71, 449–458. https://doi.org/10.1002/(SICI)1097-0010(199608)71:4%3c449::AID-JSFA601%3e3.0.CO;2-N (1996).
Article CAS Google Scholar
Maneejantra, N. et al. A quantitative analysis of nutrient requirements for hydroponics Spinach (Spinacia oleracea L.) production under artificial light in a plant factory. J. Fertil. Pest. 7, 170–174 (2016).
Google Scholar
Lynch, J. Root architecture and plant productivity. Plant. Physiol. 109, 7–13. https://doi.org/10.1104/pp.109.1.7 (1995).
Article CAS PubMed PubMed Central Google Scholar
Lynch, J. P. in Nutrient Acquisition by Plants Vol. 181 Ecological Studies (ed BassiriRad H.) Ch. Chapter 7, 147–183 (Springer, 2005).
Wright, M. N. & Zagger, A. ranger: A fast implementation of random forests for high dimensional data in C++ and R. J. Stat. Softw. 77, 1–17. https://doi.org/10.18637/jss.v077.i01 (2017).
Article Google Scholar
Alvarado, G. et al. (eds Maize International & Center Wheat Improvement) (CIMMYT Research Data & Software Repository Network, 2015).
Iversen, C. M., McCormack, M. L., Blackwood, C. B., Freschet, G. T., Kattge, J., Roumet, C., Stover, D. B., Soudzilovskaia, N.A., Valverde-Barrantes, O. J., van Bodegom, P. M., Violle, C. Version 2 (Department of Energy, Oak Ridge National Laboratory TES SFA, U.S., Oak Ridge, Tennessee, USA, 2018).
Breiman, L. in Manual On Setting Up, Using, And Understanding Random Forests V3.1 (University of California at Berkeley, Berkeley, CA) (2002).
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
Article Google Scholar
Falconer, D. S., Mackay, T. F. & Frankham, R. Introduction to quantitative genetics (4th edn). Trends in Genetics, Vol. 12, p. 280 (1996).
Cooper, M. & DeLacy, I. Relationships among analytical methods used to study genotypic variation and genotype-by-environment interaction in plant breeding multi-environment experiments. Theor. Appl. Genet. 88, 561–572 (1994).
Article CAS Google Scholar
Ward, J. H. Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58, 236–244. https://doi.org/10.1080/01621459.1963.10500845 (1963).
Article MathSciNet Google Scholar
Falconer, D. S. Introduction to Quantitative Genetics. 365 (Ronald Press, 1961).
Searle, S. R. The value of indirect selection: I. Mass selection. Biometrics 21, 682–707. https://doi.org/10.2307/2528550 (1965).
Article MathSciNet CAS PubMed Google Scholar
Gallais, A. in Efficiency in Plant Breeding. (ed W. Lange, Zeven, A.C., Hogenboom, N.G. ) 45–60 (Pudoc, 1984).
Hansel, H. in Efficiency in Plant Breeding. (ed A.C. Zeven and N.G. Hogenboom W. Lange) 61–64 (Pudoc, 1984).
Liaw, A. & Weggy, M. Classification and regression by randomForest. R News 2, 18–22 (2002).
Google Scholar
Stekhoven, D. J. & Bühlmann, P. MissForest—non-parametric missing value imputation for mixed-type data. Bioinformatics 28, 112–118 (2012).
Article CAS Google Scholar
Ljumović, M. & Klar, M. in 2015 4th Mediterranean Conference on Embedded Computing (MECO). 212–215 (IEEE).
Brown, C. E. in Applied multivariate statistics in geohydrology and related sciences 155–157 (Springer, 1998).
Wray, N. V. P. Estimating trait heritability. Nat. Educ. 1, 29 (2008).
Google Scholar
Gitonga, V. W. et al. Genetic variation, heritability and genotype by environment interaction of morphological traits in a tetraploid rose population. BMC Genet 15, 146–146. https://doi.org/10.1186/s12863-014-0146-z (2014).
Article PubMed PubMed Central Google Scholar
Folta, K. M. Breeding new varieties for controlled environments. Plant. Biol. 21(Suppl 1), 6–12. https://doi.org/10.1111/plb.12914 (2019).
Article PubMed Google Scholar
Gao, K., Chen, F., Yuan, L., Zhang, F. & Mi, G. A comprehensive analysis of root morphological changes and nitrogen allocation in maize in response to low nitrogen stress. Plant. Cell Environ. 38, 740–750. https://doi.org/10.1111/pce.12439 (2015).
Article CAS PubMed Google Scholar
Yang, J. T., Schneider, H. M., Brown, K. M. & Lynch, J. P. Genotypic variation and nitrogen stress effects on root anatomy in maize are node specific. J. Exp. Bot. 70, 5311–5325. https://doi.org/10.1093/jxb/erz293 (2019).
Article CAS PubMed PubMed Central Google Scholar
Zobel, R. W., Kinraide, T. B. & Baligar, V. C. Fine root diameters can change in response to changes in nutrient concentrations. Plant. Soil 297, 243–254. https://doi.org/10.1007/s11104-007-9341-2 (2007).
Article CAS Google Scholar
Bodner, G., Nakhforoosh, A., Arnold, T. & Leitner, D. Hyperspectral imaging: A novel approach for plant root phenotyping. Plant. Methods 14, 84. https://doi.org/10.1186/s13007-018-0352-1 (2018).
Article PubMed PubMed Central Google Scholar
Atkinson, J. A., Pound, M. P., Bennett, M. J. & Wells, D. M. Uncovering the hidden half of plants using new advances in root phenotyping. Curr. Opin. Biotechnol. 55, 1–8. https://doi.org/10.1016/j.copbio.2018.06.002 (2019).
Article CAS PubMed PubMed Central Google Scholar
Holland, J. W., Nyquist, W.E., Cervantes-Martinez, T.C. in Plant Breeding Reviews Vol. 22 (ed J. Janick) Ch. 2, 29–39 (Wiley, 2003).
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444. https://doi.org/10.1038/nature14539 (2015).
Article ADS CAS Google Scholar
Khaki, S., Wang, L. & Archontoulis, S. A CNN-RNN Framework for Crop Yield Prediction. (2019).
van Dijk, A. D. J., Kootstra, G., Kruijer, W. & de Ridder, D. Machine learning in plant science and plant breeding. iScience 24, 101890. https://doi.org/10.1016/j.isci.2020.101890 (2021).
Article ADS PubMed Google Scholar
Shahhosseini, M., Hu, G., Huber, I. & Archontoulis, S. V. Coupling machine learning and crop modeling improves crop yield prediction in the US Corn Belt. Sci. Rep. 11, 1606. https://doi.org/10.1038/s41598-020-80820-1 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This study was supported in part by funds from USDA-SCMP Grant #. TX-SCM-17-04 to V.J. and C.A.A, Texas A&M AgriLife Vegetable seed grant FY16-FY17 to C.A.A. and V.J.; and USDA-National Institute of Food and Agriculture Specialty Crops Research Initiative 2017-51181-26830 to C.A.A.

Author information

Authors and Affiliations

Texas A&M AgriLife Research and Extension Center, Weslaco, TX, 78596, USA
Henry O. Awika & Carlos A. Avila
Texas A&M AgriLife Research and Extension Center, Uvalde, TX, 78801, USA
Amit K. Mishra, James DiPiazza & Vijay Joshi
Department of Horticultural Sciences, Texas A&M University, College Station, TX, 77843, USA
Haramrit Gill, Carlos A. Avila & Vijay Joshi

Authors

Henry O. Awika
View author publications
You can also search for this author in PubMed Google Scholar
Amit K. Mishra
View author publications
You can also search for this author in PubMed Google Scholar
Haramrit Gill
View author publications
You can also search for this author in PubMed Google Scholar
James DiPiazza
View author publications
You can also search for this author in PubMed Google Scholar
Carlos A. Avila
View author publications
You can also search for this author in PubMed Google Scholar
Vijay Joshi
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

V.J. designed and supervised experiments; H.O.A., methodology; A.K.M, H.G, and J.D., data collection and extraction; H.O.A, formal analysis; H.O.A, writing—original draft preparation; H.O.A, C.A.A, and VJ, writing—review and editing; V.J. and C.A.A, funding acquisition. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Vijay Joshi.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Awika, H.O., Mishra, A.K., Gill, H. et al. Selection of nitrogen responsive root architectural traits in spinach using machine learning and genetic correlations. Sci Rep 11, 9536 (2021). https://doi.org/10.1038/s41598-021-87870-z

Download citation

Received: 22 December 2020
Accepted: 06 April 2021
Published: 05 May 2021
DOI: https://doi.org/10.1038/s41598-021-87870-z

This article is cited by

Temporal phenotypic variation of spinach root traits and its relation to shoot performance
- Ji Liu
- Jiapeng Shui
- Xiaoli Wang
Scientific Reports (2024)
Evaluation of growth adaptation of Cinnamomum camphora seedlings in ionic rare earth tailings environment
- H. Zhang
- C. Liu
- G. Xia
Scientific Reports (2023)
Non-invasive phenotyping for water and nitrogen uptake by deep roots explored using machine learning
- Satyasaran Changdar
- Olga Popovic
- Kristian Thorup-Kristensen
Plant and Soil (2023)
Cover Crop Amendments and Lettuce Plant Growth Stages Alter Rhizobacterial Properties and Roles in Plant Performance
- Yufita Dwi Chinta
- Hajime Araki
Microbial Ecology (2023)
Genetic dissection of nitrogen induced changes in the shoot and root biomass of spinach
- Vijay Joshi
- Ainong Shi
- James DiPiazza
Scientific Reports (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Methods

Plants, plant material, experimental setup, and evaluation environment

Plant material processing, root imaging, and data processing

Data analysis

Individual trait and combined management variance analysis and mean separation

Heritability

Determining the genotypic and phenotypic correlation between traits

Predicting correlated response

Summary of data preparation and evaluation by machine learning

Default tuning and model evaluation

Setting hyperparametric tuning and evaluation parameters

Constructing accuracy function and evaluating the models

Results

Model tuning and accuracy

Prediction by machine learning is a close approximation of both the genetic and phenotypic correlations

Pairwise genetic and phenotypic correlations are affected by N management

Variation among traits and between N managements

Comparing heritability and correlated response in LN and HN among root traits

Discussion

Selecting the root traits with predicted potential as secondary traits for shoot biomass

The case for selecting against large average root diameter in baby spinach

Resolving the conundrum around the antagonistic relationship between RAvdiam and SWt

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links