Selection of nitrogen responsive root architectural traits in spinach using machine learning and genetic correlations

The efficient acquisition and transport of nutrients by plants largely depend on the root architecture. Due to the absence of complex microbial network interactions and soil heterogeneity in a restricted soilless medium, the architecture of roots is a function of genetics defined by the soilless matrix and exogenously supplied nutrients such as nitrogen (N). The knowledge of root trait combinations that offer the optimal nitrogen use efficiency (NUE) is far from being conclusive. The objective of this study was to define the root trait(s) that best predicts and correlates with vegetative biomass under differed N treatments. We used eight image-derived root architectural traits of 202 diverse spinach lines grown in two N concentrations (high N, HN, and low N, LN) in randomized complete blocks design. Supervised random forest (RF) machine learning augmented by ranger hyperparameter grid search was used to predict the variable importance of the root traits. We also determined the broad-sense heritability (H) and genetic (rg) and phenotypic (rp) correlations between root traits and the vegetative biomass (shoot weight, SWt). Each root trait was assigned a predicted importance rank based on the trait’s contribution to the cumulative reduction in the mean square error (MSE) in the RF tree regression models for SWt. The root traits were further prioritized for potential selection based on the rg and SWt correlated response (CR). The predicted importance of the eight root traits showed that the number of root tips (Tips) and root length (RLength) under HN and crossings (Xsings) and root average diameter (RAvdiam) under LN were the most relevant. SWt had a highly antagonistic rg (− 0.83) to RAvdiam, but a high predicted indirect selection efficiency (− 112.8%) with RAvdiam under LN; RAvdiam showed no significant rg or rp to SWt under HN. In limited N availability, we suggest that selecting against larger RAvdiam as a secondary trait might improve biomass and, hence, NUE with no apparent yield penalty under HN.


Methods
Plants, plant material, experimental setup, and evaluation environment. A collection of 202 spinach (S. oleracea) accessions maintained and provided by the USDA-National Plant Germplasm System (NPGS) (https:// npgsw eb. ars-grin. gov/) at Ames, Iowa, U.S.A was used in the present study. The plants were grown in a growth chamber under controlled conditions of 12/12 h (light/dark), 22 °C, and 75% relative humidity. The seeds of each accession were sown in triplicate in turface (Turface Athletics MVP, PROFILE Products LLC, Buffalo Grove, Illinois, USA) in small pots (10.2 cm × 10.2 cm and 8.9 cm deep). Each set of replicates was completely randomized across separate shelves. After the seedling emergence, plants were fertilized with Peters professional ready mix (5-11-26, hydroponics special water-soluble fertilizer, Everris NA Inc., Ohio, USA) every after four days. Two concentrations of nitrogen (N), low N (50 ppm), and high N (200 ppm) were used for low and high N management, respectively. An additional N for the high N-management was provided using calcium nitrate, and equivalent calcium (3.85 mM) was replaced by calcium chloride in the low N-management. The concentrations of the macro/micro-nutrients present in the fertilizer solution are provided in Supplementary Dataset Table S1.
The experimental research in the lab facilities for this study was performed as required by Texas A&M System Regulation (15.99.06 Use of Biohazards in Research, Teaching, and Testing) and the University's Rule for the use of Biohazards and Dual Use Research of Concern (15.99.06.M1 Use of Biohazards, toxins and rDNA and DURC), approved by the Texas A&M Institutional Biosafety Committee (IBC).
Plant material processing, root imaging, and data processing. The plants were harvested at the physiological maturity of baby spinach (5-6 leaves) after 41 days of sowing. Each plant was carefully pulled from the turface and washed with running water to clear any debris off the roots. The roots were separated from the shoot at the cotyledonary nodes and floated in water. The lateral roots were separated gently using a fine tip paintbrush to minimize the overlapping of roots. The images were taken by digitally scanning roots of individual plants (Supplementary Dataset Figure S1) using a high-resolution scanner (Calibrated Color Optical scanner STD4800 with Special Lighting System) and scanned images analyzed using WinRHIZO Pro software (Regent Instruments Inc. Canada). Categorization of the traits was adapted from the Fine-Root Ecology Database 37 classification and included: (1) morphology for root length (RLength) and average root diameter (RAvdiam); (2) an indication of the complexity of the root system architecture measured using number of tips, forks (number of root bifurcations), and crossings (Xsings; overlapping parts); (3) root system of the standing crop (RSSC) for root volume (RVol), root surface area (RSarea) and root weight (RWt). The harvestable above-ground biomass was determined as fresh shoot weight (SWt) (WR P-series balance, Model 500P, VWR International, U.S.A.) Scientific Reports | (2021) 11:9536 | https://doi.org/10.1038/s41598-021-87870-z www.nature.com/scientificreports/ after removing the excess surface moisture by gently paper-bloating the wet roots followed by a two-minute air drying.
Data analysis. The analysis pipeline was designed to define the phenotypic, genotypic, and predictive relationship between the root traits and between the root traits and the SWt of spinach plants grown in a soilless system. We determined the r g and r p (defined below in the section -Determining the genotypic and phenotypic correlation between traits) between root traits and r g and r p between the root traits and the SWt within and between the two N managements. Parallel to the correlation analyses, we used a supervised random forest machine learning 38,39 technique to estimate the variable importance of each root phenotype in predicting the above-ground shoot biomass. The details of the parallel procedures are described below.
Individual trait and combined management variance analysis and mean separation. Linear mixed models were implemented in lmer from package lme4 of R using REML via Multi Environment Trial Analysis with R, META-R 36 , to calculate the adjusted means (best linear unbiased estimates, BLUEs, and predictors, BLUPs) for each root and shoot variable, under the two N managements. For individual analyses, we used the model is where Y ijk is the trait of interest, µ is the mean effect, Rep i is the effect of the ith replicate represented by the complete blocks, Gen k is the effect of the kth genotype, εijk is the error associated with the ith replication, jth incomplete block and the kth genotype, which is assumed to be normally and independently distributed, with mean zero and homoscedastic variance σ 2 . For the combined analyses across the two N-managements, the model was adjusted to where the new terms Mgt i and Mgt i × Gen l are the effects of the ith N-management and the N-management by genotype interaction, respectively. Genotype and N-management were both treated as random effects, and BLUPs were used to estimate random effects and BLUEs to estimate the fixed effect. Grand means were separated based on Fisher's Least Significance Difference (LSD) at α = 5%. We also determined the coefficients of variation (CV) for all traits.
Heritability. We estimated the average broad-sense heritability (repeatability, H) of three replicates in each N-management, which is also an estimate of correlation expected between line means from the three replicate trials conducted in the two N-managements. H was determined 40 on a line mean basis as and combined for the two N-managements as where σ 2 g and σ 2 e are the genotype and the residual error variance components, respectively; nReps is the number of replicates, σ 2 ge is the variance component of genotype by N-management interaction, and nMgt is the number of N-managements in the analysis.
Determining the genotypic and phenotypic correlation between traits. Genetic and phenotypic correlations were calculated for each trait pair, within and across the N-managements. The r g were also determined in META-R, which applies the equations from Cooper et al. 41 . Between the N-managements, r g was estimated as and between traits within a single N-management, where r p (jj 0 ) is the phenotypic correlation between N-managements j and j 0 ; and H j and Hj 0 are the heritability of N-managements j and j 0 respectively, σ g (jj 0 ) is the arithmetic mean of all pairwise genotypic covariances between trait j and j 0 , and σ g (j)σ g (j 0 ) is the arithmetic average of all pairwise geometric means among the genotypic variance components of the traits.
For graphical illustrations, cluster analysis based on the environment distance matrix (1-Genetic Correlation matrix) was also performed using the 'Ward' method 42 , creating a dendrogram. In each case, a minimum heritability threshold was set at 0.1; any trait whose heritability within or between the two N-managements was lower than 0.1 was excluded from the analysis and was not plotted. For phenotypic correlations, simple Pearson correlations between different pairs of N-managements or traits were used. (1) r g = σ g jj 0 /σ g j σ g j 0 , where CR swt is the correlated response of SWt, r g is the genetic correlation, H RT is the repeatability of root traits, V g(swt) is the genetic variance of SWt, and i is the selection intensity whose estimate we assumed would be similar between traits. Thus, CR swt was compared to direct response (R), by CR swt /R swt = r g √H RT /√ H swt. That is, if r g × H RT > H swt , then indirect selection would be superior [43][44][45][46] .

Summary of data preparation and evaluation by machine learning
To rank the root traits by importance in the prediction models for SWt, we used the random forest (RF) modeling in R. RF is a powerful ensemble machine learning tool that combines the outputs of numerous decision tree classification models. We applied the regression type to the randomForest 47 package and first ran the regression on default tuning parameters. We also invoked a user-defined hyperparameter tuning in the ranger 35 package to optimize our models; ranger is a C++ implementation of Breiman's FORTRAN-based random forest algorithm 39 . Finally, we compared the accuracy of the model from the randomForest default tuning and that from ranger hyperparametric search tuning. The function missForest 48 was used to impute missing data. Outliers were normalized by an internally derived proximity matrix procedure built into the RF. In the normalization, if an outlier case i and case j both end up in the same tree node, increase proximity prox(ij) between i and j by 1 and accumulate over all trees in RF, the outliers are normalized by twice the number of trees in RF. This creates a proximity square matrix where observations that are 'similar or alike' in value have proximities close to 1 and the dissimilar proximity closer to 0.

Default tuning and model evaluation.
The default data split (into 63.25% as training dataset and the remainder as the validation set) were applied to train each N-management. The 63.25% is the proportion expected of unique observations in a bootstrap sample 39,49 . The typical range is ~ 60 to 85%, where smaller sample sizes can reduce the training time but may introduce more bias than necessary, while too large a sample size can increase performance but at the risk of overfitting because it introduces more variance 39 . An F-fold crossvalidation feature in RF invokes the evaluation of model performance by training it on a number of different smaller datasets and evaluating them over the other smaller testing sets. By default, randomForest randomly splits the number of datasets of almost the same sized k-folds, and each of the folder models is evaluated over the number of folders and tested on the remaining test set 39,47 . This process is repeated until all the subsets have been evaluated. The regression tree parameters are tuned further by choosing the number of independent variables (m) using the default as m = p/3, where p is the total number of root traits in our analysis. This helps generalize the data best to return the least out-of-bag (OOB) error rate and provides a built-in validation set. Further, it identifies the number of trees (ntree), required to stabilize the error rate during tuning more efficiently 39,49 . OOB error is an internal error estimate of a random forest as it is being constructed 39 . It is estimated by testing each tree built from the bootstrap aggregation (bagging samples) from the training set on the remaining (validation set as defined by the default data split) of the samples not used in building that tree; randomForest chooses a random subset of features and builds many regression trees, and the model averages out all the predictions of the decision trees.
Setting hyperparametric tuning and evaluation parameters. We first determined the optimal number of trees (ntree), producing the least OOB error rate. The term 'Optimal' refers to the number of trees that were just enough to stabilize the OOB error and improve efficiency by avoiding unnecessary runs, as determined from the ntree function and which.min argument. The optimal number of trees was delineated first by running 500 trees with the default 63.25:36.75 split for each N-management. A hypergrid search was then constructed across several hyperparameter combinations and looped through each combination (details are in Supplementary Dataset Table S2). The model was evaluated over all the combinations we passed in the search space function using the grid search. The hyperparameter searches applied (values in parenthesis) were: mtry (4 variables from 2 to 48), for the number of random root trait variables to include in each tree. The primary concern was to tune the number of candidate variables (features) to sample at each tree node split randomly; 2) sampsize (sample fractions 0.55, 0.60, 0.65, 0.70, 0.75, 0.80) denoting the number of samples to train, 3) model nodesize (8 variables from 1 to 48), which determines the minimum number of samples within the terminal nodes and thus controls the complexity of the trees. This was necessary to set a bias-variance tradeoff where smaller node size allows for deeper, more complex trees with the risk of introducing more variance (risk of overfitting) and larger node results in shallower trees which may introduce more bias (risk of not fully capturing unique patterns and relationships in the data) 39 . The minimum OOB root mean square error (OOB_RMSE) was set at zero (0). For ntree, we used 500 because the OOB_RMSE from hypergrid searches stabilized with less than 500 trees (Fig. 1). The resulting hyperparameter combination producing a model with the least prediction variance and OOB_RMSE was selected and tested with the training set and an independent, smaller test sample data (not used in each N-management training). The independent test sample was obtained from the optimal sampsize split, without bootstrap replacement. www.nature.com/scientificreports/ Constructing accuracy function and evaluating the models. We applied all the above models on the same independent test (validation set) dataset to evaluate the accuracy of the grid-tuned model compared to the default model. The best of the two (lower mean error rate and greater mean model R 2 of regression trees) was used as our prediction model in a new regression run in randomForest to predict the new test set. The validation set was used as the independent test set since the sampsize split was done before bootstrapping and before sampsize split-variable randomization of the predictor root traits. Furthermore, we set importance as equals impurity' in the above modeling, which allows us to assess the variable importance of the root traits. Variable importance is measured by recording the decrease in mean square error (MSE) each time a variable is used as a node split in a tree 39 . The remaining error left in predictive accuracy after a node split is the 'node impurity, ' and a variable that reduces this impurity is considered more important than those that do not. Consequently, the root variable with the greatest accumulated reduction in MSE was considered the more impactful 39 .

Results
Model tuning and accuracy. Optimized hyperparameters used in cross-validation with both the training and test samples in the two N-management datasets were variable size, node size, sample size, and the number of trees. All the optimized settings resulting from hyperparameter grid search are in Table 1, and the stabilizing ntree and OOB-RMSE are shown by arrows in Fig. 1a-   Prediction by machine learning is a close approximation of both the genetic and phenotypic correlations. By machine learning (ML), we ranked root traits based on the predicted importance of each in the models in describing its relationship with SWt. The traits with the greatest variable importance (Tips under HN and Xsings under LN) identified by ML also had the largest r g and r p to SWt in the corresponding N-managements (Fig. 2). The traits with the least variable importance (RAvdiam under HN and RVol under LN) were correctly identified in three out of four cases by the r g and r p ranking methods; the exception was r g in the LN, where RVol followed RSarea as the least correlated to SWt. Overall, 6 out of 8 traits were correctly matched between ML and r g _HN, with a two-trait r g position switch, e.g., RVol then Xsings, instead of Xsings then RVol. Only 2 out of 8 traits were correctly matched between ML and r g _LN with a two-trait position switch between six traits, e.g., RAvdiam then Tips, RWt then Forks and RVol then RSrea, instead of vice versa (Fig. 2). These twotrait switches, in our opinion, are minor alterations if we consider the fact that the four root traits predicted by ML as the most important and the four predicted as the least important under LN were also the same root traits Table 1. Summary of model evaluation and test (validation) by random forest machine learning. a Trees maintained at 500 since all optimal trees (in Fig. 2) fell within this limit. b Number of predictor root traits sampled at each tree node; HN, high N; LN, low N. ϕ The validation sample was kept independent of the training sample, and the same hyperparameters were applied as for the training model. Pairwise genetic and phenotypic correlations are affected by N management. The r g between all traits within and between N-managements are summarized in Tables 2, 3 and 4, while the structure of these correlations is shown in Fig. 3. The main reason for estimating r g is to determine if a greater response on SWt  www.nature.com/scientificreports/ would result by selecting a root trait as a secondary trait. The pairwise r g and r p between root traits and SWt were nearly identical under HN except for correlations between RAvdiam andSWt where r g = 0.045 and r p = 0.168. Under LN, the similarity was also high except for correlations between RAvdiam and SWt where r g = −0.83 and r p = −0.48, and between RSarea and SWt where r g = 0.05 and r p = 0.28. With the exception of RAvdiam and Xsings, the r g and r p between the other root traits and SWt were generally larger under HN compared to the corresponding r g and r p between root traits and SWt under LN ( Fig. 2; Tables 2 and 3). The close similarities between r g and r p between within an N-management show that in our experimental growth environment, the r g and r p between the root traits and SWt were close approximations of each other within an N-management, likely due to low effects of the environment external to the growth facility on genotypes. Across the N-managements, the r g between the root traits and SWt were greater than the corresponding r p , most likely due to the between N-management treatment noise confounding the phenotypic variance on r p . As mentioned earlier, here, H was less than a threshold we had set at 0.1; therefore, H values for Xsings between the N-managements and r g associated with it were not included in the across-management output (Table 4).  Figure 3. The structure of genetic correlations between traits within and between the two N-managements. Ward method was used to construct the dendrograms. In the correlations between N-Managements, Xsings is missing because its heritability was < 0.1 threshold we had set when calculating the genetic correlation coefficients.

Variation among traits and between N managements. Variation among the genotypes and in the
genotype × management had a significant effect for all the traits in the two N managements, but RAvdiam had the least variation among the genotypes with CV ~ 10.3% in the HN and ~ 7.1% in the LN, compared to CV ranging between ~ 47% to ~ 80% among the rest of the traits in the two N managements (Table 6). Because CV is highly dependent on the grand mean of a trial 50 , we exercised caution in using a CV to interpret the comparative variability between traits under the two N managements. The trait CV between the two N management did not show any specific pattern to suggest a trait variance inflation due to the low nitrogen treatment or the differences in the grand means. LN and HN LN). H between the N-managements was very low, yet the genetic correlation was very high, suggesting that H is affected by the environmental noise between the two N-managements. RAvdiam had significant negative and large correlations (− 0.83) to SWt under LN, while the correlations were positive but not significant under HN. It was ranked among the highest in the RF regression under LN but ranked bottom in the HN. The heritability was very high (0.957) under LN compared to 0.536 under HN. On the other hand, Xsing, was ranked highest ranked by the RF regressions under LN, but lower under HN (r g and r p were positive and highly significant in both managements, while H was medium' in the LN, 0.588; in the HN, 0.658). These observations suggest that selecting for small root diameter may be desirable for improving shoot weight of baby spinach in low N. In fact, of all the root traits in this report, only RAvdiam had a predicted high indirect selection efficiency (113%) for SWt (r g_RTswt × H RT > H swt , Table 6). Stronger correlated response efficiency of SWt was predicted for morphological (RLength and RAvdiam) and architectural traits (Tips and Xsings) compared to the standing crop root traits (RSarea, RVol, and RWt) ( Table 6).

Discussion
The analysis pipeline was designed to define the phenotypic, genotypic, and predictive relationship between root architecture traits (Forks, Tips, and Xsings), root morphological traits (RLength, and RAvdiam), the root system of the standing crop (RSarea, RVol, and RWt) and between the root traits and the SWt of spinach grown in a soilless system. The objective was to determine root traits that have the greatest effect on the harvestable shoot under low N and thus can be used as a secondary trait to select for high NUE germplasm. We determined the phenotypic and genetic correlations (r p and r g , respectively) between root traits and between the root traits and the SWt within and between the N-managements. Parallel to the correlation analyses, we used the predictive random forest machine learning technique to rank the root phenotypes according to their strength to predict SWt. Table 5. Variance, heritability and means separation for shoot weight and root traits. LN, low N; HN, high N; LSD, least significant difference; CV, coefficient of variations; *, **, ***significant at p = 0.05, 0.05 > p ≥ 0.01, p < 0.01, respectively. www.nature.com/scientificreports/ Selecting the root traits with predicted potential as secondary traits for shoot biomass. We predicted the correlated response (CR) in SWt resulting from selecting any of the eight root traits to speculate its suitability as a secondary trait. An important component in defining CR is the H, which integrates information on genetic variation and environmental noise into one statistic and thus is useful in planning breeding programs 51 . One condition that must be met for indirect selection to be effective is H and r g must be high in both the selection and target environments 43,44 even though H and r g are environment-and population-specific 43,46 .
Fortunately, H has been strikingly similar in many environments 34,52 , and H variations in indoor growth environments are expected to be low 52,53 . In this context, since our H estimates were the average repeatability of 3-replicate trials in each of the N-managements, it may also be used to estimate the correlation expected between line means obtained from trials conducted at different indoor systems. Selecting a root trait in one N-management where H is high may predict the performance in the other, but we think the actual phenotypic quantity may vary substantially. However, if heritability values are high for both traits, then the correlation in breeding values dominates the phenotypic correlation 43,45,46 . Since the H values in SWt (H ~ 70%) was not as high compared to H of RAvdiam (~ 96%), and with a genetic correlation of − 0.827 between them, the correlation in environmental values within N-management which dominated the phenotypic correlation (− 0.481) between RAdiam and SWt may have been mainly due to LN-management effect on SWt. Thus, an LN environment that minimizes the devaluation of the breeding values between the RAvdiam and SWt must be maintained, and we believe indoor environments may provide this condition. In this context, selecting for a root trait as a secondary trait should produce a correlated response in spinach shoot biomass, and the ratio CR swt /R swt (Table 5) provides such an indirect selection criterion. It is clear (from these ratios) that direct selection of shoot biomass (SWt) is predicted to be superior to selecting for most of the eight root traits as a proxy in both N-managements. The exception is RAvdiam, which was predicted to result in superior indirect selection efficiency (112.8%) for SWt, i.e., a gain of ~ 12.8% in SWt by selecting against large RAvdiam. Other traits resulted in lower than 100% predicted efficiency; for instance, under LN was Xsings (74.6%), while the rest were less than 45% efficient. In the HN management, RLength (77.2%) and Tips (82.6%) were the most efficient but not enough for an indirect selection advantage.
The case for selecting against large average root diameter in baby spinach. We have noticed that under HN, RAvdiam did not have significant r g to any of the other root traits and with SWt, and only had significant positive r p with RWt and RVol (Table 2). Meanwhile, under LN, RAvdiam had significant positive r p only to RSarea and RVol, non-significant r p with RWt (0.104) and RLength, but significant negative r p with Tips (− 0.238) and Xsings (− 0.310) ( Table 3). Spinach requires high N supply 8 , and under such conditions, it seems RAdiam is not likely to substantially influence the yield differences observed in shoot biomass among the genotypes. However, the significant negative r p (− 0.481) and the highly negative r g (− 0.827) with SWt under LN management (Table 3) suggest that larger mean root diameter is associated with smaller mean shoot weight and vice versa. The reduction in root diameter-related phenes in the youngest maize nodes under N stress suggested that root diameter might play a role in adaptive stress responses 54,55 . Whether or not the greater mean RAvdiam was due to root girth expansion in a negative feedback response to low N or competing resource allocation 8,34  www.nature.com/scientificreports/ was not investigated in this study. Based on the pattern of diameter change in response to nutrient concentration in different species, it is suggested that altering root diameter may be another way to save C costs in root growth during nutrient stresses 56 . Although it is unclear how the root anatomical changes influence spinach root diameter, maize roots showed reduced cell diameter and area of vessels but an increased amount of aerenchyma during LN stress 54 . It is plausible to assume that N is preferentially allocated to the roots to sustain their growth under LN than shoots, and the reduced N concentration act as the internal signal in regulating the response of axile root growth. Given the robust correlated response efficiency of SWt predicted for RAvdiam and lateral root traits (Tips and Xsings) in our study, the RAvdiam measure can be used to indicate the ratio of axial: lateral roots. The genotypic correlations between RAvdiam and RSarea (0.533) and between RAvdiam and RVol (0.794) were also highly significant and positive. This implies that selecting against large root diameter may also select against RSarea and RVol under low N-management. Since there was non-significant r g but significant r p between RSarea and SWt, our data suggest that only a limited genetic linkage drag or pleiotropy 45 on shoot yield might result from selecting against large RAvdiam. Moreover, RSarea and RVol are standing crop traits 37 for 'root bulk' , which are a product of morphological traits (RAvdiam and RLength) and root architectural traits (e.g., Tips, Forks, and Xsings). The significant small positive r p between RSarea and SWt may be a combined artifact of these other morphological and architectural components when we also consider a significant negative r p existed between RAvdiam and SWt in the LN management. This position is also supported by the fact that RVol had a negative r g (− 0.296) and an insignificant r p to SWt, and yet RVol had positive r g (0.439) and r p (0.931) to RLength; RLength, on the other hand, had significant r g (0.539) and r p (0.529) to SWt. RAvdiam was also highly heritable with H ~ 96% under LN, Table 6). We propose that the root average diameter is the only trait in this study that can successfully be selected against to improve the yield of shoot biomass low N. Further studies to validate these findings might benefit from testing in multiple growth conditions (temperature, humidity, and graduated N concentrations) to define a broader range of root trait-shoot yield relationships and N-responsiveness.

Resolving the conundrum around the antagonistic relationship between RAvdiam and
SWt. Compared to SWt, RAvdiam has shown greater H ( Table 6). The r g between SWt and RAvdiam can be high (Table 3). In other words, indirect selection for a secondary trait will be superior if the heritability of that trait is high, and the correlation between the traits is close to 1 45,46 . In this study, the RAvdiam met these two critical criteria. However, for practical use in a breeding program, the secondary trait must also be inexpensive and easy to measure in large trials 43,45 . In that case, shoot biomass estimate could be used to select for roots bulk traits in production systems that target spinach roots as the end product. Because precision in imaging techniques for roots and shoot is rapidly evolving 57,58 , we believe that soilless systems can be designed to facilitate robust root metrics characterization to match the ease with which above-ground biomass can be phenotyped. It would also be worthwhile to determine if further selection for/ against other root traits would eventually result in a superior secondary selection for shoot biomass. How these relationships would play along as the plants mature under different indoor growth conditions or in the field conditions require further studies. Although a soilless system reduces the complexities associated with soils, we believe that the selection of a root trait needs to be understood in the context of the possible complex interplay among root traits 45,51 . The possible complex interactions between the root traits that may have influenced the SWt were not explicitly considered in our interpretations. However, we have alluded to such complexity by describing the trait to trait correlations, which we hope should serve as an impetus for further inquiry. Although the expected genetic correlation between estimates of cultivar means are best obtained from independent sets of trials 43,59 , we hypothesize that under similar N treatments, manipulation of other growth conditions in independent indoor growth environments may lead to some deviations from the response to selection predicted here. With the advent of techniques in processing images and the deep learning 60 frameworks that use advanced optimization and features from data, such prediction accuracy is likely to improve continually [61][62][63] . Nonetheless, machine learning recognized root traits would continue to rely on vigorous calibrations and field-based validation in systems of interest.
In conclusion, we report on the investigation of eight root traits genetic and phenotypic correlations with fresh shoot biomass of spinach grown in a soilless system in a controlled indoor environment. The plants were harvested at 41 d after sowing, a stage corresponding to the marketable baby spinach. We have used both genotypes by management and other conventional breeder statistics and a machine learning predictive technique to define candidate root traits with the potential for indirectly selecting for spinach shoot yield. The experiments were set up under two separate and contrasting N-managements. Of the eight root traits, the root average diameter emerged as the only candidate with a predicted indirect selection efficiency good enough to improve shoot biomass. However, it had a robust negative genetic correlation with shoot yield, making us believe that selecting against large root diameter may improve the fresh shoot yield of baby spinach. We have exercised caution in this interpretation by recommending further studies into the possible complex interactions among the root traits considered in improving shoot biomass yield in baby spinach.

Data availability
All data generated or analyzed during this study are included in this published article (and Supplementary  Information files).