Abstract
Regular water management is crucial for the cultivation of tomato (Solanum lycopersicum L.). Inadequate irrigation leads to water stress and a reduction in tomato yield and quality. Therefore, it is important to develop an efficient classification method of the drought status of tomato for the timely application of irrigation. In this study, a simple classification and regression tree (CART) model that includes air temperature, vapor pressure deficit, and leaf–air temperature difference was established to classify the drought status of three tomato genotypes (i.e., cherry type ‘Tainan ASVEG No. 19’, large fruits breeding line ‘108290’, and wild accession ‘LA2093’). The results indicate that the proposed CART model exhibited a higher predictive sensitivity, specificity, geometric mean, and accuracy performance compared to the logistic model. In addition, the CART model was applicable not only to three tomato genotypes but across vegetative and reproductive stages. Furthermore, while the drought status was divided into low, medium, and high, the CART model provided a higher predictive performance than that of the logistic model. The results suggest that the drought status of tomato can be accurately classified by the proposed CART model. These results will provide a useful tool of the regular water management for tomato cultivation.
Similar content being viewed by others
Introduction
Tomato (Solanum lycopersicum L.) is a popular vegetable worldwide. To bridge the seasonal gap of production and prevent the rainfall and unfavorable temperature from reducing the yield and quality, most tomatoes are cultivated in greenhouses to stabilize and mitigate these adverse environmental impacts1. However, water management remains the main issue for farmers even when cultivated under greenhouse conditions2,3.
The shortage of water resources is a major limiting factor for agricultural production in many regions. Additionally, the quality and yield of tomato are not only affected by genotype, but also related to water management. To improve crop quality and water use efficiency, a water stress is induced by applying a deficit irrigation or increasing the salinity of nutrient solution during cultivation4,5. Unfortunately, the water stress induced by underirrigation at the vegetative and reproductive periods of tomato leads to abnormal growth, aborted flowering, and fruit setting failure, which cause a significant reduction in yield and quality6,7. Under moderate water deficit, the photosynthesis is limited, but it can recover in a short time after irrigation. Conversely, if water shortage continues, the irrigation cannot reverse the photosynthesis8. Therefore, it is very important to apply the water stress at an optimum level. Different genotypes or growth stages may have various responses to water stress9,10. Most crops are drought sensitive at various growth stages. The flowering and fruit setting stages are most sensitive to water deficits in drip irrigated tomatoes11. To provide the breeding material to resist drought stress, various species of tomato have been studied12,13,14. Wild tomato is the most resilient against abiotic and biotic stress compared to the domesticated tomato12,13. Tapia et al.14 found better morpho-physiological responses such as tolerance to drought stress in wild tomato than those in domesticated tomato.
In order to achieve adequate water management, it is important to decide a suitable irrigation strategy, which relies on an accurate, reliable, and timely classification of the drought status of tomato15,16,17. The drought status of plants is mainly determined by the soil water content, morphological alternation, physiological responses, and gene expression7,18,19,20. Changing the soil moisture monitored through sensors has been criticized because of the spatial heterogeneity of soils can make the measurements unrepresentative21,22,23,24. Gene expression profiling cannot reflect the instant drought status in the greenhouse, while other kinds of stress may contribute to the same expression variation. In addition, these methods are time-consuming and labor-demanding, which limits the number of plants and scale of the experiment19,25. In contrast, the evaluation of the drought status by examining variations in physiological responses such as stomata conductance, transpiration rate, and leaf temperature by means of some instruments is relatively efficient and effortless8,26,27,28.
When a plant is under drought stress, the changes in stomatal closure are more sensitive and rapid than the water potential and leaf water content29. The stomatal closure is one of the major factors limiting plant photosynthesis under mild or moderate water stress30. Medrano et al.8 found that light-saturated stomatal conductance (gsw) can be used as a reference parameter to reflect the degree of drought for C3 plants. Besides, the species-effect of gsw on photosynthesis seems to be lower than that of the leaf water potential and relative water content8. However, although gsw provides information about the water status of plants, current methods of gsw measurement are designed to be in physical contact with leaves, which is suitable to manually measure individual plant but not favorable for large-scale and field-scale scenarios. Other common indicators to depict the water status of plants are leaf temperature and leaf-to-air vapor pressure deficit (VPD). While plants suffer from drought stress, the stomatal closure reduces the heat emission and air efflux from leaves, leading to an increase in leaf temperature and VPD. Therefore, plant temperature and leaf–air temperature difference (Tdiff) can be used to indirectly assess the plant gsw31,32,33,34. The reported indicators for evaluating plant drought status by Tdiff are stress degree day (SDD)35 and crop water stress index (CWSI)36. However, temperature-based indicators are strongly influenced by the VPD and air temperature (Tair)37,38,39. Therefore, in the subsequent establishment of the drought status model, except for the Tdiff, both Tair and VPD will also be considered in this study as independent variables.
Logistic regression is a statistical method that can establish the relationship between predictive variables and binary (dichotomous) and/or ordinal dependent variables40. Logistic regression has been used in the analysis of plant disease risk factors, and implemented as disease predictive models to classify with or without disease of wheat, oilseed rape, pyrethrum, and peanut plants40,41. Other than logistic regression, classification and regression tree (CART) is a non-parametric regression procedure developed by Breiman et al.42 in 1984. CART supports a non-linear classification and is capable of handling collinearity between predictive variables43. Due to its flexibility, interpretability, and broad applicability, CART has been widely used as a classification algorithm for multiclass issues in agriculture, environmental protection, biomedicine, and computer science44,45,46,47.
For automated irrigation management in greenhouses, most farmers have used a timer to regularly drive irrigation or to maintain a specific soil water content. However, this method neglects to consider the plant response15. Sometimes, soil moisture may not accurately represent the plant water status, and different genotypes have various drought tolerance responses. If a traditional irrigation system is adopted, the problems of irrigation deficiencies or excesses often become unavoidable. Therefore, conducting the water management on the basis of plant response is more appropriate and accurate18. In our previous studies, plant temperature was utilized to classify the drought status of greenhouse tomato to improve irrigation system15,48. However, these studies did not consider the inferences of different genotypes and growth stages.
To facilitate and conduct proper water management, the goal of this study is to develop a simple discriminant model to instantly decide the drought status of tomatoes to serve as a rule for irrigation decision-making based on plant responses. The seedlings of cherry type tomato ‘Tainan ASVEG No. 19’ were subjected to drought and regular irrigation treatments, and the net CO2 assimilation rate (An), gsw, VPD, Tair, and Tdiff parameters were collected each day after treatment. The drought status was divided into binary (WW: well-watered; WS: water deficit stress) and ordinal (L: low stress; M: medium stress; H: high stress) variables according to the value of gsw. The Tair, VPD, and Tdiff were used as explanatory variables to build the CART and logistic regression models for predicting the drought status of the tomato. Except for the data collected from ‘Tainan ASVEG No. 19,’ data of the wild accession ‘LA2093’ (Lycopersicon pimpinellifolium) and the large fruit breeding line ‘108290’ were used to evaluate the model applicability for different genotypes and growth stages of the tomato.
Results and discussion
Relationship between gsw and An
The relationship between gsw and An was displayed using a logarithmic function, and the coefficients of determination (R2) were 0.79–0.94 (Fig. 1). Thus, approximately 80% variation of An can be explained by gsw. In addition, a strong correlation was observed between An and gsw, irrespective of the data collected from different growth stages and genotypes; the Spearman correlation coefficients (ρ) were all above 0.77 (Table 1). In fact, under drought stress, plants close stomata to avoid excessive water loss. Therefore, the closure of stomata results in a lack of CO2 required for photosynthesis. On the other hand, the lack of water causes the dehydration of tissues that conduct photosynthesis and eventually impedes the photosynthesis efficiency of plants49. A high degree of correlation between gsw and An was observed in field- and pot-grown plants8,50. These results strengthen our subsequent establishment of the drought status model based on gsw.
Relationship of gsw with VPD, Tdiff, and Tair
VPD is one of the factors that induces stomatal changes in many plants51. Stomata close as the leaf-to-air VPD increases regardless of soil water conditions8,51. In the study, the ρ between gsw and VPD for all genotypes was -0.77 to -0.82 (Table 1). In addition to reducing the efficiency of photosynthesis, stomatal closure also hinders heat loss through leaf transpiration, thereby increasing the plant temperature27,52. Therefore, Tdiff should be negatively correlated with gsw. The result of this study indicated that the ρ between gsw and Tdiff for all genotypes was -0.68 to -0.89 (Table 1). Although the gsw had a slight tendency to increase as Tair rose, the correlation between gsw and Tair was very weak (ρ = 0.05–0.26) (Table 1). Raschke53 summarized the stomata feedback mechanism in which changes in temperature affect CO2 assimilation, and the open or closure of stomata that respond to temperature changes are influenced by the CO2 feedback. Urban et al.54 found that gsw increases with rising Tair. However, VPD is more important than Tair in the change in gsw55, highlighting the weak correlation between gsw and Tair. Even if Tair, Tdiff, and VPD were put into the CART model or logistic model together, Tair is kept in the final model (Table 2; Figs. 2, 3). Thus, Tair may have some influence on the prediction of gsw.
Classification ability of binary response models
In the study, 70% of the data of the Tainan ASVEG No. 19 (2018–2019) dataset were used to build the model. Next, the Tainan ASVEG No. 19 (2020), breeding line 108290, LA2093, and the remaining 30% data of Tainan ASVEG No. 19 (2018–2019) dataset were used as the testing sets for model validation.
The parameters of the logistic model in the model-building stage are shown in Table 2. When using the 30% data of Tainan ASVEG No. 19 (2018–2019) as the testing dataset, the classified performance of the logistic model revealed a sensitivity of 0.82, specificity of 0.96, geometric mean of 0.89, and 93.10% accuracy (Table 3). For the other testing datasets, the logistic model also had an acceptable performance with a sensitivity of 0.80–1.00, specificity of 0.79–0.85, geometric mean of 0.86–0.92, and 80.23–89.74% accuracy (Table 3).
The structure of the binary CART model is illustrated in Fig. 2. For the 30% data of Tainan ASVEG No. 19 (2018–2019) used as validation, the classified performance of the CART model displayed a sensitivity of 0.75, specificity of 0.97, geometric mean of 0.85, and accuracy of 92.18% (Table 4). The CART model revealed a comparable and better performance than that of the logistic model when using different testing datasets, with a sensitivity of 0.87–1.00, specificity of 0.89–0.93, geometric mean of 0.90–0.94, and 90.82–92.11% accuracy (Table 4).
Classification ability of ordinal response models
In the ordinal response models, the training and testing datasets were same as the binary response models. The performances of the classified ability of the ordinal logistic model are shown in Table 5. When using 30% of the data of Tainan ASVEG No. 19 (2018–2019) as the testing dataset, the correctly classified percentages of L, M, and H statuses were 98.98%, 59.02%, and 54.00%, respectively, and the overall accuracy of the classified performances was 87.38%. For the Tainan ASVEG No. 19 (2020) testing dataset, the correctly classified percentages of L, M, and H statuses were 81.48%, 23.81%, and 100.00%, respectively, and the overall accuracy was 72.81%. The performances of classifying the drought status for the breeding line 108290 and LA2093 datasets were between Tainan ASVEG No. 19 (2018–2019) and Tainan ASVEG No. 19 (2020). The values of the overall accuracy were 77.55% and 83.72%, respectively, for predicting the different drought status under different genotypes and growth stages (Table 5).
The results and the structure of the multi-class CART model are represented in Table 6 and Fig. 3. When the CART model predicts the 30% testing dataset of the Tainan ASVEG No. 19 (2018–2019), the correctly classified percentages of L, M, and H statuses were 96.59%, 63.93%, and 58.00%, respectively, and the overall accuracy of the classification was 86.88%. For the Tainan ASVEG No. 19 (2020) testing dataset, the correctly classified percentages of L, M, and H statuses were 87.65%, 71.43%, and 75.00%, respectively, and the overall accuracy was 83.33%. For the breeding line 108290 and LA2093 datasets, the overall accuracy were 79.59 and 84.88%, respectively, for classifying the drought status (Table 6).
Models comparison and evaluation
The binary response models performed well for the four datasets (Tables 3, 4). It is worth noting that the data used to build the model were taken from ‘Tainan ASVEG No. 19’ at the seedling/vegetative growth stage, while our testing data included ‘Tainan ASVEG No. 19’, breeding line ‘108290’, and ‘LA2093’ at the flowering stage. The acceptable performance of the binary response models indicate that the logistic and CART model have the potential to classify the binary drought status of tomatoes across different genotypes and growth stages.
As for the multi-class models, both logistic regression and CART revealed good classified ability for the L class, but poor performance in classifying M and H status (Tables 5, 6). The reason may be to the class-imbalanced datasets used in this study, as the number of cases of M and H categories were much lower than those of the L class (Table 7). Class imbalance may lead to poor recognition of M and H categories by the models and contribute to the declined classification capability56,57. The performance of the multi-class model can be further improved if the class number of the dataset was redistributed using some resampling methods58,59.
When comparing the performances of the logistic and CART models, it can be found that the latter generally outperformed the former (Tables 3, 4, 5, 6). In the case of highly class-imbalanced data, unsatisfactory model performance for the logistic model was often observed60. Even if the logistic model has a good performance, it is difficult to interpret and visualize the classification process, contrary to the process of CART analysis60. In addition, the CART model makes fewer assumptions than those of the logistic regression and can deal with complex interactions and nonlinearities61. These properties contribute to the capability of the CART model to handle class-imbalance datasets42,60,62, outperforming the logistic model60,63. After comprehensively considering the classified performance and convenience of the application, the CART models were recommend for predicting the drought status of tomatoes. In application, only the air temperature, relative humidity, and plant temperature sensors need to be installed to achieve the values of input variables required by the model and set the decision rules of CART in the control system. Taking the decision rule on the rightmost of Fig. 2 as an example, when the Tdiff \(>\) 0.64 °C and the VPD \(>\) 1.7 kPa, the tomato has a high probability (0.95) of being in a state of water shortage, indicating that it should be irrigated at this time.
Conclusions
The proposed CART model with Tair, VPD, and Tdiff as independent variables had a good performance on predicting tomato drought status. The performance of the CART model was generally better than that of the logistic model both in binary and ordinal responses. In addition, the results indicated that the CART model can classify the WW and WS as well as the L, M, and H statuses for domesticated and wild tomato genotypes at different growth stages. Taking the advantages of the convenient measuring of input variables, good classified performances, and the intuitive visualization, the proposed CART model can be utilized as a simple and practical method to classify the drought status of diverse tomato genotypes at vegetative and reproductive stages. In fact, the proposed method only needs to install air temperature, relative humidity, and plant temperature sensors and sets the decision rules of CART in the greenhouse to control the water supply system. In the future, the data of water shortage in the fruiting stage can be taken into consideration to further verify the reliability of the model. The performance of the proposed model can be further improved if the class imbalance problem is solved.
Methods
Experimental layout
In order to develop a drought stress detection method across different growth stages (vegetative and reproductive stages) and genotypes, two experiments were conducted in the 1# and 2# solar greenhouses at the Taiwan Agricultural Research Institute (TARI) located in Taichung City, Taiwan (latitude 24° 01′ N, longitude 120° 41′ E). In the 1# greenhouse, the cherry tomato cultivar ‘Tainan ASVEG No. 19’ was used between 2018 and 2019. Eight young seedlings with 6–8 fully expanded leaves were planted in baskets (50 cm × 40 cm × 30 cm) with 6D soil substrate (BVB, De Lier, The Netherlands). The experiment contained two irrigation treatments, a regular watering and drought treatments. In the regular watering treatment, tomato was irrigated daily until the field capacity was reached. For the drought treatment, no irrigation was applied after transplanting to mimic a progressive drought condition. The substrate volumetric water content (SVWC) was determined by WaterScout SM100 (Spectrum Technologies, Aurora, IL, USA). Four digital sensors were inserted evenly into the substrate at 10 cm depth of each plastic basket. The SVWC was recorded every 30 min after the regular irrigation and drought treatments were applied to tomato seedlings. In total, the experiment in 1# greenhouse was performed seven times at different time points.
In the 2# greenhouse, except for ‘Tainan ASVEG No. 19’, wild accession ‘LA2093’ and large fruits breeding line ‘108290’ were planted in the peat moss during the 2020 summer. Tomatoes were planted at a density of approximately 27,900 plants/ha. Differing from the 1# greenhouse, the irrigation treatments (regular watering and drought treatments) started from the flowering stage to the fruit setting stage in the 2# greenhouse, because this period was most sensitive to water deficits in drip irrigated tomatoes11. The study complies with relevant institutional, national, and international guidelines and legislation.
Environmental parameters and physiological data collection
For each tomato plant, 3–5 fully expanded leaves from the top of the plant were continuously measured. The leaf temperature, Tair, An, gsw, transpiration rate (E), and leaf-to-air VPD were measured using a LI-6800 portable photosynthesis system (LICOR Biosciences, Lincoln, NE, USA) at ambient air temperature (28.0–32.0 °C), air humidity (RH = 60%), reference CO2 concentration (400 μmol mol−1), and stable light intensity of 1200 μmol photons m−2 s−1 using an internal LED light source (red:blue = 9:1). Measurements were taken between 10:00 and 14:00. Data collection started from the drought treatment applied to the tomato showed clear signs of water shortage (Fig. S1), which was judged visually. The clear symptoms of water shortage were appeared about 2 to 3 weeks after drought treatment, when SVWC was 7–12%. In the study, the observations of ‘Tainan ASVEG No. 19’ in 2018–2019, ‘Tainan ASVEG No. 19’ in 2020, breeding line ‘108290’, and ‘LA2093’ are 1238, 114, 86, and 98, respectively.
Relating the light-saturated stomatal conductance to environmental and physiological parameters
In this study, the relationship between the light-saturated gsw of the tomatoes and An was first established. Next, the parameters VPD, Tair, and Tdiff, which can affect or reflect the stomatal closure, were related to the light-saturated gsw. When building the relationship between light saturation gsw and An, several models i.e., linear regression, logarithmic curve, exponential curve, and polynomial regression were considered to find the best model using Excel 2016 software. Additionally, the Spearman correlation coefficients between light saturation gsw and the parameters An, VPD, Tair, and Tdiff were calculated.
Data labeling
The tomato drought status was assessed using the thresholds for gsw defined by Medrano et al.8. The thresholds were defined as follows: for binary response, WW, with gsw ≥ 0.15 mol H2O m−2 s−1 and WS, with gsw < 0.15 mol H2O m−2 s−1. For the ordinal response, L, with gsw ≧ 0.15 mol H2O m−2 s−1; M, with 0.05 ≦ gsw < 0.15 mol H2O m−2 s−1; and H, with gsw < 0.05 mol H2O m−2 s−1. The whole data were divided into four datasets according to the data sources: (1) Tainan ASVEG No. 19 (2018–2019), (2) Tainan ASVEG No. 19 (2020), (3) breeding line 108290, and (4) LA2093. The description of the four datasets is provided in Table 7. After labeling the drought statuses, the differences of physiological parameter values between different drought statuses were examined. Because the assumption of normality was found to be violated by the Shapiro–Wilk test, the nonparametric methods were used for the comparison of different drought statuses. Mann–Whitney U and Kruskal–Wallis tests were used to examine the differences of E, A, and Tdiff values between drought statuses for binary and ordinal responses, respectively.
For the binary response, all physiological parameter values of three genotypes differed between WW and WS statuses. The values of E and An of WW plants were significantly higher than those of WS plants. Conversely, the values of Tdiff of WW plants were significantly lower than those of WS plants (Table S1). For the ordinal response, the values of E and An decreased with the increasing drought levels, and values of Tdiff increased with the increasing drought levels. However, it was observed that the physiological parameter values of M and H statuses showed no significantly different, except LA2093 (Table S2).
Models building and validation
Logistic regression is a modeling approach that can be used to describe the relationship between predictor variables and a dichotomous or multi-category response variable. For the tomato drought status defined in the previous section, a logistic model for p-1 independent variables was defined as follows:
where P(Y = 1) is the probability of WS status, given the values of X1,···, Xp−1; a is an intercept; and b1, … , bp−1 are regression coefficients. Additionally, the probability of P(Y = 1) is 1/[1 + exp (\(-\) a \(-\) b1X1 \(-\) b2X2 \(-\) ··· \(-\) bp−1Xp−1)] in the multiple logistic regression model. It appears that the logistic model can be expressed as a logit form and is simplified as a linear function. For the final model, the threshold probability, i.e., the probability value to classify an observation to the WS status with the most accurate prediction result, was used as a classification criterion41.
For the k-class ordinal response data, one of the underlying assumptions for the ordinal logistic regression is that the regression coefficient of each independent variable is identical for each of the k \(- 1\) cumulative logits, but different intercepts (Eq. 2)41.
The probability function of predicting each category of drought statuses (L, M, and H) can be defined as per Eq. (3). The probability function given the highest probability value was the predicted drought status41.
The CART model is a common categorical classifier, which takes either continuous or categorical variable as predictor variables to predict the continuous dependent variable, requires no assumptions, and is simple to interpret43. It employs the recursive partitioning method using all predictor variables to split subsets of the dataset to create two child nodes, repeatedly62. Starting with the entire dataset, i.e., root node, the CART approach explores all possible values of the predictor variables to find the best predictor variable that can split the node. The best partition is one that minimizes the average impurity of the two child nodes. In this study, the Gini index of diversity was used to choose the best predictor at each node. The Gini index at node t, g(t) is expressed as follows:
where i and j are the different categories of the dependent variable.
Regardless of the method used to build the classification model, it was randomly selected 70% data of the Tainan ASVEG No. 19 (2018–2019) dataset as the training set, and used Tair, VPD, and Tdiff as independent variables to build the model. The remaining 30% data of the Tainan ASVEG No. 19 (2018–2019) dataset were used as the testing set for model validation. In addition, the Tainan ASVEG No. 19 (2020), breeding line 108290, and LA2093 datasets were used to validate the applicability of the models to different growth substrates, genotypes, and growth stages of the tomatoes. Since the stress responses vary under different conditions64, this validation can clarify the generalizability of the proposed model65.
Discriminant ability of the models
For the binary class model, sensitivity, specificity, geometric mean, and accuracy were used to evaluate the model performance. The definition of these metrics is expressed as Eqs. (5)–(8), respectively:
where TN is true negative (when the true drought status of the tomato was “WW,” and the model also classified it as “WW”); FP is false positive (when the true drought status of the tomato was “WW,” but the model classified it as “WS”); FN is false negative (when the true drought status of the tomato was “WS,” but the model classified it as “WW”); and TP is true positive (when the true drought status of the tomato was “WS,” and the model also classified it as “WS”).
The performance of the multi-class model were evaluated with the correctly classified percentage for each class and overall accuracy. Let us assume that a size N dataset includes k classes and each class has ni instances (i = 1, 2, …, k), and cij are the elements of the k × k confusion matrix, where i, j = 1, 2, … , k. The rows and columns of the matrix show the true and predicted values at each class, respectively. Next, the correctly classified percentage for each class and overall accuracy can be defined using Eqs. (9)–(10):
The range of the metrics described here is from 0 to 1. The closer the values of these metrics are to 1, the better the classification ability of the model. Model performance is considered acceptable if the sensitivity > 0.85, specificity > 0.85, geometric mean > 0.75, and accuracy > 90.00% for a binary response. The acceptable standard for the ordinal response model is that the overall accuracy is > 80.00%. These criteria are set based on the median (or close to the median) of previous water status classification studies7,15,19,48,66,67.
Statistical analysis
The statistical analyses were implemented using the R software (version 4.1.3). Spearman correlation coefficients were calculated using the cor function. Binary logistic models were constructed using the glm function. The ordinal logistic model was built using the vglm function in VGAM package (version 1.1–7). The CART model was implemented using the rpart function in the rpart package (version 4.1.16).
Data availability
Data generated or analyzed during this study were included in this published article.
References
Li, Y. et al. Comparison of drip fertigation and negative pressure fertigation on soil water dynamics and water use efficiency of greenhouse tomato grown in the north China plain. Agric. Water Manag. 184, 1–8 (2017).
Klunklin, W. & Savage, G. Effect on quality characteristics of tomatoes grown under well-watered and drought stress conditions. Foods 6, 56. https://doi.org/10.3390/foods6080056 (2017).
Yuan, X. K., Yang, Z. Q., Li, Y. X., Liu, Q. & Han, W. Effects of different levels of water stress on leaf photosynthetic characteristics and antioxidant enzyme activities of greenhouse tomato. Photosynthetica 54, 28–39 (2016).
Khapte, P. S., Kumar, P., Burman, U. & Kumar, P. Deficit irrigation in tomato: Agronomical and physio-biochemical implications. Sci. Hortic. 248, 256–264 (2019).
Suhandy, D., Khuriyati, N. & Matsuoka, T. Determination of leaf water potential in tomato plants using NIR spectroscopy for water stress management. Environ. Control Biol. 44, 279–284 (2006).
Jangid, K. K. & Dwivedi, P. Physiological responses of drought stress in tomato: A review. Int. J. Environ. Agric. Biotech. 9, 53. https://doi.org/10.5958/2230-732X.2016.00009.7 (2016).
Tu, Y.-K. et al. A 1D-SP-Net to determine early drought stress status of tomato (Solanum lycopersicum) with imbalanced Vis/NIR spectroscopy data. Agriculture 12, 259. https://doi.org/10.3390/agriculture12020259 (2022).
Medrano, H., Escalona, J. M., Bota, J., Gulías, J. & Flexas, J. Regulation of photosynthesis of C3 plants in response to progressive drought: Stomatal conductance as a reference parameter. Ann. Bot. 89, 895–905 (2002).
Nuruddin, M. M., Madramootoo, C. A. & Dodds, G. T. Effects of water stress at different growth stages on greenhouse tomato yield and quality. HortScience 38(7), 1389–1393 (2003).
Sharma, S. P., Leskovar, D. I., Volder, A., Crosby, K. M. & Ibrahim, A. M. H. Root distribution patterns of reticulatus and inodorus melon (Cucumis melo L.) under subsurface deficit irrigation. Irrig. Sci. 36, 301–317 (2018).
Harmanto, K., Salokhe, V. M., Babel, M. S. & Tantau, H. J. Water requirement of drip irrigated tomatoes grown in greenhouse in tropical environment. Agric. Water Manag. 71(3), 225–242 (2005).
Kissoudis, C. et al. Combined biotic and abiotic stress resistance in tomato. Euphytica 202, 317–332 (2015).
Razali, R. et al. The genome sequence of the wild tomato Solanum pimpinellifolium provides insights into salinity tolerance. Front. Plant Sci. 9, 1402. https://doi.org/10.3389/fpls.2018.01402 (2018).
Tapia, G., Méndez, J. & Inostroza, L. Different combinations of morpho-physiological traits are responsible for tolerance to drought in wild tomatoes Solanum chilense and Solanum peruvianum. Plant Biol. 18, 406–416 (2016).
Fang, S.-L. et al. Plant-response-based control strategy for irrigation and environmental controls for greenhouse tomato seedling cultivation. Agriculture 12, 633. https://doi.org/10.3390/agriculture12050633 (2022).
Liu, H. et al. Drip irrigation scheduling for tomato grown in solar greenhouse based on pan evaporation in north China plain. J. Integr. Agric. 12, 520–531 (2013).
Liu, H. et al. Optimizing irrigation frequency and amount to balance yield, fruit quality and water use efficiency of greenhouse tomato. Agric. Water Manage. 226, 105787; https://doi.org/10.1016/j.agwat.2019.105787 (2019).
Fernández, J. E. Plant-based sensing to monitor water stress: Applicability to commercial orchards. Agric. Water Manag. 142, 99–109 (2014).
Fernández-Novales, J., Tardaguila, J., Gutiérrez, S., Marañón, M. & Diago, M. P. In field quantification and discrimination of different vineyard water regimes by on-the-go NIR spectroscopy. Biosyst. Eng. 165, 47–58 (2018).
Solankey, S. S., Singh, R. K., Baranwal, D. K. & Singh, D. K. Genetic expression of tomato for heat and drought stress tolerance: An overview. Int. J. Veg. Sci. 21, 496–515 (2015).
Jones, H. G. Irrigation scheduling: Advantages and pitfalls of plant-based methods. J. Exp. Bot. 55, 2427–2436 (2004).
Qiu, R. et al. Response of evapotranspiration and yield to planting density of solar greenhouse grown tomato in northwest China. Agric. Water Manag. 130, 44–51 (2013).
Wan, S. & Kang, Y. Effect of drip irrigation frequency on radish (Raphanus sativus L.) growth and water use. Irrig. Sci. 24, 161–174 (2006).
Yan, H. et al. Energy partitioning of greenhouse cucumber based on the application of Penman–Monteith and bulk transfer models. Agric. Water Manag. 217, 201–211 (2019).
Alchanatis, V. et al. Evaluation of different approaches for estimating and mapping crop water status in cotton with thermal imaging. Precis. Agric. 11, 27–41 (2010).
Bellvert, J. et al. Vineyard irrigation scheduling based on airborne thermal imagery and water potential thresholds. Aust. J. Grape Wine Res. 22, 307–315 (2019).
Maes, W. H. & Steppe, K. Estimating evapotranspiration and drought stress with ground-based thermal remote sensing in agriculture: A review. J. Exp. Bot. 63, 4671–4712 (2012).
Quebrajo, L., Perez-Ruiz, M., Pérez-Urrestarazu, L., Martínez, G. & Egea, G. Linking thermal imaging and soil remote sensing to enhance irrigation management of sugar beet. Biosyst. Eng. 165, 77–87 (2018).
Socías, F. X., Correia, M. J., Chaves, M. M. & Medrano, H. The role of abscisic acid and water relations in drought responses of subterranean clover. J. Exp. Bot. 48, 1281–1288 (1997).
Chaves, M. M. Effects of water deficits on carbon assimilation. J. Exp. Bot. 42, 1–16 (1991).
Costa, J. M., Grant, O. M. & Chaves, M. M. Thermography to explore plant-environment interactions. J. Exp. Bot. 64, 3937–3949 (2013).
Iseki, K. & Olaleye, O. A new indicator of leaf stomatal conductance based on thermal imaging for field grown cowpea. Plant Prod. Sci. 23, 136–147 (2020).
Jones, H. G. Use of thermography for quantitative studies of spatial and temporal variation of stomatal conductance over leaf surfaces. Plant Cell Environ. 22, 1043–1055 (1999).
Leinonen, I., Grant, O. M., Tagliavia, C. P. P., Chaves, M. M. & Jones, H. G. Estimating stomatal conductance with thermal imagery. Plant Cell Environ. 29, 1508–1518 (2006).
Idso, S. B., Jackson, R. D. & Reginato, R. J. Remote sensing of crop yields. Science 196, 19–25 (1977).
Jackson, R. D., Idso, S. B., Reginato, R. J. & Pinter, P. J. Canopy temperature as a crop water stress indicator. Water Resour. Res. 17, 1133–1138 (1981).
Idso, S. B., Reginato, R. J., Jackson, R. D. & Pinter, P. J. Jr. Foliage and air temperatures: Evidence for a dynamic equivalence point. Agric. Meteorol. 24, 223–226 (1981).
Kacira, M., Sase, S., Okushima, L. & Ling, P. P. Plant response-based sensing for control strategies in sustainable greenhouse production. J. Agric. Meteorol. 61, 15–22 (2005).
Sepulcre-Cantó, G. et al. Detection of water stress in an olive orchard with thermal remote sensing imagery. Agric. For. Meteorol. 136, 31–44 (2006).
Hughes, G. The evidential basis of decision making in plant disease management. Annu. Rev. Phytopathol. 55, 41–59 (2017).
Sancho, A. M., Moschini, R. C., Filippini, S., Rojas, D. & Ricca, A. Weather-based logistic models to estimate total fumonisin levels in maize kernels at export terminals in Argentina. Trop. Plant Pathol. 43, 99–108 (2018).
Breiman, L., Friedman, J., Olshen, R. & Stone, C. In Classification and Regression Tree (Chapman and Hall, New York, 1984).
Razi, M. A. & Athappilly, K. A comparative predictive analysis of neural networks (NNs), nonlinear regression and classification and regression tree (CART) models. Expert Syst. Appl. 29, 65–74 (2005).
Anubha Pearline, S., Sathiesh Kumar, V. & Harini, S. A study on plant recognition using conventional image processing and deep learning approaches. J. Intell. Fuzzy Syst. 36, 1997–2004 (2019).
Cheng, Z. et al. Evaluation of classification and regression tree (CART) model in weight loss prediction following head and neck cancer radiation therapy. Adv. Radiat. Oncol. 3, 346–355 (2017).
Naghibi, S. A., Pourghasemi, H. R. & Dixon, B. GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran. Environ. Monit. Assess. 188, 1–27 (2016).
Sánchez-Ortiz, A., Mateo-Sanz, J. M., Nadal, M. & Lampreave, M. Water stress assessment on grapevines by using classification and regression trees. Plant Direct 5, e00319 (2021).
Tu, Y.-K. et al. Establishing of early discrimination methods for drought stress of tomato by using environmental parameters and NIR spectroscopy in greenhouse. Acta Hortic. 1311, 501–512 (2021).
Yu, F. Y. & Xu, X. Z. A review on plant stress physiology. World For. Res. 16, 6–11 (2003).
Farquhar, G. D., von Caemmerer, S. & Berry, J. A. Models of photosynthesis. Plant Physiol. 125, 42–45 (2001).
McAdam, S. A. & Brodribb, T. J. The evolution of mechanisms driving the stomatal response to vapor pressure deficit. Plant Physiol. 167(3), 833–843 (2015).
Tan, C. S. Tomato yield-evapotranspiration relationships, seasonal canopy temperature and stomatal conductance as affected by irrigation. Can. J. Plant Sci. 73, 257–264 (1993).
Raschke, K. Stomatal action. Ann. Rev Plant Physiol. 26, 309–340 (1975).
Urban, J., Ingwers, M., McGuire, M. A. & Teskey, R. O. Stomatal conductance increases with rising temperature. Plant Signal. Behav. 12(8), e1356534. https://doi.org/10.1080/15592324.2017.1356534 (2017).
Pataki, D. E., Oren, R., Katul, G. & Sigmon, J. Canopy conductance of Pinus taeda, Liquidambar styraciflua and Quercus phellos under varying atmospheric and soil water conditions. Plant Physiol. 18, 307–315 (1998).
Garcia, V., Sanchez, J. S. & Mollineda, R. A. On the effectiveness of preprocessing methods when dealing with different levels of class imbalance. Knowl. Based Syst. 25, 13–21 (2012).
Lin, W.-C., Tsai, C.-F., Hu, Y.-H. & Jhang, J.-S. Clustering-based undersampling in class-imbalanced data. Inf. Sci. 409–410, 17–26 (2017).
Douzas, G., Bacao, F., Fonseca, J. & Khudinyan, M. Imbalanced learning in land cover classification: Improving minority classes’ prediction accuracy using the Geometric SMOTE algorithm. Remote Sens. 11, 3040. https://doi.org/10.3390/rs11243040 (2019).
Fonseca, J., Douzas, G. & Bacao, F. Improving imbalanced land cover classification with k-means SMOTE: Detecting and oversampling distinctive minority spectral signatures. Information 12, 266. https://doi.org/10.3390/info12070266 (2021).
Henrard, S., Speybroeck, N. & Hermans, C. Classification and regression tree analysis vs. multivariable linear and logistic regression methods as statistical tools for studying haemophilia. Haemophilia 21(6), 715–722 (2015).
Westreich, D., Lessler, J. & Funk, M. J. Propensity score estimation: Neural networks, support vector machines, decision trees (CART), and meta-classifiers as alternatives to logistic regression. J. Clin. Epidemiol. 63(8), 826–833 (2010).
Kurt, I., Ture, M. & Kurum, A. T. Comparing performances of logistic regression, classification and regression tree, and neural networks for predicting coronary artery disease. Expert Syst. Appl. 34(1), 366–374 (2008).
Irimia-Dieguez, A. I., Blanco-Oliver, A. & Vazquez-Cueto, M. J. A comparison of classification/regression trees and logistic regression in failure models. Procedia Econ. Financ. 23, 9–14 (2015).
Carter, G. A. & Knapp, A. K. Leaf optical properties in higher plants: Linking spectral characteristics to stress and chlorophyll concentration. Am. J. Bot. 88(4), 677–684 (2001).
Wang, S., Azzari, G. & Lobell, D. B. Crop type mapping without field-level labels: Random forest transfer and unsupervised clustering techniques. Remote Sens. Environ. 222, 303–317 (2019).
Elvanidi, A., Katsoulas, N., Ferentinos, K. P., Bartzanas, T. & Kittas, C. Hyperspectral machine vision as a tool for water stress severity assessment in soilless tomato crop. Biosyst. Eng. 165, 25–35 (2018).
Xia, J. et al. A cloud computing-based approach using the visible near-infrared spectrum to classify greenhouse tomato plants under water stress. Comput. Electron. Agric. 181, 105966. https://doi.org/10.1016/j.compag.2020.105966 (2021).
Funding
Funding was provided by Ministry of Science and Technology (Grant No. 110-2321-B-055-001 and 110-2634-F-005-006).
Author information
Authors and Affiliations
Contributions
B.-J.K., S.-L.F., and Y.-K.T. conceptualization; S.-L.F. and Y.-K.T. designed the experiment; L.K., T.-J.C. carried out the experiments; S.-L.F. performed the analyses; S.-L.F., Y.-K.T., H.-W.C. writing—original draft preparation; B.-J.K. writing—revision; M.-H.Y., B.-J.K. funding acquisition; B.-J.K. Supervision. All authors reviewed, contributed to writing and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Fang, SL., Tu, YK., Kang, L. et al. CART model to classify the drought status of diverse tomato genotypes by VPD, air temperature, and leaf–air temperature difference. Sci Rep 13, 602 (2023). https://doi.org/10.1038/s41598-023-27798-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-023-27798-8
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.