Introduction

Evaporation (Ep) is a highly non-linear physical process, which is profoundly affected by meteorological parameters, including temperature, wind speed, precipitation, solar radiation, etc.1,2. As a main component of water balance, it plays an extremely important role in the global hydrological cycle3,4,5. Accurate estimation of evaporation by using is a significant issue in ecological management 6,7,8,9,10,11, especially in arid sand land, where the stability and sustainability of the artificially re-vegetated belts depend on the effective utilization of the limited available water resources12,13.

In general, the direct measurements method (e.g., Class A pan, Lysimeter group) is largely restricted due to the limitation of experimental conditions in dryland14,15,16, and the physically-based methods (e.g., Dalton model, FAO-56 Penman–Monteith method, etc.) have the drawbacks that the estimated results are very sensitive to the errors of parameters17,18, and the key meteorological factors(e.g., relative humidity, latent heat of evaporation, radiation) are sometimes difficult to be measured in the arid sand land19,20. Therefore, it is necessary to construct the data-driven models to estimate the Ep with less meteorological information.

Recently, various data-driven shallow machine learning (ML) models, e.g. artificial neural networks (ANN)11,20, radial basis function neural networks (RBFNN)21, multilayer artificial neural networks (MLNN)2,22, extreme learning machine (ELM)2,15, random forest (RF)7, support vector machine (SVM)5,12,13,23, etc., have been widely used to simulate Ep with incomplete meteorological variables. Those models have the excellent capability of simulating the non-linear relationships between the Ep and meteorological variables24,25. As the hyper-parameters of the ML models determine the estimated results and accuracy, meta-heuristic algorithms, including genetic algorithm (GA)6,26,27, particle swarm optimization algorithm (PSO)1,28, whale optimization algorithm(WOA)2,12,29, flower pollination algorithm (FPA)2, grey wolf optimizer algorithm (GWO) 12,13, etc., were employed to obtain the optimal hyper-parameters of ML models. In addition, the data preprocessing techniques, including Kendall-τ correlation coefficient29,30, and entropy weight31 were used to find the effective input combination of ML models. Literature review shows that shallow ML models hybridized with meta-heuristic algorithms and data preprocessing techniques, namely, hybrid model, have higher estimation accuracy than shallow ML models or physically-based methods2,7,32. Such models are recommended as the best choice for estimating Ep with limited meteorological information in different climate zones8,12,13,33,34,35.

Although shallow ML models hybridized appropriate meta-heuristic algorithms and data preprocessing techniques have proven potentially capable of estimating Ep in different regions2,6,7,32,33,34,35, the output of those hybrid models exists large error since the structure of shallow ML models cannot fully simulate the non-linear relationships between the meteorological parameters and Ep11,13,19,36,37. To improve the estimating accuracy, deep learning models (e.g. recurrent neural network (RNN)36, deep neural network (DNN)37, temporal convolution neural network(TCNN)37, long short-term memory (LSTM)12,38, etc.) were employed to estimate the Ep. Literature review shows that the deep learning models, especially LSTM, have better model performance than that of the other deep learning models and shallow ML models, and are demonstrated as an effective method for estimating Ep in different regions12,36,37,38. However, the setting of hyper-parameters of LSTM is subjective or depends on experience, which inevitably leads to a large estimating error. The hyper-parameters of LSTM, including the number of hidden layers (NHL), the number of hidden units (NHU), epochs (E), the mini-batch size (MBS), and learning rate (LR), directly determine the estimated results, whereas, few studies use meta-heuristic algorithms to optimize the hyper-parameters of LSTM for more precise estimation of Ep.

In this paper, two typical ML models, i.e. LSTM and SVM, were selected as main estimating modules, and two new meta-heuristic algorithms, including GWO and WOA, were employed to obtain the optimal hyper-parameters of ML models, and Kendallτ-correlation coefficient was employed to determine the input combinations of ML models. The proposed hybrid models, including Kendall-τ-GWO-SVM, Kendall-τ-WOA-SVM, Kendall-τ-GWO-LSTM, and Kendall-τ-WOA-LSTM, were employed to estimate the monthly pan Ep with limited meteorological information, and the superiority of the proposed models was tested by using the standard evaluation metrics. The aims of this study were (1) to provide a novel approach for monthly pan Ep estimation with limited meteorological variables; (2) to obtain more robust and precise estimating results by coupling LSTM with heuristic algorithms and data preprocessing technique; (3) to find the optimal and minimum meteorological parameters to be observed in the study area. Compared to previous studies14,15,16,17,18,19,20,21,22,36,37,38, the proposed models simultaneously account for data preprocessing and hyper-parameters optimization of deep learning models, and can be recommended as an effective method to estimate Ep with limited meteorological information in dryland.

Materials and methods

Case study

This study was conducted in the Shapotou (37°32′ N, 105°02′ E), Ningxia Hui Autonomous Region, China. Figure 1 shows the location map of the study area. This area is characterized by densely distributed trellis dunes, and it has the typical arid climate with scarce precipitation and huge evaporation, where the annual average precipitation is 180 mm and the annual average evaporation is 2520.4 mm3. To prevent the damage of sand erosion and promote regional ecological restoration, the artificial sand-binding vegetation belts were established in 1956a, and over subsequent years (1964a, 1981a and 1987a)4,12,13. It has been proved that revegetation is an effective approach for rehabilitation in arid sandy land39, ensuring the sustainability of artificial sand-binding vegetation under scarce precipitation and huge Ep is challenging for ecologists and land managers. Therefore, accurate estimation of Ep is of great theoretical and practical significance for understanding regional drought, managing and applying limited water resources, and determining the composition, structure, spatial distribution, and scale of artificial sand-binding vegetation.

Figure 1
figure 1

The location map of the study area (Using ArcGIS v. 10.8 software; Powered by ESRI “Environmental Systems Research Institute”, www.esri.com).

Data collection and analysis

The monthly meteorological variables needed to accomplish this study, including the monthly average temperature (T), the minimum air temperature (Tmin), the maximum air temperature (Tmax), the monthly precipitation (P), and the monthly average wind speed (WS), were compiled from the Shapotou Desert Research and Experiment Station from 1991 to 2018a. The data during 1991a–2010a was utilized as the training set, and the data during 2011a–2018a was used as the validation data set. Table 1 shows the minimum, maximum, mean, variance, skewness, and kurtosis of those measured meteorological parameters. As shown in Table 1, the average annual temperature in Shapotou during 1991–2018 was 10.8 ℃, with low-temperature and high-temperature extremes of − 26.2 ℃ and 40 ℃. The average monthly precipitation is 15.1 mm and the maximum precipitation is 117.3 mm. The average monthly Ep is 210 mm and the average monthly wind speed is 2.8 m/s. The probability distribution of all meteorological parameters is skewed.

Table 1 The statistical characteristics of the collected meteorological parameters.

Kendall-τ correlation coefficient

The Kendall-τ correlation coefficient is generally used to measure the correlation between two random variables without any assumption of population distribution. The definition of the Kendall-τ correlation coefficient is

$$\tau = \left( \begin{gathered} n \hfill \\ 2 \hfill \\ \end{gathered} \right)\sum\limits_{1 \le i < j \le n} {{\text{sgn}} ((a_{i} - a_{j} )(b_{i} - b_{j} ))}$$
(1)

with the sign function

$${\text{sgn}} (\alpha ) = \left\{ \begin{gathered} - 1,\alpha { < 0,} \hfill \\ 0,\alpha { = 0,} \hfill \\ 1,\alpha { > 0}{.} \hfill \\ \end{gathered} \right.$$
(2)

Machine learning models

Long short-term memory (LSTM)

LSTM was designed to solve the gradient vanishing problem in RNN40. The significant difference between LSTM and RNN is that LSTM addresses the long-term dependency problems by adding repeating modules (cell) to store the information of the previous nodes41. Thus, LSTM was employed to estimate the evaporation in the study area. Figure 2 shows the internal structure of the LSTM cell, each memory cell consists forget gate \(F_{t}\), input gate \(I_{t}\), and output gate \(O_{t}\), which are updated in the iterative process with

$$f_{t} = \sigma \left( {w_{hf} \cdot h_{t - 1 t} + w_{xf} \cdot x + b_{f} } \right)$$
(3)
$$i_{t} = \sigma \left( {w_{hi} \cdot h_{t - 1t} + w_{xi} \cdot x + b_{i} } \right)$$
(4)
$$\tilde{c}_{t} = \tanh \left( {w_{hc} \cdot h_{t - 1} + w_{xc} \cdot x_{t} + b_{c} } \right)$$
(5)
$$c_{t} = i_{t} * \tilde{c}_{t} + f_{t} * c_{t - 1}$$
(6)
$$o_{t} = \sigma \left( {w_{ho} \cdot h_{t - 1} + w_{xo} \cdot x_{t} + b_{o} } \right)$$
(7)
$$h_{t} = o_{t} * \tanh \left( {c_{t} } \right)$$
(8)
$$y_{t} = \sigma (w_{hy} \cdot h_{t} + b_{y} )$$
(9)
$$\sigma \left( x \right) = \left( {1 + e^{ - x} } \right)^{ - 1}$$
(10)
$$\tanh \left( x \right) = \frac{{e^{x} - e^{ - x} }}{{e^{x} + e^{ - x} }}$$
(11)
Figure 2
figure 2

The internal structure of LSTM cell.

Support vector machine (SVM)

SVM is a typical shallow ML model that exhibited better model performance than other ML models to solve the nonlinear fitting problems by using kernel trick and Vapnik–Chervonenkis theory23,42. Thus, SVM was widely used to estimate Ep with limited meteorological variables in the field of hydrology5,12,13,23,29.

The regression coefficients are determined by solving the following problem

$$\begin{aligned} & \min obj\frac{1}{2}\left\| w \right\|^{2} + C\sum\limits_{i = 1}^{m} {(\xi_{i} } + \eta_{i} ) \\ & s.t.\left\{ \begin{gathered} \left\langle {w \cdot x_{i} } \right\rangle \alpha_{i} + b - y_{i} \le \varepsilon + \xi_{i} \hfill \\ y_{i} - \left\langle {w \cdot x_{i} } \right\rangle \alpha_{i} - b - \le \varepsilon + \eta_{i} \hfill \\ \xi_{i} ,\eta_{i} \ge 0,i = 1,2, \ldots n. \hfill \\ \end{gathered} \right. \\ \end{aligned}$$
(12)

The regression function \(R(x)\) can be obtained by using Karush–Kuhn–Tucker’s method, which is

$$R(x,\alpha ,\alpha^{*} ) = \sum\limits_{i = 1}^{m} {\left( {\alpha_{i} { - }\alpha_{i}^{*} } \right)} k(x,x_{i} ) + b,$$
(13)

where \(C > 0\) denote the penalty coefficient, \(\xi_{i}\) and \(\eta_{i}\) are the slack variable, \(\alpha_{i}\) and \(\alpha_{i}^{*}\) are Lagrange multiplications, respectively. The kernel function

$$k(x,x_{i} ) = \exp \left( { - G\parallel x - x_{i} \parallel } \right),$$
(14)

where \(G = 0.5\sigma^{ - 2}\) denotes the radius of \(k(x,x_{i} )\).

Meta-heuristic algorithms

Grey wolf optimizer (GWO) algorithm

GWO algorithm is a new meta-heuristic algorithm, the search process of GWO is inspired from the population hierarchy and predation behavior of the grey wolves43. Figure 3 shows the population hierarchy of grey wolves and the position updating process of GWO, where \(\alpha ,\beta ,\delta\) and \(\omega\) represents the grey wolves in the different hierarchical structures, and the dominance is decreased in sequence. In the simulation process, the distance and position vectors of different hierarchies are updated as

$$\overrightarrow {{{\mathbf{D}}_{\alpha } }} = \left| {\overrightarrow {{{\mathbf{C}}_{1} }} \overrightarrow {{{\mathbf{X}}_{\alpha }^{{^{p} }} (t)}} - \overrightarrow {{{\mathbf{X}}(t)}} } \right|,\overrightarrow {{{\mathbf{D}}_{\beta } }} = \left| {\overrightarrow {{{\mathbf{C}}_{2} }} \overrightarrow {{{\mathbf{X}}_{\beta }^{{^{p} }} (t)}} - \overrightarrow {{{\mathbf{X}}(t)}} } \right|,\overrightarrow {{{\mathbf{D}}_{\delta } }} = \left| {\overrightarrow {{{\mathbf{C}}_{3} }} \overrightarrow {{{\mathbf{X}}_{\delta }^{{^{p} }} (t)}} - \overrightarrow {{{\mathbf{X}}(t)}} } \right|,$$
(15)
$$\overrightarrow {{{\mathbf{X}}_{1} (t + 1)}} = \overrightarrow {{{\mathbf{X}}_{\alpha }^{p} (t)}} - \overrightarrow {{{\mathbf{A}}_{1} }} \overrightarrow {{{\mathbf{D}}_{\alpha } }} ,\overrightarrow {{{\mathbf{X}}_{2} (t + 1)}} = \overrightarrow {{{\mathbf{X}}_{\beta }^{p} (t)}} - \overrightarrow {{{\mathbf{A}}_{2} }} \overrightarrow {{{\mathbf{D}}_{\beta } }} ,\overrightarrow {{{\mathbf{X}}_{3} (t + 1)}} = \overrightarrow {{{\mathbf{X}}_{\delta }^{p} (t)}} - \overrightarrow {{{\mathbf{A}}_{3} }} \overrightarrow {{{\mathbf{D}}_{\delta } }} ,$$
(16)
$$\overrightarrow {{{\mathbf{X}}(t + 1)}} = \frac{{\left[ {\overrightarrow {{{\mathbf{X}}_{1} (t + 1)}} + \overrightarrow {{{\mathbf{X}}_{2} (t + 1)}} + \overrightarrow {{{\mathbf{X}}_{3} (t + 1)}} } \right]}}{3},$$
(17)

where the coefficient vectors \(\overrightarrow {{\mathbf{A}}} = \overrightarrow {{{\varvec{\upalpha}}}} (2\overrightarrow {{{\mathbf{r}}_{1} }} - 1)\), and \(\overrightarrow {{\mathbf{C}}} = 2\overrightarrow {{{\mathbf{r}}_{2} }}\), the random vectors \(\overrightarrow {{{\mathbf{r}}_{1} }} ,\overrightarrow {{{\mathbf{r}}_{2} }} \in \left[ {0,1} \right]\), the attenuation factor \(\overrightarrow {{{\varvec{\upalpha}}}}\) varies from 2 to 0. A more detailed description of GWO, we refer to Mirjalili et al.43 (Fig. 3).

Figure 3
figure 3

The population hierarchy of grey wolves and the position updating process of GWO.

Figure 4
figure 4

The flow chart of the estimating processes.

Whale optimization algorithm (WOA)

The WOA originated from the bubble-net feeding behavior of the humpback whale44. In the iterative process, the location vector of prey \(\overrightarrow {{{\mathbf{X}}^{ * } (t)}}\) is regarded as the current best candidate solution, the humpback updates the positions vector X(t) along a spiral-shaped path, namely

$$\overrightarrow {{{\mathbf{X}}(t + 1)}} = \left\{ \begin{gathered} \overrightarrow {{{\mathbf{X}}^{ * } (t)}} - \overrightarrow {{\mathbf{A}}} \left| {C\overrightarrow {{{\mathbf{X}}^{ * } (t)}} - \overrightarrow {{{\mathbf{X}}(t)}} } \right|,\begin{array}{*{20}l} {} & {} & {} & {\begin{array}{*{20}c} {if} & {p < 0.5,} \\ \end{array} } \\ \end{array} \hfill \\ \left| {C\overrightarrow {{{\mathbf{X}}^{ * } (t)}} - \overrightarrow {{{\mathbf{X}}(t)}} } \right|e^{{bt}} \cos (2\pi l) + \overrightarrow {{{\mathbf{X}}^{ * } (t)}} ,\begin{array}{*{20}c} {if} & {p \ge 0.5,} \\ \end{array} \hfill \\ \end{gathered} \right.$$
(18)

the attenuation coefficient vectors

$$\overrightarrow {{\mathbf{A}}} = (2r - 1)\overrightarrow {{\mathbf{\alpha }}} ,$$
(19)

and

$$C = 2r ,$$
(20)

where b is a constant, r is a random variable in [0,1], the vector \(\overrightarrow {{\mathbf{\alpha }}}\) decreases from 2 to 0, and l varies from 0 to 1, the random number p is used to judge whether the search process enters the bubble attack stage or performs the global search mechanism44.

Hybrid models

In this study, the hybrid models, including Kendall-τ-GWO-SVM, Kendall-τ-WOA-SVM, Kendall-τ-GWO-LSTM, and Kendall-τ-WOA-LSTM, were proposed and employed to estimate the monthly pan Ep in the study area with incomplete meteorological information. It should be noted that Kendall-τ-WOA-LSTM denotes the LSTM coupled with the WOA algorithm and Kendall-τ correlation coefficient, the meaning of the Kendall-τ-GWO-SVM, Kendall-τ-WOA-SVM, Kendall-τ-GWO-LSTM models are similar to that of the Kendall-τ-WOA-LSTM. Figure 4 schematically illustrates the estimating processes in this study. As shown in Fig. 4, the estimating process includes three modules: the data pre-processing module, the parameters optimization module, and the model evaluation module, the main steps are as follows:

Step 1. The Kendall-τ correlation coefficient was employed to recognize the effective input variables of each ML model, and the training and testing data were normalized by using the min-max normalization method.

Step 2. SVM and LSTM were selected as the main estimating modular to achieve accurate estimate the evaporation in the study area.

Step 3. WOA and GWO were used to find the best penalty coefficient (C) and radius (G) of the SVM, and determine the optimal hyper-parameters of LSTM, including NHL, NHU, E, MBS, and LR, respectively.

Step 4. The root mean squared error (RMSE) was used to choose the best hybrid models with optimal hyper-parameters from 5 replications for each fixed meteorological parameter, and the optimal input meteorological parameters were determined according to the model performance.

Step 5. The estimated performance of the proposed models was compared by using the standard statistics metrics.

Step 6. The optimal estimating model was determined based on the evaluation results.

Evaluation metrics

In this paper, the evaluation metrics, including RMSE22, the normalized mean squared error (NMSE)12, the mean absolute error (MAE)9,13,22, the mean absolute percentage error (MAPE)14,22, and Nash–Sutcliffe coefficient of efficiency (NSCE)12,13 were employed to assess the model performance. The definition of those evaluation indexes are as follows:

$$RMSE = \sqrt {\frac{{1}}{n}\sum\limits_{i = 1}^{n} {\left( {Ep_{i} - \widehat{Ep}_{i} } \right)^{2} } }$$
(21)
$$NMSE = \frac{{1}}{n}\sum\limits_{i = 1}^{n} {\left( {\frac{{Ep_{i} - \widehat{Ep}_{i} }}{{Ep_{i} }}} \right)^{2} }$$
(22)
$$MAE = \frac{{1}}{n}\sum\limits_{i = 1}^{n} {\left| {Ep_{i} - \widehat{Ep}_{i} } \right|}$$
(23)
$$MAPE = \frac{{1}}{n}\sum\limits_{i = 1}^{n} {\left| {\frac{{Ep_{i} - \widehat{Ep}_{i} }}{{Ep_{i} }}} \right|} \times 100\%$$
(24)
$$NSCE = 1 - \sum\limits_{i = 1}^{n} {\left( {Ep_{i} - \widehat{Ep}_{i} } \right)^{2} } /\sum\limits_{i = 1}^{n} {\left( {Ep_{i} - Ep} \right)^{2} }$$
(25)

where \(Ep_{i}\) and \(\widehat{Ep}_{i}\) denoted as the desired and actual outputs. It should be noted that RMSE, NMSE, MAE, and MAPE are generally employed to describe the error of the estimated results, those evaluation metrics approach 0 suggesting that the outputs of proposed models are close to the desired results. Thus, RMSE, NMSE, MAE, and MAPE are regarded as negative statistical metrics12,13. NSCE can be employed to describe the model efficiency and measure the goodness of fit, NSCE close to 1 indicates the model has good fitness, thus, NSCE is regarded as positive evaluation metric12,13. The list of abbreviations used in this manuscript is shown in Table 2.

Table 2 List of abbreviations.

Results

As mentioned above, the SVM and LSTM were regarded as the main modular to compute the monthly evaporation, respectively. To determine the input combination of ML models, the Kendall correlation coefficients between the meteorological variables, including T, Tmax, Tmin, P, WS, and Ep were calculated and shown in Table 3.

Table 3 The Kendall correlation coefficient between meteorological variables and Ep.

Table 3 shows that T, Tmax, and Tmin have the highest correlation with evaporation, and WS and P have the next highest correlation, the Kendall correlation coefficients are 0.731, 0.725, 0.636, 0.418, and 0.386, respectively. With Kendall correlation coefficient greater than 0.5 as the threshold, T, Tmax, and Tmin were selected as the fixed input variables of all ML models, thus, the input meteorological variables combinations are C1 (T, Tmax, Tmin, P, WS), C2 (T, Tmax, Tmin, WS), C3 (T, Tmax, Tmin, P), and C4 (T, Tmax, Tmin). The input meteorological variables combinations, including C1, C2, C3, and C4, were input into the SVM and LSTM to estimate the monthly Ep, respectively. The input dimension of each ML model was the number of input variables.

GWO and WOA are new efficient meta-heuristic optimization techniques that inspired from the predation behavior of grey wolves and humpback whales43,44, respectively. At present, these two algorithms have been widely used to optimize the hyperparameters of shallow ML models, and show better ergodicity and global optimization capacity than other heuristic algorithms12,13,29. However, few studies using GWO or WOA to optimize deep learning models, especially finding the optimal hyperparameters of LSTM in the hydrological field. In this study, to overcome the defects of ML models sensitive to parameter selection, the heuristic algorithms (GWO and WOA) were employed to find the optimal hyper-parameters of SVM and LSTM, respectively. Table 4 shows the parameter setting of the proposed models.

Table 4 The parameters setting of the proposed models.

As the randomness of some parameters in heuristic algorithms, the output of hybrid models was inconsistent. Thus, the relevant hyper-parameters and estimation accuracy of each hybrid model were recorded from five replications. Tables 5, 6, 7, 8 show the optimal parameters of each proposed models obtained by the heuristic algorithm in the training stage, and the evaluation indexes are also listed (The estimating results of the proposed models with different input with different input combinations and optimal hyper-parameters are shown in Supplementary File). It should be noted that the optimal hyper-parameters and evaluation metrics of those hybrid models with different input combinations are marked in bold. E.g., Table 5 shows that the optimal hyper-parameters of the hybrid Kendall-τ-GWO-SVM model in the training stage with different combinations are: C1 (C = 214.76, G = 0.001), C2 (C = 700.49, G = 0.014), C3 (C = 339.44, G = 0.013), and C4 (C = 434.08, G = 0.063), the minimum MAPE with the input combinations C1, C2, C3, and C4 in the testing stage are 30.71%, 30.34%, 26.97%, 32.32%, and the maximum NSCE are 0.74, 0.72, 0.77, 0.76, the results of other evaluation metrics are omitted. Table 7 shows that the optimal hyper-parameters of the hybrid Kendall-τ-GWO-LSTM model with input combinations C1-C4 in the training stage are: C1 (NHL = 6, NHU = 15, E = 96, MBS = 24, LR = 0.003), C2 (NHL = 10, NHU = 51, E = 39, MBS = 44, LR = 0.008), C3(NHL = 52, NHU = 89, E = 68, MBS = 16, LR = 0.007) and C4(NHL = 47, NHU = 93, E = 57, MBS = 20, LR = 0.005), the minimum MAPE with the input combinations C1, C2, C3 and C4 in testing stage are 26.17%, 27.97%, 23.03%, 19.96%, and the maximum NSCE are 0.81, 0.80, 0.86, 0.89, respectively. The meanings of the results in Tables 6 and 8 are similar to that of Tables 5 and 7.

Table 5 The optimal hyper-parameters and model performance of Kendall-τ-GWO-SVM.
Table 6 The optimal hyper-parameters and model performance of Kendall-τ-WOA-SVM.
Table 7 The optimal hyper-parameters and model performance of Kendall-τ-GWO-LSTM.
Table 8 The optimal hyper-parameters and model performance of Kendall-τ-WOA-LSTM.

The scatter plots of the desired and actual outputs of each model with optimal hyper-parameters and input combinations are shown in Fig. 5. As shown in Fig. 5, the hybrid Kendall-τ-GWO-SVM, Kendall-τ-WOA-SVM, Kendall-τ-GWO-LSTM, and Kendall-τ-WOA-LSTM models can be used to compute the monthly Ep and achieve high computing accuracy with the limited meteorological information, the coefficients of the regression lines are all greater than 1 except for that of the hybrid Kendall-τ-GWO-LSTM model, suggesting that the hybrid Kendall-τ-WOA-SVM, Kendall-τ-GWO-SVM, and Kendall-τ-WOA-LSTM models overestimated the monthly Ep, and the hybrid Kendall-τ-GWO-LSTM model underestimated the monthly Ep to a certain extent. To further compare the model performance of the hybrid Kendall-τ-WOA-SVM, Kendall-τ-GWO-SVM, Kendall-τ-WOA-LSTM, and Kendall-τ-GWO-LSTM models, the Taylor diagram is illustrated in Fig. 6. Taylor diagram shows the standard deviation, RMSE,and Pearson correlation coefficient on a two-dimensional chart, which provides an intuitive way to compare the model performance and reflects the simulation capability of the proposed models10,11,18,35. On the whole, Fig. 6 shows that the hybrid Kendall-τ-GWO-SVM model has higher Pearson correlation coefficient and lesser standard deviation and RMSE than that of the hybrid Kendall-τ-WOA-SVM, Kendall-τ-GWO-SVM, and Kendall-τ-WOA-LSTM models, indicating that the hybrid Kendall-τ-GWO-LSTM has superior performance than that of the other hybrid models.

Figure 5
figure 5

The scatter plots of the observed and estimated results of the proposed models. The blue line inside each panel denotes the fitted line between the observed and estimated results with the coefficient of determination (R2). (A) The results of the hybrid Kendall-τ-WOA-SVM model with C = 339.44 and G = 0.013. (B) The results of the hybrid Kendall-τ-GWO-SVM model with C = 145.35 and G = 0.013. (C) The results of the hybrid Kendall-τ-WOA-LSTM model with NHL = 63, NHU = 76, E = 46, MBS = 29, and LR = 0.005. (D) The results of the hybrid Kendall-τ-GWO-LSTM model with NHL = 47, NHU = 93, E = 57, MBS = 20, and LR = 0.005.

Figure 6
figure 6

Taylor diagrams of the hybrid Kendall-τ-WOA-SVM, Kendall-τ-GWO-SVM, and Ksendall-τ-WOA-LSTM models with optimal hyper-parameters and input combinations.

Discussion

The accuracies of the proposed models are determined by the different input combinations of meteorological variables, finding the optimal input combination of ML models can effectively improve the estimating accuracy. As shown in Tables 5, 6, 7, 8, the computing accuracies present different trends with different input meteorological variables. Taking the hybrid Kendall-τ-GWO-SVM model as an example, when the input meteorological variables are T, Tmax, Tmin, P, WS, the ranges of MAE, MAPE, RMSE, NMSE, and NSCE are [43.27, 45.70], [30.71%, 33.90%], [55.18, 55.68], [0.17, 0.19], and [0.73,0.74], respectively; When the input meteorological variables are T, Tmax, Tmin, and P, the ranges of MAE, MAPE, RMSE, and NMSE are [39.79, 39.84], [26.97%, 27.11%], [51.84, 51.86], [0.14, 0.14], and the maximum NSCE is 0.77, respectively. Thus, the computing accuracies of Kendall-τ-GWO-SVM were significantly improved when the input meteorological variables were optimized.

The statistical metrics in Tables 5, 6, 7, 8 show that RMSE, NMSE, MAE, and MAPE are not necessarily consistent with each other, which will lead to confusion if a different evaluation index is selected as a main benchmark to evaluate the model performance or find the optimal parameters of proposed models. Since MAPE and NSCE are two dimensionless quantities, the results of these two metrics are relatively more stable than the other evaluation indexes12,13. Thus, MAPE and NSCE were employed to determine the optimal input combination in this study (The discussion of other evaluation metrics is similar). As shown in Tables 5 and 6, the optimal and minimum input meteorological parameters of the hybrid Kendall-τ-GWO-SVM and Kendall-τ-WOA-SVM models are T, Tmax, Tmin, and P, the minimum MAPE is 26.97%, and the maximum NSCE is 0.77 from five replications. Tables 7 and 8 show that the optimal input meteorological parameters of the hybrid Kendall-τ-GWO-LSTM and Kendall-τ-WOA-LSTM models are T, Tmax, and Tmin, the minimum MAPE and the maximum NSCE of the hybrid Kendall-τ-GWO-LSTM model are 19.96% and 0.89; As for the hybrid Kendall-τ-WOA-LSTM model, the minimum MAPE and the maximum NSCE are 21.30% and 0.88. On the whole, the hybrid Kendall-τ-GWO-LSTM and Kendall-τ-WOA-LSTM models have outperformed the hybrid Kendall-τ-GWO-SVM and Kendall-τ-WOA-SVM models, and need fewer meteorological parameters to be observed.

To test whether there is a significant difference in the estimation accuracy of the proposed models under the same input combination, Kruskal–Wallis (K–W) test was performed on MAE, MAPE, NMSE, RMSE, and NSCE in the validation stage. K–W test is a non-parametric test method that does not need to assume that the variables to be tested obey normal distribution45, and its original assumption is that there is no significant difference between the variables to be tested and the level of significance \(\alpha = 0.05\). The results of the K–W test are shown in Table 9.

Table 9 The p-values of the K-W test.

Table 9 shows that the p-values of the K–W test between the hybrid Kendall-τ-GWO-SVM and Kendall-τ-WOA-SVM models are all greater than 0.05, which means that there is no significant difference in the estimation accuracy of these two models with the same input combination; The p-values of the K-W test between shallow ML models and deep learning models are all less than 0.05, suggesting that there is a significant difference in the estimation accuracy under the same input combination; As for the hybrid Kendall-τ-GWO-LSTM and Kendall-τ-WOA-LSTM models, the p-values of K–W test are all greater than 0.05, suggesting that the model performance of these two models have little difference in the estimation of Ep with limited meteorological parameters.

To compare the model performance of the hybrid Kendall-τ-GWO-SVM, Kendall-τ-WOA-SVM, Kendall-τ-GWO-LSTM, and Kendall-τ-WOA-LSTM models, the performance indexes average in the testing stage were calculated, and shown in Table 10. It should be noted that the minimum verage of MAE, RMSE, MAPE, and the maximum average of NSCE were marked in bold. Table 10 shows that the minimum average of MAPE is 28.10% and the maximum average of NSCE is 0.77 when the input meteorological parameters of the hybrid Kendall-τ-WOA-SVM model are T, Tmax, Tmin, and P. Compared with the hybrid Kendall-τ-WOA-SVM model, the hybrid Kendall-τ-GWO-SVM model with the same input combination performed slightly better than the hybrid Kendall-τ-WOA-SVM model, the minimum average of MAPE is decreased from 28.10 to 27.03%, and the maximum average of NSCE is 0.77.

Table 10 The evaluation metrics average of the proposed models with different input combinations in testing stage.

Although both the hybrid Kendall-τ-WOA-SVM and Kendall-τ-GWO-SVM models can be used to accurately simulate Ep with limited meteorological parameters, the estimation accuracy of these two models needs to be further improved since shallow ML models can not fully extract the nonlinear-and-dynamic-features between the meteorological parameters and Ep. As shown in Table 10, the minimum average MAPE of the hybrid Kendall-τ-GWO-LSTM and Kendall-τ-WOA-LSTM models are 21.91% and 23.51%, and the maximum average of NSCE are 0.87 and 0.84, implying that the estimating accuracy is significantly improved. Compared with Kendall-τ-GWO-SVM, the minimum average of MAPE decreased from 28.10 to 21.91%, and the maximum average of NSCE increased from 0.77 to 0.88, which means that the deep learning models significantly improved the estimating accuracy. In addition, the optimal and minimum input meteorological parameters of the hybrid Kendall-τ-GWO-LSTM and Kendall-τ-WOA-SVM models are T, Tmax, and Tmin, suggesting that deep learning models need fewer meteorological parameters to be observed than that of shallow ML models.

Figure 7 intuitively shows the performance indexes average of the proposed models with different input combinations. As shown in Fig. 7, the statistical metrics of the hybrid Kendall-τ-GWO-LSTM and Kendall-τ-WOA-LSTM models were similar to each other in the testing stage, suggesting that those two models can be employed to estimate Ep in dryland. Whereas, the negative evaluation indexes of Kendall-τ-GWO-LSTM are all smaller than that of Kendall-τ-WOA-LSTM, and NSCE showed the opposite trend, which means that the hybrid Kendall-τ-GWO-LSTM model performed better than the hybrid Kendall-τ-WOA-LSTM model, and GWO can obtain the optimal hyper-parameters of LSTM more effectively than WOA. Therefore, the hybrid Kendall-τ-GWO-LSTM model is strongly recommended to estimate Ep with limited meteorological parameters in dryland.

Figure 7
figure 7

The performance indexes average of the proposed models with different input combinations. (A) RMSE. (B) NMSE. (C) MAE. (D) MAPE. (E) NSCE.

Conclusion

In this study, four novel data-driven models, including the hybrid Kendall-τ-GWO-SVM, Kendall-τ-WOA-SVM, Kendall-τ-GWO-LSTM, and Kendall-τ-WOA-LSTM models, were proposed to estimate the monthly Ep with limited meteorological parameters, the proposed models simultaneously conduct the input meteorological variables and hyper-parameters optimization. The results illustrate that the optimal input meteorological parameters of the hybrid Kendall-τ-GWO-SVM (with C = 145.35 and G = 0.013) and Kendall-τ-WOA-SVM (with C = 339.44 and G = 0.013) models are T, Tmax, Tmin, and P, the minimum MAPE for both model is 26.97%, and the maximum NSCE is 0.77; the optimal input meteorological parameters of the hybrid Kendall-τ-GWO-LSTM (with NHL = 47, NHU = 93, E = 57, MBS = 20, and LR = 0.005) and Kendall-τ-WOA-LSTM (NHL = 63, NHU = 76, E = 46, MBS = 29, and LR = 0.005) models are T, Tmax, and Tmin, the minimum MAPE are 19.96% and 21.30%, and the maximum NSCE are 0.89 and 0.88, suggesting that Kendall-τ-GWO-LSTM is outperformed the Kendall-τ-GWO-SVM, Kendall-τ-WOA-SVM, and Kendall-τ-WOA-LSTM models, and needs fewer meteorological parameters to be observed. Therefore, the hybrid Kendall-τ-GWO-LSTM model can be highly recommended to estimate Ep without adequate meteorological parameters in dryland.

Although the deep learning models coupled with heuristic algorithms and data preprocessing techniques show fairly higher computing performance than the shallow ML models, the transferability of the proposed models to other locations need to be further tested. In addition, the main estimation modules are mainly focused on one or two ML models, and the estimation results inevitably have systematic overestimation or underestimation, which will inevitably lead to the risk of model selection. Further works will focus on constructing the combination model by integrating multiple ML models to obtain more robust estimating results in different bioclimatic zones.