Introduction

One of the main concerns of hydraulic engineers is the optimal management of limited water resources, in Iran. The ever-increasing growth of national investment in water projects leads to the optimization of water control and management projects in order to save national capital1,2,3. In recent years, hydraulic engineers have tried to measure the discharge with proper accuracy by building and installing measuring structures in the channels. One of the common structures in many dams and water transfer channels are labyrinth weirs, which are used for draining, measuring, and controlling the water level4, 5. These types of weirs are among the most practical surface structures, which have recently attracted the attention of various researchers. The pseudo-cosine labyrinth weirs (PCLW) with a long crown have a suitable performance for regulating the water level compared to other weirs. Numerous parameters are effective in determining the Cd in labyrinth weir with different plans. These parameters are related to several factors, including the upstream total hydraulic head (Hu), downstream hydraulic head (Hd), weir height (W), radius (R), number of cycles (N), shape of the weir crest (CR), collision of nape (Na), the approach flow conditions (AF), etc.4. Nowadays, several issues, including the increase in costs, time-consuming, and the occurrence of human error, have led to the use of 3D and computer models6, 7. Since manual calculations may involve human error, it is necessary to use novel intelligent methods such as meta-heuristic algorithms, artificial neural networks, fuzzy logic, etc. Several studies have been carried out by researchers in the investigation of the Cd of labyrinth weirs8,9,10,11,12,13,14,15. Considering some structural limitations (such as structure dimensions and weir angle) and using classical calculation methods such as linear and non-linear regression methods, the researchers have determined the Cd of weirs.

Azamathulla and Wu16 used the support vector machine (SVM) to accurately estimate the longitudinal dispersion coefficients in natural rivers. With a test on real-world datasets, the SVM algorithm is proven to generate encouraging results. In another work, Azamathulla et al.17 proposed SVM to estimate the Cd in side weirs. The experimental results proved the superiority of the SVM compared with counterpart adaptive neuro-fuzzy inference systems (ANFIS) and artificial neural networks (ANNs). Bilhan et al.18 estimate the Cd of labyrinth weirs using support vector regression (SVR) and an outlier robust extreme learning machine. The results showed that machine learning methods estimated the Cd values more accurately. Safarrazavizadeh et al.19 performed a laboratory investigation of the flow on labyrinth weirs with a semicircular and sinusoidal plan. Observations showed that the discharge coefficient in labyrinth weirs with a semicircular and sinusoidal plan, unlike linear weirs, has an upward trend in low water loads (HT/P < 0.35) and decreases after reaching its maximum value. Bonakdari et al.20 investigated the effectiveness of the gene expression programming (GEP) method for estimating Cd. Results show that the GEP method provides better results in predicting Cd. Shafiei et al.21 used the ANFIS-firefly algorithm (ANFIS-FFA) method to estimate the Cd of triangular labyrinth weirs. Results showed that the ANFIS-FFA model is more accurate in predicting the Cd of triangular labyrinth weirs. Emami et al.8 estimated the Cd of W-planform labyrinth weirs using the improved self-adaptive differential evolutionary algorithm and support vector regression (ISaDE-SVR) method. ISaDE-SVR is highly effective in estimating the Cd of W-planform weirs. Norouzi et al.22 simulated Cd using a self-adaptive robust learning machine (SAELM) model. The results showed that the SAELM model estimated the Cd with high accuracy. Wang et al.23, investigated the application of genetic algorithm (GA), particle swarm optimization (PSO), and traditional BP neural network in predicting the Cd of triangular labyrinth weir. The results showed that GA-BPNN and PSO-BPNN methods have high efficiency in predicting Cd. Chen et al.24 used SVM, random forest (RF), linear regression, SVM, k-nearest neighbor (KNN), and decision tree (DT) in predicting the Cd of streamlined weirs. Ahmad et al.25 used the ANN model to predict the Cd of an arced labyrinth side weir. The results indicated that Cd calculated by ANN is more accurate. Emami et al.26 used the Walnut algorithm and SVR method to predict the Cd of triangular labyrinth weirs. Safari et al.27 evaluated ANN, GEP, and regression models to estimate the Cd of the broad-crested weir. The results showed that ANN estimates the Cd better than GEP models and regression models.

In the previous studies, according to the many geometrical models that have been investigated by different researchers, the Cd of PCLW has not been investigated. Therefore, in the present study, by using the intelligent model of the differential evolution (LSHADE) and extreme gradient boosting (XGB) approach, the Cd of the PCLW was estimated. The proposed approach was investigated with different combinations of features to identify the high-performance combination of features.

The contributions of this paper are as follows:

  1. (a)

    Introducing the LXGB algorithm, which integrates the LSHADE with XGB to tune the XGB parameters and further enhance its estimation performance.

  2. (b)

    Using the LXGB algorithm to estimate the Cd of PCLW. The proposed algorithm models the

  3. (c)

    Evaluating the proposed model with a real-world dataset and compared with state-of-the art algorithms. The experimental results show the superiority of the proposed method compared with counterparts in terms of performance measures.

The remaining sections of this study are organized as follows. Section "Material and methods" illustrates the experimental materials and the presented hybrid approach. Section "Results and Discussion" presents the results and discussions. Section "Conclusion" summarizes the paper and supplies recommendations for coming work.

Material and methods

Dimensional analysis

The 1-dimensional equation of the flow on the PCLW is as follows28:

$$Q = \frac{2}{3}\,\mathop C\nolimits_{d} \,L\,\sqrt {2g} \,\,\mathop {\mathop H\nolimits_{T} }\nolimits^{1.5}$$
(1)

where Q is the discharge, g shows the acceleration of gravity, L is the length of the weir, and HT is the hydraulic height (h + V2/2 g). The Cd of labyrinth weirs in free flow conditions depends on geometric and hydraulic parameters as follows:

$$C_{d} = \,f_{1} (B,\,L,\,H_{T} ,\,H_{d} ,\,V,W,\,R,\,S,\,t,\,\alpha ,\,N,\,g,\,\rho ,\,\mu ,\,\sigma ,\,CS,\,JS,\,SW)$$
(2)

where B is the channel width, Hd is the total hydraulic height (downstream of the weir), V shows the flow velocity, W indicates the height of the weir, R is the radius of weir curvature, S is the length of the straight part between the curves of the weir, t is the thickness of the weir, α represents the angle of the straight section between the weir curves with the direction of the channel, N indicates the number of cycles, ρ indicates the fluid density, μ the dynamic viscosity, σ shows the surface tension, CS means the shape of the weir crest, JS denotes the shape of the flowing blade, and SW represents the approaching flow and the sidewall effect.

Equation (2) can be written as follows:

$$C_{d} = \,f_{2} \,({\text{Re}} ,\,We,\,Fr,\,\frac{{H_{T} }}{W},\,\frac{{H_{d} }}{W},\,\frac{L}{W},\,\frac{B}{W},\,\frac{R}{W},\,\frac{S}{W},\frac{t}{W},\,\alpha ,\,N,\,CS,\,JS,\,SW)$$
(3)

where Re is the Reynolds number, We mean the Weber number, and Fr is the Froude number. Henderson29 concluded that if Re < 2000, the effect of viscosity can be neglected. Novak et al.30 concluded that if the water height on the weir is more than 3 to 4 cm, the effect of surface tension is ignored. Due to the turbulent flow and minimum water height of 5 cm on the weir, the impacts of the Re and We numbers were removed. The shape of the edge of all used weirs was selected as a sharp-crested, and the effect of CS was ignored. Due to the installation of weirs perpendicular to the main flow and the absence of local contraction at their installation location, the conditions of the approaching SW flow were considered the same for all experiments.

Equation (3). is simplified as the following equation:

$$C_{d} = \,f_{3} \,(\frac{{H_{T} }}{W},\,\frac{L}{W},\,\frac{B}{W},\,\frac{R}{W},\,\frac{S}{W},\,\frac{t}{W},\,\alpha ,\,N)$$
(4)

Experimental models

The simulation of the flow around the PCLW was carried out in a channel with a width, length, and height of 0.49 m to 1.115 m, 3.2 m, and 0.5 m, respectively. In Fig. 1, the PCLW models and their geometric features are shown.

Figure 1
figure 1

A big picture of PCLW1 and PLCW2.

The geometric features and the range of experimental parameters of the PLCW are presented in Table 1.

Table 1 The geometric features of PLCW used in simulations.

Extreme gradient boosting (XGB)

XGB31,32,33 is a robust supervised learning solution to regression, classification, and ranking problems in a fast and accurate way. XGB is a more generalized form of gradient-boosting decision trees. It utilizes parallel processing, resolves missing values efficiently, prevents overfitting, and performs well on datasets of different sizes.

For a given dataset with n examples and m features \(D \, = \, \{ f(x_{i} , \, y_{i} )\} \, (\left| D \right| = \, n, \, x_{i} \in R^{m} , \, y_{i} \in R)\), XGB consists of an ensemble of K classification and regression trees (CARTs). The final prediction is formulated as follows31:

$$\hat{y}_{i} = \sum\limits_{k = 1}^{K} {f_{k} (x_{i} ),} \, f_{k} \in F$$
(5)

\(\hat{y}_{i}\) is the final predictive value, F is the list of CARTs, and \(f_{k} (x_{i} )\) is the function of input in the k-th decision tree. In the XGB, the objective function consists of two components: regularization and training error, which are defined as follows31:

$$X_{obj} = \sum\limits_{i = 1}^{n} {l(y_{i} ,\hat{y}_{i} )} { + }\sum\limits_{k = 1}^{K} {\Omega (f_{k} )}$$
(6)

where \(\sum\limits_{i = 1}^{n} {l(y_{i} ,\hat{y}_{i} )}\) calculates the difference between the predicted value and the observed value of the loss function. \(\sum\limits_{k = 1}^{K} {\Omega (f_{k} )}\) calculates the regularization component, which is:

$$\Omega (f_{k} ) = \gamma T + \frac{1}{2}\lambda \left\| w \right\|^{2}$$
(7)

where \(\gamma\) is the leaf penalty coefficient, T is the total number of a leaf node, \(\lambda\) guarantees that the scores of a leaf node are not too large, and w is the scores of a leaf node. XGB employs the gradient boosting strategy, appends one new tree at each iteration, and modifies the preceding test results by fitting the residuals of the previous prediction:

$$y_{i}^{(K)} = \sum\limits_{i = 1}^{K} {f_{k} (x_{i} )} = \hat{y}_{i}^{K - 1} + f_{K} (x_{i} )$$
(8)

Integrating Eq. (1) and (2), the objective function for the t-th tree can be written as31:

$$L^{(K)} = \sum\limits_{i = 1}^{n} {l(y_{i} ,\hat{y}_{i}^{(K - 1)} + f_{K} (X_{i} ))} + \Omega (f_{k} )$$
(9)

Taking the Taylor expansion of the loss function up to the second order, Eq. (9) can be approximated as follows:

$$L^{(K)} = \sum\limits_{i = 1}^{n} {[l(y_{i} ,\hat{y}_{i}^{(K - 1)} + f_{K} (X_{i} )) + \frac{1}{2}h_{i} f_{K}^{2} (X_{i} )] + \Omega (f_{k} )}$$
(10)
$$\begin{aligned} X_{{obj}} = & \sum\limits_{{i = 1}}^{n} {\left[ {g_{i} f_{K} (x_{i} ) + \frac{1}{2}h_{i} f_{K}^{2} (x_{i} )} \right] + } \Omega (f_{k} ) \\ = & \sum\limits_{{i = 1}}^{n} {\left[ {g_{i} w_{q} (x_{i} ) + \frac{1}{2}h_{i} w_{q}^{2} (x_{i} )} \right] + } \Omega (f_{k} ) + \lambda T + \frac{1}{2}\lambda \sum\limits_{{j = 1}}^{T} {w_{j}^{2} } \\ = & \sum\limits_{{j = 1}}^{n} {\left[ {\left( {\sum\nolimits_{{i \in I_{j} }} {g_{i} } } \right)w_{j} + \frac{1}{2}\left( {\sum\nolimits_{{i \in I_{j} }} {h_{i} } + \lambda } \right)w_{j}^{2} } \right] + } \lambda T \\ \end{aligned}$$
(11)

where \(g_{i} = \partial \hat{y}^{K - 1} l(y_{i} ,\hat{y}^{K - 1} )\) and \(h_{i} = \partial^{2} \hat{y}^{K - 1} l(y_{i} ,\hat{y}^{K - 1} )\) are the first and second-order gradient statistics of the loss function.

The optimal weight \(w_{j}\) of leaf j, and the objective function of a tree can be written as follows:

$$w_{i} = - \frac{{G_{i} }}{{H_{i} + \lambda }}$$
(12)

where \(G_{i} = \sum\nolimits_{{i \in I_{j} }} {g_{i} }\) and \(H_{i} = \sum\nolimits_{{i \in I_{j} }} {h_{i} } + \lambda\).

$$w_{j} = - \frac{1}{2}\sum\limits_{j = 1}^{T} {\frac{{G_{j} }}{{H_{j} + \lambda }}} + \gamma T$$
(13)

the weak fitting model will be intensified as follows:

$$f_{k} (x_{i} ) = f_{k - 1} (x_{i} ) + \sum\limits_{j = 1}^{T} {w_{j} .\eta }$$
(14)

where \(\eta\) is the learning rate. XGB appends new trees at each iteration by continuously dividing features. Appending a new tree to the model is learning a new function \(f_{k} (X,\theta_{k} )\) to fit the residual of previous prediction. Once K trees are learned, the strong fitting model \(F(x_{i} )\) used to predict:

$$F(x_{i} ) = f_{0} (x_{i} ) + \sum\limits_{k = 1}^{K} {\sum\limits_{j = 1}^{T} {w_{j} .\eta } }$$
(15)

where, F(xi) is the strong-fitting model.

Figure 2 shows the working principle of XGB.

Figure 2
figure 2

A big picture of the XGB method.

Since the hyper-parameters of XGB are often set empirically, optimal tuning of parameters is essential for designing robust XGB. In this paper, we used the LSHADE algorithm to tune the XGB parameters including the number of decision trees (K), learning rate (\(\eta\)), maximum depth (md), minimum child weight (mcw), gamma value (\(\gamma\)), sub-sample (ss). Table 2 lists the XGB parameters and their range used in the implementation.

Table 2 The parameters of the XGB algorithm that need to be tuned.

LSHADE

Success-history-based parameter adaptation for differential evolution (SHADE)34 is an adaptive evolutionary optimization strategy. LSHADE35 enhances SHADE with a linear population size reduction technique, which gradually reduces the size of the population using a linear function. LSHADE starts its optimization process with a randomly generated population of real parameter vectors. The algorithm repeats a process of trail vector generation and selection until some termination conditions are satisfied.

LSHADE-XGB (LXGB)

The incentive mechanism of LXGB is to improve the classification performance of XGB by integrating the LSHADE optimization algorithm with XGB. Figure 3 shows the working principle of the LXGB algorithm.

Figure 3
figure 3

Principle of the LXGB algorithm.

Assessment metrics

RMSE, RRMSE and NSE metrics were used to evaluate the performance of LXGB approach (Eqs. 1618).

$$RMSE = \sqrt {\frac{1}{n}\sum\limits_{i = 1}^{n} {\mathop {(\mathop Y\nolimits_{i} - \mathop X\nolimits_{i} )}\nolimits^{2} } }$$
(16)
$$NSE = 1 - \frac{{\sum\limits_{i = 1}^{n} {\mathop {(\mathop Y\nolimits_{i} - \mathop X\nolimits_{i} )}\nolimits^{2} } }}{{\sum\limits_{i = 1}^{n} {\mathop {(\mathop Y\nolimits_{i} - \overline{X} )}\nolimits^{2} } }}$$
(17)
$$RRMSE = \frac{RMSE}{{\sum\limits_{i = 1}^{n} {\mathop Y\nolimits_{i} } }}$$
(18)

RMSE: Root mean square error; NSE: Nash–Sutcliffe model efficiency coefficient; RRMSE: Relative root mean square error.

Where Xi is the predicted values, Yi is the observed values, and \(\overline{X}\) is the average of X.

Results and discussion

The Cd of PCLW1 and PCLW2 weirs was estimated using the hybrid LXGB approach. At first, all available data were normalized to remove or correct outliers36.

$$X_{n} = \frac{{X - X_{\min } }}{{X_{\max } - X_{\min } }}$$
(19)

where Xmin is the minimum data, X represents the raw data, Xmax is the maximum data, and Xn is the normalized data.

The ratio of the weir length to the weir height (L/W), the ratio of the channel width to the weir height (B/W), the ratio of the weir thickness to weir height (t/W), the number of cycles (N), the radius to the weir height (R/W), the ratio of the straight section between the weir curves length to the weir height (S/W), the ratio of the, the ratio of the hydraulic head to the weir height (H/W), were considered as input parameters of the LXGB approach. 132 datasets, including geometric and hydraulic parameters, were selected. The data were randomly divided into two parts: 80% (106 data) for training the model and 20% (26 data) for testing it.

Seven models with different variables were examined to introduce the most influential input parameters in estimating the Cd of PCLW1 and PCLW2 weirs. Tables 3 and 4 and Figs. 4 and 5 present various input variables.

Table 3 Combinations of the variables of weir in the PCLW1 plan.
Table 4 Combinations of the variables of weir in the PCLW2 plan.
Figure 4
figure 4

Combination of input variables (PCLW1 model).

Figure 5
figure 5

Combination of input variables (PCLW2 model).

In Tables 5 and 6, the evaluation criteria for different input variables to estimate the Cd are presented. A part of the modeling process by the LXGB approach is presented in Fig. 6.

Table 5 Performance of LXGB on the PCLW1 plan with different combination models.
Table 6 Performance of LXGB on the PCLW2 plan with different combination models.
Figure 6
figure 6

Structure of the tree generated by LXGB algorithm.

The results show the accuracy of the presented LXGB approach in estimating the Cd of PCLW1 and PCLW2 models of PCLW. Mahmoud et al.37 concluded that the ANFIS-PSO and MLP-FA (multi-layer perceptron and firefly optimization algorithm) methods are the most accurate in estimating the Cd of triangular labyrinth weirs, respectively. In a similar study, Majediasl and Fuladipanah38 concluded that the SVM model produces the most exact results in predicting the Cd of labyrinth weir with RMSE = 0.0118. Shafiei et al.21 reported that the ANFIS-FFA model is quite accurate in estimating the Cd of the labyrinth weir. Karami et al.10 showed that the ELM method with RMSE = 0.006 has acceptable efficiency in estimating the Cd of the labyrinth weir. In a similar study, the effectiveness of the least-squares support vector machine-bat algorithm (LSSVM-BA) method was used to investigate the discharge of a curved labyrinth weir39. The results of the studies showed that the SVM-based model gave accurate results in estimating the Cd of the arched labyrinth weir with values of RMSE = 0.013 and R2 = 0.97040. Multi-layer perceptron neural network (MLPNN) managed to estimate the discharge over the triangular arced labyrinth weirs of RMSE = 0.00385 and R2 = 0.99941.

The results of the estimated and observed Cd of the PCLW1 and PCLW2 models of pseudo-cosine labyrinth weirs were compared in Figs. 7 and 8. According to the results, the K6 model with the input variables of (R/W), (L/W), and (H/W), had the optimal values of statistical indicators. The Cd of PCLW1 and PCLW2 weirs increases with the increase of the weir height. In a similar study, it was concluded that with the increase in the weir height, the Cd of the triangular duckbill labyrinth weir increases, which is in agreement with the results of the present study7. The increase in the effective length of the labyrinths at a specified width, due to the radius increases of PCLW1 and PCLW2 weirs causes an increase in the Cd. The studies showed that increasing the radius causes a reduction in eddy flows, turbulence, and a sudden increase in water height during the weir39, 40, 42. The results of the investigations showed that with the increase of R/W, the Cd increases in the arched labyrinth weir, which is consistent with the results of the present study41. Also, the K2 model (H/W, L/W, R/W, N) is in the second rank, which shows that length, weir height, radius, and the number of cycles have a more significant impact on Cd of PCLW1 and PCLW2 weirs. By increasing the number of labyrinth weir cycles, discharge and Cd increase, which is consistent with the results of the present study40, 43. Figure 9 shows the importance of the influential input parameters in estimating the Cd of PCLW.

Figure 7
figure 7

Cd changes of the PCLW1 (a) training phase, (b) test phase (model K6).

Figure 8
figure 8

Cd changes of the PCLW2 (a) training phase, (b) test phase (model K6).

Figure 9
figure 9

Importance of features on the Cdestimation on the plan (a) PCLW1 and (b) PCLW2; f0: H/W, f1: R/W, f2: L/W, f3: N.

Emami et al.44 predicted the Cd of a curved plan labyrinth weirs using the WOA-ANFIS method, and the input parameters H/W and θ (weir arc angle) were introduced as the most effective parameters in estimating the Cd. Majediasal and Fuladipanah38, investigated the support vector machine (SVM) method for Cd of sharp-crested triangular labyrinth weirs and concluded that the input combination, including geometric parameters (θ, h/w, L/B), has the best results. Mohammadi et al.45 reported that the parameters Ht/P, W/P (the ratio of the weir width to the height), R/W, W/LC (the ratio of the weir width to the effective length) as input variables have the most accuracy and efficiency in estimating the Cd of U-shaped labyrinth weirs. Haghiabi et al.46 indicated the Cd of triangular labyrinth weirs using the ANFIS system and concluded that the ANFIS has a proper implementation in Cd estimation. Studies showed that the H/W parameter is the most influential parameter on the Cd of a labyrinth and arced labyrinth weirs47.

Table 7 compares the performances of the XGB and LXGB on the test dataset. The results show the superiority of the LXGB compared with the XGB algorithm in terms of performance measures. This issue proves that combining the LSHADE with XGB improves the estimation performance.

Table 7 Performance evaluation of the XGB and LXGB algorithms on test dataset.

In Table 8, the values of the evaluation criteria for estimating the Cd of labyrinth weirs with different plans have been compared with the results of other studies. The results for LXGB are generated with the PCLW1 plan. The comparisons show the appropriate accuracy of the LXGB approach in estimating the Cd of labyrinth weirs with R2 = 0.97 and RMSE = 0.014.

Table 8 Performance evaluation of the LXGB approach and similar methods.

Conclusion

This study introduces a novel design for labyrinth weirs called pseudo-cosine labyrinth weirs (PCLW). The LXGB was used to estimate the Cd of the PCLW weir. Seven models with different combinations of appropriate input parameters were introduced. A proper model was defined by analyzing the estimation results. The superior model estimates Cd by considering input parameters H/W, R/W, and L/W. LXGB was achieved in estimating the Cd of PCLW overflows by obtaining values of R2 = 0.971, RMSE = 0.014, and NSE = 0.97. The results demonstrated that the proposed LXGB algorithm generated more significant results than previous studies in estimating the Cd of labyrinth weirs. Such a cost-effective prediction model may have significant practical application, as it can be an economical alternative to the expensive laboratory solution, which is costly and time-consuming. The proposed model is useful to correct the design of water transfer systems.

figure a