Data-driven prediction of the shear capacity of ETS-FRP-strengthened beams in the hybrid 2PKT–ML approach

A new approach that combines analytical two-parameter kinematic theory (2PKT) with machine learning (ML) models for estimating the shear capacity of embedded through-section (ETS)-strengthened reinforced concrete (RC) beams is proposed. The 2PKT was first developed to validate its representativeness and confidence against the available experimental data of ETS-retrofitted RC beams. Given the deficiency of the test data, the developed 2PKT was utilized to generate a large data pool with 2643 samples. The aim was to optimize the ML algorithms, namely, the random forest, extreme gradient boosting (XGBoost), light gradient boosting machine, and artificial neural network (ANN) algorithm. The optimized ANN model exhibited the highest accuracy in predicting the total shear strength of ETS-strengthened beams and ETS shear contribution. In terms of predicting the total shear strength of ETS-strengthened beams, the ANN model achieved R2 values of 0.99, 0.98, and 0.96 for the training, validation, and testing data, respectively. By contrast, the ANN model could predict ETS shear contribution with high accuracy, with R2 values of 0.99, 0.99, and 0.97 for the training, validation, and testing data, respectively. Then, the effects of all design variables on the shear capacity of the ETS-strengthened beams were investigated using the hybrid 2PKT–ML. The obtained trends could well appraise the reasonability of the proposed approach.

For example, given a set of specific circumstances and demands, the combined model can assist data scientists with expertise in ML to perform a shear resistance assessment of ETS-strengthened RC beams even if they do not have a rigid background in civil engineering.
The present study is organized in five sections.First, the 2PKT approach is developed to successfully model the shear behavior of RC beams strengthened with ETS-FRP bars.Second, the reliability of the developed 2PKT model is validated against the experimental data of ETS-strengthened RC beams reported in the literature.Third, a data pool regarding the shear strength of ETS-strengthened RC beams is generated using the developed 2PKT approach.Fourth, a variety of rigorous ML models are applied for training and testing with the simulated 2PKT data to determine the most suitable approach for ML-2PKT combination.Finally, parametric studies on the effects of all design variables for ETS-strengthened beams are implemented using the hybrid 2PKT-ML approach.

Original 2PKT approach
The formulations of the 2PKT approach for analysis of the RC beams are described with two DOFs, which are the average strain in steel tension reinforcement (ε t,avg ) and displacement of the critical loading zone, CLZ, (Δ c ). Figure 1 shows the DOFs ε t,avg and Δ c .The average strain in the longitudinal bars induces a shear critical crack that divides the shear zone of the beams into two parts: a rigid block located above the critical crack and a crack fan located below the critical crack.Meanwhile, the DOF Δ c induces the vertical displacement of the beams without curvature.On the basis of the geometrical relations and kinematic characteristics, the geometries of the effective width of the loading plate (l b1e ), crack angles (α, α 1 ), cracked length along the bottom reinforcement (l t ), distance between kinks in the bottom reinforcement (l k , l 0 ), and crack spacing (s cr ) are determined, with the formulations given by Eqs.(1)-(4c).Then, the crack width (w) and concrete slip at critical crack (s) are expressed by Eqs. ( 5) and (6), respectively.The strain in the steel stirrups (ε v ) is determined using the elongation along the crack, as shown in Eq. (7).In addition, the deflection of the beams subjected to three-point bending is calculated by Eq. (8), which is composed of the superposition of the displacement caused by the curvature and deformation of the critical loading zone.
According to previous studies [57][58][59] , the shear strength of a conventional RC beam consists of four shear components: critical loading zone (V clz ), aggregate interlock (V ci ), steel stirrups (V s ), and dowel action of the bottom reinforcement (V d ).A number of studies [54][55][56][57][58][59] have detailed the mechanisms of the four shear components; therefore, in this section, only the equations for deriving the shear components are summarized.The shear strength attributable to the critical loading zone (V clz ) is written as follows: Figure 1.Summary of the 2PKT approach 54,57 .
The shear resistance caused by the aggregate interlock (V ci ) is determined via the shear stress of the aggregate interlock (ν ci ) as follows: where w = the critical crack width (mm); s is the concrete slip (mm); d is the effective depth (mm); φ is the aggregate interlocking angle (radian); and a g is the maximum aggregate size (mm).
The shear strength provided by the stirrups is expressed as follows: where E sw is the elastic modulus of the steel stirrups (GPa); ε yv is the yield strain of the steel stirrups; and f yv is the yield strength of the steel stirrups (MPa).
The shear resistance caused by the dowel action of the steel tension bars is stipulated as follows: where n b is the number of longitudinal tension bars; E s is the elastic modulus of steel tension reinforcement (GPa); d b is the bar diameter of steel tension bars (mm); f ys is the yield strength of the bottom reinforcement (MPa); and ε ys is the yield strain of the bottom reinforcement.

Shear contribution of ETS-FRP strengthening system.
Regarding the EB and near-surface-mounting methods, a number of shear models to estimate the shear resisting force of FRP strengthening in the strengthened beam have been developed.For beams with ETS-FRP bars, the models attempt to predict the ETS-FRP shear contribution, but the actual measurement is underestimated 26,30 .Bui and Stitmannaithum 44 proposed the bonding-based approach to simulate the shear resisting mechanism between ETS strengthening bars and concrete in ETS-retrofitted beams.Bui et al. 46 and Bui and Nguyen 48 continuously developed a new step of the bonding-based approach to analyze strengthened beams with rectangular and T-shaped sections.The corroborations to the experimental database in their studies have demonstrated the accuracy and effectiveness of the bonding-based approach for assessing the shear capacity of ETS-strengthened beams.In this section, the formulations established in the previous literature 46,48 for the bonding-based approach are utilized for the convenience of users.
In the bonding-based approach, the shear resisting mechanism of ETS strengthening to concrete is considered by assessing their respective bond performance.A crack plane occurs in the ETS-strengthened beam, which passes through the existing steel stirrups and ETS strengthening bars, dividing the beam into two parts.The shear reinforcement restrains the shear crack opening.The anchorage hook and closed shape govern the shear resistance of the steel stirrups.Meanwhile, the contribution of the ETS strengthening system to the beam shear resistance is governed by the interfacial profile of the ETS bar-to-concrete adhered joint.
Figure 2a presents the ETS technique for inserting FRP bars into the prepared holes through the beam section and bonding them with concrete by adhesive resin.The conceptual scheme of the bonding-based approach for the ETS-strengthened beams is also illustrated in the figure.The number of influenced ETS bars that are crossed by the crack line is calculated as follows: where x fi = is f is the distance from the end of the main crack plane to the end of the i th single bar crossing the critical crack plane (mm), s f is the ETS spacing (mm), h is the beam height (mm); and α 1 and β represent the crack angle and the ETS system inclination (°), respectively.
The average bond length of the influenced ETS bars is given by ( 14) Shear applied force

Crack line ETS bar
Figure 2. Bonding-based approach for ETS-strengthened beams: (a) conceptual scheme (Bui et al. 46 ) and (b) geometrical formulation for slip between ETS bar and concrete.www.nature.com/scientificreports/Bui et al. 46 proposed a nonlinear bond model to describe the bond profile between ETS bars and concrete and developed the ETS bond model via regression and mathematical analyses.Equation (17a) represents the regression fitting of the ε-s ETS curve from the pullout tests; Eq. (17b) represents the governing equation of the ETS bond profile; and Eq.(17c) shows the ETS bond fracture energy (G f ).The reliability of their proposed bond law has been validated via several pullout tests of ETS bars bonded to concrete joints 46,77 .where ε is the strain in the ETS bar; A is the bond factor representing the maximum strain in the ETS bar; B = ln(2)/s m is the bond ductility index (1/mm); E f is the elastic modulus of the ETS bar (GPa); s m is the maximum slip at the peak bond stress of the ETS bar-concrete interface (mm), which simply takes the value of 0.05 mm when E f > 50 GPa and 0.12 mm when E f ≤ 50 GPa; A f is the cross-sectional area of the ETS bar (mm 2 ); p f is the perimeter of the ETS bar (mm); s ETS is the slip between ETS bar and concrete (mm); and G f is the bond fracture energy (N/mm).
Figure 2b shows the geometrical description of the relations between the concrete slip (s) and crack width (w) at the intersection of the crack line and ETS bar.Two scenarios are considered when determining the slip of the ETS bar to concrete (s ETS ).The dependency of the ETS bar slip on the concrete slip and crack width can be described as follows: The bonding-based approach requires information about the maximum bond stress between ETS strengthening and concrete in the strengthened beam.Bui et al. 46 and Bui and Nguyen 48 provided the following expressions for the maximum bond stress between ETS bars and concrete based on the average anchorage length categorization: By taking the equilibrium of the bond force according to the free body diagram (Fig. 2a), the effective (average) strain of the ETS system in the beam can be derived as follows: where d f is the ETS bar diameter (mm).
The debonding of the ETS-FRP bars to concrete without FRP rupture represents the failure criterion of the ETS-FRP strengthening system in the ETS-FRP-retrofitted beam.Debonding occurs due to crack initiation, opening, and propagation in the beam.According to international specifications (ACI PCR-440.2-17 78; fib 2019 79 ), the strain in FRP reinforcement in RC beams is limited by the strain value of 0.004 (concrete integrity) and 0.75f fu /E f (FRP rupture).According to Eq. (20) and the debonding limit concept, the maximum strain in the ETS-FRP strengthening system in an ETS-strengthened beam can be rewritten as follows: Therefore, the shear resisting force of ETS-FRP strengthening (V f ) in the retrofitted beam can be expressed via the bond force of the equivalent pullout scheme.

Calculation procedure
On the basis of the formulations established in "Original 2PKT approach" and "Shear contribution of ETS-FRP strengthening system" sections, the total shear strength (V total ) of an ETS-FRP strengthened RC beam can be described by considering the following five shear components: Apart from the strength caused by the shear components, the total shear capacity can be derived by the moment equilibrium of the tensile force of the bottom flexural reinforcement (T), which is expressed as follows: where a is the shear span length (mm); A s is the total area of the steel tension reinforcement (mm 2 ); and A c,eff is the effective area of concrete for the tension stiffening of the longitudinal reinforcement (mm 2 ).
The shear strength attributable to the shear components (V total in Eq. ( 23)) and section equilibrium (V T in Eq. (24a)) must be equated.At each step of the displacement of the critical loading zone (Δ c ), the shear resisting forces of the shear components and the total shear strength of the ETS-retrofitted beam depend on the average strain in the bottom reinforcement (ε t,avg ).The intersection of V T versus ε t,avg to V total versus ε t,avg is an equilibrium point in the shear load-deflection curve plot of the ETS-strengthened RC beam (Fig. 3).The bisection method is applied to the variable ε t,avg to find the intersection between V T and V total .In this manner, the complete shear force-deflection response of the ETS-strengthened beam can be obtained.The iterations for Δ c and ε t,avg are needed to determine the steps between load and displacement.
At the point of peak shear force, V clz , V s , V ci , and V f can reach their own peaks (Fig. 3).Beam failure is attributable to concrete crushing in the compression zone, which can lead to diagonal shear cracking, transverse steel yielding, and FRP debonding.

Experimental database
Some experimental programs for investigating the shear behavior of concrete beams strengthened with ETS-FRP bars have been implemented [26][27][28][29][30][31][32] .Among the aforementioned studies, the engineering information for the test beams provided by Mofidi et al. 26 , Breveglieri et al. 29 , Bui et al. 30 , and Bui et al. 32 can conveniently and sufficiently perform 2PKT prediction.In the experiments of Mofidi et al. 26 , Breveglieri et al. 29 , and Bui et al. 30 , the beams have T-shaped sections; meanwhile, rectangular-shaped sections were applied for the beams in the experiments by Bui et al. 32 .The shear span-to-effective depth (a/d) ratios for all specimens in the experimental programs of Breveglieri et al. 29 and Bui et al. 30 for specimen B1 in the study by Bui et al. 32 were 2.6, 2.5, and 2.4, which are representative of deep beams.For the remaining strengthened beams, a/d ≥ 3.0, which might represent the behavior of the slender beams.
The beams studied by Mofidi et al. 26 and Breveglieri et al. 29 adopted CFRP bars for ETS strengthening systems; in their succeeding works 30,32 , the beams were strengthened by ETS-GFRP bars.All tested beams were designed to be dominated by shear failure in the shear zones consisting of ETS-FRP bars.Therefore, the beams were overreinforced with a high amount of longitudinal steel reinforcement (Table 1), inducing shear cracks followed by the yielding of steel stirrups and the debonding of ETS-FRP bars to concrete.The rupture of FRP bars was not detected in those works.
The effects of the presence of existing stirrups, ETS material types, and percentages and concrete compressive strength on the strengthening efficiency of the ETS-retrofitted beams were also examined in the abovementioned ( 23)

Verification
The results of verification of the shear capacity of the ETS-strengthened beams in the literature 26,29,30,32 and the use of the developed 2PKT approach to further verify the experimental data are presented in Fig. 4 and Table 1, respectively.The analytical model in this study was developed to predict the shear responses of ETS-FRPstrengthened RC beams.Therefore, only strengthened specimens with ETS-FRP bars from the literature 26,29,30,32 were used in the model verification.The comparisons between the model calculation and experimental results focus on the total shear strength of the beam (V total ).As shown in Table 1, the average of V total-ana./V total-exp.is 0.98, and the coefficient of variation (CoV) of the mean is 16.2%.The prominent effects of the variables on the beam shear strength can be well assessed by the 2PKT model.In the studies of Breveglieri et al. 29 and Bui et al. 30 , the strengthened beams with ETS-FRP bars inclined at 45° or more shear reinforcement had a much greater shear capacity than those with vertical ETS-FRP bars or less-transverse reinforcement.In addition, both 2PKT computation and experiment for the specimens in Bui et al. 32 , compared specimen B4 to other beams, found that the higher the concrete compressive strength was, the larger the total shear strength.These aforementioned findings demonstrate the good agreement and rationale between the developed 2PKT model and the beam shear strength tests.The computation via the developed 2PKT approach for ETS-FRP-strengthened RC beams can also be rapidly implemented.The 2PKT model can plot both prepeak and postpeak regimes, which may be necessary for evaluating the ductility displacement properties of ETS-strengthened RC beams.The validation technique applied by 2PKT to test specimens B1, B2, and B3 in the work of Bui et al. 32 are shown in Fig. 5.The beams obtained a/d ratios of 2.4, 3.6, and 4.8 for beams B1, B2, and B3, respectively.Good agreement was established between the experimental and analytical results in the load-displacement curves.Furthermore, the values were consistent in the reduction in the shear resistance and stiffness of the ETS-strengthened RC beams with increasing a/d ratios.These findings can be explained by the behavior of beams with large a/d ratios during beam action and their shear actions decreasing or even becoming negligible.Therefore, the shear resisting forces caused by the presence of concrete are significantly reduced with increasing a/d ratios, a relationship that governs the entire strengthened beam capacities.The abovementioned result is confirmed by the 2PKT analyses of the specimens shown in Fig. 6, in which the reduction in concrete shear strength relating to the a/d ratio is evident by the reductions in V clz and V ci .At a/d = 2.4 (usually categorized as a deep beam), V clz primarily governs the failure of beam B1, a situation that agrees well with the test monitoring.Computation via the 2PKT approach is shown in Fig. 6.The increasing a/d ratios of the ETS-strengthened beams modified the onsets of stirrup yielding and ETS-FRP debonding with large deflections.Although no clear experimental evidence for those observations was discovered, the beam with a high a/d ratio would cause an early trigger of the bending mechanism but would later activate the shear resisting mechanism.Otherwise, the maximum shear contribution for each component (i.e., stirrups and ETS) would be unchanged under the effect of the a/d ratio.Here, the 2PKT analyses utilized a set of failure criteria for the transverse steels and ETS-FRP strengthening system in an ETS-strengthened beam representing the yielding of the whole stirrups and debonding of ETS-FRP elements.
The reliability of the developed 2PKT model for predicting the shear capacity of ETS-FRP-strengthened RC beams can be elucidated above.Another benefit of the developed 2PKT model is that it promptly produces the calculation output only after a few seconds.This feature can help users generate a large range of data for ETSstrengthened beams without using high-performance and expensive computational tools."Machine learning approach" section will focus on the implementation of the ML models based on the data generated by the 2PKT formulations to predict the shear capacity of ETS-FRP-strengthened beams.

Machine learning approach
The ML domain focuses on the utilization of data to enhance performance in various tasks, such as clustering, classification, and regression.In regression tasks, the algorithm predicts outcomes based on the relationship between features and a target variable.Several ML algorithms are currently used for regression, such as linear regression, lasso, decision tree, random forest, support vector regression, gradient boost, and artificial neural networks (ANNs).In this study, four widely used algorithms, which are known for their efficiency and high accuracy, are utilized: random forest, extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and ANN.In this manner, the shear capacity of ETS-FRP-strengthened beams can be predicted using the 2PKT approach.

Machine learning models
Random forest 80 is a popular ensemble technique that uses bootstrapping and random feature selection to create several decision trees.The trees are uncorrelated, and their predictions are merged using a voting process to obtain the result (Fig. 7a).XGBoost 81 is built on a gradient-boosting framework that sequentially trains multiple decision trees, with each tree trained to correct the errors of the previous tree.The architecture of the XGBoost algorithm is shown in Fig. 7b.LightGBM 82 shares many similarities with XGBoost but grows tree leafwise instead of depthwise (Fig. 7d), resulting in a much faster and more memory-efficient training.XGBoost and LightGBM are highly effective and widely used in industry and academia.
ANNs are complex systems composed of three fundamental layers: the input, hidden, and output layers.The input layer receives raw data and passes these data to the hidden layer.The hidden layer is the computational core of the network, and its neurons perform complex computations on the input data.The output layer combines input and hidden information to generate an output value from which predictions for the response variables are provided.ANNs are a powerful tool for data analysis, and understanding the fundamental layers is necessary in unlocking the full potential of this algorithm 83 .
Furthermore, the ANN model includes feedforward and backward propagations.Feedforwarding is the process of inputting data into a neural network and processing them layer by layer through the network until Figure 6.2PKT analyses for specimens B1, B2, and B3 in the study of Bui et al. 32 .a prediction is generated.In the feedforward process, each neuron in the network receives input from the neurons in the previous layer, performs a computation, and passes the output to the neurons in the next layer.This process continues until the output layer is reached and the final prediction is generated.Backward propagation is the process of calculating the gradient of the loss function concerning the weights and biases of the neural network.The gradient updates the weights and biases during training, aiming to minimize the loss function and improve the network's accuracy.Figure 7c shows an example of ANN architectures.The selected studies [80][81][82][83] can be referenced for additional details about the aforementioned algorithms.

Data collection
After confirming the excellent performance of the developed 2PKT via experimental validation, this method was used to simulate more than 2643 data points, encompassing all feasible and realistic variable scenarios.Then, this dataset was utilized to implement ML models to predict the shear strength of ETS-strengthened RC beams.During the simulation, the technical constraints regarding the mechanics and details of the reinforcement were considered to ensure realistic and meaningful data.The constraints include the following conditions: The condition in Eq. ( 25) represents the force equilibrium between shear forces derived by shear components (V total ) and flexural moment (V T ); it is the termination condition of the computation.The condition in Eq. ( 26) considers the debonding failure criteria of the ETS-FRP bonded to concrete based on the strain limit of 0.004 (ε); that is, no rupture of FRP is examined in the model computation.This condition of FRP debonding from concrete, which was observed in past studies (i.e., pertaining to the experimental tests of ETS-FRP-strengthened RC beams), is safe and common for design practice.The conditions in Eqs. ( 27) and ( 28) involve the allowable spacing of the existing steel stirrups (s sw ) and ETS-FRP strengthening bars (s f ), which is smaller than half of the shear span (a).Equations ( 29) and ( 30) satisfy the detailed conditions of the steel and ETS-FRP reinforcement located in the beam section (width and height directions).

Data description
After collecting the simulation data, the relationship between the independent and target variables needed to be determined before experimenting with any ML algorithm.Correlation can be employed for bivariate analysis, i.e., measuring the relationship between two variables.The measure of correlation is called the correlation coefficient.Pearson's correlation coefficient is determined by linear association.The correlation coefficient value (R) ranges from − 1 to + 1, where − 1 indicates a negative correlation, + 1 signifies a positive correlation, and 0 denotes the absence of a correlation between two variables.Equation (31) shows the formulation of Pearson's correlation coefficient, in which the covariance ratio between two variables (S xy ) is divided by their standard deviation (S x , S y ).
Figure 8 illustrates the pairwise correlation coefficient between the variables used for analysis.The variables f ' c , ρ f , d f , E f , s f , and β have considerable effects on the shear contribution of the ETS-FRP strengthening system (V f ).The aforementioned variables govern the amounts and properties of the ETS-FRP strengthening bars or the bond shear stress between ETS-FRP bars and concrete.Meanwhile, the variables d b , n b , and ρ l affect the shear strength caused by bending (V T ), consequently modifying V f via force equilibrium.Most variables influence the total shear strength of ETS-strengthened RC beams (V total ) caused by the interrelationships among parameters in the formulations of the shear components.
According to the data presented in Table 2, the original input parameters exhibit differences in scale.When the input variables have different scales, the ML algorithms tend to give more weight to the variables with larger scales, which leads to biased and incorrect predictions apart from much slower convergence and poor performance of the model 84 .Therefore, the data need to be normalized to overcome the aforementioned issues.In this study, the max-min normalization method was adopted to normalize all variables in the dataset to a range from 0 to 1 by using Eq.(32). Figure 9 shows the data distribution for each variable.
Figure 10 shows the feature importance corresponding to V f and V total .The method of calculating feature importance varies depending on the ML algorithms, and the analyzed feature importance differs from each algorithm.In general, the analysis reveals that p f , E f , s f , b, a, and h are the most important features for predicting V f and V total .This outcome is reasonable from an engineering perspective.

Evaluation metrics
In this study, four metrics, including mean absolute error (MAE), mean squared error (MSE), root mean square error (RMSE), and coefficient of determination (R 2 ), were used to evaluate the performance of the ML models.The mathematical expressions and descriptions of these metrics are presented in Table 3.

Implementing the ML algorithms
Several hyperparameters can be used in ML algorithm operations.However, the meaning of these hyperparameters and how they can be optimized to effectively design and train ML models must be understood.In random forest, XGBoost, or LightBoost, the hyperparameter "n_estimators" determines the number of decision trees in (26)  0.75f fu /E f > 0.004,    each ensemble.Increasing the number of estimators can improve performance up to a certain point, but several trees may lead to overfitting and increased computational cost.The "max_depth" hyperparameter controls the maximum depth of each decision tree in the ensemble, and limiting it can prevent overfitting and improve model generalization.In a random forest algorithm, other hyperparameters can be tuned to control overfitting and improve the generalization ability of models."Min_samples_split" specifies the minimum number of samples required to split a node in a decision tree, which helps in generalization by enabling the trees to be less prone to overfitting.Similarly, "min_samples_leaf " sets the minimum number of samples needed to be in a leaf node of a decision tree, thus controlling overfitting."Max_features" controls the maximum number of features to be considered when splitting a node and plays a crucial role in introducing randomness and diversity into the trees."Max_samples" specifies the maximum number of samples used for training each decision tree, which can be set as a fixed number or as a fraction of the total number of samples.In XGBoost, critical hyperparameters shape the modeling behavior, such as the learning rate (eta), min_ child_weight, gamma, subsample, colsample_bytree, reg_alpha, and reg_lambda.The learning rate dictates the training step size, while min_child_weight enforces a minimum data sum needed for node splits, thus guarding against overfitting.Gamma contributes to regularization by setting the minimum loss reduction for splits, and subsample introduces randomness by selecting a data subset.Colsample_bytree randomly picks a fraction of features for each tree.Reg_alpha and reg_lambda provide L1 and L2 regularization, enhancing model stability.LightGBM shares some hyperparameters in cases where the learning rate and colsample_bytree align with XGBoost.In LightGBM, bagging_fraction (subsample) augments generalization by sampling data in boosting rounds, and feature_fraction diversifies models by selecting random feature subsets.The num_leaves parameter controls tree complexity and interpretability.Properly configuring these hyperparameters is pivotal for maximizing the performance of both XGBoost and LightGBM across a broad spectrum of ML tasks.
Several critical hyperparameters in an ANN affect its architecture and training process.The n_layers or number of layers define the network depth and complexity, with the input, hidden, and output layers contributing to the network structure.The learning rate controls the step size during training, influencing the convergence speed and stability.Activation functions introduce nonlinearity, enabling the network to learn complex patterns.Batch size determines the number of examples processed in each training step, affecting memory usage and efficiency.Neurons or units in each layer determine the capacity of the network to represent data.Epochs specify the number of passes through the entire training dataset.Dropout is a regularization technique that randomly deactivates neurons during training, mitigating overfitting.The dropout rate sets the probability of neuron deactivation, fine-tuning the regularization effect.
As mentioned above, hyperparameters play a key role in ensuring the good performance of each ML model.ML algorithms often require the fine-tuning of various hyperparameters, which are unique to each problem.The wide range of hyperparameters, coupled with the need to find the best possible combination, makes it impossible to cover all scenarios.Thus, the present study employed HyperOpt, a tool designed to automate the search for optimal hyperparameter configurations.Then, Bayesian optimization was utilized and reinforced by the sequential model-based global optimization methodology 85 .
The hyperparameter optimization process involves the following steps: first, a surrogate model of the objective function is constructed using data from past evaluations.Second, this model is used to identify hyperparameters that can yield the best performance.Third, the selected hyperparameters are tested on the actual objective function by training the model and evaluating its performance metric.The results obtained in this step are used to update the surrogate model (i.e., the fourth step).Steps 2 to 4 are repeated iteratively, often with a maximum iteration or time constraint.Finally, the best-performing hyperparameters across all trials are selected as the optimal configuration 85 .
Additionally, k-fold cross-validation was implemented to prevent overfitting and ensure that the trained model is reliable for real-world applications 86 .Figure 11 illustrates the process of implementing the ML algorithms, which were applied to four different models to compare their performance in predicting the shear strength of ETS-FRP-strengthened beams.The dataset was split into two subsets, with 80% of the data used for training and the remaining 20% for testing.In the training process, the data were also split into 80% for training and 20% for validation by using fivefold cross-validation.Subsequently, the optimal hyperparameters for each ML algorithm were obtained (Table 4).

Results and analysis
The performance of the proposed approach was comprehensively evaluated in the present study.In particular, the results derived by the hybrid 2PKT-ML model were compared with those calculated by the shear design Table 3. Performance evaluation metrics for ML models.The robustness and applicability of the proposed methodology were further investigated.Shapley additive explanations were used to assess the influence of design variables on both the shear strength and ETS shear contribution of the beam.The evaluations were conducted using the most proficiently trained ANN models (Fig. 14).The total shear strength of the ETS-strengthened beams (V total ) proportionally increased with b, b f , d b , E f , f ys , h, f ' c , n b , and ρ f but decreased with a, β, and s f .These trends are well suited to the mechanics domain and existing shear models.For instance, the higher b, b f , and h are, the larger the beam geometries, resulting in greater shear resistance.Meanwhile, a large s f enhances the beam action and leads to a small amount of ETS-FRP strengthening, reducing the beam shear strength.Additionally, the shear contribution of ETS strengthening bars (V f ) was clearly enhanced with a, b, E f , h, and ρ f but decreased with β and s f .These trends can be explained by the formulations used for V f in the 2PKT approach, which are suited for the shear resisting mechanism of ETSstrengthened beams.For example, V f is dependent on the properties and percentages of the ETS strengthening system.On the basis of the parametric investigation, the trained ML models can learn the shear mechanism of ETS-strengthened beams and precisely predict V f and V total .

Conclusions
The primary objective of the present study was to propose a new approach for combining the developed 2PKT with ML methods to predict the shear behavior of RC beams strengthened with ETS-FRP bars.The significance of the study involves the contribution of a new computation method to the strengthening field.Consequently, researchers and engineers can utilize the code established for the proposed approach in the design practice of ETS-FRP-strengthened beams without the use of complicated numerical tools.The main conclusions drawn from this work can be summarized as follows: 1.The 2PKT developed in this study can rationally predict the shear behavior of ETS-strengthened RC beams in terms of the beam shear strength, full load-deflection curve, and failure mode.The average V total-ana./V total-exp.was 0.98, and the CoV of the mean was 16.2%.The developed 2PKT approach also entails rapid and simple calculation and the possibility of data generation under various design scenarios.2. On the basis of the analyses of the data derived from the various ML models, random forest failed to predict the shear resistance of ETS-strengthened RC beams.In contrast, XGBoost, LightGBM, and ANN demonstrated great accuracy in predicting the shear contribution of the ETS-FRP strengthening system.The ANN model outperformed the other models in estimating the total shear strength of ETS-strengthened RC beams.In fact, for predicting the total shear strength of ETS-strengthened beams, the ANN model achieved R 2 values  of 0.99, 0.98, and 0.96 for training, validation, and testing data, respectively.These findings suggest that the ANN model is a stable and reliable model for predicting the shear strength of ETS-FRP-retrofitted beams.3. The studied ML algorithms involve numerous hyperparameters and a wide range of potential values.Bayesian optimization techniques can be used to optimize the hyperparameters of ML models and achieve optimal accuracy.Furthermore, optimizing the ML algorithms promotes an equitable performance comparison of the models.4. The parametric investigation provides insights into the effects of design variables on the shear capacity of ETS-strengthened RC beams.For example, the total shear strength increased with beam geometry and material strength, while the ETS shear contribution was dependent on the properties and configurations of the FRP.These findings fully agree with those reported in the literature and predicted by the existing shear models.5.The application of the developed 2PKT-ML approach requires a solid beam and ML theoretical background.
Practitioners can apply the developed model by using computational code for predicting the shear behavior of ETS-FRP-strengthened RC beams.A practical tool of the developed 2PKT-ML approach will be established in future studies.The aim is to conveniently serve practitioners in estimating the shear capacity of ETS-FRP-retrofitted beams.6.The experimental data of the beams strengthened with ETS techniques are still deficient with respect to important design variables, such as ETS material types, ETS configurations, and anchorage systems.Future experimental studies on the ETS strengthening method for RC beams are needed to provide a broad database for evaluating the proposed 2PKT-ML model. https://doi.org/10.1038/s41598-023-47064-1

Figure 3 .
Figure 3. 2PKT solution for the full shear load-deflection curve.

Figure 10 .
Figure 10.Feature importance corresponding to (a) V f and (b) V total .

(x i − xi ) 2 2 N
Average of the absolute difference between the actual values and the predicted values Used to measure the average squared difference between the actual values and the predicted valuesRMSE = N i=1 (xi−xi)It measures the average magnitude of the error in the predicted valuesR 2 = 1 − SSres SStotPresents the proportion of the variance in the dependent variable explained by the independent variables of the other three algorithms.Overall, the findings demonstrate that the ANN model is stable in calculating the beam shear strength and ETS-FRP shear contribution.

Figure 12 .
Figure 12.Comparisons of the four ML algorithms in predicting (a) V f and (b) V total .

Figure 13 .
Figure 13.Error of ML algorithms in predicting (a) V f and (b) V total .

Figure 14 .
Figure 14.Parametric investigation on (a) V f and (b) V total .

Table 1 .
Beam specifications in available experimental studies.
Vol:.(1234567890) Scientific Reports | (2023) 13:19871 | https://doi.org/10.1038/s41598-023-47064-1www.nature.com/scientificreports/works.The necessary information for those tested beams is summarized in Table 1.The beam shear strength and ETS-FRP shear contribution are also shown in the table.In this study, all beams mentioned in Table 1 were simulated to evaluate the representativeness and accuracy of the 2PKT approach.Then, the simulated 2PKT results were compared with the experimental data.f (mm) h (mm) d (mm) a/

Table 5 .
Comparison of the performance of ML algorithms in predicting the shear strengths of ETSstrengthened beams.Bold values indicate the best metric values of four ML networks.