Soft computing techniques for predicting the properties of raw rice husk concrete bricks using regression-based machine learning approaches

In this study, the replacement of raw rice husk, fly ash, and hydrated lime for fine aggregate and cement was evaluated in making raw rice husk-concrete brick. This study optimizes compressive strength, water absorption, and dry density of concrete brick containing recycled aggregates via Response Surface Methodology. The optimized model's accuracy is validated through Artificial Neural Network and Multiple Linear Regression. The Artificial Neural Network model captured the 100 data's variability from RSM optimization as indicated by the high R threshold- (R > 0.9997), (R > 0.99993), (R > 0.99997). Multiple Linear Regression model captured the data's variability the decent R2 threshold confirming- (R2 > 0.9855), (R2 > 0.9768), (R2 > 0.9155). The raw rice husk-concrete brick 28-day compressive strength, water absorption, and density prediction were more accurate when using Response Surface Methodology and Artificial Neural Network compared to Multiple Linear Regression. Lower MAE and RMSE, coupled with higher R2 values, unequivocally indicate the model's superior performance. Additionally, employing sensitivity analysis, the influence of the six input parameters on outcomes was assessed. Machine learning aids efficient prediction of concrete's mechanical properties, conserving time, labor, and resources in civil engineering.


Materials and methods
Agricultural waste is used in concrete brick-like sawdust, coconut-shell, and peanut shell using a mix design of 1:6 (cement: sand) 58 .Similarly, Table 2 shows mix design of RHCB in the article.Hydrated lime, fly ash, and cement were all used as binder in the mix of RHCB. Figure 2 summarizes the sieve analysis of fine aggregate and raw RH used in this experiment.Compressive strength, water absorption, and density are listed in Table 3 as a result of the testing.For machine learning algorithms necessitate multiple input variables to generate the targeted output variable.Within the realm of concrete construction and analysis, paramount significance is attributed to   Methods.The utilization of response surface methodology (RSM) enables the optimization of concrete brick properties, such as compressive strength, water absorption, and dry density, by systematically exploring the relationship between input variables.Meanwhile, the incorporation of artificial neural network (ANN) validation ensures the reliability and generalization of the developed model, validating its predictive capabilities against unseen data.
Response surface methodology (RSM).RSM is a collection of procedures for analyzing data for examining and modelling that there are functional connections between the input variables (x) and the desired response (y) 59 .
The RSM is a method for analyzing the results of experiments.Calculating the R 2 , R 2 adjusted, and R 2 predicted quantity determined the model's significance level.The calculated F-quantity is used to determine the influence of the factors on the measured results.The greater the F-quantity for a parameter, the more significant the experiment's results will be as a result of that parameter.The P-quantity tells readers whether or not the model's results are effective.A P-quantity of less than 0.05 can identify an important model or set of parameters.This is an RSM polynomial model, where x and y are the input and output variables, respectively.Equations (1) and ( 2) express the polynomial model, respectively.Design expert v10.0.1 software was used to perform RSM modelling on historical data.The advantage of using historical data design (HDD) is that it removes the need to stick to a specific experimental design and allows for the study to accept data of any size to be input 60,61 .
where Y represents the predicted response function, β i represents the intercept, b i , b ii represent linear effect coef- ficients, x i , x j , x 2 i represent quadratic effect coefficients, and β 0 , b 0 represents the interaction effect coefficient, n is number of data, K is total number of data.

Artificial neural networks (ANN).
The human brain's functional neural features can be used to generate mathematical and numerical models.With processing and representation of data, the ANN is an input and output is linked by various data structures that are linked together in a statistical model with which has multiple neurons capable of large computations 62,63 .An artificial neural network model to predict the desired output from a given input can be trained.The output of each layer will go to ensure that the layer's input is compatible with the transfer function with the desired outcome regular ANN showed in Fig. 3.The function of transmission can be either linear or nonlinear in its behavior.Most of the time, the hidden layers are nonlinear, while the output layers are predicted output is then processed by the output layer 64,65 .Equation (3) can be used to summarize all of ANN's processes.A single output is created by adding bias (b) to the sum of the individual outputs.ANN model was implemented this research using MATLAB R2020b software.

Multiple linear regression (MLR). Multiple engineering disciplines have relied on this model's ability to establish linear relationships between variables. Modeling the relationship between the dependent variable and more
(1) www.nature.com/scientificreports/than one independent variable is accomplished using MLR 66,67 .Minimizing the difference between variables-the dependent and independent is an important principle behind MLR.The regular mathematical form of the MLR model is as follows.
As shown in Eq. (4), the dependent variable is Y, c 0 is the intercept, c 1 to c n are the coefficients associated with independent variables, X 1 to X n are independent variables, and is the error related to the predictor.

Results
Compressive strength.Predictable RSM methods were used to predict the 28-day compressive strength design type in 14 different experimental combination results.The quadratic model of design is employed in the process.The rationale for opting for a quadratic model over other models in Response Surface Methodology (RSM) is rooted in its capacity to account for nonlinearity, curvature, and complex interactions among variables, thereby ensuring a more accurate representation of the underlying system.This choice is pivotal for precise optimization and predictive capabilities.A summary of RSM's design can be found in Table 4.
Table 5 showed that ANOVA provides the sum of squares, the df, the mean square, the F-quantity, and the p-quantity at the 5% level of significance.All three responses had R 2 ≥ 0.9923.To avoid an unnecessarily large increase in the adjusted R 2 , this term refers to an adjustment to the R 2 .R 2 that is within 0.1 of the adjusted version of the R 2 would be preferable 68 .This was the situation in question in this research.In addition, if the p-quantity is greater than 0.05, the model is statistically significant 69,70 .The research implications of the ANOVA outcomes lie in the p value and F-value results.The p value indicates statistical significance, helping accept or reject hypotheses.The F-value signifies the variance explained by the model, guiding the understanding of relationships among variables, thus influencing subsequent analyses and conclusions.
Three-dimensional RSM plots are shown in Fig. 4 displays the scatter plots for all of the data collected.Parity and three-dimensional plots of dependent variables are depicted in Fig. 4. The Fig. 4 parity and 3D plots of compressive strength reveal that the factors A and B interact strongly.The sand, raw rice husk, and cement formed a strong bond.When plotting the predictions, it was discovered that they are very in close proximity to the diagonal.As a result, the RSM model accurately predicted the outcome, with a balanced distribution of data points on either side of the diagonal.This finding demonstrates a lack of bias in over-or under-prediction.As a major drawback of RSM, estimating one factor influences its analysis of another factor, which is a major issue.According to the actual and predicted response quantity, it is clear that the responses are very close together.The residual plots showed no significant deviation from normality, indicating that the chosen model effectively predicted the material's strength and interaction 71 .
The equations for the RSM method model for CS can be found in Eq. ( 5) and respectively.
(4) www.nature.com/scientificreports/In Eq. ( 5), where the C stands for fly ash, the D is the hydrated lime, and the E is the raw rice husk, until otherwise stated.The RSM was done before to materials change of RHA with a cement replacement show a CS of RHCB.In this research, the maximum CS was calculated to be 7.34759 MPa, the minimum was calculated to be 2.47793, and the target was calculated to be 4.17615 according to this diagram (Fig. 3).This is referred that the estimated value increases.The closeness of the actual and predicted responses can be seen in the response values, which compare the actual with the predicted.The residual plots showed no significant deviation from normality; the plots clearly show in predicting the strength and interaction of the materials, that model was accurate.The architecture of ANN based on the data provided is shown in Fig. 5 used three different models.The proposed ANN regression plots are also shown in Fig. 6.This study employed MATLAB to train an artificial neural network (ANN) with 5 input layers, 1 hidden layer featuring 10 neurons, and 1 output layer.For MATLAB analysis, 75% (75 data) were allocated for training, 15% (15 data) for testing, and 15% (15 data) for validation.The R quantities that were used to train, validate, and test the model are all above 0.99997.And they can be approximated to unity, which stands for one hundred percent (Fig. 6).An R-value greater than 0.9 indicates that the model can be relied upon to provide an accurate prediction.
Again, the results in both models and the predictions match the results observed in the experiments, as seen in the Figs.7 and 8 below.With graphic depiction of a radar system Fig. 7 shows the model's results compared to the test's results.To put it another way, the values that were estimated by the test results are very close to the actual values.ANOVA was also used to test the significance of compressive strength models and model terms at a 95% confidence level.After analyzing the ANN method and the ANOVA method for both of the strength values, it was discovered that the RSM and ANN method was more appropriate for estimating the results.Table 6 results are shown of the experiments and the models used to estimate the CS.
Predicted outcomes differ greatly from one another compressive strength values from a comparison between the MLR model and the results obtained in the laboratory.Figure 8 indicates that the lower RH led to the higher ( 5) Predicted vs. Actual      According to the model's F-quantity of 2660.97, it is significant.If noise causes an F-quatity to be large, the probability is less than 0.01%.A model term with a P-quantity of less than 0.0500 indicates significance.Among the relevant model terms here are B, C, BC, B 2 , and C 2 .The model terms are insignificant if their values are greater than 0.1000.Reducing the number of unimportant model terms (excluding those necessary to support hierarchical structure) can improve the models.Because the predicted R 2 of 0.9972 and the adjusted R 2 of 0.9994 are within 0.2 standard deviations of one another, someone can conclude that the two are fairly well matched.Adeq Precision calculates SNR (signal to noise ratio).More than a four-to-one ratio is preferred.The signal-to-noise ratio of 163.887 indicates that next setup is working properly.This model can be used to find an efficient way around a design's various options and possibilities (Table 7).The maximum water absorption was calculated to be 232.111,minimum was calculated to be 112.956,and target was calculated to be 8.03163 according to the Fig. 9.
The closeness of the actual vs. predicted responses can be seen in the response values, which compare the actual with the predicted.The residual plots showed no significant deviation from normality; the plots clearly show in Fig. 9d that the chosen model accurately predicted the absorption of water and the interaction of the materials used in the experiment.Equation ( 6) is the model equations derived using the RSM method for water absorption.
Figure 10 shows the R correlation coefficient for water absorption, resulting from an ANN analysis.The water absorption R coefficient of correlation was 0.99993.Both parameters are in the range of R correlation coefficients above 0.9 show that the ANN method produces suitable models (Fig. 10).Table 8 displays the outcomes of the experiments and the models used to estimate the water absorption.Again, the results both models' predictions are in agreement results observed during the research, as seen in the Fig. 11 below.
Graphic depiction of a radar system in Fig. 11 shows the model's results compared to the test's results.To put it another way, the values estimated by the outcomes of the examinations are very close.The significance of the results was determined by performing an ANOVA of water absorption models and model terms at a 95% confidence level.And Fig. 12 shows the MLR for water absorption.
Predicted outcomes differ greatly from actual outcomes of water absorption comparing between the MLR model and the results obtained in the laboratory (Figs.11, 12).In Fig. 12, the higher RH indicates the higher the water absorption in RHCB.The data points are fitted line plots in Fig. 11.According to the MLR, 0.915 R 2 predicts the outcomes correctly at 91.5% for water absorption.

Density.
Based on RSM methods and 14 experiments, the 28-day density was created.A quadratic model is used to create it.The density of RHCB results are shown in Table 9.
The model F-quantity of 86.46 indicates that the model is significant.And the F-quantity greater than 0.01% is extremely unlikely to occur due to noise.A model has significant terms if its P-quantity is less than 0.500.If the value is greater than 0.1000, the model term is unacceptable.Reducing the number of unimportant model term (excluding those required to support hierarchy) can help the model perform better.Because the predicted R 2 is 0.9429 and the adjusted R 2 is 0.9813, the two are within 0.2 of each other.Model precision is the ratio of signal to noise.More than a four-to-one ratio is preferable.A signal strength ratio of 28.085 indicates that the data is adequate.The design space can be navigated with the help of this model (Table 9).The maximum Density was ( 6) www.nature.com/scientificreports/calculated to be 2.37481, the Minimum was calculated to be 1.76493, and the target was calculated to be 1.6834 according to this Fig. 13.The closeness of the actuality and forecast responses can be seen in the response values, which compare the actual with the predicted.The residual plots showed no significant deviation from normality.Equation ( 7) is the model equations derived using the RSM approach for density.Figure 14 show the R correlation coefficient for density.
Table 10 shows the actual and predicted value of ANOVA and RSM for density of RHCB.Following figures (Figs. 15, 16) show the selected model accurately predicted the density and interactions of the materials used.
Predicted outcomes differ greatly from actual outcomes of density values, comparing between the MLR model and the results obtained in the laboratory.In Fig. 14, the higher RH indicates the lower density.The data ( 7)     points are fitted line plots in Fig. 15.According to the MLR, 0.977 R 2 predicts the outcomes correctly at 97.7% for density of RHBC.

Performance analysis
Calculated compressive strength, water absorption, and density of RHCB is compared against those obtained through laboratory testing as part of the performance analysis.Performance is evaluated using four indicators the accuracy of the equations used in the study.The MAE 72 , RMSE 73 , the VAF 74 , and R 275,76 are used to evaluate compressive strength, water absorption, and density prediction.When comparing two estimates, MAE measures the difference between the estimates based on each method.The regular equation is shown in Eq. ( 8).where in the analysis, the y i is prediction, the x i is true value, and the n is total number of data points from laboratory experiments and a model that was based on those results.With the RMSE, one can compare the accuracy of different dataset models.Measures the further estimated values stray from actual measurements 77,78 .RMSE is calculated using the following Eq.( 9).where the i is variable, the N is number of non-missing data points, the x i is actual observations time series, and the û is an estimated time series.This statistic is almost always positive, indicating that the model's data has been perfectly fitted when it is equal to zero.A high VAF indicates better forecasting performance for a given dataset since VAF is an indicator of the precision of a prediction method 79 .Calculating VAF is a simple process Eq. ( 10).
Var denotes an estimate of the variance in a set of data, VAF is a common verification technique, as well the model's correctness through comparison of values that have been observed or measured are compared to values that have been predicted or estimated.A model's prediction performance improves if its MAE and RMSE are lower, and the opposite is true.R 2 and VAF values, on the other hand, have a direct impact on the accuracy of the model's predictions.A property's R 2 is typically calculated by plotting its measured and estimated data.MAE and RMSE are error measures for parameter estimations.As opposed to this, R 2 and VAF attempt to gauge how close the predicted and measured values of two variables are to each other 80 .
Sensitive analysis.Sensitivity analysis (SA) to ascertain the individual influence of input parameters on outcomes 81 .SA is executed through technical Eqs.(11) and (12) to quantify the relative contributions of each parameter.As depicted in Fig. 17, every parameter holds substantial importance in predicting compressive strength, water absorption, and dry density.The sensitivity analysis underscores the pronounced influence of Rice husk on the actual contributions to compressive strength, water absorption, and dry density, with an impact exceeding 30%.This finding accentuates the significance of Rice husk as a pivotal factor in determining these material properties within the context of the study 82 .

Discussion
The results stemming from this research can be attributed to the comprehensive utilization of advanced techniques for optimizing raw rice husk-concrete brick (RHCB) properties.The systematic replacement of raw rice husk, fly ash, and hydrated lime for fine aggregate and cement was judiciously evaluated, forming the foundation for robust material compositions.The employment of response surface methodology (RSM) facilitated the exploration of intricate relationships between variables, leading to the precise optimization of compressive strength, water absorption, and dry density.The model's subsequent validation through artificial neural network (ANN) and Multiple Linear Regression showcased its efficacy across varied scenarios, as indicated by the high R and R 2 thresholds achieved 83 .
The exceptional accuracy attained in predicting 28-day compressive strength, water absorption, and density can be attributed to the synergistic interplay between RSM and ANN, outperforming Multiple Linear Regression.The lower MAE and RMSE values, along with elevated R 2 metrics, definitively underscore the model's exceptional predictive capabilities.The validation of outcomes via sensitivity analysis further bolsters the results, enabling a granular understanding of the input parameter influences.The culmination of these techniques, encapsulating machine learning, optimization, and rigorous analysis, presents a sound scientific approach to enhancing concrete's mechanical properties, while streamlining resource utilization in the domain of civil engineering.
ANN, RSM, and MLR models were identified as contributing factors to predicting CS, density, and water absorption values.In compressive strength, the MAE, RMSE, VAF, and R 2 are the performance indicators compared.The ANN model predicts an MAE of 0.031, RMSE of 0.361, R 2 of 0.99997, and VAF of 99.32%.The MAE predicted by the RSM model was 0.097, the RMSE was 0.126, the R 2 was 0.9923, and the VAF was 99.23%, while the MAE predicted by the MLR model was 0.038, the RMSE was 0.392, the R 2 was 0.9855, and the VAF was  The MLR model's lower performance may be due to the model's lack of completeness depict and account for the uncertainties in the dataset's input.Because of these uncertainties, the MLR model's performance will be lower.The ANN, RSM model's training algorithm quantifies the inherent uncertainties in the input data, resulting in a better model.According to MLR's predicted compressive and water absorption and density were slightly overestimated compared to the actual measurements.The above signifies that while the MLR can be used as a model estimate RHCB's compressive strength, density, and water absorption properties, ANN and RSM models are more accurate at predicting RHCB's compressive strength, water absorption, and density properties (Table 11).

Practical applications.
In hot climates, ANN models can assist construction site engineers in selecting the most appropriate concrete parameters, resulting more durable and long-lasting construction materials 84 .The results have numerous useful applications for engineers in the fields of civil and environmental construction.The use of these types of design models is important for estimating the performance of a product on the basis of a predetermined objective cement substitutes made from other types of solid waste 85 .As a result of the surrounding benefits of cement that has been reduced consumption, careful consideration must be given to the decision to use a partial replacement, as must maintaining a minimum level of mechanical performance beginning for concrete as mandated by regulation 86 .Predictive models are applicable and helpful in such circumstances.In addition, concerns about costs, concerns about the budget, and in the beginning, cost estimates for engineering of the built environment projects can benefit from such prediction models.In comparison to theoretical and experimental methods, RSM, ANN and MLR predictions are rapid and less costly 87 .

Conclusion
The compressive strength, water abortion, and density of RHCB were predicted using ANN, RSM, and MLR models in this current work.The models took as input three quantitative parameters like raw rice husk, FA, and hydrated lime.Compressive strength, water absorption, and density were studied in relation to RH, FA, and HL using ANN, RSM, and MLR methods.The results are as follows: • As the percentage of raw rice husk replaced increased, a decrease in strength was observed in the RHCB.
Replacing rice husk with sand 1:3:3 is significant water absorption and density noted according to the literature.• R 2 values for compressive, water abortion, and density were 0.9923, 0.9998, and 0.9998, respectively, after RSM analysis.Therefore, the models estimate the compression strength to be 99.23%, the water abortion to be 99.98%, and the density to be 99.98% based on these obtained values.• The values of compressive strength, water abortion, density, and R correlation coefficients were 0.9939, 0.99839 and 0.99875, respectively.The ANN model was appropriate because all correlation coefficients are nearly close to 1. • R 2 values for CS, water abortion, and density were 0.9855, 0.9155, and 0.9768, respectively, after MLR analysis.
Therefore, the models estimate the compression strength to be 98.55%, the water abortion to be 91.55%, and the density to be 97.68%based on these obtained values.• A comparison of plots of parity showed that both the ANN and RSM are performing models had no bias in their prediction accuracy.The RSM, ANN model, on the other hand, is superior to the MLR model because it is more accurate and better suited to the dataset.

Figure 1 .
Figure 1.Main stages of the procedure.

Figure 3 .
Figure 3. Regular neural model of ANN.

Figure 5 .
Figure 5. Architecture of ANN utilized for prediction.

Figure 11 .
Figure 11.Actual and predicted value of water absorption.

Figure 13 .
Figure 13.Maximum density, optimum density and minimum density and actual and predicted value from RSM.

Table 2 .
Mix ratio and material combination.

Size of sieve(mm) Sand RH Figure
2. Sieve analysis of fine aggregate and RH.

Table 3 .
Compressive, water abortion, and density.parameters such as compressive strength, water absorption, and dry density.Leveraging the input factors and data instances, a robust and efficacious model is formulated to encompass the incorporation of raw agricultural waste within concrete compositions.

Table 4 .
Design of RSM.

Table 6 .
Predicted value of ANN and ANOVA.There were 14 separate experiments used to develop the 28-day water absorption.A quadratic model is used to create it.RSM-ANOVA results of RHCB water absorption can be seen in Table7for more information.

Table 8 .
Actual and predicted value of ANOVA and RSM for water absorption of RHCB.

Table 10 .
Actual and predicted value of ANOVA and RSM for density of RHCB.

Table 11 .
2NN, MLR, and RSM comparison model.98.55%.The MAE, RMSE, R2, and VAF are used for density comparisons.In the ANN model, the MAE was 0.039, RMSE was 0.054, R 2 was 0.99993, and VAF was 99.85%, while, in the RSM model, it was 0.06, RMSE was 0.080, R 2 was 0.9998, and VAF was 99.99%, while in the MLR model it was 0.026, RMSE was 1.283, R 2 was 0.9768, and VAF was 96.78%.Indicators of performance that are being compared in water absorption are the MAE, RMSE, R 2 , and VAF.The ANN model yielded an MAE of 5,526, an RMSE of 7,033, an R 2 of 0.99997, and a VAF of 99.85.The MAE of the RSM model was 1.035, the RMSE was 1.597, R 2 was 0.9998, and the VAF was 99.98.The MAE of the MLR model was 0.934, the RMSE was 9.753, R 2 was 0.9155, and the VAF was 91.55.Table11shows, based on the evaluation of performance, this signifies that the ANN, RSM model is a better an algorithm for predicting the future CS, density, and water absorption than the MLR model.