Liquid temperature prediction in bubbly flow using ant colony optimization algorithm in the fuzzy inference system as a trainer

In the current research paper a novel hybrid model combining first-principle and artificial intelligence (AI) was developed for simulation of a chemical reactor. We study a 2-dimensional reactor with heating sources inside it by using computational fluid dynamics (CFD). The type of considered reactor is bubble column reactor (BCR) in which a two-phase system is created. Results from CFD were analyzed in two different stages. The first stage, which is the learning stage, takes advantage of the swarm intelligence of the ant colony. The second stage results from the first stage, and in this stage, the predictions are according to the previous stage. This stage is related to the fuzzy logic system, and the ant colony optimization learning framework is build-up this part of the model. Ants movements or swarm intelligence of ants lead to the optimization of physical, chemical, or any kind of processes in nature. From point to point optimization, we can access a kind of group optimization, meaning that a group of data is studied and optimized. In the current study, the swarm intelligence of ants was used to learn the data from CFD in different parts of the BCR. The learning was also used to map the input and output data and find out the complex connection between the parameters. The results from mapping the input and output data show the full learning framework. By using the AI framework, the learning process was transferred into the fuzzy logic process through membership function specifications; therefore, the fuzzy logic system could predict a group of data. The results from the swarm intelligence of ants and fuzzy logic suitably adapt to CFD results. Also, the ant colony optimization fuzzy inference system (ACOFIS) model is employed to predict the temperature distribution in the reactor based on the CFD results. The results indicated that instead of solving Navier–Stokes equations and complex solving procedures, the swarm intelligence could be used to predict a process. For better comparisons and assessment of the ACOFIS model, this model is compared with the genetic algorithm fuzzy inference system (GAFIS) and Particle swarm optimization fuzzy inference system (PSOFIS) method with regards to model accuracy, pattern recognition, and prediction capability. All models are at a similar level of accuracy and prediction ability, and the prediction time for all models is less than one second. The results show that the model’s accuracy with low computational learning time can be achieved with the high number of CIR (0.5) when the number of inputs ≥ 4. However, this finding is vice versa, when the number of inputs < 4. In this case, the CIR number should be 0.2 to achieve the best accuracy of the model. This finding could also highlight the importance of sensitivity analysis of tuning parameters to achieve an accurate model with a cost-effective computational run.

In addition, CFD computing methods can generate large datasets with several inputs and outputs (big data). This type of dataset can be a piece of interesting information for soft computing models for training. In this regard, soft computing models can easily find a connection between input and output parameters compared to discrete or global datasets that are popular in experimental observation. On the other hand, the generation of local node information enables soft computing models to describe the distribution of flow characteristics or heat and mass transfer in the reactor. Among different soft computing models, Ant Colony Optimization (ACO) can be a new option to learn non-discrete datasets as a family of swarm intelligence algorithms. This method, based on the concept of ants behavior to search an optimal path between their colony and a source of food, can train large datasets and present the distribution of datasets as a function of several input parameters. Alternatively, the ACOFIS can be presented in different technologies and applications as a problem solver or predictive tools to optimize challenging engineering processes with several operational parameters. Bubbly flow in the bubble column reactor can be an excellent candidate to present very complicated engineering processes due to flow behavior, turbulence properties, and heat characteristics which are complex to be simulated using mechanistic models.
In the current research paper, a 2-dimensional bubble column reactor with CFD was modeled and Artificial Intelligence (AI) was applied to create an artificial framework for process simulation and understanding. ACO was used in the learning stage of the AI model, and the fuzzy interface system was applied for the prediction stage based on the ant colony optimization method training. X and Y computing nodes, simulation time, liquid velocity in the Y direction, and gas hold-up are input parameters, and the temperature distribution is the output of training datasets. Different tuning parameters from ACOFIS were studied, such as cluster influence range, CIR in the training of the system, and their effects on better prediction of fluid temperature were investigated. ACOFIS is also compared with the genetic algorithm fuzzy inference system (GAFIS) and Particle swarm optimization fuzzy inference system (PSOFIS) method regarding model accuracy and ability of temperature prediction in the bubble column reactor. The training and prediction time in predicting temperature are also calculated for all models. This evaluation and analysis for different training models to build up the FIS structure in predicting reactor properties, particularly heat transfer characteristics, have been proposed for the first time. In addition to the prediction of reactor specifications with different AI models, the importance of tuning parameters in the ACOFIS is also highlighted for future investigations.

Methodology
CFD approach. Numerical methods and algorithms in computational fluid dynamics are employed for understanding the problems which involve fluid flows. In this study, the Euler-Euler multi-phase technique was applied to solve the average mass, flow, and energy equations and the volume fraction equation separately for each stage. The continuity equation can be defined as 37,38 : Momentum conservation equation 37,38 : Scientific Reports | (2020) 10:21884 | https://doi.org/10.1038/s41598-020-78751-y www.nature.com/scientificreports/ The stress term of gas bubbles and liquid stage can be presented by 37 : In which µ eff ,k represents the effective viscosity. Bubble induced can be described as follows 37 : Sato et al. 39 indicated viscosity of turbulence which was induced by the movement of multi-bubbles. Different studies applied this model for predicting the BCs 23,37,[40][41][42][43] . Formulating viscosity because of the induced turbulence can be defined by the equation below 37 : With a model constant of C µ,BI = 0.6 being mentioned in previous studies 37,41,[44][45][46][47][48][49] .
The drag forces between gas and liquid are shown 37 : The Schiller-Naumann drag model can be used. In general, it is acceptable for all multi-stage measurements 50 . The drag coefficient CD is defined by Schiller and Naumann as follows 51 : The turbulent dispersion force, which was introduced by Lopez de Bertodano is defined by the following equation 52 : Energy conservation equation: In which H q represents the enthalpy of stage q, − → q q represents the heat flux, S q implies the heat source, Q pq represents the heat exchange between the pth and qth stages, ṁ pq shows the mass transfer from pth to the qth stage. The heat flux, − → q q is defined as 38 : In which K q represents the conductivity of each stage. The energy transfer rate among the stages ( Q pq ) can be assumed to be a function of the temperature difference among the stages 38 : In which h pq implies the heat transfer coefficient between the pth and qth stages. The heat transfer coefficient can be associated to the pth stage Nusselt number, Nu p , by 38 : where K q represents the thermal conductivity of the qth stage. In regard with the heat transfer between the liquid and the gas, this study utilized Ranz-Marshal equation 53 which for sphere is: Re p implies the Reynolds number according to the diameter of the pth stage bubbles or particles as well as the relative velocity and Pr p implies the Prandtl number of the qth stage.
The turbulent eddy viscosity, the turbulent kinetic energy ( k ) and its energy dissipation rate ( ε ) are indicated in the equations below 37 : Scientific Reports | (2020) 10:21884 | https://doi.org/10.1038/s41598-020-78751-y www.nature.com/scientificreports/ Ant colony optimization (ACO) algorithm in the fuzzy inference system (FIS) training prosses. The FIS refers to a fuzzy inference system that predicts the behavior of complicated and nonlinear systems accurately [54][55][56][57] . ACO algorithm refers to AI method to solve complex problems, that can be reduced for discovering appropriate paths via the graphs being used in the fuzzy inference system training process [58][59][60][61][62][63][64][65][66][67][68][69][70][71] . In the present study X and Y computing elements, simulation time, liquid velocity distributions, and gas hold-up (gas fraction in the reactor) are input parameters, and the liquid temperature distribution is the output of training datasets. The function of the ith rule is as follows: where w i represents the output signal of the second layer's node and μ Ai , μ Bi , μ Ci , μ Di and μ Ei represent the input signals of the implemented MFs on the inputs, X-direction (X), Y-direction (Y), time, velocity in the Y direction (V) and a gas fraction ( ǫ g ), to the second layer's node. The third layer can be described as the following 72,73 : where w i represents normalized firing strengths. The node function is described as follows 73 : In which m i , n i , o i , p i , q i, and r i represent the if-then rules' parameters named as consequent parameters. The training datasets can be defined in the structure of a fuzzy system by the distribution of membership functions. In this regard, several membership functions can be defined in each input parameter to fully translate numbers and train datasets into the function shape with different function distribution across input parameters. All the input signals from the fourth layer were aggregated to achieve the model output representing the result of estimation. The FIS structure is designed to implement the final decision part of the model, which represent the conceptual understanding of human [74][75][76] . This structure has been previously described by Takagi, Sugeno, and Kang. The main training part of FIS can also be defined with different algorithms for a better understanding of physical processes. These learning methods contain several model parameters or tuning options to improve the model's accuracy, prediction capability and even reduce the computational learning time. The difference between the FIS structure is based on the weighted average of the rule output parameters rather than the max operator mechanism. Moreover, the rule output with defuzzification analysis can generate a model with inexpensive computational requirements. This cost-effective model is called the Sugeno model 77 .

Model comparison. For better evaluation of the ACOFIS model, this prediction algorithm is compared
with GAFIS and PSOFIS models in similar modeling conditions (see Table 1). Five inputs are selected in all models, while 75% of all datasets have participated in the training models. All models are trained with 70 number of epoch or iteration number. The Sugeno type is the FIS structure for all models. Additionally, subtractive clustering is used as a clustering type in the model with a cluster influence range of 0.5 for all models 36 . The reject ratio as a subtractive clustering parameter is similar for GAFIS, PSOFIS, and ACOFIS models. In the next step of the selection of model parameters, the pheromone effect and number of ants for ACO are 0.2 and 20, respectively. For the GA method, other parameters are selected such as swarm size = 20, crossover percentage = 0.7, and mutation percentage = 0.5. Finally, for the PSO model population size, the inertia, weight damping ratio, personal learning coefficient, and global learning coefficient are 20, 0.99, 1, and 2, respectively.

Results and discussion
We used 75% of the data for training processes. The remained 25% of the data plus the 75% in the training were studied in the evaluation or testing step of simulations. For clustering the data, we used subtractive clustering with the gaussmf membership function. As mentioned earlier, we used five inputs for the training process. In the domain of ant colony optimization method, we used 20 number of ants numbers so that we could reach the best intelligence in the system, and for the general evaluation of the system, we used the regression system. As shown in Fig. 1, we have five inputs and one output, which is the fluid temperature in the training process. After the training process, the trained datasets are translated into the fuzzy structure based on membership specifications. The membership functions can translate all learning processes in the fuzzy structure and train the main FIS structure. The ant colony optimization method is selected as the main learning algorithm. For the selection of membership function, in general, in the subtractive clustering method, when the CIR number is defined in the method number of the membership function is dictated for each input parameter. For example, as Fig. 1 shows, by selecting CIR number 0.5, several membership functions (21 membership functions) are distributed in each input parameter. However, decreasing the number of CIR in the method number of membership functions rises in each input parameter, representing a direct correlation between the number of CIR and membership  Fig. 2, we put 3000 points in the training process, and used 20 numbers of ants, as shown R reaches 0.71. The data used in the training process was also used in the testing process, and our evaluation R reaches 0.72, which is very close to the evaluation level in the training process.
Despite the previous data, we increase CIR, and therefore, R decreases significantly, and the system does not show an intelligent signal (prediction capability) in Fig. 2. By increasing CIR, as shown in Fig. 2, the intelligence of the system is decreasing. With CIR number 0.4, the intelligence of the system reachers R = 0.4 , and the testing and training processes are very close to each other. Finally, with CIR number 0.5, the intelligence of the system reaches its lowest amount that is R = 0.3. Figure 3 shows the three inputs in the training process with the CIR number 0.2. In this regard, the system reaches an average intelligence which is 0.83.
In this domain, by increasing the system's CIR to 0.5, the system's intelligence starts decreasing; however, comparing it to the previous Figure, the amount of decrease is lower. As shown in Fig. 4, the system's intelligence increases significantly in both training and testing processes. In this part, we studied four inputs for the AI system, and from the beginning with CIR number 0.2, the intelligence of the system reaches 0.99 in both training and testing. The signal relating to the intelligence of the system reveals that by increasing the number of inputs in the system, we can significantly increase its intelligence. As such, the number of inputs plays a key role in the intelligence of the system, so increasing the number of inputs can increase the intelligence of the system. In continuation of explaining Fig. 4, despite increasing the CIR number to 0.5, the intelligence of the system Table 1. Model parameters in the generation of FIS and clustering classification for ACO, GAFIS, PSOFIS algorithms.

Number of inputs 5 5 5
Maximum of iteration 70 70 70 Percentage of data in training process 75 75  Global learning coefficient as PSO parameter --2 Figure 1. FIS system based on ACO algorithm in the learning process, using subtractive clustering in the best of ACOFIS intelligence when CIR is 0.5, number of ants is 20, and the pheromone effect is 0.2.
Scientific Reports | (2020) 10:21884 | https://doi.org/10.1038/s41598-020-78751-y www.nature.com/scientificreports/ does not tend to decrease, emphasizing the point that the system reaches its full amount of intelligence, and the parameters that previously affect the intelligence of the system negatively seems to become neutral. Figure 5 shows interesting results. By enhancing the number of inputs, the network becomes greatly intelligent, and the R reaches unity. By increasing the CIR of the system, the number stays the same. The results also show that when the number of input is smaller than four, CIR number should be small. In general small CIR, number requires expensive computational runs and calculations. However, when the input number is four or five, we can observe different behavior in the model. In this case, the model can accurately predict datasets with a high number of CIR (0.5) with lower computational time.
For better understanding the intelligence of the system, we study an error distribution diagram. As shown in Fig. 6, the error relating to zero has the highest amount of aggregation, and when we have such a diagram in the middle of a graph, by aggregation of the points in zero, it shows an intelligent system. This visualization was  www.nature.com/scientificreports/ done for CIR number 0.5 and 5 inputs in the system. Also, we plotted a general error for each of the data sets, so that we could observe the amount of error in the system. We face an oscillation of errors, but it does not decrease the swarm intelligence of the system (see Fig. 7). Figure 8a-j show the inputs comparing to each other. As shown in the Figures, the CFD results are completely in agreement with the numerical results of swarm intelligence and fuzzy logic system. Results from the ACOFIS system can accurately predict the CFD results which are shown in Fig. 8.
To understand the main algorithm and all assessment criteria and check the model accuracy and prediction capability the designed flow chart is illustrated in Fig. 9. In the initial steps of running ACOFIS, the input parameters and output parameters are selected in the model. The subtractive clustering is defined to build up the main FIS structure. Then, all parameters of subtractive clustering and ACO are described in the algorithm. In the next step, the FIS structure is generated based on subtractive clustering. To design (train) the FIS structure, the ant colony optimization method is used. However, to increase the ability of method in learning datasets, errors  www.nature.com/scientificreports/ are recorded in the method, and if the error is not sufficient to achieve the high level of accuracy or prediction capability, model parameters are changed for another training process. In the case of having a small error in the model, the coefficient of determination is evaluated for the model, and the trained model is validated based on "non-trained" datasets. In the final stage of the method, the method's prediction section is evaluated to show the liquid temperature distribution in the domain of calculation. For better evaluation of the current training model (ACO) this model is compared with other training algorithms, such as GA and PSO with respect to model accuracy, coefficient of determination (R 2 ). Figure 10 shows the comparison between models for training and testing processes for different training models generating FIS structure. R 2 for all models are very high R 2 > 0.999 for both training and testing. However, for better evaluation of prediction capability, temperature distribution as a function of the number of data can be considered. Figure 11 shows the prediction capability for ACOFIS, GAFIS, and PSOFIS models. The liquid temperature in the reactor is predicted with different models. The results show that all models are very accurate in predicting the liquid temperature of the reactor. However, there is a small discrepancy in predicting local points for all machine learning models. This difference does not have an impact on the overall prediction of temperature distribution in the reactor. To thoroughly analyze the difference between machine learning methods, various assessment methods are considered such as MSE, RMSE, mean error, StD, R and R 2 for the training and testing processes. In addition to that, training time and prediction time are computed for all methods and compared together (see Table 2). The results show that all methods are similar with regards to model accuracy and evaluation criteria. For example, R and R 2 for ACOFIS, PSOFIS, and GAFIS in the training and testing processes are more than 0.999. A  www.nature.com/scientificreports/ similar finding is achieved for other error evaluation parameters, such as MSE and RSME, and STD. Table 2 also shows that the PSOFIS model can train datasets in half a time compared by ACOFIS. However, the prediction time for all models is less than one second. These results show that all ACOFIS, PSOFIS, and GAFIS models can be a good replacement prediction model instead of CFD computing models in estimating the temperature in the bubble column reactor. Table 1 shows all model parameters for different machine learning models, ACOFIS, GAFIS, and PSOFIS.

Conclusion
Generally, the swarm intelligence and the fuzzy logic system can be a prediction model for physical and chemical processes. In the current study, we make an AI model from a multi-phase flow in the 2-dimensional bubble column reactor data generated by CFD. In the BCR, we used the heating sources and created them in the CFD   www.nature.com/scientificreports/ the fuzzy interface system in terms of training processes. We used five inputs and one output in the training process to see this capability in the system. The output was the temperature of the fluid in the reactor. In the domain of sensitivity of the study, we take advantage of the effects of the cluster range in the model. After the training and prediction, we came to this conclusion that by increasing the number of inputs in the system, the system reaches a high swarm intelligence. Like the ants that they find their path to their food and optimize their path, the system can optimize the prediction of the fluid flow very fast. This is a transformation concept from ant behavior to bubbly flow behavior in the reactor. Therefore, increasing the number of inputs in the system can increase the swarm intelligence of the ants. For designing a machine learning system, we need to add the inputs as much as we can so that a system can reach a high intelligence. Sometimes, the inputs are not completely related to the system, but the AI system can find a meaningful relationship between the inputs and the outputs. The sensitivity analysis (tuning model) parameters can impact the accuracy and the overall prediction capability.
Alternatively, the tuning model can make the model cost-effective with regards to learning computational time.
The result shows that the small number of CIR (such as 0.2) can improve the model's accuracy up to three input parameters. However, as the number of inputs increases, such as four or five, the small CIR number (0.2) makes the inaccurate model with an expensive computational run (high learning time). In this case, CIR number 0.5 can generate a cost-effective model with a high level of accuracy. For better evaluation of the ACOFIS model, this machine learning algorithm is compared with GAFIS and PSOFIS models in terms of model accuracy, prediction capability, training time, and prediction time. It was found that the ACOFIS model is very accurate, along with GAFIS and PSOFIS models. Additionally, all machine learning models can predict the processes in less than one second after the reactor's temperature distribution training. However, the PSOFIS model can train the datasets in a shorter time than ACOFIS and GAFIS models. As far as this approach is a meshless method, except for the 100% data which were studied, we can create a meshless domain except for the inputs so that the prediction process can be studied in this domain. For example, x and y-direction positions make a square or a rectangle in a 2-dimensional domain, and the meshes could show thousands of points in themselves. We can create millions of data via AI, but the point worth mentioning here is that the data could be created in a particular domain, a square or a rectangle. On the other hand, millions of data that are created in the domain could be different from previous points.