Prediction model of spontaneous combustion risk of extraction borehole based on PSO-BPNN and its application

The feasibility and accuracy of the risk prediction of gas extraction borehole spontaneous combustion is improved to avoid the occurrence of spontaneous combustion in the gas extraction borehole. A gas extraction borehole spontaneous combustion risk prediction model (PSO-BPNN model) coupling the PSO algorithm with BP neural network is established through improving the connection weight and threshold values of BP neural network by the particle swarm optimization (PSO) algorithm. The prediction results of the PSO-BPNN model are compared and analyzed with that of the BP neural network model (BPNN model), GA-BPNN model, SSA-BPNN model and MPA-BPNN model. The results showed as follows: the average relative error of the PSO-BPNN model was 4.38%; the average absolute error was 0.0678; the root mean square error was 0.0934; and the determination coefficient was 0.9874. Compared with the BPNN model, the average relative error, average absolute error and root mean square error decreased by 9.35%, 0.1707 and 0.2056 respectively; and the determination coefficient increased by 0.1169. Compared with the GA-BPNN model, the average relative error, average absolute error and root mean square error decreased by 3.19%, 0.0602 and 0.0821 respectively; and the determination coefficient increased by 0.0320. Compared with the SSA-BPNN model, the average relative error, average absolute error and root mean square error decreased by 5.70%, 0.0820 and 0.1100 respectively; and the determination coefficient increased by 0.0474. Compared with the MPA-BPNN model, the average relative error, average absolute error and root mean square error decreased by 3.50%, 0.0861 and 0.1125 respectively; and the determination coefficient increased by 0.0488, proving that the PSO-BPNN model is more accurate than the BPNN model, GA-BPNN model, SSA-BPNN model and MPA-BPNN model as for prediction. When the PSO-BPNN model was applied to three extraction boreholes A, B, and C in a coal mine of Shanxi, the prediction results were better than the BPNN model, GA-BPNN model, SSA-BPNN model and MPA-BPNN model, proving the accuracy and stability of the PSO-BPNN model in predicting risk of borehole spontaneous combustion in other mine.

Spontaneous combustion in gas extraction borehole is a internal-caused fire in coal mines influenced by multiple factors, which seriously restricts the high production efficiency and safety of mines 1,2 .The initial temperature of the coal seam rises due to the increase of mining depth and intensity, bringing new challenges to the prevention and control of coal spontaneous combustion disasters 3,4 .Therefore, the risk of spontaneous combustion in extraction borehole is becoming increasingly serious, especially for high gas-prone spontaneous combustion coal seams.Spontaneous combustion often occurs in the deep of a certain distance from the exposed face of the coal seam so that the location of the fire source is difficult to determine 5,6 .Once spontaneous combustion occurs in the extraction borehole, the borehole will be suspended or scrapped, even causing the explosion of the extraction pipeline.Therefore, studying the prevention and control of spontaneous combustion in the extraction borehole www.nature.com/scientificreports/ is an urgent and significant issue.Determining the risk of spontaneous combustion in borehole is an essential basis for taking fire prevention measures.Scientific and reasonable methods to improve the prediction accuracy is an important guiding meaning for the control of spontaneous combustion in borehole.
With the development of computer science and technology, meta-heuristic algorithms are widely used in engineering practice [7][8][9] , many scholars have begun to combine the early warning indicators of coal spontaneous combustion with machine learning and intelligent algorithms for prediction of coal spontaneous combustion in recent years, improving the accuracy of prediction results 10,11 .WEN Tingxin et al. 12 proposed a coal spontaneous combustion prediction model based on KPCA-Fisher discriminant analysis.They used kernel principal component analysis method to extract nonlinear features from characteristic indicators with a high degree of correlation, and then the extracted principal components taken as discriminators in the Fisher discriminant model.ZAN Juncai et al. 13 used gas composition analysis and BP neural network to establish a prediction model, and then selected coal spontaneous combustion index gas concentration as the input layer of the neural network and coal temperature as the output layer to predict the coal spontaneous combustion.The results were basically consistent with the actual situation.QI Yun et al. 14 established a comprehensive evaluation method based on set-value statistics-Entropy, simulating human decision thinking process and mathematically processing the multi-factor data which causes the risk of spontaneous combustion, thereby avoiding the bias of the classical fuzzy evaluation method to assess the quantification.XING Yuanyuan et al. 15 used the inverse entropy weighting method to determine the weights of evaluation indexes based on the principle of minimum information identification, and then constructed a coal spontaneous combustion risk evaluation model based on the TOPSIS method.Wang Wei et al. 16 proposed a dynamic weighting method, and then established a dynamic prediction model for the risk of coal spontaneous combustion according to the characteristics of dynamic changes in the goaf.Shuang et al. 17 proposed an improved grey wolf optimized support vector regression coal spontaneous combustion temperature prediction model based on nonlinear parameter control, dynamic inertia weights and grey wolf social hierarchy.The effectiveness of the improved grey wolf optimizer algorithm was verified by numerical experiments.Jun Det al. 18 proposed a SA-SVM prediction model to reflect the complex nonlinear mapping between characteristic gases and the coal temperature.The risk degree of coal spontaneous combustion was estimated in the time domain, and the model was verified by using in situ data from an actual working face.CHANG Xuhua et al. 19 used the improved G1 method, entropy weight method and improved game theory to calculate the comprehensive weights of evaluation indexes.A topizable evaluation model of coal spontaneous combustion hazard with improved game theory empowerment was proposed based on the the hazard level and ranking of evaluation elements identified by the comprehensive correlation.WANG Minhua et al. 20 obtained a training dataset for coal spontaneous combustion prediction with data augmentation expansion by generating virtual samples through WGAN-GP model, and then used AI model for learning training of the dataset to establish a model for coal spontaneous combustion prediction in the goaf.The literature on spontaneous combustion in gas extraction borehole is relatively rare.Some scholars adopted numerical simulation and model prediction to study spontaneous combustion in gas extraction borehole.QI Yun et al. 21applied Comsol Multiphysics software to conduct a comprehensive study of two indicators, sealing depth and sealing length, for the spontaneous combustion of gas extraction borehole in Pingmei No. 10 Mine.The sealing parameters were optimized to effectively prevent spontaneous combustion of the coal around the borehole.WANG Wei et al. 22 first used a mathematical prediction model to study the risk of spontaneous combustion in gas extraction borehole, then proposed an improved CRITIC method to modify the G2 weighting model, and established a G2-TOPSIS prediction model to judge the risk of spontaneous combustion in borehole by combining with TOPSIS method.QI Qingjie et al. 23 optimized the sealing depth and negative pressure of gas extraction borehole by numerical simulation and obtained the best sealing parameters of extraction borehole by verifying with the field engineering test results.
The coal spontaneous combustion prediction model proposed in the above research has played a certain role in promoting the prevention and control of spontaneous combustion fires in mines.However, the limitations of some methods includes the tendency to fall into local optimal solutions, low generalization ability and slow convergence speed because of the difficulties in determining the weights of some evaluation indicators in the practical application, the incomplete consideration of the model indicator factors, and lack of clarity of the primary and secondary factors affecting coal spontaneous combustion.In addition, borehole spontaneous combustion occurs frequently due to lack of studying the influence factors and prevention of extraction borehole spontaneous combustion.The coal seam is deep and the gas content is high in a coal mine in Shanxi.Some of the boreholes exhibited spontaneous combustion due to the poor sealing of the boreholes in gas extraction process, which led to the suspension of extraction and even the scrapping of the boreholes.
In this view, taking the problem of spontaneous combustion in gas extraction borehole in a coal mine in Shanxi as the research background, the author intends to introduce PSO algorithm and BP neural network into the prediction of spontaneous combustion risk in borehole.The PSO algorithm is used to improve the connection weight and threshold of the BP neural network, thereby overcoming the deficiency that BP neural network is easy to fall into local optimum.A prediction model of spontaneous combustion risk in borehole is established based on PSO-BPNN.The approach is expected to improve the accuracy of spontaneous combustion prediction in gas extraction borehole, laying the foundation for adopting scientific and reasonable borehole spontaneous combustion prevention and control measures.Meanwhile, the research results can provide theoretical support for other mines to solve the problem of spontaneous combustion in extraction borehole.

BP neural network model
BP neural network is a multilayer feed-forward neural network, whose main feature is that the error propagates backwards while the signal passes forward 24 .First, the original sample data is imported into the BPNN prediction model, then obtaining the actual output after the calculation.If the relative error between the actual output and the desired output does not satisfy requirement of the error accuracy, the error is propagated backwards.Thus, the weights and thresholds in the BPNN model can be adjusted in time.Then importing and calculating the original sample data again gradually reduce the relative error between the actual output and the desired output, meeting the requirements of the error accuracy.The calculation process of the BPNN model is shown in Fig. 1, and the main steps are as follows: (1) Initialize the network, determine the number of nodes in the input layer, the number of nodes in the output layer, the number of nodes in the implied layer of the model, the weights between the layers, and the implied layer threshold; (2) Calculate the output of the implicit layer; (3) Calculating the output variables; (4) Calculating the error between the output value of the test set and the actual output value; (5) Continuously adjust the weights and thresholds of the network by error reversal training, and then carry out feed-forward training many times; (6) Finally, when the training error is less than the set error, the training ends.
The BPNN model consists of three parts: input layer, hidden layer, and output layer.Its mathematical expression is as follows: where h j is the output value; f (x) is the activation function; x j is the input value; l j is the connection weight of the hidden nodes; b j is the threshold between the hidden nodes; ε is the threshold of the hidden nodes; p is the number of hidden nodes.
The input layer can select each influencing factor, and the output layer selects the result that needs to be predicted.The number of nodes in the hidden layer of the BP neural network is the core part of the neural network topology structure 25 .Fewer nodes in the hidden layer lead to drop the learning ability of network, affecting the (1) prediction accuracy.A large number of nodes give rise to increase training time, generating overfitting of network.The formulas for calculating the number of neurons in the hidden layer include the traditional formula l= √ mn and the empirical formula l = 2n + 1 26,27 , where: l is the number of nodes in the hidden layer; n is the number of nodes in the input layer; and m is the number of nodes in the output layer.The results obtained from the two formulas are respectively substituted into the BPNN prediction model.The final results show that the BP neural network converges well and has high prediction accuracy when the number of neurons in the hidden layer is 19.Finally, the topology structure of the BPNN model is finally determined to be 9-19-1.50 sets of sample data that meet the conditions are selected from references 28,29 .After randomly shuffling the data, the former 40 sets of data is taken as training samples, while the later 10 sets of data is taken as prediction samples.The input layer of the BP neural network consists of 9 parameters, including O 2 , N 2 , CO, CH 4 , CO 2 , C 2 H 4 , C 2 H 6 , C 2 H 4 /C 2 H 6 , CO 2 /CO, and the output layer is the hazard level.22 samples are hazard level 1; 16 samples are with hazard level 2; and 12 samples are hazard level 3. When the hazard level is 1, it is a safer situation.Level 2 is a relatively dangerous situation, requiring to pay attention to the development trend of the risk of borehole spontaneous combustion.When the level 2 warning lasts for three days or more, it is necessary to alert the dangerous situation and take preliminary fire prevention measures.Level 3 is a dangerous situation.When the level 3 warning lasts for three days or more, the borehole spontaneous combustion is in an extremely dangerous state, requiring to take intensive fire prevention measures.The specific sample data is shown in Table 1.
The BPNN prediction model is constructed by the Matlab software.The number of neurons in the input layer is set to 9, the number of neurons in the hidden layer to 19, the number of neurons in the output layer to 1, the maximum training times to 1000, the training target error to 0.00001, and the learning rate to 0.001.The results of the training and prediction of the BPNN model are shown in Figs. 2, 3 and Table 2.According to Table 2, the relative errors between the predicted and real values of the BPNN model operations ranges from 1.90% to 30.98%, with a difference of 29.08% and an average relative error of 13.73%, while the absolute errors ranges from 0.0242 to 0.6195, with a difference of 0.5953 and an average absolute error of 0.2385.According to Fig. 2, goodness-of-fit between output value and target value of the training set, validation set and test set is higher with the correlation coefficients to over 0.97, indicating that the training results are valid.Figure 3 shows that change rule of the predicted values of the BPNN model is roughly similar trend with that of the real data, but errors of some predicated values still is larger.Therefore, the prediction accuracy needs to be improved.

Particle swarm optimization
The particle swarm optimization algorithm is a stochastic search algorithm discovered in the study of the foodseeking behavior of bird flocks 30 , which uses the concepts of "swarm" and "evolution" with the characteristic of information sharing and co-evolution between groups.Particle swarm optimization algorithm belongs to one of the metaheuristic algorithms, which is a global search optimization algorithm constructed based on experience and intuitive observation, and the variety is very rich and increasing, including genetic algorithms, ant colony algorithms, wolf pack optimization algorithms, artificial bee colony optimization algorithms, simulated annealing algorithms and other kinds of algorithms, which is proposed for the to get the optimal solution of global optimal solution, which can be obtained by these metaheuristic algorithms to get the optimal solution.Among the meta-inspired algorithms, the PSO algorithm is one of the more effective and widely used ones, on the one hand, because of its strong optimization ability and high accuracy of the results, and on the other hand, its structure is simple and easy to code, so it is widely used by scientific researchers and technicians.
The basic idea of the PSO algorithm is that the solution of each problem is considered as the position of each particle.The particle swarm composed of all the particles searches in a D-dimensional space 31 .Direction and distance of every particles are determined by their velocity.Every particle has a adaption value.Therefore, the particle search direction and distance constantly changes with the change of the particle's velocity and adaption value.When the particle moves in the preconditioned space, it constantly changes its position according to the obtained individual and global extremes, and then updates the solution by constantly correcting its position, thereby achieving the purpose of finding an optimum in the preconditioned space.The individual extremes is the optimal solution found by the particle itself while the global extremes is the optimal solution found by the whole population.Compared with other algorithms, the BP neural network has simple initial parameter selection, strong learning ability, and nonlinear mapping ability, and the network structure based on error backpropagation can substantially improve the prediction accuracy, and is more fault-tolerant and adaptable to the predicted sample data.At the same time, using a particle swarm algorithm to optimize the BP neural network can avoid falling into local optimum and improve its convergence speed.Therefore, this paper chooses to use a particle swarm algorithm to optimize the BP neural network to predict the degree of spontaneous combustion hazard of coal in the mining area.
In the PSO, the set of particles is x i = (x i1 , x i2 , …, x id ), and the set of velocities is , where v is the velocity of each particle, 1 ≤ d ≤ n.If the global and individual extremes are g Besti and p Besti at the t times iteration, the particle velocity and position update equations are as follows: where t is the current iterations number of times; r 1 , r 2 are randomly distributed numbers on the interval [0, 1]; c 1 , c 2 are learning factors; and ω is the inertia weight and a parameter, balancing the global search ability and local search ability of the population.

PSO-BPNN model prediction steps
The calculation process of the PSO-BPNN model is shown in Fig. 4, and the main steps are as follows: Table 1.Sample data 28,29 .(1) The parameters of the PSO algorithm are initialized according to the BPNN model in the previous section.The particle swarm size, velocity, particle dimension and iteration number are determined, then establish the PSO-BPNN model.(2) The sample data in Table 1 are divided into two parts according to the grouping of the BPNN model, that is, the training group and the test group, and then importing into the PSO-BPNN model.(3) The PSO-BPNN model begins to be trained to get values, then calculating adaption values of each particle.The current individual optimal position p Best and the global optimal position g Best are obtained by the calculation results of the fitness value.(4) When the global optimal position is outside the set convergence accuracy range, the calculation of the adaptation value continues to update the individual optimal position and the global optimal position.When the global optimal position enters the convergence accuracy range, the calculation terminates.(5) The solution with the highest adaption value is assigned to the BP neural network weights and thresholds.
Then the optimal solution is obtained after the model is trained.The iterative changes of each of these models are shown in Fig. 7, from which it can be learned that the PSO-BPNN model has the optimal convergence performance and convergence speed, and most of the curves of PSO-BPNN are located below those of GA-BPNN, SSA-BPNN, and MPA-BPNN, which indicates that the model has higher solution accuracy and better global optimization seeking ability.
By recording the analysis duration of each model, Table 4 can be obtained, from which it can be understood that among the four models, PSO-BPNN, GA-BPNN, SSA-BPNN and MPA-BPNN, the analysis duration of PSO-BPNN is the shortest, and the analysis durations of the other models are longer than PSO-BPNN, which indicates that the computation rate of this model is better than the other models.
The comparison results of the absolute and relative errors calculated by each model are shown in Figs. 8 and  9, respectively.Comparison is made and it is found that the great and small values of the relative errors of the PSO-BPNN model prediction results are smaller than the great and small values of the prediction results of the other models.The comparison of the performance indexes of each model is shown in Table 5.The average relative error of the BPNN model is 13.73%, the average absolute error is 0.2385, the root mean square error is 0.2990, and the coefficient of determination is 0.8705; the average relative error of the GA-BPNN model is 7.57%, the average absolute error is 0.1280, the root mean square error is 0.1755, and the coefficient of determination is 0.9554; the SSA-BPNN model has an average relative error of 10.08%, an average absolute error of 0.1498, a root-mean-square error of 0.2034, and a coefficient of determination of 0.9400; the MPA-BPNN model has an average relative error of 7.88%, an average absolute error of 0.1539, a root-mean-square error of 0.2059, and a coefficient of determination of 0.9386; while the PSO-BPNN model has an average relative error of 4.38%, an average absolute error of 0.0678, a root-mean-square error of 0.0934, and a coefficient of determination of 0.9874.Compared with the BPNN model, the PSO-BPNN model's average relative error, average absolute error, and root-mean-square error are reduced by 9.35%, 0.1707, and 0.2056, and the coefficient of determination increased by 0.1169; compared with the GA-BPNN model, the average relative error, average absolute error and root mean square error of the PSO-BPNN model decreased by 3.19%, 0.0602 and 0.0821, respectively, and the

Conclusions
The PSO-BPNN model was constructed to predict the spontaneous combustion risk of gas extraction boreholes.The main conclusions were obtained by comparing with the prediction results of the BPNN model, GA-BPNN model, SSA-BPNN model and MPA-BPNN model as follows: https://doi.org/10.1038/s41598-023-45806-9

Figure 2 .
Figure 2. Training regression state curve of BPNN model.

Figure 7 .
Figure 7. Changes in fitness of each model.

Figure 10 .
Figure 10.Comparison of average relative errors in predicted results of each extraction borehole.

Table 2 .
Comparison between real values and predicted values of BPNN model.

Table 3 .
Comparison between predicted values and real values of PSO-BPNN model.
NO Real

Table 4 .
Duration of analysis for each model.

Table 5 .
Comparison of performance indicators of different models.