Identification of mine water sources using a multi-dimensional ion-causative nonlinear algorithmic model

Zheng, Qiushuang; Wang, Changfeng; Yang, Yang; Liu, Weitao; Zhu, Ye

doi:10.1038/s41598-024-53877-5

Download PDF

Article
Open access
Published: 08 February 2024

Identification of mine water sources using a multi-dimensional ion-causative nonlinear algorithmic model

Qiushuang Zheng¹,
Changfeng Wang¹,
Yang Yang¹,
Weitao Liu^2,3 &
…
Ye Zhu^2,3

Scientific Reports volume 14, Article number: 3305 (2024) Cite this article

216 Accesses
Metrics details

Subjects

Abstract

Based on the nonlinear algorithmic theory, the R-SVM water source discrimination model and prediction method were established by using the piper qualitatively to compare the differences between the ionic components and R-type factor approximation indicator input dimensions. Taking the mine water samples of Zhaogezhuang Coal Mine as an example, according to the chemical composition analysis of the water samples from different monitoring points, six indexes of Na⁺, Ca²⁺, Mg²⁺, Cl^–, SO₄^2– and HCO₃^– were selected as the discrimination factors. According to the water characteristics of each aquifer and the actual needs of discrimination, the water inrush sources in the mining area were divided into four categories: The goaf water is class I, Ordovician carbonate is class II, Sandstone fracture water from the 13 coal system is class III, and Sandstone fracture water from the 12 coal system is class IV. Taking 56 typical water inrush samples as training samples, 11 groups for prediction samples, establish the input index as typical ion content, output as water source type, using SPSS statistics and MATLAB to realize the R-SVM water source discriminant analysis model, automatically establishing the mapping relationship between the water quality indexes and the evaluation standards, which can achieve the purpose of rapid and accurate discrimination of the water sample data. The results showed that the accuracy of the R-SVM model classification was 90.90% in the verification of the water source discrimination example of Zhaogezhuang mine and the coupled model has high accuracy, good applicability and discriminant ability, and has certain guiding significance for the prevention and control of water damage and the related field work.

Underestimated burden of per- and polyfluoroalkyl substances in global surface waters and groundwaters

Article Open access 08 April 2024

Removal of heavy metal ions from wastewater: a comprehensive and critical review

Article Open access 08 July 2021

Environmental impact of direct lithium extraction from brines

Article 23 February 2023

Introduction

With the development of economy and society, the demand for mineral resources is steadily escalating. Mineral resources serve as the indispensable material foundation for human production activities^1,2,3. Over the years, the development and utilization of mineral resources have necessitated a shift in mining focus, transitioning coal mines towards the extraction of intricate refractory mining bodies, such as deep orebody, broken soft orebody, alpine area orebody and low-grade orebody, and “three lower and one upper” ore bodies⁴. As mining intensity and depth increase, the extraction of mineral resources within complex geological structures becomes more challenging, giving rise to a surge in engineering predicaments. Among these challenges, mine water disasters emerge as a prominent threat to mining operations. Hence, the timely and precise identification of water source categories, constitutes essential prerequisites for averting water-related disasters and establishing a scientific foundation for swift rescue and management endeavors^5,6.

Water chemistry data plays a crucial role in understanding the fundamental characteristics of aquifers and is vital for discriminating water sources⁷. Qualitative and quantitative methods are commonly employed to analyze water chemistry information for this purpose. Qualitative analysis, combined with water level dynamics, provides a rough determination of the syncline level. Piper's trilinear water chemistry analysis, on the other hand, is a convenient and visual tool for water quality classification and ion distribution⁸. The modified D-Piper trilinear diagram provides a solution for the challenge of visualizing ion distribution in large data sets⁹, leading to improved visualization and interpretation with an increase in data points. In addition, it is crucial to consider physicochemical information such as isotopes and radioactive elements in water bodies to reflect the essential characteristics and historical evolution of hydrogeology. The hydrogeochemical distribution, recharge sources, indicator tests, influencing factors, and evolutionary laws are analyzed based on conventional water chemistry, trace elements, and isotopes of the aquifer¹⁰. Gibbs’ semi-qualitative model¹¹ is employed to analyze the hydration types of surface water and shallow groundwater, providing insights into the controlling factors, formation mechanisms, and recharge sources of isotopes in various aquifers. This analysis reveals the distinct weathering and hydration characteristics of different water bodies. However, qualitative methods alone face limitations in similar aquifers due to the ambiguous relationship between indicators, overlapping water quality characteristics, and unclear distribution boundaries¹². To overcome these limitations, quantitative analysis¹³ is utilized to uncover the inherent laws of water chemistry data, establish mathematical models for determining water source types, elucidate the close connection between water quality indicators and determination criteria, and minimize the errors associated with qualitative analysis methods.Fisher function discrimination of water source locations based on fuzzy clustering and factor analysis^14,15 and Bayes classification of water sources^16,17 are employed to determine the water sources of sudden water in the mine area, with improved accuracy of discrimination. Groundwater is subject to multiple factors coupling due to the variability of mine geological structure, the complexity of hydrogeological characteristics, and the diversity of mining conditions, resulting in fuzzy connections and complex nonlinear relationships between water quality indicators and discriminatory criteria. However, model studies for index simplification through data dimensionality reduction are limited, and the redundancy of information between water chemical components reduces discriminative accuracy, requiring further optimization of the discrimination model.

This study addresses the water quality assessment system by introducing a novel approach that combines qualitative and quantitative analysis. A key contribution of this research is the utilization of Piper's trilinear diagram graphical method to analyze the variation pattern of ionic composition in aquifers and water chemistry characteristics through point mapping. By comparing the differences in ionic composition among aquifers and evaluating the proximity to the target water body, an initial classification of water quality is established.This fills the gap in existing research on risk factor internal information mining and machine learning, and provides a foundation for subsequent quantitative water source discrimination. To achieve this, a coupled discrimination model, integrating the R-factor and Support Vector Machine, is developed to uncover inherent characteristics within water chemistry data and automatically establish the mapping relationship between water quality indices and evaluation criteria. This innovative approach enables precise identification of water source types and provides valuable guidance for effective water damage control in practical engineering applications.

Theoretical basis

Principle of R-factor dimensionality reduction

There are m test variables $Z_{i} (i = 1,2,3, \cdots ,m)$, which may be correlated, and each $Z_{i}$ contains independently existing common factor $f_{j} \left( {j = 1,2, \cdots ,p} \right)$, $P \le m$ where $Z_{i}$ contains m mutually uncorrelated unique factors $u1,u2,u3, \cdots ,um$, and u and f are mutually uncorrelated. Each Z can be linearly characterized by f and u as¹⁸:

$$\left\{ {\begin{array}{*{20}l} {Z_{1} = a_{11} f_{1} + a_{12} f_{2} + \cdots + a_{,p} f_{p} + c_{1} u_{1} } \hfill \\ {Z_{2} = a_{21} f_{1} + a_{22} f_{2} + \cdots + a_{2p} f_{p} + c_{2} u_{2} } \hfill \\ \vdots \hfill \\ {Z_{m} = a_{m1} f_{1} + a_{m2} f_{2} + \cdots a_{np} f_{p} + c_{m} u_{m} } \hfill \\ \end{array} } \right..$$

(1)

Expressed as matrix:

$$\left( {\begin{array}{*{20}c} {Z_{1} } \\ {Z_{2} } \\ \vdots \\ {Z_{m} } \\ \end{array} } \right) = \left( {\begin{array}{*{20}c} {a_{11} } & {a_{12} } & \cdots & {a_{1n} } \\ {a_{21} } & {a_{22} } & \cdots & {a_{2n} } \\ \cdots & \cdots & \ddots & \cdots \\ {a_{m1} } & {a_{m2} } & \cdots & {a_{nn} } \\ \end{array} } \right)\left( {\begin{array}{*{20}c} {f_{1} } \\ {f_{2} } \\ {f_{3} } \\ {f_{4} } \\ \end{array} } \right) + \left( {\begin{array}{*{20}c} {c_{1} u_{1} } \\ {c_{2} u_{2} } \\ \vdots \\ {c_{m} u_{m} } \\ \end{array} } \right).$$

(2)

Abbreviated as:

$$Z = A \cdot F + C \cdot U.$$

(3)

The factor analysis method lies in replacing Z by F through Eqs. (2) and (3), conditioned on $p < m$, which can streamline the number of dimensions to reduce redundancy. The specific steps are¹⁹:

(1)
Construct sample matrix and perform correlation test,

Collect the p-dimensional random variable $X = (x_{1} ,x_{2} , \cdots x_{p} )^{T}$ and construct the sample matrix:

$$X = \left[ {\begin{array}{*{20}c} {x_{1}^{T} } \\ {x_{2}^{T} } \\ \vdots \\ {x_{n}^{T} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {x_{11} } & {x_{12} } & \cdots & {x_{1p} } \\ {x_{21} } & {x_{22} } & \cdots & {x_{2p} } \\ \vdots & \vdots & \vdots & \vdots \\ {x_{n1} } & {x_{n2} } & \cdots & {x_{np} } \\ \end{array} } \right].$$

(4)

The KMO or Bartlett test was used to test the correlation of variables, and if the correlation coefficient is less than 0.3, there is no sense of dimensionality reduction. If the correlation is strong means that the commonality of variables can be extracted and is suitable for factor analysis.

(2)
Processing to obtain the standardized matrix,

The standardization is done through the following:

$$Z_{ij} = \frac{{y_{ij} - \hat{y}_{j} }}{{s_{ij} }}(i = 1,2, \cdots ,p).$$

(5)

The standardized matrix is obtained:

$$Z = \left[ {\begin{array}{*{20}c} {z_{1}^{T} } \\ {z_{2}^{T} } \\ \vdots \\ {z_{n}^{T} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {z_{11} } & {z_{12} } & \cdots & {z_{1p} } \\ {z_{21} } & {z_{22} } & \cdots & {z_{2p} } \\ \vdots & \vdots & {} & \vdots \\ {z_{n1} } & {z_{n2} } & \cdots & {z_{np} } \\ \end{array} } \right].$$

(6)

(3)
Calculate the correlation matrix,

The correlation coefficient matrix is obtained as follows:

$$Z = \left[ {r_{ij} } \right]_{p \times p} = \frac{{Z^{T} Z}}{n - 1}.$$

(7)

In addition,

$$r_{j}^{2} = \frac{{\sum\limits_{i} = 1^{n} (z_{ij} - z_{j} )^{2} }}{n - 1}(i,j = 1,2, \cdots ,p).$$

(8)

The correlation calculation is performed on the standardized matrix Z. The eigenvector values of $|R - \lambda I_{P} | = 0$ are obtained based on the features of the correlation matrix, and then the common factors are extracted using the above approach, making the information utilization rate cover more than 85%.

(4)
Calculate the factor load matrix, rotate the load matrix, and obtain the matrix U,
$$U = \left[ {\begin{array}{*{20}c} {u_{1}^{T} } \\ {u_{2}^{T} } \\ {u_{3}^{T} } \\ {u_{4}^{T} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {u_{11} } & {u_{12} } & \cdots & {u_{1p} } \\ {u_{21} } & {u_{22} } & \cdots & {u_{2p} } \\ \vdots & \vdots & {} & \vdots \\ {u_{n1} } & {u_{n2} } & \cdots & {u_{np} } \\ \end{array} } \right].$$
(9)

$u_{i}$ Principal component vector of the i sample. $u_{ij}$ Projection of the vector on the unit eigenvector.

Support vector machine principle

Support Vector Machine simplifies complex problems by establishing nonlinear mapping relationships is good at dealing with nonlinear complex systems, and automatically establishes the mapping relationship between water quality indicators and evaluation criteria by performing inner product operations in the transformation space to achieve the purpose of effectively classifying the categories to which the predicted samples belong. The principle is shown in Fig. 1.

The support vector machine consists of three parts: input layer, intermediate inner product kernel function layer, and output layer. The water source discriminant $X_{1} ,X_{2} ,X_{3} , \cdots ,X_{n}$, which represents the sample feature information, is input into the Support Vector Machine model, and the input variables will be processed by the intermediate inner product kernel function layer to map them into the high-dimensional space to seek the optimal solution. This does not consider the specific mapping relationship in the transformation stretching process, and the discriminant type of the water source is finally output in the output layer after a nonlinear transformation²⁰.

The procedure of SVM classification operation is as follows^21,22:

① Determine the input sample variable as $\{ x_{i} \} \subset X = R^{n}$, the output variable as $y_{i} \in Y = \{ 1, - 1\}$.
② Select the optimal combination of parameters, where the kernel function is $K\left( {x_{i} ,x} \right) = \varphi \left( {x_{i} } \right) \cdot \varphi \left( x \right)$.
③ Solve $\min = \frac{1}{2}\sum\limits_{i = 1}^{L} {\sum\limits_{i = 1}^{L} {a_{i} } } a_{j} y_{i} y_{j} K\left( {x_{i} ,x_{j} } \right) - \sum\limits_{i = 1}^{L} {a_{i} }$ according to the constraints.
④ The optimal solution $a^{*} = (a_{1} ,a_{2} ,a_{3} ,.....a_{n} )$ is obtained from the above calculation.

After dimensioning, assuming a nonlinear mapping $\varphi :R^{d} \to H$, the optimization problem can be transformed into:

$$\begin{gathered} \mathop {\min }\limits_{w,b} \frac{{\left\| w \right\|^{2} }}{2} \hfill \\ s.t.y_{i} (w \cdot \varphi (x_{i} ) + b) \ge 1,i = 1,2, \cdots l. \hfill \\ \end{gathered}$$

(10)

Introducing Lagrange multipliers yields:

$$L(w,b,a) = \frac{1}{2}\left\| w \right\|^{2} - \sum\limits_{i = 1}^{j} {\alpha_{i} } \left[ {y_{i} \left( {w \cdot \varphi (x_{i} ) + b} \right) - 1} \right].$$

(11)

The pairwise objective function is:

$$\left\{ {\begin{array}{*{20}l} {\max_{n} \sum\limits_{i = 1}^{i} {\alpha_{i} } - \frac{1}{2}\sum\limits_{i = 1}^{l} {\sum\limits_{j = 1}^{l} {a_{i} } } a_{i} y_{i} y_{i} K(x_{i} ,x_{j} )} \hfill \\ {s.k\sum\limits_{i = 1}^{l} {a_{i} } y_{i} = 0} \hfill \\ {a_{i} \ge 0,i = 1,2, \cdots l} \hfill \\ \end{array} } \right..$$

(12)

$K(x_{i} ,x) = \varphi (x_{i} ) \cdot \varphi (x)$ is a kernel function that implicitly maps the data and then learns it. To obtain the classification decision function:

$$f\left( x \right) = {\text{sgn}} \left( {\sum\limits_{i = 1}^{l} {y_{i} } a_{i} k\left( {x,x_{i} } \right) + b} \right).$$

(13)

The soft interval with the introduction of the penalty factor C and the relaxation variable $\xi_{i} (\xi_{i} > 0)$ is optimized as:

$$\begin{gathered} \min \frac{1}{2}\sum\limits_{i = 1}^{l} {\sum\limits_{j = 1}^{l} {\alpha_{i} } } \alpha_{j} y_{i} y_{j} K(x_{i} ,x_{j} ) - \sum\limits_{i = 1}^{l} {\alpha_{i} } \hfill \\ 0 \le a_{i} \le C,i = 1,2, \cdots l. \hfill \\ \end{gathered}$$

(14)

The optimal decision function can be obtained as:

$$f(x) = {\text{sgn}} \left( {\sum\limits_{i = 1}^{l} {a_{i} } y_{i} K(x_{i} ,x) + b} \right).$$

(15)

Optimal parameter solving

In this paper, the grid search method is chosen to divide the grid for the optimal search. Using the fixed-step grid search search²³, a violent search method with a combination of coarse and fine, and a large step size in the optimization search space, all the real target points to be searched are cyclically arranged and combined, and the value range of c and g are set to [2–10]. The process and principle of the optimization search are shown in Fig. 2.

The support vector machine steps for the optimization of the grid search method are as follows^24,25:

(1)
Create a coordinate grid Set $X = \left[ {\begin{array}{*{20}c} {X_{1} ,X_{2} } \\ \end{array} } \right]$, $Y = \left[ {\begin{array}{*{20}c} {Y_{1} ,Y_{2} } \\ \end{array} } \right]$_. Set up the training learner, pick the step size L, put in the parameter search range, and the grid parameter node ${\text{c}} = 2X$,$g = 2Y$.
(2)
Using K-fold to find the classification accuracy The samples are divided into N subsets, including the test set and the training set, and the number of subsets is 1 and N-1, respectively, where the training set is used for model building. The accuracy evaluation method is set to obtain the classification accuracy corresponding to the set of parameters, which is used for the training set.
(3)
Traversing the coordinate grid The combination with the smallest mean square error among all the traversed parameters is selected to obtain the optimal trainer, that is, the combination of (c, g) with the highest classification accuracy, and the optimal trainer accuracy is output.

Analysis of water information

Hydrogeologic conditions in the study area

The coal seams in the Zhaogezhuang Coal mine are predominantly distributed within the Upper Taiyuan Formation (Zhaoge Formation) of the Shanxi Formation (Da Miaozhuang Formation). The presence of faults on the eastern, western, southern, and northern boundaries has resulted in the uplift and exposure of the Ordovician limestone due to tectonic activity. This faulting has led to the development of intense structural karst. Consequently, the gently inclined limestone has formed troughs, and a robust karst development zone has emerged along the eastern boundary fault of the Kaiping block. The overlying Quaternary loose layers exhibit coarse particle size, exceptional permeability, and high water content, serving as a prominent conduit for groundwater movement and constituting the primary strong runoff zone in the regional groundwater system. The hydrodynamic forces are notably strong, displaying characteristics of concentrated conduit flow. Furthermore, a portion of the groundwater in the eastern part of the Shahe River basin in the Zhaogezhuang mine infiltrates the field's interior through the Leizhuang fault, with groundwater flowing from the northeast to the southwest.

The Zhaogezhuang Coal Mine has developed five major aquifer systems from the Cambrian to the Quaternary: the Cambrian aquifer, the Ordovician limestone aquifer, the coal-bearing formation sandstone aquifer, the Tangshan limestone aquifer, and the Quaternary alluvial aquifer.The Quaternary alluvial aquifer in the study area exhibits a relatively thin structure, exerting minimal impact on coal mining operations. In contrast, the Cambrian aquifer predominantly interacts with the Ordovician aquifer. Consequently, the Ordovician aquifer assumes a pivotal role in water influx incidents within the study area, particularly in cases of deep water influx. The principal contributors to these occurrences are the aquifers comprising Ordovician limestone and coal-bearing sandstone within the coal-bearing rock series. To maximize differentiation of water source types, the study selected the six most widely distributed ions in groundwater as discriminative indexes^26,27. These include Na⁺, Ca²⁺ , Mg²⁺ , Cl^–, SO₄^2– and HCO₃^–. K⁺ was combined with Na⁺ due to their low variation range.

Data index extraction and collection

For data selection, the Zhaogezhuang mine’s deep mining process was primarily threatened by Ordovician carbonate from the Ordovician aquifer, followed by goaf water damage and sandstone water damage. As a result, four water sample types were chosen: goaf water (from the I aquifer), ordovician carbonate (from the II aquifer), sandstone fracture water from the 13 coal system (from the III aquifer), and sandstone fracture water from the 12 coal seam (from the IV aquifer section A). To screen the typical water sample data, 67 groups were selected from 19 boreholes based on the anion and cation balance test and hydrogeological data of Zhaogezhuang. Among these groups, 18 were from goaf water, 13 from ordovician carbonate, 17 from 13 coal seam sandstone fracture water, and 19 from 12 coal seam sandstone fracture water. The four water sample sources are indicated by I, II, III, and IV respectively. The water samples were submitted to the Testing and Analysis Center of Hebei Coalfield Geology Bureau for chemical analysis. The water quality testing report provided analysis of the main ions and the total hardness (TH) using ion chromatography. Additionally, the bicarbonate ion (HCO₃^–) and total alkalinity (TA) were determined through titration using dilute sulfuric acid-methyl orange. The pH value was measured using a pH tester. Subsequently, the data on the nine discriminant indices of the mine water were organized and presented in Table 1(attached).

Table 1 67 groups of water chemistry data.

Full size table

Using 67 sets of typical water sample data collected from the Zhaogezhuang mining area, 56 of these were utilized as training samples for the learning machine as shown in Table 2(attached) while the remaining 11 sets were reserved as test samples, labeled G1 to G11 as presented in Table 3. The distribution of anion and cation content was illustrated using a three-dimensional diagram, with the cation content distribution depicted in Fig. 3, and the anion content distribution shown in Fig. 4.

Table 2 Training sample data.

Full size table

Table 3 Forecast sample data.

Full size table

Water chemistry characterization

Analysis of statistical characteristic values

The water chemistry statistical characteristic values were calculated and analyzed based on the water chemistry content information of 67 groups of water samples from Zhaogezhuang mine. In the water sample data of study area, the goafwater is obviously different from the other three types of water samples in ionic composition. Among the anions of the goaf water, the anion with the highest content is SO₄^2–, which is 78.022 mmol·L^–1, while the other water samples are HCO₃^–. The goaf water is easier to identify than the other three types of water sources, and can be identified by the content of anions, if the highest content of SO₄^2– can be initially classified as goaf water; in the cations, the highest content in all four types of water samples is Ca²⁺. In addition, in terms of the overall content of anions and cations in all water samples data, the content of Ca²⁺ and HCO₃^– is higher compared to other ions, which indicates that Ca²⁺ and HCO₃^– have strong recognition ability.

The goaf water

The hydrochemical index of goaf water are as shown in Table 4. The water chemical composition of the four water samples from Zhaogezhuang differed significantly, and their mass concentrations of substances were related to the water source cycle. In the goaf water, the mass concentration of SO₄^2– was the highest in the distribution of anion content, and its substance concentration ranged from 60.47 mmol·L^–1 to 85.55 mmol·L^–1, accounting for 78% of the anions, followed by HCO₃^–. Cl^– had the smallest mass concentration. The cations were mainly Ca²⁺ and Mg²⁺, and the lowest mass concentration of Na⁺. The coefficient of variation is the ratio of the standard deviation to the mean, indicating the degree of dispersion of the data, and the Cl^– coefficient of variation was the largest at 0.9, followed by Na⁺ at 0.41, and the rest were smaller, indicating the poor uniformity of ion concentration in the water.

Table 4 Hydrochemical index of goaf water.

Full size table

Ordovician carbonate

The hydrochemical index of Ordovician Carbonate are as shown in Table 5.The ph of ordovician carbonate is 7.30–7.94, which is weakly alkaline. 86.6% of the anions in ordovician carbonate are mainly HCO₃^– and SO4^2–, and the mass concentration of cations are: Ca²⁺ > Mg²⁺ > Na⁺, mainly Ca²⁺ and Mg²⁺ accounting for 92.88%, and the water chemistry type is Ca-Mg-HCO₃. The variation coefficient of ordovician carbonate is in the following order: SO₄^2– > Cl^– > Na⁺ > Mg²⁺ > HCO₃^– > Ca²⁺, and the coefficients of variation of all six indexes are less than 0.5. and the coefficients of variation of the anions Cl^–, SO₄^2–, HCO₃^– is greater than that of cations Na⁺, Mg²⁺, Ca²⁺.

Table 5 Hydrochemical index of Ordovician carbonate.

Full size table

Sandstone fracture water from the 13 coal system

The hydrochemical index of sandstone fracture water from 13 coal system are as shown in Table 6.The highest mass concentration of HCO₃^– among the anions in the fracture water of the 13-coal sandstone is up to 79.58 mmol·L^–1, the content of SO₄^2– and Cl^– is less, and the highest mass concentration of cations is Ca²⁺, followed by Mg²⁺. The 13 coal system sandstone fracture water coefficient of variation is not much different except for Na⁺, which is less than 0.1, and the ion concentration is dispersed more uniformly.

Table 6 Hydrochemical index of sandstone fracture water from the 13 coal system.

Full size table

Sandstone fracture water from the 12 coal system

The anions in the fracture water of the 12 coal seam sandstone are mainly HCO₃^–with a mean mass concentration of 71.79 mmol·L^–1. The cations are dominated by Ca²⁺ up to 64.36 mmol·L^–1, followed by Mg²⁺ with a mean concentration of 32.57 and finally Na⁺. The variation coefficients of sandstone fracture water in the 12 coal seam are in the following order: Mg²⁺ > Ca²⁺ > Na⁺ > Cl^– > HCO₃^–, and the variation coefficient of Mg²⁺ is as high as 0.69.

The hydrochemical index of sandstone fracture water from 12 coal system are as shown in Table 7. In order to study the hydraulic connection between individual aquifers, the degree of connection K between them can be calculated quantitatively^28,29, and since the Cl^– concentration is minimally disturbed by other factors and is mainly influenced by the formation itself, the degree of hydraulic connection between two aquifers can be obtained by calculating the difference between their average Cl^– concentrations .If the K value of the hydraulic connection between the two aquifers is less than 0.2, it means that they have a strong hydraulic connection, if K is greater than 0.4, it means that the hydraulic connection between the two aquifers here is weak, if the final calculated K value is between 0.2 and 0.4, it means that the hydraulic connection is moderately strong^30,31.

$$K = 0.5 \times \frac{{Cl_{1} - Cl_{2} }}{{(Cl_{1} + Cl_{2} )}}.$$

(16)

Table 7 Hydrochemical index of sandstone fracture water from the 12 coal system.

Full size table

Cl₁ The average Cl^– concentration in aquifer 1. Cl₂ The average Cl^– concentration in aquifer 2.

Through Eq. (16), the K values of goaf water and Ordovician carbonate, sandstone fracture water of 13 coal system and sandstone fracture water of 12 coal system are all 0.25, and the degree of hydraulic connection is moderate. The K value of the hydraulic connection between the goaf water and the sandstone fracture water of 13 coal system is 0.025, and the K value of the fracture water with the 12 coal seam sandstone is 0.03, which is a weak hydraulic connection; the K value of the fracture water with the 13 coal system sandstone and the 12 coal seam sandstone fracture water is 0.001, which is a very weak hydraulic connection. It can be summarized that there is a certain hydraulic connection between the goaf water and other aquifers, indicating the existence of connection and increasing the difficulty of discrimination.

Piper trilinear diagram analysis

The hydrogeological conditions in Zhaogezhuang Coal Mine are characterized by complexity and variability. As demonstrated by the previous analysis of the goaf water composition and other water sources, they exhibit distinguishable differences. To further investigate the distribution patterns of aquifer water samples, the Piper trilinear diagram method was employed for analysis. The ion contents were represented as points on the diagram, allowing for inference of the water chemistry type and quality pattern of the aquifer based on the scatter position of the water samples.

The water samples of the study area were drawn for hydrochemistry analysis using piper trilinear diagram shown in Fig. 5. The goaf water was located in the upper right corner, near Ca²⁺, Mg²⁺ and SO₄^2-, Cl^–, mainly Ca·Mg-Cl·SO₄ type, and individually Ca·Mg-SO₄ type. The water sample of Ordovician carbonate water is located in the left position of the diamond-shaped area, and the water quality type is Ca·Mg-HCO3 type. By observing the left triangle area, we can find that the cations in the Ordovician carbonate sample are mainly Mg²⁺ and Ca²⁺, and the anions are mainly HCO₃^– and SO₄^2– in the right triangle area. Sandstone fracture water from the 13 coal system is located in the middle and left position, and the cations are mainly located in Ca²⁺ and The anions are scattered in the end elements with high proportion of HCO₃^– and SO₄^2–, and the water quality type is Ca·Mg-HCO₃ type. sandstone fracture water samples from the 13 coal system are highly similar to the 13 in the trilinear diagram, and the water chemistry type is Ca·Mg-HCO₃ type, the cations are mainly Ca²⁺ and Mg²⁺, and the anions are mainly HCO₃^– and CO₃^2–. In summary, the water quality types of Ordovician carbonate, sandstone fissure water from 13 or 12 coal seam are the same, with overlapping characteristics and inconspicuous distribution boundaries, which need further quantitative discrimination.

Model building and application

Dimensionality reduction based on R-factor

The normalization process is performed before the operation to make it lie in the interval of [0, 1] to solve the comparability between indicators and ensure the stability of calculation.The normalization of water sample data are as shown in Table 8 (attached).

Table 8 Normalization of water sample data.

Full size table

There is a non-linear association between the indicators, and to reduce the correlation between the data, the optimal number of common factors for the six indicators of sodium ion, calcium ion, magnesium ion, chloride ion, sulfate ion, and bicarbonate ion was determined to be 3, denoted as Y1, Y2, and Y3. SPSS software was used to analyze 67 groups of samples and 6 evaluation indicators of Zhaogezhuang based on the correlation calculation steps of R-type factors. The eigenvalues and contribution rates of the main factors were as Table 9.

Table 9 Characteristic values and contribution rates of main factors.

Full size table

The cumulative contribution rate of the first three principal factors reaches 96.660%, which indicates that the factors extracted by dimensionality reduction contain 96.660% of the information of the original index data. When the cumulative contribution rate reaches 80%, it shows that the extracted principal factors are reasonable and effective, which indicates that these three principal factors cover most of the water chemistry information and can effectively replace the original indexes.

The factor correlation matrix is as follows:

$$A = \left[ {\begin{array}{*{20}c} {1.000} & { - 0.416} & { - 0.167} & {0.231} & { - 0.104} & {0.080} \\ { - 0.416} & { - 1.000} & { - 0.799} & {0.393} & { - 0.362} & {0.286} \\ { - 0.167} & { - 0.799} & {1.000} & {0.589} & {0.000} & { - 0.379} \\ {0.231} & {0.393} & { - 0.589} & {1.000} & { - 0.866} & {0.79} \\ { - 0.104} & { - 0.362} & {0.480} & { - 0.899} & {1.000} & { - 0.987} \\ {0.080} & {0.286} & { - 0.379} & {0.791} & {0.987} & {1.000} \\ \end{array} } \right].$$

(17)

The correlation coefficient above 0.8 indicates a strong correlation, while between 0.3 and 0.8 indicates a moderate correlation, and below 0.3 indicates no correlation. The correlation coefficient between Na⁺ and Ca²⁺ is − 0.416, indicating a weak correlation, while with Mg²⁺ is − 0.167, with Cl^– is 0.231, with SO₄^2– is − 0.104, and with HCO₃^– is 0.080, all of which have no correlation. The correlation coefficient between Ca²⁺ and Mg²⁺ is − 0.799, indicating weak correlation between Ca²⁺ and other ions. Similarly, Mg²⁺ is not correlated with Na⁺ and weakly correlated with other ions, while Cl^– and SO₄^2– are strongly correlated and SO₄^2– and HCO₃^– are strongly correlated.

Using the maximum variance orthogonal rotation method, SPSS rotates to obtain the rotated component matrices. The factor loading matrix and the rotated component matrix were:

$$Z^{\prime}_{(3 \times 6)} = \left| {\begin{array}{*{20}c} {0.101} & {0.788} & {0.600} \\ {0.622} & { - 0.754} & {0.180} \\ { - 0.755} & {0.333} & { - 0.364} \\ {0.922} & {0.215} & { - 0.010} \\ { - 0.929} & { - 0.221} & {0.182} \\ {0.872} & {0.25} & { - 0.384} \\ \end{array} } \right|\quad Z^{\prime}_{(3 \times 6)} \equiv \left| {\begin{array}{*{20}c} {0.096} & { - 0.053} & {0.990} \\ { - 0.286} & { - 0.892} & { - 0.398} \\ {0.279} & { - 0.933} & { - 0.393} \\ { - 0.838} & { - 0.267} & {0.251} \\ { - 0.873} & { - 0.211} & { - 0.022} \\ {0.978} & {0.263} & { - 0.06} \\ \end{array} } \right|.$$

The component conversion matrix is:

$$Z^{\prime\prime}_{(3 \times 3)} = \left[ {\begin{array}{*{20}c} {0.835} & {0.548} & {0.055} \\ {0.338} & { - 0.589} & {0.734} \\ { - 0.434} & {0.594} & {0.677} \\ \end{array} } \right].$$

Three new main components Y₁, Y₂, and Y₃ were extracted, and the factor score coefficient matrix based on SPSS operations was as follows:

$$U = \left[ {\begin{array}{*{20}c} { - 0.072} & {0.079} & {0.838} \\ { - 0.108} & {0.521} & { - 0.241} \\ {0.152} & { - 0.608} & { - 0.264} \\ {0.277} & {0.052} & {0.116} \\ { - 0.410} & { - 0.204} & { - 0.128} \\ \end{array} } \right],$$

According to the factor score coefficient matrix, the expressions of the main factors Y₁, Y₂, and Y₃ are:

$$\left\{ {\begin{array}{*{20}l} {Y_{1} = - 0.072X_{1} - 0.108X_{2} + 0.152X_{3} + 0.277X_{4} - 0.410X_{5} } \hfill \\ {Y_{2} = 0.079X_{1} + 0.521X_{2} - 0.608X_{3} + 0.052X_{4} - 0.204X_{5} } \hfill \\ {Y_{3} = 0.838X_{1} - 0.241X_{2} - 0.264X_{3} + 0.116X_{4} - 0.128X_{5} } \hfill \\ \end{array} } \right..$$

The original data of water samples (I), water samples (II), water samples (III), and water samples (IV) from Zhaogezhuang mine were substituted into the model expressions of the three main factors Y₁, Y₂, and Y₃, and the factor score matrices were as follows:

$$\left( \mu \right)_{18 \times 3} { = }\begin{array}{*{20}l} {\left[ {\begin{array}{*{20}l} { - 2.427} \hfill & {2.841} \hfill & {0.952} \hfill \\ { - 1.011} \hfill & { - 0.465} \hfill & { - 1.304} \hfill \\ { - 0.933} \hfill & { - 0.635} \hfill & {1.059} \hfill \\ { - 1.374} \hfill & { - 1.276} \hfill & { - 0.770} \hfill \\ { - 0.621} \hfill & { - 0.066} \hfill & {0.284} \hfill \\ { - 1.476} \hfill & { - 0.164} \hfill & { - 0.689} \hfill \\ { - 1.511} \hfill & { - 0.660} \hfill & { - 0.801} \hfill \\ { - 1.636} \hfill & { - 1.159} \hfill & {2.954} \hfill \\ { - 1.389} \hfill & { - 1.881} \hfill & { - 0.448} \hfill \\ { - 1.349} \hfill & { - 1.170} \hfill & { - 1.730} \hfill \\ { - 1.441} \hfill & { - 1.052} \hfill & { - 0.711} \hfill \\ { - 1.506} \hfill & { - 1.012} \hfill & { - 0.163} \hfill \\ { - 1.518} \hfill & { - 0.835} \hfill & { - 0.234} \hfill \\ { - 1.377} \hfill & { - 0.355} \hfill & {1.104} \hfill \\ { - 1.516} \hfill & {0.228} \hfill & {1.266} \hfill \\ { - 1.609} \hfill & { - 0.066} \hfill & {1.085} \hfill \\ { - 1.699} \hfill & { - 0.414} \hfill & {0.667} \hfill \\ { - 1.623} \hfill & { - 0.492} \hfill & {0.284} \hfill \\ \end{array} } \right]} \hfill \\ \end{array} \left( \mu \right)_{18 \times 3} { = }\left[ {\begin{array}{*{20}c} { - 0.607} & { - 0.156} & { - 0.731} \\ { - 0.497} & { - 0.107} & { - 0.901} \\ { - 0.556} & {0.111} & { - 1.160} \\ {0.280} & {1.199} & { - 0.906} \\ { - 0.464} & { - 0.140} & { - 0.932} \\ { - 0.693} & {0.626} & { - 1.268} \\ { - 0.672} & {0.281} & { - 0.833} \\ {0.373} & {1.885} & {0.727} \\ { - 0.049} & {2.636} & {0.921} \\ { - 0.021} & {2.541} & {1.268} \\ {0.257} & {1.968} & {1.211} \\ {0.085} & {1.734} & {1.449} \\ {0.837} & { - 0.732} & {0.547} \\ \end{array} } \right].$$

$$(\mu )_{17 \times 3} = \left[ {\begin{array}{*{20}r} \hfill {1.074} & \hfill { - 0.886} & \hfill { - 0.960} \\ \hfill {0.947} & \hfill { - 0.2529} & \hfill { - 0.283} \\ \hfill {0.944} & \hfill { - 0.906} & \hfill {0.179} \\ \hfill {1.065} & \hfill { - 0.943} & \hfill { - 0.228} \\ \hfill {0.581} & \hfill {1.069} & \hfill { - 2.249} \\ \hfill {0.812} & \hfill { - 0.581} & \hfill {0.250} \\ \hfill {0.916} & \hfill { - 0.863} & \hfill {0.364} \\ \hfill {0.963} & \hfill { - 0.729} & \hfill {0.105} \\ \hfill {0.910} & \hfill { - 0.410} & \hfill { - 0.428} \\ \hfill {0.963} & \hfill { - 1.004} & \hfill {0.100} \\ \hfill {0.927} & \hfill { - 0.694} & \hfill {1.162} \\ \hfill {0.852} & \hfill { - 0.918} & \hfill {1.804} \\ \hfill {1.005} & \hfill { - 1.054} & \hfill {1.200} \\ \hfill {1.068} & \hfill { - 0.700} & \hfill { - 0.182} \\ \hfill {0.993} & \hfill { - 1.254} & \hfill {1.103} \\ \hfill {0.982} & \hfill { - 1.108} & \hfill {1.684} \\ \hfill {1.030} & \hfill { - 0.890} & \hfill {1.251} \\ \end{array} } \right](\mu )_{19 \times 3} = \left[ {\begin{array}{*{20}r} \hfill {0.538} & \hfill {0.679} & \hfill { - 0.904} \\ \hfill {0.633} & \hfill {0.212} & \hfill { - 0.196} \\ \hfill {0.439} & \hfill {0.826} & \hfill {0.211} \\ \hfill {0.560} & \hfill {0.512} & \hfill { - 0.055} \\ \hfill {0.521} & \hfill {0.499} & \hfill {0.232} \\ \hfill {0.526} & \hfill {0.707} & \hfill { - 0.272} \\ \hfill {0.549} & \hfill {0.595} & \hfill { - 0.146} \\ \hfill {0.669} & \hfill {0.177} & \hfill { - 0.255} \\ \hfill {0.980} & \hfill { - 0.512} & \hfill {1.176} \\ \hfill {0.796} & \hfill { - 0.055} & \hfill { - 0.383} \\ \hfill {0.714} & \hfill {0.188} & \hfill { - 0.011} \\ \hfill {0.933} & \hfill { - 0.242} & \hfill { - 1.267} \\ \hfill {0.458} & \hfill {0.282} & \hfill { - 1.702} \\ \hfill {0.556} & \hfill {0.662} & \hfill {0.174} \\ \hfill {0.805} & \hfill {0.128} & \hfill { - 1.836} \\ \hfill {0.540} & \hfill {0.519} & \hfill { - 0.495} \\ \hfill {0.564} & \hfill {0.480} & \hfill {0.925} \\ \hfill {0.436} & \hfill {0.952} & \hfill { - 0.626} \\ \hfill {0.466} & \hfill {1.077} & \hfill { - 1.002} \\ \end{array} } \right].$$

R-SVM model establishment

The R- SVM model is shown in Fig. 6. First, the R-factor is used to initially reduce the dimensionality of the data, and the three common factors Y₁, Y₂, and Y₃ are used as the input variables of the model, and the four types of water sources H are used as the output of the model to establish the mapping $F({\text{Y1,Y2,Y3}}) \to H$, which automatically searches for complex connections between the input variables and the types of water sources. The grid search method is used to find the optimal combination of parameters for the Support Vector Machine model. The training set data is then used to train the model, and the trained model is used to predict the water sample types for the testing set data. The predicted types are then compared with the actual types to correct for any deviations. This process is repeated until the model achieves a satisfactory level of accuracy in predicting the types of water samples.

Parameter search and model application

Six indicators of sodium ion, calcium ion, magnesium ion, chloride ion, sulfate ion and bicarbonate ion are used as input variables of the SVM, and four water source types of goaf water, Ordovician carbonate, sandstone fracture water from the 13 coal system and sandstone fracture water from the 12 coal system are used as outputs of the model to establish the mapping relationship between the two and seek the nonlinear law of the two by SVM. Firstly, 55 sets of training samples and 11 sets of prediction samples are substituted into the grid search method to run the search for parameters, and the range of values of the parameters c and g of the grid search method are set ${\text{g}} \in \left[ {2^{ - 10} ,2^{10} } \right]$ ${\text{c}} \in \left[ {2^{ - 10} ,2^{10} } \right]$, and the step size L = 0.2 according to the operation process of SVM.

The three public factors of Zhaogezhuang after dimensionality reduction were used as the input variables of the model, and four types of goaf water, Ordovician carbonate, sandstone fracture water from the 13 coal system, and sandstone fracture water from the 12 coal system of Zhaogezhuang mine were used as the outputs of the model to establish the mapping relationship about the public factors and water source types. The factor scores of the 67 sets of sample data after dimensionality reduction were substituted into the SVM model of grid search method for finding the best model for training, and the best parameter combination c = 1 and g = 2.8284 was finally obtained.The result of the optimization search is shown in Fig. 7

Substituting c = 1 and g = 2.8284 into the SVM model, the type attributes were predicted for 11 sets of data to be discriminated, and the final results are shown in Fig. 8 and Table 10. The model misjudged Type II ordovician carbonate as Type III sandstone fracture water from the 13 coal system, indicating that the model is suitable for water source discrimination in Zhaogezhuang Coal Mine and can effectively make the distinction.

Table 10 Comparison of model operation results.

Full size table

Table 11 presents a comparative analysis of model performance across different optimization types. The accuracy and precision metrics were employed to evaluate the models' efficacy. The Fisher optimization type exhibits the lowest performance in terms of accuracy and precision. The Grid optimization type shows a significant improvement in both accuracy and precision compared to the Fisher type. Notably, the R-type grid optimization type demonstrates the highest level of performance, surpassing both the Fisher and Grid types in terms of accuracy and precision.

Table 11 Comparison of model performance.

Full size table

Based on the information provided, it seems that the coupled discriminant model of R-SVM was able to provide more targeted and effective characterization of water sources compared to other multi-model prediction results presented in Table 11. The R-factor simplification was used as a new discriminant to improve the model’s independence component. The coupled discriminant model of R-SVM can also complement the qualitative analysis of water chemistry and provide rapid identification of water sources.

Conclusion

As coal mine of submarine mining, the identification and prediction of mine water inrush source is of great significance to the safety and efficiency of mine production in Zhaogezhuang Coal Mine. In order to prevent and control the water inrush, it is of great practical significance to identify the mine water source effectively and accurately. Through the analysis of the water source data of different parts in the mine, the effective water source discrimination model was established to verify its effectiveness and practicability.The conclusions of the study are as follows:

(1)
The chemical composition data of 67 water samples of Zhaogezhuang Coal Mine were collected. According to the chemical composition analysis of selected mine water sources, the main ions identified in water sources were Na⁺, Ca²⁺, Mg²⁺, Cl^–, SO₄^2– and HCO₃^–. The water inrush sources in the mining area were divided into four categories: goaf water was type I, ordovician carbonate was type II, sandstone fracture water from 13 coal seam was type III, and from 12 coal seam was type IV. The analysis and comparison of water source information provide support for the establishment of water source discrimination model.
(2)
R factor analysis was used to reduce the dimensionality of the original data, resulting in three common factors (Y₁, Y₂, and Y₃) and factor score data for water source data. This approximation of indicator attributes filtered out redundant features and improved efficiency.
(3)
The coupled model of R-SVM achieved a classification accuracy of 90.90% in water source discrimination for the Zhaogezhuang mine. Compared to traditional qualitative approaches, this model explores the internal laws of the data and provides accurate discrimination, improving upon the Fisher discrimination function and SVM model alone.

Data availability

The data used to support the findings of this research are included within the paper.

References

Liu, X., Han, K. & Fan, Z. Discriminated method of mine water inrush source based on entropy weight fuzzy comprehensive analysis. Coal Ming Technol. 22(06), 82–84 (2017).
Google Scholar
Chen, Y., Tang, L. & Zhu, S. Comprehensive study on identification of water inrush sources from deep mining roadway. Environ. Sci. Pollut. Res. 29, 19608–19623 (2022).
Article Google Scholar
Wei, Z., Dong, D., Ji, Y., Ding, J. & Yu, L. Source discrimination of mine water inrush using multiple combinations of an improved support vector machine model. Mine Water Environ. 41, 1106–1117 (2022).
Article ADS Google Scholar
Rahbar, A. et al. A hydrogeochemical analysis of groundwater using hierarchical clustering analysis and fuzzy C-mean clustering methods in Arak plain, Iran. Environ. Earth Sci. 79, 1–17 (2020).
Article ADS Google Scholar
Fan, Z. Quantify discriminated method of water source of mine water inrush based on grey relational analysis. Coal Min. Technol. 22(02), 10–14 (2017).
MathSciNet Google Scholar
Nadiri, A. A. et al. Hydrogeochemical analysis for Tasuj plain aquifer. Iran. J. Earth Syst. Sci. 122, 1091–1105 (2013).
Article ADS CAS Google Scholar
Zhang, D., Meng, L., Dong, F., Liu, X. & Shao, Q. Study on GA-SVM for mine water inrush source identification. Coal Technol. 37(04), 144–147 (2018).
Google Scholar
Erdogan, I. G., Fosso-Kankeu, E., Ntwampe, S. K. O., Waanders, F. & Hoth, N. Seasonal variation of hydrochemical characteristics of open-pit groundwater near a closed metalliferous mine in o’kiep, namaqualand region, South Africa. Environ. Earth Sci. https://doi.org/10.1007/s12665-020-8863-2 (2020).
Article Google Scholar
Moreno Merino, L., Aguilera, H., González-Jiménez, M. & Díaz-Losada, E. D-piper, a modified piper diagram to represent big sets of hydrochemical analyses. Environ. Model. Softw. 138, 104979 (2021).
Article Google Scholar
Song, C., Yao, L., Gao, J., Hua, C. & Ni, Q. Identification model of water inrush source based on statistical analysis in Fengyu minefield, Northwest China. Arab. J. Geosci. https://doi.org/10.1007/s12517-021-06901-1 (2021).
Article Google Scholar
Guo, Y., Gan, F., Yan, B., Wang, F. & Bai, J. Hydrochemical-isotopic characteristics of surface water and its controlling factors in southwest Tibetan plateau. J. North China Univ. Water Resour. Electr. Power (Nat. Sci. Ed.) 43(6), 96–107 (2022).
CAS Google Scholar
Zhang, S., Hu, Y. & Xing, S. Discrimination of the mine water inrush source based on principal component analyses-theory of gray relational degree. Hydrogeol. Eng. Geol. 45(06), 36–41 (2018).
ADS Google Scholar
Qiu, M. et al. Recognition method of mine water sources based on principal component analysis and support vector machine. China Sciencepap. 10(03), 251–255 (2015).
Google Scholar
Xu, X., Wang, X., Li, K. & Li, Y. Source discrimination of mine water inrush based on elman neural network globally optimized by genetic algorithm. Arab. J. Geosci. https://doi.org/10.1007/s12517-021-06821-0 (2021).
Article Google Scholar
Nadiri, A. A. et al. Supervised committee machine with artificial intelligence for prediction of fluoride concentration. J Hydroinform 15(4), 1474–1490 (2013).
Article CAS Google Scholar
Cao, X., Qian, J. & Sun, X. Hydrochemical classification and identification for groundwater system by using integral multivariate statistical models: A case study in Guqiao Mine. J China Coal Soc. 35(S1), 141–144 (2010).
Google Scholar
Chitsazan, N., Nadiri, A. A. & Tsai, F.T.-C. Prediction and structural uncertainty analyses of artificial neural networks using hierarchical Bayesian model averaging. J. Hydrol. 528, 52–62 (2015).
Article ADS CAS Google Scholar
Winsberg, S. & Ramsay, J. O. Monotone spline transformations for dimension reduction. Psychometrika 48(4), 575–595 (1983).
Article Google Scholar
Abbasi, M. et al. A hybrid of random forest and deep auto-encoder with support vector regression methods for accuracy improvement and uncertainty reduction of long-term streamflow prediction. J. Hydrol. 597, 125717 (2021).
Article Google Scholar
Huang, S. et al. Applications of support vector machine (SVM) learning in cancer genomics. Cancer Genom. Proteom. 15(1), 41–51 (2018).
CAS Google Scholar
Miller, C. H., Sacchet, M. D. & Gotlib, I. H. Support vector machines and affective science. Emot. Rev. 12, 297–308 (2020).
Article Google Scholar
Kim, S. & Kim, C. Influence diagnostics in support vector machines. J. Korean Stat. Soc. 49, 757–778 (2020).
Article MathSciNet Google Scholar
Lv, W., Li, T. T., Ren, H. L., Zeng, S. J. & Zhou, J. Inequality distance hyperplane multiclass support vector machines. Int. J. Intell. Syst. 37, 2046–2060 (2022).
Article Google Scholar
Goretzko, D. & Bühner, M. Robustness of factor solutions in exploratory factor analysis. Behaviormetrika https://doi.org/10.1007/s41237-021-00152-w (2021).
Article Google Scholar
Gai, Q., Huang, L. & Zhao, L. Floor water inrush model of Jiaozuo mining area based on factor analysis. Coal Eng. 53(01), 123–127 (2021).
Google Scholar
Cai, X., Han, R., Meng, L. & Yang, J. Safe and warning water level control of closed pit groundwater in Zhaogezhuang Mine. Coal Eng. 52(09), 116–121 (2020).
Google Scholar
Sun, W., Yang, H., Li, X., Wang, Z. & Yang, L. Research on rapid recognition method of mine water inrush source based on PCA and ELM model. Coal Eng. 52(01), 111–115 (2020).
Google Scholar
Yang, Y. Research on groundwater chemical characteristics and genesis mechanism of main water-filled aquifers in Xinzhi coal mine. China University of Ming and Technology, Master Thesis (2020).
Li, C. Study on hydrochemical characteristics and identification model of water inrush source in Xieqiao coal mine. Anhui University of Science and Technology, Master Thesis (2020).
Lam, K. F. & Moy, J. W. A piecewise linear programming approach to the two-group discriminant problem—An adaptation to fisher’s linear discriminant function model. Eur. J. Oper. Res. 145, 471–481 (2003).
Article MathSciNet Google Scholar
Liberda, E. N., Zuk, A. M., Martin, I. D. & Tsuji, L. Fisher’s linear discriminant function analysis and its potential utility as a tool for the assessment of health-and-wellness programs in indigenous communities. Int. J. Environ. Res. Public Health 17, 7894 (2020).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This research was funded by the National Emergency Management System Construction Project (grant 20VYJ061), the Construction and Empirical Research on Early Warning Index System of Major Engineering Safety Risks Based on Optimal Control Theory, National Natural Science Foundation of China (grant 71271031), the Innovation Fund for Doctoral Students of Beijing University of Posts and Telecommunications (grant CX2023102), and the Graduate Innovation and Entrepreneurship Project (2024-YC-A180).

Author information

Authors and Affiliations

School of Economics and Management, Beijing University of Posts and Telecommunications, Beijing, 100876, China
Qiushuang Zheng, Changfeng Wang & Yang Yang
College of Energy and Mining Engineering, Shandong University of Science and Technology, Qingdao, 266590, China
Weitao Liu & Ye Zhu
State Key Laboratory of Mine Disaster Prevention and Control, Shandong University of Science and Technology, Qingdao, 266590, China
Weitao Liu & Ye Zhu

Authors

Qiushuang Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Changfeng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yang Yang
View author publications
You can also search for this author in PubMed Google Scholar
Weitao Liu
View author publications
You can also search for this author in PubMed Google Scholar
Ye Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Q.Z. performed the data analyses and wrote the manuscript; C.W. provided research funding support; Y.Y. contributed significantly to analysis and manuscript preparation; W.L. performed the experiment and data analyses; Y.Z. helped perform part of the finite element analysis. All authors reviewed the manuscript.

Corresponding author

Correspondence to Qiushuang Zheng.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Zheng, Q., Wang, C., Yang, Y. et al. Identification of mine water sources using a multi-dimensional ion-causative nonlinear algorithmic model. Sci Rep 14, 3305 (2024). https://doi.org/10.1038/s41598-024-53877-5

Download citation

Received: 12 May 2023
Accepted: 06 February 2024
Published: 08 February 2024
DOI: https://doi.org/10.1038/s41598-024-53877-5

Keywords

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Underestimated burden of per- and polyfluoroalkyl substances in global surface waters and groundwaters

Removal of heavy metal ions from wastewater: a comprehensive and critical review

Environmental impact of direct lithium extraction from brines

Introduction

Theoretical basis

Principle of R-factor dimensionality reduction

Support vector machine principle

Optimal parameter solving

Analysis of water information

Hydrogeologic conditions in the study area

Data index extraction and collection

Water chemistry characterization

Analysis of statistical characteristic values

The goaf water

Ordovician carbonate

Sandstone fracture water from the 13 coal system

Sandstone fracture water from the 12 coal system

Piper trilinear diagram analysis

Model building and application

Dimensionality reduction based on R-factor

R-SVM model establishment

Parameter search and model application

Conclusion

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Comments

Search

Quick links