Introduction

Rock failure parameters

In oil fields, many downhole problems such as borehole instability and sand production are directly related to the rock’s mechanical properties1. Hence, a good knowledge of the rock's mechanical characteristics can help to minimize the problems during the drilling operation and can be used to optimize the drilling performance and enhance the economic gain from reservoir1,2. Furthermore, the determination of the geomechanical properties of the reservoir and near-reservoir rocks is important for the hydraulic fracturing design, and reservoir/geomechanical modeling3.

Friction angle and cohesion are important geomechanical properties that reflect the shearing strength, the angle of rupture and the stability condition of the materials4,5. These parameters are essential when conducting the stability analysis6,7. Cohesion and friction angle are affected by many factors such as particle arrangement, material physical properties, and loading conditions. Cohesion reflects the internal force that bonds the material’s particles together while friction angle reflects the frictional resistance within the material8.

Mohr–Coulomb criterion is frequently used for rock failure characterization, in which shear stress (τ) is assumed to have a linear relationship with the effective normal stress (σ′) as per (Eq. 19). Where the intercept is known as cohesion (C) in stress units, also called inherent shear strength, and the slope is the tangent of the angle of internal friction (φ) in degrees, also called friction angle.

$$ \uptau = \text{C} + \tan \left( \upvarphi \right)\upsigma^{^{\prime}} $$
(1)

Estimation of the rock failure parameter usually requires several compressional tests to draw multiple Mohr’s cycles and then the Mohr–Coulomb failure envelope can be drawn as a tangent for those cycles10,11. The angle between the failure envelope and the normal stress axis is the friction angle and the intersection with the shear stress axis is the cohesion as in Fig. 1. In this way, C and φ describe how rock fails under different horizontal stresses.

Figure 1
figure 1

Estimation of failure parameter from Mohr’s cycles.

This process of estimating the failure parameters is costly and time-consuming, furthermore, the availability and the representability of the core samples are major concerns, in addition to that, it is difficult provide continuous information due to the number of samples limitation. Holt et al.12,13 pointed out that the extracted core samples exerted mechanical properties change due to the release of the stresses. Many researchers tried to overcome these concerns by correlating the failure parameters to other physical rock properties that are easier to be measured.

Cohesion and friction angle correlations

Several attempts to correlate φ and C with the porosity (Ø) have been made, it has been reported that both parameters decrease with porosity, however, the accuracies of these correlations are low with R2 values didn’t exceed 0.761,14,15. The correlations are expressed with linear relation as in the following (Eq. 2 to Eq. 5):

Weingarten & Perkins:

$$ \upvarphi = 57.8 - 1.05\emptyset $$
(2)

Edimann et al.:

$$ \upvarphi = 41.929 - 0.7779\emptyset $$
(3)
$$ \text{C} = 37.715 - 0.8757\emptyset $$
(4)

Abbas et al.

$$ \varphi = 64.369 - 99.238\emptyset $$
(5)

Plumb16 and Chang et al.17 incorporated the effect of shale content in φ estimation using the gamma-ray (GR) (Eq. 6 and Eq. 8), the former reported an increase of φ with clay content. Almalikee18 also reported a correlation between GR and φ expressed by (Eq. 9).

Plumb:

$$ \upvarphi = 26.5 - 37.4\left( {1 - \emptyset - V_{\text{shale}} } \right) + 62.1\left( {1 - \emptyset - V_{\text{shale}} } \right)^{2} $$
(6)

where Vshale is calculated by (Eq. 7):

$$ V_{\text{shale}} = \frac{{\text{GR - GR}_{\min } }}{{\text{GR}_{\max } - \text{GR}_{\min } }} $$
(7)

Chang et al.:

$$ \upvarphi = \tan^{ - 1} \left( {\frac{{\left( {\text{GR - GR}_{sand} } \right)\mu_{\text{shale}} + \left( {\text{GR}_{\text{shale}} - \text{GR}} \right)\mu_{sand} }}{{GR_{shale} - GR_{sand} }}} \right) $$
(8)

where GRsand and GRshale are the gamma-rays of pure sand and shale respectively which were reported to be 60 API and 120 API with in same order by the original authors. μshale and μsand are the internal friction coefficients (tanφ) for pure shale and sand respectively (reported to be 0.5 and 0.9 respectively by the authors).

Almalikee:

$$ \upvarphi = 39.25 - 0.1166\,\text{GR} $$
(9)

In addition to the porosity and GR, φ and C have been correlated to the compressional wave velocity (Vp), both parameters increase with Vp as in (Eq. 10 to Eq. 1219,20).

Lal:

$$ \upvarphi = \sin ^{{ - 1}} \left( {\frac{{{\text{V}}_{{\text{p}}} - 1}}{{{\text{V}}_{{\text{p}}} + 1}}} \right) $$
(10)
$$ C = \frac{{5\left( {\text{V}_{\text{p}} - 1} \right)}}{{\sqrt {\text{V}_{\text{p}} } }} $$
(11)

Abbas et al.:

$$ \varphi = 17.134e^{{0.239 \text{V}_{\text{p}} }} $$
(12)

The efforts toward obtaining empirical correlations for the failure parameters were not limited to the above equations, several authors employed machine learning (ML) techniques for the same objectives. The applications of ML in the estimation of rock’s physical and mechanical properties are growing due to its high accuracy. These applications cover but are not limited to the correlations of porosity21,22, permeability23,24, bulk density25, compressive strength26, sonic velocities27 and elastic properties28. Cohesion and friction angle were not an exception, different authors presented ML-based estimations for them. In addition to the porosity, Vp, and GR, the models’ inputs include shear wave velocity (Vs) and bulk density (ρbulk) as summarized in Table 1.

Table 1 Summary of machine learning models for cohesion and friction angle.

Utilization of drilling data

All the models in Table 1 require, at least, the knowledge of sonic wave velocities, bulk density and porosity. Therefore, the failure parameters cannot be estimated unless we have these inputs that need a well logging operation. As an alternative, this work proposes using drilling data instead of well logs. The advantages of drilling parameters over the well logging outcomes are that the former requires no additional cost and is easier to be obtained and available at an earlier stage in the life of the well.

In the oil industry, one of the oldest exploitations of the drilling operational data is in the estimation of the formation pressure. Recently, employing machine learning, the drilling data were utilized in the prediction of rock properties such as bulk density25, wave velocities32, static and dynamic Young’s modulus33, static and dynamic Poisson’s ratio34,35.

The objective of this paper is to present an investigation on the use of drilling data in the prediction of rock failure parameters utilizing the artificial neural network as a machine learning tool. The advantage of drilling data over the tests on core plugs is that the drilling data are available at an earlier stage, more frequent, and require no additional cost. Therefore, this approach will help in having an instantaneous and complete profile for those parameters which should be very beneficial for the optimization of drilling and fracturing operations.

Methodology

The following procedure, illustrated in Fig. 2, has been employed to predict failure properties from the drilling parameters. Data for drilling operation records and experimental tests have been compiled and divided into three groups after the preprocessing. In the pre-processing step the different parameters were normalized using min–max normalization method (parameter value-minimum value)/(maximum value − minimum value) to scale the parameters to varies between 0 and 1. The different equations used to normalize the different parameters, in addition to recalculating the output data from the normalized values are listed in the Appendix A. According to their accuracies, the models are updated and optimized to provide the best possible performance.

Figure 2
figure 2

Research methodology flowchart.

Data

The utilized dataset contains over 2200 data points, the data has been divided into three groups and different division percentages were tested (from 50:25:25 to 80:10:10). The best outcomes were notices with 60:20:20 percentage of data division; following are the definitions of these data groups and percentages. 60% of the data were used to train the model (Training dataset), and 20% of the points were used to test the model accuracy within the algorithm to update the model’s parameters (Testing dataset). The last 20% of the dataset was hidden from the machine learning tool to validate the built models (Validation dataset). The definitions of these terms (training, testing and validation), may be different in some other publications. For instance, in some literature, the validation dataset refers to the dataset provided along with the training, and the testing dataset is the final evaluation. However, in this paper, the former definitions are maintained.

Each measurement contains five drilling history recordings as inputs, as well as values for cohesion and friction angle that are established as the targeted outputs. This model was built using the following drilling parameters acquired from field data:

  • Drilling rate of penetration ROP

  • Weight on bit WOB

  • Drill pipe pressure SPP

  • Torque

  • Drilling fluid pumping rate

The data were cleaned of noise and abnormalities using the Matlab program before being entered into the ANN. Table 2 presents the statistical analysis of the three datasets, the three datasets cover slightly different ranges of inputs and outputs parameters with an average change in the mean values of 11%. The lowest relative standard deviation values were noticed for Q, SPP and T ranged between 0.04 and 0.14 while the highest values were for ROP between 0.5 and 0.57. The linear correlation coefficient values between the five input values and the two output values were all less than 0.58 which indicates that there are no direct linear relationship between each input individually and each output. However, data shows a high correlation coefficient between the two failure parameters as seen in Fig. 3. The models presented in the results section of this paper will be limited by the ranges presented in Table 2.

Table 2 Statistical parameters for the training data.
Figure 3
figure 3

The correlation between cohesion and friction angle.

Machine learning

Artificial neural networks (ANN) were used to build empirical correlations between cohesion/friction angle and drilling parameters. ANN is a popular machine-learning method that simulates brain neurons36. In classification, regression, and clustering tasks, ANN could be used as an unsupervised or supervised machine learning tool37. As shown in Fig. 4, an ANN is made up of several elements such as neurons, training functions, and transfer functions in different layers38. Many effective applications of ANN in the oil and gas industry have been reported in the literature24,39. For instance, ANN has been utilized successfully in developing correlations for porosity40, permeability41, drilling fluid rheology42, rate of penetration43, and hydrocarbon properties44.

Figure 4
figure 4

Structure of the artificial neural network.

In this work, the Bayesian regularization backpropagation method was utilized for training the network and updating weight and bias values based on Levenberg–Marquardt optimization. A logistic sigmoid was used as the activation function to calculate the required outputs. Ascending numbers of neurons were tested and stopped when no further significant improvements were noticed, the 30 neurons in Fig. 4 were given as an example.

Model evaluation

Different runs were performed in the ANN to determine the optimum tuning elements within the algorithms. The number of neurons and the types of employed training/network/transfer functions were all evaluated. All of these models' trials were evaluated using two statistical measures: the correlation coefficient (R) and the average absolute percentage error (AAPE) which have been calculated using (Eqs. 13 and 14), respectively:

$$ R = \frac{{\left[ {N\mathop \sum \nolimits_{i = 1}^{N} \left( {X_{given i} \times X_{{{\text{Predicted}} i}} } \right)} \right] - \left[ {\mathop \sum \nolimits_{i = 1}^{N} X_{given i} \times \mathop \sum \nolimits_{i = 1}^{N} X_{{{\text{Predicted}} i}} } \right]}}{{\sqrt {\left[ {N\mathop \sum \nolimits_{i = 1}^{N} \left( {X_{given i} } \right)^{2} - \left( {\mathop \sum \nolimits_{i = 1}^{N} X_{given i} } \right)^{2} } \right]\left[ {N\mathop \sum \nolimits_{i = 1}^{N} \left( {X_{{{\text{Predicted}} i}} } \right)^{2} - \left( {\mathop \sum \nolimits_{i = 1}^{N} X_{{{\text{Predicted}} i}} } \right)^{2} } \right]} }} $$
(13)
$$ AAPE = \frac{{\mathop \sum \nolimits_{i = 1}^{N} \frac{{X_{given i} - X_{Predicted i} }}{{X_{given i} }} \times 100\% }}{N} $$
(14)

where N is the size of the dataset, \(X_{given}\) and \(X_{Predicted} \) are respectively the measured and the ANN-estimated failure parameter values.

Results and discussion

Training and testing

The models were optimized to yield the best possible fitting accuracy in terms of the higher value of R and the lower value of AAPE. The best performance was found Bayesian regularization backpropagation training function and log-sigmoid transfer function. The maximum number of epochs was set at 2000, however the optimum performance was found at 836, and 1412 epoch in the case of cohesion, and friction angle models, respectively.

Figures 5, 6 show the cross plots between the actual and estimated failure parameters for the training and the testing. The closer the points are to the 45-degree line means better the prediction. For the friction angle, the model resulted in a 0.86 correlation coefficient for both training and testing, while the AAPE values were around 4% ± 0.2%. Similarly, the resulting R values for cohesion ranged between 0.88 and 0.89 and AAPE values were in the range between 5.8 and 6.4%. A similar performance in predicting the two parameters was expected since they have a high correlation coefficient as shown in Table 2 and Fig. 3.

Figure 5
figure 5

Actual versus predicted friction angle cross plots for (a) training and (b) testing datasets.

Figure 6
figure 6

Actual versus predicted cohesion cross plots for (a) training and (b) testing datasets.

Models’ validation

Figure 7 shows a visual comparison between the actual and estimated values using the constructed models on the validation dataset. The performance of the models in the validation was very similar in accuracy to the training and testing, for instance, validation R values were 0.85 and 0.89 for friction angle and cohesion respectively, compared to 0.86 and 0.89 in the same order for the training. Similarly, the validation AAPE values were 4% and 5.8% for φ and C respectively, and the values in the same order were 3.8% and 5.8% for the training. Those results in validation confirm a good generalization of the model for the investigated data range.

Figure 7
figure 7

Comparison between the actual and predicted profiles for validation dataset for (a) friction angle and (b) cohesion.

By comparing the current method, which resulted in correlation coefficients in the range between 0.86 and 0.89, as in Figs. 5, 6, 7, with previous attempts based on machine learning mentioned in Table 1, which resulted in correlation coefficients in the range between 0.81 and 0.99, the results are close with a different input in both cases. Deducing the properties of rock failure using drilling data gives some positive advantages over using well logs data as in previous models, which are that drilling data is always available and before any other data in the well and does not need additional cost and at the same time it can provide continuous information because it is recorded in a frequent and real-time manner. It worth mentioning that the network training performance is shown in Fig. 8 in which the MSE was used as the loss function to monitor the model performance. It is clear from Fig. 8 that overfitting issue did not occur while running the model.

Figure 8
figure 8

Training performance in terms of MSE showing the best performance at (a) epoch 836 for cohesion model, and (b) epoch 1412 for friction angle model.

Models’ equations

The best models were achieved using the log-sigmoid transfer function and 30 neurons in the ANN. Equation (15) and Eq. (17) present the model for cohesion and friction angle respectively. While Table 3 and Table 4 present the models' parameters needed for Eq.(15)and Eq. (16) respectively. Using these equations and the data in tables allows them to be tested in different datasets to generate synthetic failure parameters or to be compared with any model that would be built in the future with similar parameters.

$$ C_{n} = \left[ {\mathop \sum \limits_{i = 1}^{N} W_{2,i} \left( {\frac{1}{{1 + e^{{ - \left( {W_{11,i} *Qn + W_{12,i} *SPPn + W_{13,i} *Tn + W_{14,i} *WOBn + W_{15,i} *ROPn + b_{1,i} } \right)}} }}} \right)} \right] + b_{2} $$
(15)
Table 3 The parameters in Eq. (15) for cohesion estimation.
Table 4 The parameters in Eq. (16) for friction angle estimation.

The normalized cohesion value can be back transformed to the actual value using the following equation.

$$ C = 976 C_{n} + 221 $$
(16)
$$ \varphi_{n} = \left[ {\mathop \sum \limits_{i = 1}^{N} W_{2,i} \left( {\frac{1}{{1 + e^{{ - \left( {W_{11,i} *Qn + W_{12,i} *SPPn + W_{13,i} *Tn + W_{14,i} *WOBn + W_{15,i} *ROPn + b_{1,i} } \right)}} }}} \right)} \right] + b_{2} $$
(17)

The normalized friction angle value can be back transformed to the actual value using the following equation.

$$ \varphi = 35.05 \varphi_{n} + 18.91 $$
(18)

It should be highlighted that the application of the developed correlations in equations. (15) and (16) to predict the friction angle and cohesion are more recommended for carbonate formations from which the data used in developing the models were obtained. Therefore, some errors might be expected upon the application for different formation lithology. Moreover, it is recommended to employ the developed equations using inputs within the range and the same units listed in Table 2 to ensure reliable results.

Conclusions

Rock mechanical parameters are vital in drilling optimization, fracturing design, and avoiding borehole problems. Conventionally, rock failure parameters are estimated using the Mohr–Coulomb failure envelope that requires drawing multiple Mohr's cycles and hence performing several compressional tests on rock samples. In this paper, an alternative technique based on the utilization of drilling data and artificial neural network is investigated and presented with the following concluding remarks:

  • The proposed approach has an advantage over the experimental testing or the previous ML-based models that require well logging data; because the drilling data are available earlier than the well logs and their acquisition does not require additional operational cost. In addition, in contrast to core samples which have practical limitations in the number of samples that could be obtained, drilling data can provide continuous information.

  • The models for the two parameters yielded close performance in all datasets, training, testing, and validation, even though the last one was not introduced during the models’ building.

  • For friction angle, the yielded R values were around 0.85 and 0.86 while AAPE values were between 3.8 and 4.2% for the three datasets.

  • For cohesion, the model resulted in R values between 0.88 and 0.89 and AAPE values ranged between 5.8 and 6.4%.

  • The comparable matching accuracy in the two parameters could be attributed to the observed high correlation coefficient between the two failure parameters.

  • In previous works, rock bulk density and elastic properties have been predicted from drilling data, in addition to the failure properties presented in this paper. For future work, the same approach could be applied to estimate other properties such as petrophysical properties.