Modeling liquid rate through wellhead chokes using machine learning techniques

Dabiri, Mohammad-Saber; Hadavimoghaddam, Fahimeh; Ashoorian, Sefatallah; Schaffie, Mahin; Hemmati-Sarapardeh, Abdolhossein

doi:10.1038/s41598-024-54010-2

Download PDF

Article
Open access
Published: 23 March 2024

Modeling liquid rate through wellhead chokes using machine learning techniques

Mohammad-Saber Dabiri¹,
Fahimeh Hadavimoghaddam²,
Sefatallah Ashoorian³,
Mahin Schaffie¹ &
…
Abdolhossein Hemmati-Sarapardeh^1,4

Scientific Reports volume 14, Article number: 6945 (2024) Cite this article

350 Accesses
Metrics details

Subjects

Abstract

Precise measurement and prediction of the fluid flow rates in production wells are crucial for anticipating the production volume and hydrocarbon recovery and creating a steady and controllable flow regime in such wells. This study suggests two approaches to predict the flow rate through wellhead chokes. The first is a data-driven approach using different methods, namely: Adaptive boosting support vector regression (Adaboost-SVR), multivariate adaptive regression spline (MARS), radial basis function (RBF), and multilayer perceptron (MLP) with three algorithms: Levenberg–Marquardt (LM), bayesian-regularization (BR), and scaled conjugate gradient (SCG). The second is a developed correlation that depends on wellhead pressure (P_wh), gas-to-liquid ratio (GLR), and choke size (D_c). A dataset of 565 data points is available for model development. The performance of the two suggested approaches is compared with earlier correlations. Results revealed that the proposed models outperform the existing ones, with the Adaboost-SVR model showing the best performance with an average absolute percent relative error (AAPRE) of 5.15% and a correlation coefficient of 0.9784. Additionally, the results indicated that the developed correlation resulted in better predictions compared to the earlier ones. Furthermore, a sensitivity analysis of the input variable was also investigated in this study and revealed that the choke size variable had the most significant effect, while the P_wh and GLR showed a slight effect on the liquid rate. Eventually, the leverage approach showed that only 2.1% of the data points were in the suspicious range.

Modelling rate of penetration in drilling operations using RBF, MLP, LSSVM, and DT models

Article Open access 08 July 2022

Enhanced machine learning—ensemble method for estimation of oil formation volume factor at reservoir conditions

Article Open access 14 September 2023

Enhanced intelligent approach for determination of crude oil viscosity at reservoir conditions

Article Open access 30 January 2023

Introduction

The momentous attributes of wellhead choke throughout oil and gas production cannot be overemphasized, as it restricts flow to regulate production rate. The adjustment of the production rate is mainly made by the wellhead chokes, which can be minimized by proper management of the production rate, formation damage, and preventing the occurrence of factors such as water and gas coning and sand production¹. The wellhead chokes can be either fixed (positive) or adjustable, depending on the bean settings. The bean size is fixed with a positive choke, while an adjustable choke is analogous to a variable valve. Due to a pressure drop in the production pipeline and a pressure falling, a bubble point of a two-phase current is created in the chokes. These two-phase components are divided into two categories, critical and subcritical. The critical flow occurs when the velocity of the fluid is higher than the velocity of the sound, and the flow velocity becomes independent of the upstream pressure². Conversely, in subcritical flow, the flow rate depends on the pressure difference, and changes in the upstream pressure affect the downstream pressure³. Numerous techniques exist for forecasting choke patterns in these areas, and it is equally important to predict the boundary between critical and subcritical flow. For instance, at critical flow, the pressure downstream of the choke can be as low as 50% or 5% of the pressure upstream of the choke⁴. The major problem created by two-phase flow via chokes is calculating the flow rate based on measurable parameters such as GLR, bean size, pressure, etc. The methods offered for multiphase flow through chokes fall into two categories, analytical and empirical⁵. In 1949, Tangerang et al. made the first theoretical study of two-phase flow limitations. He assumed the polytropic expansion of a gas uniformly distributed in a mixture into its continuous phase with a liquid⁶. Since then, several approaches have been proposed to predict multiphase flow through chokes. These techniques can be classified into several groups. One group involved simple empirical equations similar to those of Gilbert. In 1954, Gilbert proposed an empirical equation for determining the liquid flow rate, in which the flow is linearly proportional to the P_wh⁷. Later, this equation was modified by Ros⁸, Achong⁹, Baxendell¹⁰, pilehvari¹¹, Mirzaei and Salavati¹², and Beiranvand et al. The overall form of the Gilbert Equation is as follows:

$${Q}_{liq}=a1\frac{{P}_{wh}^{a2}{D}_{64}^{a3}}{{GLR}^{a4}}$$

(1)

where Q_liq is the liquid rate (STB/D), D₆₄ is choke diameter (1/64in), and Pwh and GLR are wellhead pressure (psi) and gas-to-liquid ratio (SCF/STB), respectively. a1, a2, a3, a4, a5, and a6 are the empirical coefficients of this equation presented in Table 1.

Table 1 Specific empirical coefficient correlations proposed for liquid flow through oilfield chokes.

Full size table

Following Tangeren, the Ros conducted studies based on the continuous gas phase and extended the Tangeren Eq. (8). Poettmann and Beck improved the Ros equation using 108 production data. They compiled charts for different types of crude oil with varying degrees of API, ranging choke diameter from 4/64 to 28/64 inches and ranging oil flow rates from 10 to 1300 STBD¹⁵. Al-Attar and Abdul-Majid conducted a study in which they evaluated and compared the available correlations used to assess the performance of multiphase fluid flow through a wellhead choke. They used 155 well-test production datasets from the east Baghdad oilfield¹⁶. In another study, Abdul-Majid examined correlations developed for predicting liquid rate in oilfield chokes. A dataset including 210 well-test data was used to predict the accuracy of eight correlation models. Additionally, a regression analysis was employed to find correlations that best matched measured data, and as a consequence, four new correlation coefficients were developed. Based on the statistical results, new correlations were more robust than previous ones¹⁷. Fortunati Presented an empirical equation for both critical and subcritical currents. Additionally, he included a graphical representation and established the demarcation line between critical and subcritical flow¹⁸. Ashford¹⁹ and Pilehvari¹¹ performed their studies on subcritical currents in the wellhead chokes. They determined the boundary between critical and subcritical flow as a function of fluid properties and GLR. In another study, Al-Attar carried out research work based on the critical flow through the choke. In this study, he used 40 field data based on choke size adjustment and presented a more accurate empirical equation compared to the previous ones⁵. Beiranvand and Babaei Khorzoughi presented an innovative correlation for multiphase flow through surface chokes, integrating recently introduced parameters. They did their research based on 182 production data from one of the Iranian oil fields. They also added temperature, sediment, and water to the Gilbert equation and obtained more confident results than the previous correlations²⁰.

Rashid et al. used the collected 276 data and radial basis function-genetic algorithm (RBF-GA) neural network to estimate the flow rate via the wellhead chokes. In this study, the R² values for training and test data were obtained 0.9885 and 0.9795, respectively^21,22. Mirzaei-paiaman & Salavati using 102 production test data and adding the specific gravity of oil and gas to the general equation of Gilbert reached the following Eq. (12):

$${Q}_{L}=\frac{A.{P}_{wh}.{d}^{B}.{\gamma }_{g }^{D}.{\gamma }_{O}^{E}}{{GLR}^{C}}$$

(2)

Q_L, liquid flow rate (STB/D); D₆₄, choke size (1/64 inches); P_wh, wellhead pressure (Psia); Ɣ_o, oil specific gravity; Ɣ_g, gas specific gravity; GLR, gas to liquid ratio (Scf/STB); and, A, B, C, D, and E are constants.

According to the literature, most of the experimental relationships presented for calculating the flow rate inside the choke can be classified into two categories, linear and non-linear, which typically yield a high error. However, the literature still suffers from the lack of a comprehensive and accurate model for predicting oil flow inside wellhead chokes. Hence, we attempt to develop a new correlation with a lower percentage of error than the empirical relationships presented in the literature. Additionally, we used robust machine learning algorithms to accurately predict liquid rate through the oilfield chokes. To the best of our knowledge, there has been no prior endeavor to undertake this type of modeling.

In this study, the liquid rate in wellhead chokes is modeled using machine learning approaches. To this end, 565 real data points are collected from the literature. Then, for a precise and reliable prediction of oilfield chokes, several ML models of liquid rate are applied. Four kinds of ANNs MLP with three algorithms, RBF, MARS, and Adaboost-SVR, are employed to develop models to accurately predict the liquid rate through the chokes. Furthermore, statistical evaluation and graphical error criteria are used to investigate the validation and reliability of intelligent models and other correlations. In addition, the relative impact of inputs on the liquid rate in wellhead chokes is inspected by applying the relevancy factor definition. Finally, the leverage approach is utilized to investigate the credit and application of the best-proposed model. Therefore, the key contributions of this study can be summarized as follows:

Gathering a comprehensive dataset of wellhead choke liquid rates, encompassing crucial variables like D_c, P_wh, and GLR.
The development of precise models with minimal errors by employing Adaboost-SVR machine-learning algorithms.
Developing a new empirical relationship that outperforms the previously developed relationships.
Conducting sensitivity analysis to identify the relative impact of pressure, choke size, and gas–liquid ratio on the liquid rate in oil field chokes.
Applying the leverage method to detect anomalous and outlier data associated with liquid rate as reported in the literature.

Data collection

First, for accurate prediction of the liquid rate of two-phase flow through wellhead chokes, a comprehensive database of 565 data points of liquid rate was collected^{12,20, 23,24,25,26,27,28}. Based on the literature, the most critical elements that affect the choke liquid flow rate are the P_wh, D₆₄, and GLR. As a result, in this study, the liquid flow rate is defined based on the mentioned parameters. The implemented input parameter range and output parameter range are reported in Table 2. Additionally, the input data were analyzed by mean, minimum, maximum, and other parameters, as in Table 3. The liquid rate changes with a minimum value of 205 (STB/Day), a maximum of 25,878 (STB/Day), and an arithmetic 8146.613. The P_wh value changes between 50 and 4045 with an arithmetic mean of 1549.699. The statistical dispersion for a liquid rate through chokes was determined by calculating the kurtosis, skewness, and standard deviation, and values of 1.006, 0.760, and 4383.228 were obtained, respectively, which indicates that the data points are spread out over a broader range of values. Skewness is a measure of the level of asymmetry in the distribution of a dataset. Skewness in the normal curve is observed when a data set is asymmetrically distributed. Skewness can be positive, negative, or undefined. Additionally, kurtosis measures the tailedness of the probability distribution of a random variable. Positive kurtosis means that there are several data points in the tail of a distribution, while negative kurtosis results in a few data points in the tail.

Table 2 The range of databases used in the developed model.

Full size table

Table 3 Statistical description of the data set used for modeling.

Full size table

Model development

Multilayer perception neural network (MLPNN)

A neural network processes the data through a learning process, stores it, and makes it available for use. Synaptic weights, connection strengths between neurons, are used to store knowledge²⁹. Neural networks which are significantly important in this context, are a powerful, and comprehensive framework for representing non-linear mappings from several input variables to several output variables, where several adjustable parameters govern the form of mapping. Before the emergence of the MLP neural network, in 1958 Frank Rosenblatt invented a neural network called a perceptron³⁰. Rosenblatt formed a layer of neurons and called the resulting network a perceptron. However, Rosenblatt's perceptron also had many problems. For instance, it could only solve problems that were linearly separable³¹. In 1969, Minsky and Paper wrote a book called Perceptron. They explored all the perceptron's capabilities and problems in this book. Minsky and Paper proved that the perceptron could only solve problems that are linearly separable^32,33. Furthermore, the conceptually more appealing neural network model is the MLP model^34,35. In its most basic form, this model consists of several successive layers. Each layer consists of a small number of units called neurons^36,37. In this model, the units of each layer are connected to the next layers, which are called links or synapses. A multi-layer perceptron (MLP) comprises a minimum of three layers of nodes: these include an input layer, a hidden layer, and an output layer. MLP employs an administered learning strategy called feedback for training. Its multiple layers and nonlinear activation distinguish MLP from a linear perceptron. If a multilayer perceptron has a linear activation function in all neurons, it maps the weighted inputs of each neuron with this linear function. At that point, utilizing direct polynomial math, it appears that any number related to layers can be decreased to a two-layer input–output model. These functions usually include "Tanh", "Sigmoid", and "Linear". A linear function is typically used for the output layer. These functions are described below³⁸:

$$Tansig=tanh: h(x)=\frac{{e}^{x}-{e}^{-x}}{{e}^{x}+{e}^{-x}}=\frac{2}{1+{e}^{-2x}}-1$$

(3)

$$linear=purelin=h(x)=x$$

(4)

$$sigmoid=logsig: h(x)=\frac{{e}^{x}}{{e}^{x}+1}$$

(5)

Consider an MLP with two hidden layers and logsig and tansig activation functions for the two hidden layers and purlin for the output layer, respectively. The output of the model can be calculated by the following formula:

$$output=purlin({w}_{3}\times ({\text{log}}sig\left({w}_{2}\times \left(tansig\left({w}_{1}\times x\right)+{b}_{1}\right)\right)+{b}_{2})+{b}_{3}$$

(6)

where the bias terms for the 1st and 2nd hidden layers are ${b}_{1}$ and ${b}_{2}$, respectively, and ${b}_{3}$ is the bias of the output layer. In addition, ${w}_{1}$ , ${w}_{2}$, and ${w}_{3}$ are the weight matrixes for the 1st and 2nd, and the output layer, respectively. The activation functions used for the first and second hidden layers are usually tansig and logsig, respectively, in the case of using two hidden layers³⁸.

Figure 1 shows the structure of an MLP model with two hidden layers. In this study, to develop the MLP model, three algorithms including Bayesian Regularization (BR), Scaled Conjugate Gradient (SCG), and Levenberg–Marquardt (LM), were used. The type of activation function, the number of neurons, and the number of layers used for the MLP model are reported in Table 4.

Table 4 Control parameters for MLP and RBF model used in this study.

Full size table

Radial basis function neural network (RBFNN)

Similar to the MLP neural network model, there is another type of neural network in which processing units are focused on a specific distance. Regarding overall structure, neural RBF networks are not significantly different from MLP networks, and the only difference is the type of processing the neurons perform on their inputs. However, RBF networks often have faster learning and training processes. since neurons are concentrated in specific functional areas, it will be easier to regulate them. Generally, the radial basis function (RBF) network is composed of a three-layer structure, where the initial and final layers serve as the input and output layers, while the intermediate layer functions as the hidden layer. There is one hidden layer in this model that identifies the relationship between input and output data^39,40. Figure 2 indicates an example of an RBF network. The output of this model is given by the following formula:

$${y}_{k}=\sum_{i=1}^{N}{\varnothing }_{ki}\times wi\times (\left|{x}_{i}-{c}_{k}\right|)+{w}_{0} , k=\mathrm{1,2},...,N ;i=\mathrm{1,2},...,M$$

(7)

where $wi$, ${w}_{0}$, ${y}_{k}$, $N$, ${c}_{k},$ and $M$ are the weights of the network, the model’s output, the cluster numbers, cluster, coefficient of bias, and data point number, respectively. The maximum number of neurons and the expansion coefficient are the main parameters that can be changed in this model. It should be noted that these factors are usually determined by trial and error.

Adaptive boosting support vector regression (AdaBoost-SVR)

AdaBoost algorithm is a collective learning method and is a well-known algorithm from the family of Boosting algorithms presented by Freund and Schapire⁴¹. In collective learning algorithms, one case is classified by several different classifiers, and the classifications’ results are intelligently combined and the final result is determined for that particular case. Typically, the collective learning algorithm is higher compared to the individual classifiers participating in its structure. In AdaBoost collective learning, each class is trained with a different bootstrap. The bootstrap sampling method is such that the number of training samples is randomly selected from the training data set. A nested pattern allows the same pattern to be selected multiple times. This algorithm has several steps that are mentioned here⁴²:

1.
First, all data will be assigned some weights. Initially, all the weights will be equal. To determine the sample weight, the following formulas were used:
$$w({x}_{i},{y}_{i})=\frac{1}{N} , i=1,2,3,...,n$$
(8)
where N is the total number of data.
2.
For m = 1 to M:
1. (a)
  Fit a classifier G_m (x) to the learning data using weights w_i.
2. (b)
  Determine
  $${err}_{m}=\frac{\sum_{i=1}^{N}{w}_{i}I({y}_{i}\ne {G}_{m}(xi))}{\sum_{i=1}^{N}{w}_{i}}$$
  (9)
3.
Compute
$${\alpha }_{m}=log((1-{err}_{m})/{err}_{m}).$$
(10)
4.
set ${w}_{i}$
$$w_{i}^{*} \;exp\left[ {\alpha_{m} .I\left( {y_{i} \ne G_{m} \left( {xi} \right)} \right)} \right]{ },{ }i = 1,2,...,N$$
(11)
5.
Output
$$G(x)=sign[\sum_{m=1}^{M}{\alpha }_{m}{G}_{m}(x)]$$
(12)

where M, ${err}_{m}$, ${\alpha }_{m}$ are the number of learners, the weight of the error rate, and the predicted weight.

Support vector regression (SVR)

SVR was first proposed in 1995 by Vapnik for classification problems. Recently, the SVR model has become one of the most common models in the field of petroleum engineering due to its acceptable performance in forecasting^43,44,45. For a simple case, input data x ϵ R^d are regressed by hyper plane g(x):

$$g(x)=w.\varnothing (x)+b$$

(13)

The weight vector and the bias are w and b, respectively, with g(x) representing the regression function of the input space vector x. A minimization problem is formulated for regression purposes to compute vector b, in which Model complexity and associated empirical error are summarized under the so-called normalized risk function⁴⁶.

$$\xi =\left|{y}_{i}-g(w,xi)\right|$$

(14)

$${\left|\xi \right|}_{\in }= \left\{\begin{array}{l}0\quad \qquad \quad if \left|\xi \right|<\in \\ \left|\xi \right|-\in \quad otherwise \end{array}\right.$$

(15)

By considering the positive slack variables ($\xi$, $\xi$*) optimization problem is formulated as:

$$\text{Minimize }\frac{1}{2}{\| \omega \| }^{2}+c\sum_{i=1}^{n}({\xi }_{i}+{\xi }_{i}^{*})$$

(16)

where $\sum_{i=1}^{n}({\xi }_{i}+{\xi }_{i}^{*})$ represents the empirical error and ${\| \omega \| }^{2}$ is the flatness of the function. C represents a penalizing factor for the data that their deviation from g is higher than ε⁴⁷.

Multivariate adaptive regression spline (MARS)

MARS is an algorithm designed for multivariate non-linear regression problems⁴⁸. In each aspect, the Mars algorithm divides the input parameter space into separate subregions and corresponds to a spline function known as a basis function. MARS studies non-linear relationships between input and response variables with more flexibility, which is why this model differs from other linear regression techniques. Additionally, MARS checks all degrees of interaction in arrange to discover all conceivable intelligence between factors. This strategy takes into account all intuitive and convenient shapes between input parameters, so it can effectively follow hidden connections in high-dimensional datasets as well as complex structures found in data points⁴⁹. The general formula of this algorithm is represented as follows:

$$f(x)={\beta }_{0}+\sum_{m=1}^{m}{\beta }_{m}{\lambda }_{m}(x)$$

(17)

where ${\beta }_{0}$ and ${\beta }_{m}$ represent the parameters that give the best fit of data points, f(x) stands for the response, and M indicates BF in the model. In this algorithm, the basis function can take the form of a univariate spline function or a combination of multiple functions, depending on the various predictive inputs. ${\lambda }_{m}$(x) and the spline BF can be presented as follows:

$${\lambda }_{m}(x)={\Pi }_{k=1}^{{k}_{m}=1}[{S}_{km}({X}_{v(k,m)}-{t}_{(k,m)}]$$

(18)

where ${S}_{km}$ is the right/left regions of the corresponding step function, taking either 1 or − 1, ${t}_{(k,m)}$ represents the knot location, K_m presents the number of knots and $v(k,m)$ represents the predictor input’s label. Mars model builds BF using a step-by-step technique. MARS over-fits data in the forward step by investigating an expansive number of BFs. Duplicate BFs are removed backward from the equation to prevent overfitting. To remove duplicate BFs, MARS uses the Generalized Cross-Validation (GCV) criteria. A GCV is expressed as:

$$GCV=\frac{\frac{1}{N}\sum_{i=1}^{n}[{y}_{i}-f^({x}_{i})]}{{[1-\frac{C(B)}{N}]}^{2}}$$

(19)

The N parameter presents the whole data number. C(B) represents a complexity penalty, and it is defined as⁵⁰:

$${\text{C}}\left( {\text{B}} \right) = \left( {{\text{B}} + {1}} \right) + {\text{d}}({\text{B}})$$

(20)

Generalized reduced gradient (GRG)

The generalized reduced gradient (GRG) approach is frequently applied as a solver for multivariable problems. Based on the concept of decreased gradients, this technique is designed to incorporate and solve Linear and non-linear Problems. The component is monitored in such a way as to ensure that the active constraints are kept satisfied when the process changes from one stage to another. The GRG provides a linear estimation of the gradient at a given point x. The constraint and objective gradient are resolved at the same time so that constraints can be represented by gradients of an objective function. By moving in a practical path, the search area is reduced. The following notations represent an objective function, f(z), which is subject to the constraint h(z)⁵¹.

$${\text{Minimizes}}:{\text{ f}}\left( {\text{z}} \right) = {\text{ z}}$$

(21)

$${\text{Subjected to}}:{\text{ h}}_{{\text{k}}} \left( {\text{z}} \right) = \, 0$$

(22)

The GRG can be adjusted using the following form:

$$\frac{df}{{dz}_{k}}=\nabla {z}_{k}^{t}f-\nabla {z}_{i}^{t}f\left(\frac{dh}{{dz}_{i}}\right) \frac{dh}{{dz}_{k}}$$

(23)

Basically, f(z) will be minimum under two simple conditions which are df(z) = 0 or $\frac{df}{{dz}_{k}}=0$⁵².

Evaluation of the model

Evaluation of the performance of the proposed models is ordinarily done by comparison of the model prediction with the real values by calculating the various statistical parameters, including average percent relative error (APRE), average absolute percent relative error (AAPRE), standard deviation (SD), root mean square error (RMSE), and coefficient of determination. These statistical parameters are obtained from the following Equations:

$$APRE=\frac{1}{n}\sum_{i=1}^{n}Ei$$

(24)

where E_i is the percent relative error and is stated based on the following formula⁵³:

$$Ei=[\frac{{\left({Q}_{liq,i}\right)}_{real}-{\left({Q}_{liq,i}\right)}_{pred}}{{\left({Q}_{liq,i}\right)}_{real}}]\times 100$$

(25)

$$AAPRE=\frac{1}{n}\sum_{i=1}^{n}\left|Ei\right|$$

(26)

$$SD=\sqrt{\frac{1}{N-1}\sum_{i=1}^{N}{[\frac{{\left({Q}_{liq,i}\right)}_{real}-{\left({Q}_{liq,i}\right)}_{pred}}{{\left({Q}_{liq,i}\right)}_{real}}]}^{2}}$$

(27)

$$RMSE=\sqrt{\frac{\sum_{i=1}^{N}{\left[{\left({Q}_{liq,i}\right)}_{real}-{\left({Q}_{liq,i}\right)}_{pred}\right]}^{2}}{N}}$$

(28)

$${R}^{2}=1-\frac{\sum_{i=1}^{N}{\left[{\left({Q}_{liq,i}\right)}_{real}-{\left({Q}_{liq,i}\right)}_{pred}\right]}^{2}}{\sum_{i=1}^{N}{\left[{\left({Q}_{liq,i}\right)}_{real}-\frac{\sum_{i=1}^{N}{\left({Q}_{liq,i}\right)}_{real}}{N}\right]}^{2}}$$

(29)

Here ${\left({Q}_{liq,i}\right)}_{real}$ is the real oil flow rate that measured in the field test; ${\left({Q}_{liq,i}\right)}_{pred}$ is the predicted oil flow rate and N presented the whole number of data utilized for analysis.

At the same time, the performance of the machine learning model was assessed using the following graphical tools, which are described further below:

Cross plot: The most widely recognized method is graphical analysis, in which the predicted values are graphed against measured values, and the models' accuracy is determined by how closely the data points align with a line of unity slope.

Cumulative frequency plot: This plot is a comparative chart that can compare several models with each other. In this diagram, a model predicting more data with lower error can be determined. If the model is close to the vertical axis, the higher percentage of data is predicted by a lower error, therefore, it is more accurate than the other model.

Trend plot: This diagram plots both real data and the model's estimate against a given feature or an index to determine whether that model is valid.

Error distribution plot: Plotting the difference between the measured value and the predicted value against the actual data to assess the dispersion of the data around the zero-error line and analyze any patterns in errors.

Results and discussion

In the present work, models were developed based on 565 production data points that were collected from different sources in the literature. For all models with different algorithms, 80% of the data points were randomly selected to train the set, and the remaining 20% were employed to test and validate the model.

Development of the correlation

In this work, the GRG algorithm is used to predict the liquid rate through wellhead chokes. The correlation was developed based on four coefficients to optimize the APRE and RMSE, which is presented below:

$$Qliq=a1\times {P}_{wh}^{a2}\times {D}_{c}^{a3}{GLR}^{a4}$$

where Q_liq, liquid flow rate (STB/Day); P_wh, upstream pressure(psi); Dc, choke size (1/64) in and GLR, gas to liquid ratio (SCF/STB).

a1, a2, a3, a4 are equation coefficients are reported in Table 5.

Table 5 Coefficients developed correlation to optimized AAPRE and RMSE.

Full size table

Statistical analyses of models

First, we have to compare intelligent models and correlation based on statistical parameters including (R2, APRE%, AAPRE %, RMSE, and SD), to find the most accurate and efficient models. Table 6 shows the model development, validation, and statistical evaluation of the total sets for a liquid rate through oil field chokes by Adaboost-SVR, MARS, MLP-LM, MLP-BR, MLP-SCG, and RBF models. Furthermore, Table 7 reports the statistical assessment of the proposed correlations by Gilbert, Ros, Achong, Baxendell, Pilehvari, Beiranvand, and developed correlation to optimized AAPRE and RMSE.

Table 6 Statistical evaluation of the developed models.

Full size table

Table 7 Statistical analysis errors proposed correlation used in this study and developed correlation.

Full size table

As seen in Table 6, using the Adaboost-SVR model results in the lowest value of AAPRE for predicting the liquid rate of two-phase flow through wellhead chokes. The total APRE, AAPRE, RMSE, SD, and R² for Adaboost-SVR are − 1.5%, 5.15%, 643.38, 0.086, and 0.9784, respectively. After Adaboost using the MARS leads to the lowest overall AAPRE. As appeared in this Table, the total AAPRE for MLP-SCG is 11.44% which indicates the lowest precision.

Furthermore, according to the results presented in Table 7, the proposed correlation by Pilehvari has the lowest accuracy compared to other correlations to estimate liquid rate, while using Beiranvand leads to the lowest value of the total AAPRE which is 19.03%. After Beiranvand, using the Achong correlation leads to the lowest value of the overall AAPRE. Comparing the statistical analysis of the errors in Tables 6 and 7, it can be concluded that all the proposed models of ANN had a much higher accuracy than the correlation studied in this research for the prediction of liquid rate in the choke.

To further evaluate the validity and reliability of the Adaboost-SVR model, an external validation dataset containing 28 liquid rates in oilfield chokes over a range of operating choke size (14–48 in), pressure (250–1697.9 psia), and GLR (600.1–800 SCF/STB), were collected from the literature¹⁷. This data falls entirely outside the training and testing sets utilized for modeling in this paper. As a result, it enables an assessment of the model's performance beyond the data sets used for modeling. Predicted values for Adaboost-SVR are reported in Table 8. The values presented in this Table for experimental and predicted data show that the Adaboost-SVR model demonstrates reliable predictive accuracy even for new fluid rates beyond the range of chokes used during the modeling process.

Table 8 The experimental and predicted values for evaluation of the Adaboost-SVR model.

Full size table

Graphical error analysis

Another way to assess model performance and compare it to other models and proposed correlations is to use graphical error analysis. This graphical strategy impressively helps when there are several models whose performance should be compared together. To assess the precision of the intelligent models consisting of Adaboost-SVR, MARS, MLP-LM, MLP-BR, MLP-SCG, and RBF the predicted liquid rate data was plotted against the real values in Fig. 3. It can be concluded that all intelligence models show relatively good accuracy. The Adaboost-SVR model gives the most noteworthy exactness level compared to other models. Also, it can be concluded that from the Figure MLP with algorithm SCG shows the lowest accuracy compared to the two algorithms MLP-LM and MLP-BR.

Furthermore, Fig. 4 is plotted to evaluate the performance of different correlations. As seen in Fig. 4, all correlations proposed for an estimated liquid rate through wellhead chokes showed weak performance. The Gilbert correlation predicts the flow rate is lower than its actual value. Under these conditions, the relative error could be a positive number, and expectations go astray from the proper values. Also, the Pilehvari model overestimates the real data points. In other words, this model tends to predict values to be larger than the real values. In this situation, the relative error could be a negative number, and forecasts veer off from the right values. It is obvious that Ros, Achong, Beiranvand, and Baxendell are models that suffer from a random error in anticipating real value and show poor performance in estimating the liquid rate. It can also be concluded from the Figure that Gilbert and Pilehvari are the models with the least accuracy with the most considerable AAPRE value among all the correlations proposed for estimating the liquid flow through oil field chokes.

Figures 5 and 6 illustrate the percent relative error distribution versus the real flow rate for the AI models and correlations to determine the error trend of the predictive models when an independent variable is increased. Concerning Figs. 5 and 6, it can be concluded that AI models have much higher accuracy than the presented correlations.

The data points lie close to the zero-error line regardless of the change in their value. Moreover, these Figures show that by increasing the value of the liquid rate, there is no error trend in this plot, which means that the developed models are suitable for using any range of data. It should be noted that the training phase of these models was developed based on a sufficient amount of data.

Furthermore, the cumulative relative frequency of data (with absolute relative errors below specific increasing values) is plotted against absolute relative error (ARE%) to quantify the number of data that the model can accurately predict. To find cumulative frequencies, it is first necessary to sort the column of the absolute relative errors in ascending order, then the relative frequency of each row is calculated. Relative frequency is obtained by dividing the number of rows by the number of total data. Then, cumulative frequency versus absolute relative error is plotted⁵⁴.

Figure 7 illustrates the cumulative frequency error versus ARE % for AI models consisting of Adaboost-SVR, MARS, RBF, MLP-LM, and developed correlations consisting of Gilbert and correlation in this study. As seen in Figure, the developed AI models performed better in estimating the liquid flow compared to the others.

correlation studied in this research. The Adaboost-SVR model is the most accurate model among the developed artificial intelligence models showing 91% of the full data set with 15% ARE. It can also be deduced from Fig. 7 that the developed correlations in this study with four coefficients estimate approximately 60% of data with 15% ARE. Regarding correlations, the correlation developed by Gilbert demonstrated poor performance.

Furthermore, Fig. 8 demonstrates the trend plots of liquid rate in oil field chokes at different choke sizes by the Adaboost-SVR model. As seen in this Figure, there is a very good match between the real and predicted values.

The comparison of AAPRE and RMSE between the proposed AI models and other correlations is shown in Fig. 9. As seen in this Figure, the lowest value of AAPRE and RMSE is related to the Adabost-SVR model.

Sensitivity analysis

Sensitivity analysis of the input parameters was performed in estimating the liquid flow by using Eq. (30). To this end, input data points and real liquid flow rate data were used. This diagram shows the effect of inputs on the liquid flow rate through the choke, which is based on the Pearson relationship. This is defined as follows^26,55:

$$r=\frac{\sum_{i=0}^{n}({I}_{k}-{\underline{I}}_{k})({O}_{i}-\overline{O })}{\sqrt{\sum_{i=0}^{n}{({I}_{k}-{\overline{I} }_{k})}^{2}}\sum_{i=0}^{n}{({O}_{i}-\overline{O })}^{2}}$$

(30)

where I_k Indicates the input value of the k number of the model (P_wh, D (1/64), GLR, and Q_L) and ${\overline{I} }_{k}$ indicates the average value for the input variable k number of the model. O and $\overline{O }$ predicted liquid flow rate and the average predicted liquid flow rate, respectively. also ${I}_{ki}$ shows the amount of k-number input data²⁵. Figure 10 illustrates the relative effect of input parameters on the liquid flow rate. This figure demonstrates that the input variable, such as the choke size, exerts a positive influence on the target value. Conversely, the output variable is adversely affected by both P_wh and GLR. This implies that any rise in P_wh or GLR would lead to a reduction in the liquid flow rate in chokes. As can be seen from this Figure, the largest effect on the liquid flow rate is related to the choke size. Furthermore, the lowest r-value among the input variables considered is − 0.045, which suggests that the gas–liquid ratio has the least impact on the flow rate.

Outlier diagnostics and model reliability assessment

To find suspicious and out-of-bounds data, a William diagram is drawn using the leverage technique⁵⁶. Such data are not necessarily non-standard data, and their proper P_wh range, D_c, and GLR may differ from other data in a valid range. Data with a hat between 0 and an H* and standardized residual (SR) between −3 and 3 are valid data. Also, data with SR values greater than 3 or lower than −3 are lab-suspicious (regardless of their hat value), and data with Hat higher than Hat* and SR between − 3 and 3 are outside the model scope^57,58. The SR, Hat*, and H are represented as follows⁵⁹:

$$SR=\frac{(outputs-targets)}{({(1-h)}^{0.5}\times RMSE)}$$

(31)

$${Hat}^{*}=\frac{3(number~ of ~input~ data+1)}{number ~of ~data~ point}$$

(32)

$$H=X{{(X}^{t}X)}^{-1}{X}^{t}$$

(33)

H is defined as a matrix (k × j), in which k and j determine the total number of data and the model parameters, respectively, and t is the concept of transposition. Using the main elements of the matrix diameter, the relationship between each point is obtained and finally, the suspicious data are calculated. Figure 11 illustrates the Williams chart for Adaboost-SVR model⁶⁰. According to the graph, the number of data points out of leverage data is insignificant, affecting the model accuracy considerably, and most of the used data is in the valid zone of the Williams chart. As depicted in Fig. 11 most of the data points are situated within the range of 0 ≤ H ≤ H ∗ and − 3 ≤ R ≤ 3. Data points with lower values of R and H demonstrate higher reliability. Therefore, the identification of data points outside the model's intended scope amounted to a mere 2.1%, which is insignificant when considering the substantial volume of data points used during the model's development. These findings indicate that the proposed Adaboost-SVR model exhibits high reliability.

Conclusions

In this study, the liquid rate in chokes was modeled using 565 datasets including P_wh, GLR, and D_c. Six intelligent models were developed for forecasting the liquid rate. Additionally, developed an empirical equation with four coefficients based on P_wh, GLR, and D_c. Statistical analysis confirms that all the developed models in this study can properly estimate the liquid rate through oilfield chokes. Nevertheless, the accuracy of the different models can be ranked as follows:

Adaboost-SVR > MARS > RBF > MLP-LM > MLP-BR > MLP-SCG.

The Adabboost-SVR model is the most precise compared to other intelligent models. The statistical parameters for this model are: R² of 0.9784; RMSE of 643.38; APRE of − 1.5%, and AAPRE of 5.15%. The correlation developed with four coefficients showed the best performance among the earlier correlations in this work (Supplementary file). Furthermore, the results of sensitivity analysis indicated that D_c has a positive effect and owns the highest influence on liquid rate through chokes, while GLR and P_wh have a negative effect. Finally, outlier detection applying the leverage approach revealed that only 2.1% of the real data points are doubtful.

Data availability

The datasets used during the current study are available as a Supplementary file.

Abbreviations

NN:: Neural network
MARS:: Multivariate adaptive regression spline
Adaboost-SVR:: Adaptive boosting-support vector machine
R² :: Coefficient of determination
SCG:: Scaled conjugate gradient
GLR:: Gas to liquid ratio
RMSE:: Root mean square error
LM:: Levenberg–Marquardt
AI:: Artificial intelligence
RBF-GM:: Radial basis function-genetic model
MLP:: Multilayer perceptron
ANN:: Artificial neural network
BR:: Bayesian-regularization
APRE%:: Average percent relative error
RBF:: Radial basis function
P_wh :: Wellhead pressure
SD:: Standard deviation
GLR:: Gas-to-liquid ratio
SVM:: Support vector machine
Dc:: Choke size
SR:: Standardized residual
γo:: Oil specific gravity
AAPRE%:: Average absolute percent relative error
γg:: Gas specific gravity
T:: Temperature
W.C:: Water cut

References

Sanni, K., Longe, P. & Okotie, S. New production rate model of wellhead choke for Niger delta oil wells. J. Pet. Sci. Technol. 10, 41–49 (2020).
Google Scholar
Guo, B. Petroleum Production Engineering: A Computer-Assisted Approach (Elsevier, 2011).
Google Scholar
Elgibaly, A. & Nashawi, I. New correlations for critical and subcritical two-phase flow through wellhead chokes. J. Canad. Pet. Technol. https://doi.org/10.2118/98-06-04 (1998).
Article Google Scholar
Sachdeva, R., Schmidt, Z., Brill, J. & Blais, R. SPE Annual Technical Conference and Exhibition. (OnePetro).
Al-Attar, H. H. Latin American and Caribbean Petroleum Engineering Conference. (OnePetro).
Tangren, R., Dodge, C. & Seifert, H. Compressibility effects in two-phase flow. J. Appl. phys. 20, 637–645 (1949).
Article ADS CAS Google Scholar
Gilbert, W. Drilling and Production Practice. (OnePetro).
Ros, N. An analysis of critical simultaneous gas/liquid flow through a restriction and its application to flowmetering. Appl. Sci. Res. 9, 374–388 (1960).
Article Google Scholar
Achong, I. Revised Bean Performance Formula for Lake Maracaibo Wells (Shell Oil Co., 1961).
Google Scholar
Baxendell, P. Bean performance-lake wells. Shell Internal Rep (1957).
Pilehvari, A. A. Experimental Study of Critical Two-Phase Flow Through Wellhead Chokes (University of Tulsa, 1981).
Google Scholar
Mirzaei-Paiaman, A. & Salavati, S. A new empirical correlation for sonic simultaneous flow of oil and gas through wellhead chokes for Persian oil fields. Energy Sour. Part A Recov. Util. Environ. Effects 35, 817–825 (2013).
CAS Google Scholar
Baxendell, P. Producing wells on casing flow-an analysis of flowing pressure gradients. Trans. AIME 213, 202–206 (1958).
Article Google Scholar
Safar Beiranvand, M., Mohammadmoradi, P., Aminshahidy, B., Fazelabdolabadi, B. & Aghahoseini, S. New multiphase choke correlations for a high flow rate Iranian oil field. Mech. Sci. 3, 43–47 (2012).
Article Google Scholar
Poettmann, F. & Beck, R. New charts developed to predict gas-liquid flow through chokes. World Oil 184, 95–100 (1963).
Google Scholar
Al-Attar, H. & Abdul-Majeed, G. Revised bean performance equation for East Baghdad oil wells. SPE Prod. Eng. 3, 127–131 (1988).
Article Google Scholar
Abdul-Majeed, G. H. & Maha, R.A.-A. Correlations developed to predict two-phase flow through wellhead chokes. J. Canad. Pet. Technol. https://doi.org/10.2118/91-06-05 (1991).
Article Google Scholar
Fortunati, F. SPE European Spring Meeting. (OnePetro).
Ashford, F. An evaluation of critical multiphase flow performance through wellhead chokes. J. Pet. Technol. 26, 843–850 (1974).
Article Google Scholar
Safar Beiranvand, M. & Babaei Khorzoughi, M. Introducing a new correlation for multiphase flow through surface chokes with newly incorporated parameters. SPE Prod. Oper. 27, 422–428 (2012).
Google Scholar
Shams, R., Esmaili, S., Rashid, S. & Suleymani, M. An intelligent modeling approach for prediction of thermal conductivity of CO₂. J. Nat. Gas Sci. Eng. 27, 138–150 (2015).
Article CAS Google Scholar
Rashid, S., Ghamartale, A., Abbasi, J., Darvish, H. & Tatar, A. Prediction of critical multiphase flow through chokes by using a rigorous artificial neural network method. Flow Meas Instrum. 69, 101579 (2019).
Article Google Scholar
Gorjaei, R. G., Songolzadeh, R., Torkaman, M., Safari, M. & Zargar, G. A novel PSO-LSSVM model for predicting liquid rate of two phase flow through wellhead chokes. J. Nat. Gas Sci. Eng. 24, 228–237 (2015).
Article Google Scholar
Choubineh, A. et al. Improved predictions of wellhead choke liquid critical-flow rates: modelling based on hybrid neural network training learning based optimization. Fuel 207, 547–560 (2017).
Article CAS Google Scholar
Ganat, T. A. & Hrairi, M. A new choke correlation to predict flow rate of artificially flowing wells. J. Pet. Sci. Eng. 171, 1378–1389 (2018).
Article CAS Google Scholar
Ghorbani, H. et al. Adaptive neuro-fuzzy algorithm applied to predict and control multi-phase flow rates through wellhead chokes. Flow Meas. Instrum. 76, 101849 (2020).
Article Google Scholar
Al-Attar, H. H. SPE Latin America and Caribbean Petroleum Engineering Conference. SPE-120788-MS (SPE).
Mirzaei-Paiaman, A. & Salavati, S. The application of artificial neural networks for the prediction of oil production flow rate. Energy Sourc. Part A Recov. Util. Environ. Effects 34, 1834–1843 (2012).
Google Scholar
Pinkus, A. Approximation theory of the MLP model in neural networks. Acta Numer. 8, 143–195 (1999).
Article ADS MathSciNet Google Scholar
Kanal, L. N. Encyclopedia of Computer Science 1383–1385 (2003).
Rosenblatt, F. Principles of neurodynamics. perceptrons and the theory of brain mechanisms. (Cornell Aeronautical Lab Inc Buffalo NY, 1961).
Mikelsten, D., Teigens, V. & Skalfist, P. Umjetna inteligencija: četvrta industrijska revolucija. (Cambridge Stanford Books).
Teigens, V. Umjetna opća inteligencija. Vol. 1 (Cambridge Stanford Books).
Driss, S. B., Soua, M., Kachouri, R. & Akil, M. Real-Time Image and Video Processing 2017. 32–42 (SPIE).
Kashaninejad, M., Dehghani, A. & Kashiri, M. Modeling of wheat soaking using two artificial neural networks (MLP and RBF). J. Food Eng. 91, 602–607 (2009).
Article Google Scholar
Mia, M. M. A., Biswas, S. K., Urmi, M. C. & Siddique, A. An algorithm for training multilayer perceptron (MLP) for Image reconstruction using neural network without overfitting. Int. J. Sci. Technol. Res. 4, 271–275 (2015).
Google Scholar
Camacho Olmedo, M. T., Paegelow, M., Mas, J.-F. & Escobar, F. Geomatic Approaches for Modeling land Change Scenarios. An introduction (Springer, 2018).
Book Google Scholar
Hemmati-Sarapardeh, A., Ghazanfari, M. H., Ayatollahi, S. & Masihi, M. Accurate determination of the CO₂-crude oil minimum miscibility pressure of pure and impure CO₂ streams: A robust modelling approach. Canad. J. Chem. Eng. 94, 253–261 (2016).
Article CAS Google Scholar
Najafi-Marghmaleki, A. et al. On the prediction of interfacial tension (IFT) for water-hydrocarbon gas system. J. Mol. Liq. 224, 976–990 (2016).
Article CAS Google Scholar
Najafi-Marghmaleki, A., Barati-Harooni, A., Tatar, A., Mohebbi, A. & Mohammadi, A. H. On the prediction of Watson characterization factor of hydrocarbons. J. Mol. Liq. 231, 419–429 (2017).
Article CAS Google Scholar
Freund, Y. & Schapire, R. E. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55, 119–139 (1997).
Article MathSciNet Google Scholar
Mohammadi, M.-R. et al. Modeling hydrogen solubility in hydrocarbons using extreme gradient boosting and equations of state. Sci. Rep. 11, 17911 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Vapnik, V. Pattern recognition using generalized portrait method. Autom. Remote Control 24, 774–780 (1963).
Google Scholar
Esfahani, S., Baselizadeh, S. & Hemmati-Sarapardeh, A. On determination of natural gas density: Least square support vector machine modeling approach. J. Nat. Gas Sci. Eng. 22, 348–358 (2015).
Article CAS Google Scholar
Nejatian, I., Kanani, M., Arabloo, M., Bahadori, A. & Zendehboudi, S. Prediction of natural gas flow through chokes using support vector machine algorithm. J. Nat. Gas Sci. Eng. 18, 155–163 (2014).
Article Google Scholar
Mohammadi, M.-R. et al. Application of robust machine learning methods to modeling hydrogen solubility in hydrocarbon fuels. Int. J. Hydrog. Energy 47, 320–338 (2022).
Article CAS Google Scholar
Cherkassky, V. & Ma, Y. Practical selection of SVM parameters and noise estimation for SVM regression. Neural Netw. 17, 113–126 (2004).
Article PubMed Google Scholar
Nakhaei-Kohani, R. et al. Modeling solubility of oxygen in ionic liquids: Chemical structure-based machine learning systems compared to equations of state. Fluid Phase Equilib. 566, 113630 (2023).
Article CAS Google Scholar
Mohammadi, M.-R. et al. Modeling hydrogen solubility in alcohols using machine learning models and equations of state. J. Mol. Liq. 346, 117807 (2022).
Article CAS Google Scholar
Naser, A. H., Badr, A. H., Henedy, S. N., Ostrowski, K. A. & Imran, H. Application of multivariate adaptive regression splines (MARS) approach in prediction of compressive strength of eco-friendly concrete. Case Stud. Constr. Mater. 17, e01262 (2022).
Google Scholar
Ameli, F., Hemmati-Sarapardeh, A., Dabir, B. & Mohammadi, A. H. Determination of asphaltene precipitation conditions during natural depletion of oil reservoirs: A robust compositional approach. Fluid Phase Equilib. 412, 235–248 (2016).
Article CAS Google Scholar
Mousavi, S. P. et al. Viscosity of ionic liquids: Application of the Eyring’s theory and a committee machine intelligent system. Molecules 26, 156 (2020).
Article PubMed PubMed Central Google Scholar
Hu, S., Wang, H., Liu, Z. & Wang, Y. Design of a three-dimensional current sensor with measuring upwelling. Flow Meas. Instrum. 69, 101606 (2019).
Article Google Scholar
Shateri, M. et al. Comparative analysis of machine learning models for nanofluids viscosity assessment. Nanomaterials 10, 1767 (2020).
Article CAS PubMed PubMed Central Google Scholar
Rezaei, F., Jafari, S., Hemmati-Sarapardeh, A. & Mohammadi, A. H. Modeling of gas viscosity at high pressure-high temperature conditions: Integrating radial basis function neural network with evolutionary algorithms. J. Pet. Sci. Eng. 208, 109328 (2022).
Article CAS Google Scholar
Rousseeuw, P. J. & Leroy, A. M. Robust Regression and Outlier Detection (Wiley, 2005).
Google Scholar
Gramatica, P. Principles of QSAR models validation: Internal and external. QSAR Comb. Sci. 26, 694–701 (2007).
Article CAS Google Scholar
Gharagheizi, F. et al. Evaluation of thermal conductivity of gases at atmospheric pressure through a corresponding states method. Ind. Eng. Chem. Res. 51, 3844–3849 (2012).
Article CAS Google Scholar
Mohammadi, M.-R. et al. Modeling the solubility of light hydrocarbon gases and their mixture in brine with machine learning and equations of state. Sci. Rep. 12, 14943 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Sarapardeh, A. H., Larestani, A., Menad, N. A. & Hajirezaie, S. Applications of Artificial Intelligence Techniques in the Petroleum Industry (Gulf Professional Publishing, 2020).
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Petroleum Engineering, Shahid Bahonar University of Kerman, Kerman, Iran
Mohammad-Saber Dabiri, Mahin Schaffie & Abdolhossein Hemmati-Sarapardeh
Ufa State Petroleum Technological University, Ufa, Russia, 450064
Fahimeh Hadavimoghaddam
Institute of Petroleum Engineering, School of Chemical Engineering, University of Tehran, P.O. Box: 11155-4563, Tehran, Iran
Sefatallah Ashoorian
State Key Laboratory of Petroleum Resources and Prospecting, China University of Petroleum (Beijing), Beijing, China
Abdolhossein Hemmati-Sarapardeh

Authors

Mohammad-Saber Dabiri
View author publications
You can also search for this author in PubMed Google Scholar
Fahimeh Hadavimoghaddam
View author publications
You can also search for this author in PubMed Google Scholar
Sefatallah Ashoorian
View author publications
You can also search for this author in PubMed Google Scholar
Mahin Schaffie
View author publications
You can also search for this author in PubMed Google Scholar
Abdolhossein Hemmati-Sarapardeh
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.-S.D.: Writing-Original Draft, Data curation; Formal analysis, F.H.: Writing-Review & Editing, Validation, Methodology, S.A.: Writing-Review & Editing, Validation, Mahin Schaffie: Writing-Review & Editing, Validation, A.H.-S.: Writing-Review & Editing, Methodology, Validation, Supervision,

Corresponding authors

Correspondence to Mohammad-Saber Dabiri or Abdolhossein Hemmati-Sarapardeh.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Dabiri, MS., Hadavimoghaddam, F., Ashoorian, S. et al. Modeling liquid rate through wellhead chokes using machine learning techniques. Sci Rep 14, 6945 (2024). https://doi.org/10.1038/s41598-024-54010-2

Download citation

Received: 13 April 2023
Accepted: 07 February 2024
Published: 23 March 2024
DOI: https://doi.org/10.1038/s41598-024-54010-2

Keywords

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Modelling rate of penetration in drilling operations using RBF, MLP, LSSVM, and DT models

Enhanced machine learning—ensemble method for estimation of oil formation volume factor at reservoir conditions

Enhanced intelligent approach for determination of crude oil viscosity at reservoir conditions

Introduction

Data collection

Model development

Multilayer perception neural network (MLPNN)

Radial basis function neural network (RBFNN)

Adaptive boosting support vector regression (AdaBoost-SVR)

Support vector regression (SVR)

Multivariate adaptive regression spline (MARS)

Generalized reduced gradient (GRG)

Evaluation of the model

Results and discussion

Development of the correlation

Statistical analyses of models

Graphical error analysis

Sensitivity analysis

Outlier diagnostics and model reliability assessment

Conclusions

Data availability

Abbreviations

References

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Supplementary Information.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Comments

Search

Quick links