Abstract
This paper aims to introduce a novel family of probability distributions by the well-known method of the T–X family of distributions. The proposed family is called a “Novel Generalized Exponent Power X Family” of distributions. A three-parameters special sub-model of the proposed method is derived and named a “Novel Generalized Exponent Power Weibull” distribution (NGEP-Wei for short). For the proposed family, some statistical properties are derived including the hazard rate function, moments, moment generating function, order statistics, residual life, and reverse residual life. The well-known method of estimation, the maximum likelihood estimation method is used for estimating the model parameters. Besides, a comprehensive Monte Carlo simulation study is conducted to assess the efficacy of this estimation method. Finally, the model selection criterion such as Akaike information criterion (AINC), the correct information criterion (CINC), the Bayesian information criterion (BINC), the Hannan–Quinn information criterion (HQINC), the Cramer–von-Misses (CRMI), and the ANDA (Anderson–Darling) are used for comparison purpose. The comparison of the NGEP-Wei with other rival distributions is made by Two COVID-19 data sets. In terms of performance, we show that the proposed method outperforms the other competing methods included in this study.
Similar content being viewed by others
Introduction
In the literature on distribution theory, the researchers have proposed many probability distributions for analyzing and predicting real-world phenomena but the real-world phenomena are complex and complicated. Therefore, no particular probability distribution is yet proposed to handle (for analyzing and predicting) every phenomenon. Similarly, in the literature of distribution theory, the exponential and Rayleigh distributions are the most popular and well-known distributions and are widely used in lifetime analysis. However, when the real-life phenomena are complex then these probability distributions are not suitable for accurate representation of the data. For example, the exponential distribution is concerned with describing data with only a constant failure (or hazard) rate function. On the other hand, the Rayleigh distribution is used to model data that have only an increasing failure rate function. Furthermore, the Weibull distribution is also considered one of the most important lifetime distributions, which has both the capability of the exponential and Rayleigh distributions and offers data modeling that has increasing, decreasing, and constant failure rate functions. However, in many applied fields, especially, in biomedical and engineering areas the behaviour of hazard function changes with time non-monotonically. So, in such phenomena, the Weibull distribution is not a suitable choice to implement; see (Almalki and Yuan1) for more reading. To deal with such difficulties, generalized versions of these classical models are needed. To this end, the researchers are trying to derive new methods (new family of distributions) to obtain the generalized version of the classical distributions with greater distributional flexibility. Most of the new methods in the literature are developed by adding one or more additional parameters to the baseline distributions (or existing distributions) to obtain new updated versions of these existing distributions that are analytically more flexible in modeling as a practical viewpoint; see (Usman et al.2) for more reading. In the recent past, several families of probability distributions have been proposed in the literature of distributions theory, for example, Mudholkar and Srivastava3 proposed a very simple approach called the Exponentiated family of distributions. The proposed method is based on inserting only one additional parameter into the baseline distributions. The CDF (cumulative distribution function) of the exponentiated family is given by
where, \(\phi > 0\) is an extra (or additional) shape parameter and \(A\left( {x;\mu } \right)\) is the CDF of any baseline random variable depending on parameter vector \(\mu\). Marshal and Olkin4 introduced a new method for obtaining the modified version of the existing distributions. Their suggested method is called, the Marshal and Olkin family of distributions. The CDF of the Marshal Olkin family is defined by
Using Eq. (2), Marshal and Olkin4 derived two special sub-models namely, Marshal–Olkin exponential and Marshal–Olkin Weibull distributions. Latterly, the authors used Eq. (2), and several probability distributions have been proposed in the literature, see (Ghitany et al.5, Gui et al.6, and Saboor et al.7) for more reading.
Similarly, in this regard, Mahdavi and Kundu8 also proposed a new family of distributions by incorporating one additional parameter to the baseline distribution. They named their proposed method by Alpha Power transformation (APTra) family of distributions. The CDF of the APTra family is defined by
Using Eq. (3), Mahdavi and Kundu8 modified the exponential distribution and named the alpha power transformed exponential (APTra-Expo) distribution. Furthermore, considering Eq. (3), various contributions have been made in the literature on distribution theory; see Dey et al.9, Ihtisham et al.10, and Hassan et al.11.
In the recent past, Shah et al.12 proposed a new method of probability distribution by incorporating one additional parameter into baseline distribution. Their proposed method is called, the new generalized logarithmic–X (NGLog–X) family of distributions. The CDF of the NGLog-X family is given by
Using Eq. (4), Shah et al.12 modified the Weibull distribution and named a new generalized logarithmic Weibull (NGLog-Wei) distribution. For recent developments about the distributional approaches, we refer to a superior extension for the Lomax distribution with application to Covid-19, proposed by Alsuhabi et al.13, a novel logarithmic approach to generate new probability distributions, proposed by Zhao et al.14, a novel updated-W family of distributions, proposed by Alnssyan et al.15, a new Type 1 Alpha Power family of distributions, proposed by Tekle et al.16, the Type II-Topp-Leone-Gompertz-G family of distributions with applications to COVID-19 data, proposed by Chamunorwa et al.17, a Weighted Cosine-G family of distributions, proposed by Odhah et al.18, some inferences on three parameters Birnbaum–Saunders distribution, developed by Shakil et al.19, a statistical analysis of excess mortality mean at Covid-19 in 2020–202, proposed by Raihen et al.20, exponentiated generalized Weibull exponential distribution, proposed by Abonongo et al.21, a case study for Kuwait mortality during the consequent waves of COVID-19, derived by BuHamra et al.22, and a new inverse Rayleigh distribution with applications of COVID-19 data, developed by El-Sherpieny et al.23.
In this research paper, taking motivation from the above discussion, we also propose a new method for obtaining more flexible probability distributions. The proposed method is obtained by implementing the T–X family approach proposed by (Alzaatreh et al.24). The proposed method may be named a Novel Generalized Exponent Power-X (NGEP-X) family of distributions. Based on the NGEP-X method, the improvised version of the Weibull distribution with distributional flexibility in shapes of PDF (probability density function) and HF (hazard function) is introduced. Based on COVID-19 data sets, the fitting power of the proposed work is compared with Alpha Power Transformed Weibull, New Reduce Logarithmic Weibull, Kumaraswamy Weibull, Weibull, Marshal Olkin Nadarajah Haghigh, and Gull Alpha Power Weibull distributions; see Table 3, for references of these competing distributions.
The rest of the work done in the study is organized into seven sections: In section "NGEP-X family", the newly proposed family of distributions is comprehensively derived. Section "NGEP-Wei distribution", gives a special sub-model of the proposed family, named a Novel Exponent power Weibull distribution, and in the same section, the shapes of its CDF, survival function, PDF, and HF are also graphically illustrated. The mathematical properties of the NGEP-X family are given (or derived) in Section "Basic mathematical properties". The method of Maximum Likelihood Estimation is applied for estimating the model parameters of the proposed method in section "Estimation, experiment and simulation". Practical applications via two COVID-19 data sets (describing the mortality rates of the countries of Canada and Mexico) are discussed in section 6. Finally, section 7 gives the "Concluding remarks" based on the analyses done in this paper.
NGEP-X family
In this section, the CDF, PDF, SF (survival function), HF, and CHF (cumulative hazard function) of the NGEP-X family of distributions are computed.
Definition
Let \(m\left( t \right) = e^{ - \alpha t}\) be the PDF of exponential random variable, say T, where \(T \in \left[ {b_{1} ,b_{2} } \right]\) for \(- \infty \le b_{1} < b_{2} \le \infty\) and let suppose \(K\left[ {A\left( {x;\mu } \right)} \right]\) be a function of CDF \(A\left( {x;\mu } \right)\) of a random variable, say X, satisfying the following three conditions.
-
I.
\(K\left[ {A\left( {x;\mu } \right)} \right] \in \left[ {b_{1} ,b_{2} } \right]\),
-
II.
\(K\left[ {A\left( {x;\mu } \right)} \right]\) is monotonically increasing function and differentiable,
-
III.
\(K\left[ {A\left( {x;\mu } \right)} \right] \to b_{1}\) as \(x \to - \infty\) and \(K\left[ {A\left( {x;\mu } \right)} \right] \to b_{2}\) as \(x \to \infty\).
According to (Alzaatreh et al.24) the CDF of T–X family defined by
where, \(K\left[ {A(x;\mu )} \right]\) satisfies the conditions (I)–(III); see (Alzaatreh et al.24). The PDF \(Y\left( {x;\mu } \right)\) of T–X distribution, associated with Eq. (5) is as follow
Now, by using \(m\left( t \right) = e^{ - t}\) as the PDF of exponential distribution with rate parameter \(\left( {\alpha = 1} \right)\) and setting the upper limit \(K\left[ {A(x;\mu )} \right] = - \log \left( {e^{{\phi A\left( {x;\mu } \right)^{2} }} - e^{\phi } A(x;\mu )^{2} } \right)\) and lower limit \(b_{1} = 0\) in Eq. (5), we get the CDF \(Y\left( {x;\phi ,\mu } \right)\) of the NGEP-X family, which is given by
where, \(A\left( {x;\mu } \right)\) is the CDF of any sub-model which may be depending on \(\mu \in {\mathbb{R}}\). To prove that whether the CDF \(Y\left( {x;\phi ,\mu } \right)\) is an exact CDF or not, we have the following two proposition.
Proposition 1
The CDF \(Y\left( {x;\phi ,\mu } \right)\) derived in Eq. (7), we need to prove.
Proof
and
Proposition 2
The CDF \(Y\left( {x;\phi ,\mu } \right)\) derived in Eq. (7), is RC (right continues) and differentiable.
Hence, from proposition 1 and 2, we concluded that the CDF \(Y\left( {x;\phi ,\mu } \right)\) in Eq. (7) is a valid CDF. Corresponding to Eq. (7), the PDF \(y\left( {x;\phi ,\mu } \right)\) of the NGEP-X family is given by
where,\(\frac{d}{dx}A\left( {x;\mu } \right) = a\left( {x;\mu } \right)\) and the rest of the SF \(S\left( {x;\phi ,\mu } \right)\), HF \(h\left( {x;\phi ,\mu } \right)\), and CHF \(H\left( {x;\phi ,\mu } \right)\) of the NGEP-X family are respectively given by
and
NGEP-Wei distribution
This section of the article is based on a three-parameter-specific sub-model of the NGEP-X family of distributions. This special model of the proposed family is called a Novel Generalized Exponent Power Weibull distribution (NGEP-Wei for short). Let \(A(x;\mu )\) and \(a(x;\mu )\) be the corresponding CDF and PDF of the classical Weibull distribution expressed as \(A(x;\mu ) = 1 - e^{{ - \alpha x^{\delta } }}\) and \(a(x;\mu ) = \alpha \delta x^{\delta - 1} e^{{ - \alpha x^{\delta } }}\), respectively, (\(\alpha ,\delta ,x \in {\mathbb{R}}^{ + }\)), where \(\mu = \left( {\alpha ,\delta } \right)\). Using \(A(x;\mu ) = 1 - e^{{ - \alpha x^{\delta } }}\) in Eq. (7), we obtain the updated version of the Weibull distribution. The CDF \(Y(x;\phi ,\mu )\), and SF \(S(x;\phi ,\mu )\) of the NGEP-Wei distribution (or updated version) is given by the following form, respectively
and
Some attractive plots of \(Y(x;\phi ,\mu )\) and \(S(x;\phi ,\mu )\) are visualized in Fig. 1. The plots of CDF \(Y(x;\phi ,\mu )\) and SF \(S(x;\phi ,\mu )\) are obtained with different parameters values (i) \(\phi =\) 0.1, \(\alpha =\) 2.3, and \(\delta =\) 1.3 (red line), (ii) \(\phi =\) 0.5, \(\alpha =\) 0.4, and \(\delta =\) 2.9 (green line), (iii) \(\phi =\) 0.2, \(\alpha =\) 1.2, and \(\delta =\) 1.9 (black line), and (iv) \(\phi =\) 1.1, \(\alpha =\) 0.1, and \(\delta =\) 3.4 (blue line). From Fig. 1, it is visually confirmed that the proposed model has a valid CDF.
Furthermore, the PDF \(y(x;\phi ,\mu )\) , HF \(h(x;\phi ,\mu )\) and function for CHF \(H(x;\phi ,\mu )\) corresponding to Eq. (12) are given in Eqs. (14)–(16), respectively, by
and
Some attractive, right skewed, lift skewed, and symmetrical PDF \(y(x;\phi ,\mu )\) plots are visualized in Fig. 2. The plots in lift penal of the Fig. 2 of \(y(x;\phi ,\mu )\) are obtained with different parameters values (i) \(\phi =\) 0.1, \(\alpha =\) 0.4, and \(\delta =\) 5.0 (red line), (ii) \(\phi =\) 1.0, \(\alpha =\) 1.3, and \(\delta =\) 2.5 (green line), (iii) \(\phi =\) 0.1, \(\alpha =\) 1.0, and \(\delta =\) 4.0 (black line), and (iv) \(\phi =\) 0.3, \(\alpha =\) 2.3, and \(\delta =\) 1.5 (blue line). The plots in right penal of the Fig. 2 of \(y(x;\phi ,\mu )\) are obtained with different parameters values (i) \(\phi =\) 0.1, \(\alpha =\) 1.1, and \(\delta =\) 3.7 (red line), (ii) \(\phi =\) 0.1, \(\alpha =\) 2.3, and \(\delta =\) 1.3 (green line), (iii) \(\phi =\) 1.0, \(\alpha =\) 0.2, and \(\delta =\) 4.5 (black line), and (iv) \(\phi =\) 1.1, \(\alpha =\) 0.1, and \(\delta =\) 3.6 (blue line).
Similarly, some increasing, decreasing, unimodal, and bathtub shape plots of HF \(h(x;\phi ,\mu )\) are also visualized in Fig. 3. The plots in left penal of the Fig. 3 of \(h(x;\phi ,\mu )\) are obtained with different parameters values (i) \(\phi =\) 1.6, \(\alpha =\) 0.36, and \(\delta =\) 0.3 (red line), (ii) \(\phi =\) 1.4, \(\alpha =\) 0.5, and \(\delta =\) 0.46 (green line), (iii) \(\phi =\) 0.6, \(\alpha =\) 0.8 and \(\delta =\) 0.9 (black line), and (iv) \(\phi =\) 0.5, \(\alpha =\) 0.64, and \(\delta =\) 0.8 (blue line). The plots in right penal of the Fig. 3 of \(y(x;\phi ,\mu )\) are obtained with different parameters values (i) \(\phi =\) 2.1, \(\alpha =\) 0.09, and \(\delta =\) 1.75 (red line), (ii) \(\phi =\) 0.7, \(\alpha =\) 1.20, and \(\delta =\) 0.85 (green line), (iii) \(\phi =\) 0.6, \(\alpha =\) 1.70, and \(\delta =\) 0.70 (black line), and (iv) \(\phi =\) 0.8, \(\alpha =\) 1.90, and \(\delta =\) 0.70 (blue line).
Basic mathematical properties
In the present section, we derive some mathematical properties of the NGEP-X family of distributions. These properties include the QF (quantile function), moments, and MGF (moment-generating function), order statistic, and Residual and Reverse Residual life.
Quantile function
The QF is also called the inverse of the CDF and is generally used for generating RNs (random numbers) from a distribution. The RNs are usually used for simulation purposes to evaluate the performances of the estimators (or estimation method). Later in section "Simulation", we have implemented this method (i.e., inverse distribution function) for generating RNs from the NGEP-Wei distribution. For the NGEP-Wei distributions, the QF is given by
where, \(0 < u < 1\) and \(u\) is the solution of the above expression. The expression can be used to generate random samples or random numbers from any special model of the NGEP-X family of distributions.
rth moment
The moment is an important and useful statistical tool to obtain certain characteristics and features of any model. These characteristics are known as (i) CT (central tendency): which deals with the mean point of a distribution, (ii) MD (Measure of dispersion): which take care of the variance of a model (or measure dispersion among the data), (iii) skewness: which describe the tail behaviour of the model, and (iv) kurtosis: which helps in studying the peakiness of the distribution. For the proposed method, the rth moment expressed by \(\mu_{r}^{\prime }\), is derived as
Using Eq. (8) in Eq. (17), we have
Next, using the exponential series in Eq. (18), we get
where, \(\kappa_{r,1} = \int_{ - \infty }^{\infty } {\mathop x\limits^{r} } \,a\left( {x;\mu } \right)A\left( {x;\mu } \right)dx\) and \(\kappa_{r,2m + 1} = \int_{ - \infty }^{\infty } {\mathop x\limits^{r} } \,a\left( {x;\mu } \right)A\left( {x;\mu } \right)^{2m + 1} dx\).
Furthermore, a simple general expression for the MGF of the NGEP-X random variable X, say \(M_{x} \left( t \right)\) , is derived as
Order statistics
In distribution theory, OS is a very crucial importance. They make their appearance (or role) in the reliability analysis, problems of estimation theory and life testing in a number of ways. They can characterize the lifetime of elements or elements of a reliability system.
Let \(X_{1} ,{\text{ X}}_{2} , \, {. }{\text{. }}{\text{. , X}}_{k}\) be a random set of observations of size k chosen from NGEP-X family with CDF \(Y\left( {x;\phi ,\mu } \right)\) and PDF \(y\left( {x;\phi ,\mu } \right)\) given by (7) and (8), respectively. Then the DF (density function) of \(y_{r:k} \left( x \right)\) is given by “
We express the 1st order statistic as \(X_{1:k} = \min \left( {X_{1} ,{\text{ X}}_{2} , \, {. }{\text{. }}{\text{. , X}}_{k} } \right)\) and the kth order statistic as \(X_{k:k} = \max \left( {X_{1} ,{\text{ X}}_{2} , \, {. }{\text{. }}{\text{. , X}}_{k} } \right).\) Since, \(0 < Y\left( x \right) < 1\) for \(x > 0.\) We may utilize the binomial expansion of \(\left[ {1 - Y\left( X \right)} \right]^{k - r}\) as follow”
On using Eq. (21) into Eq. (20), we get
Using Eqs. (7) and (8), in Eq. (22), we obtain the DF of \(y_{r:k} \left( x \right)\).
Residual and reverse residual lifetime
The RL (residual lifetime) and RRL (reverse residual lifetime) offer broader application (or characteristics) in risk management, actuarial measures, biometry, and survival analysis. The RL of the NGEP-X family with a random variable X, say \(R_{\left( X \right)} \left( t \right)\) is defined as
In addition to the RL, we also obtain the RRL of the NGEP-X family of distributions. The RRL, say \(\overline{R}_{\left( X \right)} \left( t \right)\) is given by
Estimation, experiment and simulation
This section provides a detailed description of the maximum likelihood estimation implemented for obtaining the parameter estimates of the proposed family of distributions. Furthermore, we also conduct a comprehensive Monte Carlo simulation study to assess the performance of the estimators (or the estimation method).
Maximum likelihood estimation
Several methods of estimation are proposed for obtaining the parameter estimates in the various studies. MLE (Maximum likelihood estimation) is one of the most frequently used methods of estimation. This method furnishes estimators with several important properties and can be used in the construction of confidence intervals as well as other tests for checking statistical significance. For further details about MLEs, please see19. This sub-section provides a discussion on the MLEs approach for obtaining the parameter estimates of the NGEP-Wei distributions.
Suppose \(x_{1} ,x_{2} ,...,x_{n}\) are the observed values from PDF \(y\left( {x;\phi ,\mu } \right)\) given in Eq. (8). Then, the likelihood function corresponding \(y\left( {x;\phi ,\mu } \right)\) is expressed by
Now, the log-likelihood function derived by putting Eq. (8) into Eq. (23) and taking the log
where, \(\Xi = \left( {\phi ,\alpha ,\delta } \right)^{T}\). The log-likelihood function can be maximized either directly by using the R package (AdequecyModel), Ox program (subroutine Max BFGS) or SAS (PROC NLMIXED) (see, Doornik25 for more reading) or by solving the nonlinear log-likelihood equations obtained by differentiating Eq. (24). So, the partial derivatives of Eq. (24), the, we get
and
Equating the Eq. (25) \(\frac{\partial L\left( \Xi \right)}{{\partial \phi }}\) and Eq. (26) \(\frac{\partial L\left( \Xi \right)}{{\partial \mu }}\) to zero, and simultaneously solving, yield these expression MLEs of \(\phi\) and \(\mu\).
Simulation
To cover the second aim of this section, the performance of the MLEs \(\left( {\phi_{MLE} ,\alpha_{MLE} ,\delta_{MLE} } \right)\) of \(\left( {\phi ,\alpha ,\delta } \right)\) is assessed by conducting a MCSS (monte Carlo Simulation study). We consider different sample size (i.e., n = 50, 100, 200, 300, 400, 500, 600, 700, 800, 900, and 1000) with different parameters values \(\phi =\)(0.9, 0.7, 0.8, 0.7), \(\alpha =\)(1.0, 1.5, 0.7, 1.8), and \(\delta =\)(1.9, 0.5, 1.2, 0.8). As we have already mentioned the range of \(\phi \in {\mathbb{R}}^{ + }\), \(\alpha \in {\mathbb{R}}^{ + }\), and \(\delta \in {\mathbb{R}}^{ + }\). So, we can choose any values (by default or predefined values of parameters) within their ranges of \(\phi\), \(\alpha\), and \(\delta\) to conduct the simulation study. For each combination of parameters values, the MCSS is repeated 1000 times and the AMLEs (average of MLEs), ABs (average of Biases), and MSEs (mean square error) values are gained. The ABs, and MSEs are calculated using the following expression
and
where, \(\Phi = \left( {\phi ,\alpha ,\delta } \right)\). The numerical results for set I = (\(\phi = 0.9,\,\alpha = 1.0,\,\delta = 1.9\)), and Set II = (\(\phi = 0.7,\,\alpha = 1.5,\,\delta = 0.5\)) are recorded in Table 1, while the numerical MCSS results for set III = (\(\phi = 0.8,\,\alpha = 0.7,\,\delta = 1.2\)) and Set IV = (\(\phi = 0.7,\,\alpha = 1.8,\,\delta = 0.8\)) are presented in Table 2. In Tables 1 and 2, the simulation results are obtained by using the R-script with L-BFGS-B method. Based on the numerical results (or numerical facts) in Tables 1 and 2, we can observe that as the sample size \(n\) increase (i.e., \(n \to \infty\)), the.
-
MSE of \(\hat{\phi }_{MLE}\), \(\hat{\alpha }_{MLE}\), and \(\hat{\delta }_{MLE}\) decay to zero.
-
MLEs of \(\hat{\phi }_{MLE}\), \(\hat{\alpha }_{MLE}\), and \(\hat{\delta }_{MLE}\) become closer to the true values.
-
Biases of \(\hat{\phi }_{MLE}\), \(\hat{\alpha }_{MLE}\), and \(\hat{\delta }_{MLE}\) also decrease.
Real life application to coved-19 data sets
Here, we consider two applications from COVID-19 data sets to illustrate the fitting power of the NGEP-Wei distribution. We applied the NGEP-Wei distribution on both data sets and compared its flexibility (or fitting power) with the other rival distributions. The rival distributions of the NGEP-Wei distribution are presented in the following Table 3.
The SFs of these rival distributions are the following:
-
APTra-Wei distribution
$$S\left( {x;a,\alpha ,\delta } \right) = 1 - \left( {\frac{{a^{{\left( {1 - e^{{ - ax^{\delta } }} } \right)}} - 1}}{a - 1}} \right)\;\;x \in {\mathbb{R}},$$ -
NRLog-Wei distribution
$$S\left( {x;\phi ,\alpha ,\delta } \right) = \frac{{\log \left( {\phi + 1 - \phi \left( {1 - e^{{ - \alpha x^{\delta } }} } \right)} \right)}}{{\log \left( {1 + \phi } \right)}},\;\;x \in {\mathbb{R}},$$ -
Kumar–Wei distribution
$$S\left( {x;a,b,\alpha ,\delta } \right) = \left( {1 - \left( {1 - e^{{ - \alpha x^{\delta } }} } \right)^{a} } \right)^{b} ,\;\;x \in {\mathbb{R}},$$ -
Wei distribution
$$S\left( {x;,\alpha ,\delta } \right) = e^{{ - \alpha x^{\delta } }} ,\;\;x \in {\mathbb{R}},$$ -
MO-NH distribution
$$S\left( {x;\phi ,\alpha ,\delta } \right) = 1 - \left( {\frac{{1 - e^{{[1 - (1 + \alpha x)^{\delta } ]}} }}{{1 - (1 - \phi )e^{{[1 - (1 + \alpha x)^{\delta } ]}} }}} \right),\;x \in {\mathbb{R}},$$ -
GAP-Wei distribution
$$S\left( {x;\phi ,\alpha ,\delta } \right) = 1 - \left( {\frac{{\phi \left( {1 - e^{{ - \alpha x^{\delta } }} } \right)}}{{\phi^{{\left( {1 - e^{{ - \alpha x^{\delta } }} } \right)}} }}} \right),\;\;x \in {\mathbb{R}},$$
Next, after selecting the rival distributions, we consider certain statistical criteria (i.e., analytical goodness of fit measures) to find out (i.e., fitting power of the competing distributions) the best-suited distribution for the considered COVID-19 data sets. The mathematical formulas (or expression) of this goodness of fits measures (or statistical criteria) are given by
-
The CRMS (Cramer–von-Misses)
$$CRMS = \sum\limits_{i = 1}^{n} {\left[ {A\left( {x_{i} } \right) - \frac{2i - 1}{{2n}}} \right]}^{2} + \frac{1}{12n},$$ -
The ANDR (Anderson–Darling)
$$ANDR = - n - \frac{1}{n}\sum\limits_{i = 1}^{n} {\left( {2i - 1} \right)} \times \left[ {\log A\left( {x_{i} } \right) + \log \left( {1 - A\left( {x_{i - n + 1} } \right)} \right)} \right],$$ -
The KS (Kolmogorov–Smirnov)
$$KS = \sup_{x} \left[ {A_{n} \left( x \right) - \hat{A}\left( x \right)} \right]$$ -
The AINC (Akaike information criteria)
$$AINC = 2p - 2L\left( \Xi \right),$$ -
The BINC (Bayesian information criteria)
$$BINC = p\log (n) - 2L\left( \Xi \right),$$ -
The CAINC (Consistent AINC)
$$CAINC = \frac{2np}{{n - p - 1}} - 2L\left( \Xi \right),$$ -
The HQINC (Hannan–Quinn information criteria)
$$HQINC = 2p\log \left( {\log (n)} \right) - 2L\left( \Xi \right).$$
In the above expressions of the decision criteria, \(L\left( \Xi \right)\) is the MLF (maximized likelihood function) evaluated at MLEs, n represents the sample size and p represents the number of parameters to be estimated in the model. For the NGEP-Wei distribution and Rival distributions, the values of MLEs and the above decision tools (i.e., CRMS, ANDR, KS, AINC, BINC, CAINC, and HQINC) are computed by using R-software with “method = Nelder-Mead” algorithm. In general, among the above-applied distributions to each data set, a distribution with the lowest values of the above goodness of fit measures represents best-suited distribution for the data.
Analysis of first COVID-19 data sets
The first data set (onward signified by DAST 1) consists of 36 observations and daily new death cases (due to COVID-19) recorded from the period of 10 April to 15 May 2020 in the country of Canada. The data can also be accessible via the link [https://covid19.who.int/]. For mor details about the DS 1, we refer to Almetwally et al.31, and Xin et al.32. The DAST 1 is: DAST 1 = {3.1091, 3.3825, 2.8636, 3.2218, 4.2781, 4.2202, 2.1901, 2.4141, 1.9048, 2.9078, 3.6426, 3.2110, 3.6346, 2.7957, 1.5157, 2.6029, 3.3592, 2.8349, 3.1348, 2.5261, 1.5806, 2.7704, 3.8594, 4.0480, 4.1685, 3.1444, 3.2135, 2.4946, 3.5146, 4.9274, 3.3769, 6.8686, 3.0914, 4.9378, 3.1091, 3.2823}.
Some significant descriptive analysis of DAST 1, are: Min. = 1.516, Max. = 6.869, Mean = 3.282, Q1 (1st quartile) = 2.789, Q2 (2nd quartile or median) = 3.178, Q3 (3rd quartile) = 3.637, Range = 5.3529, variance = 0.9970656, Skewness = 1.213916, and Kurtosis = 6.151625. Additionally, the histogram plot (HP), Kernal density plot (KDP), total-time on test plot (TTT-P), Violin plot, and Box plot (BP) of the DAST 1 are presented in Fig. 4.
Corresponding to DATS 1, the numerical values of MLEs along with standard errors enclosed in parenthesis of the NFEP-Wei and rival distributions (i.e., \(\hat{\phi }_{MLE}\), \(\hat{\alpha }_{MLE}\), \(\hat{\delta }_{MLE}\), \(\hat{a}_{MLE}\), \(\hat{b}_{MLE}\)) are recorded in Table 4. Furthermore, the numerical values of the goodness of fit measures of the NGEP-Wei and other competitive rival distributions are recorded in Table 5. According to the model selection criteria (goodness of fit measures) in Table 5, the NGEP-Wei distribution provides the best-suited fit with the minimum value of CRMS, ANDR, KS, AINC, BINC, CAINC, and HQINC as compared with rival distributions to the Canada COVID-19 dataset (DATS 1). In other words, based on model selection criteria, we can say that the NGEP-Wei distribution attains reasonable (or satisfactory) fit, which is not sufficiently (or adequately) fitted by the other rival distributions (i.e., APTra-Wei, NRLog-Wei, Kumar-Wei, Wei, MO-NH, and GAP-Wei). Consequently, the NGEP-Wei distribution provides a valuable fit to the DATS 1. Except for the numerical illustration (or comparison) of the NFEP-Wei distribution and other rival distributions, we also presented a visual illustration of the NGEP-Wei distribution. For visual illustration, we plotted the profiles of the log-likelihood function of the \(\hat{\phi }_{MLE}\), \(\hat{\alpha }_{MLE}\), and \(\hat{\delta }_{MLE}\) in Fig. 5. The plots in Fig. 5, we clearly see that the point estimated parameter values of the NGEP-Wei distribution are the maxima. Similarly, the PDF, CDF, SF, PP-plot and QQ-plot for the NGEP-Wei distribution are visualized in Fig. 6. From the graphical illustration in Fig. 6, we can also see that the respective red curve lines of the NGEP-Wei distribution are more close fit to the corresponding empirical objects.
Analysis of second COVID-19 data sets
The second data set (onward signified by DAST 2) consists of 108 observations. The DAST 2 represents COVID-19 mortality rates and belonging to Mexico for 108 days. This DAST 2 was recently used by Almongy et al.33 and suggested a new extended Rayleigh distribution. Observations on the mortality rates were recorded from the period of March 4, 2020 to July 20, 2020. The DAST 2 is: DAST 2 = {8.826, 6.105, 9.391, 14.962, 10.383, 7.267, 13.220, 16.498, 11.665, 6.015, 10.855, 6.122, 6.656, 3.440, 5.854, 10.685, 10.035, 5.242, 4.344, 5.143, 7.630, 14.604, 7.903, 6.370, 3.537, 6.327, 4.730, 3.215, 9.284, 12.878, 8.813, 10.043, 7.260, 5.985 , 6.412, 3.395, 4.424, 9.935, 7.840, 9.550, 3.499, 3.751, 6.968, 3.286, 10.158, 8.108, 6.697, 7.151, 6.560, 2.077, 3.778, 2.988, 3.336, 6.814, 8.325, 7.854, 8.551, 3.228, 7.486, 6.625, 6.140, 4.909, 4.661, 5.392, 12.042, 8.696, 1.815, 3.327, 5.406, 6.182, 1.041, 1.800, 4.949, 4.089, 3.359, 2.070, 3.298, 5.317, 5.442, 4.557, 4.292, 2.500, 6.535, 4.648, 4.697, 5.459, 4.120, 3.922, 3.219, 1.402, 2.438, 3.257, 3.632, 3.233, 3.027, 2.352, 1.205, 3.218, 2.926, 2.601, 2.065, 3.029, 2.058, 2.326, 2.506, 1.923}.
Some significant descriptive analysis of DAST 2, are: Min. = 1.041, Max. = 16.498, Mean = 5.822, Q1 (1st quartile) = 3.289, Q2 (2nd quartile or median) = 5.279, Q3 (3rd quartile) = 7.594, Range = 15.457, variance = 10.56173, Skewness = 0.9732453, and Kurtosis = 3.666136. Additionally, the histogram plot (HP), Kernal density plot (KDP), total-time on test plot (TTT-P), Violin plot, and Box plot (BP) of the DAST 2 are presented in Fig. 7.
Corresponding to DATS 1, the numerical values of MLEs along with standard errors enclosed in parenthesis of the NGEP-Wei and rival distributions (i.e., \(\hat{\phi }_{MLE}\), \(\hat{\alpha }_{MLE}\), \(\hat{\delta }_{MLE}\), \(\hat{a}_{MLE}\), \(\hat{b}_{MLE}\)) are recorded in Table 6. Furthermore, the numerical values of the goodness of fit measures of the NGEP-Wei and other competitive rival distributions are recorded in Table 7. According to the model selection criteria (goodness of fit measures) in Table 7, the NGEP-Wei distribution provides the best-suited fit with the minimum value of CRMS, ANDR, KS, AINC, BINC, CAINC, and HQINC as compared with rival distributions to the Mexico COVID-19 dataset (DATS 2). In other words, based on model selection criteria, we can say that the NGEP-Wei distribution attains reasonable (or satisfactory) fit, which is not sufficiently (or adequately) fitted by the other rival distributions (i.e., APTra-Wei, NRLog-Wei, Kumar-Wei, Wei, MO-NH, and GAP-Wei). Except for the numerical illustration (or comparison) of the NGEP-Wei distribution and other rival distributions, we also presented a visual illustration of the NGEP-Wei distribution. For visual illustration, we plotted the profiles of the log-likelihood function of \(\hat{\phi }_{MLE}\), \(\hat{\alpha }_{MLE}\), and \(\hat{\delta }_{MLE}\) in Fig. 8. The plots in Fig. 8, we clearly see that the point estimated parameter values of the NGEP-Wei distribution are the maxima. Similarly, the empirical PDF, CDF, SF, PP-plot and QQ-plot are visualized for the NGEP-Wei distribution in Fig. 9. From the graphical illustration in Fig. 9, we can again see that the respective red curve lines of the NGEP-Wei distribution are more close fit to the corresponding empirical objects.
Concluding remarks
In this article, we presented a novel generator called a “Novel Generalized Exponent Power-X” family of distributions or in short NGEP-X family. A special sub case of the proposed class by employing the Weibull distribution as a baseline distribution is derived. The special sub-case is named as a Novel Generalized Exponent Power Weibull distribution (NGEP-Wei for short). The density function of the derived model is positively skewed, negatively skewed as well as symmetrical depending on parameter values. Moreover, the hazard function can be monotonically increasing, decreasing, unimodal, and bathtub-shaped. General expressions, for different statistical properties of the proposed class (NGEP-X) have been derived including quantile function, moments, moments generating function, order statistics, residual and reverse residual of lifetime. The maximum Likelihood Estimation method has been used for estimating the model parameters. In addition, a comprehensive MCSS (or simulation) is carried out to assess the performance of the estimators of the proposed model. To prove the efficacy (or fitting power over other classical distributions) of the proposed family of distribution (NGEP-X) based on an NGEP-Wei distribution, we considered two data sets of COVID-19 mortality rates related to the countries of Mexico and Canada. Based on numerical illustration, it is observed that the proposed work outperforms then other widely used existing distributions. For future works, many researchers can use our proposed method to develop new extensions of the existing distributions such as a Novel Generalized Exponent Power Lomax, a Novel Generalized Exponent Power Inverse Lomax, a Novel Generalized Exponent Power Pareto, and a Novel Generalized Exponent Power Lindley that are powerful for representing and predicting real-world phenomena.
Data availability
The corresponding author can provide the datasets utilized and/or examined during the present study upon a reasonable request.
References
Almalki, S. J. & Yuan, J. A new modified Weibull distribution. Reliab. Eng. Syst. Saf. 111, 164–170 (2013).
Usman, R. M., Haq, M. & Talib, J. Kumaraswamy half-logistic distribution: Properties and applications. J Stat Appl Probab 6, 597–609 (2017).
Mudholkar, G. S. & Srivastava, D. K. Exponentiated Weibull family for analysing bathtub failure-rate data. IEEE Trans. Reliab. 42(2), 299–302 (1993).
Marshall, A. W. & Olkin, I. A new method for adding a parameter to a family of distributions with application to the exponential and Weibull families. Biometrika 84(3), 641–652 (1997).
Ghitany, M. E., Al-Hussaini, E. K. & Al-Jarallah, R. A. Marshall–Olkin extended Weibull distribution and its application to censored data. J. Appl. Stat. 32(10), 1025–1034 (2005).
Gui, W. Marshall–Olkin extended log-logistic distribution and its application in minification processes. Appl Math Sci 7(80), 3947–3961 (2013).
Saboor, A. & Pogány, T. K. Marshall–Olkin gamma–Weibull distribution with applications. Commun. Stat.-Theory Methods 45(5), 1550–1563 (2016).
Mahdavi, A. & Kundu, D. A new method for generating distributions with an application to exponential distribution. Commun. Stat.-Theory Methods 46(13), 6543–6557 (2017).
Dey, S., Sharma, V. K. & Mesfioui, M. A new extension of Weibull distribution with application to lifetime data. Ann. Data Sci. 4(1), 31–61 (2017).
Ihtisham, S., Khalil, A., Manzoor, S., Khan, S. A. & Ali, A. Alpha-Power Pareto distribution: Its properties and applications. PloS one 14(6), e0218027 (2019).
Hassan, A. S., Elgarhy, M., Mohamd, R. E., & Alrajhi, S. On the alpha power transformed power Lindley distribution. J. Probab. Stat. (2019).
Shah, Z., Khan, D. M., Khan, Z., Faiz, N., Hussain, S., Anwar, A., & Kim, K. I. (2023). A new generalized logarithmic–X family of distributions with biomedical data analysis. Appl. Sci. 13(6), 3668.
Alsuhabi, H., Alkhairy, I., Almetwally, E. M., Almongy, H. M., Gemeay, A. M., Hafez, E. H., & Sabry, M. (2022). A superior extension for the Lomax distribution with application to Covid-19 infections real data. Alexandria Eng. J., 61(12), 11077–11090.
Zhao, Y. et al. A novel logarithmic approach to generate new probability distributions for data modeling in the engineering sector. Alexandria Eng. J. 62, 313–325 (2023).
Alnssyan, B., Ahmad, Z., Malela-Majika, J. C., Seong, J. T., & Shafik, W. On the identifiability and statistical features of a new distributional approach with reliability applications. AIP Adv., 13(12) (2023).
Tekle, G., Roozegar, R., & Ahmad, Z. A new type 1 alpha power family of distributions and modeling data with correlation, overdispersion, and zero-inflation in the health data sets. J. Probab. Stat., 2023 (2023).
Chamunorwa, S., Oluyede, B., Chipepa, F., & Rannona, K. The type II-Topp-Leone-Gompertz-G family of distributions with applications to COVID-19 data. Eurasian Bull. Math. (ISSN: 2687–5632), 5(1), 14–38 (2023).
Odhah, O. H., Alshanbari, H. M., Ahmad, Z. & Rao, G. S. A weighted cosine-G family of distributions: Properties and illustration using time-to-event data. Axioms 12(9), 849 (2023).
Shakil, M., Munir, M., Kausar, N., Ahsanullah, M., Khadim, A., Sirajo, M., & Kibria, B. M. G. (2023). Some inferences on three parameters birnbaum-saunders distribution: Statistical properties, characterizations and applications. Comput. J. Math. Stat. Sci. 2(2), 197–222.
Raihen, M. N., Akter, S., Tabassum, F., Jahan, F. & Begum, S. A statistical analysis of excess mortality mean at Covid-19 in 2020–2021. Comput. J. Math. Stat. Sci. 2(2), 223–239 (2023).
Abonongo, A. I. L. & Abonongo, J. Exponentiated generalized weibull exponential distribution: properties, estimation and applications. Comput. J. Math. Stat. Sci. 3(1), 57–84 (2024).
BuHamra, S. S., Al-Kandari, N. M., Hussam, E., Almetwally, E. M., & Gemeay, A. M. (2024). A case study for Kuwait mortality during the consequent waves of COVID-19. Heliyon.
El-Sherpieny, E. S. A., Muhammed, H. Z. & Almetwally, E. M. A new inverse Rayleigh distribution with applications of COVID-19 data: Properties, estimation methods and censored sample. Electron. J. Appl. Stat. Anal. 16(2), 449–472 (2023).
Alzaatreh, A., Lee, C. & Famoye, F. A new method for generating families of continuous distributions. Metron 71(1), 63–79 (2013).
Doornik, J. A. (2009). An object-oriented matrix programming language Ox 6.
Liu, Y., Ilyas, M., Khosa, S. K., Muhmoudi, E., Ahmad, Z., Khan, D. M., & Hamedani, G. G. (2020). A flexible reduced logarithmic-X family of distributions with biomedical analysis. Computat. Math. Methods Med. (2020).
Cordeiro, G. M., Ortega, E. M. & Nadarajah, S. The Kumaraswamy Weibull distribution with application to failure data. J. Frankl. Inst. 347(8), 1399–1429 (2010).
Weibull, W. A statistical distribution function of wide applicability. J. Appl. Mech. (1951).
Muhammad, M. & Liu, L. Characterization of Marshall–Olkin–G family of distributions by truncated moments. J. Math. Comput. Sci. 19(3), 192–202 (2019).
Ijaz, M., Asim, S. M., Farooq, M., Khan, S. A. & Manzoor, S. A Gull Alpha Power Weibull distribution with applications to real and simulated data. Plos one 15(6), e0233080 (2020).
Almetwally, E. M., Alharbi, R., Alnagar, D. & Hafez, E. H. A new inverted topp-leone distribution: Applications to the COVID-19 mortality rate in two different countries. Axioms 10(1), 25 (2021).
Xin, Y., Zhou, Y., & Mekiso, G. T. (2022). A new generalized-family for analyzing the COVID-19 data set: A case study. Math. Problems Eng. (2022).
Almongy, H. M., Almetwally, E. M., Aljohani, H. M., Alghamdi, A. S. & Hafez, E. H. A new extended Rayleigh distribution with applications of COVID-19 data. Results Phys. 23, 104012 (2021).
Acknowledgements
The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University, Saudi Arabia for funding this work through Large Groups Project under Grant Number (RGP.2/176/44).
Author information
Authors and Affiliations
Contributions
Z.S., I.K., and D.M.K. enhanced the manuscript through mathematical analyses and numerical simulations. M.J. and S.A.M. initiated the primary concept, analysed data, and contributed to manuscript restructuring. B.A. and M.J. meticulously validated findings, revised the manuscript, and secured funding. Additionally, I.K. and S.A.M. refined the manuscript's language and performed additional numerical simulations. The final version, prepared for submission, represents a consensus reached by all authors.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Shah, Z., Khan, D.M., Khan, I. et al. A novel flexible exponent power-X family of distributions with applications to COVID-19 mortality rate in Mexico and Canada. Sci Rep 14, 8992 (2024). https://doi.org/10.1038/s41598-024-59720-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-024-59720-1
Keywords
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.