A new (T-X $$^\theta$$ ) family of distributions: properties, discretization and estimation with applications

Mandouh, Rasha M.; Mahmoud, Mahmoud R.; Abdelatty, Rasha E.

doi:10.1038/s41598-023-49425-2

Download PDF

Article
Open access
Published: 18 January 2024

A new (T-X$^\theta$) family of distributions: properties, discretization and estimation with applications

Rasha M. Mandouh¹,
Mahmoud R. Mahmoud¹^na1 &
Rasha E. Abdelatty¹^na1

Scientific Reports volume 14, Article number: 1613 (2024) Cite this article

437 Accesses
Metrics details

Subjects

Abstract

In this paper, a new class of distributions called the T-X$^\theta$ family of distributions for bounded—(0,1)—and unbounded—$(0,\infty )$—supported random variables is suggested. Some special sub-models of the proposed family are utilized. A new sub-model is selected to be studied in details. The statistical properties of the suggested family including quantile function, moments, moment generating function, order statistics and Rényi entropy are discussed. The maximum likelihood method is provided to estimate the parameters of the distribution and a Monte Carlo simulation study is used. The discretized T-X$^\theta$ family provided many sub-families and sub-models. In addition, eight real data sets are utilized to demonstrate the flexibility of the proposed continuous and discrete family’s multiple sub models.

Truncated Weibull–exponential distribution: methods and applications

Article Open access 27 November 2023

A novel generalized Weibull Poisson G class of continuous probabilistic distributions with some copulas, properties and applications to real-life datasets

Article Open access 19 January 2024

A novel extension of half-logistic distribution with statistical inference, estimation and applications

Article Open access 21 February 2024

Introduction

Data and their distributions are the consequences of generating models. One can imagine that real generating methods are unlimited and therefore we are met with quite a few of models. Statisticians try to choose a mathematical model to fit a given data set. The more data we have, the larger our repertoire of models should be. The existence of a huge variety of data in many problems, requires numerous changes in the classical distributions producing new families of distributions. The newly found distributions may have new properties beside inheriting some advantages of the classical ones which often are special cases of the new generated distribution. This adds additional flexibility to the produced distribution, which it is hoped that it will be of benefit and provides accurate prediction. As a result, comparisons with the classical distributions were frequently in favor of the newly produced ones. Several approaches have been suggested on how to find new flexible distributions. Many alternatives have been employed; sometimes they realized there purpose through adding one or more parameters to a baseline distribution in different ways, some used transformation forms while others merged more than one distribution to obtain a new one. Reference¹ proposed the transformed transformer’s family (T-X family). This method is derived from three functions (R, F, and W), where R and F are the cdfs of two random variables, T and X, respectively, and the W(.) function is used to connect T’s support to the range of X. The cdf and the pdf of the T-X family are given by

$$\begin{aligned} G(x)& = \int \limits _{a}^{W(F(x))} r(t) dt\quad = R(W(F(x))), \end{aligned}$$

and

$$\begin{aligned} g(x)& = r\{W(F(x))\}\frac{d}{d(x)}W(F(x)), \end{aligned}$$

where r(t) and R(t) are the pdf and cdf of the random variable T$\in [a,b]$, where $-\infty \le a<b\le \infty .$ The W(.) is monotonically and non decreasing function with rang of W = [a, b].

1.
$W(F(x))\in [a,b]$.
2.
W(F(x)) is differentiable and monotonically non decreasing.
3.
$W(F(x))\longrightarrow$ a, as $x\longrightarrow -\infty \;$ and $\;W(F(x))\longrightarrow$ b, as $x\longrightarrow \infty .$

Several new types of distributions using various W(.) functions were introduced in the literature. Table 1 includes some members of the T-X family depends on different examples of the W(.) functions for non negative continuous random variable T.

Table 1 Some members of the T-X family.

Full size table

The aim of this research is to introduce a new family of distributions called the T-X$^{\theta }$ family of distributions. The cdf, pdf, Survival and hazard functions of the new family are proposed, also some T-X$^{\theta }$ families based on different T distributions are introduced. Using different T and X functions some new models are introduced, the statistical properties of the E-X$^\theta$E distribution including Survival and hazard rate functions, the quantile function, moments, order statistics and Rényi entropy are derived. The maximum likelihood method is used for estimating the parameters and a simulation study on a variety of sample sizes for the E-X$^\theta$E distributions is investigated. The performance of several members of the discretized and continuous T-X$^\theta$ families of distributions are performed using eight different types of lifetime data sets. Finally the conclusion is provided.

The new family

Here a new way of defining the W(x) function in the T-X family of distributions is proposed. Let T and X be non-negative random variables with cdf R(t) and F(x) and assumed the random variable T$\in (a,b)$ and X$\in (a,b)$, for $0\le a< b\le \infty$ then the cdf and the pdf of the new T-X$^{\theta }$ family of distributions can be defined as follows

$$\begin{aligned} G(x)& = R\{x^{\theta }F(x)\}, \end{aligned}$$

(1)

and

$$\begin{aligned} g(x)& = \int \limits _{0}^{x^{\theta }F(x)}r(t)dt = \left( x^{\theta }f(x)+\theta x^{\theta -1}F(x)\right) r\{{x^{\theta }F(x)}\}\nonumber \\& = \left( xf(x)+\theta F(x)\right) x^{\theta -1} r\{{x^{\theta }F(x)}\}, \end{aligned}$$

(2)

where $\theta >0$ is an additional shape parameter.

The new W(.) function is a generalization that can be defined for both of ranges (0,1) and $(0,\infty )$. One can see that the random variables T, and X should have the same range.

The Survival and hazard functions of a random variable X with T-X$^{\theta }$ family are, respectively given, by

$$\begin{aligned} s(x)& = 1-R\{x^{\theta }F(x)\}, \end{aligned}$$

and

$$\begin{aligned} h(x)& = h_{T}\{x^{\theta }F(x)\}. \end{aligned}$$

When $\theta$ = 0, and T is distributed as beta or Kumaraswamy distribution, then this family is the beta-generated or the Kumaraswamy-generated family of distributions as defined in Table 1.

Based on different T distributions Table 2 displays some members of T-X$^{\theta }$ family with the same X function.

Table 2 Some cdf’s of the new family based on different T distributions.

Full size table

Special models

This section introduces various new T-X$^\theta$ family models based on different X random variables. The Exponential-X$^\theta$Exponential distribution is studied in details and Some of its features are provided. In Table 3, several new models derived from the T-X$^{\theta }$ family of distributions are provided.

Table 3 Some members of the T-X$^{\theta }$ family.

Full size table

The Uniform-X$^\theta$Beta distribution in Table 3 is the McDonald distribution defined by Ref.⁸ with $\theta +1 = c$

The Exponential-X$^{\theta }$Exponential

Let the random variable T follows the Exponential distribution with parameter $\lambda = 1$, and F(x) is the cdf of the Exponential distribution with parameter $\alpha$, then the cdf and the pdf of the Exponential-$X^{\theta }$Exponential(E-$X^{\theta }$E) distribution are defined as

$$\begin{aligned} G(x)& = 1- e^{- x^{\theta }(1- e^{-\alpha x})}, \end{aligned}$$

(3)

and

$$\begin{aligned} g(x)& = \left( \alpha x^{\theta }e^{-\alpha x}+\theta x^{\theta -1}(1- e^{-\alpha x})\right) e^{-x^{\theta }(1- e^{-\alpha x})}, x>0, \end{aligned}$$

(4)

where the survival and the hazard functions are

$$\begin{aligned} s(x)& = e^{-x^{\theta }(1- e^{-\alpha x})}, \end{aligned}$$

and

$$\begin{aligned} h(x)& = \alpha x^{\theta }e^{-\alpha x}+\theta x^{\theta -1}(1- e^{-\alpha x}). \end{aligned}$$

For different values of $\alpha$ and $\theta$ the pdf and the hazard function of the E-$X^{\theta }E$ distribution are plotted in Fig. 1.

Properties of the E-X$^{\theta }$E distribution

In this subsection the E-X$^{\theta }$E distribution is studied in details. Some properties of the E-X$^{\theta }$E distribution including quantile function, moments, moment generating function, order statistics and entropy are provided.

The quantile function

The quantile function (qf) of the E-X$^{\theta }$E distribution can be obtained by equating the cdf (G(x)) in (3) to u, $0<u<1,$ and solve it for x, then

$$\begin{aligned} 1- e^{-x^{\theta }(1- e^{-\alpha x})}& = u,\nonumber \\ x^{\theta }(1- e^{-\alpha x})& = -\ln (1-u). \end{aligned}$$

(5)

There is no closed form for the quantile function and it can be obtained numerically. Butting u = 0.5, in (5), the median (M) of the E-$X^{\theta }E$ distribution can be calculated.

pdf expansion

Here we need to expand the pdf which will be useful in calculating some properties of the E-X$^{\theta }$E distribution such as moments and moment generating function. To obtain the pdf expansion one needs to use the following expansion

$$\begin{aligned} e^{-x^{\theta }(1- e^{-\alpha x})}& = e^{-x^{\theta }}e^{x^{\theta } e^{-\alpha x}} = \sum _{j = 0}^{\infty }\sum _{i = 0}^{\infty } \dfrac{(-1)^{j}}{j!i!} {x^{(j+i)\theta } e^{-i\alpha x}}, \end{aligned}$$

then the pdf in (4) can be expressed as

$$\begin{aligned} g(x)& = \sum _{i = 0}^{\infty } \sum _{j = 0}^{\infty } \frac{(-1)^j}{i! j!}\left[ \alpha x^{(i+ j+1)\theta } e^{ -( i+1)\alpha x} +\theta x^{(i+ j+1)\theta -1}e^{-\alpha i x} -\theta x^{(i+ j+1)\theta -1} e^{ -( i+1)\alpha x} \right] . \end{aligned}$$

(6)

The moments

If X is a random variable distributed as E-X$^{\theta }E (\alpha ,\theta )$, then the nth ordinary moment of E-X$^{\theta }E$ distribution using the pdf in (6) will be

$$\begin{aligned} E(X^{n})& = \sum _{i = 0}^{\infty } \sum _{j = 0}^{\infty } \frac{(-1)^j}{i! j!}\left[ \int _{0}^{\infty }\alpha x^{(i+ j+1) \theta +n} e^{ -( i+1)\alpha x} dx +\int _{0}^{\infty }\theta x^{( i+ j+1)\theta +n-1}e^{-\alpha i x}dx \right. \\&\quad \left. - \int _{0}^{\infty } \theta x^{( i+j+1)\theta +n-1}e^{ -(i+1)\alpha x} dx \right] ,\\ E(X^{n})& = \sum _{i = 0}^{\infty } \sum _{j = 0}^{\infty } \frac{(-1)^j}{i! j!}\left[ \frac{\alpha \Gamma ( (i+ j+1)\theta +n +1)}{(( i+1)\alpha ) ^{ (i+ j+1)\theta +n+1}}+\frac{\theta \Gamma ( (i+ j+1)\theta +n )}{(\alpha i)^{ (i+ j+1)\theta +n}} -\frac{\theta \Gamma ( (i+ j+1)\theta +n )}{( (i+1)\alpha )^{ (i+ j+1)\theta +n}} \right] . \end{aligned}$$

The central moments can be obtained from the moments by

$$\begin{aligned} E(X-\mu )^{n} = \sum _{m = 0}^{n}\left( {\begin{array}{c}n\\ m\end{array}}\right) (-1)^{m}\mu ^{n}\mu ^{'}_{n-m}. \end{aligned}$$

Therefore, the mean and the variance of the E-$x^{\theta }E$ distribution are given by

$$\begin{aligned} \mu ^{}_{}& = E(X) = \sum _{i = 0}^{\infty } \sum _{j = 0}^{\infty } \frac{(-1)^j}{i! j!}\left[ \frac{\alpha \Gamma ((i+ j+1) \theta +2)}{((i+1)\alpha )^{(i+ j+1) \theta +2}}+\frac{\theta \Gamma ((i+ j+1) \theta +1 )}{(\alpha i)^{(i+ j+1) \theta +1}} -\frac{\theta \Gamma ((i+ j+1) \theta +1 )}{((i+1)\alpha )^{(i+ j+1) \theta +1}} \right] , \end{aligned}$$

and

$$\begin{aligned} \mu ^{}_{2}& = \sigma ^{2} = E(X-\mu _{})^{2} = E(X^{2})-\mu _{}^{2}\\& = \sum _{i = 0}^{\infty } \sum _{j = 0}^{\infty } \frac{(-1)^j}{i! j!}\left[ \frac{\alpha \Gamma ((i+j+1) \theta +3)}{((i+1)\alpha )^{(i+ j+1) \theta +3}}+\frac{\theta \Gamma ((i+j+1) \theta +2 )}{(\alpha i)^{(i+ j+1) \theta +2}} -\frac{\theta \Gamma ((i+ j+1) \theta +2 )}{((i+1)\alpha )^{(i+ j+1) \theta +2}} \right] \\&\quad -\mu _{}^{2}. \end{aligned}$$

The numerical values of mean ($\mu$) and variance ($\sigma ^2$) of the E-X$^{\theta }$E distribution are listed in Table 4 for selected values of $\alpha$ and $\theta$.

Table 4 Mean and variance of the E-X$^{\theta }$E distribution for some values of $\alpha$ and $\theta$.

Full size table

From Table 4 one can notes that; the mean and the variance of the E-X$^{\theta }E$ distribution decreases as $\theta$ or $\alpha$ increases. For fixed value of $\alpha$ or $\theta$ the mean and variance decreases as the other parameter increases.

The skewness and kurtosis based on moments

The measure of skewness (Sk) describes the degree of symmetry of the distribution while the kurtosis (Ku) is the peakedness of the distribution. They associated with the E-X$^{\theta }$E distribution using the moments by

$$\begin{aligned} Sk& = \dfrac{\mu ^{}_{3}}{\sigma ^{3}} = E\left[ \left( \dfrac{X-\mu ^{}_{}}{\sigma ^{}}\right) ^{3}\right] = \dfrac{E(X^{3})-3\mu E(X^{2})+3\mu ^{2} E(X^{})-\mu ^{3} }{\sigma ^{3}}, \end{aligned}$$

and

$$\begin{aligned} Ku& = \dfrac{\mu ^{}_{4}}{(\mu ^{}_{2})^{2}} = E\left[ \left( \dfrac{X-\mu ^{}_{}}{\sigma ^{}}\right) ^{4}\right] = \dfrac{E(X^{4})-4 E(X^{3})\mu +6E(X^{2})\mu ^{2}-3\mu ^{4} }{\sigma ^{4}}. \end{aligned}$$

Table 5 shows numerical values for the skewness (Sk), and kurtosis (Ku) of the E-$X^{\theta }E$ distribution for some values of $\alpha$ and $\theta$.

Table 5 Skewness and kurtosis of E-X$^{\theta }$E distribution for some values of $\alpha$, and $\theta$.

Full size table

From the results provided in Table 5, it’s observed that the E-X$^{\theta }$E distribution covered different shapes of pdf.

For different values of $\alpha$, one can notes that; the distribution is right skew $(Sk>0)$ for $\theta <3$. For $\theta = 3$, the distribution is approximately symmetric $(Sk = 0)$, and for $\theta >3$ the distribution is left skew $(Sk<0)$.

For fixed value of $\alpha$, the kurtosis of the E-X$^{\theta }$E distribution is high peak (leptokurtic) for values of $3<\theta <5$, but the distribution is neither too high peak nor too flat topped for $3<\theta <10$ (mesokurtic).

The moment generating function

The moment generating function of the E-X$^{\theta }$E distribution can be obtained using the expansion in (6), as follows

$$\begin{aligned} M_{X}(t) = E(e^{tX})& = \sum _{i = 0}^{\infty } \sum _{j = 0}^{\infty } \frac{(-1)^j}{i! j!} \left[ \int _{0}^{\infty }\alpha x^{(i+ j+1) \theta } e^{ -((i+1)\alpha -t)x} dx +\int _{0}^{\infty }\theta x^{(i+ j+1) \theta -1}e^{-(\alpha i-t) x}dx -\right. \nonumber \\&\left. \qquad \qquad \qquad \qquad \int _{0}^{\infty } \theta x^{(i+ j+1) \theta -1}e^{ -((i+1)\alpha -t)x} dx \right] \nonumber \\& = \sum _{i = 0}^{\infty } \sum _{j = 0}^{\infty } \frac{(-1)^j}{i! j!}\left[ \dfrac{\alpha \Gamma ((i+ j+1) \theta +1)}{((i+1)\alpha -t)^{(i+ j+1) \theta +1}} +\dfrac{\theta \Gamma ((i+ j+1) \theta )}{(\alpha i-t)^{(i+ j+1) \theta }} - \dfrac{\Gamma ((i+ j+1) \theta )}{( (i+1)\alpha -t)^{(i+ j+1) \theta }} \right] . \nonumber \end{aligned}$$

Thus we can find the nth moment by differentiating the $M_{X}(t)$ n times, and then setting t = 0 in the result; that is,

$$\begin{aligned} \dfrac{\partial }{\partial t^{n}}M_{X}(t)|_{t = 0} = E(x^{n}) = \mu '_{n}. \end{aligned}$$

Order statistics

Let $X_{1}, \dots , X_{k}$ be k independent random variables from distribution with pdf g(x) and cdf G(x). According to Ref.⁹, the pdf of rth order statistics, is given by

$$\begin{aligned} g_{r:k}(x)& = \dfrac{k!}{(r-1)!(k-r)!}\left[ G(x)\right] ^{r-1}\left[ 1-G(x)\}\right] ^{k-r}g(x). \end{aligned}$$

(7)

Using the binomial expansion for te quantity

$$\begin{aligned} \left[ G(x)\right] ^{r-1} = \left[ 1-(1-G(x))\right] ^{r-1} = \sum _{i = 0}^{\infty }\left( {\begin{array}{c}r-1\\ i\end{array}}\right) (1-G(x))^{i}. \end{aligned}$$

(8)

Inserting (8), (3) and (4) in (7), then the pdf of the rth order statistic from the T-X$^{\theta }$ family of distributions is given by

$$\begin{aligned} g_{r:k}(x)& = \sum _{i = 0}^{\infty }\dfrac{k!}{(r-1-i)!(k-r)!i!}\left[ 1-R\{{x^{\theta }F(x)}\}\right] ^{i+k-r}\left( x^{\theta }f(x)+\theta x^{\theta -1}F(x)\right) r\{{x^{\theta }F(x)}\}. \end{aligned}$$

Let $X_{1}, \ldots , X_{k}$ be a random sample from the E-X$^{\theta }$E distribution. The pdf of rth order statistic, $X_{r:k}$, for the E-X$^{\theta }$E distribution is defined by

$$\begin{aligned} g_{r:k}(x) = &\sum _{l = i = j = 0}^{\infty }\omega _{i,j,l} \left[ \alpha x^{\theta +\theta i+\theta j} e^{ -(\alpha +\alpha i)x} +\theta x^{\theta +\theta i+\theta j-1}e^{-\alpha i x} -\theta x^{\theta +\theta i+\theta j-1}e^{ -(\alpha +\alpha i)x} \right] , \end{aligned}$$

(9)

where $\omega _{i,j,l} = \dfrac{k!(-1)^j(l+k-r+1)^{i+j}}{(r-1-l)!(k-r)!l!i! j!}$.

Proof

The pdf of rth order statistic, $X_{r:k}$, for the E-X$^{\theta }$E distribution can be written as

$$\begin{aligned} g_{r:k}(x)& = \sum _{l = 0}^{\infty }\dfrac{k!}{(r-1-l)!(k-r)!l!}\left[ e^{-x^{\theta }(1- e^{-\alpha x})}\right] ^{l+k-r}\left( \alpha x^{\theta }e^{-\alpha x}+\theta x^{\theta -1}(1- e^{-\alpha x})\right) \\&\quad \times e^{-x^{\theta }(1- e^{-\alpha x})}\\& = \sum _{l = 0}^{\infty }\dfrac{k!}{(r-1-l)!(k-r)!l!}\left[ e^{-x^{\theta }(1- e^{-\alpha x})}\right] ^{v}\left( \alpha x^{\theta }e^{-\alpha x}+\theta x^{\theta -1}(1- e^{-\alpha x})\right) , \end{aligned}$$

where $v = l+k-r+1$. Using the Exponential expansion for $e^{-vx^{\theta }+vx^{\theta } e^{-\alpha x}}$, then the Form in (9) is obtained. $\square$

Entropy

In information theory entropy can be regarded as a measure of a system’s degree of uncertainty. It has a widely applications in economics, physics, weather science and sociology. in this section the Rényi entropy measurements is evaluated for the E-X$^{\theta }$E distribution.

Rényi entropy

¹⁰ defined the Rényi entropy which considered as a generalization of the Hartley, Shannon, collision and min entropy. The Rényi entropy of a random variable X is defined by

$$\begin{aligned} I_{R}(\rho )& = \dfrac{1}{1-\rho }\log \underset{0 }{\overset{\infty }{\int }} f^{\rho }(x)dx,\rho >0,\rho \ne 1. \end{aligned}$$

The Rényi entropy of the T-$X^\theta$ family of distributions is given by

$$\begin{aligned} I_{R}(\rho )& = \dfrac{1}{1-\rho }\log \int _{0}^{\infty } \sum _{m = 0}^{\rho }\left( {\begin{array}{c}\rho \\ m\end{array}}\right) \theta ^m x^{\theta \rho -m}f(x)^{\rho -m}( F(x))^{m} r^{\rho }\{ x^{\theta }F(x)\} dx. \end{aligned}$$

(10)

This form is easy to be shown by applying the following binomial expansion

$$\begin{aligned} \left( x^{\theta }f(x)+\theta x^{\theta -1}F(x)\right) ^{\rho }& = \sum _{m = 0}^{\rho }\left( {\begin{array}{c}\rho \\ m\end{array}}\right) \theta ^m x^{\theta \rho -m}f(x)^{\rho -m}( F(x))^{m}. \end{aligned}$$

The Rényi entropy of of a random variable X following the E-$X^{\theta }$E distribution is

$$\begin{aligned} I_{R}(\rho )& = \dfrac{1}{1-\rho }\log \left[ \sum _{k = 0}^{\infty }\sum _{m = 0}^{\rho }\sum _{r = 0}^{m+k}\left( {\begin{array}{c}\rho \\ m\end{array}}\right) \left( {\begin{array}{c}m+k\\ r\end{array}}\right) \dfrac{\left( -1\right) ^{k+r}\alpha ^{(\rho -m)\Gamma \left( \theta \rho +\theta k-m+1\right) }\rho ^{k}\theta ^{m}}{k!r!\left( \alpha (\rho +r-m)\right) ^{\left( \theta \rho +\theta k-m+1\right) }} \right] . \end{aligned}$$

(11)

Proof

Applying result in (10), then the Rényi entropy of the E-X$^{\theta }$E can be written as

$$\begin{aligned} I_{R}(\rho )& = \dfrac{1}{1-\rho }\log \int _{0}^{\infty } \sum _{m = 0}^{\rho }\left( {\begin{array}{c}\rho \\ m\end{array}}\right) \theta ^m x^{\theta \rho -m}(\alpha e^{-\alpha x})^{\rho -m}( 1- e^{-\alpha x})^{m} e^{-\rho x^{\theta }(1- e^{-\alpha x})} dx. \end{aligned}$$

Since $e^{-\rho x^{\theta }(1- e^{-\alpha x})} = \sum _{k = 0}^{\infty }\dfrac{\left( -\rho x^{\theta }(1- e^{-\alpha x})\right) ^{k}}{k!}$ , then the Rényi entropy will be

$$\begin{aligned} I_{R}(\rho )& = \dfrac{1}{1-\rho }\log \left[ \int _{0}^{\infty } \sum _{k = 0}^{\infty }\sum _{m = 0}^{\rho }\left( {\begin{array}{c}\rho \\ m\end{array}}\right) \dfrac{\left( -1\right) ^{k}\alpha ^{(\rho -m)}\rho ^{k}\theta ^{m}}{k!} x^{\theta \rho +\theta k-m} \times e^{-\alpha (\rho -m) x}\left( 1- e^{-\alpha x}\right) ^{m+k} dx\right] . \end{aligned}$$

Since $\left( 1- e^{-\alpha x}\right) ^{m+k} = \sum _{r = 0}^{m+k}\left( {\begin{array}{c}m+k\\ r\end{array}}\right) (-1 )^{r}e^{-r\alpha x}.$ Then

$$\begin{aligned} I_{R}(\rho )& = \dfrac{1}{1-\rho }\log \left[ \int _{0}^{\infty } \sum _{k = 0}^{\infty }\sum _{m = 0}^{\rho }\sum _{r = 0}^{m+k}\left( {\begin{array}{c}\rho \\ m\end{array}}\right) \left( {\begin{array}{c}m+k\\ r\end{array}}\right) \dfrac{\left( -1\right) ^{k+r}\alpha ^{(\rho -m)}\rho ^{k}\theta ^{m}}{k!r!} \times x^{\theta \rho +\theta k-m}e^{-\alpha (\rho +r-m) x} dx\right] . \end{aligned}$$

Assuming the exchange between summation and integration is possible, then the last form will be

$$\begin{aligned} I_{R}(\rho )& = \dfrac{1}{1-\rho }\log \left[ \sum _{k = 0}^{\infty }\sum _{m = 0}^{\rho }\sum _{r = 0}^{m+k}\left( {\begin{array}{c}\rho \\ m\end{array}}\right) \left( {\begin{array}{c}m+k\\ r\end{array}}\right) \dfrac{\left( -1\right) ^{k+r}\alpha ^{(\rho -m)}\rho ^{k}\theta ^{m}}{k!r!} \int _{0}^{\infty } x^{\theta \rho +\theta k-m}e^{-\alpha (\rho +r-m) x} dx\right] . \end{aligned}$$

The result in Eq. (11) obtained from the last form by using the gamma function as

$$\begin{aligned} \int _{0}^{\infty } x^{\theta \rho +\theta k-m}e^{-\alpha (\rho +r-m) x} d x& = \dfrac{\Gamma \left( \theta \rho +\theta k-m+1\right) }{\left( \alpha (\rho +r-m)\right) ^{\left( \theta \rho +\theta k-m+1\right) }}. \end{aligned}$$

$\square$

Parameter estimation and simulation study

In the first subsection, the maximum likelihood estimation (MLEs) of the parameters of the E-X$^{\theta }$E distribution is discussed. In the second subsection, a simulation study is obtained.

Maximum likelihood (MLE)

Let $X_{1}, X_{2},..., X_{n}$ be a random sample from E-X$^{\theta }$E distribution. The log-likelihood function corresponding to (4) is

$$\begin{aligned} LL = \log L(x_{i},\alpha , \theta )& = \sum _{i = 1}^{n} \log \left( \alpha x_{i}^{\theta }e^{-\alpha x_{i}}+\theta x^{\theta -1}(1- e^{-\alpha x_{i}})\right) -\sum _{i = 1}^{n}x_{i}^{\theta }(1- e^{-\alpha x_{i}}). \end{aligned}$$

(12)

The partial derivatives of $\alpha$ and $\theta$ corresponding to (12), are given by

$$\begin{aligned} \dfrac{\partial }{\partial \alpha }LL& = \sum _{i = 1}^n \frac{ x_i^{\theta }e^{-\alpha x_i} - \alpha x_i^{\theta +1} e^{-\alpha x_i}+\theta x_i^{\theta } e^{-\alpha x_i} }{\alpha x_{i}^{\theta }e^{-\alpha x_{i}}+\theta x^{\theta -1}(1- e^{-\alpha x_{i}})} -\sum _{i = 1}^n x_i^{\theta +1}e^{-\alpha x_i}, \end{aligned}$$

(13)

and

$$\begin{aligned} \dfrac{\partial }{\partial \theta }LL& = \sum _{i = 1}^n \frac{\alpha e^{-\alpha x_i} x_i^{\theta } \log \left( x_i\right) +x_i^{\theta -1}\left( 1-e^{-\alpha x_i}\right) +\theta \left( 1-e^{-\alpha x_i}\right) x_i^{\theta -1} \log \left( x_i\right) }{\alpha x_{i}^{\theta }e^{-\alpha x_{i}}+\theta x^{\theta -1}(1- e^{-\alpha x_{i}})}\nonumber \\&\quad -\sum _{i = 1}^nx_i^{\theta } \log \left( x_i\right) \left( 1-e^{-\alpha x_i}\right) . \end{aligned}$$

(14)

Therefore, the MLE for ${\hat{\alpha }}$ and ${\hat{\theta }}$ are achieved by setting (13) and (14) to zero and then numerically solving them using a simulation technique such as Newton Rahbson.

Interval estimation

the second derivatives of the log likelihood function for $\alpha$ and $\theta$ are

$$\begin{aligned} \dfrac{\partial LL}{\partial \alpha ^{2}}& = \sum _{i = 1}^n e^{-\alpha x_i} x_i^{\theta +2}+ \sum _{i = 1}^n \left[ \frac{-\theta x^2 e^{-\alpha x_i} x_i^{\theta -1}-2 e^{-\alpha x_i} x_i^{\theta +1}+\alpha e^{-\alpha x_i} x_i^{\theta +2}}{\theta \left( 1-e^{-\alpha x_i}\right) x_i^{\theta -1}+\alpha e^{-\alpha x_i} x_i^{\theta }}+\left( \theta x e^{-\alpha x_i} x_i^{\theta -1}+e^{-\alpha x_i} x_i^{\theta }-\alpha e^{-\alpha x_i} x_i^{\theta +1}\right) \times \right. \nonumber \\&\quad \left. \left( \frac{\alpha e^{-\alpha x_i} x_i^{\theta +1}}{\left( \theta \left( 1-e^{\alpha (-x)}\right) x_i^{\theta -1}+\alpha e^{-\alpha x_i} x_i^{\theta }\right) {}^2}-\frac{\theta x e^{-\alpha x} x_i^{\theta -1}}{\left( \theta \left( 1-e^{-\alpha x_i}\right) x_i^{\theta -1}+\alpha e^{-\alpha x_i} x_i^{\theta }\right) {}^2}- \frac{x_i^{\theta }e^{-\alpha x_i} }{\left( \theta x_i^{\theta -1} \left( 1-e^{-\alpha x_i}\right) +\alpha x_i^{\theta } e^{-\alpha x_i}\right) {}^2}\right) \right] , \end{aligned}$$

(15)

$$\begin{aligned} \dfrac{\partial ^{2}LL}{\partial \alpha \partial \theta }& = -\sum _{i = 1}^n x_i^{\theta +1} \log \left( x_i\right) e^{-\alpha x_i}+\sum _{i = 1}^n \left[ \frac{x e^{-\alpha x_i} x_i^{\theta -1}+\theta x e^{-\alpha x_i} x_i^{\theta -1} \log \left( x_i\right) +e^{-\alpha x_i} x_i^{\theta } \log \left( x_i\right) -\alpha e^{-\alpha x_i} x_i^{\theta +1} \log \left( x_i\right) }{\theta \left( 1-e^{-\alpha x_i}\right) x_i^{\theta -1}+\alpha e^{-\alpha x_i} x_i^{\theta }}+\right. \nonumber \\&\quad \left. \left( \theta x_i e^{-\alpha x_i} x_i^{\theta -1}+e^{-\alpha x_i} x_i^{\theta }-\alpha e^{-\alpha x_i} x_i^{\theta +1}\right) \left( \frac{e^{-\alpha x_i} x_i^{\theta -1}}{\left( \theta \left( 1-e^{-\alpha x_i}\right) x_i^{\theta -1}+\alpha e^{-\alpha x_i} x_i^{\theta }\right) {}^2}-\frac{x_i^{\theta -1}}{\left( \theta \left( 1-e^{-\alpha x_i}\right) x_i^{\theta -1}+\alpha e^{-\alpha x_i} x_i^{\theta }\right) {}^2}\right. \right. \nonumber \\&\quad \left. \left. +\frac{\theta e^{-\alpha x_i} x_i^{\theta -1} \log \left( x_i\right) }{\left( \theta \left( 1-e^{-\alpha x_i}\right) x_i^{\theta -1}+\alpha e^{-\alpha x_i} x_i^{\theta }\right) {}^2}- \frac{\theta x_i^{\theta -1} \log \left( x_i\right) }{\left( \theta \left( 1-e^{-\alpha x_i}\right) x_i^{\theta -1}+\alpha e^{-\alpha x_i} x_i^{\theta }\right) {}^2}-\frac{\alpha e^{-\alpha x_i} x_i^{\theta } \log \left( x_i\right) }{\left( \theta \left( 1-e^{-\alpha x_i}\right) x_i^{\theta -1}+\alpha e^{-\alpha x_i} x_i^{\theta }\right) {}^2}\right) \right] , \end{aligned}$$

(16)

and

$$\begin{aligned} \dfrac{\partial LL}{\partial \theta ^{2}}& = -\sum _{i = 1}^n \left( 1-e^{-\alpha x_i}\right) x_i^{\theta } \log ^2\left( x_i\right) + \sum _{i = 1}^n \left[ \frac{\theta \left( 1-e^{-\alpha x_i}\right) x_i^{\theta -1} \log ^2\left( x_i\right) +\alpha e^{-\alpha x_i} x_i^{\theta } \log ^2\left( x_i\right) +2 \left( 1-e^{-\alpha x_i}\right) x_i^{\theta -1} \log \left( x_i\right) }{\theta \left( 1-e^{-\alpha x_i}\right) x_i^{\theta -1}+\alpha e^{-\alpha x_i} x_i^{\theta }}\right. \nonumber \\&\qquad \left. +\left( \left( 1-e^{-\alpha x_i}\right) x_i^{\theta -1}+\theta \left( 1-e^{-\alpha x_i}\right) x_i^{\theta -1} \log \left( x_i\right) +\alpha e^{-\alpha x_i} x_i^{\theta } \log \left( x_i\right) \right) \times \left( \frac{e^{-\alpha x_i} x_i^{\theta -1}}{\left( \theta \left( 1-e^{-\alpha x_i}\right) x_i^{\theta -1}+\alpha e^{-\alpha x_i} x_i^{\theta }\right) {}^2}\right. \right. \nonumber \\&\qquad \left. \left. -\frac{x_i^{\theta -1}}{\left( \theta \left( 1-e^{-\alpha x_i}\right) x_i^{\theta -1}+\alpha e^{-\alpha x_i} x_i^{\theta }\right) {}^2}+\frac{\theta e^{-\alpha x_i} x_i^{\theta -1} \log \left( x_i\right) }{\left( \theta \left( 1-e^{-\alpha x_i}\right) x_i^{\theta -1}+\alpha e^{-\alpha x_i} x_i^{\theta }\right) {}^2}-\frac{\theta x_i^{\theta -1} \log \left( x_i\right) }{\left( \theta \left( 1-e^{-\alpha x_i}\right) x_i^{\theta -1}+\alpha e^{-\alpha x_i} x_i^{\theta }\right) {}^2}-\right. \right. \nonumber \\&\qquad \left. \left. \frac{\alpha e^{-\alpha x_i} x_i^{\theta } \log \left( x_i\right) }{\left( \theta \left( 1-e^{-\alpha x_i}\right) x_i^{\theta -1}+\alpha e^{-\alpha x_i} x_i^{\theta }\right) {}^2}\right) \right] . \end{aligned}$$

(17)

The hessian matrix H can be obtained as follows by using Eqs. (13), (16) and (17) .

$$\begin{aligned} H(\Theta |x) = \begin{pmatrix} \dfrac{\partial ^{2} LL}{\partial \alpha ^{2}}&{}\dfrac{\partial ^{2} LL}{\partial \alpha \partial \theta }\\ \dfrac{\partial ^{2} LL}{\partial \theta \partial \alpha }&{}\dfrac{\partial ^{2} LL}{\partial \theta ^{2}} \end{pmatrix}, \end{aligned}$$

where, ${\Theta } = (\alpha , \theta )$, $\hat{{\Theta }}\sim N_{2}({\Theta },-H(\hat{{\Theta }}|x)^{-1})$ and the information matrix $I( {\Theta }|x) = -E(H({\Theta }|x))$ , The EX$^{\theta }$E distribution’s MLEs for parameters $\alpha$ and $\theta$ only exist if the Hessian matrix is negative definite then the likelihood has a unique root. Consequently, the matrix -H($\hat{{\Theta }}$) can be used to calculate asymptotic confidence intervals for the parameters $\alpha$ and $\theta$.

Simulation study

In this subsection a Monte Carlo simulation study is presented to demonstrate the effectiveness of the ML approach for estimating the E-X$^{\theta }E$ distribution parameters ($\alpha$ and $\theta$). The following are the steps of the simulation procedure:

1.
Set the parameter values for ($\alpha$ and $\theta$) as (2,0.1), (0.1,2), (0.5,1.5), (1.2,1.5), (1,0.2), (0.3,0.2), (0.6,0.5), (0.1,0.2), (1,2), and (0.2,1).
2.
Using the E-$X^{\theta }$E distribution’s quantile function, which is defined in (5), to generate a random sample of size n, where n = 20, 45, 60, 90, 120, 150, and 200.
3.
Using the generated data obtained in step 2, the MLE of the parameters $\alpha$ and $\theta$ is calculated.
4.
The biases and the root mean squared errors (RMSE) were determined using the provided formulas,
$$\begin{aligned} \text {bias}(\gamma ) = \dfrac{1}{1000}\sum _{i = 1}^{1000}({\hat{\gamma }}_{i}-\gamma )\quad \text {and} \quad \text {RMSE}(\gamma ) = \sqrt{\dfrac{1}{1000}\sum _{i = 1}^{1000}({\hat{\gamma }}_{i}-\gamma )^{2}}. \end{aligned}$$

The biases, root mean squared errors and variances of ${\hat{\alpha }}$ and ${\hat{\theta }}$ are reported in Table 6. In general, as predicted, the results in this table showed a decrease in the values of the biases RMSE and variance as sample size increases, indicating that the MLE is a reliable approach for estimating the E-X$^{\theta }$E parameters as it is unbiased, the variance is minimum and it realizes the consistence.

Table 6 Summary of MLE’s results simulation of the E-X$^{\theta }$E for some values of $\alpha$ and $\theta$.

Full size table

Discrete T-X$^{\theta }$ family

Statistical literature contains many techniques that can be used to discretize the continuous family of distributions. One of these techniques is the one that depends on the survival function. Following¹¹, the survival function for a discrete life time distribution is defined as $S(x) = P(X\ge x), x = 1,2,...$ and $S(0) = 1$, then the probability mass function (pmf) is:

$$\begin{aligned} P(X = x)& = P(x\le X\le x+1)\nonumber \\& = S(x)-S(x+1), x = 0,1,2,\ldots . \end{aligned}$$

(18)

The new family is generated by discretizing the continuous cdf function in (1) using the form in (18). The pmf of the T-X$^{\theta }$ family is given by

$$\begin{aligned} p(x)& = R\{(x+1)^{\theta }F(x+1)\}-R\{x^{\theta }F(x)\},x = 0,1,2,\ldots . \end{aligned}$$

(19)

Based on this pmf, with different T distributions, Table 7 contains some new discrete sub-families of the discrete T-X$^{\theta }$ family.

Table 7 Some pmfs of the new family based on different T distributions.

Full size table

From Table 7 for the DW-X$^{\theta }$ family, we notes that;

When the shape parameter $\lambda = 1,$ the DW-X$^{\theta }$ family reduces to the discrete Exponential family of distributions (DE-X$^{\theta }$) with parameters $1/\beta$ and $\theta$.
When $\lambda = 2,$ the DW-X$^{\theta }$ family reduces to the discrete Rayleigh family of distributions (DR-X$^{\theta }$) with parameters $\beta ^2$ and $2\theta$.
The DW-X$^{\theta }$ family can be considered as The DE-X$^{\theta }$ family with exponentiation F(x), ie $\theta ^{\lambda } = \theta ^{\star },$ ${\beta }^{\lambda } = {\beta }^{\star }$and $\lambda$ is the exponentiation parameter.

The discrete exponetial-X$^{\theta }$, DE-X$^{\theta }$ family

The cumulative distribution function (cdf), survival function and probability mass function (pmf) of the DE-X$^{\theta }$ family of distributions are

$$\begin{aligned} G(x)& = 1- e^{- \alpha {x^{\theta }F(x)}}, x = 0,1,2,\ldots , \end{aligned}$$

(20)

$$\begin{aligned} p(x)& = e^{- \alpha {x^{\theta }F(x)}}- e^{- \alpha {(x+1)^{\theta }F(x+1)}} , \alpha , \theta >0. \end{aligned}$$

(21)

The survival and the hazard rate functions of the DE-X$^{\theta }$ family of distributions are

$$\begin{aligned} S(x)& = p(X\ge x) = e^{- \alpha {x^{\theta }F(x)}},\\ h(x)& = P(X = x|X\ge x) = 1-e^{-\alpha (x+1)^{\theta }F(x+1)-(x)^{\theta }F(x)}. \end{aligned}$$

With different X random variables, many distributions can be generated as members of the DE-X$^{\theta }$ family as shown in Table 8.

Table 8 Some pmfs of the DE-X$^{\theta }$ family based on different X distributions.

Full size table

The DE-Exponential$^{\varvec{\theta }}$(DEE) distribution

The DEE distribution is derived here as an example from Table 8. Substituting F(x) in (20) and (21) by the Exponential distribution with parameter $\beta$, then The pmf of the DEE distribution with three parameters are given by;

$$\begin{aligned} G(x)& = 1- e^{- \alpha {x^{\theta }(1-e^{-\beta x})}}, x = 0,1,2,\ldots ,\nonumber \\ p(x)& = e^{- \alpha {x^{\theta }(1-e^{-\beta x})}}- e^{- \alpha {(x+1)^{\theta }(1-e^{-\beta (x+1)})}} , \alpha , \theta , \beta >0, \end{aligned}$$

(22)

while the survival and the hazard functions are given by

$$\begin{aligned} S(x)& = p(X\ge x) = e^{- \alpha {x^{\theta }(1-e^{-\beta x})}}\nonumber \\ h(x)& = 1-e^{-\alpha (x+1)^{\theta }(1-e^{-\beta (x+1)})-(x)^{\theta }(1-e^{-\beta x})}. \end{aligned}$$

Figure 2 displays some possible pmf shapes of the DEE distribution. The hazard rate function may has an increasing or decreasing shape as shown in Fig. 3.

Applications to real data

The applications in this section are derived into two subsections, the first subsection contains applications of the continuous distributions displayed in Table 3. While, the second subsection deals with count data that matched the discrete distributions presented in Table 8.

Using continuous data

In this subsection multiple models of the T-X$^\theta$ family are fitted to four different data sets. These examples are provided to demonstrate the flexibility of the new family members when compared against a variety of distributions. The estimation of each model parameters is obtained using the maximum likelihood method. To compare the distributions, three criteria are calculated, including: the Akaike information criterion (AIC), the Bayesian information criterion (BIC) and the corrected Akaike information criterion (AICc).

$$\begin{array}{*{20}l} \text {AIC} = 2k -2L(\varvec{\theta }; x), \\ \text {BIC} = \ln (nk) -2L(\theta ; x), \\ \text {AICc} = \text {AIC}+\frac{2 k(k+1) }{n-k-1}, \end{array}$$

where $L(\theta ; x)$ denotes the log likelihood for the model, k is the number of parameters and n is the sample size. In general the model that best fits the data is the one with the highest log L and p values and the lowest AIC, AICc, BIC, and Ks values. The Mathematica package was used to assess all of the required computations and figures. Table 9 shows the distributions that have been fitted to the data for comparison considerations.

Table 9 Competitive distributions.

Full size table

Data set I

The data set provides the wait times (in min) before service for 100 Bank clients, which were evaluated and assessed by¹⁶ for fitting the Lindley distribution. The data are: 0.8, 0.8, 1.3, 1.5, 1.8, 1.9, 1.9, 2.1, 2.6, 2.7, 2.9, 3.1,3.2, 3.3, 3.5, 3.6, 4.0, 4.1, 4.2, 4.2, 4.3, 4.3, 4.4, 4.4, 4.6, 4.7, 4.7, 4.8, 4.9 ,4.9, 5.0, 5.3, 5.5, 5.7, 5.7, 6.1, 6.2, 6.2, 6.2, 6.3, 6.7 ,6.9 ,7.1, 7.1, 7.1, 7.1, 7.4, 7.6, 7.7, 8.0, 8.2, 8.6, 8.6, 8.6, 8.8, 8.8, 8.9, 8.9, 9.5, 9.6,9.7, 9.8, 10.7, 10.9, 11, 11, 11.1, 11.2, 11.2, 11.5, 11.9, 12.4,12.5, 12.9, 13, 13.1, 13.3, 13.6, 13.7, 13.9, 14.1, 15.4, 15.4, 17.3,17.3, 18.1, 18.2, 18.4, 18.9, 19, 19.9, 20.6, 21.3, 21.4, 21.9, 23,27, 31.6, 33.1, 38.5.

The estimated parameter values and the goodness of fit measures for this data are presented in Table 10. According to the results in this table, the G-X$^\theta$L, E-X$^\theta$E, G-X$^\theta$E, E-X$^\theta$G, and L-G{F} distributions are fitted to this data. The G-X$^\theta$L distribution is the best option among other competitive models, as shown by the findings in Table 10, as it has the greatest p-value, and the smallest other goodness of fit statistics. These findings are also supported by Fig. 4 which represents the empirical cdf and the observed density (histogram) for data set I together with the competitive models.

Table 10 Goodness-of fit statistics for data set I.

Full size table

Data set II

The data set comes from Ref.¹⁷ and represents the time between failures for repairable components. The data are provided as shown below:

1.43, 0.11, 0.71, 0.77, 2.63, 1.49, 3.46, 2.46, 0.59, 0.74, 1.23, 0.94, 4.36, 0.40, 1.74, 4.73, 2.23, 0.45, 0.70, 1.06, 1.46, 0.30, 1.82, 2.37, 0.63, 1.23, 1.24, 1.97, 1.86, 1.17.

The parameter estimates and the goodness-of-fit statistics for E-X$^{\theta }$E, G-X$^{\theta }$L,G-X$^{\theta }$E, Bu-X$^{\theta }$E, CIR and PIL distributions are listed in Table 11. All competitive distributions are fitted and perform well when examining these data with a p-value greater than 0.05. However, the optimal model to acquire the best assessment of the data is the E-X$^{\theta }$E model, which has the smallest values of -ll, AIC, BIC, AICc and k-s statistics, as well as the highest p-value of all the examined models. Figure 5 supports the results in Table 11.

Table 11 Goodness-of fit statistics for data set II.

Full size table

Data set III

The data set was acquired from Ref.¹⁸and consists of 30 successive values of March precipitation (in inches) in Minneapolis/St Paul. The data is as follows:

0.77, 1.74, 0.81, 1.2, 1.95, 1.2, 0.47, 1.43, 3.37, 2.2, 3, 3.09, 1.51, 2.1, 0.52, 1.62, 1.31, 0.32, 0.59, 0.81, 2.81, 1.87, 1.18, 1.35, 4.75, 2.48, 0.96, 1.89, 0.9, 2.05.

Figure 6 depicts the empirical cdf and observed density (histogram) for Data set III, compared with the cdf’s and pdf’s of E-X$^{\theta }$E, E-X$^{\theta }$G, G-X$^{\theta }$E, Bu-X$^{\theta }$E and CIR distributions. Table 12 displays the calculated parameters as well as the goodness-of-fit values. As seen in Table 12 and Fig. 6, the G-X$\theta$E distribution was chosen as the best model for this data because it had the lowest goodness-of-fit statistics and the greatest p-value, which = 1, of all the competitive distributions.

Table 12 Parameter estimate and goodness-of fit statistics data set III.

Full size table

Data set IV

The data shows the remission times (in months) of 128 bladder cancer patients. Reference¹⁵ have made use of this data. The following are the data set values:

0.08, 2.09, 3.48, 4.87, 6.94, 8.66, 13.11, 23.63, 0.20, 2.23, 3.52, 4.98, 6.97, 9.02, 13.29, 0.40, 2.26, 3.57, 5.06, 7.09, 9.22, 13.80, 25.74, 0.50, 2.46, 3.64, 5.09, 7.26, 9.47, 14.24, 25.82, 0.51, 2.54, 3.70, 5.17, 7.28, 9.74, 14.76, 26.31, 0.81, 2.62, 3.82, 5.32, 7.32, 10.06, 14.77, 32.15, 2.64, 3.88, 5.32, 7.39, 10.34, 14.83, 34.26, 0.90, 2.69, 4.18, 5.34, 7.59, 10.66, 15.96, 36.66, 1.05, 2.69, 4.23, 5.41, 7.62, 10.75, 16.62, 43.01, 1.19, 2.75, 4.26, 5.41, 7.63, 17.12, 46.12, 1.26, 2.83, 4.33, 5.49, 7.66, 11.25, 17.14, 79.05, 1.35, 2.87, 5.62, 7.87, 11.64, 17.36, 1.40, 3.02, 4.34, 5.71, 7.93, 11.79, 18.10, 1.46, 4.40, 5.85, 8.26, 11.98, 19.13, 1.76, 3.25, 4.50, 6.25, 8.37, 12.02, 2.02, 3.31, 4.51, 6.54, 8.53, 12.03, 20.28, 2.02, 3.36, 6.76, 12.07, 21.73, 2.07, 3.36, 6.93, 8.65, 12.63, 22.69.

Plots of the fitted cdf’s and pdf’s for this data set are shown in Fig. 7. The calculated parameter values and goodness of fit measures are shown in Table 13. This table shows that, E-X$^{\theta }$E, G-X$^{\theta }$E, G-X$^{\theta }$L distributions are Fitted and perform well when examining this data set with a p-value greater than 0.05. While the NW-L distribution isn’t fitted to this data. However, the optimal model to acquire the best assessment of the data is the E-X$^{\theta }$E model, which has the smallest values of -ll, AIC, BIC, AICc and K-S statistics, as well as the highest p-value of all the examined models. Figure 7 supports the results in Table 13

Table 13 Parameter estimate and goodness-of fit statistics data set IV.

Full size table

Using discrete data

Applications to count data sets from different fields are examined to demonstrate the practical applicability of the proposed models presented in Table 8. The R software is used to acquire the statistical results. MLE’s, Akaike Information Criteria (AIC), Bayesian Information Criteria(BIC), $\chi ^{2}$ test with associated p-values are performed to compare the flexibility of distributions. For competitive considerations, the data were also fitted to the Gamma distribution and some other discrete competitive distribution shown in Table 14.

Table 14 The discrete competitive distributions.

Full size table

Data set V

This data set represents the survival times of 44 patients suffering from head and neck cancer who retreated using a combination of radiotherapy. This data is taken from Ref.²⁴. The data are:

12 32 37 24 24 74 81 26 41 58 63 68 78 47 55 84 155 159 92 94 110 127 130 133 140 112 119 146 173 179 194 195 339 432 209 249 281 319 469 725 817 519 633 1776

Figure 8 illustrates the empirical and estimated cdf of this data. Table 15 displays the calculated parameters as well as the goodness-of-fit values. From the outcomes of this Table, we conclude that the DEE, DEW, DEB, DEL, DEFr, DEG, DEETE, INH, DEOWE, DBiEXII, PMiD and Geometric distributions are Fitted and perform quite well for evaluating this data. The DEE distribution is selected as the best model for the data because it has the smallest values of the goodness-of-fit statistics and the highest p-value among all the competitive distributions.

Table 15 Parameter estimate and goodness-of fit statistics data set V.

Full size table

Data set VI

These data show the time intervals in days between coal mine explosions from March 15, 1851 to March 22, 1962, inclusive. This time range is 40,550 days long. So the data consists of 190 numbers totaling 40,549. The 0 occurs because two incidents occurred on June 6, 1875. These findings are presented in Ref.²⁵. The data set is; 0 1 1 2 2 3 4 4 4 6 7 10 11 12 12 12 13 15 15 16 16 16 17 17 18 19 19 19 20 20 22 23 24 25 27 28 29 29 29 31 31 32 34 34 36 36 37 40 41 41 42 43 45 47 48 49 50 53 54 54 55 56 59 59 61 61 65 66 66 70 72 75 78 78 78 80 80 81 88 91 92 93 93 95 95 96 96 97 99 101 108 110 112 113 114 120 120 123 123 124 124 125 127 129 131 134 137 139 143 144 145 151 154 156 176 182 186 187 188 189 190 193 194 197 202 203 208 215 216 217 217 217 218 224 225 228 232 233 250 255 275 275 275 276 286 292 307 307 312 312 315 324 326 326 329 330 336 345 348 354 361 364 368 378 388 420 431 456 462 467 498 517 536 538 566 632 644 745 806 826 871 952 1205 1312 1358 1630 1643 2366

The summary statistics for this data is shown in Table 16. Figure 9 depicts the fitted cdf plots of the proposed models compared with the DEETE, INH, DEOWE, DBiEXII, PMiD and Geometric distributions for the data set VI. The estimated parameter values and the goodness of fit measures are presented in Table 16. According to the results in this table, the DEE, DEW, DEB, DEL, DEFr, DEG and DEOWE distributions are fitted to the data. The DEE distribution is the best option among other competitive models as it has the greatest p-value, while the DEG has the smallest other goodness of fit statistics, as shown by the findings in Table 16.

Table 16 Parameter estimate and goodness-of fit statistics for data set VI.

Full size table

Data set VII

This data derived from a study performed in the lab on male mice who were given a 300 roentgen radiation exposure and were 5–6 weeks old. This information describes additional causes of death than the two primary causes: Thymic lymphoma and reticulum cell sarcoma. This data were examined by Ref.²⁶. The data are:

40 42 51 62 163 179 206 222 228 252 249 282 324 333 341 366 385 407 420 431 441 461 462 482 517 517 524 564 567 586 619 620 621 622 647 651 686 761 763

Figure 10 shows the fitted cdf plots of the suggested models compared with the DEETE, INH, DWMOE, DBiEXII, PMiD and Geometric distributions to this data set. Table 17, shows the estimated values of parameters as well as the goodness of fit statistics.

This table shows that, with the exception of the INH and Geometric distributions, all distributions are fitted and performed well when examining these data with p-values greater than 0.05. However, The PMiD distribution is the best option among other competitive models as it has the greatest p-value, as shown by the findings in Table 17.

Table 17 Parameter estimates and goodness-of fit statistics for data set VII.

Full size table

Data set VIII

These data are the yields from 70 consecutive runs of a batch chemical process (see²⁵). The data are 23 23 25 34 35 35 36 37 38 38 38 39 40 40 41 41 43 44 44 45 45 45 45 45 48 48 49 50 50 50 50 50 50 51 51 51 51 51 52 53 54 54 54 54 55 55 55 55 56 57 58 58 59 59 59 59 60 60 62 64 64 64 68 71 71 71 74 74 80.

The empirical and estimated cdf for the this data is shown in Fig. 11. The estimated parameters and goodness-of-fit measurements are also included in Table 18.

This Table shows that, the DEE, DEW, DEB, DEL, DEFr, DEG, DEETE, and DEOWE distributions are Fitted and perform quite well for evaluating this data with p-values larger than 0.05. The DEOWE distribution is selected as the best model for these data because it has the smallest -ll, AIC, BIC, and $\chi 2$ values and the highest p-value.

Table 18 Parameter estimates and goodness-of fit statistics for data set VIII.

Full size table

Summary and conclusion

In this research, a new way for generating distributions is developed with high degree of flexibility that would be very useful in modeling real data in various fields. The new family of distributions is called the T-X$^{\theta }$ family suggested with extra shape parameters $\theta$ for bounded and positive unbounded random variables. Several specific sub-families and sub-models of the proposed family are presented including the Exponential-X$^{\theta }$ Exponential distribution which was selected and studied in details. The parameters of this distribution were estimated using the MLE method. A simulation study was also conducted to investigate the efficiency as well as behavior of estimates. The discretized T-X$^{\theta }$ family of distributions has been proposed and some discrete models of the family were defined. As an example, the discrete Exponential Exponential, DEE, distribution, a three-parameter discrete distribution derived, various alternative graphs of the DEE’s pmf and hazard functions were shown. Eight different actual data sets were used to demonstrate the effectiveness of some members of the suggested continuous and discrete members of the family vs some other distributions. In general, the results indicate that the proposed distributions are highly flexible, provide accurate results and can fit various types of data. The new models achieved a closer match for all data sets.

Data availability

The datasets analyzed in this study were a re-analysis of existing data, which are are included in Refs.^24,25,26 and available at locations cited in the reference section.

References

Alzaatreh, A., Lee, C. & Famoye, F. A new method for generating families of continuous distributions. Metron 71, 63–79. https://doi.org/10.1007/s40300-013-0007-y (2013).
Article MathSciNet Google Scholar
Eugene, N., Lee, C. & Famoye, F. Beta normal distribution and its applications. Commun. Stat. Theory Methods 31, 497–512 (2002).
Article MathSciNet Google Scholar
Cordeiro, G. M. & de Castro, M. A new family of generalized distributions. J. Stat. Comput. Simul. 81, 883–898 (2011).
Article MathSciNet Google Scholar
Alzaghal, A., Famoye, F. & Lee, C. Exponentiated tx family of distributions with some applications. Int. J. Stat. Probab. 2, 31 (2013).
Article Google Scholar
Ahmad, Z., Elgarhy, M. & Hamedani, G. A new Weibull-x family of distributions: Properties, characterizations and applications. J. Stat. Distrib. Appl. 5, 1–18 (2018).
Article CAS Google Scholar
Jamal, F. & Nasir, M. A. Some new members of the tx family of distributions. In 17th International Conference on Statistical Sciences, Vol. 33 (2019).
Ahmad, Z., Mahmoudi, E., Dey, S. & Khosa, S. K. Modeling vehicle insurance loss data using a new member of tx family of distributions. J. Stat. Theory Appl. 19, 133–147 (2020).
Article Google Scholar
McDonald, J. B. et al. Some generalized functions for the size distribution of income. Econometrica 52, 647–663 (1984).
Article Google Scholar
David, H. A. & Nagaraja, H. N. Order Statistics (Wiley, 2004).
Google Scholar
Rényi, A. On measures of entropy and information. In Proc. Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics, Vol. 4, 547–562 (University of California Press, 1961).
Kemp, A. W. Classes of discrete lifetime distributions. Commun. Stat. Theory Methods 33, 3069–3093 (2004).
Article MathSciNet Google Scholar
Mahmoud, M. R., Mandouh, R. M. & Abdelatty, R. E. Lomax–Gumbel Frechet a new distribution. J. Adv. Math. Comput. Sci. 31, 1–19 (2019).
Article Google Scholar
Mahdy, M., Ahmed, B. & Ahmad, M. Elicitation inverse Rayleigh distribution and its properties. J. ISOSS 5, 30–49 (2019).
Google Scholar
Hassan, A. S., Al-Omar, A. I., Ismail, D. M. & Al-Anzi, A. A new generalization of the inverse Lomax distribution with statistical properties and applications. Int. J. Adv. Appl. Sci. 8, 89–97 (2021).
Article Google Scholar
Alshanbari, H. M., Ijaz, M., Asim, S. M., Hosni El-Bagoury, A.A.-A. & Dar, J. G. New weighted Lomax (NWL) distribution with applications to real and simulated data. Math. Probl. Eng. 2021, 1–12 (2021).
Article Google Scholar
Ghitany, M. E., Atieh, B. & Nadarajah, S. Lindley distribution and its application. Math. Comput. Simul. 78, 493–506 (2008).
Article MathSciNet Google Scholar
Murthy, D. P., Xie, M. & Jiang, R. Weibull Models (Wiley, 2004).
Google Scholar
Hinkley, D. On quick choice of power transformation. J. R. Stat. Soc. Ser. C (Appl. Stat.) 26, 67–69 (1977).
Google Scholar
Alaa, R. & Eledum, H. Discrete extended erlang-truncated Exponential distribution and its applications. Appl. Math. 16, 127–138 (2022).
MathSciNet Google Scholar
Singh, B., Singh, R. P., Nayal, A. S. & Tyagi, A. Discrete inverted Nadarajah–Haghighi distribution: Properties and classical estimation with application to complete and censored data. Stat. Optim. Inf. Comput. 10, 1293–1313 (2022).
Article MathSciNet Google Scholar
Nagy, M. et al. The new novel discrete distribution with application on covid-19 mortality numbers in Kingdom of Saudi Arabia and Latvia. Complexity 2021, 1–20 (2021).
CAS Google Scholar
Eliwa, M. S., Tyagi, A., Almohaimeed, B. & El-Morshedy, M. Modelling coronavirus and larvae pyrausta data: A discrete binomial Exponential II distribution with properties, classical and bayesian estimation. Axioms 11, 646 (2022).
Article Google Scholar
Maya, R., Irshad, M. R., Chesneau, C., Nitin, S. L. & Shibu, D. S. On discrete Poisson–Mirra distribution: Regression, inar (1) process and applications. Axioms 11, 193 (2022).
Article Google Scholar
Afify, A. Z., Elmorshedy, M. & Eliwa, M. A new skewed discrete model: Properties, inference, and applications. Pak. J. Stat. Oper. Res. 17, 799–816 (2021).
Article Google Scholar
Hand, D. J., Daly, F., McConway, K., Lunn, D. & Ostrowski, E. A Handbook of Small Data Sets (CRC Press, 1993).
Book Google Scholar
Hoel, D. G. A representation of mortality data by competing risks. Biometrics 28, 475–488 (1972).
Article CAS PubMed Google Scholar

Download references

Funding

Open access funding provided by The Science, Technology & Innovation Funding Authority (STDF) in cooperation with The Egyptian Knowledge Bank (EKB).

Author information

These authors contributed equally: Mahmoud R. Mahmoud and Rasha E. Abdelatty.

Authors and Affiliations

Department of Mathematical Statistics, Faculty of Graduate Studies for Statistical Research, Cairo University, Cairo, 12613, Egypt
Rasha M. Mandouh, Mahmoud R. Mahmoud & Rasha E. Abdelatty

Authors

Rasha M. Mandouh
View author publications
You can also search for this author in PubMed Google Scholar
Mahmoud R. Mahmoud
View author publications
You can also search for this author in PubMed Google Scholar
Rasha E. Abdelatty
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to this work equally. All authors reviewed the manuscript.

Corresponding author

Correspondence to Rasha M. Mandouh.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Mandouh, R.M., Mahmoud, M.R. & Abdelatty, R.E. A new (T-X$^\theta$) family of distributions: properties, discretization and estimation with applications. Sci Rep 14, 1613 (2024). https://doi.org/10.1038/s41598-023-49425-2

Download citation

Received: 07 October 2023
Accepted: 07 December 2023
Published: 18 January 2024
DOI: https://doi.org/10.1038/s41598-023-49425-2

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Truncated Weibull–exponential distribution: methods and applications

A novel generalized Weibull Poisson G class of continuous probabilistic distributions with some copulas, properties and applications to real-life datasets

A novel extension of half-logistic distribution with statistical inference, estimation and applications

Introduction

The new family

Special models

The Exponential-X\(^{\theta }\)Exponential

Properties of the E-X\(^{\theta }\)E distribution

The quantile function

pdf expansion

The moments

The skewness and kurtosis based on moments

The moment generating function

Order statistics

Proof

Entropy

Rényi entropy

Proof

Parameter estimation and simulation study

Maximum likelihood (MLE)

Interval estimation

Simulation study

Discrete T-X\(^{\theta }\) family

The discrete exponetial-X\(^{\theta }\), DE-X\(^{\theta }\) family

The DE-Exponential\(^{\varvec{\theta }}\)(DEE) distribution

Applications to real data

Using continuous data

Data set I

Data set II

Data set III

Data set IV

Using discrete data

Data set V

Data set VI

Data set VII

Data set VIII

Summary and conclusion

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Comments

Search

Quick links