A new unit distribution: properties, estimation, and regression analysis

Karakaya, Kadir; Rajitha, C. S.; Sağlam, Şule; Tashkandy, Yusra A.; Bakr, M. E.; Muse, Abdisalam Hassan; Kumar, Anoop; Hussam, Eslam; Gemeay, Ahmed M.

doi:10.1038/s41598-024-57390-7

Download PDF

Article
Open access
Published: 27 March 2024

A new unit distribution: properties, estimation, and regression analysis

Kadir Karakaya¹,
C. S. Rajitha²,
Şule Sağlam¹,
Yusra A. Tashkandy³,
M. E. Bakr³,
Abdisalam Hassan Muse⁴,
Anoop Kumar⁵,
Eslam Hussam⁶ &
…
Ahmed M. Gemeay⁷

Scientific Reports volume 14, Article number: 7214 (2024) Cite this article

779 Accesses
1 Altmetric
Metrics details

Subjects

Abstract

This research commences a unit statistical model named power new power function distribution, exhibiting a thorough analysis of its complementary properties. We investigate the advantages of the new model, and some fundamental distributional properties are derived. The study aims to improve insight and application by presenting quantitative and qualitative perceptions. To estimate the three unknown parameters of the model, we carefully examine various methods: the maximum likelihood, least squares, weighted least squares, Anderson–Darling, and Cramér-von Mises. Through a Monte Carlo simulation experiment, we quantitatively evaluate the effectiveness of these estimation methods, extending a robust evaluation framework. A unique part of this research lies in developing a novel regressive analysis based on the proposed distribution. The application of this analysis reveals new viewpoints and improves the benefit of the model in practical situations. As the emphasis of the study is primarily on practical applications, the viability of the proposed model is assessed through the analysis of real datasets sourced from diverse fields.

Interviews in the social sciences

Article 15 September 2022

An overview of clinical decision support systems: benefits, risks, and strategies for success

Article Open access 06 February 2020

Principal component analysis

Article 22 December 2022

Introduction

Statistical distributions constitute fundamental mathematical elements in data modeling, inference, and estimating processes, as well as in fields such as public health, actuarial science, biomedical studies, demography, and industrial reliability. Due to the lack of a suitable distribution for the data and the limitations of the existing distribution theory, researchers frequently selected the most appropriate distribution from the available blocks. In many studies, the absence of proper statistical distributions forces researchers in various fields to consistently put effort into developing new distributions to support their judgments. Applied researchers and practitioners often find modeling complex problems to be a perplexing challenge, especially when dealing with diverse lifetime datasets prevalent in physical and natural sciences. In their quest for simplicity and efficiency, exhaustive reviews on this subject can be explored in^1,2. These references offer comprehensive summaries of statistical distributions derived through various methodologies.

New statistical models built on attractive distributions have long been a favorite in the statistical literature due to the complexity and diversity of modern data. The extended distributions suggested by adding extra parameters provide greater flexibility.

Numerous studies are examined to build probability distributions with substantially more perfect and flexible properties that can model real-life data sets of diverse kinds. The requirement to create new distributions appears from hypothetical concerns, actual applications, or both. There has been a pointed extension in generalizing some well-known distributions and their sensible application to contest more well-known distributions. The exponential distribution is perfect for exposing the life data, like for many types of industrial items. The major highlight of the exponential distribution is that it may be used to model the performance of objects with a fixed failure rate. The primary objective of this paper is to present a new, better model capable of modeling and fitting distinct forms of data. It also aims to exhibit the dominance of the new model in surpassing every opponent. It proposes a new model as a strong and novel contestant for modeling real data sets. Once demonstrating a situation with a known model is difficult, we might use generalization to account for extra data variation. The challenges present at this time are progressing significantly along with our world. As of this, we insist on extra generalizations of probability distributions to capture more complicated data. Also, it might be used to analyze many real-life data sets and fit them quite well; it can also be used in various problems in applied areas such as medicine, engineering, and industrial reliability analysis.

Moreover, numerous families of probability distribution have been suggested by a combining technique tracking the innovative work of Adamidis and Loukas³. Composite types have been headed in the situation of reliability study when the lifespan can be declared as the least or extreme of a system of independent and identically distributed (i.i.d.) random variables demonstrating system components failure times. The new combination of distributions can extend well-known classical distributions and provide flexibility in modeling data. Combining some valid lifetime data with power series (PS) distributions has been proposed by quite a few authors. Some of them are exponential-PS, Weibull-PS, generalized exponential PS, extended Weibull PS, Burr XII PS, Lindley PS, generalized inverse Weibull PS, and complementary exponentiated inverted Weibull PS distributions^{4,5,6,7,8,9,10,11}.

Also, the power function (PF) distribution is a flexible lifetime distribution that may offer a suitable fit to some sets of failure data. Some generalized distributions from PF are beta PF¹², Weibull¹³, Kumaraswamy PF¹⁴, transmuted PF (TPF)¹⁵, exponentiated Kumaraswamy PF¹⁶, exponentiated Weibull PF¹⁷ and odd generalized exponential PF¹⁸. In addition to the above-mentioned distributions some nonlinear predictive network epidemic models were introduced in the literature^{19,20,21,22,23}

The primary objective of this article is to introduce an advanced model designed for the modeling and fitting of data defined on (0,1). We aim to demonstrate the superiority of this new model by surpassing all existing competitors. We advocate for the proposed distribution as a robust and innovative choice for modeling real datasets. In situations where modeling with a known distribution proves challenging, the utilization of generalization becomes crucial to accommodate additional variations in the data.

This paper aims to develop a three-parameter alternative to several lifetime distributions, including the Kumaraswamy²⁴, unit-Weibull²⁵, unit-Burr XI²⁶, unit-Muth²⁷, and new power function²⁸ distributions. In this context, we propose and develop the statistical properties of the proposed distribution and show that it is a better model for reliability analysis to the data defined on (0,1).

In this paper, a new extended form of the new power function distribution (NPFD) is proposed using the power transformation $X=T^{\frac{1}{\sigma }}$ is applied to the cumulative distribution function (CDF) of NPFD. The proposed distribution is called the power new power function distribution (PNPFD). The PNPFD provides increasing, bathtub, J-shaped, reverse J-shaped, and decreasing shapes. Its density can be left-skewed, unimodal, right-skewed, concave down, or constant. Furthermore, this paper aims to delve into the main statistical properties of the PNPFD distribution. The analysis encompasses the shapes of the density function and hazard rate function, moments, incomplete moments, moment generating function (MGF), order statistics, stochastic ordering, and parameter estimation through the maximum likelihood method. To underscore the practical utility of the model, applications to real datasets are provided, demonstrating the distribution’s applicability and usefulness.

An investigation of the relationship between independent one or more variables and the dependent variable is conducted by a classical regression model. The classical regression models correlate the mean response by giving specific values of the independents. In cases where the dependent variable contains an outlier, the classical regression models can be insufficient. The median can handle these scenarios better than the mean since it is a more robust estimate. For these cases, many quantile regression models were introduced such as the beta regression model by²⁹, the Kumaraswamy regression model by³⁰, unit Weibull regression model by²⁵, unit Burr-XII regression model by²⁶, the unit Burr-Hatke regression model by³¹, the unit log-log regression model by³², etc. This paper also introduces a new quantile regression model as an alternative to current ones based on the proposed distribution.

In this paper, we propose a new distribution as a novel probability distribution model tailored for data defined on the interval (0,1), This study makes a significant contribution to the field of statistics by thoroughly examining its statistical and reliability features. By discussing moments, stochastic ordering, reliability function, hazard rate function, order statistics, and quantile function, we comprehensively understand the PNPFD’s properties. Furthermore, we establish a framework for comparing the efficacy of the PNPFD against selected distributions like the Kumaraswamy and beta distributions. This comparative analysis sets the stage for evaluating the PNPFD’s performance in various statistical applications. Through rigorous parameter estimation techniques and Monte Carlo simulations, we demonstrate the precision and reliability of the PNPFD in handling real-world data. Additionally, introducing a novel regression analysis technique based on the PNPFD expands the scope of statistical modeling, particularly in scenarios where the dependent variable is proportional. Overall, this study presents a new distribution model and highlights its potential to enhance statistical analyses across diverse domains.

The rest of the paper is organized as follows: Section 2(Model formulation) introduces the nature of the probability density function (PDF) and hazard rate function (HRF) of the PNPFD. Its associated statistical properties, such as the moment generating function (mgf), moments, MRL, order statistics, stochastic ordering, and quantile function are investigated in Sect. 3(Statistical properties). The estimation of the parameters is discussed in Sect. 4(Estimation methods). The significant sample behavior of the PNPFD, with the help of certain simulated data sets, is detailed in Sect. 5(Numerical simulation). In Sect. 6(Regression analysis), a novel quantile regression is presented based on PNPFD. In Sect. 7(Real data analysis), real data sets are analyzed using the proposed distribution. Finally, the study is concluded in 8.

Model formulation

In (2021), Iqbal et al.²⁸ derived a new statistical model called new power function distribution (NPFD) with CDF defined as follows

$$\begin{aligned} G(t)=1-\left( \frac{1-t}{\delta t+1}\right) ^{\eta },\quad ~0<t<1,~\eta >0,~-1<\delta <\infty . \end{aligned}$$

(1)

its PDF is defined as follows

$$\begin{aligned} g(t)=(\delta +1) \eta (1-t)^{\eta -1} (\delta t+1)^{-\eta -1}. \end{aligned}$$

(2)

The power transformation $X=T^{\frac{1}{\sigma }}$ is applied to the CDF (1) to have power new power function distribution (PNPFD) with CDF defined as follows

$$\begin{aligned} F(x)=1-\left( \frac{1-x^{\sigma }}{\delta x^{\sigma }+1}\right) ^{\eta },~0<x<1,~\eta ,~\sigma >0,~-1<\delta <\infty . \end{aligned}$$

(3)

we have PNPFD PDF defined as follows

$$\begin{aligned} f(x)=\frac{(\delta +1) \eta \sigma x^{\sigma -1} \left( \frac{1-x^{\sigma }}{\delta x^{\sigma }+1}\right) ^{\eta }}{\left( 1-x^{\sigma }\right) \left( \delta x^{\sigma }+1\right) }. \end{aligned}$$

(4)

Figure 1 shows the graphical representation of the PDF of the PNPFD for different combinations of parameter values of $\delta$, $\eta$, and $\sigma$. Figure 1a–d show that they can be unimodal with monotonically increasing and then decreasing for some parameter combinations. Figure 1b shows a constant trend initially, increasing rapidly as x increases (J-shaped), and Fig. 1a shows that it can be skewed to the left. Figure 1c,d show that the PDF of PNPFD can be symmetric.

Statistical properties

Mixture representation

The expansion of the PDF of the PNPFD proves valuable in deriving its properties. To facilitate this, we employ the following two lemmas:

Lemma 1

If $\lambda$ is a positive real non-integer and $\mid y \mid \le 1$, from Gradshteyn et al.³³ Equation (1.110) we get binomial series expansion as;

$$\begin{aligned} (1-y)^{\lambda -1} = \sum _{i=0}^{\infty }(-1)^i {\left( {\begin{array}{c}\lambda -1\\ i\end{array}}\right) }y^i. \end{aligned}$$

Lemma 2

If a is a positive real non-integer and $\mid y^b \mid > 1$

$$\begin{aligned} (1+y^b)^{-a} = \sum _{k=0}^{\infty } {\left( {\begin{array}{c}a+k-1\\ k\end{array}}\right) }y^{-b(k+a)}. \end{aligned}$$

and If a is a positive real non-integer and $\mid y^b \mid < 1$

$$\begin{aligned} (1+y^b)^{-a} = \sum _{k=0}^{\infty } {\left( {\begin{array}{c}a+k-1\\ k\end{array}}\right) }y^{bk}. \end{aligned}$$

Using Lemmas 1 and 2, the expansion of PDF of the PNPFD can be derived as follows.

Case I: $0<\delta x^{\sigma }<1$, we have

$$\begin{aligned} f(x)=(\delta +1)\delta ^k \eta \sigma \sum _{j=0}^{\infty } \sum _{k=0}^{\infty } (-1)^{j+k}{\left( {\begin{array}{c}\eta -1\\ j\end{array}}\right) } {\left( {\begin{array}{c}\eta +k\\ k\end{array}}\right) }x^{\sigma (j+k+1) -1.} \end{aligned}$$

(5)

Case II: $\delta x^{\sigma }>1$, we have

$$\begin{aligned} f(x)=(\delta +1)\delta ^{-(k+\eta +1)}\eta \sigma \sum _{j=0}^{\infty } \sum _{k=0}^{\infty } (-1)^{j+k}{\left( {\begin{array}{c}\eta -1\\ j\end{array}}\right) } {\left( {\begin{array}{c}\eta +k\\ k\end{array}}\right) }x^{\sigma (j-k-\eta )-1}. \end{aligned}$$

(6)

Reliability characteristics of the PNPFD

The reliability function (rf) of the PNPFD is given by

$$\begin{aligned} R(x)=\left[ \frac{(1-x^{\sigma })}{(\delta x^{\sigma }+1)}\right] ^\eta . \end{aligned}$$

(7)

The HRF of the PNPFD is given by

$$\begin{aligned} H(x)=\frac{(\delta +1) \eta \sigma x^{\sigma -1} }{(1-x^{\sigma })(\delta x^{\sigma }+1)}. \end{aligned}$$

(8)

Figure 2 gives examples of the shapes of the hazard function of our proposed model for different values of $\delta$, $\eta$, and $\sigma$. Figure 2a,c show that the hazard rate function of PNPFD can be increased. Figure 2b shows that the hazard rate function can be decreased, and Fig. 2d shows that the hazard rate function of PNPFD is bathtub-shaped, depending on the values of its parameters.

The reverse hazard rate function (rhrf) of the PNPFD is given by

$$\begin{aligned} W(x)=\frac{\eta \sigma (\delta +1) x^{\sigma -1}(1-x^{\sigma })^{\eta -1} }{(\delta x^{\sigma }+1)\left[ (\delta x^{\sigma }+1)^\eta -(1-x^{\sigma })^\eta \right] }. \end{aligned}$$

(9)

Moments

The $r^{th}$ moment $E(X^r)$ of PNPFD is given by

Case I: $0<\delta x^{\sigma }<1$

$$\begin{aligned} E(X^r) =\frac{(\delta +1)\delta ^k \eta \sigma \sum _{j=0}^{\infty } \sum _{k=0}^{\infty } (-1)^{j+k}{\left( {\begin{array}{c}\eta -1\\ j\end{array}}\right) } {\left( {\begin{array}{c}\eta +k\\ k\end{array}}\right) }}{\sigma (j+k+1)+r}. \end{aligned}$$

(10)

Case II: $\delta x^{\sigma }>1$,

$$\begin{aligned} E(X^r) =\frac{(\delta +1)\delta ^{-(k+\eta +1)}\eta \sigma \sum _{j=0}^{\infty } \sum _{k=0}^{\infty } (-1)^{j+k}{\left( {\begin{array}{c}\eta -1\\ j\end{array}}\right) } {\left( {\begin{array}{c}\eta +k\\ k\end{array}}\right) }}{\sigma (j-k-\eta )+r}. \end{aligned}$$

(11)

The first four moments of the PNPFD are obtained by substituting $r=1,2,3,4$ in Eqs. (10) and (11)

Moment generating function

The MGF of a PNPFD random is given by, Case I: $0<\delta x^{\sigma }<1$

$$\begin{aligned} M_X(t) = \sum _{r=0}^{\infty }\frac{t^r}{r!}\mu _r' = \frac{(\delta +1)\delta ^k \eta \sigma }{\sigma (j+k+1)+r}\sum _{r=0}^{\infty }\frac{t^r}{r!} \sum _{j=0}^{\infty } \sum _{k=0}^{\infty } (-1)^{j+k}{\left( {\begin{array}{c}\eta -1\\ j\end{array}}\right) } {\left( {\begin{array}{c}\eta +k\\ k\end{array}}\right) }. \end{aligned}$$

Case II: $\delta x^{\sigma }>1$

$$\begin{aligned} M_X(t) = \sum _{r=0}^{\infty }\frac{t^r}{r!}\mu _r' = \frac{(\delta +1)\delta ^{-(k+\eta +1)}\eta \sigma }{\sigma (j-k-\eta )+r}\sum _{r=0}^{\infty }\frac{t^r}{r!} \sum _{j=0}^{\infty } \sum _{k=0}^{\infty } (-1)^{j+k}{\left( {\begin{array}{c}\eta -1\\ j\end{array}}\right) } {\left( {\begin{array}{c}\eta +k\\ k\end{array}}\right) }. \end{aligned}$$

Incomplete moment

The incomplete $r^{th}$ moment is defined by

$$\begin{aligned} m_r(x) = \int _{0}^{x} x^r f(x) dx. \end{aligned}$$

For the PNPFD, incomplete $r^{th}$ moment is obtained by;

Case I: $0<\delta x^{\sigma }<1$

$$\begin{aligned} \begin{aligned} m_r(x) = \frac{(\delta +1)\delta ^k \eta \sigma }{\sigma (j+k+1)+r} \sum _{j=0}^{\infty } \sum _{k=0}^{\infty } (-1)^{j+k}{\left( {\begin{array}{c}\eta -1\\ j\end{array}}\right) } {\left( {\begin{array}{c}\eta +k\\ k\end{array}}\right) }x^{\sigma (j+k+1)+r}. \end{aligned} \end{aligned}$$

(12)

Case II: $\delta x^{\sigma }>1$

$$\begin{aligned} \begin{aligned} m_r(x) = \frac{(\delta +1)\delta ^k \eta \sigma }{\sigma (j-k-\eta )+r} \sum _{j=0}^{\infty } \sum _{k=0}^{\infty } (-1)^{j+k}{\left( {\begin{array}{c}\eta -1\\ j\end{array}}\right) } {\left( {\begin{array}{c}\eta +k\\ k\end{array}}\right) }x^{\sigma (j-k-\eta )+r}. \end{aligned} \end{aligned}$$

(13)

Mean residual life function

The mean residual life (MRL) function is significant in reliability and survival analysis. It describes how long a system will operate, beginning at the time x. For PNPFD, the MRL is obtained as,

$$\begin{aligned} \phi (x) =\frac{R(x+t)}{R(t)} =\Bigg [\frac{(\delta x^{\sigma }+1)(1-(x+t)^{\sigma })}{(\delta (x+t)^{\sigma } +1)(1-(x)^{\sigma })}\Bigg ]^\eta . \end{aligned}$$

(14)

PDF and CDF of order statistics

The order statistics of a distribution are derived by arranging the sample values in ascending order. The PDF of the $r^{th}$ order statistic is expressed as:

$$\begin{aligned} f_{r:n}(x) = C_{r:n} [F(x)]^{r-1}[1-F(x)]^{n-r}f(x). \end{aligned}$$

where, $C_{r:n}= \frac{n!}{(r-1)! (n-r)!}$

Using Eqs. (3) and (4),the PDF of the $r^{th}$ the order statistic of PNPFD is given as

$$\begin{aligned} \begin{aligned} f_{r:n}(x) =C_{r:n} (\delta +1) \eta \sigma \Bigg [1-\left( \frac{1-x^{\sigma }}{\delta x^{\sigma }+1}\right) ^{\eta }\Bigg ]^{r-1 }(1-x^{\sigma })^{\eta (n-k+1)-1}(\delta x^{\sigma }+1)^{-\eta (n-k+1)-1} \end{aligned}. \end{aligned}$$

(15)

Moreover, the CDF of $r^{th}$ the order statistic is given as

$$\begin{aligned} F_{r:n}(x) = \sum _{m=r}^{n}C_{n:m} [F(x)]^{m}[1-F(x)]^{n-m}. \end{aligned}$$

Using Eq. (3) the CDF of the $r^{th}$ the order statistic of PNPFD is given as

$$\begin{aligned} F_{r:n}(x) = \sum _{m=r}^{n}\left( {\begin{array}{c}n\\ m\end{array}}\right) \Bigg [1-\left( \frac{1-x^{\sigma }}{\delta x^{\sigma }+1}\right) ^{\eta }\Bigg ]^{m}\Bigg [\frac{1-x^{\sigma }}{\delta x^{\sigma }+1}\Bigg ]^{\eta (n-m)} \end{aligned}$$

Stochastic ordering

For a random variable X to be smaller than a random variable Y, certain conditions must be satisfied:

(i)
Hazard rate order $X \le _{hr} Y$ if $h_X(x) \ge h_Y(x)$
(ii)
Stochastic order $X \le _{st} Y$ if $F_X(x) \ge F_Y(x)$
(iii)
Mean residual life order $X \le _{mrl} Y$ if $M_X(x) \le M_Y(x)$
(iv)
Likelihood ratio order $X \le _{lr} Y$ if $\frac{f_X(x)}{f_Y(x)}$ decreasing in x

Theorem 1

Let random variables $X \sim PNPFD(\sigma _1,\eta _1,{\delta }_{1}) and Y \sim PNPD(\sigma _2,\eta _2,{\delta }_{2})$ and if $\sigma _1 \le \sigma _2,\eta _1 \le \eta _2, \delta _1 \le {\delta }_2,$ we have $X \le _{lr} Y$ then $X \le _{hr} Y, X \le _{mr} Y$ and $X \le _{st} Y$

Proof

To prove $\frac{f_X(x)}{f_Y(x)}$ decreasing in x we have to show that the derivative of $\frac{f_X(x)}{f_Y(x)}$ is less than 0.

$$\begin{aligned} \begin{aligned} \frac{f_X(x)}{f_Y(x)} =\frac{\frac{(\delta _1 +1) \eta _1 \sigma _1 x^{\sigma _1 -1} \left( \frac{1-x^{\sigma _1 }}{\delta _1 x^{\sigma _1 }+1}\right) ^{\eta _1 }}{\left( 1-x^{\sigma _1 }\right) \left( \delta _1 x^{\sigma _1 }+1\right) }}{\frac{(\delta _2 +1) \eta _2 \sigma _2 x^{\sigma _2 -1} \left( \frac{1-x^{\sigma _2}}{\delta _2 x^{\sigma _2 }+1}\right) ^{\eta _2}}{\left( 1-x^{\sigma _2}\right) \left( \delta _2 x^{\sigma _2 }+1\right) }}. \end{aligned} \end{aligned}$$

To prove $\frac{f_X(x)}{f_Y(x)}$ is less than 0, we can also show that the derivative of the logarithm of $\frac{f_X(x)}{f_Y(x)}$ is less than 0.

$$\begin{aligned} \begin{aligned} \frac{d}{dx} ln\big (\frac{f_X(x)}{f_Y(x)}\big ) =\frac{\sigma _1-\sigma _2}{x}-\sigma _1x^{\sigma _1-1} \Bigg [\frac{\eta _1-1}{1-x^{\sigma _1}}+\delta _1\frac{\eta _1+1}{1+\delta _1x^{\sigma _1}}\bigg ]+\sigma _2x^{\sigma _2-1} \Bigg [\frac{\eta _2-1}{1-x^{\sigma _2}}+\delta _2 \frac{\eta _2+1}{1+\delta _2x^{\sigma _2}}\bigg ]. \end{aligned} \end{aligned}$$

(16)

which is less than 0, when $\sigma _1 \le \sigma _2,\eta _1 \le \eta _2, \delta _1 \le {\delta }_2$. Hence, we proved $Y \ge _{lr} X$ so we can say that $Y \ge _{hr} X, Y \ge _{mrl} X$ and $Y \ge _{st} X$ when Y and X follows the PNPFD. $\square$

Quantile function

By obtaining the CDF (3) of the PNPFD, the quantile function (QF) of the PNPFD is obtained by calculating the inverse function of the CDF (3) as follows

$$\begin{aligned} Q(p)=\left( -\frac{(1-p)^{1/\eta }-1}{\delta (1-p)^{1/\eta }+1}\right) ^{1/\sigma },\quad 0<p<1. \end{aligned}$$

(17)

Estimation methods

In this section, many estimators like maximum likelihood, least squares, weighted least squares, Anderson–Darling, and Cramér-von Mises are examined to estimate the parameters $\sigma , \eta$ and $\delta$ of PNPFD. Let $X_{1},X_{2},\ldots ,X_{n}$ be a random sample from the $PNPFD\left( \sigma ,\eta ,\delta \right)$ distribution and $x_{1},x_{2},\ldots ,x_{n}$ represents the values of the sample. Let $X_{\left( 1\right) },X_{\left( 2\right) },\ldots ,X_{\left( n\right) }$ represent the order statistics for sample $X_{1},X_{2},\ldots ,X_{n}$ with realization $x_{\left( 1\right) },x_{\left( 2\right) },\ldots ,x_{\left( n\right) }$. The likelihood and log-likelihood functions can be given as

$$\begin{aligned} L\left( \Xi \right) =\left( 1+\delta \right) ^{n}\left( \eta \sigma \right) ^{n}\prod \limits _{i=1}^{n}\frac{x_{i}^{-1+\sigma }\left( \frac{ 1-x_{i}^{\sigma }}{1+x_{i}^{\sigma }\delta }\right) ^{\eta }}{\left( 1-x_{i}^{\sigma }\right) \left( 1+x_{i}^{\sigma }\delta \right) }. \end{aligned}$$

and

$$\begin{aligned} \ell \text { }\left( \Xi \right)= & {} n\log \left( 1+\delta \right) +n\log \left( \eta \right) +n\log \left( \sigma \right) +\left( \sigma -1\right) \sum \limits _{i=1}^{n}\log \left( x_{i}\right) \\{} & {} +\eta \text { }\sum \limits _{i=1}^{n}\log \left( \frac{1-x_{i}^{\sigma }}{ 1+x_{i}^{\sigma }\delta }\right) -\sum \limits _{i=1}^{n}\log \left( 1-x_{i}^{\sigma }\right) -\sum \limits _{i=1}^{n}\log \left( 1+x_{i}^{\sigma }\delta \right) . \end{aligned}$$

where $\Xi =\left( \sigma ,\eta ,\delta \right)$. The maximum likelihood estimates(MLE) of $\Xi$, say, ${\widehat{\Xi }}=\left( \widehat{ \sigma },{\widehat{\delta }},{\widehat{\eta }}\right)$ is obtained as follows:

$$\begin{aligned} {\widehat{\Xi }}=\underset{\left( \sigma ,\eta ,\delta \right) \in \left( 0,\infty \right) \times \left( 0,\infty \right) \times \left( -1,\infty \right) }{\arg \max \ell \text { }\left( \Xi \right) }. \end{aligned}$$

Let us deal with the following five functions to obtain the other estimators:

$$\begin{aligned} LS\left( \Xi \right)= & {} \sum \limits _{i=1}^{n}\left( \left( 1-\left( \frac{ 1-x_{\left( i\right) }^{\sigma }}{1+x_{\left( i\right) }^{\sigma }\delta } \right) ^{\eta }\right) -\frac{i}{n+1}\right) ^{2} . \end{aligned}$$

(18)

$$\begin{aligned} WLS\left( \Xi \right)= & {} \sum \limits _{i=1}^{n}\frac{\left( n+2\right) \left( n+1\right) ^{2}}{i\left( n-i+1\right) }\left( \left( 1-\left( \frac{ 1-x_{\left( i\right) }^{\sigma }}{1+x_{\left( i\right) }^{\sigma }\delta } \right) ^{\eta }\right) -\frac{i}{n+1}\right) ^{2} . \end{aligned}$$

(19)

$$\begin{aligned} AD\left( \Xi \right)= & {} -n-\sum \limits _{i=1}^{n}\frac{2i-1}{n}\log \left\{ \left( 1-\left( \frac{1-x_{\left( i\right) }^{\sigma }}{1+x_{\left( i\right) }^{\sigma }\delta }\right) ^{\eta }\right) \right\} \nonumber \\{} & {} +\log \left\{ \left( \frac{1-x_{\left( n+i-1\right) }^{\sigma }}{ 1+x_{\left( n+i-1\right) }^{\sigma }\delta }\right) ^{\eta }\right\} . \end{aligned}$$

(20)

and

$$\begin{aligned} CvM\left( \Xi \right) =\frac{1}{12n}+\sum \limits _{i=1}^{n}\left[ \left( 1-\left( \frac{1-x_{\left( i\right) }^{\sigma }}{1+x_{\left( i\right) }^{\sigma }\delta }\right) ^{\eta }\right) -\frac{2i-1}{2n}\right] ^{2}. \end{aligned}$$

(21)

The least-square estimates (LSEs), weighted least square estimate (WLSEs), Anderson–Darling estimate (ADEs) and Cramér–von Mises estimate (CvMEs) are achieved by minimizing Eqs. (18)–(21), respectively.

Numerical simulation

In this section, the bias and mean squared errors (MSEs) of MLEs, LSEs, WLSEs, ADEs, and CvMEs for parameters of the PNPFD are obtained via 5000 runs. For generating samples for the PNPFD in the simulation experiment, the quantile function provided in Eq. (17) is used. Furthermore, optimization procedures for obtaining estimations from the generated samples are performed using the BFGS method in the optim function in R. Six different scenarios are evaluated for parameter settings. These are $\Xi _{1}=\left( 0.5,1.5,-0.5\right)$, $\Xi _{2}=\left( 2,1.5,-0.5\right) ,$ $\Xi _{3}=\left( 1.5,0.5,2\right)$, $\Xi _{4}=\left( 3,1.5,2\right) ,$ $\Xi _{5}=\left( 0.5,2.5,-0.7\right)$ and $\Xi _{6}=\left( 2.5,0.7,1.5\right)$. The simulation results are given in Tables 1 and 2. Tables 1 and 2 show that the bias and MSEs decrease as the sample size increases for all estimators. According to the bias criterion, the best estimator for the parameters of $\sigma$ and $\eta$ is usually ADEs, while the best estimator for the $\delta$ parameter is MLEs. When scenarios are analyzed in detail, the following interpretations can be made for the MSEs criterion:

In scenario $\Xi _{1}$, the MLEs for $\sigma$ and ADEs for both $\eta$ and $\delta$ are the best estimators.
In scenario $\Xi _{2}$, the LSEs for $\sigma$ and CVMEs for both $\eta$ and $\delta$ are the best estimators.
In scenarios $\Xi _{3}$ and $\Xi _{6}$, the WLSEs are the best estimators for three parameters.
In scenarios $\Xi _{4}$ and $\Xi _{5}$, the MLEs for $\sigma$ and ADEs for both $\eta$ and $\delta$ are the best estimators.

It is observed that the decreasing trend in bias and MSEs for all estimators is achieved as expected with the increase in sample size.

Table 1 The bias of all estimators for PNPFD.

Full size table

Table 2 The MSEs of all estimators for PNPFD.

Full size table

Regression analysis

In this section, a novel regression model is presented and serves as an alternative to the Kumaraswamy and beta regression models. The quantile function in Eq. (17) is used to obtain this new regression model. Re-parameterizing the PDF and CDF of the PNPFD can be achieved by utilizing the quantile function. Let $Q\left( p;\sigma ,\eta ,\delta \right) =\mu$ and then

$$\begin{aligned} \sigma =\frac{\log \left( \frac{1-\left( 1-p\right) ^{1/\eta }}{1+\delta \left( 1-p\right) ^{1/\eta }}\right) }{\log \left( \mu \right) } \end{aligned}$$

(22)

is acquired. The CDF and PDF of the re-parametrized distribution are obtained, respectively, by

$$\begin{aligned} F\left( y,\eta ,\delta ,\mu \right) =1-\left( \frac{1-y^{\sigma ^{*}}}{ \delta y^{\sigma ^{*}}+1}\right) ^{\eta }. \end{aligned}$$

(23)

and

$$\begin{aligned} f\left( y,\eta ,\delta ,\mu \right) =\frac{(\delta +1)\eta \sigma ^{*}y^{\sigma ^{*}-1}\left( \frac{1-y^{\sigma ^{*}}}{\delta y^{\sigma }+1}\right) ^{\eta }}{\left( 1-y^{\sigma ^{*}}\right) \left( \delta y^{\sigma ^{*}}+1\right) }. \end{aligned}$$

(24)

where

$$\begin{aligned} \sigma ^{*}=\frac{\log \left( \frac{1-\left( 1-p\right) ^{1/\eta }}{ 1+\delta \left( 1-p\right) ^{1/\eta }}\right) }{\log \left( \mu \right) }, \end{aligned}$$

where parameters $\eta > 0$ and $\delta > -1$ characterize the PNPFD, while $\mu \in (0, 1)$ denotes the quantile regression parameter. The value of p is selected from the range (0, 1) and can be either 0.25, 0.5, or 0.75. It is noticed that the random variable Y is denoted by $Y \sim PNPF\left( \eta ,\delta ,\mu ,p\right)$.

Once the QPNPF has been defined, the new regression model using the PDF of the QPNPF in Eq. (24) can be presented. Let $y_{1},y_{2},\ldots ,y_{n}$ such that $y_{i}$ is an realization of $Y^{\tilde{\,}}QPNPF\left( \eta ,\delta ,\mu _{i},p\right)$ for $i=1,2,\ldots ,n$ where $\eta ,\delta$ and $\mu _{i}$ are unknown parameters, and the p is known. The proposed quantile regression model is as follows:

$$\begin{aligned} g\left( \mu _{i}\right) ={\textbf{x}}_{i}\mathbf {\beta }^{\texttt{T}}, \end{aligned}$$

(25)

where $\mathbf {\beta }\mathbf {=}\left( \beta _{0},\beta _{1},\ldots ,\beta _{p}\right)$ are the unknown regression parameter vector, ${\textbf{x}} _{i}=\left( \textbf{1,x}_{i1},{\textbf{x}}_{i2},\ldots ,{\textbf{x}}_{ip}\right)$ known ith vector of the covariates and g is a link function. We use the following logit-link function because the QPNPF is defined within the interval (0, 1):

$$\begin{aligned} g\left( \mu _{i}\right) =\log \left( \frac{\mu _{i}}{1-\mu _{i}}\right) ,i=1,2,\ldots ,n. \end{aligned}$$

(26)

It is achieved by Eq. (26)

$$\begin{aligned} \mu _{i}=\frac{\exp \left( {\textbf{x}}_{i}\mathbf {\beta }^{\texttt{T}}\right) }{1+\exp \left( {\textbf{x}}_{i}\mathbf {\beta }^{\texttt{T}}\right) }. \end{aligned}$$

(27)

Parameter estimation for regression parameters

In this section, for the estimate of unknown regression parameters and model parameters, the maximum likelihood estimation method is introduced. Let $Y_{1},Y_{2},\ldots ,Y_{n}$ be a random sample of size n from the $QPNPF\left( \eta ,\delta ,\mu _{i},p\right)$ distribution with realizations $y_{1},y_{2},\ldots ,y_{n}$, where the $\mu _{i}$ is given in (27) for $i=1,2,\ldots ,n.$Then the log-likelihood function is given by

$$\begin{aligned} \ell \left( \Xi \right)= & {} n\log \left( \delta +1\right) +n\log \left( \eta \right) +n\log \left( \sigma ^{*}\right) +\left( \sigma ^{*}-1\right) \sum \limits _{i=1}^{n}\log \left( y_{i}\right) \nonumber \\{} & {} +\eta \sum \limits _{i=1}^{n}\log \left( \frac{1-y^{\sigma ^{*}}}{\delta y^{\sigma ^{*}}+1}\right) -\sum \limits _{i=1}^{n}\log \left( 1-y_{i}^{\sigma ^{*}}\right) -\sum \limits _{i=1}^{n}\log \left( \delta y_{i}^{\sigma ^{*}}+1\right) \end{aligned}$$

(28)

where $\Xi =\left( \eta ,\delta ,\mathbf {\beta }\right)$ is the parameter vector. The MLE of the $\Xi ,$ say ${\widehat{\Xi }}=\left( {\widehat{\eta }}, {\widehat{\delta }},\beta _{0},\beta _{1},\ldots ,\beta _{p}\right)$ is achieved by maximizing the $\ell \left( \Xi \right)$ presented in (28) for $\eta ,\delta$ and $\mathbf {\beta .}$ As the log-likelihood function in (28) involves a nonlinear function, and it can be maximized using optim function in R.

Real data analysis

In this section, three real data applications are examined for both the proposed distribution and novel regression model.

Practical examples for PNPFD

In this subsection, two practical data sets are analyzed to demonstrate the usability of the PNPFD. The Kumaraswamy (K)²⁴, unit-Weibull (UW)²⁵, unit-Burr XII (UBXII)²⁶, unit-Muth(UM)²⁷, and NPFD models are used to compare the PNPFD. The PDFs for these models are given, respectively, by

$$\begin{aligned} f_{PNPFD}\left( y\right)= & {} \frac{\left( p_{2}+1\right) p_{3}p_{1}y^{p_{1}-1}\left( \frac{1-y^{p_{1}}}{1+y^{p_{1}p_{2}}}\right) ^{p_{3}}}{(1-y^{p_{1}})(p_{2}y^{p_{1}}+1)},p_{1},p_{3}>0,p_{2}>-1.\\ f_{K}\left( y\right)= & {} p_{1}p_{2}y^{p_{1}-1}\left( 1-y^{p_{1}}\right) ^{p_{2}-1},p_{1},p_{2}>0.\\ f_{UW}\left( y\right)= & {} p_{1}p_{2}\left( -log\left( y\right) \right) ^{p_{2}-1}\exp \left( -p_{1}\left( -log\left( y\right) \right) ^{p_{2}}\right) y^{-1},p_{1},p_{2}>0.\\ f_{UBXII}\left( y\right)= & {} p_{1}p_{2}y^{-1}\left( -\log y\right) ^{p_{2}-1}\left( 1+\left( -\log y\right) ^{p_{2}}\right) ^{-p_{1}-1},p_{1},p_{2}>0.\\ f_{UM}\left( y\right)= & {} p_{2}^{-1}\exp \left( 1/p_{1}\right) \left( y^{-\frac{ p_{1}}{p_{2}}}-p_{1}\right) y^{-1-\frac{p_{1}}{p_{2}}}\exp \left( -\frac{1}{ p_{1}}y^{-\frac{p_{1}}{p_{2}}}\right) ,p_{1},p_{2}>0.\\ f_{NPFD}\left( y\right)= & {} \left( p_{1}+1\right) p_{2}\left( 1-y\right) ^{p_{2}-1}\left( 1+p_{1}y\right) ^{-p_{2}-1},p_{1},p_{2}>0. \end{aligned}$$

The maximum likelihood methodology is used to estimate the model parameters. The estimated log-likelihood ($\ell$), Akaike information criterion (AIC), and the Bayesian information criterion (BIC) are used to assess the goodness-of-fit of the distributions. Furthermore, the Kolmogrov-Smirnov (KS) statistic and p-value of the KS statistic are calculated.

The first set of data was taken from firm risk management cost-effectiveness, which is available on the web page of Professor E. Frees (Wisconsin School of Business). The data is defined on (0, 1) and calculated as the total property and casualty premiums and uninsured losses as a percentage of the total assets. The first data is also reported and analyzed by³⁴. Table 3 reports the first real data set modeling results.

Table 3 The goodness of fit results for the first data sets.

Full size table

The second data set indicates the recovery rates of viable CD34+ cells in the 239 patients who agreed to autologous peripheral blood stem cell transplant after myeloablative chemotherapy doses. The CD34+ is also investigated by²⁶. Results for the CD34+ are given in Table 4.

Table 4 The goodness of fit results for the second data sets.

Full size table

When the modeling results for both real data sets are analyzed, Tables 3 and 4 clearly show that PNPFD is the best model among all models based on all criteria and statistics. Figures 3 and 4 present some goodness-of-fit graphs for real data modeling. In Figures 3 and 4, the fitted PDF, CDF, SF, and P-P plots of the PNPFD based on the first and second real datasets are illustrated in detail. Considering the fit in Figures 3 and 4, it is observed that the PNPFD is a suitable choice for modeling these two real datasets.

Practical example for QPNPFD

In this subsection, the new regression model is demonstrated for its usability through a real data application. For comparison purposes, the Kumaraswamy³⁰ and the beta²⁹, log-extended exponential geometric (LEEG)³⁵, and transmuted unit rayleigh (TUR)³⁶ regression models are utilized. The quantile parameter p is set to 0.5 for the QPNPFD, Kw, and LEEG regression models. The data is taken from³⁶ and can be found at https://stats.oecd.org/index.aspx?DataSetCode=BLI. Here, the percentage of the educational attainment values of the OECD countries (y) is considered as the dependent variable, and the percentage of the voter turnout ($x_{1}$), homicide rate ($x_{2}$), and life satisfaction ($x_{3}$) as the independent variables. Detailed information about this data and some descriptive statistics can be viewed from³⁶. This application aims to reveal the relationship with y and $x_{1}$, $x_{2}$, and $x_{3}$.

The regression model is presented as

$$\begin{aligned} \text {logit} \left( \mu _{i}\right) =\beta _{0}+\beta _{1}x_{i1}+\beta _{2}x_{i2}+\beta _{3}x_{i3},\text { }i=1,2,\ldots ,38. \end{aligned}$$

where $\mu _{i}$ represents the median for QPNPFD, Kw, and LEEG models and the mean for Beta regression. Parameter estimates for regression models, p-values for the significance of model parameters, and log-likelihood results are presented in Table 5.

Table 5 Parameter estimates of regression models for OECD data with standard error (SE) and log-likelihoods.

Full size table

From 5, it is striking that the best regression model for OECD data is the PNPFD model. For the PNPFD model, $\eta$, $\delta$, and $\beta _{0}$ parameters are statistically insignificant at the level of 5%, and the other parameters $\beta _{1}$, $\beta _{2}$ and $\beta _{3}$ are statistically significant at the level of 5%. The median response is positively affected by parameter $\beta _{3}$, whereas the median response is negatively affected by parameters $\beta _{1}$ and $\beta _{2}$. It is determined that an increase in life satisfaction increases the percentage of educational attainment, while an increase in voter turnout and homicide rate decreases the percentage of educational attainment.

Conclusion

This study aimed to introduce a new superior model capable of modeling and fitting data defined on (0,1). This paper introduced a new unit model as an alternative to Kumaraswamy and beta distributions. The new model’s statistical and reliability features were discussed, like moments, stochastic ordering, reliability function, hazard rate function, order statistics, and quantile function. Furthermore, the PNPFD has flexible shapes for its density and hazard functions. The probability density function plots reveal that the new distribution is unimodal and J-shaped, while the hazard rate function exhibits a pattern characterized by decreased, increased, and bathtub-shaped behavior. The major objectives had been established throughout the study, setting the groundwork for a comprehensive investigation into the efficacy of the PNPFD compared to existing, well-known distributions. As we delve into the conclusion, it is noteworthy to emphasize that the research aim has been realized with resounding success. Its parameters are estimated with precision using various methods. The performance of these methods is compared with a Monte Carlo simulation. According to the simulation study, it is observed that the results of the estimators approached each other in a large sample size. Simulation results indicate that, according to the bias criterion, ADEs are typically identified as the optimal estimator for the parameters of $\sigma$ and $\eta$, while MLEs are considered the most suitable estimator for the $\delta$ parameter.. A novel regression analysis is introduced via the proposed distribution. Three real data analyses demonstrate the applicability and reliability of the new distribution and the new regression model evidenced by low error measures such as SE and p-value. The results from the modeling with figures also demonstrate that the new distribution fits remarkably well with the real data. In conclusion, this study not only ensued in meeting its aim but also proved the capability of the PNPFD to contribute substantially to the field of statistics. The flexibility of the proposed regression model compared to existing regression models indicates that it is an effective model for situations where the dependent variable is proportional. The outcomes portrayed here open paths for future research incorporating novel heuristics techniques for investigating the disease dynamics and insist on the significance of the PNPFD as a beneficial tool for researchers in diverse areas, including neuro-computational intelligence, non-linear tumor-immune delayed model, nonlinear multi-delayed tumor oncolytic virotherapy systems, nonlinear influenza-A epidemic model, nonlinear multi-delays SVEIR epidemic systems, etc. We hope that this model will be used for data analysis in many different fields such as economics, engineering, medicine, etc. In addition to the properties we have discussed, several other methods, such as Bayesian regression and the method of moments, can be employed to estimate parameters to assess the efficiency of a model. By applying these methods, we can make future predictions based on the data set, allowing for further analysis and application of the proposed model.

Data availability

All data exists in the paper with its related references.

References

Tahir, M. H. & Nadarajah, S. Parameter induction in continuous univariate distributions: Well established g families. Ann. Braz. Acad. Sci. 87(2), 539–568 (2015).
Article MathSciNet Google Scholar
Brito, C. R., Rego, L. C., Oliveira, W. R. & Gomes-Silva, F. Method for generating distributions and classes of probability distributions: The univariate case. Hacet. J. Math. Stat. 48, 897–930 (2019).
MathSciNet Google Scholar
Adamidis, K. & Loukas, S. A lifetime distribution with decreasing failure rate. Stat. Probab. Lett. 39(1), 35–42 (1998).
Article MathSciNet Google Scholar
Chahkandi, M. & Ganjali, M. On some lifetime distributions with decreasing failure rate. Comput. Stat. Data Anal. 53(12), 4433–4440 (2009).
Article MathSciNet Google Scholar
Hassan, A. S., Assar, M. S. & Ali, K. A. The compound family of generalized inverse Weibull power series distributions. Brit. J. Appl. Sci. Technol. 14(3), 1–18 (2016).
Article Google Scholar
Hassan, A. S., Abd-Elfattah, A. M. & Hussein, A. M. The compound family of generalized inverse Weibull power series distributions. Brit. J. Math. Comput. Sci. 13(2), 1–20 (2016).
Google Scholar
Mahmoudi, E. & Jafari, A. A. Generalized exponential-power series distributions. Comput. Stat. Data Anal. 56(12), 4047–4066 (2012).
Article MathSciNet Google Scholar
Morais, A. L. & Barreto-Souza, W. A compound class of Weibull and power series distributions. Comput. Stat. Data Anal. 55(3), 1410–1425 (2011).
Article MathSciNet Google Scholar
Silva, R. B., Bourguignon, M., Dias, C. R. B. & Cordeiro, G. M. The compound class of extended Weibull power series distributions. Comput. Stat. Data Anal. 58, 352–367 (2013).
Article MathSciNet Google Scholar
Silva, R. B. & Cordeiro, G. M. The burr xii power series distributions: A new compounding family. Braz. J. Probab. Stat. 29(3), 565–589 (2015).
Article MathSciNet Google Scholar
Warahena-Liyanage, G. & Pararai, M. The Lindley power series class of distributions: Model, properties and applications. J. Comput. Model. 5(3), 35–80 (2015).
Google Scholar
Brito, R. S. & Cordeiro, G. M. The beta power distribution. Braz. J. Probab. Stat. 26(1), 88–112 (2021).
MathSciNet Google Scholar
Tahir, M., Alizadehz, M., Mansoor, M., Cordeiro, G. M. & Zubair, M. The Weibull-power function distribution with applications. Hacet. Univ. Bull. Nat. Sci. Eng. Ser. Math. Stat. 45(1), 245–265 (2016).
MathSciNet Google Scholar
Oguntunde, P., Odetunmibi, O. A., Okagbue, H. I., Babatunde, O. S. & Ugwoke, P. O. The kumaraswamy-power distribution: A generalization of the power distribution. Int. J. Math. Anal. 9(13), 637–645 (2015).
Article Google Scholar
Haq, M. A., Butt, N. S., Usman, R. M. & Fattah, A. A. Transmuted power function distribution. Gazi Univ. J. Sci. 9(13), 177–185 (2016).
Google Scholar
Bursa, N. & Kadilar, G. O. The exponentiated Kumaraswamy power function distribution. Hacet. Univ. Bull. Nat. Sci. Eng. Ser. Math. Stat. 46(2), 1–19 (2017).
MathSciNet Google Scholar
Hassan, A. S. & Assar, S. M. The exponentiated Weibull power function distribution. J. Data Sci. 16(2), 589–614 (2017).
Google Scholar
Hassan, A. S., Elshrpieny, E. & Mohamed, R. E. Odd generalized exponential power function: Properties and applications. Gazi Univ. J. Sci. 32(1), 351–370 (2019).
Google Scholar
Anwar, N., Ahmad, I., Kiani, A. K., Shoaib, M. & Raja, M. A. Z. Intelligent solution predictive networks for non-linear tumor-immune delayed model. Comput. Methods Biomech. Biomed. Eng. https://doi.org/10.1080/10255842.2023.2227751 (2023).
Article Google Scholar
Anwar, N., Ahmad, I., Kiani, A. K., Shoaib, M. & Raja, M. A. Z. Novel intelligent Bayesian computing networks for predictive solutions of nonlinear multi-delayed tumor oncolytic virotherapy systems. Int. J. Biomath. https://doi.org/10.1142/S1793524523500705 (2023).
Article Google Scholar
Anwar, N. et al. Intelligent computing networks for nonlinear influenza-A epidemic model. Int. J. Biomath. 16(04), 2250097 (2022).
Article MathSciNet Google Scholar
Shoaib, M. et al. Neuro-computational intelligence for numerical treatment of multiple delays SEIR model of worms propagation in wireless sensor networks. Biomed. Signal Process. Control. 84, 104797 (2023).
Article Google Scholar
Shoaib, M. et al. Intelligent networks knacks for numerical treatment of three-dimensional Darcy–Forchheimer Williamson nanofluid model past a stretching surface. Waves Random Complex Media https://doi.org/10.1080/17455030.2022.2058713 (2022).
Article Google Scholar
Kumaraswamy, P. A generalized probability density function for double-bounded random processes. J. Hydrol. 46(1–2), 79–88 (1980).
Article ADS Google Scholar
Mazucheli, J., Menezes, A. F. B., Fernandes, L. B., De Oliveira, R. P. & Ghitany, M. E. The unit-Weibull distribution as an alternative to the Kumaraswamy distribution for the modeling of quantiles conditional on covariates. J. Appl. Stat. 47(6), 954–974 (2020).
Article MathSciNet CAS PubMed Google Scholar
Korkmaz, M. Ç. & Chesneau, C. On the unit Burr-XII distribution with the quantile regression modeling and applications. Comput. Appl. Math. 40(1), 29 (2021).
Article MathSciNet Google Scholar
Maya, R., Jodra, P., Irshad, M. R. & Krishna, A. The unit Muth distribution: Statistical properties and applications. Ricerche di Matematica https://doi.org/10.1007/s11587-022-00703-7 (2022).
Article Google Scholar
Iqbal, M. Z., Arshad, M. Z., Özel, G. & Balogun, O. S. A better approach to discuss medical science and engineering data with a modified Lehmann type-II model. F1000Research 10, 823 (2021).
Article Google Scholar
Ferrari, S. & Cribari-Neto, F. Beta regression for modelling rates and proportions. J. Appl. Stat. 31(7), 799–815 (2004).
Article MathSciNet Google Scholar
Mitnik, P. A. & Baek, S. The kumaraswamy distribution: Median-dispersion re-parameterizations for regression modeling and simulation-based estimation. Stat. Pap. 54, 177–192 (2013).
Article MathSciNet Google Scholar
Saglam, S. & Karakaya, K. Unit burr-hatke distribution with a new quantile regression model. J. Sci. Arts 22(3), 663–676 (2022).
Article Google Scholar
Korkmaz, M. Ç. & Korkmaz, Z. S. The unit log–log distribution: A new unit distribution with alternative quantile regression modeling and educational measurements applications. J. Appl. Stat. 50(4), 889–908 (2023).
Article MathSciNet PubMed Google Scholar
Gradshteyn, I. S., & Ryzhik, I. M. Tables of Integrals, Series and Products. (Elsevier, Inc., London, 2007), Page No.25.
Abd El-Bar, A., Bakouch, H. S. & Chowdhury, S. A new trigonometric distribution with bounded support and an application. Revista de la Unión Matemática Argentina 62(2), 459–473 (2021).
Article MathSciNet Google Scholar
Jodra, P. & Jimenez-Gamero, M. D. A quantile regression model for bounded responses based on the exponential-geometric distribution. REVSTAT-Stat. J. 18(4), 415–436 (2020).
MathSciNet Google Scholar
Korkmaz, M. Ç., Chesneau, C. & Korkmaz, Z. S. Transmuted unit Rayleigh quantile regression model: Alternative to beta and Kumaraswamy quantile regression models. Univ. Politeh. Buchar. Sci. Bull. Ser. Appl. Math. Phys. 83, 149–158 (2021).
MathSciNet Google Scholar

Download references

Acknowledgements

This research project was supported by the Researchers Supporting Project Number (RSP2024R488), King Saud University, Riyadh, Saudi Arabia.

Author information

Authors and Affiliations

Department of Statistics, Faculty of Sciences, Selcuk University, Konya, Turkey
Kadir Karakaya & Şule Sağlam
Department of Mathematics, Amrita School of Physical Sciences, Amrita Vishwa Vidyapeetham, Coimbatore, 641112, India
C. S. Rajitha
Department of Statistics and Operations Research, College of Science, King Saud University, P.O. Box 2455, Riyadh, 11451, Saudi Arabia
Yusra A. Tashkandy & M. E. Bakr
Faculty of Science and Humanities, School of Postgraduate Studies and Research (SPGSR), Amoud University, Borama, Somalia
Abdisalam Hassan Muse
Department of Statistics, Faculty of Basic Science, Central University of Haryana, Mahendergarh, 123031, India
Anoop Kumar
Department of Mathematics, Faculty of Science, Helwan University, Cairo, Egypt
Eslam Hussam
Department of Mathematics, Faculty of Science, Tanta University, Tanta, 31527, Egypt
Ahmed M. Gemeay

Authors

Kadir Karakaya
View author publications
You can also search for this author in PubMed Google Scholar
C. S. Rajitha
View author publications
You can also search for this author in PubMed Google Scholar
Şule Sağlam
View author publications
You can also search for this author in PubMed Google Scholar
Yusra A. Tashkandy
View author publications
You can also search for this author in PubMed Google Scholar
M. E. Bakr
View author publications
You can also search for this author in PubMed Google Scholar
Abdisalam Hassan Muse
View author publications
You can also search for this author in PubMed Google Scholar
Anoop Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Eslam Hussam
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed M. Gemeay
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed equally to this paper.

Corresponding author

Correspondence to Abdisalam Hassan Muse.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Karakaya, K., Rajitha, C.S., Sağlam, Ş. et al. A new unit distribution: properties, estimation, and regression analysis. Sci Rep 14, 7214 (2024). https://doi.org/10.1038/s41598-024-57390-7

Download citation

Received: 11 July 2023
Accepted: 18 March 2024
Published: 27 March 2024
DOI: https://doi.org/10.1038/s41598-024-57390-7

Keywords

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Interviews in the social sciences

An overview of clinical decision support systems: benefits, risks, and strategies for success

Principal component analysis

Introduction

Model formulation

Statistical properties

Mixture representation

Lemma 1

Lemma 2

Reliability characteristics of the PNPFD

Moments

Moment generating function

Incomplete moment

Mean residual life function

PDF and CDF of order statistics

Stochastic ordering

Theorem 1

Proof

Quantile function

Estimation methods

Numerical simulation

Regression analysis

Parameter estimation for regression parameters

Real data analysis

Practical examples for PNPFD

Practical example for QPNPFD

Conclusion

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Comments

Search

Quick links