Improvement in variance estimation using transformed auxiliary variable under simple random sampling

This paper offers a novel approach to formulate efficient ratio estimator of the population variance using a transformed auxiliary variable. The impact of transformation on auxiliary information has also been discussed. It is observed that incorporating a transformed auxiliary variable result in a high gain in efficiency. Theoretical properties of the newly developed estimators have been derived. The empirical and simulation studies show that the suggested estimators outperformed the existing estimators.


Methodology
Consider a population = y i , x i , i = 1, 2, ...N. of size N. Suppose a random sample y i , z i of size n is taken from a population under simple random sampling without replacement case, i-e (SRSWOR).Let y i , x i be the value of ith unit of the main study and auxiliary variable and z i = t(x i ) is the transformed auxiliary variable observed on the sample.The supplementary variate (x) is supposed to be correlated positively with the main study variable y .It is to be noted that the correlation between y i , z i and between y i , x i is same.Let y = 1 , such that where r and s be the non-negative integer and µ rs , µ 20 and µ 02 are the second order moments and φ rs is the moment's ratio.
where C 2 y , C 2 x are the coefficient of variation of the survey variable and auxiliary variables Y and X respectively.ρyx is the correlation coefficient between main study variable Y and auxiliary variable X, and β (1)x , β (2)x are the coef- ficient of skewness and the coefficient of kurtosis of the auxiliary variables respectively.
In literature, some estimators of the population variance are given as 1.The usual classical estimator of population variance is given by where τ 1 is an unbiased estimator.Its variance is as under 2. Developed the estimator of population variance, it is given by 2 The MSE of τ 2 is as below.
3. Provide the estimator of population variance, it is given below 22 The MSE of τ 3 is given by 4. For ratio estimator of population variance, the linear regression estimator developed by 2 is given by here b = s 2 y V 22 s 2 x V 40 represents the sample regression coefficient.
(5) www.nature.com/scientificreports/ 5.The transformed estimator by 23 is as follows: here k 1 and k 2 are optimization constants, θ is generalization constants which can takes value between 0 and 1 and "a" and "b" some function of auxiliary variable.The optimum value of k 1 and k 1 are as below and The minimum MSE at k 1(opt) and k 2(opt) is given by where q = aS 2 x aS 2 x +b .The MSE of τ is minimum at (θ, a, b) = (1, 1, 0).

6.
Introduced the following difference-cum-exponential estimator of population variance 7 , where c 11 and c 11 are optimization constants.The optimum MSE of τ 6 is given by 7. Developed the following ratio estimator of population variance 24 The optimum value of k 11 and k 12 are given by where The minimum MSE at k 11(opt) and k 12(opt) is given by (10)  Var  25 where k 1 and k 2 are optimization constants that minimize the MSE of τ 8 .which is given by and 9. Suggested a general type of estimator of population variance given by 20 The particular case of estimator τ a,b 9 for estimating variance is obtained by putting a = 0, b = 2,w 1 = 0 and w 2 = 1 and ω 2 = 2 as following With MSE given by where Which in case of variance estimator, the MSE τ will induce the following particular case:

Proposed estimators
The first proposed estimator Motivated by 24 and using transformed auxiliary variable, the following class of transformed ratio-product type exponential estimator is suggestedor where c 1 and c 2 are optimization constants and γ 2 and γ 2 are suitable constants or some function of auxiliary variables.

The second proposed estimator
Motivated by 2 , we can write the proposed estimator as a linear combination of usual ratio and exponential estimators as followingOr ψ 1 , and ψ 2 are optimization constants whose value is to be obtained so that the MSE of τ P2 is minimum.

The third proposed estimator
Applying transformation to the auxiliary variable in (26), we can write the third proposed estimator as following Or ψ 3 and ψ 4 are optimization constants whose value is to be determined so that the MSE of τ P3 is minimum.It is to be noted that for γ 1 = 1 and γ 2 = 0.
The third proposed estimator τ P3 given by ( 26) become equivalent to the second proposed estimator τ P2 given by (26).
Similarly, the first proposed estimator τ P1 given by (25) become equivalent to τ P7 as suggested by Muneer et al. 9 given by (13).

Theoretical properties of the proposed estimators
This unit aims at, deriving the theoretical properties of the new estimators using the notations given in (1) and (2).Rewriting (25), (26) and (27) respectively in term of error terms, as following and where  The optimum value of c 1 and c 2 is obtain by differentiating (37) w.r.t c 1 and c 2 respectively and equating to zero, as following where, putting the optimum value of c 1 and c 2 in (37), we get where Similarly, using calculus rule, the optimum value of ψ 1 , ψ 2 , ψ 3 and ψ 4 can also be obtain by differentiating MSE of τ P2 and τ P3 w.r.to ψ 1 , ψ 2 , ψ 3 and ψ 4 and equating to zero.Hence, we obtain after simplification and where The MSE is given by The bias and MSEs are given by and 2. For k = 2, γ 1 = 1 and γ 2 = C x , τ P1 and τ P3 will take the following form The bias and MSEs are given by and ( 41) (49) 3. For k = 3, γ 1 = ρ yx and γ 2 = C x ,τ P1 and τ P3 will take the following form The bias along with the mean square error (MSE) is given by and where 4. For k = 4, γ 1 = C x and γ 2 = ρ yx .the estimators τ P1 and τ P3 will take the following form The bias and MSEs are obtained as and where 5. For k = 5, γ 1 = X and γ 2 = ρ yx , the estimators τ P1 and τ P3 will take the following form

Efficiency comparisons
This section aims to compare the MSEs of the newly developed estimators with the competing estimators discussed in the literature.
The condition mentioned above always hold true for all type of real data where the correlation is positive between the main study variable and supplementary variable.

Empirical analysis
This section aims to investigate the performance of the proposed estimators against the competing estimators using data from some real-life situations.Table 1 consists of summary statistics of various datasets.
It is obvious from the Table 2 that the newly transformed estimators always perform well than existing estimators for all real data sets.The transformation introduced results high gain in efficiency.The first proposed estimator given by ( 25) is more efficient than the parent estimator suggested by 24 , moreover it also outperforms all the competing estimators discussed in the literature.The second proposed estimator given by ( 26) is more efficient even than our first proposed estimators and all other competing estimators.Further, incorporating the transformed auxiliary variable in the second proposed estimator generates the third proposed estimators given by (27) which are more efficient than all other estimators.

Simulation study
The simulation study the suggested and existing estimators is conducted to assess the performance of both suggested and existing estimators.Three different populations of size 10,000 have been generated using positive correlation between main study and auxiliary variables.The intensity of correlation between the main study and the supplementary variable is high, moderate, and low in the first, the second and the third population respectively.
(  We consider sample of sizes n = 50, 150, 300 are consider for each population, using simple random sampling without replacement approach.The steps below summarize the whole simulation procedure in R-Studio.
Step 1: Population is generated using Bivariate normal distribution with mean vector µ = c 5.0 10.0 and = 9.0 2.9 2.9 1.0 and calculated all parameters of auxiliary variable and constant from the data.
Step 2: From each of the target population, samples of n = 50, 150, 300 have been drawn using SRSWOR method respectively, allow the loop to 100,000 times, select the sample, and calculated the estimates in each iteration.
Step 4: The estimates obtained from each iteration are store in a matrix and calculate the percent PRE averaging over all iteration of each estimates by using the formula (67), and report the results in Table 3.
The simulation results indicate that the newly developed estimators are highly efficient as compared to the conventional estimators in all situations.It is obvious that for the high correlation ρ ≥ 0.96 , the estimators are tend to be more efficient than moderate ρ ≥ 0.75 and low ρ ≥ 0.39 correlation.Thus, the intensity of relation- ship between the study and supplementary variable plays a vital role in increasing the efficiency.The newly developed estimator retains its efficiency in all three cases.Further, the transformation also plays a remarkable role in enhancing efficiency.As it can be seen from the simulation in the Table 3, the first transformed proposed estimator is efficient than its parent estimator τ 7 for different choices of γ 1 and γ 2 .Similarly, the third proposed estimator is also efficient not only from the conventional estimators but also from the second proposed estimator for different choices of γ 1 and γ 2 .

Conclusion
The empirical and simulation findings highlight the remarkable efficiency of the newly developed estimators in comparison to conventional counterparts across diverse scenarios.Particularly noteworthy is the enhanced efficiency of these estimators in situations with higher correlation, emphasizing the pivotal role of the strength of the relationship between the study and supplementary variable.This aligns with the expectation that a stronger correlation contributes to increased estimator efficiency.Furthermore, the robustness of the newly developed estimator is evident across all correlation levels-high, moderate, and low.This consistent performance underscores the versatility and reliability of the proposed estimators, making them applicable in a broad spectrum of scenarios.
The introduced transformation in the estimators emerges as a key contributor to enhanced efficiency.As depicted in Table 3, simulation results affirm that the first transformed proposed estimator consistently outperforms its parent estimator across various choices.Similarly, the third proposed estimator not only demonstrates efficiency over conventional estimators but also surpasses the second proposed estimator for different choices.This emphasizes the significant role played by transformations in elevating the overall performance of the estimators.
Additionally, the proposed estimators exhibit flexibility and effectiveness in various sampling designs, such as stratified random sampling, non-response sampling, and adaptive cluster sampling.The extension of these estimators to non-conventional sampling designs, including adaptive cluster sampling and stratified adaptive cluster sampling, is also under consideration for variance estimation.These estimators display flexibility in exploring potential improvements in formulating estimates of population parameter utilizing two auxiliary variables.
In conclusion, the proposed estimators shows remarkable efficiency in finite population variance estimation under the simple random sampling scheme without replacement.The encouraging findings suggest their applicability in diverse survey scenarios, and future research avenues could further enhance their adaptability, extending their utility to more intricate sampling designs.
represents the sample mean of study variable, ancillary variable and transformed ancillary variable.be the population mean of variate y ,(x) and (z).Let us define the random error due to sampling by ε 0 =

Table 1 .
Summary statistics of various data sets.

Table 2 .
PRE of proposed and competing estimators against the usual estimator τ 1 .

Table 3 .
Simulation results for the percent relative efficiencies of the newly developed estimators and conventional estimators with respect to the usual estimator considering different sample sizes and correlation coefficients.