Saddlepoint p-values for a class of location-scale tests under randomized block design

This paper deals with a class of nonparametric two-sample location-scale tests. The purpose of this paper is to approximate the exact p-value of the considered class under a randomized block design. The exact p-value of the considered class is approximated by the saddlepoint approximation method, also by the traditional method which is the normal approximation method. The saddlepoint approximation method is more accurate than the normal approximation method in approximating the exact p-value, and does not take a lot of time like the simulation method. This accuracy is proved by applying the mentioned methods to two real data sets and a simulation study.


Class of location-scale tests
Consider two independent samples X which is the control group and Y is the treatment group are drawn from populations with CDF F and G , with means µ 1 and µ 2 and standard deviations σ 1 and σ 2 , respectively.Under the randomized block design, the location-scale class is given by where k is the number of blocks with even sizes n i , N = k i=1 n i , and b i = 1 n i +1 is the optimum weight of block i , Elteren 46 .The location and scale scores of each observation j in each block i are donated by the linear combination (a L ij + a S ij ) , where a L ij is for location score and a S ij is for scale score, and Z ij is the group indicator takes the value 1 if the observation j in the block i is from the treatment group Y and takes the value 0 otherwise.The permutation distribution of the observations assignments within the blocks, is done under random allocation rule with n i m i possible permutations, where m i is the number of the treatment observations inside the block i .The asymptotic distribution of H B is N(µ L + µ S , σ L + σ S ) , where µ L and µ S are the means of the location and scale tests, respectively.Also, σ L andσ S are the standard deviations of the location and scale tests, respectively.The considered class includes many of location-scale tests, such as, Lepage's statistic which can be written according to the Eq. ( 1) as where W L ij is the score of the Wilcoxon location test with mean µ L = 1 (a L ij + a S ij )Z ij , i = 1, . . ., k, j = 1, . . ., n i , with asymptotic distribution N(µ L + µ S , σ L + σ S ) .In addition, LP2 , also called Gastwirth 47 test, can be written in the form of Eq. ( 1), as follows: where the location score is and the scale score where and Z i(j) is the j-th order statistic of the combined two samples X and Y in the block i.
LP3 test takes the form of LP2 in Eq. ( 3) and follows the form of location-scale class in Eq. ( 1) with location score of Van der Waerden test and Klotz scale score where ϕ −1 is the inverse cumulative distribution function of standard normal distribution.
LP4 test also takes the same form of LP2 and LP3 but with location score and with Mood scale score where [a] denotes the greatest integer less than or equal to a.All Lepage's types, LP2, LP3, andLP4 are asymptotically distributed N(µ, σ ), where and In addition, Rublik 11 investigated the location-scale problem with test statistic contains a combination between the Wilcoxon location score and the Mood scale score, that can take the same form of (1) as follows: Vol:.(1234567890)

Saddlepoint approximation
For simplicity, let the class (Eq. 1) be in the form where ) , as we noted before, that the patients within the blocks distributed under random allocation design, this means that the random variables Z ij and Z ib for all i = 1, . . ., k and j = b are dependent but independent with Z aj where a = i.
To avoid the problem of the dependence, we constructed a conditional distribution as follow: where V i1 , . . ., V in i are independent and identically Bernoulli ( θ i ) random variables for each i = 1, . . ., k.This transfers the distribution of the statistic (Eq.4) to equivalent conditional distribution as follows: Now, to approximate exact p-value of H B in (Eq. 4),we need to approximate the following conditional probability where h o is the observed value of H B , using the double saddlepoint approximation of Skovgaard 29 , the conditional probability in (Eq.5) can be approximated as follows: where where M = (m 1 , . . ., m k ) , the two saddlepoints are t, S = t, s 1 , . . ., s k and S 0 = ( s 10 , . . ., s k0 ).

The joint cumulant generating function of
Hessian matrix and Q ′′ ss is the second derivative of Q 0, S 0 with respect to S .
To calculate ω and u , we first calculate the numerator saddlepoints t, S = t, s 1 , . . ., s k , by solving the follow- ing equations www.nature.com/scientificreports/and to find the value of S 0 = s 10 , . . ., s k0 , we solve the following equation Each of ω and u dose not depend on θ i , so for explicitly in solving the SPA equations, we can choose θ i = m i n i , then S 0 = (0, . . ., 0) for all i = 1, . . ., k.

Simulation study
The main aim of using a simulation study is to prove that the saddlepoint approximation method is closer to the simulated mid-p-value than the normal approximation method.The exact p-values of Lepage test, LP3 test and Rublik test are approximated using the two methods, saddlepoint approximation and normal approximation methods.To illustrate the accuracy of SPA p-value, we compare the p-values for the two previously methods with the simulated mid-p-value, which can be calculated from simulation of one million random permutations of the treatment and control indicator Z ij .The simulated mid-p-value is obtained as [ I(H > h 0 ) + 0.5 I(H = h 0 )]/10 6 , and its donated here as "exact" p-value.The benefit of using mid-p- value instead of the p-value is that the mid-p-value is convenient in case of discrete test statistics and not conservative compared to the ordinary p-value, for more details, see Agresti 50 , Butler 31 and Delanchy et al. 51 .The simulated data in this section is generated from extreme value and logistic distributions with six cases which are (1): k , and m i = m.Different location (µ 1 , µ 2 ) and scale ( σ 1 , σ 2 ) parameters are used to generate the data according to the previous scenarios.To prove our aim this process is repeated 1000 times based on 1000 generated samples.Tables 1, 2 and 3 present the mean of the SPA p-values, normal approximation p-values, simulated mid-p-values and the percentage of approaching the SPA method to the simulated method "P.SPA".Also, the average relative absolute error for SPA "R.E.SPA" and the average relative absolute error for normal approximation method "R.E.NA" are calculated and presented in Tables 1, 2 and 3.
In Table 1, 2 and 3, the SPA approximation is more accurate than the normal approximation method.This can be seen through the R.E.SPA, which is much smaller than R.E.NA, for all considered cases.
Table 1.Outcomes of the simulation study for the Lepage test.Extreme value distribution

Real data examples
To support the aim of this paper, three real data sets are analyzed.The first data set is from Rosenberger and Lachin 21 .They analyzed cholesterol rate for 50 patients.The results of the cholesterol rate for 50 patients can be found in Table 7.4 in the reference Rosenberger and Lachin 21 .The 50 patients were assigned randomly to control and treatment group by generating the vector of the group indicator Z ij , such that 25 assign to control and 25 assign to treatment, where each block contains 5 from each group, i.e. ( N = 50, n i = 10, k = 5, m = 5 ).
The second data set is from a survey of household expenditure for 20 single men" treatment group" and 20 single women "control group".For this data set, ( N = 40, n i = 8, k = 5, m = 4) .The second data is presented in Büning and Thadewald 10 .The third data set was presented on page 39 of Hand et al. 52 .This data set consists of 40 measurements of cholesterol levels for 40 men were divided into two groups A and B according to two types of behaviors.The type A behavior "treatment group" is characterized by urgency and aggression.While type B behavior "control group" is relaxed.For this data set, ( N = 40, n i = 8, k = 5, m = 4) .Extreme value distribution www.nature.com/scientificreports/p-value.It remains for us to explain the reason for considering the saddlepoint approximation method as an alternative to the simulation method.The reason is that the saddlepoint approximation method requires much less computing time compared to the simulation method.To clarify this, the computing time for the different methods is calculated and this is summarized in Tables 5, 6 and 7.
From the result of the simulation study, we can see that the SPA method is more accurate than the normal approximation methods compared to the simulated exact p-value.Moreover, from Tables 5, 6 and 7 it is clear that the SPA method is faster than simulated method which needs a lot of time to approximate the exact p-value.

Confidence intervals for location and scale parameters
The estimated confidence intervals for location parameter µ 2 and scale parameter σ 2 , are the set of all values µ 2 o and σ 2 o of the parameters µ 2 and σ 2 , respectively, which if formulated in the claim H o : µ 2 = µ 2 o and σ 2 = σ 2 o , would not be rejected at α significant level.Accordingly, if p µ 2 o , σ 2 o is the one-sided p-value for the location-scale test, then a (1 − α)100% confidence intervals of µ 2 and σ 2 can be constructed as , respectively, see 34 .Assume D o µ 2 o , σ 2 o be the observed test statistic with location parameter µ 2 o and scale parameter σ 2 o , using a satisfactory grid of µ 2 o and σ 2 o values with suitable increasing, the cutoff D o (., .) is a step function in µ 2 o and σ 2 o that leads to incremental increases with increasing µ 2 o and σ 2 o .Here, the 3 rd real data set is used to illustrate the procedure for creating     From Table 8, we can see that the estimated 99% confidence interval using SPA method is more accurate than the corresponding estimated confidence interval using the normal approximation method as compared to the simulated (Exact) confidence interval.For the 95% confidence interval, both methods have the same accuracy as the simulated method.

Conclusion
In this article, various nonparametric tests for location and scale problem have been discussed and rewritten as a common linear rank class.The exact p-value of the considered class is approximated by SPA method and the normal approximation method.According to our results in the simulation study and the two real data sets, SPA performs well and achieves high accuracy in approximating the exact p-value instead of the normal method.This article can be applied in different designs, such as random allocation design, Wei's urn design, complete design and truncated binomial design.Also, the proposed study can be extended to neutrosophic statistics, see Afzal et al. 17 , Albassam et al. 18 and Sherwani et al. 19 .

2 i
(n i + 1).Also, R S ij is the score of Ansari-Bradley scale test that takes the

Table 4
presents the p-values of LP1 test, LP3 test and Rublik test using simulated, SPA and normal approximation methods.From Table4, we can see that SPA p-value is closer to the exact p-value than the normal p-value, and this result gives more evidence that SPA method is more accurate than the normal method in approximating the

Table 2 .
Outcomes of the simulation study for the LP3 test.

Table 3 .
Outcomes of the simulation study for the Rublik test.

Table 4 .
P-values for simulated, SPA and normal approximations for the three data sets.

Table 5 .
The time consumed for calculating the SPA p-values, normal approximation p-values and simulated mid-p-values for Lepage test.

Table 6 .
The time consumed for calculating the SPA p-values, normal approximation p-values and simulated mid-p-values Rublik test.
the location and scale parameters.We use the "gofTest" R package to estimate the location µ 2 and scale σ 2 parameters for the 3 rd real data set and to test the suitability of the extreme value distribution for the considered real data set.The maximum likelihood estimations for the location and scale parameters are 227.9 and 31.17,respectively.Furthermore, the p-value of the goodness of fit test, is p-value = 0.887.We evaluate the values of D o µ 2 o , σ 2 o using a large range of the possible values of µ 2 o andσ 2 o , then for each value of D o µ 2 o , σ 2 o the corresponding exact, saddlepoint and normal p-values are calculated.Table8includes the exact, saddlepoint and normal confidence intervals for LP1 test.

Table 7 .
The time consumed for calculating the SPA p-values, normal approximation p-values and simulated mid-p-values for LP3 test.

Table 8 .
99% and 95% confidence intervals for the location and scale parameters of the 3rd data set for LP1 test.