Benchmarking the performance of water companies for regulatory purposes to improve its sustainability

The regulation of water companies (WCs) is essential to protect the interests of citizens, as companies are natural monopolies. Consequently, several methodological approaches are applied by water regulators to benchmark the performance of WCs, with data envelopment analysis (DEA) being one of the most widely applied. However, traditional DEA models allocate different weights to variables considered in efficiency assessments of each WC. By contrast, this study proposes and applies a common set weights (CSW) DEA model to a sample of Chilean WCs. The results showed that the DEA-CSW approach had greater discriminatory capacity compared to traditional DEA techniques. Moreover, weights allocated to input and output variables involved in efficiency assessment diverged among WCs when the traditional DEA model was employed. By contrast, the DEA-CSW approach generated reliable ranking of WCs based on their efficiency scores, thus facilitating the regulatory decision-making process.


INTRODUCTION
Access to clean drinking water and sanitation are essential human rights recognized by the United Nations since 2010 1 . Moreover, ensuring the availability, sustainability, and sanitation of water for all has been defined as one of the Sustainable Development Goals for water management 2 . Regulation of water companies is extremely relevant to protect customers because urban water and sanitation services are natural monopolies 3 . Thus, set water tariffs and improved quality of service contribute to fair and open competition between water companies, providing incentives to improve water and sanitation services 4,5 . To meet these functional requirements, most water regulators compare the performance of water companies (WCs) by employing benchmarking methods 6 . Thus, one of the main challenges that water regulators face is selecting the most adequate benchmarking technique to evaluate and compare the efficiency of WCs 7 .
The published literature includes a wealth of efficiency studies on the water and sanitation sector using frontier models, such as parametric techniques (stochastic frontier analysis, SFA), nonparametric techniques (data envelopment analysis, DEA) [8][9][10] or a combination of them such as stochastic non-parametric envelopment of data (StoNED) 11,12 . There is ongoing debate on the suitability of these methods for regulatory purposes 13,14 . Both SFA and DEA techniques have various advantages and disadvantages. To conduct a robust evaluation of WCs´performance, Lannier and Porcher 15 combined both DEA and SFA methods in a three-stage approach which allowed them to consider environmental effects and statistical noise in efficiency estimations. Nevertheless, the literature conducted by Berg and Marques 16 revealed that most of past research assessing the efficiency of water companies applied the DEA method.
The main concept of the non-parametric DEA technique is to select a set of input and output weightings for each unit analysed (WC in our study), which maximise efficiency scores, while keeping the efficiency of all units below one 17 . In other words, traditional DEA models allow the potential allocation of different weights for each variable considered to assess the efficiency of each analysed WC. In the framework of WC regulation, this approach has two notable shortcomings. First, flexible weightings allow the performance of WCs to be assessed using their own most favourable weights, which maximise efficiency scores. From a regulatory perspective, this means that several WCs can be identified as efficient (efficiency scores equal to one); consequently, this set of efficient WCs cannot be further discriminated and ranked 18 . Second, because different sets of weights are used to compute efficiency, evaluation results, and thus the ranking, of WCs are sometimes not acceptable to others. For instance, every WC believes that the other WCs are taking this advantage to gain superiority 19 . To our knowledge, these limitations have not been considered in previous studies evaluating the efficiency of WCs.
To overcome the limitation of different weight allocation in DEA, two main methodological approaches have been proposed, namely the cross-efficiency evaluation method and commonweight evaluation method. The first method was proposed by Sexton et al. 20 , and is based on replacing the self-evaluation mode in DEA by the peer-evaluation mode. In other words, each unit receives one efficiency score using its most favourable weighting and n-1 peer-evaluated efficiencies using the most favoured weights of the other units 21 . However, Kao and Hung 22 showed that it is possible that none of the units receive an efficiency score equal to one when applying this method, and so may not be identified as efficient. Moreover, the non-Pareto optimality of the cross-efficiency scores reduces the effectiveness of this method. In contrast, when using the common set of weights (CSW) approach, the same weighting is allocated to the variables (inputs and outputs) for all units assessed 23 . Several approaches have been proposed to incorporate CSW, including central values for all the weights, maximizing the average efficiency of all units, and maximizing the number of efficient units 24 .
For the DEA-CSW approach, Wu et al. 25 proposed allocating common weights for efficiency assessment. This approach has clear advantages for benchmarking the performance of WCs and, thus, improving the regulation of these utilities. In this approach, common weights in DEA are determined based on the concept of degree of satisfaction of units. This concept characterises the degree to which each unit is satisfied within the selected CSW. The allocated CSW should not result in units having satisfaction degrees with large differences to each other. Moreover, the DEA-CSW model proposed by Wu et al. 25 incorporated Pareto-optimal solutions. These solutions ensure that the final generated CSW is a unique solution. In other words, it allows the full discrimination of all units, and ranks them in a unique order, which is not possible in traditional DEA models. These characteristics of the DEA-CSW model mean that the evaluation results are considered more acceptable by the assessed units (WCs). Consequently, the policies adopted to improve the regulation of WCs are expected to be better received by them. Nevertheless, the DEA-CSW approach, as all DEA models, present the inconvenience of being sensitive to outliers and not allowing statistical inference. Despite the clear advantages of DEA-CSW methods over DEA models assuming flexible weights, to estimate efficiency scores for regulating WCs, to the best of our knowledge, this methodological approach has never been applied to benchmark the performance of WCs.
Thus, the current study aimed to illustrate the usefulness of the DEA-CSW approach to benchmark the performance of WCs. To accomplish this, efficiency scores for a sample of 23 WCs in Chile were computed using a traditional DEA model (Charnes, Cooper and Rhodes DEA model -DEA-CCR-) and the DEA-CSW model. This approach allowed us to compare the ranking of the analysed WCs for the two approaches, and further evaluate the impact of common weighting allocation on the efficiency of WCs. Moreover, degree of satisfaction of assessed WCs was computed to elucidate the acceptability of efficiency results by WCs.
The remaining parts of this paper are structured as follows: section 2 presents and discusses the results and section 3 details the proposed methodology and the case study.

RESULTS AND DISCUSSION
Efficiency scores allocating flexible and common weights The methodological approach applied here allowed us to rank WCs based on their efficiency scores estimated from CSW. Table 1 shows the evaluation and ranking results of the DEA-CCR model (Model 1) and DEA-CSW model (Model 6). The DEA-CCR approach failed to discriminate WCs effectively when ranking them. For instance, under this approach, five out of 23 (22%) WCs were deemed efficient, occupying to top ranking, with no possibility of further distinguishing them. In contrast, discriminatory power significantly improved when efficiency scores were computed using the DEA-CSW model. Under this approach, only one WC (WC8) was identified as efficient. Hence, this WC was considered as the benchmark, because it presented the best performance out of all evaluated WCs. It is a small WC that provides drinking and sewerage services to 15,571 customers, representing around 0.3% of national customers 26 . Moreover, this WC met all quality requirements for drinking water and treated wastewater. In other words, its indices for QDW and QWW scored one, and the generation of outputs was not penalised.
Under the DEA-CCR model, the average efficiency score was 0.747; however, this value decreased to 0.584 when efficiency scores were estimated using DEA-CSW. Thus, on average, WCs produced 25.3% and 41.6% more outputs with the same CAPEX and OPEX, respectively. Table 1 also shows that, when CSW is imposed, WC performance considerably diverged among them, as the minimum efficiency score was 0.198 (WC19). In contrast, when efficiency scores were computed in the DEA-CCR model, WC performance did not diverge, with input and output weights being allocated to maximise the efficiency score of each WC.
Thus, the minimum efficiency score under this approach was 0.323 (WC23). Table 2 presents the weights allocated to inputs and outputs under DEA-CSW and DEA-CCR approaches. It shows that, when common weights were allocated, the most relevant variable for efficiency assessment was the quality-adjusted volume of drinking water supplied (weighted as 0.431). In contrast, the weight for the variable quality-adjusted number of customers with wastewater treatment was limited to 0.085. Of note, the total weight allocated to the input and output variables was 0.484 (0.187 + 0.297) and 0.516 (0.431 + 0.085), respectively. Thus, under the DEA-CSW approach, both inputs and outputs had similar relevance for computing efficiency scores. In contrast, when efficiency scores were computed using the DEA-CCR model, each WC used their most favourable weights. Under this flexibility, Table 2 shows that some WCs excluded variables that did not have good performance from the efficiency calculation. Thus, 14 out of 23 WCs analysed (61%) excluded the variable quality-adjusted volume of drinking water from the efficiency computation. In contrast, none of the WCs evaluated excluded the OPEX variable from the efficiency estimation. Yet, the weighting allocated to this input ranged between 0.079 and 0.500, demonstrating that, when WCs can choose the weighting of variables, large variability arises for the same variable to maximise the efficiency score. Actually, only three out of the 23 WCs (13%) included the four variables considered in this study to compute efficiency scores (i.e., the weights for the four variables differed to zero). In contrast, four out of the 23 WCs (17%) only considered two variables (one input and one output) to evaluate efficiency. These differences in the weights allocated to input and output variables make the efficiency scores obtained from DEA-CCR model inappropriate for benchmarking purposes 24 .
Satisfaction degree of water companies A basic criterion for allocating weights under the DEA-CSW approach is the degree of satisfaction (termed satisfaction degree) of each WC. This parameter ranged between 0.365 and 1.000 for the 23 WCs (Fig. 1). Satisfaction degree measures the satisfaction of WCs with the common weights allocated for efficiency assessments. WC8 had the highest satisfaction (1.000), and was the best performing WC (efficiency score: 1.0) under the DEA-CSW approach. In contrast, WC19 had the lowest satisfaction degree (0.365), and was the worst performing WC (efficiency score: 0.198). Thus, differences in satisfaction degree among WCs were explained by differences between the CSW allocated, and favourability of weighting (i.e., weights that maximize efficiency   Table 2). WC19 (with the lowest efficiency score) calculated its efficiency score under the DEA-CCR approach without considering the variable of quality-adjusted volume of drinking water (the weight allocated to this variable was zero). However, when efficiency scores were computed under the DEA-CSW model, this variable had the greatest weight. Unlike WC19, very similar weights were allocated to WC8 (with the highest satisfaction degree) under both the DEA-CCR and DEA-CSW models.
Estimation of upper and lower efficiency scores Figure 2 shows the upper efficiency goal (E max d ), lower efficiency goal (E min d ), and common-weight efficiency (E common d ) for each evaluated WC. For 13 out of the 23 WCs (56.5%), the difference between the DEA-CSW efficiency score and upper efficiency score (DEA-CCR) was below 0.1. Thus, these WCs had high satisfaction degree values, with the efficiency results used for benchmarking performance being acceptable for most analysed WCs. In contrast, WC10, WC14, and WC1 exhibited large differences (>0.5) between the DEA-CSW efficiency und upper efficiency scores. Thus, these WCs were the most negatively "affected" when common weights were selected to evaluate WC efficiency. For example, if efficiency scores were computed using the DEA-CCR model, WC15 would have been identified as efficient (upper efficiency goal of one), and it would have occupied the top WC ranking. In contrast, under the DEA-CSW approach, its efficiency score was 0.489, and it was ranked 16th for performance out of all WCs.
The results from this study evidence the main advantages of the DEA-CSW approach to evaluate the performance of WCs for regulatory purposes and therefore, its great potential to benchmark them. It is shown that: (i) DEA-CSW presents better discriminatory power than DEA-CCR to identify a single WC as the best and therefore, the reference for the other WCs assessed; (ii) DEA-CSW integrates all variables in efficiency assessment which is essential to evaluate the performance of WCs from an holistic perspective. By contrast, when efficiency scores were computed using the DEA-CCR method, some variables that did not have good performance were excluded from the assessment since their weights were 0.0. This means that the performance assessment using the DEA-CCR model is partial and biased because only embraces the variables whose performance is good and; (iii) the estimation of the upper efficiency goal and lower efficiency goal for each WC evidences the impact of using CSW for performance assessment and its effect in ranking WCs.
The results of this empirical study demonstrate WC rankings clearly differed when efficiency scores were computed using DEA-CCR (different weightings for each WC) versus DEA-CSW models (common weightings for all WCs). Our study confirms the importance of water regulators being transparent when applying benchmarking methods to regulate WCs. The Lisbon Charter 4 states that water regulators should provide reliable, concise, and credible information that can be easily interpreted by all parties. Because efficiency scores estimated using the DEA-CSW model allocate common weights for all units (WCs in this study), the results are more likely to be accepted by them. Consequently, the use of this method in regulation could not be challenged by WCs. Moreover, the regulator can classify WCs into optimal, moderate and poor performers based on common criteria, which is considered as an objective approach.
A relevant step to apply the DEA-CSW method to evaluate the efficiency of WCs is the definition of the weights for each variable. In doing so, the regulator might adopt different approaches depending on the water industry maturity as well as regulatory policies and legal framework. One option is to define weights according to the methodology described in this study. It is a robust approach based on a mathematical and well-defined procedure. An alternative approach is to define weights based on the priorities and strategic planning defined by the water regulator. For example, in a context of extreme drought, the water regulator might be interested in prioritizing the variables related to water saving such as water losses by allocating them larger weights to other quality of service variables. This is an example of the flexibility of the DEA-CSW method for benchmarking the performance of WCs according to the dynamic priorities of the water regulators and decision makers.

Policy recommendations
Because WCs are natural monopolies, they must be regulated to protect the interests of users. Thus, most of water regulators benchmark the performance of WCs by employing frontier models, such as DEA. However, traditional DEA models allocate different weights for inputs and outputs for each analysed WC in efficiency assessments. Consequently, benchmarking results are questioned by participating WCs, with several WCs being identified as efficient preventing further discrimination and ranking. To overcome these limitations, here, we applied a DEA-CSW method. This method is based on the satisfaction degree concept, allocating common weights to all evaluated units.
The case study showed that the DEA-CSW model had greater discriminatory capacity over the DEA-CCR model. Based on the CSW approach, only one WC was identified as efficient; however, when WCs chose their most favourable weighting, five WCs were identified as efficient. This result makes it difficult to regulate WCs, as efficiency scores do not allow WCs with the best performance to be clearly identified. Moreover, the ranking of WCs based on the two methodological approaches (DEA-CCR and DEA-CSW)   clearly differed, confirming the importance of using CSW for performance assessment in regulatory purposes. This phenomenon was highlighted when weights were allocated to inputs and outputs to compare efficiency score estimates for DEA-CSW models versus DEA-CCR models. The high flexibility of the DEA-CCR approach showed that some WCs were assigned a weight equal to zero for some variables used to evaluate efficiency scores. Consequently, these variables were not considered in the efficiency assessment, resulting in performance comparison of WCs being implemented using different criteria. Consequently, efficiency scores might be biased, leading to sub-optimal regulatory decisions. From a policy and regulatory perspective, the current study demonstrated the importance of selecting DEA-CSW models to evaluate the efficiency of WCs, particularly when results are used for regulatory purposes. Efficiency results must be acceptable to WCs to encourage trust, and to validate the regulatory process of the water and sanitation industry. Thus, the use of CSW is recommended to increase the transparency of performance assessments and, thus, the acceptability of the efficiency results, improving the regulatory process.
The empirical application conducted in this study focused on assessing the efficiency of a sample of WCs. It is a static evaluation which does not account for changes on the performance of WCs over time. Future studies on this topic might be developed to evaluate the productivity change of WCs, i.e., to assess dynamic efficiency of WCs using the DEA-CSW approach. Moreover, given the number of WCs evaluated in this study, the inputs and outputs considered to assess the efficiency was limited to four (two inputs and two outputs). However, in case of a larger sample of WCs, the number of variables integrated in the assessment might also be higher. Additional quality of service and environmental variables such as non-revenue water, unplanned water supply interruptions or greenhouse gas emissions could be integrated in the DEA-CSW model as undesirable outputs. Dynamic efficiency results based on the suggested assessment would be very useful for water regulators to set water tariffs.

Efficiency estimation
To compute the efficiency scores of WCs based on the DEA-CSW approach, the methodology proposed by Wu et al. 25 was employed. It was assumed that there are n units j ¼ 1; :: ; d; :: ; n ð Þ (WC ¼ djd is a water company f g ) and each WC uses m inputs i ¼ 1; :::: ; m ð Þto produce s outputs r ¼ 1; :::: ; s ð Þ . To evaluate the efficiency of WC d , the basic DEA-CCR model proposed by Charnes et al. 17 was used (Model 1): where u rd is the weight of the output r for the WC d (observation evaluated) and ω id is the weight of the input i for the water company evaluated (WC d ). Model (1) is an output-oriented DEA model because within a regulatory framework, the objective of WCs is to improve the quality of their services (outputs) keeping constant economic costs (inputs). Model (1) selects the set of input and output weights that maximize the efficiency of WC d . In other words, the efficiency score for the water company d in the DEA-CCR model (E d ) is the best that the WC d can obtain. The WC d is efficient if E d ¼ 1 and is not efficient (i.e. has room for improvement) if E d < 1. Based on Model (1), WCs cannot be evaluated and ranked on the same basis, because different weights to inputs and outputs are allocated in efficiency assessments.
Unlike to the traditional DEA-CCR approach shown in Model (1), in the common-weight DEA approach, a CSW is used to calculate the efficiency of units (WCs) 22,24 . Under this approach, each unit can have its own upper and lower efficiency target 27 . Model (1) shows that the CCR efficiency of WC j is achieved when its most favourable weights are allocated; thus, under the CSW approach, the upper efficiency target for a unit is its CCR efficiency score (i.e., E max j ¼ E j ). In contrast, the minimum efficiency score of each unit is 0. However, this value is only generated if the output weights of the CSW are all equal to 0, which would not be acceptable for any unit. Based on Wu et al. 25 , the lower efficiency goal ðE min j Þ of each unit is calculated as: where μ Ã rd ; ϖ Ã id ; ∇i; r À Á is (are) the most favourable set(s) of unit d weights generated from Model (1). Equation (2) shows that the lower efficiency of a unit is obtained when it is forced to use a set of weights that is most favourable for another unit.
Considering the upper and lower efficiency goals of WCs, the DEA-CSW is defined as: 8j a X s r¼1 μ rd y jr À E min j X m i¼1 ϖ rd x ij À s 2 The model below (3) (DEA-CSW), which is non-linear, shows that when a CSW is chosen to evaluate the efficiency of WCs, it is guaranteed that all efficiency scores are between their upper and lower efficiency goals. Thus, in DEA-CSW model, different sets of common weights can be selected for the efficiency assessments of WCs. When efficiency scores are used to benchmark and rank WCs, a fundamental criterion to select the CSW is that they must be satisfied by all WCs. Thus, Wu et al. 25 proposed the concept of "satisfaction degree" of DMU d for a weighting profile, which was measured as the distance from the proposed efficiency ratio to the efficiency ratio determined using CSW. Based on Wu et al. 25 , each WC d was assumed to have rational common-weights selection, allowing common weights to be selected that achieve the upper efficiency goal, E max d . However, it is not possible to select a set of common weights that makes efficiency less or less equal to its lowest efficiency goal, E min d . Based on this criterion, the satisfaction degree of WC d based on the set of common weights selected from R. Sala-Garrido et al.
The model (5) maximises the satisfaction degrees of all WCs as follows: The model (6) allows us to generate the set of common weights for the evaluated WCs. Because this model is nonlinear, it cannot be directly solved. To overcome this limitation, Wu et al. 25 proposed two algorithms that allow CSW to be estimated. These algorithms are shown in the supplementary material.

Case study
The current study focused on the major WCs in Chile. In particular, efficiency was assessed for a sample of 23 WCs that provide both drinking water and sewerage services to around 95% of the urban population in Chile (13.5 million people). They are distributed across the 16 Chilean regions and therefore, our study covers the whole country from a territorial perspective. These 23 WCs included ten concessionary water companies, eleven private water companies, and one public water company. The water and sewerage industry in Chile was almost entirely privatised between 1998 and 2004 26 . Nevertheless, all WCs are regulated using the same model, which is based on the efficient water company. Under this regulatory model, the "real" costs of each WC are compared with a virtual, efficient WC, defined by the urban water regulator (Superintendencia de Servicios Sanitarios -SISS-), and considered to be the benchmark 28 . Moreover, the SISS also monitors the quality of the services provided by the WCs and established penalties when quality standards are not met. The customers can file complaints to both WCs and water regulator, which has to respond appropriately. This mechanism also contributes to monitor the quality of the service provided by the Chilean WCs.
Moreover, a basic prerequisite for applying DEA is that the selected input and output variables should have an isotonic relationship, which can be validated using correlation analysis. If the correlation between input and output variables is positive, this means the variables maintain an isotonic relationship and are appropriate to use in the DEA model. If the correlation is negative, other variables need to be selected 29,30 .
The two inputs considered were: (i) operational expenditure (OPEX) expressed as CLP (On 25th February, the conversion rate was 1 US$ ≈ 704 CLP and 1 € ≈ 855 CLP.) per year, involving annual costs incurred by the WC to provide both drinking and sewerage services and; (ii) capital expenditure (CAPEX), which was measured as CLP per year, integrating the funds used by WCs to acquire, upgrade, and maintain physical assets. Following Molinos-Senante et al. 31 , two quality-adjusted outputs were considered to evaluate the efficiency of WCs in Chile. These outputs were estimated based on Eqs. (7) and (8).
where y 1 is the quality-adjusted volume of supplied drinking water, estimated as the product of the volume of supplied drinking water (VDW) multiplied by the quality of drinking water (QDW). y 2 is the quality-adjusted number of customers with access to wastewater treatment, which is the product of the number of customers with access to wastewater treatment (CWW) multiplied by the quality of the wastewater treated (QWW). Both quality indicators (QDW and QWW) are estimated by the Chilean water regulator for each WC, with values ranging from zero to one. A value of one indicates that the WC met all legal requirements regarding quality issues, and vice versa 26 . In Eqs. (7) and (8), when a WC does not meet all quality requirements (i.e., its quality indicator is lower than one), then it is penalized in terms of output production. Hence, by multiplying the volume of drinking water (VDW) and the number of customers with access to wastewater treatment (CWW) by quality indicators (QDW and QWW) it is avoided favoring WCs that have lower costs but provide poor quality water and sanitation services.
To test that the selected outputs and inputs fulfil the isotonic condition, the Pearson correlation test was conducted. Our empirical results (Table 3) indicate that the variables used to estimate efficiency scores have strong positive correlation, indicating that the selected input and outputs can be used in the DEA model. Moreover, we have applied the methodology proposed by Lewis et al. 31 (Eq. 9) to detect outliers because their presence distorts the efficiency results of the WCs. Nevertheless, none of the 23 WCs evaluated was identified as an outlier. Table 4 provides an overview of the statistical data employed to compute the efficiency scores of the WCs evaluated in Chile.

DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author, [MMS], upon reasonable request.