Energy plays a crucial role in various aspects of managing water resources in urban settings, including abstraction, treatment, and distribution1. In particular, the production of drinking water is an energy intensive activity because raw water needs to be cleaned by removing high levels of pollutants2. The Sustainable Development Goals established by the United Nations, specifically Goals 6 and 7, impose an obligation on governments to ensure universal access to clean water and enhance energy efficiency3. Consequently, it is crucial for water resources to meet rigorous quality standards before being deemed suitable for drinking purposes. Simultaneously, the challenges posed by climate change and population growth necessitate the sustainable and efficient utilization of water resources4,5. Moreover, drinking water needs to be provided to people at an affordable price6.

To address the aforementioned challenges, it is crucial to gain a deeper understanding of the energy efficiency of the water treatment process, including the factors that influence energy performance and the potential for energy savings. By examining the energy efficiency of the water treatment process and exploring opportunities for energy conservation, we can work towards achieving sustainable and efficient use of energy in the water sector. Previous studies, such as those conducted by Loubet et al.7, Chini et al.8, and Lam et al.9, have examined the relationship between energy intensity and the water treatment process. These studies have highlighted the growing energy demands in water treatment resulting from factors like climate change, population growth, and urbanization. However, it is worth noting that these studies did not specifically explore the link between energy efficiency and the water treatment process.

Energy intensity and energy efficiency are different concepts10. Energy intensity is the energy consumed [kWh] per unit volume [m3] of drinking water produced and therefore, it does not consider how the quality of raw water and drinking water affected the energy consumed by drinking water treatment plants11,12. On the other hand, energy efficiency is a synthetic indicator which in addition to the volume of drinking water produce integrates the quantity of pollutants removed from raw water13.

Only a limited number of previous studies have specifically focused on evaluating the energy efficiency of drinking water treatment plants (DWTPs). These studies, conducted by Molinos-Senante and Guzman14, Molinos-Senante and Sala-Garrido10,15, Ananda16, Sala-Garrido and Molinos-Senante17, and Maziotis et al.13, utilized the data envelopment analysis (DEA) method. DEA is a non-parametric technique that employs linear programming to measure energy efficiency, enabling the integration of multiple inputs and outputs for each DWTP18. One advantage of DEA is that it does not require a priori definition of the functional form of the production frontier, which represents the relationship between inputs and outputs19. While DEA has its positive characteristics in evaluating energy efficiency of DWTPs, it is important to acknowledge its limitations. One such limitation is its deterministic nature, which makes it sensitive to outliers in the data20. Consequently, the presence of extreme values can significantly impact the efficiency scores derived from DEA analysis.

DEA method needs implicitly to determine the boundary of the underlying technology, which constitutes the reference benchmark21. Its estimation allows the calculation of the corresponding inefficiency score for each unit (DWTP in our case study) as the deviation of each activity or production plan from the frontier of the production possibility set. For this reason, by definition, DEA approach suffers from an overfitting problem21,22,23. Overfitting occurs when the model becomes too closely tailored to the specific dataset used for analysis, potentially resulting in less robust efficiency scores. Therefore, caution should be exercised when interpreting and relying solely on efficiency scores derived from DEA analysis.

In order to address the limitations of DEA and enhance the accuracy and robustness of efficiency scores, Esteve et al.21 introduced a novel method called Efficiency Analysis Trees (EAT). The EAT method combines machine learning and linear programming techniques to measure efficiency. Specifically, it utilizes regression (or decision) trees to estimate the value of the response variable based on thresholds of predictor variables. The EAT approach assumes free disposability to estimate a step function production frontier and calculate efficiency scores. Esteve et al.21 demonstrated from a mathematical point of view how the EAT method overcomes overfitting improving the accuracy of the efficiency results. In particular, they demonstrated that the EAT method outperforms other non-parametric techniques such as DEA because the estimated values are not overfitted, ensuring more reliable efficiency measurements24.

In light of the aforementioned context, the main objective of this study is to assess the energy efficiency of the drinking water treatment process using the newly developed method EAT. By employing the EAT approach, this study aims to quantify the optimal level of energy consumption at various thresholds of volume pollutants removed. Additionally, the EAT method allows for the estimation of potential energy savings that could be achieved with efficient water treatment practices. Moreover, in order to gain a deeper understanding of the factors influencing energy efficiency in the water treatment process, this study utilizes bootstrap regression techniques. Specifically, it examines the impact of operational characteristics such as the age of the facility and the type of treatment technology on energy performance. Through these analyses, this study seeks to provide valuable insights into improving energy efficiency in drinking water treatment operations.

Our study makes several significant contributions to the existing literature. Firstly, it stands as the pioneering research that applies machine learning and linear programming techniques to assess the energy performance of the drinking water treatment process. By employing the EAT approach, we are able to generate robust efficiency scores that are not overfitted, in contrast to other non-parametric methods like DEA. Furthermore, our study estimates the potential energy savings achievable in the drinking water treatment process. This allows us to gain insights into the optimal energy utilization required for different pollutant removal volumes. This information is invaluable for managers and decision-makers as it sheds light on the factors influencing energy intensity and aids in the decision-making process. Importantly, our innovative research was implemented and applied to multiple DWTPs in Chile, enhancing its applicability and relevance to real-world scenarios. This empirical application further strengthens the validity and reliability of our findings.

Results and discussion

Optimal energy use in drinking water treatment

By applying the EAT algorithm, we can determine the optimal level of energy utilization in DWTPs based on the volume of drinking water produced and efficiency in pollutants removal. The findings from our analysis, as depicted in Fig. 1, highlight the significant impact that removing arsenic and sulfates from raw water has on the energy consumption of DWTPs. By contrast, the removal of the other pollutants considered in this study, i.e., turbidity and total dissolved solids, does not significantly explain energy use in DWTPs. This finding aligns with Molinos-Senante and Sala-Garrido25 conclusions, which assessed how various pollutants and the volume of treated water affect the energy intensity in a range of water treatment facilities. They evidenced that total dissolved solids only affect the energy usage in DWTPs employing coagulation-flocculation and pressure filtration techniques. Conversely, the reduction of turbidity was found to influence energy consumption only in DWTPs that utilize pressure filtration. However, for facilities relying on rapid gravity filtration, the energy usage is not affected by the removal of turbidity and total dissolved solids.For those facilities producing more than 2,111,834 m^−3 per year of drinking water adjusted by arsenic removal efficiency the maximum energy use is 1,054,754 kWh per year, i.e., 0.499 kWh m^−3. In the case of DWTPs that produce less than 2,111,834 m^−3 per year of drinking water adjusted by arsenic removal and of more than 428,440 m^−3 per year adjusted by sulfates removal, the maximum use of energy utilization could reach the level of 539,412 kWh per year. Hence, the optimal energy usage ranges from 0.255 kWh m^−3 to 1.259 kWh m^−3 depending on whether the assessment takes into account the efficiency of arsenic or sulfates removal, respectively. Finally, DWTPs producing less than 428,440 m^−3 per year adjusted by sulfates efficiency removal, the maximum energy use required could be lower than 136,201 kWh/year, i.e., 0.318 kWh m^−3. It is illustrated the large range of optimal energy use depending on the volume of drinking water produced and the quantity of arsenic and sulfates removed. Considering that the average volume of water produced adjusted by arsenic removal is 1,998,544 m^−3 per year, results on Fig. 1, illustrate that smaller facilities can have an energy intensity as low (or lower) than larger ones as even the quantity of sulfates to be removed is large (>428,440 m^−3 per year).

Fig. 1: Efficiency Analysis Tree (EAT) for estimating optimal energy consumption in DWTPs.
figure 1

The volume of drinking water produced and the quantities of arsenic and sulphates removed from raw water influences of the energy use of DWTPs.

Indeed, the results of our study indicate that optimal energy utilization in DWTPs can vary based on different thresholds of water treated to remove specific pollutants such as sulfates and arsenic. This finding underscores the importance of tailoring energy consumption to the specific requirements of pollutant removal in order to achieve optimal energy efficiency. By considering different thresholds or levels of pollutant removal, DWTPs can determine the appropriate energy usage for their particular circumstances. The results suggest that the energy required to remove pollutants like sulfates and arsenic can have a significant impact on overall energy consumption in DWTPs. Therefore, by optimizing energy usage based on these thresholds, water treatment facilities can enhance their energy efficiency and reduce unnecessary energy consumption.

Energy efficiency of drinking water treatment plants

The findings of our study indicate that the water treatment processes examined exhibit high levels of energy inefficiency, with an average energy efficiency score of 0.197 (Fig. 2). This implies that, on average, there is significant room for improvement in terms of energy consumption, with a potential reduction of almost 80% in energy usage. The distribution of energy efficiency scores among DWTPs is shown in Supplementary Fig. 1.

Fig. 2: Energy efficiency scores of drinking water treatment plants evaluated.
figure 2

Relevant differences on the energy efficiency among drinking water treatment plants are reported.

Moreover, it is observed that a small proportion of the evaluated facilities demonstrated full energy efficiency. Specifically, out of the 146 DWTPs analyzed, only four facilities, representing ~2.7% of the total, achieved a full efficiency score of 1.00. This indicates that these particular facilities have effectively optimized their energy usage and are operating at the highest level of energy efficiency within the context of our study. The four energy-efficient facilities utilize pressure filters (PF) for drinking water production. However, they differ in their primary raw water sources, with two of them using groundwater and the other two relying on surface water.

Findings from this study slightly differ from those of Molinos-Senante and Sala-Garrido15, who reported an average energy efficiency score of 0.28 and classified 6 out of 146 DWTPs as energy-efficient. Notably, two facilities deemed efficient in their study were considered inefficient in ours. Furthermore, there are more pronounced discrepancies when compared to Molinos-Senante and Maziotis26, who reported an average energy efficiency score of 0.462, with none of the evaluated facilities being fully energy-efficient. These variations can be attributed to the different methodological approaches used in assessing energy efficiency. Molinos-Senante and Sala-Garrido15 utilized a double-bootstrap DEA method, which reduces data uncertainty but does not address overfitting issues. In contrast, Molinos-Senante and Maziotis26 used stochastic non-parametric envelopment of data (StoNED), a technique that incorporates both inefficiency and noise in the assessment but still has overfitting limitations. Our study, however, employed the EAT approach, a method that is not prone to overfitting issues. This methodological improvement strengthens the validity of our findings and underscores the need for targeted efforts to improve energy efficiency in water treatment plants.

Figure 3 provides valuable insights into the distribution of energy efficiency scores across the evaluated DWTPs. The majority of the facilities reported energy efficiency scores below 0.21, indicating a significant level of energy inefficiency. On average, these plants would need to reduce their energy consumption by nearly 80% to achieve optimal energy efficiency. On the other hand, there are 18 treatment plants that reported relatively higher energy efficiency scores. However, their scores still fall within the range of 0.21 and 0.61, indicating room for improvement. These facilities have the potential to achieve substantial energy savings, ranging from 40% to 80% on average, and bridge the gap with the most energy-efficient plants in the sector. No common characteristics were observed in terms of source of raw water, treatment train, and ownership for this group of DWTPs. The treatment trains employed are coagulation-flocculation with rapid gravity filters (CF-RGF) by 8 facilities, coagulation-flocculation with pressure filters (CF-PF) by 5, and PF alone by another 5. Similarly, the distribution of the main source of raw water varies, with 8 DWTPs treating mixed raw water, 5 treating groundwater, and 5 treating surface water. Ownership-wise, 14 of the 18 DWTPs are fully privately owned, while the remaining 4 are operated by concessioned companies.

Fig. 3: Energy efficiency scores across DWTPs.
figure 3

Most of the facilities evaluated present a very poor energy efficiency with a score lower than 0.21.

Furthermore, the analysis reveals a group of 11 DWTPs that can be considered as best performers. These facilities attained energy efficiency scores ranging between 0.81 and 1.00, indicating a high level of energy efficiency. These best performers serve as examples of successful energy management practices and provide insights into the potential for achieving optimal energy efficiency in the water treatment sector. The shared characteristic among this group of DWTPs is their use of PF as the primary treatment method for producing drinking water. Additional information of the influence of treatment train on the energy efficiency of DWTPs is shown in Fig. 5. As with the previous group, diverse features are observed in terms of ownership and the main source of raw water. Of these DWTPs, 8 out of 11 are owned by full private water companies, while the remaining three are operated by concessioned water companies. Regarding the source of water, the distribution is as follows: 5 out of 11 DWTPs treat mixed water, 4 out of 11 treat groundwater, and 2 out of 11 treat surface water.

In our study, we investigated the energy-saving potential of the energy-inefficient DWTPs. By applying Eq. (5) and considering the current energy use of the 146 assessed DWTPs, we estimated the potential energy savings for these facilities, as depicted in Fig. (4). The results revealed that the assessed DWTPs have a combined potential energy-saving of 13,344,093 kWh/year. This indicates the substantial opportunity for reducing energy consumption in these facilities while maintaining the same volume of drinking water production and pollutant removal from raw water. The mean potential energy savings for the 146 assessed DWTPs were estimated to be 0.005 kWh m^−3. Additionally, we analyzed the distribution of potential energy savings across the assessed DWTPs. The 25th percentile indicates that 25% of the facilities could achieve energy savings of 0.008 kWh per cubic meter water, while the 75th percentile suggests that 25% of the facilities could achieve energy savings of 0.146 kWh m^−3. Current energy intensity of DWTPs ranges between 0.002 kWh m^−3 and 0.215 kWh m^−3 with an average value of 0.007 kWh m^−3. The distribution of potential energy savings among DWTPs is shown in Supplementary Fig. 2.

Fig. 4: Potential energy savings whether DWTPs are energy efficient.
figure 4

DWTPs with the lowest energy performance can save up to 2.45 kWh per cubic meter of water.

Factors influencing energy efficiency of DWTPs

To gain a deeper understanding of the factors impacting the energy performance DWTPs, it is essential to assess how their operational characteristics influence the previously estimated energy efficiency scores. The findings of this analysis are presented in Table 1. We check the existence of multicollinearity among the explanatory variables in the regression using the variance inflation factor (VIF) test. The estimated value of VIF was 2.31 indicating that there is no multicollinearity in the regression model. The results indicate that the age of the treatment plant, the source of raw water, and the type of treatment technology have a negative effect on energy efficiency. By contrast, the ownership of the facility does not statistically influence on its energy performance which is consistent with past research15,26,27.

Table 1 Influence of operational characteristics on energy efficiency. Estimates of the bootstrap truncated regression model

Specifically, the study reveals that older DWTPs tend to have lower energy performance. This can be attributed to the lack of updates and improvements in energy-efficient equipment within these aging plants. Molinos-Senante and Sala-Garrido15 found that newer facilities tend to exhibit better energy performance, suggesting a positive correlation between a facility’s age and its energy efficiency. Conversely, the study by Molinos-Senante and Maziotis26 did not provide a definitive conclusion on how the age of a facility influences its energy efficiency, indicating that the relationship between these factors remains unclear and further research is needed. Furthermore, the analysis demonstrates that DWTPs relying on mixed water resources, such as both surface and groundwater, experience a decrease in energy efficiency. This implies that treating water from multiple sources necessitates extensive treatment processes, potentially leading to higher energy consumption. Consequently, this combination of surface and groundwater treatment may have a detrimental impact on overall energy performance. This finding is consistent with those reported by Molinos-Senante and Maziotis26 whereas Molinos-Senante and Sala-Garrido15 did not identify the source of raw water as an explanatory factor of energy efficiency of DWTPs.

When focusing on the primary technology used for treating raw water, there are statistically significant differences in energy efficiency scores. Figure 5 provides an illustration of these differences, highlighting that the most energy-efficient technology is PF. On the other hand, treatment plants utilize rapid gravity filters (RGF) technology to remove pollutants from raw water are the least energy-efficient. Treatment plants that utilize a combination of coagulation and flocculation (CF) and PF or RGF to purify water demonstrated slightly higher energy efficiency scores. However, there is still considerable room for improvement in their energy performance to catch up with plants utilizing more energy-efficient technologies. These findings present a partial divergence from the conclusions of Molinos-Senante and Sala-Garrido10, who conducted a metafrontier DEA assessment on DWTPs. Their study concluded that DWTPs utilizing a combination of CF and rapid gravity filtering (CF + RGF) were the most energy-efficient. They also found evidence that facilities employing RGF as their treatment process were the least energy-efficient. It is important to note that differences in methodologies, data sources, and specific contexts may contribute to variations in the results between studies.

Fig. 5: Energy efficiency scores across water treatment technologies (PF: pressure filters; RGF: rapid gravity filters; CF-PF: coagulation-flocculation and pressure filters; CF-RGF: coagulation-flocculation and rapid gravity filters).
figure 5

DWTPs using PF are those with the best energetic performance.

Results from this study provide evidence that the evaluated DWTPs exhibit inadequate energy performance, highlighting significant opportunities to reduce energy consumption. Such reductions can lead to cost savings and help mitigate greenhouse gas emissions, particularly if the energy sources are non-renewable. Water managers and regulators can implement various actions and policies to enhance energy efficiency in water treatment processes (Fig. 6). Potential strategies could be categorized as follows:

Fig. 6: Some strategic actions to promote energy efficiency in drinking water treatment plants.
figure 6

Energy efficiency improvement can be achieved by applying a diverse range of approaches.

DWTPs can improve energy efficiency by both reducing energy use and by removing more pollutants from raw water. Focusing on the first alternative, as it has reported by Sowby et al.28, some practices for managing energy in DWTPs are: (i) Implementing optimized operational procedures can help minimize energy waste. This includes strategies such as optimizing flow rates, ensuring appropriate maintenance of equipment, and adopting efficient operating schedules; (ii) Energy recovery systems, such as energy-efficient pumps or turbines, can be installed to capture and utilize energy that would otherwise be wasted during the treatment process and (iii) Deploying advanced monitoring and control systems can enable real-time monitoring of energy consumption and process optimization. This allows operators to identify areas of energy inefficiency promptly and take corrective actions. Results from this study evidenced that older DWTPs tend to have lower energy performance. Thus, water managers can invest in modernizing treatment plant equipment to utilize more energy-efficient technologies. This may involve replacing outdated machinery with newer models that offer improved energy performance. Generation and transfer of specific knowledge is also a valuable tool for improving energy efficiency. Thus, training programs and awareness campaigns can educate DWTP staff about the importance of energy efficiency and provide them with the necessary knowledge and skills to identify energy-saving opportunities in their daily operations. Moreover, establishing platforms for collaboration and knowledge sharing among water managers, researchers, and industry experts can facilitate the exchange of best practices and innovative approaches to energy efficiency in water treatment processes. Finally, given the large room of Chilean DWTPs to improve energy efficiency, the Chilean water regulators can introduce incentives and policies to encourage water companies to prioritize energy efficiency. These can include offering financial incentives for adopting energy-efficient technologies or setting energy efficiency targets that must be met to ensure tariff adjustments or other benefits.

The policy implications of your study’s findings are indeed significant and can provide valuable guidance to stakeholders involved in water treatment processes. By employing a novel approach that combines machine learning and linear programming techniques, this study offers a visually intuitive way for water regulators to understand the maximum energy requirements for different pollutant removal scenarios. This can aid decision-making processes by providing clear insights into the energy implications of water treatment operations. The new method used overcomes overfitting issues often encountered in other efficiency techniques. As a result, the energy efficiency scores derived from this approach are more robust and reliable. This increased reliability can contribute to more informed decision-making, as water regulators can have greater confidence in the efficiency assessments provided. The analysis conducted identifies key factors influencing energy performance in water treatment processes. This knowledge enables water regulators to gain insights into the specific aspects that impact efficiency. For example, recognizing that newer treatment plants tend to be more energy-efficient can inform decisions regarding facility upgrades or replacements. Similarly, understanding the energy intensity associated with different water sources and treatment technologies can guide choices in resource allocation and process optimization. In this context, Sowby29 empirically proved that those water utilities with energy management policies or plans use less energy. This correlation was attributed to the organization´s culture and operation and also to the identification of energy use as a relevant topic within the organization.


Energy performance assessment based on efficiency analysis tree

According to Esteve et al.21, let’s make the assumption that there is a vector of predictors variables, i.e., factors influencing energy use in DWTPs, defined as \({x}_{1},\ldots ,{x}_{m}\) with \({{\boldsymbol{x}}}_{{\boldsymbol{i}}}{{\in }}{{\boldsymbol{R}}}^{{\boldsymbol{m}}}\). Let’s also assume that this set of variables are employed to predict a vector of response variables, i.e., energy used, defined as \(y,\ldots ,{y}_{n}\) with \({{\boldsymbol{y}}}_{{\boldsymbol{i}}}{{\in }}{{\boldsymbol{R}}}^{{\boldsymbol{n}}}\). The EAT approach selects a predictor variable \(j\) and a threshold \({{\boldsymbol{s}}}_{{\boldsymbol{j}}}{{\in }}{{\boldsymbol{S}}}_{{\boldsymbol{j}}}\) where \({{\boldsymbol{S}}}_{{\boldsymbol{j}}}\) consists of the vector of potential thresholds for the variable \(j\) to split the dataset into the right and left node, \({t}_{R}\) and \({t}_{L}\), respectively22. The mean squared error (MSE) is used to define the threshold and consequently, the right and left node. This is shown mathematically as follows:

$$R\left({t}_{L}\right)+R\left({t}_{R}\right)=\frac{1}{n}\sum _{\left({x}_{i},{y}_{i}\right)\in {t}_{L}}{\left({y}_{i}-y\left({t}_{L}\right)\right)}^{2}+\frac{1}{n}\sum _{\left({x}_{i},{y}_{i}\right)\in {t}_{R}}{\left({y}_{i}-y\left({t}_{R}\right)\right)}^{2}$$

where \(t\) shows the node of the regression tree, \(R(t)\) presents the MSE of each node \(t\), \(n\) is the size of the sample, and \(y\left({t}_{L}\right)\) and \(y\left({t}_{R}\right)\) are the estimated values of the response variable \(y\). Note that the nodes \({t}_{L}\) and \({t}_{R}\) present the left and right nodes of the tree, respectively. The value of the response variable is calculated using the data that goes to nodes, \({t}_{L}\) and \({t}_{R}\).

The estimated values of the response variable for each node of the regression tree are calculated as follows23:

$$y\left({t}_{L}\right)=\max \left\{\max \left\{{y}_{i}:\left({x}_{i},{y}_{i}\right)\in {t}_{L}\right\},y\left({I}_{T\left(k{\rm{|}}{t}^{* }\to {t}_{L},{t}_{R}\right)}\left({t}_{L}\right)\right)\right\}$$
$$y\left({t}_{R}\right)=\max \left\{\max \left\{{y}_{i}:\left({x}_{i},{y}_{i}\right)\in {t}_{R}\right\},y\left({I}_{T\left(k{\rm{|}}{t}^{* }\to {t}_{L},{t}_{R}\right)}\left({t}_{R}\right)\right)\right\}$$

where \(T\) is the sub-tree that is derived using the EAT method, \(k\) is the number of splits, \(y({I}_{T\left({k|}{t}^{* }\to {t}_{L},{t}_{R}\right)}\left({t}_{L}\right))\) and \(y({I}_{T\left({k|}{t}^{* }\to {t}_{L},{t}_{R}\right)}\left({t}_{R}\right))\) is the set of leaf nodes of the regression tree generated after performing the \(k\)-th break that Pareto dominates node \({t}_{L}\) and \({t}_{R}\)21,23.

To avoid any overfitting problems, the EAT approach employs cross validation techniques to select the best regression tree21. Therefore, the production technology that is estimated takes the following form:

$${\widehat{{{PT}}_{{T}_{k}}}}=\left\{\left(x,y\right)\in {R}_{+}^{m+1}:y\le {d}_{{T}_{k}}\left(x\right)\right\}$$

where \({d}_{{T}_{k}}\left(x\right)\) is the predictor estimator regarding the sub-tree \({T}_{k}.\)

The energy efficiency score, which is a synthetic index embracing energy use and pollutants removed from raw water, for each analyzed DWTP is derived after solving the following linear programming model:

$$\theta \left({x}_{k},{y}_{k}\right)=\min \theta$$
$$\sum _{t\in \widetilde{{T}^{* }}}{\lambda }_{t}{a}_{j}^{t}\le {\theta x}_{{jk}},j=1,\ldots ,m$$
$$\sum _{t\in \widetilde{{T}^{* }}}{\lambda }_{t}{d}_{r{T}^{* }}^{t}({a}^{t})\ge {y}_{{jk}},r=1,\ldots ,p$$
$$\sum _{t\in \widetilde{{T}^{* }}}{\lambda }_{t}=1$$
$${\lambda }_{t}\in \left\{0,1\right\},i=1,\ldots ,n$$

where \(\theta\) is the energy efficiency score which is ranged between 0 and 1. We note that when energy efficiency score equals to one, then the unit is energy-efficient. A value lower than one indicates energy inefficiency. \({a}^{t},{d}_{{T}^{* }}({a}^{t}\)) are locations in the input-output space for all \(t\in {T}^{* }\) where * presents the final sub-tree, and \(\lambda\) are intensity variables that are part of the process to construct the efficient frontier30.

Based on the energy efficiency scores estimated by using Eq. (4), potential energy savings if a DWTP was efficient can be estimated using the following equation:

$${{Energy}}_{s}={{Energy}}_{c}* \left(1-\theta \right)$$

where \({{Energy}}_{s}\) is the potential saving in energy and \({{Energy}}_{c}\) is the actual level of energy consumption of the evaluated DWTP.

Factors influencing energy efficiency of DWTPs

In the second step of our analysis, we examine the relationship between energy efficiency scores of DWTPs and their operating characteristics. To accomplish this, we utilize bootstrap truncated regression techniques, as proposed by Simar and Wilson31. The choice of employing a truncated regression approach is motivated by the fact that energy efficiency scores are bounded between zero and one. This approach allows us to account for this bounded nature and ensure that the estimated relationship is valid within this range.

By using bootstrap regression techniques, we can mitigate any potential issues related to serial correlation among efficiency scores, error terms, and the operating characteristics. This is an improvement over the standard Tobit regression approach, which may encounter difficulties when dealing with such correlations31. The bootstrap method provides a robust framework for analyzing the relationship between energy efficiency scores and the various operating characteristics of DWTPs in our study.

Mathematically, the regression model takes the following form:

$${\theta }_{i}={{\rm{\mu }}}_{0}+{{\rm{\mu }}}_{i}{\xi }_{i}^{{\prime} }+{time}+{v}_{i}$$

where \({\theta }_{i}\) is the EAT energy efficiency score, \({\mu }_{0}\) is the constant term, \({{\boldsymbol{\xi }}}_{{\boldsymbol{i}}}^{{\boldsymbol{{\prime} }}}\) is the set of operating characteristics of any DWTP \(i\), and \({\mu }_{i}\) are parameters that are estimated. Finally, \({v}_{i}\) is the error term which is distributed following the standard normal distribution24.

Case study and variables used

The case study conducted in Chile focuses on assessing the energy performance of 146 DWTPs in the country. The study focused on assessing the energetic performance of water treatment facilities excluding energy use for raw water abstraction. It is important to note that the water industry in Chile operates under a system of private ownership, which was established during the privatization process between 1998 and 2004. Two types of water companies emerged from this process: full private water companies, responsible for the long-term operation and maintenance of the water network, and concessionary water companies, tasked with supplying water for a specific period, typically around 30 years32. Due to the monopolistic nature of the water sector, a national regulator called the Superintendencia de Servicios Sanitarios (SISS) was established. This regulatory body is responsible for setting water tariffs for customers, using an efficient company standard as a benchmark33. Additionally, the national regulator, SISS, monitors the environmental performance of the water sector. The Ministry of Health establishes quality standards that must be met by treated water before it is distributed to end-users for consumption. These quality standards are based on guidelines set by the World Health Organization10.

The selection of predictor and response variables is based on past literature on this topic and data availability34,35,36,37,38,39. The response variable is captured by the energy consumption and is measured in kWh per year. In order to account for the removal of pollutants during the water treatment process and consider its impact on energy efficiency, our study incorporates four quality adjusted predictor variables. Following past practice10,40,41, they are estimated as follows:

$${{Quality\,adjusted\,y}}_{s}={{Volume}}_{w}\,\cdot\, \left(\frac{{{Pollutant}}_{\sin }-{{Pollutant}}_{{sef}}}{{{Pollutant}}_{\sin }}\right)$$

where \({{Volume}}_{w}\) denotes the volume of drinking water produced and is measured in m3 per year; \({{Pollutant}}_{\sin }\) is the concentration of the pollutant \(s\) in the influent and \({{Pollutant}}_{{sef}}\) is the concentration of the pollutant \(s\) in the effluent. Pollutant concentrations are measured in g/m3. This study employs four quality adjusted predictors because four pollutants are removed during the water treatment process. The four pollutants considered are sulfates, turbidity, arsenic and total dissolved solids. The four pollutants considered are sulfates, turbidity, arsenic and total dissolved solids. This selection is based on their significant impact on the energy consumption of Chilean DWTPs10,25.

Regarding operational characteristics influencing the energetic performance of DWTPs, the following variables are considered: (i) age of the DWTP measured in years; (ii) source of the raw water treated (surface water; groundwater or mixed water resources, which involves groundwater and surface water blending before its treatment); (iii) ownership of the DWTP which is captured through the use of a dummy variable, i.e., whether the treatment plant owned by a full private or concessionary water company and; (iv) the type of treatment technology used in the DWTPs, i.e., PF (n = 66), RGF (n = 36), CF-PF (n = 18) and CF-RGF (n = 26). The pretreatment of all facilities assessed is a simple screening process and all use chlorine for water disinfection. The descriptive statistics of the variables used in our analysis are reported in Table 2. Data of the variables (predictor, response variables, and operational characteristics of DWTPs) was provided by the Chilean Urban Water Regulator (SISS) requested under the right to public information in Chile and correspond to 2020.

Table 2 Descriptive statistics of the variables used in the analysis