Introduction

The linear economic model has resulted in the excessive depletion of natural resources, with products being purchased, used, and discarded, leading to environmental pollution from excessive waste (Akhtar et al., 2021). Efforts to counteract this environmental damage include advocating for production systems with reduced waste, such as the circular economy. This approach has the potential to drive social and economic development within a country (Penna & Fadel, 2021). Sustainable development hinges on prioritizing environmental, social, and economic elements. Industrial activity plays a pivotal role in conserving resources (Nodoushan et al., 2022), necessitating firms to develop models that ensure sustainability. A sustainable developmental trajectory, given limited resources, ensures the long-term viability of process industries (Jia et al., 2017).

Increasing environmental awareness has prompted governments to implement policies (Calise et al., 2019), while there is a global demand for investments from all quarters to enhance governance and management capabilities required to combat climate change, urbanization, and environmental degradation (Jiang, 2022).

Effectively managing water resources stands as one of the critical challenges in achieving sustainable development, entailing the delicate balance between economic interests and sustainable water stewardship (Chu et al., 2022; Ganesan & Rajiv, 2022; Song et al., 2022). Water scarcity has propelled countries to develop innovative technologies for their management (Gebre et al., 2021). With the world’s population on the rise, substantial volumes of wastewater are generated by domestic, industrial, and agricultural activities. These waters can be treated and repurposed for other uses, aligning with the circular economy ethos (Issaoui et al., 2022). Rahim et al. (2020) identified four factors contributing to water scarcity: (1) uneven geographic distribution of water sources; (2) rapid urbanization leading to population and economic growth; (3) inadequate management of water resources; and (4) prolonged drought.

The burgeoning industrial sector necessitates ongoing evaluation of both the quantity and quality of water (Wang, 2022). Often, the industry’s focus leans towards maximizing production rather than prioritizing water efficiency and conservation (Navarro-Ramírez et al., 2020), given the limited availability of water and the resulting wastewater from production processes (Hung et al., 2020; Akhtar et al., 2021). While conserving water resources can drive sustainable industrial advancement (Egbueri, 2022), it presents challenges in the production phases today (Ene et al., 2013). Achieving a green industrial transformation in certain areas calls for a systematic evaluation system and a comprehensive theoretical framework. Furthermore, understanding how various actors (users and management agencies) in a water resources system make decisions regarding resource utilization is paramount (Meempatta et al., 2023).

Empirical evidence underscores the correlation between water management and the industrial development levels of regions. Kosolapova et al. (2021) discovered a positive relationship between business innovation and effective practices in water resource utilization. Wang et al. (2019) scrutinized the ecological transformation of Chongqing Gepai Wire and Cable Co., Ltd., revealing that interdisciplinary methods are imperative in selecting and dynamically evaluating indicators. Cai et al. (2021) conducted a quantitative spatial analysis across 232 countries and territories to assess Sustainable Development Goal 6 (clean water and sanitation), finding that global water governance confronts challenges in resource management and international collaboration. Nations must implement appropriate political measures to attain universal and equitable access to potable water, sanitation services, and proper hygiene.

In Colombia, the water quality index highlights the industrial sector as a major contributor to the organic load in water bodies. This load is estimated at 51% in terms of biological oxygen demand and 62% in terms of chemical oxygen demand, in relation to the total load contributed by agriculture, industry, and domestic sources. Between 11.8 and 19 million Colombians are at high risk due to poor water quality (IDEAM, 2019).

Despite empirical studies, Rahim et al. (2020) identified a research gap. Most studies rely on total water consumption data, with specific analyses on disaggregated water consumption behavior being scarce. This is even less prevalent in the case of studies encompassing the industrial sector (Burri et al., 2019).

The literature gap lies in the lack of comprehension regarding the connection between business performance and water conservation in the Colombian manufacturing industry. In this context, we aim to address the following question: What factors influenced the efficient use and conservation of water in the Colombian manufacturing industry in 2020? To address this query, we analyze data from the Annual Manufacturing Survey (Departamento Administrativo Nacional de Estadística, 2022a) and the Industrial Environmental Survey (Departamento Administrativo Nacional de Estadística, 2022b). We employ two supervised learning machine-learning algorithms: decision trees and logit regression with the Least Absolute Shrinkage and Selection Operator regularization algorithm.

Methods

Data

In this research, we used data from the Annual Manufacturing Survey (Departamento Administrativo Nacional de Estadística, 2022a) and the Industrial Environmental Survey (Departamento Administrativo Nacional de Estadística, 2022b)Footnote 1. These two surveys meet the requirements of OECD statistical operations. They are publicly accessible, and the data is anonymized. According to the documentation of the “Industry Environmental Survey” (Departamento Administrativo Nacional de Estadística, 2022b) the necessary elements to describe the sampling are:

  • Unit of observation. Industrial establishments located in the national territory engaged in manufacturing activities that meet the parameters established in the target population.

  • Reporting unit. The person responsible for providing the required information; typically, this is the professional in charge of process engineering or environmental engineering within the industrial establishment.

  • Unit of analysis. Industrial establishments located in the national territory engaged in manufacturing activities that meet the parameters established in the target population.

  • Sampling Unit. Industrial establishments located in the national territory engaged in manufacturing activities that meet the parameters established in the target population.

  • Sampling method. The design of this sample is probabilistic with triple stratification: probabilistic because each establishment in the sample frame has a probability greater than zero of being selected, and triple stratification because a partition of the universe is done, first by regions, second by industrial divisions, and lastly, through the Hidiroglou algorithm, the population is divided into a stratum of forced inclusion and another of probabilistic inclusion. In some cases, according to statistical and thematic criteria, the population is classified only as forced inclusion.

    It is stratified according to the following variables:

    1. Region of the country to which the establishment belongs (Caribbean, Eastern, Central, Pacific, Bogota, and Amazonia-Orinoquia).

    2. Industrial activity carried out by the companyFootnote 2.

    3. Size of the establishment, in terms of employed personnel and industrial production.

The observational unit was industrial establishments located in Colombia (7743), and the sample consisted of 2917 establishments across 9 productive sectors. The sample design explained earlier allowed for estimates with an estimated coefficient of variation (sampling error) of less than 5% at the national level. Figure 1 displays the spatial distribution of the sample. It should be noted that there are no industrial activities in the departments to the east of Colombia.

Fig. 1: Spatial distribution of the sample.
figure 1

Note. Fig. 1 displays the spatial distribution of the sample. The observational unit was industrial establishments located in Colombia (7743), and the sample consisted of 2,917 establishments across 9 productive sectors. Own elaboration used Philcarto (Waniez, 2023).

Variables

Table 1 summarizes the information for each variable, with Prowater as the target variable. The variables were selected based on two criteria: (i) availability of information; and (ii) relationship with efficient and sustainable water use. The variables can be categorized into five groups: (i) Environmental commitment; (ii) Risk mitigation; (iii) Innovation; (iv) Location; and (v) Entrepreneurial performance. Sendawula et al. (2021) explored the relationship between environmental commitment and good water use practices in small and medium-sized companies. Similarly, Kosolapova et al. (2021) found a positive relationship between business innovation and sustainable water resource management. Chen et al. (2022) analyzed the effect of environmental liability insurance (ELI) on firms’ environmental performance in developing economies.

Table 1 Variables.

Ethical statements

The data collection process employed in this study involved self-completion of an electronic form through a DANE website. Data transmission was conducted in real-time via the electronic form accessed through the website. It is important to note that access to the anonymized microdata for public use is provided free of charge and will be made available on the DANE website. This approach ensures transparency and accessibility while upholding ethical standards in data handling and dissemination.

Algorithms

Supervised learning algorithms have already been used to analyze water consumption and usage. For example, Pesantez et al. (2018) used eight variables to predict water demand 6 hours in advance and employed a three-layer neural model with three neurons in each layer. Similarly, Platsko & van Beek (2018) fitted models based on sets of regression trees using the LSBoost algorithm to predict outdoor residential water consumption. Deep learning techniques and algorithms have also been applied to urban water system planning and management problems (Fu et al., 2022).

We applied two supervised learning algorithms: Logistic regression with LASSO regularization and decision tree. All variables were normalized because they had different units of measurement. As a result, standardization provides numerical stability, with a Z-score standardization of mean 0 and standard deviation 1.

We divided the sample into two parts. The algorithms used 80% of the sample for training and the remaining 20% for validation.

The training parameters for the classification decision tree were:

(i) the minimum number of observations per division was 20;

(ii) the minimum number of observations in one or terminal was seven;

(iii) the maximum depth of any node in the final tree was 30;

(iv) the complexity of the parameters was 0.01 (any division not decreasing the overall unfit in a factor of this parameter is not attempted);

(v) the seed was randomized; and

(vi) all processes were randomized.

The classification decision tree assumptions are:

(i) the target variable is a nominal or ordinal variable; and

(ii) predictors are scalar, nominal, or ordinal variables.

Chernozhukov and Wei (2016) most that Lasso implemented one plugin estimator:

$$\lambda \log {{{\mathrm{it}}}} = \frac{c}{{2n^{1/2}}}{{\Phi }}^{ - {{{\mathrm{1}}}}}\left[ {1 - \frac{{1.1}}{{2{\it{max}}\left[ {n,p{\it{ln}}\;{\it{n}}} \right]}}} \right]$$

n is the sample size, and p is the number of covariates in the model.

Results

Decision tree

Figure 2 illustrates the estimated decision tree generated by the algorithm, defining three terminal nodesFootnote 3. The algorithm demonstrates robust adjustment indicators, with a correct predictions-to-total predictions ratio of 0.81. The harmonic mean of accuracy and recovery scores also stands at 0.81, further underlining the algorithm’s performance. Additionally, the area under the curve (AUC) reaches a value of 0.79.

Fig. 2: Decision tree.
figure 2

Note: Fig. 2 illustrates the estimated decision tree generated by the algorithm, defining three terminal nodes. own elaboration figure made in Japs 0.16.2.

The most influential predictors for ranking establishments’ programs for efficient water use and conservation, ranked in order of importance, are as follows:

  • Total investment and expenses for wastewater treatment.

  • Total volume of water used by the facility.

  • Personnel expenses dedicated to environmental protection activities.

  • Payment of environmental licenses, permits, fees, and fines.

  • Fixed assets.

  • Total investment and expenses for air and climate protection (Table 2).

    Table 2 Relative Importance of Variables.

Logistic regression with LASSO regularization algorithm

The Logistic regression model with LASSO regularization yielded a ROC (receiver operating characteristic) score of 0.83 for the training sample and 0.81 for the validation sample, indicating strong predictive performanceFootnote 4. This algorithm identified twelve significant predictors (Table 3).

Table 3 Logistic regression with LASSO regularization algorithm.

The results of the logit regression analysis reveal predictors that either decreased or increased the probability of establishments having a program for efficient water use and saving.

Predictors decreasing probability:

  • Absence of environmental certification or eco-labeling or being in the process of implementing it.

  • Total investment and expenditures for the treatment and disposal of hazardous waste.

  • Establishments in the manufacture of rubber and plastic products and textile, apparel, footwear, and leather products.

  • Establishments located in Boyacá and Bolívar (see Table 3).

Predictors increasing probability:

  • Expenditures for personnel dedicated to environmental protection activities.

  • Total volume of water used by the establishment.

  • Establishments in the food, beverages, and tobacco, and manufacture of chemical products and substances.

  • Total investment and expenditures for air and climate protection.

  • Value added.

  • Payments for environmental policies.

  • Establishments located in Valle del Cauca (see Table 3).

Table 4 provides the calculated elasticity, representing the percentage change in the dependent variable for every 1% change in the predictor variable. This facilitates the interpretation of the coefficients.

Table 4 Interpretation of elasticities.

The results of logistic regression and decision tree analyses revealed the following factors to be the most influential:

  • Investment in wastewater treatment: Companies that invest more in wastewater treatment are more likely to have water use efficiency programs.

  • Volume of water used: Companies that use more water are more likely to implement water use efficiency programs.

  • Investment in air and climate protection: Companies that invest in air and climate protection are more likely to implement water use efficiency programs.

  • Geographical location: Companies located in certain regions, such as Valle del Cauca, are more likely to implement water use efficiency programs.

  • Investment in personnel dedicated to environmental protection: Companies that invest in personnel dedicated to environmental protection are more likely to implement water use efficiency programs.

In Fig. 3, the departments with a direct relationship to efficient water use and conservation (in red) are: Sucre, Risaralda, Huila, Atlántico, Cundinamarca, Cauca, Meta, Cesar, Casanare, and Valle del Cauca. On the other hand, those with a negative relationship (in blue) are: Nariño, Vichada, Bogotá, Santander, Magdalena, Tolima, Bolívar, and Boyacá.

Fig. 3: Elasticity by geographical location.
figure 3

Note: Fig. 3 illustrates departments directly associated with effective water use and conservation in red, while those displaying a negative relationship are highlighted in blue. The diagram was created using Philcarto (Waniez, 2023).

These findings suggest that companies that are more committed to environmental protection and sustainability are more likely to implement water use efficiency programs. Additionally, the findings suggest that geographical location and industry type may play a role in the adoption of these programs.

Now, to foster efficient water use and conservation in the Colombian manufacturing industry, the following public policies can be enacted:

  • Offer fiscal and financial incentives to encourage investments in wastewater treatment and reuse technologies.

  • Advocate for training and environmental education programs tailored for corporate staff.

  • Establish initiatives for environmental certification and eco-labeling within the corporate sector.

  • Enforce more stringent environmental regulations and standards for businesses, for example: (i) Textile, (ii) Metallurgy and manufacture of metal products, and (iii) Manufacture of rubber and plastic products.

Discussion

Colombia faces a considerable challenge in achieving sustainable development, as 40% of surveyed firms lack programs focused on efficient water use. This is particularly concerning given that the industrial sector is the primary contributor to water pollution in the country. The findings of this study shed light on the intricate relationship between water use efficiency and the manufacturing industry in Colombia. The decision tree and logistic regression model, coupled with LASSO regularization, have provided invaluable insights into the influential factors affecting water use efficiency within the Colombian manufacturing sector. The decision tree illuminated that variables such as total investment in wastewater treatment and the overall volume of water utilized by facilities emerged as the most potent predictors for classifying companies’ programs regarding efficient water use and conservation. This discovery aligns seamlessly with existing literature emphasizing the significance of investments in water treatment technologies and the adoption of sound water resource management practices (Chu et al., 2022; Gebre et al., 2021).

Additionally, the logistic regression model identified specific predictors that either heightened or lessened the likelihood of companies implementing water-efficient programs. Notably, substantial investment in air and climate protection, along with the total volume of water consumed, surfaced as pivotal drivers for embracing sustainable practices. This underscores the critical importance of considering both economic and environmental dimensions in business decision-making pertaining to water management (Wang et al., 2019).

Furthermore, we observed that the geographical location of companies significantly influences the adoption of water conservation practices. Companies situated in specific regions, such as: Sucre, Risaralda, Huila, Atlántico, Cundinamarca, Cauca, Meta, Cesar, Casanare, and Valle del Cauca, demonstrated a higher propensity for implementing water-efficient programs, indicating the substantial influence of local and regional factors in business decision-making processes. Departments with a positive relationship to efficient water use were not necessarily the places with the highest industrial development in Colombia, contrasting with findings by Kosolapova et al. (2021).

It is imperative to highlight that this study also unearthed a notable gap in the existing literature. Despite the growing awareness of water management and its pivotal role in the industry, there is a conspicuous dearth of studies delving into disaggregated water consumption behavior, particularly within the context of the Colombian manufacturing industry (Burri et al., 2019). This underscores the need for further research to gain a comprehensive understanding of the intricate dynamics and underlying factors influencing water management within this sector.

Assegid et al. (2022) underscore that the substantial water consumption and wastewater production within the textile sector aligns with our findings, indicating that companies in the Colombian textile, apparel, footwear, and leather products sector exhibit less efficient water resource management practices. Similarly, the global food, beverages, and tobacco sector, accounting for 70% of worldwide water consumption, demonstrates a greater likelihood of implementing programs for water efficiency and conservation among Colombian firms. It is noteworthy that firms adopting water management programs often experience short-term financial gains.

The coefficient and its positive sign for c3rh2vtc corroborate the observations of Chen & Cheung (2021), who noted that companies with high water usage benefit from water management programs, particularly in cost reduction through water recirculation. In this instance, there is a direct correlation between high water usage and its efficient management.

Conversely, the economic investment in technology for industrial water treatment and reuse (c1acygge) offers a longer-term return on investment. This may explain why companies choose to make such investments, albeit with a relatively modest effect (Navarro-Ramírez et al., 2020). The willingness of companies to invest in personnel dedicated to environmental protection activities (c1acyggg) emerges as the most influential factor in driving efficient water use within facilities. Finally, there exists a notable and positive correlation between conscientious water use and (i) companies investing in climate energy savings (c1iygtotr1) and (ii) payments for environmental policies (ppolizamb).

It’s important to note that not all surveyed companies have programs in place to ensure the efficient use of water resources, as investments in these programs often yield returns in the long-term. This lack of commitment to environmental certifications and best practices may be attributed to the abundance of water resources in Colombia, particularly in cities where scarcity is not a prevailing concern. Consequently, firms may lack incentives to implement efficient water management plans.

The findings of this study underscore the environmental challenges confronting the Colombian government and underscore the need for sector-specific policies to address these issues. However, the economic interests of firms can sometimes impede investments in responsible water resource management programs. This challenge has also been emphasized in prior studies by Chu et al. (2022), Ganesan & Rajiv (2022), and Sound et al. (2022).

Conclusion

This paper substantiates that efficient water utilization in Colombia is intricately tied to companies’ willingness to invest, affirming Jiang’s (2022) assertions. Committing resources to water resource management not only strengthens a firm’s ability to mitigate environmental degradation risks, particularly those stemming from irresponsible water consumption in industrial operations but also bolsters its overall environmental stewardship.

The study has successfully identified sectors that are most likely to have robust resource management plans and at least two commendable practices for wastewater treatment and water reuse. These sectors encompass (i) Food, beverages, and tobacco, (ii) Manufacture of chemical products and substances, (iii) Manufacture of non-metallic mineral products, and (iv) Coking, manufacture of petroleum refining products, and nuclear fuel. Conversely, sectors that demonstrate a divergence from these practices include (i) Manufacture of rubber and plastic products, (ii) Textile, apparel, footwear, and leather products, (iii) Metallurgy and manufacture of metal products, and (iv) Other industrial divisions.

In predicting efficient water usage, the study found a significant predictor in the inclusion of an environmental diagnosis by Colombian industrial firms as a planning instrument. These findings align with the discoveries of Wang et al. (2019). Furthermore, the study’s algorithms reinforce the results of Navarro et al. (2020), illustrating the interdependence between firms’ readiness to optimize water usage and their financial capacity.

Moreover, Colombian manufacturing enterprises should strongly contemplate the adoption of water use efficiency programs. These initiatives not only aid in cost reduction but also elevate their environmental performance, contributing significantly to the sustainability of Colombia’s water resources.

Based on these findings, the study advocates for the incorporation of the following points into public policies aimed at safeguarding water resources:

  • Encourage all firms to establish a program for efficient water use.

  • Develop targeted plans for sectors with the highest pollution contributions.

  • Evaluate a firm’s water footprint and offer tax incentives to those effectively mitigating their water impact and achieving neutrality.

  • There is a need for government-led initiatives to enhance awareness among businesses and the public about the critical importance of water conservation.

  • Encourage the widespread adoption of circular economy principles within the industrial sector.

The study’s limitations include reliance on data from a single country and an exclusive focus on quantitative aspects. Future research endeavors could significantly augment this study by incorporating a qualitative analysis phase with a specific focus on economic sectors.