Distance and weightage-based identification of most critical and vulnerable locations of surface water pollution in Kabul river tributaries

Water plays a key role in the economic growth of an agricultural country. Pakistan is a farming country that uses almost 90% of its water resources for agriculture. Khyber Pakhtunkhwa (KPK) province of Pakistan has extensive surface water resources. In addition to using groundwater resources for irrigation, large parts of its flat plains are irrigated with the Kabul River surface water. Due to large population growth and unregulated small/local scale industries in the region, surface water quality deteriorates with time, which affects people's health when polluted surface water is used for irrigation purposes. This research investigates the surface water quality of Kabul River's different tributaries. It identifies the most critical and vulnerable locations regarding water quality using the weightage-based identification method and distance-based iteration method, respectively. The Bara River exhibited the most critical location, surpassing the threshold values by a considerable margin in at least seven water quality parameters. The maximum seven critical values determined against the Bara River using the weightage-based method, i.e., 17.5, 5.95, 7.35, 27.65, 1.75, 0.35, and 10.45 for total alkalinity, sodium, total hardness, magnesium, total suspended solids, biological oxygen demand (BOD), and turbidity. The Khairabad station, where the Kabul River meets the Indus River, was identified as vulnerable due to elevated levels of total suspended solids, hardness, sulfate, sodium, and magnesium using distance-based methods. The locations, i.e. Adezai, Jindi, Pabbi, and Warsak Dam, appeared critical and vulnerable due to the prevalence of small-scale industries on their bank and high population densities. All the results are finally compared with the interpolated values over the entire region using Kriging interpolation to identify critical and vulnerable areas accurately. The results from the distance and weightage-based methods aligned with the physical reality on the ground further validate the results. The critical and vulnerable locations required immediate attention and preventive measures to address the deteriorating water quality parameters by installing monitoring stations and treatment plants to stop further contamination of the particular parameter.

www.nature.com/scientificreports/ is during July and August 26 . The Kabul River and its tributaries transport untreated sewage from Afghanistan and Pakistan's cities, towns, and villages from Peshawar, Mardan, Khyber, Mohmand, and Malakand agencies 27 . The lower sections of the river pass through the plains, particularly densely populated. The effluents from many industries end up in the river Kabul, either directly or through nullahs which eventually drain into the river. Of these industries, the sugar mills, distilleries, paper mills, tanneries, ghee factories, and textile mills contribute most of the water pollution hazards 28 . The wastewater contains domestic, industrial, and commercial effluents and their quality is checked using a variety of laboratory tests, including tests for physical, chemical, and biological parameters 29 . Further research has been done to analyze the water quality of a region based on these tests using conventional techniques. It highlights the common risks and possible suitable solutions in water supply for domestic and agricultural use 30 . The surface and groundwater quality of the Kabul River at Attock City, Punjab, has been investigated by calculating the water quality index (WQI) 31 . The status of groundwater was compared with World Health Organization (WHO) standards and found a negative trend in nitrates and faecal microbes in the Kabul catchment 32 . Another research that has been done to investigate the concentration of heavy metals in the surface water of the Kabul River found that nickel has the highest concentration, 30 times higher than the permissible limit of WHO 33 . This research idea was used to identify the flood risk based on the most crucial variables of flood characteristics of Waverly City, Iowa, United States 34 .
Pakistan Council of Research in Water Resources (PCRWR) studied water quality parameters of the upper Khyber Pakhtunkhwa (KPK) region and northern areas of Pakistan including Mardan, Buner, and Swat districts of KPK, among various other regions 35,36 . Although the specific districts focused on this study were not in the PCRWR study, the results exhibit similarities with the findings of the PCRWR study conducted in district Mardan. District Mardan, which shares geographical proximity with the locations of this study, also exhibits similar geological, social, and industrial conditions. According to the International Union for Conservation of Nature, Pakistan Program (IUCN), the Kabul watershed houses 205, 10, 41, and 45 industrial units in Peshawar, Charsadda, Nowshera, and Mardan districts, respectively, constituting more than 15% of the total industries in the region 37 . Almost all these industries have no treatment facilities for effluents. Nitrites from agricultural fertilizers and untreated effluents like ammonia, chromium, and nickel from industries influence the water quality in these areas 38 .
The research conducted on the water quality status of the Kabul River and its tributaries stands out as a valuable and distinctive contribution in several ways. Unlike previous studies that solely relied on the water quality index, this research goes beyond the surface level by thoroughly investigating the water quality of all tributaries connected to the Kabul River. By considering the water quality status of these tributaries, the study adopts a comprehensive approach that combines weightage and distance methods to identify the most critical and vulnerable location. Additionally, the research utilizes interpolated maps in arc-GIS to compare and analyze the results spatially, providing a deeper understanding of the water quality dynamics. This study identifies critical tributaries  www.nature.com/scientificreports/ and proposes practical and effective solutions for water quality issues. These solutions include the installation of monitoring stations on the identified tributaries and recommendation-specific treatment procedures to mitigate the long-lasting effects of water contamination. Moreover, the research emphasizes the importance of stakeholder involvement and advocates for launching awareness campaigns, highlighting the need for collective action in vulnerable locations. The uniqueness of this study lies in its comprehensive approach, methodological framework, and practical recommendations, which collectively provide valuable insights and guidance to concerned authorities and decision-makers. This study becomes an invaluable resource for future research by facilitating the implementation of remedial measures. It contributes to protecting and preserving the Kabul River and its tributaries, ensuring the well-being of the communities and the environment that depend on these water resources. Section 3 describes the detailed problem solution with research findings, and Sect. 4 discussed the study's conclusion.

Material and methods
Study area and data collection. The stretch of the Kabul River under study is the section from just upstream of Warsak Dam to its junction with the Indus. This area is densely populated, with much of the KPK's small industry dependent upon the water of the Kabul River and its tributaries for different purposes 35,39 . Most industries ultimately drain their effluents without any treatment in the Kabul River. Ironically, this is also where much of the agricultural products consumed in cities like Peshawar, Mardan, and Swat are grown. Much concern has been raised about the quality of water in these areas 31 . The geographical boundary of the Kabul River basin and the sampling locations are shown in Fig. 1. The sampling points are shown in green dots and a detailed explanation of the sampling locations is mentioned in Table 1.
All the water samples were collected from ten different tributaries in 1-L polyethene (PE) bottles, which were washed with deionized water before use that drained into the Kabul River. These sample bottles were sealed and placed in a dark environment at a constant temperature range of 4-10 °C to avoid contamination and the effects of light and temperature. The detail of the sampling locations with a unique ID was the tributary of Kabul River along with their latitude and longitude and their height from the mean sea level given in Table 1.
All standard protocols necessary for sample collection, sample storage, sample transportation, and sample tests were followed as per relevant American Society of Testing and Materials (ASTM) guidelines given in Table 2. A total of nine parameters were tested on the collected water quality sample under the relevant standards mentioned in the parameter section, followed by the units of each parameter, their accuracy, test standards and Food and Agriculture Organization (FAO) and WHO threshold values both for drinking and agriculture use.
The samples were analyzed for their physical, chemical, and biological properties relevant to the study. Considering the pollution sources the study area is exposed to and the ultimate use of untreated surface water for irrigation purposes, physical parameters like pH, Total Settleable/Suspended Solids (TSS), Total Dissolved Solids (TDS), Turbidity, and Electrical Conductivity (EC) 40  Distance and weightage-based identification of most critical and vulnerable locations. The methodology presented here is developed to identify the most critical and vulnerable locations having polluted groundwater. We define the criticality of a certain location based on a large value exceeding the threshold defined by WHO. This way, we differentiate between the locations with observed parameter values just exceeding the threshold amount and those exceeding the threshold by a large margin. We define the vulnerability of certain locations by identifying areas closest to critical areas identified earlier in terms of their observed values. An iterative distance-based indicator is used to identify locations that need immediate intervention to secure water quality. The detailed procedure used in this paper is illustrated in a flow chart shown in Fig. 2. Similarly, Eqs. (2) and (3) used for chemical and biological parameters: Then Eq. (4) is used to calculate the extent to which certain water quality parameter exceeds the threshold Z for all the water quality parameters, i.e., Moreover, compare it with the escalated threshold values as defined above. Depending on whether Z exceeds the escalated threshold at a certain location, weights are assigned to these locations to single them out among all locations. The importance indicator (II) for a certain location thus gives us an idea of the extent of intervention needed to mitigate the exceedance situation at a particular location.
Finally singled out the critical location for a particular water quality parameter j by choosing the location with has the maximum II value.
where n is the total number of locations over which the search for maximum is performed.
The above methodology would work for those water quality parameters for which the desired value does not exceed the upper threshold value defined by a regulatory body (e.g., Total Hardness). In case the desired value is higher than the defined threshold (Dissolved Oxygen) value, the minimum value among all locations will be used, i.e.
Critical locations obtained for each water quality parameter thus identify locations that exceed the threshold values by a large amount. Water use from these areas should be strictly regulated, which is especially important for irrigation water use to avoid harmful chemicals entering the food chain.
Vulnerable Location Identification. This study proposes an iterative distance-based search method to identify the most vulnerable locations (VL) in terms of exceedance of threshold values. The method is used for locations that, though not yet exceeding the threshold limit, are most likely to do so shortly based on the observed value.
Proposed Algorithm of the distance-based method Equations (8) and (9) inherently show the methodology's iterative nature. We search for locations where the observed values of different physical, chemical, and biological parameters are the closest to the exceeding value. The results would benefit the decision-makers in prioritizing the mitigating measure for a particular location.
Validation by Kriging interpolation To compare the results obtained from the above methodology, we employ the Kriging interpolation method in our study area for identifying the most critical locations. Using a 30-m digital elevation model of the region as external drift, we interpolate the observed values of different parameters over the entire region. Figures 3, 4, and 5 show the summary results of water quality tests of selected physical, chemical, and biological parameters. The WHO-specified allowable/threshold limits, used as guiding limits, are depicted as red horizontal lines for each parameter. Threshold limits are "triggers" for starting mitigating efforts for bringing the recorded values to safe values and an "endpoint" for terminating mitigating efforts.

Laboratory test results.
The overall test results contradict the previous general findings about water quality in these areas. Except for EC and DO, almost all considered physical parameters lie below the threshold limits. The purer the water, the lower the conductivity. Distilled water is almost an insulator, while salty water is an electrical conductor. Half of the selected locations exceed the threshold limits of EC. Almost all locations exceed the DO values by a considerable margin. Surface water having higher DO values (or too low) affect surface water's aquatic life and its quality.
(1) PP es j = 0.5 * PP who,j + PP who,j (2) CP es j = 0.5 * CP who,j + CP who,j (3) BP es j = 0.5 * BP who,j + BP who,j www.nature.com/scientificreports/ During chemical analysis of water quality parameters (Fig. 3), selected locations show mixed results. Ca, Mg, and NO 2 exceeded threshold values in almost all locations. Samples taken from Bara River (BR) seem to have the worst surface water quality in terms of both physical (TSS, BOD, Turbidity) and chemical (Na, TH, TA) water quality parameters and it observes that the water quality parameters measured in the samples taken from Bara River (BR) deviate from the patterns observed in other locations. These deviations indicate that Bara River exhibits distinct characteristics or trends compared to the other sampled locations. The reason could be that most of the small-scale industry at this location (Marble and Paper industry) drains its effluents directly into the surface water streams. This makes the surface water at this location increasingly polluted concerning the quality parameters where other locations have reasonably acceptable values. Figure 6, e.g., show the selected location results for biological parameters such as total and faecal coliforms. All considered locations exceed the acceptable threshold limits of WHO for these parameters. Malik et al. (2010) refer to the WHO study, which says that faecal bacteria, parasites, and other microbes cause about 6000 deaths of adults and children every day, resulting in a statistic of 1.8 million deaths every year from complications caused by the presence of this kind of pollutant in water bodies 43 . Figures 6 and 7 shows a correlation between measured physical and chemical-biological water quality parameters. These correlations were calculated to ensure the samples were properly collected, stored, transported, and tested. According to our understanding of the dependence of different water quality parameters on each other and their confirmation from the correlation test gives us confidence that the data collected is trustworthy. For example, the main salts generally dissolved in water are carbonates, bicarbonates, sulfates, chlorides, nitrates, and phosphates. The presence of TDS in water is commonly associated with a higher probability of increased electric conductivity. The correlation between TDS and EC also provides confidence in the reliability of the collected data, reinforcing the interdependence of various water quality parameters 44 . We obtain a high Pearson correlation value between TDS and EC of 0.83 (Fig. 6). Similarly, high correlation values were obtained between measured TSS and parameters like Turbidity, TH, and COD. The Turbidity/COD and Turbidity/BOD similarly display correlation values above 0.85.

The research findings
Mathematical Eqs. (1)-(3) were applied to escalate the threshold value by 50% and the threshold value set by WHO. In Table 3, the first line of each parameter is the observed value denoted by CP, PP and BP. In contrast, the second line is the escalated value addition with the WHO values of each parameter for all locations. All the   www.nature.com/scientificreports/ The exceeded values ( Z ij ) were assigned special weightage of 90% and 35% by following the condition of Eq. (5) and obtained the importance indicator values (II) for all the locations using the values of Table 3. All the highlighted values in Table 4 are the critical values obtained using Eq. (6). Table 5 shows all the vulnerable values identified by the distance-based iterative objective function using Eqs. (7) and (8). All the observed values were subtracted from the WHO standard values except the highlighted value in Table 3. The values that exceeded the observed values were already used for identifying critical locations, therefore not given in Table 5, and left as an empty box. The minimum value in each row is the most vulnerable, and the location is the most vulnerable.
The locations in Table 6, the second column, are the most critical regarding the exceedance margin from the threshold values. These results were obtained by applying Eq. (1) to Eq. (7) for those physical and chemical water quality parameters that exceed the threshold value. It is noted that Bara River's water quality parameters behaved differently among all locations of parameter values. The table below confirms the identification of Bara River as  www.nature.com/scientificreports/ the most critical location in at least seven water quality parameters. All these locations exceed the threshold values by considerable margins. At the same time, the locations identified in the third column are the most vulnerable due to applying Eqs. (8) and (9) for the physical and chemical water quality parameters that nearly exceed the threshold value. These results closely match the physical reality on the ground. The Khairabad station appears the most in Table 6, which is a location where the Kabul River, along with all its tributaries, meets the Indus River. The water quality parameters, for example, TSS, Na, TH, Mg, and SO 4 2are on the upper side of allowable limits. The location is vulnerable because further deterioration will change these locations to critical areas. Adezai, Jindi, and Pabbi Rivers and Warsak Dam appear twice in Table 6; all these locations inhabit and sustain large population densities along their banks. In addition to agriculture, small-scale industries (Ghee, Marble, Paper industries, and Auto-Repair Shops) are predominantly the means of sustenance for the local population.  www.nature.com/scientificreports/ The method mentioned in "Vulnerable Location Identification" Section was employed to compare the results obtained for critical and vulnerable locations. Figures 8, 9, 10 shows the areas identified by kriging interpolation as the most critical ones. The areas are mostly the same identified through the distance, and weightage-based identification method explained above. Bara River is in the most critical state due to the large margin exceeding the threshold values of different parameters. Mitigating and preventive measures must be taken at the identified location to improve the water quality parameters, especially TSS, BOD, Turbidity, TA, Na, TH, and Mg.

Discussion
The study utilized above mentioned mathematical equations to analyze the water quality of the Kabul River using distance and weightage-based methods to highlight the most critical and vulnerable locations. The analysis of the results presented in Table-3 revealed that multiple water quality parameters across different locations exceeded the escalated values, except for dissolved oxygen. The Bara River (BR) had the highest number of exceedances,  www.nature.com/scientificreports/ followed by PR, RR, JR, IR and AK, as highlighted in Table 3. Table 6 further supported this observation by identifying BR as the most critical location in at least seven water quality parameters. Table 5 identified vulnerable values in each row and indicated the most vulnerable locations. The Khairabad station, located at the confluence of the Kabul River and its tributaries with the Indus River, frequently appeared in Table 6 and exhibited water quality parameters on the upper side of allowable limits. The consistency between the results obtained through different methods further validated the findings.

Conclusion
The study developed two objective functions applied to different locations in the larger prone area of Peshawar District, Pakistan. The methodology identified the critical areas and vulnerable locations concerning the water quality part. The results show considerable skills in identifying the critical and vulnerable locations compared with already published work for the same area of water quality parameters. The developed method is intuited and easy to apply and gives results, which are enclosed tendons with the ground reality of the region. The study employed mathematical Eqs. (1)(2)(3)(4)(5)(6)(7)(8)(9) to assess water quality parameters in various locations, revealing that the Bara River had the highest number of exceeded values, signifying its critical condition. Adezai, Jindi, Pabbi Rivers, Khairabad station, and Warsak Dam were also vulnerable due to parameters nearing the threshold values. These findings correspond to the actual state of the water bodies. Effective measures should be implemented to safeguard these rivers, targeting the improvement of crucial parameters such as Total Suspended Solids, BOD, Turbidity, Total Alkalinity, Sodium, Total Hardness, and Magnesium. These protection measures could involve implementing pollution control measures, enhancing wastewater treatment facilities, promoting sustainable agricultural practices, and raising awareness among local communities about the importance of preserving water quality. Ensuring the long-term health of these rivers requires a collaborative effort from stakeholders, including environmental organizations, government authorities, and local communities. By implementing these measures, we can move towards restoring and maintaining these vital water resources' ecological integrity and sustainability for both present and future generations.

Data availability
The data generated and analyzed during the current study are included in this article. The analyzed data in the form of an Excel sheet is attached in the supplementary files.
Received: 6 December 2022; Accepted: 30 June 2023 Figure 10. Interpolated values of water quality biological parameters over the entire study area.