Contribution of changing precipitation and climatic oscillations in explaining variability of water extents of large reservoirs in Pakistan

Major threat that Pakistan faces today is water scarcity and any significant change in water availability from storage reservoirs coupled with below normal precipitation threatens food security of more than 207 million people. Two major reservoirs of Tarbela and Mangla on Indus and Jhelum rivers are studied. Landsat satellite’s data are used to estimate the water extents of these reservoirs during 1981–2017. A long-term significant decrease of 15–25% decade−1 in water extent is found for Tarbela as compared to 37–70% decade−1 for Mangla, mainly during March to June. Significant water extents reductions are observed in the range of −23.9 to −53.4 km2 (1991–2017) and −63.1 to −52.3 km2 (2001–2010 and 2011–2017) for Tarbela and Mangla, respectively. The precipitation amount and areas receiving this precipitation show a significant decreasing trend of −4.68 to −8.40 mm year−1 and −358.1 to −309.9 km2 year−1 for basins of Mangla and Tarbela, respectively. The precipitation and climatic oscillations are playing roles in variability of water extents. The ensuing multiple linear regression models predict water extents with an average error of 13% and 16% for Tarbela and Mangla, respectively.

The hydrology and water management department of Pakistan, referred to as water and power development authority (WAPDA) provides information related to water reserves in the two large reservoirs. Water reserves are around 9.87 MAF in August, 2019 as compared to 6.43 MAF in August, 2018 which is still lower than last five years (2014-2018) average of 10.6 MAF during August and is associated with a big gap in storage capacity and inflows 26 . This is due to lower than normal precipitation in the basin areas triggered by climate change during 2018 27 . The situation can be alleviated through only exceptional monsoon or early glacial melting. Thus, a detailed study incorporating these recently changing precipitation patterns is timely for assessment of their influence on the two large water reservoirs of Pakistan.

climatic Variability and Water Reservoirs in pakistan
The large reservoirs are considered as an integral part of the surface water hydrology through their influential role in diluting the extreme events of floods and drought along with optimum water resource management at regional scale 28 . These large reservoirs have undoubtedly improved the socio-economics of the countries by improving the economic growth as well as by alleviating poverty 29 . Large reservoirs have both positive impacts by providing the water during deficit periods and negative impacts by changing the local basin environment through reservoir effects 30 . A recent study has demonstrated that how precipitation and El Niño Southern Oscillations or ENSO can be linked with remote sensing data-based land covers through vegetation proxies in Indonesia 31 . In irrigated areas, vegetation sensitivity is more linked with water availability through storage reservoirs. We attempted to link remote sensing-based water extents (WEs) of the two reservoirs with gridded precipitation and climatic oscillations (COs) of ENSO, North Atlantic Oscillations or NAO and Indian Ocean Dipole oscillations or IOD in this study.
Thus, an updated view of the monthly, seasonal and inter-annual variability of WEs of large reservoirs along with role of upstream precipitation, precipitation surface area and COs is provided. This will deliver the basic information about WE of reservoirs which is considered as an equivalent to water storage volume [32][33][34] . This will also help the policy makers for informed decisions related to future water apportionment and allocations under normal, flood, and drought conditions and ensure more objectivity in water sharing among various stakeholders 35 www.nature.com/scientificreports www.nature.com/scientificreports/ precipitation receiving surface area (Ap) are computed for Tarbela, Mangla and for overall basin. The ERA Tp and APH Ap showed significant annual linear decreasing trend and is associated with ENSO episodes (Fig. 4). The APH Ap showed significant decreasing trends of −358.1 (R 2 = 0.62), −309.9 (R 2 = 0.39) and −319.1 (R 2 = 0.43) km 2 year −1 for Mangla, Tarbela and overall basin, respectively. The ERA Tp showed significant negative linear trends of −4.68 to −8.40 mm year −1 (R 2 = 0.15 to 0. 20). No significant trend is observed for APH Tp and ERA Ap. Categorized annual ENSO episodes suggest that the decreases in Tp and Ap are linked to El Niño and La Niña events. This highlights that there are long term decreasing trends in upstream Tp and Ap linked to COs.
Decadal percent changes are computed in WEs, Tp, and Ap (Extended Fig. 1, Extended Table 2). Tarbela reservoir WE shows a reduction of 15-25% decade −1 with reference to normal based on 1981-2017. Maximum negative change has occurred during Post WD (Post Rabi) and Pre MS (Pre Kharif) with more than 50% change decade −1 . The APH Tp and APH Ap are found more variable during Pre WD (Pre Rabi) while during rest of seasons, the change is around ±10% decade Relationship with cos and precipitation. The variations in WEs are correlated with Tp, Ap and Sea Surface Temperature Anomalies (SSTA) for the three different COs (Fig. 5). Mangla WE showed significant negative correlation during January, April and October attributing to less precipitation from WD except for September. Significant positive correlation was found during September owing to strength of the MS ERA Tp. The Tarbela and overall basin revealed significant negative correlations with Tp and Ap at varying monthly timescale. The COs have both positive and negative correlations with Mangla and Tarbela WEs, depending upon the month of the year. Mangla reservoir is significantly sensitive to positive phase of the ENSO and the IOD. Tarbela correlations showed more strong positive relationship with ENSO (January to June) and NAO (January) and negative correlation with the IOD. Overall basin showed strong negative correlation between WE change and the NAO.
Further, COs along with Tp and Ap are correlated at monthly scale to assess their co-variability over the Tarbela, Mangla and overall basins (Extended Fig. 2). Correlation magnitude and sign depends upon the geographical    (Table 1). We have addressed the issue of over fitting, high R 2 value, and spurious relationships for 148 predictor variables (n = 19,388) through steps of principal component analysis (PCA), and multi-collinearity removal  Table 1 is the best one and showed no higher relationship as compared to models based on any of the above predictor's combinations.
The above presented WEs prediction models provide real time information to be incorporated into water accounting and sharing plan. Though, the accuracy of the models is limited, nevertheless the developed models hold merit to be considered for further engagement of variables in future research studies. Inclusion of the COs nevertheless in model is based on the fact that the global COs are modulating the weather systems responsible for the precipitation regimes in the upper Indus Basin. The study thus suggests that there are connections between the reservoir's WE and Tp and Ap in the respective basins. Thus, WEs are predicted somewhat skillfully for better water management.

Discussion
The presented analysis provides an independent (remotely sensed) information set about WEs of the reservoirs at both monthly and seasonal scales which was previously not available for irrigation water distribution for crops production downstream. We examined the upstream precipitation behavior of both the reservoirs to link WE behavior with it. Interestingly, ERA based Tp and APH based Ap show a significant decreasing trend. This highlights that both Tp and Ap are inherently different as one is modeled and the other one is based on a dense network of meteorological stations.
COs play a role in a hydrological cycle at global, regional and basin scales 37 . The ENSO is a complex climate system which demands improvements in climate related products and services to secure billions of population against vulnerability to natural disasters 38 . Climate system of Pacific Ocean had faced three major El Niño events in the 20th Century i.e., 1997-1998, 1982-1983 and 1925-1926 39 . These COs are responsible for specific weather regimes on land at global and regional scales by influencing the precipitation patterns, temperature regimes, and atmospheric pressures changes. Winter precipitation in western Himalaya is linked to NAO affecting WDs 40 . A large amount of precipitation in western Himalayas is governed by positive NAO phase as compared to lower amount in negative NAO phase based on ERA-40 and 20CR precipitation datasets 41 .
Mangla reservoir WE variability is different from Tarbela as its basin is situated in peak MS belt and less influenced by the WD. However, Tarbela reservoir WE is equally influenced by WD and MS precipitation. The ENSO, NAO and IOD are observed to directly and indirectly impact the precipitation weather system at 6-12 months   T9 T1 T2 T3 T4 T10 T13 T21 T22 T5 T6 T8 T7 T17 T18 T11 T12 T14 T15 T20 T16 T19 T24 T25 T26  The MLRs are developed for prediction of WEs to provide an early information to water planning and management stakeholders to ensure irrigation water supply downstream around critical growth stage of crops. Prior to this study, water supply is planned based on short to long term weather forecasting only. This study suggests that there is connection between the WEs and upstream Tp and Ap. It is reported that water accounting by Indus river system www.nature.com/scientificreports www.nature.com/scientificreports/ authority 13 does not take into account precipitation, evaporation and other information essential for water accounting 25,42 . Briscoe and Qamar indicated that Pakistan water apportionment accord of 1991 needs well calibrated systems for water storage reservoirs monitoring, river flows and transparent along with real time reporting 43,44 . Overall, IRB water balance lacks accuracy and large amount of water remains unaccounted. Remotely sensed data availability has increasing capacity in recent past to provide vital information to assess the poorly monitored IRB 45 . This research study will add on to possible solution related to better IRB water resource management by providing relevant local climate model construction being identified as one of knowledge gap in newly approved National Water Policy 2018 of Pakistan 23 .

conclusions
We pointed out that Landsat satellites-based observed WE of reservoirs for the period of 1981-2017 are an indicator of water storage status. These are linked with precipitation amount, and surface area receiving the precipitation in upstream basins, along with climatic oscillation specific sea surface temperature anomalies. These satellites-based WE assessments will ensure the downstream water and food security to millions of people inside Pakistan by integrated incorporation of available forecasts of precipitation, spatial area receiving this precipitation inside basins of these reservoirs, along with ENSO, NAO and IOD based SSTA anomalies, as pointed out in this study. Previously, such linking is only partially explored in the form of selected stream inflow simulations at specific gauge points based on hydrological models.
A range of WE models are developed and presented in this study. Using the multiple linear regression models that best explain the variability in WE, the relevant stakeholders can estimate the WE year-round for water management and for research and development. Precipitation behavior in Tarbela sub-basins T3 (2283.2 km 2 with 10.7 km 2 of glaciated area) and T25 (7201.1 km 2 with 179.7 km 2 of glaciated area) is dominantly relevant for WE variability in Tarbela reservoir. Similarly, precipitation behavior in sub-basin of GL5P (17327.0 km 2 with 180.8 km 2 of glaciated area) is more pertinent to WE variability in Mangla reservoir, situated in region of Pakistan. Given the considerable multifaceted challenges related to long term weather monitoring in high mountain areas, our obtained results in this study provide information for policy makers/managers for sub-basins where more careful weather monitoring is suggested out of all sub-basins for each reservoir, separately.

Methods
The IRB is spread across four different countries of Pakistan, India, Afghanistan and China, of which major part lies in Pakistan followed by India, and the other two countries 11 . We selected the study area covering two major water reservoirs in Pakistan constructed on rivers of Indus and Jhelum, which are a major source of water (Fig. 1). Every reservoir receives water from a particular basin area and boundary delineation is a primary requisite for research related to hydrology, water resource management, climate change impacts, agricultural water productivity, and environmental and watershed management [46][47][48][49][50][51][52] .
We used an integrated approach for this study (Extended Fig. 4). Satellite data based digital elevation models (DEMs) has been widely used for basin boundary delineation using manual to automatic procedures 53,54 . We used ALOS PRISM 30 m resolution global DEM for basin delineation based on semi-automatic basin and sub-basins extraction approach called soil and water analysis tool (SWAT) 55 . This SWAT tool uses flow direction analysis based on D8 flow routing algorithm 56,57 . The extracted basin boundary after refinements represents around 170,600 km 2 and lie in the upper IRB under study. Pakistan' PUIRB is spread across 96,700 km 2 and out of it 19,100 km 2 is glaciated with around 8000 glaciers 58,59 . Rest of the PUIRB basin lies in disputed area of Kashmir with an area of 73,900 km 2 that includes glaciated area of 6,600 km 2 with 4900 glaciers. The PUIRB basins consist of a number of sub-basins and are documented for both reservoirs and various rivers basins (Extended Table 1). The Tarbela-Indus basin consists of 26 sub-basins linked with 9 sub-basins of Gilgit, Hunza, lower part of upper Indus, Shigar, Astore, Shyok, Shingo, Zanskar and upper Indus (138,505 km 2 ). Tarbela specific basin is fed by 14 different rivers of Gilgit, Khunjrab, Shimshal, Indus, Hunza, Shigar, Astore, Shyok, Nubra, Hushey, Shingo, Dras, Suru and Zanskar. The Mangla-Jhelum consists of five sub-basins linked to five rivers of Neelum, Jhelum, Kunhar, Kanshi and Poonch rivers (32,095 km 2 ).
We used annual, seasonal (Meteorological or MET and Cropping or AGR) and monthly time scales for the current study. Summary of the specific seasons is presented in Extended Table 5. The MET seasons in Pakistan are characterized by two major weather systems which are WD linked to winter precipitation and MS system linked to summer precipitation. Irrigation water distribution through reservoirs is adjusted in view of the two cropping seasons of Kharif (summer) and Rabi (spring). Cotton, rice, sugarcane and maize are major summer crops, while wheat is most dominating spring crop in Pakistan 60,61 .
Landsat satellite data. The Landsat satellite 3, 5, 7 and 8 data covering period of 1981-2017 are used to extract the WEs of both reservoirs (Extended Table 6). Landsat grid number 150/36 covers Tarbela reservoir while grid number149/37 covers Mangla reservoir. A total of 404 cloud free Landsat satellite images are downloaded and processed using an open sources QGIS tool. The normalized difference water index (NDWI) is computed using modified NDWI to detect the WE of reservoirs 62 . All NDWI images are subsetted using reduced area of interest (AOI) to optimize the size of the images for efficient processing based on developed ArcGIS models. The NDWI threshold of 0.00-0.20 is used to extract the WE more effectively in the form of binary raster (1 = water and 0 = no water) as recently proposed 63,64 . The NDWI binary images are further vectorized to get the WE. All peak storage WEs of August to October months are used to establish both reservoirs' maximum boundary. All AOI specific multi-temporal WEs contain some water features outside the reservoir site and are removed using vector clipping approach developed in ArcGIS model environment. The Landsat 7 satellites scan line corrector (SLC) error is corrected in vector layer by using polygons aggregation tool in ArcGIS (see Section on Codes and www.nature.com/scientificreports www.nature.com/scientificreports/ ArcGIS model) instead of using gap filling approach in raster domain. We measured the approximate size of missing data line width size and used a distance parameter of 320-360 m for error correction. The Mangla WE is revised by subtracting approx. 58.3 km 2 from computed WE after 2009 till 2017 due to Mangla raising project (2004-2009). Gridded precipitation data. Gridded precipitation datasets may be used as proxy to in situ data for ungauged locations with varying cons and pros 64 . The APH data with 0.25° × 0.25° spatial resolution is used for the period of 1981-2007 65,66 . The APH precipitation data is based on dense rain gauge network of 5000-12000 stations and is primarily generated through distance weighting spatial interpolation method. We downloaded the APHRO MA V1101 data spread across 60° E-150° E, 15° S-55° N. The APH precipitation data set is selected because of its fine spatial resolution and performance relative to several other existing gridded precipitation data sets available for Pakistan [67][68][69] , as well as in similar climatic conditions 70,71 . The daily ERA precipitation data with 0.25° × 0.25° spatial resolution are also employed 72 . The ERA Web API and python procedure is used for downloading the daily precipitation (parameter = 228.128) in annual netcdf file format. Both gridded precipitation data are extracted based on reservoir's specific basin extents using ArcGIS automatic data extraction tool in the database format (dbf). These daily precipitation dbf files are converted to annual files using Visual Basic macro developed using Microsoft excel.
The APH and ERA precipitation variables of Tp and Ap at monthly, seasonal and annual time scales are computed using Eqs. (1 and 2), respectively.
where p d1 is the observed precipitation amount on day 1 and so on. The Ap is based on summing up the grid box areas of 0.25° × 0.25° or 625 km 2 , where the observed precipitation exceeds the threshold of 0.1 mm day −1 . An objective definition of threshold is difficult and depends on spatial resolution of the data 73 : where a i is the area of grid box, p i is daily precipitation and x 0 is the threshold criterion. The H( ) ⋅ is the Heaviside function in Eq. (2). co indices. We assessed the statistical relationship between the reservoir' WEs variability and COs indices of ENSO, NAO and IOD based on SSTA 37,74,75 . Three month running average approach is used to compute the SSTA anomaly. Selection of specific indices are based on sensitivity of winter and summer precipitation in PUIRB of Pakistan and have already been used in numerous earlier studies 76,77 . The WEs of reservoirs are found to be related with SSTA based indices as evidenced via correlation analysis. The annual CO classification scheme has been proposed using cumulative SSTA by developing 11 classes based on percentiles taking into account negative/positive phases of the oscillations that prevailed throughout 1981-2017, following ref. 75 (Extended Fig. 3, Extended Table 7).
Sensitivity and statistical analysis. Annual trends of Tp and Ap are computed for each reservoir' basin using linearly regressed fits during the period of 1981-2017 77 . These trends are also linked with identified ENSO positive and negative phase years 78 . The time series data of hydrological nature can be statistically analyzed using parametric or non-parametric trend tests to identify the underlying trends 79 . The trends in Tp, Ap, monthly and seasonal indices of ENSO, NAO and IOD are computed using nonparametric MK test and Sen's slope at p ≤ 0.05 and also addressed the issue of serial correlations and biasness for Mangla, Tarbela basins and overall upper IRB [80][81][82] . Further, sensitivity analysis for linear relationship of reservoirs' WE with Tp, Ap, three CO indices as well as between precipitation and COs is performed by utilizing Pearson correlation coefficient (CC) based on Eq. (3) 83,84 : where i represents the observed value, and overbar is the mean observation value. The P represents precipitation. The correlations among WE and CO are computed by replacing P with CO in Eq. where i represents the given minimum value observed for a decade and denominator is a median value for 37 years. The DC for ERA Ap, APH Tp, APH Ap and WE are computed by replacing ERA Tp with these variables in Eq. (4) and so on. Two sample t-test is applied to evaluate the decadal change in mean of WE with respect to normal WE of both reservoirs 84 .
reservoirs with Tp, Ap and three COs along with yearly trends based on MLRs (Table 1, Extended Table 8). Before developing model, we incorporated the new precipitation related variables based on glaciated area at sub-basin and its location in Pakistan or disputed areas (Fig. 7). Three variable classes are established which are GL5 (precipitation from glaciated area less than 5% of sub-basin total area), GB520 (precipitation from glaciated area between 5 -20% of sub-basin total area) and GG20 (precipitation from glaciated area greater than 20% of sub-basin total area). These three classes are further categorized based on precipitation received as overall basins (GL5, GB520 and GG20), within Pakistan (GL5P, GB520P and GG20P) and the disputed areas that lie in basins (GL5D, GB520D and GG20D). Initially, developed regression models are subject to two statistical procedures of PCA and multi-collinearity testing. The PCA is employed to reduce the dimensionality of a large number of interrelated variables and to retain the dominant variations present in the data 84,85 . Multi-collinearity among independent variables is examined based on correlation coefficient matrix analysis 84 . Three approaches are employed to find the best model for the prediction of WE variability. These are best (parameters are minimum variable, maximum variable and adjusted R 2 ), stepwise (independent variable in and out based on probability criteria of in 0.05 and out 0.10) and forward (same as stepwise except for the variable is added only). The validation is carried out based on random 15 observations from the matrix. Best regression model is selected based on highest R 2 value and lowest root mean square error (RMSE) 83 T1   T2  T3   T4   T5   T6   T7   T8   T9   T10   T11  T12   T13  T14  T15   T16   T17  T18   T19  T20   T21   T22   T23   T24   T25   T26 Figure 7. Sub-basin characteristics based on distribution and size of glaciated areas using Rudolph glacier inventory. The glaciated area percent contribution at sub-basin scale is used to generate new predictor variables for multiple linear regression models for WE prediction. We have addressed the following question: whether precipitation of glaciated sub-basins play some role in WE variability or not? Detailed information is provided in Extended Table 1. Glacier inventory data was downloaded from http://www.glims.org/RGI/randolph50. www.nature.com/scientificreports www.nature.com/scientificreports/ aggregation (https://www.ian-ko.com/ET_GeoWizards/UserGuide/aggregatePolygons.htm); ERA interim data is downloaded automatically by using ERA Web API (https://confluence.ecmwf.int/display/WEBAPI/ Web-API+Downloads). We also used SDM toolbox for the conversion of netcdf into tif format (http://sdmtoolbox.org/downloads) and developed a number of GIS models for basic processing like automatic clipping, data extraction and others. Similarly, hundreds of dbf extracted files are compiled into single Excel file based on developed VB macro. All GIS models and codes are available on request.