Background & Summary

This paper documents a global monthly gridded (0.5° resolution) sectoral water withdrawal and consumption dataset that contains conditional projections of water usage (from 2010 to 2100) across a range of future socio-economic and climate scenarios. This dataset is important because it quantifies the sources of demand-side pressures on scarce water resources globally under diverse future scenarios. Mekonnen & Hoekstra 20161 (also cited in the UN World Water Development Report 20222) estimated that roughly 71% (4.1 billion people) of the world’s population was exposed to water scarcity at least one month in the year over the period from 1996 to 2005. In their more recent study, Van Vliet et al. 20213 estimate global water scarcity over the period from 2000 to 2010 to range from 30% (without water quality considered) to 40% (when also including water quality). Global water scarcity is expected to increase across the globe with critical implications for sustainable development4,5,6,7,8. Recent studies highlight that future water scarcity is primarily driven by human water demands rather than climate impacts on water availability4,9. Additionally, irrigation water demands have been shown to have the largest relative impact on water scarcity5,6,10. Furthermore, water access, availability and demands are highly localized, with large energy and economic costs associated with water transfers, and thus a regional understanding of water use is essential11,12. This paper accounts for all of these key factors by providing a transparent and open-source dataset and accompanying methodology that captures the key drivers of future water scarcity (water use for human activities) at a fine spatio-temporal scale (0.5° resolution and monthly) and with added detail on irrigation water use by crop types.

Past studies13,14,15 that have evaluated global gridded water use at monthly resolution have been limited to historical analyses. Other studies, such as World Resources Institute (WRI) 201916, look at future water withdrawals but only at an annual time resolution and up to 2040 with sectoral detail divided into domestic, industry, agriculture and livestock sectors. In this paper we offer a finer spatiotemporal resolution for future projections compared to previous studies applied to a broader suite of socioeconomic and climate forcing scenarios. Additionally, we provide more detail in the irrigation sector which includes 13 different crop types by coupling our water demand model with a land allocation model. Table 1 compares the key features in this study to a representative set of previous studies that have analysed global water use. Table 1 highlights that, compared to previous studies, our study captures additional sectoral detail (especially by irrigated crop types) and a more diverse set of future scenarios.

Table 1 Comparison of selected global water use studies.

This study thus addresses the critical need for future projections of distributed water demand at a fine resolution so that scientists and water managers can start to explore and plan for future water needs. The dataset could also directly support the growing MultiSector dynamics research literature, particularly scenario-based studies of the future interactions between water and other sectors (e.g., energy and land) across scales in a global context17,18,19. The diverse set of 75 scenarios we produce supports scenario-based water demand uncertainty analysis by varying key elements of human and earth system change. The entire dataset can be downloaded from a dataverse online repository20 (https://doi.org/10.7910/DVN/VIQEAB) and is accompanied by a meta-repository (https://jgcri.github.io/khan-etal_2022_tethysSSPRCP/) that provides detailed figures and workflows for interested readers.

We generated this dataset by linking together multiple models and datasets designed to explore the dynamic interactions among energy, water, and land systems at global scale and gridded resolution. Central to our modeling workflow is the Global Change Analysis Model (GCAM4), an integrated tool for exploring the coarse regional dynamics of the coupled human-Earth system and the response of this system to global change, including human system and climate system changes into the future. Tethys21 then spatially and temporally downscales outputs from GCAM to grid resolution. We enhance Tethys’ projections of irrigation water usage by coupling it with Demeter22, a high-resolution downscaling model that uses GCAM outputs to calculate global gridded land-use change. With the combination of GCAM and Demeter, Tethys is able to project water withdrawal and consumption demands for 6 sectors (domestic, electricity generation, irrigation, livestock, industry and mining). The irrigation sector is further divided into 13 different crop types (biomass, corn, fiber crop, miscellaneous crops, oil crop, other grain, palm fruit, rice, root tuber, sugar crop, wheat, fodder herb, and fodder grass). Withdrawal refers to the total volume of water that is extracted by a user from a water source. While some of this withdrawn water may be returned to its original source (e.g., a river), a remaining portion (referred to as consumption) may not returned to the system (e.g., evaporated water). To capture a range of futures reflecting diverse global change across the human and Earth systems, we used 75 scenarios comprised of a combination of 4 Representative Concentration Pathways (RCPs)23, 5 Shared Socioeconomic Pathways (SSPs)24, and 5 Global Climate Models (GCMs) from the Inter-sectoral Impact Model Intercomparison Project (ISIMIP)25 protocol 2b. 15 viable combinations of the SSPs and RCPs were combined with each of the 5 GCMs to arrive at the final 75 scenarios. Graham et al. 20204 provides the details on these original GCAM runs for the 75 scenarios which included a characterization of demand-side narratives corresponding to the SSPs for the water sector26. The GCAM outputs were then passed on to the Demeter model to produce the downscaled irrigated crop land area for 13 different crops in the study by Chen et al. 202027. The combined outputs from the GCAM study and the Demeter study were used in this study to calculate the final downscaled water demand results. The entire workflow of data from the original scenarios through GCAM and Demeter to Tethys is shown in Fig. 1.

Fig. 1
figure 1

Study workflow showing the 75 scenarios are a combination of 4 Representative Concentration Pathways (RCPs), 5 Shared Socioeconomic Pathways (SSPs) and 5 Global Climate Models (GCMs). 15 viable combinations of SSPs and RCPs were combined with each of the 5 GCMs to arrive at the final 75 scenarios which are were then used to generate the corresponding GCAM scenarios which were then passed onto Demeter. Annual water demands from the GCAM runs (Graham et al. 20204) and irrigated crop land area from the Demeter study (Chen et al. 202027) were then passed onto Tethys to generate the final results of this study.

Methods

GCAM produces water withdrawal and consumption outputs for 32 regions for the domestic, mining, power generation, industry, and livestock sectors and for 434 region-basin intersections for the irrigation sector as shown in Fig. 2. (These spatial boundaries28 are determined by Moirai29, the land data system used by GCAM). Tethys v1.3.130 was used to downscale the water withdrawals and consumption outputs from GCAM onto a 0.5° by 0.5° grid as shown in Fig. 3. Of the 259,200 possible grid cells at this resolution (360 × 720), only the 67,420 cells categorized as land are considered. The Tethys outputs focus only on demand-side dynamics, so they make no distinctions regarding the water supply sources used to meet the demands (i.e., surface water, groundwater, desalinated water), though GCAM does make this distinction. While many adjacent regions differ largely in total water demand, most of this demand is directly related to total population or land area, and often concentrated in a few cells, such as those containing cities. As a result, spatial distributions at the border are smoother than they appear on the region scale map, without additional consideration of the boundaries by Tethys.

Fig. 2
figure 2

Water withdrawals and consumption from GCAM by a) 32 GCAM regions for domestic, mining, power generation, industry, and livestock sectors and b) 434 GCAM region and basin intersections for the irrigation sector.

Fig. 3
figure 3

Example outputs of Tethys spatial downscaling of 2010 water withdrawals by sector from GCAM regions and basins to 0.5° × 0.5° grid cells.

Spatial downscaling – non-agriculture

Spatial downscaling for non-agricultural (domestic, electricity, manufacturing, and mining), water withdrawals and consumption in each grid cell are assumed to be proportional to that cell’s population as compared to the larger GCAM region within which that grid cell is located. The population data set used for this paper is from “Gridded Population of the World” (SEDAC, 2016)31. Tethys uses the nearest available year, which for this paper was 2010 in 2010, and 2015 in all other years. Each region’s population is determined by taking the sum of population over all cells belonging to that region. For each of these sectors, Tethys calculates the water withdrawals and consumption as shown in Eq. 1, 2 for a given cell by:

$${{\rm{withdrawal}}}_{{\rm{cell}}}={{\rm{withdrawal}}}_{{\rm{region}}}\times \frac{{{\rm{population}}}_{{\rm{cell}}}}{{{\rm{population}}}_{{\rm{region}}}}$$
(1)
$${{\rm{consumption}}}_{{\rm{cell}}}={{\rm{consumption}}}_{{\rm{region}}}\times \frac{{{\rm{population}}}_{{\rm{cell}}}}{{{\rm{population}}}_{{\rm{region}}}}$$
(2)

Large groups of cells with the same value are a by-product of the areal-weighting method used in the proxy, where coarse census data are evenly distributed.

Spatial downscaling – livestock

Spatial downscaling of livestock water use is calculated using gridded global maps from the FAO gridded livestock of the world (Wint and Robinson, 2007)32 dataset for six types of livestock (cattle, buffalo, sheep, goats, pigs, and poultry). GCAM outputs are organized into five types (beef, dairy, pork, poultry, and “sheepgoat”) and these are first reorganized to match the six types from Wint and Robinson, 200732 using ratios for each region estimated from the dataset. The ratios are stored in two files that are used as inputs to Tethys: bfracFAO2005.csv (“buffalo fraction”) and gfracFAO2005.csv (“goat fraction”). The following formulas are used to map the water withdrawals and consumption values for the five GCAM livestock types to the six livestock types from Wint and Robinson, 200732 for each region:

$${\rm{buffalo}}=\left({\rm{beef}}+{\rm{dairy}}\right)\times {\rm{buffalo}}\_{\rm{fraction}}$$
(3)
$${\rm{cattle}}=\left({\rm{beef}}+{\rm{dairy}}\right)\times \left(1-{\rm{buffalo}}\_{\rm{fraction}}\right)$$
(4)
$${\rm{goat}}=\left({\rm{sheepgoat}}\right)\times {\rm{goat}}\_{\rm{fraction}}$$
(5)
$${\rm{sheep}}=\left({\rm{sheepgoat}}\right)\times (1-{\rm{goat}}\_{\rm{fraction}})$$
(6)

No adjustment is required for pork (pigs) or poultry. After this, downscaling for each livestock type is very similar to downscaling the nonagricultural sectors, with the exception that the respective livestock population (heads) is used as the proxy instead of human population.

$${{\rm{withdrawal}}}_{{\rm{animal}},{\rm{cell}}}={{\rm{withdrawal}}}_{{\rm{animal}},{\rm{region}}}\times \frac{{{\rm{heads}}}_{{\rm{animal}},{\rm{cell}}}}{{{\rm{heads}}}_{{\rm{animal}},{\rm{region}}}}$$
(7)
$${{\rm{consumption}}}_{{\rm{animal}},{\rm{cell}}}={{\rm{consumption}}}_{{\rm{animal}},{\rm{region}}}\times \frac{{{\rm{heads}}}_{{\rm{animal}},{\rm{cell}}}}{{{\rm{heads}}}_{{\rm{animal}},{\rm{region}}}}$$
(8)

The results for each of the six types are then added together to get the total livestock withdrawal and consumption for each cell:

$${{\rm{withdrawal}}}_{{}_{{\rm{livestock}},{\rm{cell}}}}=\left(\begin{array}{c}{{\rm{withdrawal}}}_{{\rm{cattle}},{\rm{cell}}}+\\ \begin{array}{c}{{\rm{withdrawal}}}_{{\rm{buffalo}},{\rm{cell}}}+\\ {{\rm{withdrawal}}}_{{\rm{sheep}},{\rm{cell}}}+\\ {{\rm{withdrawal}}}_{{\rm{goat}},{\rm{cell}}}+\\ {{\rm{withdrawal}}}_{{\rm{pigs}},{\rm{cell}}}+\\ {{\rm{withdrawal}}}_{{\rm{poultry}},{\rm{cell}}}\end{array}\end{array}\right)$$
(9)
$${{\rm{consumption}}}_{{\rm{livestock}},{\rm{cell}}}=\left(\begin{array}{c}{{\rm{consumption}}}_{{\rm{cattle}},{\rm{cell}}}+\\ \begin{array}{c}{{\rm{consumption}}}_{{\rm{buffalo}},{\rm{cell}}}+\\ {{\rm{consumption}}}_{{\rm{sheep}},{\rm{cell}}}+\\ {{\rm{consumption}}}_{{\rm{goat}},{\rm{cell}}}+\\ {{\rm{consumption}}}_{{\rm{pigs}},{\rm{cell}}}+\\ {{\rm{consumption}}}_{{\rm{poultry}},{\rm{cell}}}\end{array}\end{array}\right)$$
(10)

Spatial downscaling – irrigation

GCAM irrigation water withdrawal and consumption outputs are organized by 13 crop types: Biomass, Corn, Fiber Crop, Miscellaneous Crop, Oil Crop, Other Grain, Palm Fruit, Rice, Root Tuber, Sugar Crop, Wheat, Fodder Herb, and Fodder Grass. By downscaling GCAM output, Demeter22 provides a spatial landcover breakdown for each crop type. Because the Demeter outputs used in this study were harmonized to match the land areas of a base map, they are first converted back to be consistent with GCAM. Using these adjusted irrigation area values for each crop, cell withdrawal and consumption values are given by:

$${{\rm{withdrawal}}}_{{\rm{crop}},{\rm{cell}}}={{\rm{withdrawal}}}_{{\rm{crop}},{\rm{region}},{\rm{basin}}}\times \frac{{{\rm{area}}}_{{\rm{crop}},{\rm{cell}}}}{{{\rm{area}}}_{{\rm{crop}},{\rm{region}},{\rm{basin}}}}$$
(11)
$${{\rm{consumption}}}_{{\rm{crop}},{\rm{cell}}}={{\rm{consumption}}}_{{\rm{crop}},{\rm{region}},{\rm{basin}}}\times \frac{{{\rm{area}}}_{{\rm{crop}},{\rm{cell}}}}{{{\rm{area}}}_{{\rm{crop}},{\rm{region}},{\rm{basin}}}}$$
(12)

In cases where the GCAM outputs for a region-basin have nonzero irrigation of a crop type, but Demeter shows no corresponding cells (due to the harmonization with the base map), the distribution is assumed to be proportional to land area. Note that in the current version of Tethys (v.1.3.1) used in this paper, biomass is also downscaled uniformly within a region-basin intersection (with respect to land area), as given by:

$${{\rm{withdrawal}}}_{{\rm{biomass}},{\rm{cell}}}={{\rm{withdrawal}}}_{{\rm{biomass}},{\rm{region}}}\times \frac{{{\rm{area}}}_{{\rm{cell}}}}{{{\rm{area}}}_{{\rm{region}},{\rm{basin}}}}$$
(13)
$${{\rm{consumption}}}_{{\rm{biomass}},{\rm{cell}}}={{\rm{consumption}}}_{{\rm{biomass}},{\rm{region}}}\times \frac{{{\rm{area}}}_{{\rm{cell}}}}{{{\rm{area}}}_{{\rm{region}},{\rm{basin}}}}$$
(14)

The total irrigation sector value for a cell is the sum of that cell’s values for all 13 crops.

Temporal downscaling – domestic

Temporally downscaling domestic withdrawal and consumption uses the following formula from Wada et al., 201133. The R parameter described below is from Huang et al. 201813 and temperature data is from Weedon et al. 201434. Withdrawals and consumption for each month of a year for each cell are given by the formula:

$${{\rm{withdrawal}}}_{{\rm{month}}}=\frac{{{\rm{withdrawal}}}_{{\rm{year}}}}{12}\left[\left(\frac{{{\rm{temp}}}_{{\rm{month}}}-{{\rm{temp}}}_{{\rm{mean}}}}{{{\rm{temp}}}_{{\rm{\max }}}-{{\rm{temp}}}_{{\rm{\min }}}}\right){\rm{R}}+1\right]$$
(15)
$${{\rm{consumption}}}_{{\rm{month}}}=\frac{{{\rm{consumption}}}_{{\rm{year}}}}{12}\left[\left(\frac{{{\rm{temp}}}_{{\rm{month}}}-{{\rm{temp}}}_{{\rm{mean}}}}{{{\rm{temp}}}_{{\rm{\max }}}-{{\rm{temp}}}_{{\rm{\min }}}}\right){\rm{R}}+1\right]$$
(16)

Where:

tempmonth = Average temperature for the month

tempmean = Mean monthly temperature for the year

tempmax = Max monthly temperature for the year

tempmin = Min monthly temperature for the year

R = Parameter representing the relative difference of water use between the warmest and coolest months of the year

Temporal downscaling – electricity generation

Water withdrawal and consumption for electricity generation each month are assumed to be proportional to the amount of electricity consumed, using the formula developed in Voisin et al., 201335:

$${{\rm{withdrawal}}}_{{\rm{month}}}={{\rm{withdrawal}}}_{{\rm{year}}}\left[{{\rm{\rho }}}_{{\rm{b}}}\left(\begin{array}{c}{{\rm{\rho }}}_{{\rm{h}}}\frac{{{\rm{HDD}}}_{{\rm{month}}}}{{{\rm{HDD}}}_{{\rm{year}}}}+\\ {{\rm{\rho }}}_{{\rm{c}}}\frac{{{\rm{CDD}}}_{{\rm{month}}}}{{{\rm{CDD}}}_{{\rm{year}}}}+\\ {{\rm{\rho }}}_{{\rm{u}}}\frac{1}{12}\end{array}\right)+{{\rm{\rho }}}_{{\rm{it}}}\frac{1}{12}\right]$$
(17)
$${{\rm{consumption}}}_{{\rm{month}}}={{\rm{consumption}}}_{{\rm{year}}}\left[{{\rm{\rho }}}_{{\rm{b}}}\left(\begin{array}{c}{{\rm{\rho }}}_{{\rm{h}}}\frac{{{\rm{HDD}}}_{{\rm{month}}}}{{{\rm{HDD}}}_{{\rm{year}}}}+\\ {{\rm{\rho }}}_{{\rm{c}}}\frac{{{\rm{CDD}}}_{{\rm{month}}}}{{{\rm{CDD}}}_{{\rm{year}}}}+\\ {{\rm{\rho }}}_{{\rm{u}}}\frac{1}{12}\end{array}\right)+{{\rm{\rho }}}_{{\rm{it}}}\frac{1}{12}\right]$$
(18)

Where:

ρb = Proportion of electricity used for buildings

ρit = Proportion of electricity used for industry and transportation

ρbit = 1

ρh = Proportion of electricity used for buildings heating

ρc = Proportion of electricity used for buildings cooling

ρu = Proportion of electricity used for buildings other

ρhcu = 1

HDD = Heating Degree Days

CDD = Cooling Degree Days

Heating degree days (HDD) and cooling degree days (CDD) are indicators for the amount of electricity used to heat and cool buildings, and are calculated from mean daily outdoor air temperature. HDD for a month is the sum of (18 °C -temperatureday) across all days where temperature is less than 18 degrees Celsius. CDD is the sum of (temperatureday – 18°C) across all days where temperature is greater than 18°C. Annual HDD and CDD are the sum of their respective monthly values.

Tethys uses HDD, CDD, and ρ values for each cell from the nearest available year in the input files listed at the end of this subsection, which is 2010 for this data set.

The formula is modified for cells with low annual HDD or CDD as described in Huang et al., 201813, since these may not have heating or cooling services despite nonzero values of ρh or ρc.

When HDDyear<650, the HDD term is removed (leaving only CDD) and ρh is reallocated to the cooling proportion, giving:

$${{\rm{withdrawal}}}_{{\rm{month}}}={{\rm{withdrawal}}}_{{\rm{year}}}\left[{{\rm{\rho }}}_{{\rm{b}}}\left(\begin{array}{c}\left({{\rm{\rho }}}_{{\rm{h}}}+{{\rm{\rho }}}_{{\rm{c}}}\right)\frac{{{\rm{CDD}}}_{{\rm{month}}}}{{{\rm{CDD}}}_{{\rm{year}}}}+\\ {{\rm{\rho }}}_{{\rm{u}}}\frac{1}{12}\end{array}\right)+{{\rm{\rho }}}_{{\rm{it}}}\frac{1}{12}\right]$$
(19)
$${{\rm{consumption}}}_{{\rm{month}}}={{\rm{consumption}}}_{{\rm{year}}}\left[{{\rm{\rho }}}_{{\rm{b}}}\left(\begin{array}{c}\left({{\rm{\rho }}}_{{\rm{h}}}+{{\rm{\rho }}}_{{\rm{c}}}\right)\frac{{{\rm{CDD}}}_{{\rm{month}}}}{{{\rm{CDD}}}_{{\rm{year}}}}+\\ {{\rm{\rho }}}_{{\rm{u}}}\frac{1}{12}\end{array}\right)+{{\rm{\rho }}}_{{\rm{it}}}\frac{1}{12}\right]$$
(20)

When CDDyear<450, the CDD term is removed (leaving only HDD) and ρc is reallocated to the cooling proportion, giving:

$${{\rm{withdrawal}}}_{{\rm{month}}}={{\rm{withdrawal}}}_{{\rm{year}}}\left[{{\rm{\rho }}}_{{\rm{b}}}\left(\begin{array}{c}\left({{\rm{\rho }}}_{{\rm{h}}}+{{\rm{\rho }}}_{{\rm{c}}}\right)\frac{{{\rm{HDD}}}_{{\rm{month}}}}{{{\rm{HDD}}}_{{\rm{year}}}}+\\ {{\rm{\rho }}}_{{\rm{u}}}\frac{1}{12}\end{array}\right)+{{\rm{\rho }}}_{{\rm{it}}}\frac{1}{12}\right]$$
(21)
$${{\rm{consumption}}}_{{\rm{month}}}={{\rm{consumption}}}_{{\rm{year}}}\left[{{\rm{\rho }}}_{{\rm{b}}}\left(\begin{array}{c}\left({{\rm{\rho }}}_{{\rm{h}}}+{{\rm{\rho }}}_{{\rm{c}}}\right)\frac{{{\rm{HDD}}}_{{\rm{month}}}}{{{\rm{HDD}}}_{{\rm{year}}}}+\\ {{\rm{\rho }}}_{{\rm{u}}}\frac{1}{12}\end{array}\right)+{{\rm{\rho }}}_{{\rm{it}}}\frac{1}{12}\right]$$
(22)

When annual HDD and CDD are both below their respective thresholds (<650 for HDD and <450 for CDD), all sources of monthly variation vanish and the formula reduces to

$${{\rm{withdrawal}}}_{{\rm{month}}}=\frac{{{\rm{withdrawal}}}_{{\rm{year}}}}{12}$$
(23)
$${{\rm{consumption}}}_{{\rm{month}}}=\frac{{{\rm{consumption}}}_{{\rm{year}}}}{12}$$
(24)

Temporal downscaling – livestock, manufacturing and mining

For livestock, manufacturing, and mining, a uniform distribution is applied. The withdrawal or consumption for the year is divided between months according to the number of days.

$${{\rm{withdrawal}}}_{{\rm{month}}}={{\rm{withdrawal}}}_{{\rm{year}}}\times \frac{{{\rm{days}}}_{{\rm{month}}}}{{{\rm{days}}}_{{\rm{year}}}}$$
(25)
$${{\rm{consumption}}}_{{\rm{month}}}={{\rm{consumption}}}_{{\rm{year}}}\times \frac{{{\rm{days}}}_{{\rm{month}}}}{{{\rm{days}}}_{{\rm{year}}}}$$
(26)

Temporal Downscaling – Irrigation

Temporal downscaling for irrigation water withdrawal and consumption is based on weighted irrigation profiles for each of the 235 basins. Gridded monthly irrigation withdrawal values from the PCR-GLOBWB global hydrological (from Huang et al. 201813, original data from ISIMIP36) model are averaged across the years 1971–2010, then aggregated to the basin scale. The monthly irrigation withdrawal percentages for a basin are applied to all crops in each of its cells.

$${{\rm{withdrawal}}}_{{\rm{month}}}={{\rm{withdrawal}}}_{{\rm{year}}}\times {{\rm{percent}}}_{{\rm{basin}},{\rm{month}}}$$
(27)
$${{\rm{consumption}}}_{{\rm{month}}}={{\rm{consumption}}}_{{\rm{year}}}\times {{\rm{percent}}}_{{\rm{basin}},{\rm{month}}}$$
(28)

In the event that the model has no monthly data for a basin with nonzero irrigation, the profile of the nearest available basin is used.

Data Records

Data outputs from this experiment have been minted and are available in the repository indicated in Table 2. A meta-repository with detailed information on the workflows to produce the data is also available and shown in Table 2.

Table 2 Data records.

The dataset contains separate files with names which start with a combination of the following SSP, RCP, GCM and water usage type:

  • SSP: ssp1, ssp2, spp3, spp4, spp5

  • RCP: rcp26, rcp45, rcp60, rcp85

  • GCM: gfdl, hadgem, ipsl, miroc, noresm

  • Water use type: consumption, withdrawals

Example 1: ssp1_rcp26_gfdl_consumption_XXX

Example 2: ssp1_rcp26_gfdl_withdrawal_XXX

The datasets files have been then divided into sub-sets to manage their size. The following list shows the file structure for one of the SSP, RCP, GCM combinations:

  • ssp1_rcp26_gfdl_consumption_crops_annual.zip

  • ssp1_rcp26_gfdl_consumption_crops_monthly_1.zip

  • ssp1_rcp26_gfdl_consumption_crops_monthly_2.zip

  • ssp1_rcp26_gfdl_consumption_sectors_annual.zip

  • ssp1_rcp26_gfdl_consumption_sectors_monthly_1.zip

  • ssp1_rcp26_gfdl_consumption_sectors_monthly_2.zip

The files with “_crops_” in their names include data for individual crops while the files with “_sectors_” in their name include data for other aggregated sectors. The following expanded list shows the individual files inside the zipped files for the example ssp1_rcp26_gfdl cases. “cd” stands for “consumption downscaled” and “tcd” stands for “temporal consumption downscaled”:

  • ssp1_rcp26_gfdl_consumption_crops_annual.zip

    • crops_cdirr_biomass_km3peryr.csv

    • crops_cdirr_Corn_km3peryr.csv

    • crops_cdirr_FiberCrop_km3peryr.csv

    • crops_cdirr_FodderGrass_km3peryr.csv

    • crops_cdirr_FodderHerb_km3peryr.csv

    • crops_cdirr_MiscCrop_km3peryr.csv

    • crops_cdirr_OilCrop_km3peryr.csv

    • crops_cdirr_OtherGrain_km3peryr.csv

    • crops_cdirr_PalmFruit_km3peryr.csv

    • crops_cdirr_Rice_km3peryr.csv

    • crops_cdirr_Root_Tuber_km3peryr.csv

    • crops_cdirr_SugarCrop_km3peryr.csv

    • crops_cdirr_Wheat_km3peryr.csv

  • ssp1_rcp26_gfdl_consumption_crops_monthly_1.zip

    • crops_tcdirr_biomass_km3peryr.csv

    • crops_tcdirr_Corn_km3peryr.csv

    • crops_tcdirr_FiberCrop_km3peryr.csv

    • crops_tcdirr_FodderGrass_km3peryr.csv

    • crops_tcdirr_FodderHerb_km3peryr.csv

    • crops_tcdirr_MiscCrop_km3peryr.csv

    • crops_tcdirr_OilCrop_km3peryr.csv

  • ssp1_rcp26_gfdl_consumption_crops_monthly_2.zip

    • crops_tcdirr_OtherGrain_km3peryr.csv

    • crops_tcdirr_PalmFruit_km3peryr.csv

    • crops_tcdirr_Rice_km3peryr.csv

    • crops_tcdirr_Root_Tuber_km3peryr.csv

    • crops_tcdirr_SugarCrop_km3peryr.csv

    • crops_tcdirr_Wheat_km3peryr.csv

  • ssp1_rcp26_gfdl_consumption_sectors_annual.zip

    • cddom_km3peryr.csv(Domestic)

    • cdelec_km3peryr.csv(Electricity Generation)

    • cdirr_km3peryr.csv(Irrigation)

    • cdliv_km3peryr.csv(Livestock)

    • cdmfg_km3peryr.csv(Industry & manufacturing)

    • cdmin_km3peryr.csv(Mining)

    • cdnonag_km3peryr.csv(Aggregated non-agriculture)

    • cdtotal_km3peryr.csv(Total)

  • ssp1_rcp26_gfdl_consumption_sectors_monthly_1.zip

    • tcddom_km3peryr.csv(Domestic)

    • tcdelec_km3peryr.csv(Electricity Generation)

    • tcdirr_km3peryr.csv(Irrigation)

  • ssp1_rcp26_gfdl_consumption_sectors_monthly_2.zip

    • tcdliv_km3peryr.csv(Livestock)

    • tcdmfg_km3peryr.csv(Industry & manufacturing)

    • tcdmin_km3peryr.csv(Mining)

Technical Validation

GCAM outputs are calibrated at a regional scale to match observed data for base year values as described in Graham et al. 20204. Sectoral comparison between GCAM’s future water demand projections and other studies is carried out in the supporting information of Graham et al. 201826. In this study, validation is limited to ensuring that the downscaling algorithms in Tethys are free of errors and there is no loss in values as a result of the temporal or spatial downscaling methodology. The results of this study were validated by re-aggregating spatial and temporal downscaled model outputs and comparing them to the original aggregated inputs. Figure 4a shows how the disaggregated water withdrawal values in km3 equal the original values both spatially for GCAM regions and temporally for annual values across sectors and crops. Figure 4b shows the same validation for how the disaggregated water consumption values in km3 equal the original values both spatially for GCAM regions and temporally for annual values across sectors and crops.

Fig. 4
figure 4

Validation of downscaled spatial and temporal Tethys water use. a) Water Withdrawals (km3) and b) Water Consumption (km3).

Additionally, Tethys outputs were also compared to results from two other studies: Huang et al. 201813 and Mekonnen, M.M. and Hoekstra, A.Y. 201115 as shown in Fig. 5. Given the larger number of variables and assumptions for future scenarios considered here, we limit the validation with other studies to historical data. Since this work is primarily concerned with the downscaling of existing projections to a gridded monthly scale, we look at how spatial and temporal patterns in the year 2010 (for which all scenarios are identical) compare to those of the chosen datasets.

Fig. 5
figure 5

Spatial distribution of water withdrawals and consumption across this study (year 2010), Huang et al. 201813 (year 2010) and Mekonnen, M.M. and Hoekstra, A.Y. 201115 (average of years 1996–2005).

Huang et al. 201813, uses an earlier version of Tethys on historical data from 1971–2010. The underlying data have more regions and different totals, but many of the downscaling methods are identical, leading to similar results. For the non-agricultural sectors (domestic, electricity, manufacturing, and mining), the same underlying population map is used to downscale water use. For irrigation, Huang et al. 201813 use United States Geological Survey (USGS) and Food and Agricultural Organization (FAO) AQUASTAT irrigation data, whereas the current version of Tethys uses crop landcover maps from Demeter. Consumption and withdrawals generally showed similar spatial patterns, with differences in assumptions regarding each region’s and sector’s consumption-to-withdrawal ratios accounting for some differences. There are also some differences in accounting. For example, in this study hydropower is included in the consumption for electricity generation category, which by itself is several times greater than the entire water consumption for electricity generation in Huang et al. 201813.

The second data set we compared with is from Mekonnen, M.M. and Hoekstra, A.Y. 201115. It contains monthly total blue water consumption values representing an average of years 1996–2005, which we compare to the base year values from 2010 from this study. The sectoral breakdown is different between the two datasets, but the datasets are at the same spatial-temporal resolution, so we compare monthly totals for each grid cell. Comparing datasets cell by cell is highly sensitive to local differences, and since our spatial downscaling is based on proxy quantities we do not expect every detail to be recreated exactly.

Nonetheless, there is general agreement in the sub-regional patterns across the data sets as seen in Fig. 5. Figure 6 also shows similar sub-annual patterns across the dataset with some differences in total values being attributed to underlying data and year of the study.

Fig. 6
figure 6

Temporal distribution of global water withdrawals and consumption across this study (year 2010), Huang et al. 201813 (year 2010) and Mekonnen, M.M. and Hoekstra, A.Y. 201115 (average of years 1996–2005).

Table 3 Model and data code availability.

Usage Notes

Users are encouraged to explore the accompanying meta-repository (https://jgcri.github.io/khan-etal_2022_tethysSSPRCP/index.html), which provides detailed visualization across the various scenarios, sectors and time periods. Users can then download specific datasets for water withdrawal or consumption for relevant sectors, crops and desired SSP, RCP or GCM from the accompanying dataset repository20 (https://doi.org/10.7910/DVN/VIQEAB) to analyze the raw data. Some example figures from the meta-repository are presented in this section.

Figure 7a shows the total annual water withdrawals by sector for each of the 75 SSP-RCP-GCM combinations from 2010 to 2100. Similar figures are available for consumption as well as by crop. Figure 7b shows the sub-annual temporal distribution across the same set of scenarios for 2010 and for 2100. Patterns such as an increase in summer water withdrawals can be seen in such figures.

Fig. 7
figure 7

Global water withdrawals for the 75 SSP-RCP-GCM combinations by sector. (a) Annual water withdrawals by sector from 2010 to 2100. (b) Monthly water withdrawals for 2010 and 2100. Lines of the same color within each plot represent the 5 different GCMs considered.

The meta-repository also includes details on three selected basins: the Indus, Nile and Upper Colorado River Basin (U.S.). These are used to show how the data can be used to explore trends and patterns at this finer resolution. Figure 8a,b are examples showing how land-use change impacts which type of crop becomes the dominant water user in the Indus basin over time for the SSP1-RCP2.6-GFDL scenario. Figure 8c,d show the accompanying distribution of total water withdrawals both spatially and temporally. Similar figures are provided in the meta-repository for water consumption as well as for other sectors, crops and scenarios.

Fig. 8
figure 8

Indus Basin water withdrawals (km3) by crop for scenario SSP 1, RCP 2.6, GCM GFDL. (a) Showing which crop has the maximum water withdrawals (km3) in each grid cell for years 2025, 2050, 2075 and 2100. (b) Aggregated water withdrawals (km3) by crop in the Indus Basin from 2015 to 2100. (c) Showing total water withdrawals (km3) in each grid cell for years 2025, 2050, 2075 and 2100. (d) Aggregated total water withdrawals (km3) in the Indus Basin from 2015 to 2100.

We highlight that several developments have been planned in the next release of Tethys to improve the methodologies used to downscale water use for the dataset in this paper. Some of the key planned developments include:

  1. 1.

    Improving the spatial distribution of powerplant water use based on actual and projected powerplant location instead of based on population.

  2. 2.

    Updating the output resolution to 1/8th degrees from the existing ½ degree resolution.

  3. 3.

    Including future population projections to improve on the current methodology which uses a static base year population map even for future years.

  4. 4.

    Improving the downscaling of biomass water use which is currently distributed equally within each region.

  5. 5.

    Making Tethys compatible with GCAM-USA37, which allow use of more accurate state-level water use data instead of using national data as inputs to Tethys.

  6. 6.

    Comparing gridded outputs against observational data for individual sectors and regions where data is available.