Seasonal forecasts offer economic benefit for hydrological decision making in semi-arid regions

Increasing frequencies of droughts require proactive preparedness, particularly in semi-arid regions. As forecasting of such hydrometeorological extremes several months ahead allows for necessary climate proofing, we assess the potential economic value of the seasonal forecasting system SEAS5 for decision making in water management. For seven drought-prone regions analyzed in America, Africa, and Asia, the relative frequency of drought months significantly increased from 10 to 30% between 1981 and 2018. We demonstrate that seasonal forecast-based action for droughts achieves potential economic savings up to 70% of those from optimal early action. For very warm months and droughts, savings of at least 20% occur even for forecast horizons of several months. Our in-depth analysis for the Upper-Atbara dam in Sudan reveals avoidable losses of 16 Mio US$ in one example year for early-action based drought reservoir operation. These findings stress the advantage and necessity of considering seasonal forecasts in hydrological decision making.


Drought indices
The Standardized Precipitation Index (SPI) 2 is calculated by the following steps: 1. Define the monthly precipitation dataset for an at least 30-year period, separately for the ensemble forecast and the reference.
2. Aggregate each dataset over the selected timescales, e.g., over i =1, 3, 4 or 6 months, in a forecast -not in a retrospective -manner. This is done movingly in the sense that for each month new aggregated values are calculated from the current and the next i − 1 months. That means for SPI6 of June, values from June to November are accumulated. For the forecasts, this implies using lead 0-5 from the forecast issued in June for each ensemble member. i.e., the derived deviation for a standard normally distributed probability density with zero mean.
With separate distributions for reference and forecasts, current values are only compared within the system's (reference or forecast) distribution. A chosen threshold of SPI< −1 thus defines a system's specific quantile value, representative of one negative standard deviation from the system's mean. Accordingly, standard bias correction approaches like linear scaling (correction of the mean) or quantile mapping (correction of absolute quantile values) at the same temporal (monthly) and spatial (basin-mean) resolutions have no effect.
For the Standardized Precipitation Evapotranspiration Index (SPEI), a simple water balance between precipitation (P) and potential evapotranspiration (PET) is calculated to derive the water deficit or surplus D. According to Vicente-Serrano et al., 2010 3 , for the use of PET in the drought index, the method to calculate PET is not critical. Therefore, we followed their calculation of monthly PET (mm) with the simple approach by Thornthwaite 1948 4 , requiring only data on monthly-mean temperature (T in • C): For the temperature-dependent heat index I, the coefficient m depending on I and the correction coefficient K depending on latitude and month, we refer to Vicente-Serrano et al., 2010 3 . The water deficit or surplus is then calculated for each year, month and lead time, separately for the forecasts and reference, as Similarly to SPI, the derived values of D are then aggregated over different timescales and above steps 2-4 are applied.

Bootstrapping algorithm
As stated in the main text, the bootstrapping algorithm is applied to estimate the uncertainty of the hit rate H, false alarm rate F and the potential economic value PEV that may come along due to sampling errors of the considered extreme events. The

Mann-Kendall trend significance test
The Mann-Kendall test is a non-parametric test for trend significance, that does not require any particular distribution of the tested timeseries 8 . The null hypothesis of the test is the absence of consistently increasing or decreasing trend in a timeseries x = (x 1 , x 2 , ..., x n ). The test analyzes differences in signs between earlier and later data points. The Mann-Kendall statistic S is defined as where n is the length of the timeseries and sgn denotes the sign function that allows values of +1, 0 and −1. For increasing or decreasing trends, the value of the Mann-Kendall statistic S should be highly positive or negative, respectively. To statistically test the trend, the probability associated with the Mann-Kendall statistic S is required. A normal distribution of the Mann-Kendall statistic S can be assumed for datasets with more than 10 sample points and including less equal values, i.e. ties. The normalized test statistic, the Z-value associated with S, is calculated as where the variance of the Mann-Kendall statistic VAR(S) for a non-tied dataset is defined as Finally the probability (p-value) associated with the Z-value is calculated using the standard normal cumulative distribution function for a two-tailed test. Based on the chosen significance level α, typically 0.05, the null hypothesis is accepted or rejected. If the p-value of the test is less than α, the test rejects the null hypothesis, i.e., the test signifies the presence of trend in x. Otherwise, if the p-value is greater than α, the null hypothesis of trend absence is accepted.

Multivariate ENSO Index Version 2 (MEIv2)
The phases of El Niño Southern Oscillation (ENSO) are described by the Multivariate El-Niño-Southern-Oscillation (ENSO) Index Version 2 (MEIv2) 1 that is based on the principal-component analysis of standardized anomalies of sea level pressure, sea surface temperature, 10-m zonal and meridional wind, and outgoing longwave radiation. For consistency with the monthly analysis, the two-month MEIv2 product (i. e., data for December-January, January-February, ... November-December) was linearly interpolated to monthly values. Data are available since 1979.