Superstatistical distribution of daily precipitation extremes: A worldwide assessment

Maximum annual daily precipitation is a fundamental hydrologic variable that does not attain asymptotic conditions. Thus the classical extreme value theory (i.e., the Fisher-Tippett’s theorem) does not apply and the recurrent use of the Generalized Extreme Value distribution (GEV) to estimate precipitation quantiles for structural-design purposes could be inappropriate. In order to address this issue, we first determine the exact distribution of maximum annual daily precipitation starting from a Markov chain and in a closed analytical form under the hypothesis of stochastic independence. As a second step, we formulate a superstatistics conjecture of daily precipitation, meaning that we assume that the parameters of this exact distribution vary from a year to another according to probability distributions, which is supported by empirical evidence. We test this conjecture using the world GHCN database to perform a worldwide assessment of this superstatistical distribution of daily precipitation extremes. The performances of the superstatistical distribution and the GEV are tested against data using the Kolmogorov-Smirnov statistic. By considering the issue of model’s extrapolation, that is, the evaluation of the estimated model against data not used in calibration, we show that the superstatistical distribution provides more robust estimations than the GEV, which tends to underestimate (7–13%) the quantile associated to the largest cumulative frequency. The superstatistical distribution, on the other hand, tends to overestimate (10–14%) this quantile, which is a safer option for hydraulic design. The parameters of the proposed superstatistical distribution are made available for all 20,561 worldwide sites considered in this work.


Results and Discussion
We used stations belonging to the Global Historical Climatology Network (GHCN) daily (see also Materials and Methods Section). The selected stations have at least 25 years of quality-controlled, complete daily data and passed a preliminary screening to detect the presence of changing points, monotonic trends, and autocorrelation in annual maxima (all undesired features, see Data and preliminary tests Section).
Using this reduced, but still large database, we also checked if the original time series of daily precipitation were autocorrelated. Autocorrelation could adversely influence our daily precipitation model based on a Markov chain and thus the resulting distribution of annual maxima. We thus tested if the binary 0-1 time series, where "0" means a day with no precipitation, while "1" a day with precipitation, were serially dependent by checking if the sample lag-one autocorrelation was statistically different from zero 34 for each site (also referred to as station), each year, and a selection of thresholds to define nonzero precipitation (x T between 0 and 16 mm). The motivation for considering different threshold values and details about the role of this parameter in our framework are provided in the Materials and Methods Section, to which the reader is referred.
Over all the database, the median value of the percentage of years (calculated for each site) during which data are not serially dependent is 5% for x T = 0, but this statistic increases to 52% for x T = 16 mm. Thus, the series of daily precipitation present autocorrelation, which becomes weaker when increasing the threshold value. This means that a first-order Markov chain seems more appropriate to represent the observed time series than the simpler case of zero-order Markov chain. However, for simplicity, in the next we will make use of results (Eq. 5) strictly valid in the case of zero-order Markov chain, still obtaining satisfactory performances in modeling maximum annual daily precipitation.
SCIentIfIC REpoRTS | (2018) 8:14204 | DOI: 10.1038/s41598-018-31838-z Starting from the time series of daily precipitation, we estimated the parameters of the Weibull distribution for nonzero daily precipitation at each station. We checked the agreement between the cumulative distribution function (CDF) of Weibull and the cumulative frequency (also referred as empirical cumulative distribution function) using the Kolmogorov-Smirnov (KS) test with a 1% significance level. Parameters were calculated for every year with at least 25 days of precipitation using L-moments 35 , assuming, as before, different thresholds x T . We treated the presence of repeated values of nonzero precipitation (viz ties) through their randomization 36 . This operation is necessary because the presence of ties can lead to a misidentification of the probability distribution. Randomization adds to all the repeated values a set of suitable random perturbations in the range of the instrumental resolution adopted during data sampling. In order to avoid unverifiable assumptions, the noise was chosen to be Uniform (i.e., a least-informative approach).
For each site, we calculated the percentage of the years during which the Weibull distribution passed the KS test. Table 1 reports the 1st, 2nd, and 3rd quartiles of the percentage of the years during which the Weibull was accepted as distribution, for different thresholds. The number of stations considered is also reported, again as a function of the threshold. Results show that the number of stations decreases from 20,561 (x T = 0) to 6,421 (x T = 16 mm) with an increasing threshold, whereas the percentage of acceptance increases rapidly to 100% with an increasing threshold. Thus, Table 1 supports the use of Weibull as distribution of daily precipitation, worldwide, as speculated by Wilson and Toumi 37 .
The main idea behind the superstatistics conjecture for daily precipitation is that the yearly variability of the parameters of daily precipitation can be described by probability distributions. Eq. (5) in Materials and Methods Section summarizes the resulting probability distribution for maximum annual daily precipitation, which depends on λ and β (Weibull parameters of the probability distribution of nonzero daily precipitation) and p 0 , an "intermittent parameter" describing the probability of precipitation during any given day. This superstatistics conjecture was tested for each threshold, and each station, by using a Normal distribution for each of the three parameters (λ and β, p 0 ). We estimated the parameters of the Normal distributions (i.e., mean and standard deviation) with the method of moments over the samples of annual estimated values of λ, β, and p 0 . Figure 1 shows an example of annual variability of p 0 , λ and β, for Cagliari, Italy (site IT000016560), with a threshold x T = 3.7 mm. More specifically, panels (a), (c), and (e) show the temporal variability of parameters, while panels (b), (d), and (f) compare the cumulative frequency of these parameters with the CDF of Normal distribution.
Considering the threshold x T = 0, over the whole worldwide dataset, the median value of the mean of p 0 is 0.746 (1st quartile 0.651, 3rd quartile 0.830), whereas it is 7.35 mm for λ (1st quartile 4.6, 3rd quartile 10.57), and 0.766 for β (1st quartile 0.71, 3rd quartile 0.823). The estimate of β looks different from the constant value of 2/3 speculated in 37 . For x T = 0, the median value of the standard deviation of p 0 is 0.042 (1st quartile 0.034, 3rd quartile 0.052), whereas it is 1.81 mm for λ (1st quartile 1.05, 3rd quartile 2.84), and 0.122 for β (1st quartile 0.095, 3rd quartile 0.16). Figure S2 gives the variability of mean and standard deviation of p 0 , λ and β with latitude (x T = 0). The mean of p 0 , and both the mean and standard deviation of λ exhibit some patterns with latitude, while the other statistics are substantially constant.
We checked the agreement between the Normal CDF and the cumulative frequency of sample estimates of the three parameters p 0 , λ, and β using the KS test with a 1% significance level. The Normal distribution is accepted as distribution of p 0 in 20,554 out of 20,561 stations (99.97%), whereas it is accepted in 20,466 stations (99.54%) for λ and β (all results with x T = 0). These percentages of acceptance for the Normal distribution increase when increasing the value of the threshold x T . Similar percentages of acceptance can be obtained using the Gamma distribution, but we preferred the use of Normal distribution because it is more robust in the generation of synthetic samples. Overall, these results support the superstatistics conjecture for daily precipitation parameters and the use of a superstatistical distribution (given in Eq. (5)) for maximum annual daily precipitation.
After validating the superstatistics conjecture, we finally focused on extremes. For each station, we extracted the annual maxima and estimated the GEV parameters with L-moments method [ 38 , chap. 18] by following the standard, operational procedure to calibrate a probability distribution from data. The median value for the shape parameter κ (over all the database) is −0.082 (1st quartile −0.166, and 3rd quartile 0.006), whereas it is 15.5 mm for the scale parameter α (1st quartile 10.9, and 3rd quartile 22.0), and 49.9 mm for the position parameter ε (1st quartile 32.8, and 3rd quartile 68.7). These values are in agreement with the estimates given in Papalexiou and Koutsoyiannis 39 . As an example, Fig. 1, panel (g), gives the temporal variability of annual maxima for Cagliari, while panel (h) compares the cumulative frequency of its annual maxima against the CDF of GEV. We checked the agreement between the GEV and the cumulative frequency of annual maxima using the KS test with a 1% significance level. We found that the GEV was accepted as distribution of annual maxima in the 100% of cases.
For each station, and each threshold, we also calculated the superstatistical distribution (given in Eq. (5)). For a fixed threshold, we performed the calibration if at least five years of data were available, each with at least 25 above-threshold observations. This condition reduced the amount of stations on which we calculated the parameters to 20,561. We selected the best threshold using the smallest value of the Kolmogorov-Smirnov statistic, i.e., the smallest value of the vertical distance between the empirical and theoretical (superstatistical) CDF (see Materials and Methods Section). As an example, Fig. 2 compares the CDF of superstatistical distribution with the one of GEV and the cumulative frequency (the site is again IT000016560 for consistency with Fig. 1). We report results using both the best threshold (x T = 3.7 mm) and the range of thresholds 0-6.5 mm used in this case. Figure 2 shows how the choice of the threshold affects the central body and the left tail of the distribution of maxima rather than its right tail. The median value of the best threshold worldwide is ~5 mm (1st quartile ~1 mm, and 3rd quartile ~10 mm). For small values of x T , the autocorrelation as well as the number of data used for calibrating the parameters of Weibull is high, conversely for high values of x T , both the autocorrelation as well as the number of data decrease. The calibration of x T can be viewed as a trade off between neglecting the temporal dependence of daily precipitation and maximizing the agreement with annual maxima. We checked the goodness-of-fit of the selected superstatistical distribution and found that in 20,518 out of 20,561 (99.8%) the superstatistical distribution passed the KS test at 1% level of significance. Figure 3a gives the violin plot (i.e. a mirrored sample density plot) of the Kolmogorov-Smirnov statistic obtained for both the GEV and the selected superstatistical distribution over the entire sample of 20,561 stations. The median value of the KS statistic for the GEV is 0.067, while the one for the selected superstatistical distribution is 0.079, with a wider variability range. While the parameters of the GEV are directly calibrated on annual maxima, those of the superstatistical distribution are not, except for the threshold x T , so it is expectable that the value of the KS statistic for the GEV will be smaller than the one for the superstatistical distribution. Figure S3 shows the threshold selected by minimizing the Anderson-Darling statistic against the threshold obtained by minimizing the KS statistic, for the 20,561 stations. In 51% of the cases, the selected threshold with the Anderson-Darling (AD) statistic is within the interval (x T ± 0.1x T ) of ±10% of the threshold selected with the KS statistic. This percentage increases to 64%, 71%, 77%, or ~90% if the interval is ±30%, ±50%, ±80%, or ±100% of the threshold, supporting the selection made using the KS statistic.   5) is in its more robust predictive power, compared to the one of the GEV, when predicting unobserved values. In order to illustrate this point, we considered a subsample (357 sites) of the database composed by the longest time series (>100 years). We used the first 25 yrs of each sample to estimate both the parameters of the GEV and those of the superstatistical distribution; then we compared these distributions with annual maxima of the remaining part of the sample. Figure 4 gives an example, using Milan data (ITE00100554). Panel (a) reports the variability (dots) of annual maxima. Panel (b) shows the comparison between the cumulative frequency (dots), the CDF of GEV (blue line), and the one of the selected (x T = 13.3 mm) superstatistical distribution (red line) when all the data are used in calibration (1858-2008). Panel (c) reports the comparison between data and the same distributions when using the first 25 years (1858-1882) of data in calibration mode. Panel (d) compares the performance of the GEV and the superstatistical distribution (calibrated using the first 25 years) in extrapolation mode, that is, over the period 1883-2008. This corresponds to the well-known split-sample validation protocol used for hydrologic models. In panel (b), both models describe well the cumulative frequency; in panel (c), the GEV performs better than the superstatistical model; in panel (d), the superstatistical model performs better than the GEV over the unobserved part of the sample, i.e., the one not used in the calibration. Figure 3b gives the violin plot of the Kolmogorov-Smirnov statistic obtained for the GEV and the superstatistical distribution calculated for the 357 longest stations when using only the first 25 years of data in calibration mode (note that the KS statistics of this violin plot refer to the validation phase). The median value of the KS statistic for the GEV is now 0.139 (1st quartile 0.099, 3rd quartile 0.192), while the one for the selected Figure 2. Comparison between the empirical (dots) and theoretical (line) CDF for the GEV (blue) and the superstatistical distribution (red, x T = 3.7 mm). The station is the same as that in Fig. 1. We also reported the variability (area in light red) of the superstatistical distribution when varying the threshold x T in the range 0-6.5 mm. For values of maximum annual daily precipitation smaller than ~80 mm, x T = 0 and x T = 6.5 mm represent the left and right boundaries of this range. For values of maximum annual daily precipitation greater than ~80 mm, x T = 0 and x T = 6.5 mm represent the right and left boundaries of this range. Overall, these results clearly show that the performance of the GEV tends to decrease when passing from calibration to validation, whereas the superstatistical model shows a remarkable robustness. In addition, the performances in validation of the GEV get worse when decreasing the number of years used in calibration, while those of the superstatistical model are constant. Figure S4 also clearly shows the better performances of the superstatistical model in validation, especially for small data samples (say < 25 yrs), while when increasing the samples (>25 yrs) the performances of the GEV and the superstatistical model in validation are equivalent. This is due to the fact that the superstatistical distribution is calibrated using all the available information, whereas the GEV only exploits annual maxima. The superstatistical distribution is more resilient to the sample variability in case of small sizes.
Using the subsample of 357 sites having more than 100 years, we also investigated the performances of the GEV and the proposed superstatistical distribution in reproducing extremes and in particular the quantile associated to the highest cumulative frequency. This analysis used a varying amount of years in calibration for both distributions (first m years of the samples). Performances are quantified in terms of the median difference (in %) between the theoretical and the empirical quantiles (Table 2). We also reported the percentage of cases when the difference between theoretical and empirical quantiles is negative (viz, the distribution is underestimating quantiles, see the number in parentheses). Results show that, independently from the amount of data used in calibration, the GEV tends to underestimates the quantile associated to the highest cumulative frequency by 7% to 13% in 60% to 70% of the cases. On the other hand, the superstatistical distribution tends to overestimate the quantile associated to the highest cumulative frequency by about 10% to 14% in 65% to 67% of the cases. Thus, the superstatistical distribution is more precautionary than the GEV when estimating the quantile associated to the highest cumulative frequency.
The parameters of the proposed superstatistical distribution for all the 20,561 sites considered in this work can be found at the following link: http://ecohys.blogspot.com/p/data.html. For each station, this dataset includes annual observations of parameters p0, λ, β as well as the optimal threshold xT. These data represent the necessary information to readily apply Eq. (5) at any of the sites considered in this work. While no regular update or revision of this database is scheduled for the future, authors are open to feedback, suggestions, and comments. Any feedback will be incorporated in the database as soon as possible by clearly marking new releases with a progressive number.

Materials and Methods
Data and preliminary tests. In this work, we considered the world database Daily Global Historical Climatology Network (version 3.2), available at ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily/. The database includes more than 75,000 stations with daily precipitation data during the period 1797-2015. This dataset has been already used by previous works on extreme precipitation and therefore represents a good benchmark for judging improvements to existing theory. We selected 21,510 stations with at least 25 years of quality-controlled, complete data (viz, without missing data and/or quality flags). Before performing any further statistical analysis, each dataset of annual maxima was further pre-screened to check possible non-stationarities such as changing points or monotonic trends, detected using the Pettitt test 40 , and the Mann-Kendall test 41,42 , respectively. We also preliminarily tested the independence assumption of annual maxima by checking if the lag-one autocorrelation was significantly different from zero 43 . The presence of autocorrelation could induce the detection of a spurious monotonic trend. A 1% significance level was used in order to limit the type-I error. We removed from our analysis any time series that did not pass at least one of these three tests.
The autocorrelation of annual maxima was significantly (1% significance level) greater than zero in 1% (210 time series) of the data, whereas 1.8% of them (389 time series) presented a changing point. The median of this changing-point year across all these 389 sites was 1957, which is in agreement with findings in Southern-East Europe 43 and about 10 years earlier than the changing point found in Austria 44 (late 1960s -early 1970s). 2.7% of data (589 time series) showed a monotonic trend, wheres 0.3% (58) presented both autocorrelation and changing point, 0.2% (51) presented autocorrelation and a monotonic trend, and 1.2% (264) presented a changing point and a monotonic trend. Overall, 4.0% (859) of time series were removed from our analysis, as a result of these preliminary screenings. The number of considered stations was thus reduced to 20,651. Their location is given in Fig. S1 in the Supplementary Material. The stations have a number of years of complete data (i.e., without no data) varying between 25 and 196 yrs, with a median of 46 yrs, a 1st quartile of 34 yrs, and a 3rd quartile of 60 yrs. Rejected stations are evenly distributed around the world, with no evident spatial pattern.

Distribution of maximum annual daily precipitation starting from a Markov chain.
Due to the lack of asymptotic conditions for the maximum annual of daily precipitation, the results of Fisher-Tippett's theorem 13 are not valid, even if they are assumed as reference. Pre-asymptotic results 25,26 , also known as penultimate approximations, have been recently considered in the analysis of daily precipitation extremes 24 , as well as an approximate distribution 27 . Here, differently, we provided some exact results.
We started from the abundant literature 34 about the representation of daily precipitation occurrence through a Markov chain. We determined the distribution of the daily precipitation extremes as the law of the annual maximum of variables over a Markov chain, using some general results given in statistical literature 45 . The daily precipitation has been described by a bivariate sequence of random variables, r.v. 's, {(J n , X n ), n ≥ 0}. The marginal sequence {J n } is a two-state {0, 1} first-order Markov chain with P[J n = j|J n−1 = i] = p ij and i, j = {0, 1}. J n = 1 means that precipitation occurs on day n, J n = 0 means that no precipitation occurs on day n. The r.v.'s {X n } are conditionally independent given {J n }, describing the amount of precipitation. P[X n ≤ x|J n = i] = F i (x), with i = {0, 1} is the cumulative distribution function of X n conditioned by the status J n . In particular, F 0 (x) is a degenerate function at zero (i.e., it has all its probability at zero: Let N T = 365 be the number of days in the year, and = … M X X max { , , } N 1 T , the maximum annual value of the r.v.s X n . The conditional probability can be written as  Table 2. Performances of the GEV and the proposed superstatistical distribution in reproducing the quantile associated to the highest cumulative frequency for the 357 sites having more than 100 years. m represents the number of years used in calibration for both distributions (first m years of the samples). Performances are quantified as median difference (in %) between the theoretical and the empirical quantiles. The number in parentheses is the percentage of cases when the difference between theoretical and empirical quantiles is negative (viz, the distribution is underestimating quantiles).
SCIentIfIC is too complicated to be used for practical applications. In the particular case, when J n is a two-state {0, 1} zero-order Markov chain (i.e., stochastic independence), analytical results are determined. p 11 = p 01 = 1−p 0 and p 00 = p 10 = p 0 . p 0 is denominated the "intermittent" parameter, and represents the probability of zero precipitation in a day. The matrix where N T is fixed and known. Eq. (3) is a mixture distribution with a mass in zero, which accounts for the intermittent behavior of precipitation through the parameter p 0 . In the standard literature 13 , the distribution of the maximum annual daily precipitation is calculated as [F 1 (x)] N , where N is a random variable, representing the number of days in the year with nonzero precipitation. Since N is variable, asymptotic (also known as ultimate), penultimate, or other approximations are necessary. Conversely, Eqs (2-3) are exact results, which generalize the existing literature. As distribution of daily precipitation in days with J = 1, F 1 (x), we considered the Weibull (or stretched Exponential) distribution, following the motivations given by Wilson and Toumi 37 ; this distribution is also adopted in 24,27 . In particular, we used a shifted Weibull, having the following expression, where λ > 0 is the scale parameter, β > 0 the shape parameter, and x T > 0 is the shift or threshold parameter. 1 This threshold parameter aimed at distinguishing precipitation events from spurious, or low nonzero precipitation events. Accordingly, a given day was considered "wet" if the precipitation was greater than x T . Because the value of this threshold could be both site-and instrument-specific, we performed all computations using values in the range [0, 16] mm and then selected the best threshold for each site by minimizing the Kolmogorov-Smirnov (KS) statistic 13 between the cumulative distribution function of maximum annual daily precipitation (see Eq. (5) below) and the cumulative frequency (calculated using the Weibull plotting position 13 ) of annual maxima. This approach allows each site to choose a different optimal threshold based on data fitting. The range for x T is broader than the interval [0, 10] mm considered in the literature 37 , however it is in line with the range of precipitation threshold considered to generate runoff (see Table 1 in 47 ). In any case, our results show that the optimal value of this threshold is smaller than 10 mm in 80% of the sites, which represents a good trade-off between rejecting noise and preserving precipitation events that are usually significant for hydrologic processes.
Eq. (3) can be written as  be (on average) only one day per year with nonzero precipitation and the distribution of maximum annual daily precipitation will be close to the parent distribution F 1 . This suggests that in (very) dry climates the distribution of maximum annual daily precipitation could be very similar to the parent distribution, which supports the use of non-extreme type distributions as found in some references of Table S2. If p 0 = 0, all days will be characterized by nonzero precipitation and the distribution of maximum annual daily precipitation will be represented by the most distant condition from the parent distribution F 1 (x) (with regard to p 0 variability).
The parameters, p 0 , λ, and β, are estimated year-by-year. p 0 can be estimated as the ratio n 0 /N T where n 0 is the number of dry days in the year, while λ and β can be estimated through the L-moments method 35 , which is more robust than the method based on ordinary moments when dealing with outliers in data or with extremes. The agreement between the Weibull distribution and the nonzero daily precipitation data has been checked year-by-year using the KS test 13 with a 1% significance level.
The superstatistical distribution of maximum annual daily precipitation. A complication of expressing daily precipitation extremes using Eq. (4) is that its parameters can vary form year to year due to weather and climate 24,27 . To fully include this variability in the estimation of quantiles, we leveraged the superstatistics conjecture for daily precipitation, i.e., we assumed that the parameters of its distribution are described by probability distributions. This conjecture, even if considered in the literature for daily precipitation 48,49 , has not been tested extensively yet.
Using the Kolmogorov-Smirnov test, we checked if the fluctuations of the yearly values of p 0 , λ, β parameters could be represented by Normal distribution, which was selected among other distributions like χ 2 , or Gamma after a preliminary check. While a right and left truncated distribution could be more appropriate for the parameter p 0 ∈[0, 1], and a left-truncated distribution for λ and β, we considered Normal distributions, both for simplicity and as a first approximation. The parameters of these three Normal distributions are estimated using the method of moments.
In hypothesis of superstatistics, the resulting distribution of M must be calculated as E[F M (x|p 0 , λ, β)], where the expectation is with respect to the (joint) distribution of the three parameters. Given m years of data, empirically, the distribution of the variable M can be calculated as the arithmetic mean of the m distributions in Eq. (4): (5) is the superstatistical distribution. Figure 5b shows the variability of Eq. (5) (red lines), compared to F 1 (x) (black line), and Eq. (4) (blue line). We assumed Normal distributions for the three parameters: the average values are the same as those used in panel 4(a), whereas the coefficients of variation are assumed equal for all the parameters in the range between 0.1 and 0.9. We have checked the goodness-of-fit between Eq. (5) and data using the KS test with a 1% significance level.
The GEV distribution. The cumulative distribution of the GEV is where ε ∈ R, α > 0 and κ ∈ R are the position, scale and shape parameters, respectively. If κ = 0, then the GEV coincides with the Gumbel distribution, if κ > 0 it is the reversed Weibull distribution, and if κ < 0 it is the Fréchet distribution. The GEV parameters are estimated using the L-moments method [ 38 , chap. 18]. We have checked the goodness-of-fit between Eq. (6) and data using the KS test with a 1% significance level.