Multivariate sea storm hindcasting and design: the isotropic buoy-ungauged generator procedure

The present work provides indications for assessment of wave climate and design of structures at sea at ungauged sites, both critical issues in Ocean sciences. The paper is of methodological nature and of global worldwide applicability. It shows how suitable wave hindcasting relations can be exploited in order to provide sea storm scenarios at an ungauged (Target) location useful for design purposes: in particular, only geographical information and the knowledge of another gauged (Source) buoy are used. Several are the novelties introduced. (i) New hindcasting relations are derived. (ii) A full statistical model is set up for the Target area, whereas traditional hindcasting simply transfers time series from a gauged to an ungauged site: this gives the possibility to appropriately deal with design and hazard assessment at the Target location. (iii) The multivariate behavior of non-independent random variables is properly modelled by using the Theory of Copulas. As an illustration, a number of case studies is investigated, involving four pairs of buoys which, given their positions and exposures, are representative of a wide variety of sea states and conditions, as well as of different wave generation mechanisms.


Scientific Reports
| (2020) 10:20517 | https://doi.org/10.1038/s41598-020-77329-y www.nature.com/scientificreports/ always available at any site). The procedure is of global worldwide applicability, and works under minimal (and "natural") assumptions, such as that the climatic event generating the sea storms has the same (comparable) features at both the Source and the Target sites. Note that also the predictions of global wave models implicitly thrust a "homogenous" sea state within each cell of the simulation grid used.
Here, a pragmatic approach is adopted. The idea is to check whether the iBUG procedure is able to provide valuable Target estimates of the design values of some relevant variables associated with given Return Periods (hereinafter, RP). The rationale is that such values are those used in practice for the design of structures at sea and the assessment of the corresponding hazards. In particular, in the present work, the Significant Wave Height H (in m) and the Sea Storm Duration D (in h) are used as the variables characterizing a sea storm-see "Materials".
For this purpose, several buoys, subject to different sea state conditions and exposures, are considered on a pair-wise basis, and the following calculations are carried out.
1. Firstly, one buoy (say, A) is regarded as the Source of information (i.e., it is gauged), whereas another buoy (say, B, the Target one) is assumed to be ungauged (i.e., no data collected at the corresponding site are used to compute the iBUG estimates). Then, the iBUG procedure is utilized to set up a statistical model at B only using (i) the information available at A (i.e., the local observations of the variables H and D), and (ii) the ratios of the Effective Fetches at A and B. Finally, iBUG sea storm scenarios are generated at B via Monte Carlo techniques, and suitable H and D design values (associated with given RP's) are compared with those actually observed at B. 2. The role of the buoys A and B is swapped: B is taken as the Source buoy, and A becomes the Target buoy.
Then, the same calculations as above are carried out.
Most importantly, note that only the Source data, and the information about the Source and Target Fetches, are used to compute the iBUG Target estimates, since the Target buoy is assumed to be ungauged. This corresponds to a situation of "complete ignorance" of the sea state at the Target site, typical of many practical cases. The "double-blind" estimating approach used here (i.e., A vs. B, and B vs. A) gives the possibility to carry out an accurate roundtrip survey of the performance of the iBUG technique.

Results and discussion
In this section, the outcomes of the analyses are illustrated and commented. Hereinafter, the following acronyms are used: MC for Monte Carlo, GoF for Goodness-of-Fit, and CI for Confidence Interval.
In general, waves generated in deep water can be fetch-limited, duration-limited, or constitute a fully developed sea 10,11 . In order to thoroughly illustrate the applicability of the iBUG procedure, in the following four pairs of buoys A and B are considered: given their positions and exposures, these are representative of a wide variety of sea states and conditions, as well as of the wave generation mechanisms mentioned above. The features of the data sets are briefly introduced in "Materials", and fully presented in the accompanying Supplementary Information (SI) document. The pairs investigated are the following: the corresponding maps are plotted in Fig. 1 Interpretation of the results. As already mentioned above, the case studies considered in this work are representative of a wide variety of sea states and conditions, as well as of wave generation mechanisms. The Traditional Hindcasting approach is nothing more than a deterministic technique: given a set of N data at a gauged Source buoy S, only a set of N estimates can be generated at a Target ungauged site T. Moreover, the Traditional Hindcasting may only yield design estimates based on a single sample (i.e., the data available at the Source buoy): in practice, only a single Target sea state scenario can be generated.
On the contrary, the iBUG procedure transfers a full statistical model at the target site T, i.e. it constructs a suitable Probability Space for the ungauged buoy, which also accounts for the dependence structure (i.e., the multivariate Copula) ruling the joint dynamics of the variables at play: note that this would not be possible via Traditional Hindcasting, being a deterministic univariate technique without any link to Statistics. www.nature.com/scientificreports/ Most importantly, the statistical model used by the iBUG procedure may generate a number of possible (independent) Target scenarios via MC techniques: this entails that the Probability Space associated with the Target sea storms can be more appropriately and accurately explored, especially considering rare extreme events associated with large design Return Periods (difficult to observe via a single sample).
It is also interesting to note that, in general, the Monte Carlo iBUG CI's are larger than the corresponding Target and Traditional Hindcasting ones, constructed via Bootstrap techniques. The reason is that the latter are based on a single sample of limited size N, whereas the iBUG CI's represent the information extracted from many (independent) samples of size N-here, 100,000 MC simulations. Clearly, this gives the possibility to better account for the uncertainty and randomness of the Target sea states.
Overall, the results shown in "The Gibraltar case"-"The Hawaii case" indicate that the iBUG procedure is able to construct sea storm scenarios always compatible (comparable) with the sea state observed at the Target  www.nature.com/scientificreports/ site (it is successful in 100% of the cases), whereas (as expected) the Traditional Hindcasting often fails: actually, in 9 cases out of 16, more than 50% of the tests, and especially considering Return Periods larger than 20 years (i.e., the ones of interest in practical applications).

Materials
The available sea storm Time Series provide information about the significant wave height and the wave direction. In order to prevent problems with possible serial dependencies and correlations, in the present study an event-based approach is used (adopting a maritime version of the Run Method outlined in 12   Practically, a sea storm starts when H crosses upwards a given threshold H 0 , and ends when H persists below the same level for at least I 0 hours: for a graphical illustration see Fig. 6, showing a standard Triangular Wave Model 13 used to identify a sea storm sequence. Here, the threshold H 0 corresponds to a suitable (large) percentile p H of the empirical distribution of wave heights, in order to focus the analyses on extreme events. The choice of the percentile represents a trade-off between (i) the purpose of investigating strong sea storms (and hence p H should be as large as possible) and (ii) the resulting size N of the extracted sample of sea storms (which should also be as large as possible for carrying out valuable statistical surveys): clearly, increasing p H reduces N, and vice-versa. In the present work, the following values of p H have been used: 95% for the Gibraltar case, 90% for the North Sea case, 95% for the Bay of Biscay case, and 75% for the Hawaii case.  www.nature.com/scientificreports/ Also, a minimum inter-storm temporal distance I 0 is used to temporally separate successive sea storms: this guarantees the physical independence of the events of interest. Actually, according to the tests provided by 14,15 and implemented in the R package npcp, the distributions of the variables H and D extracted show no temporal Change-Points (distributional stationarity). Here, I 0 = 48 h for the Hawaii (accounting for the specific meteorology of the site), and I 0 = 24 h in all the other cases.
Once the thresholds H 0 and I 0 are chosen, the duration D of a sea storm is immediately computed, while the significant wave Height is taken as the maximum H during the whole sea storm, in order to investigate its extreme dynamics-and, correspondingly, θ is taken as the direction associated with the maximum Height recorded.
As anticipated in "Results and discussion", in the present study four pairs of buoys are investigated.

Methods
In this section, the mathematical framework adopted is thoroughly outlined. The present work is of methodological nature and of global worldwide applicability.
Multivariate hindcasting. For the sake of illustration, here the (multivariate) analyses will involve the variables H and D. However, the same procedures can straightforwardly be used by considering other ones. The general idea (and novelty) is to make use of the observations available at a gauged Source buoy S, and then exploit the hindcasting relations provided by the JONSWAP experiment 7 , in order to construct a multivariate statistical model (accounting for the dependence between H and D) at an ungauged Target site T. Most importantly, such a model can be used to simulate suitable time series and scenarios of sea storms at T (see "Multivariate hindcasting"): clearly, the latters can be of any desired size, even much larger than the size of the sample observed at S-e.g. appropriate for specific design and hazard assessment purposes at T (see below). The basic assumption of hindcasting procedures is that the climatic event generating the sea storms has the same (viz., "comparable") features at both the Source and the Target site. In turn, the wave climate at the Source buoy S is supposed to be the same as-or comparable to-the (unobserved) one at the Target buoy T. In practice, such a condition/constraint can only be approximately satisfied: thus, the hindcasting outcomes must be taken with care. Here, the inspiring engineering principle is that a crude prediction at the ungauged site is always better than no information at all.
Below, we use Copulas as a basic tool for dealing with a multivariate framework. Indeed, a convenient way to model the joint behavior of a set X 1 , . . . , X d of random variables is to utilize Copulas. A copula is a multidimensional function describing, and mathematically formalizing, the dependence structure ruling the joint random dynamics of the X i 's. Note that, in Literature, it has been increasingly recognized that the use of a multivariate framework is practically mandatory in Maritime Engineering, given the fact that often (if not always) the variables at play are non-independent-see [17][18][19] , as well as the typical case studies presented in "Materials".
Indeed, recent advances in Mathematics show how copulas may represent an efficient tool to investigate the statistical behavior of dependent variables. For a theoretical introduction to copulas, see [20][21][22] ; for a practical approach, see 16,23,24 . In particular, elementary guidelines for using copulas in maritime/hydrological applications are outlined in 18,19,[25][26][27][28] . Freeware R routines are provided and illustrated in 29,30 .
According to Sklar's Theorem representation 31 , the joint distribution of the X i 's, with marginal distributions F i 's, can be written as, for all (x 1 , . . . , where C 1,...,d is the copula associated with the random vector (X 1 , . . . , X d ) . Practically, a multivariate model F 1,...,d can be readily constructed by (i) fitting suitable univariate laws F i 's on the marginals X i 's, and (ii) fitting an appropriate copula C 1,...,d on the observed (X 1 , . . . , X d )'s. Target site, the JONSWAP hindcasting formulas 11 make it possible to determine the wave climate at a site where no observations are available. Clearly, the outcomes of this methodology (hereinafter, Transfer Method (TM)), as well as the ones of traditional hindcasting methods, must be taken with care. Let θ be a given direction: for instance, θ = 0 • N indicates the North. The TM consists of the following directional relations, where the subscripts "S" and "T" label, respectively, the Source and Target variables of interest: where t is the duration of the climatic event generating a sea storm, and is the ratio of the Target F T (θ) and Source F S (θ) Fetches associated with the direction θ.
Incidentally, as a novelty, the use of a Triangular Wave Model (see "Materials" and Fig. 6) may provide an estimate of t. In fact, it may be reasonable to assume that t equals half the duration of the triangular sea storm: viz., the storm "grows" until the wind feeds it (i.e., the temporal segment AB in Fig. 6 The iBUG procedure. As a novelty with respect to traditional hindcasting methods, here we do not simply transfer data bases from a Source to a Target site. Rather, by using Eqs. (2)-(7), we show how to convey a full statistical model from a (gauged) Source buoy S to an (ungauged) Target one T. In particular, here we focus the attention on the pair (H, D): however, there are no theoretical limitations to consider other sets of variables.
The advantages of transferring a full statistical model, instead of a single set of data, are obvious: in fact, once the (multivariate) statistics of the sea storms at the Target buoy T is set up, it is then possible to generate Target scenarios and samples of any suitable size (e.g., via MC simulations-see below), and thus provide useful information for design and hazard assessment at the ungauged site.
Before proceeding, the following important point needs to be stressed. In the present work, we adopt an isotropic invariance principle: viz., the Source marginals and copula do not depend upon the direction θ . Instead, the Target distributions will be ruled by the directional Fetch ratios ̺ θ 's-see, below, Eqs. (8), (9), and (11). Actually, there are no theoretical reasons for requiring such a directional distributional invariance, rather only empirical ones. In fact, without such an assumption, specific univariate laws and copulas should be fitted for each θ . Clearly, this makes little sense in practical applications (even introducing a coarse discretization of the directions θ's) since the directional sample sizes would almost surely be insufficient to provide a statistically significant number of observations, both for fitting the Source marginals and copula, and for carrying out univariate and multivariate Goodness-of-Fit tests. Practically, the iBUG procedure starts with an unique Source statistical model, and "drives" it to the Target buoy by exploiting the directional information provided by the Fetches' ratios. Now, denoting by G H S and G D S , respectively, the univariate marginal distributions of the Source sea storm variables H S and D S , it is immediate to calculate the corresponding laws G H T and G D T of the Target variables H T and D T : for all x, y ∈ R . Evidently, given a direction θ , the Target distributions are re-scaled versions of the Source ones via the Fetch ratio ̺ θ . Then, by using Sklar's Theorem, Eqs. (8)- (9), and the isotropic invariance, the bivariate probability distributions F S and F T of the pair (H, D) of, respectively, the sea storms at the Source and Target buoys, can be written as  www.nature.com/scientificreports/ for all x, y ∈ R , where C is the copula associated with the Source pairs (H S , D S )'s. Evidently, one of the advantages of transferring a full statistical model is that it may provide a complete set of information concerning the joint random behavior of the sea storms at the Target buoy.
Estimates of design values. Traditionally, in Maritime Engineering (as well as in Terrestrial Hydrology) design values are computed via a Return Period approach. First, a design RP is chosen (e.g., 20, 50, or 100 years, depending on the structure of interest), and then the associated design value(s) of the variable(s) of interest are estimated using the available data. In the following, the general notion of RP introduced in 25,27 is adopted. Given an event E, the associated RP T E is where µ is the mean inter-arrival time of the events E's in the time series considered. Note that µ provides the time unit (e.g., years) in which T E should be expressed. Considering events like E = {X > x * }-the ones of typical interest in applications-where x * is a suitable threshold for the variable X (e.g., X could be the Significant Wave Height H or the Duration D), it is easy to compute design values x * once a design RP T * is assigned. In fact, Eq. (12) would become T * = µ/P(X > x * ) , and inverting it yields where F X is the distribution function of X.
The iBUG procedure in practice. Below it is shown how to simulate Target sea storm scenarios via the iBUG technique.