Generation mechanism and prediction of an observed extreme rogue wave

Rogue waves are individual ocean surface waves with crest height \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta$$\end{document}η or trough-to-crest height H that are large compared to the significant wave height \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H_s$$\end{document}Hs of the underlying sea state: \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H/H_s>2.2$$\end{document}H/Hs>2.2 or \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta /H_s>1.25$$\end{document}η/Hs>1.25. The physics of rogue wave generation and the potential of predicting the rogue wave risk are open questions. Only a few rogue waves in high sea states have been observed directly, but they can pose a danger to marine operations, onshore and offshore structures, and beachgoers. Here we report on a 17.6m high rogue wave in coastal waters with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta /H_s=1.98$$\end{document}η/Hs=1.98 and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H/H_s=2.9$$\end{document}H/Hs=2.9 which are likely the largest normalized heights ever recorded. Simulations of random superposition of Stokes waves in intermediate water depth show good agreement with the observation. Non-linear wave modulational instability, a well known cause for rogue waves in laboratory settings, did not contribute significantly to the rogue wave generation. A parameter obtained from a routine spectral wave forecast provides a practical risk prediction for rogue waves. These results confirm that probabilistic prediction of oceanic rogue waves based on random superposition of steep waves are possible and should replace predictions based on modulational instability.

Rogue waves can be the consequence of third order nonlinear four-wave interactions between free waves forming a modulational instability 17 where individual waves can grow as sidebands, drawing energy from neighbouring waves. For long-crested waves this mechanisms is well documented in wave flumes [18][19][20] and it is the basis for the Benjamin-Feir Index (BFI) which is routinely being used as rogue wave risk predictor 21 . Under modulational instability the tails of the probability density function of wave crests and wave heights will be higher than predicted by second order theory. However, as waves become more directionally spread modulational instability becomes less effective, as has been shown in wave tank experiments 22 and numerical simulations 23,24 , and for observed real ocean rogue waves BFI has little predictive power 25 .
Here we report observations of an extreme rogue wave ( η/H s = 1.98 , H/H s = 2.91 , H s = 6.05 m) off Vancouver Island, BC, Canada. According to second order theory 13 such an event ( η/H s = 1.98 , wave steepness parameter R = 0.23 ) would occur once in 1300 years. The corresponding Benjamin-Feir Index BFI = 0.13 indicates that modulational instability did not play a role in generating such an extreme wave. However, Monte-Carlo simulation of random superposition of fourth order Stokes waves 8 drawn from the observed wave energy spectra yield wave and crest height distributions in good agreement with the 10 month buoy record, including the extreme rogue wave. Furthermore, we show that the crest-trough correlation r 25-27 calculated from a WAVE-WATCH III ®28 (WW3) wave model has strong predictive power for rogue wave risk prediction from a standard wave forecast model.

Rogue wave observation
A 0.9 m CoastScout wave buoy (MarineLabs) was deployed 13/08/2020 at Amphitrite Bank (48.9 • N, 125.6 • W) about 7 km offshore in 45 m water depth. The buoy records surface elevation at 5 Hz from a 3-D inertial measurement unit (IMU), and from differential GPS. Data are recorded in 20 min bursts every 30 min and are available in near-real time. Here we analyze its data record until 21/05/2021, covering a wide range of sea states 0.5 m < H s ≤ 10.1 m. The record includes an extreme rogue wave on 17/11/2020 ( Fig. 1a) with η = 11.96 m crest height (recorded by IMU and GPS sensor) when the significant wave height was H s = 6.05 m. The total wave height H was 17.6 m (19.5 m) based on the preceding (following) trough. The significant wave height had increased rapidly from a minimum H s = 1.93 m 18 h prior to the event. The rogue wave is the fourth crest in a group of about 10 waves. The crests of the preceeding and following waves were much smaller than the rogue wave crest, with η = 3.27 m and η = 4.13 m, respectively. This is consistent with the fact that rogue waves generally occur near the centre of a group and are unexpected, i.e. there is not a gradual build-up of individual wave heights 29,30 .
The spectrogram (Fig.1b) reveals bursts of energy associated with wave groups and a dominant frequency f p = 0.078 Hz (corresponding to a dominant period T p = 12.8 s ), and a second peak at f 2 = 0.156 Hz ( T 2 = 6.4 s ). The group containing the rogue wave has the highest energy and during the rogue wave the energy  www.nature.com/scientificreports/ is spread continuously over a wide frequency range from f = 0.06 Hz to f = 0.9 Hz . This implies that the rogue wave is the combination of energetic waves from a broad range of scales. As the group evolves the high frequency energy rapidly decreases and the low frequency peak exhibits a continuous upshift and individual wave heights decrease. This spectral shape is unique to the wave group containing the rogue wave.
Rogue wave occurrence rates. It is convenient to plot wave and crest height exceedance probabilities P as ln(− ln(P)) versus the normalized heights η/H s , H/H s on a logarithmic scale (Fig. 2). On such a plot the Rayleigh distribution (Eq. 1) and distributions allowing for second order corrections fall on straight lines and any deviations are readily visible 8 . Our observations follow the standard model [13][14][15] up to H/H s < 2.1 and η/H s < 1.1 but beyond that they show a dramatic increase of probabilities of high waves. As commonly observed, probabilities of wave heights H/H s exceeding a given value are lower than given by the Rayleigh distribution, due to the finite spectral width of ocean wave data 2 . Extrapolation of the observed distributions would give a probability of the November 17 rogue wave as 1 in 5 × 10 9 and the rogue crest as 1 in 10 14 , several orders of magnitude less frequent than observed. In particular, the extreme crest height η/H s = 1.98 would be a nearly impossible event (1 in 31 million years) if based on the extrapolation of moderate waves.
To test whether the buoy observations are consistent with the linear mechanism of wave superposition allowing for the Stokes correction, we perform a Monte-Carlo simulation 8,30,31 based on the observed wave spectra, and add bound harmonics up to fourth order 8 (see "Methods"). This way we obtain a synthetic surface elevation record with the same characteristics as the one recorded in the ocean, but which does not include nonlinear wave interactions and therefore no modulational instability. The wave and crest heights from the simulations are in good agreement with the observed wave height distributions. In particular the rapid increase of rogue wave probabilities over the Rayleigh distribution or second order corrections is well captured (Fig. 2). This supports random superposition of Stokes waves as the leading cause for rogue waves. www.nature.com/scientificreports/

Rogue wave risk prediction
The crest-trough correlation r 14,32 (see "Methods") is strongly correlated with the maximum normalized wave height in a record (Fig. 3). Observed and synthetic data show similar maximum wave heights for a given correlation value. The largest normalized wave heights are clearly associated with high correlation values (Fig. 3). A recent study found that in wave buoy observations the probability P(H/H s > 2) increased by about one order of magnitude for the highest r value compared to records with the lowest r 25 . We performed a similar analysis to evaluate how the rogue wave probability p varies with sea state parameters r and BFI (see "Methods"). This analysis utilizes surface elevation measurements from the MarineLabs buoy and an additional nearby buoy, 2 km offshore in 20 m water depth referred to as the Nearshore buoy, resulting in a combined 28 months of data. Here, p is either defined as the probability of a wave exceeding 2H s or 2.2H s . Based on this observational data set r shows substantial correlation with p while BFI is practically ineffective (Fig. 4). The predictive power P x of r for p = P(H/H s > 2) is 0.85 with a lower bound of 0.56, and p ranging from 9.1 × 10 −6 to 6.3 × 10 −5 . That is, the occurrence rate of a rogue wave exceeding 2H s varies from approximately 1 in 110,000 for the lowest r value to 1 in 16,000 during records with the highest r value. For p = P(H/H s > 2.2) , r has a higher predictive power of 0.99, however with a lower bound of 0.15, and p ranging from 7.2 × 10 −7 to 6.9 × 10 −6 . The occurrence rate of a rogue wave exceeding 2.2H s varies from 1 in 1,400,000 to 1 in 140,000. One of the issues that arises when analyzing extreme events is the limited amount of data, which has the unfortunate consequence of increasing the confidence bounds of the rogue wave probability exceeding 2.2H s compared to 2H s . BFI has a predictive power of only 0.32 with lower bound of 0.077 for p = P(H/H s > 2) and 0.81 for p = P(H/H s > 2.2) with a lower bound of -0.07. The confidence bounds on P x indicate that BFI is not a robust prediction parameter and therefore would perform poorly as a rogue wave risk indicator.
The benefit of a spectrally derived parameter such as r relating to increased rogue wave risk, is that it can be obtained from standard spectral wave forecast models (Fig. 5). However, despite the regional WW3 model predicting H s with high reliability with a correlation coefficient of 0.96, the models prediction of r at the near shore location of the MarineLabs and Nearshore buoy is less reliable with a correlation coefficient of 0.61 and a scatter index of 15%. Even so, the crest-trough correlation from the model r model demonstrates a strong positive correlation with rogue wave probability. Additionally, r model has similar high predictive power for rogue wave probability ( p = P(H/H s > 2.2) ) as r obtained from the observations r buoy .
Thus far the demonstration of the correlation between rogue wave probability and r has been restricted to coastal areas 25 and it has been noted that the relationships between p and sea state parameters will likely depend on location 33 . Long term surface elevation records, which are the basis of the analysis presented here, and in 25 are     34 .
Records are available for a wide range of offshore buoy locations (Fig. 5) and thus the rogue wave prediction can be calibrated for the full domain of the operational wave forecast (Fig. 6). The caveat with this analysis is it does not register multiple rogue wave events in a given record segment and the number of waves in a record is estimated from the mean period. The models forecast of r also improves at offshore locations. For example, at an open ocean buoy (C46004) the correlation coefficient increased to 0.78 and the scatter index decreased to 10%. To formalize the risk prediction forecast we took the average of the semi logarithmic fits from the Marinelabs and Nearshore data set and 9 DFO buoys to get the following equations relating rogue wave probability and r. www.nature.com/scientificreports/ We didn't find substantial differences in the slope of the fits when grouped by depth, region or distance from the coast. The spatial and temporal fields of r can be output from the wave model and using the above equations the probability of extreme wave events can be calculated. This analysis was only performed for extreme wave heights as wave crests by definition are not affected by crest-trough correlation.

Discussion
The spectrogram of the surface elevation (Fig. 1b) highlights the rogue wave as an event where a wide range of energetic spectral wave components align. These components are steep compared to the background wave field with enhanced higher order Stokes corrections, and their superposition generates extreme crest heights. The Stokes enhancement of less energetic, and therefore less steep waves is much weaker (see Eq. 3) and their superposition results in moderate crest heights, only. This explains the dramatic increase of rogue wave probabilities for H/H s 2.1 which is not captured by second-order Stokes corrections.
To develop a rogue wave risk prediction system we evaluated how the rogue wave probability p changes with certain sea state parameters. Based on the strong correlation with rogue wave probability at offshore and nearshore locations, we present crest-trough correlation r to be used for rogue wave risk prediction in wave forecasts. r is a logical parameter to be linked with the generation of rogue waves as it is an estimate of the auto-correlation between the crest heights and the trough depths. Therefore, in narrow-banded seas the crests and successive troughs are approximately the same size resulting in high r values. Narrow-banded seas also maintain wave groups and rogue waves are more likely to form in 'groupy' sea states. Additionally, r can be easily computed by a routine wave forecast as demonstrated in Fig. 5 and alongside a forecast for H s provides a complete risk assessment where overlapping areas of high H s and r pose the greatest risk.

Methods
Linear rogue wave simulation. Long time series of high-resolution wave observations under natural conditions are still rare but are required for assessing the occurrence rate of rogue waves. Many height distributions are evaluated with observed probabilities ≤ 10 −513, 35 and only a few studies reach P ≤ 10 −725, 27,29,36 . Alternatively, synthetic time series can provide good statistics on low-probability events and have the advantage to test non-linear (up to a desired order) or linear dynamics. Here we use the Matlab toolbox WAFO 37 to synthesize surface elevation from linear superposition of spectral components with sine and cosine terms that are independent and Gaussian 30 . For each of the 11,600 data segments from the buoy a thirty-minute synthetic surface elevation h lin with 5Hz resolution and the same spectral characteristics as the observations is generated. All individual waves, defined as data between consecutive zero-down crossings, are then modified with fourth order Stokes correction 12 with wave amplitude a, wave steepness ε = ak , and the wavenumber k is obtained iteratively from the dispersion relation ω 2 = gk tanh(kD) , with ω = 2π/T , zero-down crossing wave period T, gravitational acceleration g and water depth D. The first term on the r.h.s. of Eq.  38 . In addition, nonlinear four wave interaction using discrete interaction approximation, DIA (NL1) 39 , JONSWAP bottom friction (BT1), depth-limited breaking 40 (DB1), triad interactions (TR1) 41 , and a linear input term (LN1) have been used for the computations. To produce the low resolution r field in Fig. 5 the WW3 model was run and point spectra were output every . r was then calculated from Eq. (7).

Rogue wave predictors.
To assess the nonlinear mechanism of rogue wave generation by the modulation instability we use the Benjamin Feir Index (BFI) 17 . BFI is calculated as the ratio of wave steepness ε to spectral bandwidth ν 42 . www.nature.com/scientificreports/ where k p is the wavenumber associated with the peak period T p calculated from the weighted spectrum 43 , and m n = ∞ 0 f n S(f )df is the n th spectral moment. The crest-trough correlation r has been suggested as a parameter to relate to wave height distributions 14 . Crest-trough correlation r is the auto-correlation of the sea surface elevation at half the wave period. It can be estimated from the spectral density, S(f) using the Wiener-Khinchin theorem. Following 15 we compute r as, where τ =T 2 is the lag time at half the spectral mean period T = m 0 m 1 .
Estimation of rogue wave probabilities. We evaluate the univariate rogue wave probability p with uncertainties following the same method as  outlined briefly below 25 . We define p as the probability that the next wave height will exceed the threshold 2H s or 2.2H s . To evaluate how p varies with a wave parameter x we split x and H/H s into N bins with approximately equal number of observations in each bin. We assume that the number of rogue waves n + and the number of non-rogue waves n − in each bin are identically and independently distributed (iid) according to a binomial distribution with probability p. For the prior distribution of p we assume a Beta distribution with parameters α 0 = 1 and β 0 = 3000 for P(H/H s > 2) and β 0 = 16,000 for P(H/H s > 2.2) from the Rayleigh distribution. Applying Bayes theorem we find the posterior probability for p is another Beta distribution, since the Beta prior for p is conjugate to the binomial distribution of n + .
Our estimate of p with uncertainties is calculated from the median and the 95% credible interval of Eq. (8) (based on 2.5th and 97.5th percentiles). The benefit of this analysis is to be able to generate uncertainties for p to determine whether the variation with x is meaningful. Here are a few things to note. Firstly, the purpose of the Beta prior is only to constrain p to a reasonable order of magnitude so the exact choice of α 0 and β 0 does not influence final results. Secondly, this analysis hinges on the assumptions that p is iid distributed in each bin, which isn't the case if p depends on more than one parameter. Therefore, these uncertainties indicate the level of confidence in our best estimate of p if we only consider one parameter at a time.
For buoys where surface elevation is unavailable, the normalized wave is given by H max /H s where H max is the maximum wave in a record and the total number of waves N 0 is estimated using the mean period T and the record length RL where, N 0 = RL/T . To evaluate if this simplifying approach is acceptable we perform these simplifications on our Nearshore and MarineLabs data set to compare with results using the full sea surface elevation. Using H max /H s excludes only three rogue events in the data set. Those events correspond to segments where there were two waves with height exceeding 2H s . Overall, there is no appreciable difference using H max and T 01 to get N 0 , however there is the limitation that the limit of rogue waves flagged in a record is one.
To evaluate the degree of variability of p with parameter x we use the predictive power defined as, where i max is the bin index where x is highest and is bin index where x is lowest. This measures how much p varies with x if we only consider this one parameter. The uncertainty in is quantified through Monte Carlo sampling, based on the known distributions of p imax and given in Eq. (8) and the 95% confidence interval calculated from the 2.5th and 97.5th percentiles of the distribution of .

MEDS buoys.
The offshore buoys data sets are provided by the Marine Environmental Data Section (MEDS) buoys of Fisheries and Oceans Canada (DFO) (https:// meds-sdmm. dfo-mpo. gc. ca) 34 . We utilized the long term time series of H max , H s and frequency spectra. The open ocean buoys (C46004 and C46184) are 6m NOMAD buoys, the rest are AXYS 3m discus buoys. They record vertical acceleration at 1 Hz sampling rate for 34 minutes every hour and output bulk wave statistics. This work only used data which had received a quality control check and the record appears correct from 01/01/2010 to 15/08/2021. Any other suspicious data sections were removed. Due to outages in service and erroneous records the time series from the MEDS buoys are variable in length and do not necessarily span the full 11 year. The buoys location are found in Fig. 5a.

Data availability
The wave model data and the MarineLabs buoy data are available from the corresponding author on reasonable request. Data access is restricted to research and educational applications. Data from the Fisheries and Oceans Canada buoys can be accessed at https:// meds-sdmm. dfo-mpo. gc. ca.

Code availability
The generation of the synthetic surface elevation data was performed with the free Matlab toolbox "WAFO" 37 . WAVEWATCH III ® is open source code 28 .  www.nature.com/scientificreports/