Machine learning analysis of extreme events in optical fibre modulation instability

Närhi, Mikko; Salmela, Lauri; Toivonen, Juha; Billet, Cyril; Dudley, John M.; Genty, Goëry

doi:10.1038/s41467-018-07355-y

Download PDF

Article
Open access
Published: 22 November 2018

Machine learning analysis of extreme events in optical fibre modulation instability

Nature Communications volume 9, Article number: 4923 (2018) Cite this article

8666 Accesses
96 Citations
46 Altmetric
Metrics details

Subjects

Abstract

A central research area in nonlinear science is the study of instabilities that drive extreme events. Unfortunately, techniques for measuring such phenomena often provide only partial characterisation. For example, real-time studies of instabilities in nonlinear optics frequently use only spectral data, limiting knowledge of associated temporal properties. Here, we show how machine learning can overcome this restriction to study time-domain properties of optical fibre modulation instability based only on spectral intensity measurements. Specifically, a supervised neural network is trained to correlate the spectral and temporal properties of modulation instability using simulations, and then applied to analyse high dynamic range experimental spectra to yield the probability distribution for the highest temporal peaks in the instability field. We also use unsupervised learning to classify noisy modulation instability spectra into subsets associated with distinct temporal dynamic structures. These results open novel perspectives in all systems exhibiting instability where direct time-domain observations are difficult.

Machine learning reveals the control mechanics of an insect wing hinge

Article 17 April 2024

Physics-informed machine learning

Article 24 May 2021

Mid-infrared wide-field nanoscopy

Article 17 April 2024

Introduction

A characteristic feature of many nonlinear dispersive systems is the process known as modulation instability (MI), whereby noise on an input signal can be exponentially amplified to create localised structures of high intensity^1,2. There has been significant interest in studies of MI in nonlinear Schrödinger equation (NLSE) systems, with many experiments reported in fibre optics, hydrodynamics and other fields³.

When seeded by noise, the localised structures emerging from MI show complex dynamics and random statistics, and it has even been suggested that MI may be linked to the development of extreme events or rogue waves^4,5,6. Such studies have been of particular interest in nonlinear fibre optics because recent developments in real-time measurement techniques^7,8 have allowed the emergent dynamics to be characterised experimentally in both the temporal and spectral domains. Specifically, in the temporal domain, although optical MI typically occurs on timescales that preclude direct electronic measurement, time-lens magnification has been used to characterise picosecond random breathers and solitons^9,10. In the spectral domain, the dispersive Fourier transform (DFT) has permitted real-time characterisation of a range of instabilities in both optical fibres and laser cavities^{11,12,13,14,15,16,17}.

These new real-time measurement techniques have essentially revolutionised the study of ultrafast instabilities in nonlinear fibre optics^18,19,20, but they nonetheless remain limited in several important respects. For example, time-lens magnification is experimentally complex, typically involving a nonlinear wavelength conversion process which constrains the measurement bandwidth and power. As a result, there are relatively few experiments that have directly measured ultrafast (picosecond or shorter) extreme events in the time domain^9,10. The DFT technique is experimentally simpler because it involves only propagation in dispersive fibre, but is typically associated with a relatively low dynamic range of 20–25 dB²¹. This is a significant limitation to the detailed study of extreme events in MI which are associated with extension in the spectral wings below the −40 dB level^22,23.

In this paper, we describe the development of a new high dynamic range real-time spectrometer that allows the measurement and analysis of unstable MI spectra with an experimental dynamic range approaching 60 dB. Although our measurements are performed in the spectral domain, the application of machine learning to our data allows us to nonetheless compute corresponding statistics for the maximum intensity of the localised temporal peaks in the MI field, peaks that are preferentially associated with rogue wave events. Our approach employs a supervised learning algorithm, first using data from numerical simulations to train a machine learning model (based on a neural network) to correlate the complex spectral and temporal properties of noise-driven MI. We then apply the trained network to analyse high dynamic range experimental measurements of MI spectra in an optical fibre system, and from these data we determine a probability density function for the associated peak shot-to-shot temporal intensity maxima. The temporal probability density function obtained from the experimental spectra is found to be in excellent agreement with numerical modelling, including in the distribution tails that contain the high intensity extreme events. In addition to supervised learning analysis, we show how unsupervised learning can classify noisy MI spectra into subsets associated with distinct temporal dynamic structures. In particular, we show using simulations that machine learning can identify spectral clusters physically associated with different localised breather and rogue wave solutions of the NLSE⁶. Aside from the direct relevance of our results to optics, our approach has a far wider impact in showing how machine learning applied to only spectral data can be successfully used to study the properties of extreme events in the time domain.

Results

Modulation instability and machine learning

Machine learning is an umbrella term that describes the use of statistical techniques to analyse data sets with the aim of detecting patterns and building predictive models. Machine learning has been widely used in areas such as control systems, speech processing, neuroscience and computer vision²⁴, and has very recently been applied to predicting the behaviour of chaotic systems^25,26. Applications of machine learning in the field of photonics is also relatively recent, but a number of studies have been reported in laser optimisation^27,28, ultrashort pulse measurements²⁹, label-free cell classification³⁰, imaging^31,32,33 and coherent communications³⁴. In our case, we aim to apply the techniques of machine learning to the study of chaotic nonlinear dynamics in optics, with the particular aim of studying the statistics of the maximum intensity of temporal peaks in noise-seeded modulation instability using only spectral measurements.

Machine learning algorithms are usually described in terms of two classes: supervised and unsupervised learning²⁴. With supervised learning, prior knowledge of how the input and output of a system are related is used to build a function or model that describes the system response. With unsupervised learning on the other hand, the analysis is more exploratory, and an algorithm will search for inherent patterns and structures in a data set without using any a priori knowledge about the data or system. Here, we have applied both unsupervised and supervised learning to analyse the shot-to-shot spectral fluctuations in noise-seeded MI, and we find that they provide complementary and important insights.

We begin by presenting results applying supervised learning to analyse spectral data from noise-seeded MI. This first involves a training step where a set of MI data with known spectral and temporal characteristics is fed into a neural network to determine a transfer function capable of correlating desired input and output properties. To this end, we use stochastic numerical simulations of a generalised NLSE model to generate a large ensemble of training data (both temporal and spectral) associated with a chaotic MI field. The simulations are parameterised to model our experiments (described below) where MI develops from picosecond pulses injected into the anomalous dispersion regime of a nonlinear optical fibre. The simulations consider input pulses of 3 ps duration (full width at half maximum (FWHM)) and 175 W peak power evolving over a propagation distance of 0.68 m. The MI is seeded from a broadband quantum-limited one photon per mode spectral noise background³⁵. See Methods for further details. It is important to note here that the NLSE simulation model used has been previously shown to provide a very accurate quantitative description of the statistical and noise properties of MI, supercontinuum generation and optical turbulence^{9,10,13,36,37}. This is essential for its use in training the network to subsequently process experimental data. We also note that the use of numerical data to train a network prior to analysing experimental results has previously been used in ultrashort pulse measurement applications²⁹.

Typical results from a single simulation showing the spectral and temporal evolution with distance are plotted in Fig. 1. We see the growth of distinct MI sidebands in the spectral domain (Fig. 1a) associated with the development of a strong modulation and emergence of localised breathers on top of the pulse envelope (Fig. 1b). In the picosecond regime, MI dynamics are highly sensitive to input noise, and for identical initial pulses but with a different random noise background, the spectral and temporal evolution can vary dramatically. This is shown explicitly in Fig. 1c, d where we plot four output spectral and temporal intensity profiles for different random noise seeds (see Methods), as well as the corresponding average profiles calculated over a larger number of 50,000 realisations. The single-shot profiles clearly show complex structure and vary dramatically from shot to shot, but of course these instability characteristics are not seen when the spectra and temporal profiles are averaged. It is for this reason that real-time measurement techniques have proven so valuable in understanding the nonlinear dynamics of MI.

These simulation results allow us to gain insight into the statistics associated with the shot-to-shot variations of MI³⁸. To this end, the solid line in Fig. 1e plots the probability density function (PDF) of the intensity of the localised MI peaks across the pulse envelope. This PDF is calculated from the ~10⁶ temporal peaks identified from analysing the structure on the temporal envelopes obtained from the ensemble of 50,000 realisations. This probability distribution shows typical characteristics of MI with an extended tail, and the dashed vertical line shown in the tail region indicates the rogue wave threshold intensity I_RW defined as I_RW = 2I_1/3 where I_1/3 is the mean intensity of the highest third of intensity peaks.

In the context of relating MI dynamics to the appearance of extreme events and rogue waves, our aim is to determine the intensity of the maximum peak occurring in a given (single-shot) temporal profile (i.e., the points indicated by circles in Fig. 1d) from only the corresponding spectral intensity profile. Note that the associated PDF of these maximum intensity peaks from the simulation data is shown as the red dashed line in Fig. 1e. It is clear that by focussing on the maximum intensity peak from each realisation, we preferentially select out those events which have a greater probability to be classified as rogue wave events from the full distribution.

However, determining the magnitude of these temporal peaks from only spectral intensity profiles (without spectral phase) is a difficult problem because of the complexity of the noisy MI spectral characteristics. Moreover from an experimental point of view, the highest temporal peaks are associated with broad exponentially decaying spectral wings extending over many 10s of dB dynamic range, and determining the spectral bandwidth is not straightforward when dealing with noisy spectra consisting of multiple breathers with random amplitude and phase. As we will see, however, when combined with our novel experimental technique for real-time high dynamic range spectral measurements, machine learning provides a robust and convenient solution that solves this problem.

The specific approach we use is supervised machine learning based on a feed-forward neural network to relate the input (spectral intensity profile) and output (temporal intensity maximum) obtained from simulations. This is illustrated in Fig. 2 (see Methods for further details). In particular, the spectral intensity from a single simulation realisation with index n is written as a vector input X_n = [x₁, x₂...x_N] where x_i is the spectral intensity at wavelength λ_i and mapped via a neural network to a scalar output Y_n corresponding to the maximum intensity of the associated temporal profile. The objective here is to use the training data to determine the weights and biases of the constituent nodes (neurons) that allow the network to perform as a transfer function to link X_n and Y_n. In our case, the neural network was trained using data from an ensemble of 30,000 simulations. The typical dynamic range of the simulation spectra (between the pump and the MI wings) was ~60 dB, and anticipating the use of this network on experimental data, the spectra were also pre-processed to account for experimental conditions such as wavelength response and system resolution (see Methods).

After training, the model was tested on 20,000 simulations from a distinct ensemble of data not used in the training step. The aim here is to test how well the transfer function obtained from training is able to estimate the maximum temporal intensity from a new simulated single-shot spectrum, by comparing the value obtained from the machine learning algorithm with the known value from the time-domain simulation data. The results of this test are shown in Fig. 3. Here Fig. 3a shows a false colour density plot of the “predicted” maximum temporal intensity against the “target” value extracted from the simulation temporal data for the 20,000 test realisations. In order to highlight the grouping of data points, the density plot uses a histogram representation where the data points are grouped into bins of constant area. The colour scale shown corresponds to the normalised density of points in a particular bin. Note also the log scale for better visualisation. We see clear grouping around the expected x = y linear relationship (white dashed line), with very strong correlation (Pearson's correlation coefficient ρ = 0.92).

It is especially important to test the ability of the machine learning model to reproduce the statistics of the MI temporal peaks, and this comparison is shown Fig. 3b. The blue line shows the known PDF of the maximum intensity peaks from the simulation test ensemble, whereas the corresponding PDF determined by the machine learning algorithm is shown as the red dashed line. It is clear that the algorithm performs impressively in reproducing the shape of the probability distribution, especially the slope of the distribution tail (over nearly three orders of magnitude) as it extends to the regime of higher intensity extreme events. In this context, we note that although this agreement might be expected because the training and simulation data are generated from the same numerical model (and indeed possess indistinguishable average spectra), the purpose of the testing step is to evaluate how well the neural network has been configured during the training to yield an accurate mapping from input to output when correlating spectral and temporal properties.

As a further test during this evaluation phase, we examined whether lower-dynamic range spectral measurements (e.g., from conventional fibre-based DFT) could also be suitable for such machine learning analysis. To this end, we repeated the neural network training but truncated the dynamic range of the spectra applying a reduced dynamic range limit of 25 dB which is typical for real-time fibre-DFT systems. We plot the corresponding three-dimensional histogram results obtained when applying the network algorithm to the 20,000 test data in Fig. 3c. From Fig. 3c it is clear that there is greatly reduced visual grouping around the one-to-one relationship (white dashed line) and indeed the Pearson's correlation coefficient here is only ρ = 0.69. Moreover, the predicted PDF shown in Fig. 3d fails to reproduce the slope of the tail, emphasising the importance of the high dynamic range in capturing extreme events. Note that we performed similar tests over a wider range of parameters, and found that machine learning was only able to construct a reliable model when the spectral data possessed a dynamic range exceeding 50 dB. It is of course also important in this regard to ensure that the training and experimental conditions are as close as possible, and the network performance will be reduced if this is not the case. For example, applying the network algorithm as trained above to an ensemble of test data generated with a ±20% difference in input pulse peak power yields a PDF with an overall shape similar to that expected from simulations, but the algorithm in this case does not reproduce the slope of the tail of the PDF corresponding to the highest maximum intensities.

In addition to the supervised learning approach described above, we also applied unsupervised learning to analyse the MI process. The motivation here is to automatically classify unstable MI spectra into subsets associated with different classes of localised breather structures possessing particular analytic solutions³⁸. Because of the very complex nature of these spectra with much low-amplitude fine structure (e.g., see Fig. 1c), this is a challenging objective, but as we show below, machine learning succeeds in performing such classification.

The approach used was to apply a clustering algorithm to partition a large ensemble of simulated spectra into distinct clusters whose structure exhibits similarity based on a distance metric relative to the cluster centroids (see also Methods)³⁹. Note that such an algorithm does not classify the spectra using a single measure such as bandwidth or amplitude, but rather identifies clusters based on the structure of the spectra over their full bandwidth, making it sensitive to complex features such as sidebands, fine structure and the slope of low-amplitude wings.

In our case, we ran a k-means clustering algorithm on an ensemble of 50,000 simulations and found that a partition number k in the range 5–30 yielded similar results. In particular, independent of the number of clusters used, the results showed that the spectra in the cluster with the largest population were associated with temporal profiles whose maximum intensity was close to the maximum of the probability distribution (Fig. 1e). In addition, the mean bandwidth of the spectra in this cluster was closest to that calculated from all the spectra in the ensemble. Significantly, this result allows us to expect on physical grounds that the temporal profiles of the highest intensity peaks corresponding to the spectra in this cluster would be well fitted by the Akhmediev breather solution to the NLSE at the point of the maximum MI gain, as this is the breather solution which dominates the dynamics³⁸. Similarly, we were able to confirm that the cluster with the smallest population grouped together spectra whose associated temporal profiles had maximum intensities in the tail of the probability distribution, and the spectra in this cluster had the largest calculated mean bandwidth. Again on physical grounds, we can then expect these spectra to be associated with temporal peaks corresponding to the strongly localised Peregrine soliton solution of the NLSE, and perhaps even higher-amplitude profiles associated with breather collisions⁹.

To show this explicitly, Fig. 4 presents results obtained using 9 clusters. Firstly, we consider the results in Fig. 4a, b which show spectral and temporal profiles associated with the cluster of spectra closest to the mean of the spectrum across the whole ensemble. This cluster includes 8402 elements. The superposed blue curves in Fig. 4a show individual spectra from this cluster, while the black curve shows their calculated mean. The individual temporal profiles of the highest intensity peaks in the corresponding time-domain fields are shown as the superposed blue curves in Fig. 4b and the black line plots the mean of these temporal peaks. The yellow line plots the analytic Akhmediev breather at maximum MI gain (see Methods), and we note the excellent agreement between the cluster mean and the analytic breather solution.

The results in Fig. 4c, d show spectral and temporal profiles associated with the cluster of spectra with largest mean bandwidth and which includes 2153 elements. Again, the superposed blue curves in Fig. 4c show individual spectra while the black curve shows their calculated mean. The individual temporal profiles of the highest intensity peaks in the corresponding time-domain fields are shown as the blue curves in Fig. 4d and the black line plots the mean of these temporal peaks. The yellow line here plots the analytic Peregrine soliton (see Methods), and we note again the excellent agreement between the cluster mean and the analytic solution.

Experimental setup and results

Our experimental setup was designed to measure a large ensemble of high dynamic range spectra from noise-seeded MI, suitable for analysis using machine learning as described above. To this end, we first generated a noisy MI field by injecting 3 ps duration (FWHM) pulses of 175 W peak power into 0.68 m of photonic crystal fibre (PCF) with zero-dispersion wavelength around 750 nm. At the pump wavelength of 825 nm, the fibre exhibits strong anomalous dispersion such that clear characteristics of MI are observed. The pump source used was an 80 MHz mode-locked Ti:Sapphire laser. Note that the simulations described above used identical parameters to these experiments.

To characterise the shot-to-shot spectra with high dynamic range, we developed a novel real-time spectrometer setup as shown in Fig. 5. We first reduce the MI signal repetition rate to 150 kHz (using an acousto-optic modulator placed after the Ti:Sapphire laser) and use a rapidly rotating mirror to scan sequential spectra onto different vertical positions of the entrance slit of a 1.1 nm resolution Czerny–Turner spectrograph. Most importantly, this approach is combined with spectral windowing and differential attenuation to capture the central region and the lower amplitude wings of the individual spectra separately, such that the two distinct spectral regions are recorded with the full available dynamic range of the detector. Post-processing is then used to recombine the two windowed components, yielding a dynamic range approaching 60 dB, a near four-order magnitude improvement compared to conventional fibre-DFT. See Methods for further details.

Figure 6 shows experimental results where this technique was used to measure an ensemble of 3000 MI spectra. Firstly, Fig. 6a shows a sequence of 60 consecutively recorded spectra to illustrate how the real-time measurements capture the large shot-to-shot fluctuations expected from MI in the picosecond regime^9,12,35. It is especially significant that the high dynamic range clearly reveals variations in the structure of the spectral wings below the −40 dB level. As a check on the fidelity of these measurements, we computed the mean of the 3000 real-time spectra (dashed red line in Fig. 6b) to compare with an independent measurement (solid yellow line) using an integrating optical spectrum analyser (OSA). We see very good agreement between the OSA measurement and the average of the real-time measurements for the central region, the MI sidebands and the slope of the wings on the short wavelength edge. Note that the discrepancy observed for wavelengths beyond 875 nm in the wings when compared to the OSA is due to reduced throughput efficiency of the system (i.e., grating and camera response).

To further highlight the advantage of the windowed-real-time technique developed here, the inset to Fig. 6b compares the average from our high dynamic range measurements with the results of additional experiments where we used a standard fibre-DFT setup with dynamic range of ~22 dB (see Methods for details). The near four orders of magnitude improvement using spectral windowing is very apparent from this comparison. To show explicitly how this enhanced dynamic range reveals shot-to-shot differences in the spectral wings, Fig. 6c compares the structure of two measured single-shot spectra (dashed red line) with the average computed over the 3000 measured spectra (solid black line). We also plot on this figure (blue solid line) the mean spectrum calculated from the full ensemble of 50,000 numerical simulations of our experiments as described above. In this context we note that for the supervised machine learning training step, the simulated spectra were multiplied by a spectral response function to match the experimental fall off above 875 nm. This ensures that the model obtained from training using simulations can be applied to experimental results.

Results applying the trained supervised learning model to the experimental data are shown in Fig. 7. Here, we aim in particular to determine from the measured single-shot spectra the associated maximum temporal intensity. Figure 7 plots the PDF obtained from the machine learning analysis of the experimental data (dashed red) compared with that from numerical simulations (solid blue). We see very good agreement between the experimental and simulation probability density functions, especially in the slope of the distribution tail for the highest intensity extreme events. These results show very clearly that even though the only available experimental data are that of the spectral intensity, we can nonetheless extract significant physical information about the corresponding temporal behaviour, and in particular reproduce the long tail of the statistics associated with the emergence of localised breather and rogue wave structures.

We also ran the unsupervised learning clustering algorithm on the ensemble of 3000 experimental spectra to cluster the spectra into 9 partitions as in our analysis of the simulation data. These results are shown in Fig. 4. Specifically, the dashed red line in Fig. 4a plots the mean of the partition whose bandwidth is closest to that of the mean of the full ensemble, while the dashed red line in Fig. 4b plots the mean of the cluster with largest spectral bandwidth. We see how the spectral clusters identified from the experimental data closely match those from the numerical simulations.

Discussion

There are several major conclusions to be drawn from these results. Firstly, for the modulation instability system studied here, we have shown that real-time measurements of only the spectral intensity can be combined with supervised machine learning to yield quantitative information about temporal characteristics using training based on accurate numerical simulations. In particular, by relating spectral characteristics to the maximal intensity of the corresponding time-domain peaks, we can extract a probability distribution that preferentially selects out events which satisfy rogue wave criteria. Since this allows the presence of deleterious high-power temporal spikes to be captured even though only optical spectra are being measured, this is of potential practical significance in imaging and spectroscopy experiments using supercontinuum sources seeded by an initial regime of modulation instability. Secondly, our simulation results showing the ability of unsupervised machine learning to cluster a large ensemble of modulation instability spectra into different classes associated with specific dynamical structures is another important aspect of our work. A further significant element of our results concerns the experimental technique used to capture the shot-to-shot instability spectra with high dynamic range. By windowing the complex modulation instability spectra and using differential attenuation, it has been possible to measure spectra in real time and with nearly 60 dB dynamic range. Being able to characterise spectra with such a large dynamic range is an essential component in successfully applying machine learning to our data. This approach is experimentally straightforward and can be implemented at all wavelengths where suitable spectrometers are available. In this context we note that it is a long standing problem in ultrafast optics to relate temporal and spectral information when only the intensity properties are known; the underlying fields also contain phase components, and it is generally extremely difficult to correlate temporal and spectral properties without this phase information. The possibility to infer time-domain properties in optics only from real-time spectral measurements that are easier to implement experimentally is significant not only from the point of view of studying the particular process of modulation instability as we do here, but also more generally in the field of ultrafast optics. We also note in this context that the use of accurate simulations to train a neural network subsequently applied to experimental data opens up many possibilities for the applications of machine learning in optics. Indeed, the use of simulation-based training applied to real-world data (”sim-real transfer”) is a burgeoning field of machine learning, and with the wide availability of realistic numerical models for many propagation scenarios in both linear and nonlinear optics, we anticipate many future applications in the analysis of optical systems.

Finally, although demonstrated here in an optical context, the principle of using machine learning to study temporal properties of a nonlinear system based only on spectral intensity measurements would be expected to apply to many physical systems exhibiting chaos and instability where direct time-domain observations are precluded.

Methods

Numerical modelling

Our numerical modelling is based on the well-known generalised NLSE model describing the evolution of a field envelope in an optical fibre³⁵. This model has been previously shown to provide a very accurate quantitative description of the statistical and noise properties of MI and supercontinuum generation^13,36. Here, we model the propagation of 3 ps (FHWM) P₀ = 175 W peak power hyperbolic-secant pulses in the anomalous dispersion regime of a 68 cm-long PCF (NKT Photonics NL-PM-750) with Taylor-series expansion dispersion coefficients at 825 nm: β₂ = −1.03 × 10⁻²⁶ s² m⁻¹, β₃ = 4.74 × 10⁻⁴¹ s³ m⁻¹, β₄ = 2.35 × 10⁻⁵⁶ s⁴ m⁻¹, β₅ = −1.17 × 10⁻⁷⁰ s⁵ m⁻¹ and β₆ = −9.07 × 10⁻⁸⁵ s⁶ m⁻¹. The nonlinear coefficient γ = 0.1 W⁻¹ m⁻¹. For completeness, we also include the Raman and shock terms in the model, but for our parameter regime, these had minor influence on the dynamics (although the Raman effect does lead to the observed MI sideband asymmetry.)

Simulations used 4096 grid points with a temporal window of 12 ps corresponding to an 83 GHz spectral resolution. Noise was included in the frequency domain via a one photon per mode spectral background with random phase to seed the growth of MI sidebands outside the bandwidth of the pump pulse. This random phase background in the initial conditions varies between different simulation realisations. We generated an ensemble of 50,000 numerical simulations corresponding to different input noise seeds with 30,000 used for training and 20,000 for testing of the neural network. To account for experimental shot-to-shot fluctuations introduced by using an acousto-optic modulator (AOM) to operate at reduced repetition rate, the simulations also included an additional ±5% random variation of input peak power between different realisations in the ensemble. Because the MI modulation frequency at maximum gain is given by Ω = (2γP₀/|β₂|)^1/2, this leads to a very small variation in the temporal modulation frequency on the pulse envelope for each realisation in the ensemble. However, the intrinsic randomness of the temporal structure from shot to shot is determined by the initial background spectral noise which is amplified nonlinearly by the MI process. Indeed, without the one photon per mode broadband noise background, the simulations do not show the growth of MI at all.

Analytic breather and Peregrine soliton solutions for MI

We give here the analytic form of the Akhmediev breather and Peregrine soliton solutions of MI that are plotted for comparison with the clustering results in Fig. 4. Akhmediev breathers form a one-parameter class of localised solutions of the NLSE, and at their point of maximum localisation (maximum amplitude and minimum temporal width), their analytic form in dimensional units is given by:

$$A_{{\mathrm{AB}}}(t) = \sqrt {P_0} {\kern 1pt} {\kern 1pt} \frac{{(1 - 4a) + \sqrt {2a} {\mathrm{cos}}(\omega _mt)}}{{\sqrt {2a} {\mathrm{cos}}(\omega _mt) - 1}},$$

(1)

where parameters 0 < a < 0.5 and ω_m are related by $2a = 1 - \omega _m^2|\beta _2|/(4\gamma P_0)$.

The solutions plotted in Fig. 4 are the Akhmediev breather solution corresponding to the frequency of maximum MI gain where a = 0.25 in Eq. (1), and the Peregrine soliton solution which is obtained in the asymptotic limit a = 0.5. In this latter case, the Peregrine soliton takes the rational form:

$$A_{{\mathrm{PS}}}(t) = \sqrt {P_0} \left[ {1 - 4/\left( {1 + 4\gamma P_0t^2/|\beta _2|} \right)} \right].$$

(2)

Note that the parameters used for the analytic solutions in Fig. 4 are the experimental values given in the preceding section. Finally we remark that the simulation and analytical intensity profiles are normalized according to the common convention in nonlinear fibre optics to show instantaneous power³⁵.

Machine learning

Machine learning describes the use of computational and statistical techniques to analyse data sets with the aims of classifying data and building models²⁴. Machine learning algorithms are usually described in terms of two classes. A supervised learning algorithm uses classification and regression techniques to train a model from a known set of input and output data such that the model can be used to map new inputs to new outputs. In contrast, unsupervised learning is used to find patterns or intrinsic structures in data sets without any a priori knowledge of the system or data properties. We now present further details of how these techniques were used here.

Supervised learning

The goal of supervised learning is to use a set of known ”training” data to determine a function or model that will map an input to an output. In our case the input X represents the intensity spectrum of a modulation instability field, while the output Y represents a single number—the intensity of the highest peak in the corresponding temporal intensity profile.

Our training data are obtained from an ensemble of 30,000 numerical simulations, and we denote the training data pairs as (X_n, Y_n) with n = 1…30,000. The mapping function used in supervised learning can take various forms including decision trees, regressions, neural networks or Bayesian classifiers^40,41, and in our implementation, we used a multilayer neural network as shown in Fig. 2. Such a network consists of basic computational units (nodes or ”neurons”) organised into different layers: an input layer which accepts the input data X_n, intermediate hidden layers that perform operations on the data and the final layer which computes the network output. Each node in a given layer accepts multiple inputs from the previous layer and these are weighted, summed and combined with an additive bias to yield a resulting single real-value which is passed to an ”activation function” to generate the node output.

In more detail, the network we used consisted of an input layer, two dense (i.e., fully connected) layers of hidden nodes with a nonlinear activation function and an output layer. The two dense hidden layers had 30 and 10 nodes, respectively. The output layer is a single linear node. The output of a generic node $h_i^{(k)}$ in layer k is calculated by combining the outputs $h_j^{(k - 1)}$ from the previous k − 1-th layer:

$$h_i^{(k)} = f\left( {\mathop {\sum}\limits_{j = 1}^{N_{k - 1}} w_{ij}^{(k)}h_j^{(k - 1)} + b_i^{(k)}} \right).$$

(3)

Here $w_{ij}^{(k)}$ are the weights between nodes $n_j^{(k - 1)}$ and $n_i^{(k)}$ of layers k − 1 and k, respectively and the summation is calculated over N_k−1, the number of nodes in layer k − 1. The term $b_i^{(k)}$ represents the bias for each node $n_i^{(k)}$ in layer k, and f(x) = 2/[1 + exp( − 2x)] − 1 is the nonlinear (hyperbolic tangent sigmoid) activation function. For the output layer, a linear activation function was used. The node weights and biases are initially set to random values and then optimised using conjugate gradient back-propagation^42,43 in order to minimise a cost function, defined as:

$$\epsilon = \frac{1}{N}\mathop {\sum}\limits_{n = 1}^N \left( {Y_n - Y_n^\prime } \right)^2.$$

(4)

where N is the number of samples in the training data, Y_n is the target value and $Y_n^\prime$ is the output of the network. The weights w_ij are iteratively adjusted by an amount Δw_ij with learning rate η towards the negative gradient of ϵ such that ${\mathrm{\Delta }}w_{ij} = - \eta \frac{{\partial \varepsilon }}{{\partial w_{ij}}}$. This process is repeated over a number of ”epochs” (one forward pass and one backward pass of all (X_n, Y_n) training pairs through the network) until convergence (no change in the gradient descent with subsequent multiple iterations). At this point the network is suitably configured to perform as the desired transfer function linking X_n and Y_n pairs.

Because our goal is to apply the machine learning algorithm to real-world experimental data, the simulated MI spectra were pre-processed to account for experimental constraints, i.e., the wavelength-dependent response (which falls off above 875 nm) and the 1.1 nm resolution of the spectrometer. With this pre-processing, the input vector then consists of N = 121 uniformly distributed spectral intensity bins such that each MI spectrum from the simulation ensemble was discretised onto a vector X_n = [x₁, x₂...x_N]. The neural network was trained for 300 epochs of all 30,000 training sets.

Unsupervised learning

Clustering is the most common unsupervised learning technique used for exploratory data analysis. In our analysis, we used the k-means method that divides unclassified data into k mutually exclusive clusters by minimising the distance from the data to the cluster centroid. The algorithm begins with random initialisation of the centroid locations, and this is followed by a classification of the data into clusters based on distance to these centroids. The centroids of these clusters are then calculated, the cluster populations are updated based on these new centroid locations, and this process is repeated until the centroid positions stabilise. It is important to note here that the k-means algorithm does not cluster on a single metric such as, e.g., the spectral bandwidth or amplitude, but rather identifies the clusters of different spectra based on the structure and shape of the spectra over their full bandwidth. Indeed, it is the ability of the clustering algorithm to detect patterns in the shot-to-shot spectral structure that demonstrates its utility for this purpose.

Using this method, the 50,000 simulated spectra were classified using different number of clusters (k varying from 5 to 30) to ensure that any conclusions drawn would be independent of the number of clusters used. For the spectra grouped in each cluster, we calculated the corresponding time-domain intensity profile locally around the intensity maximum and examine these for all clusters. Independent of the number of clusters used, the cluster containing the largest number of spectra yielded local temporal profiles with maximum intensity close to the maximum of the probability distribution (Figs. 1e and 7). On the other hand, clusters corresponding to lower or higher intensities than the distribution maximum contained fewer spectra so that the cluster sizes essentially follow the probability distribution. In the distribution tail at the highest intensities, we found only one cluster would be identified from the algorithm, and this cluster contained the smallest number of spectra. Results in Fig. 4 illustrate the classification results using k = 9 for the clusters with the largest and smallest number of spectra produced by the algorithm both for the 50,000 simulated spectra and the 3000 experimental spectra. For completeness here, we give the number of elements in each cluster generated from the simulated data. Specifically for the 50,000 simulations, the cluster sizes ordered from largest to smallest population were: 8402, 6948, 6327, 5853, 5596, 5074, 4973, 4674 and 2153. Note that the results in Fig. 4 correspond to the clusters with the largest and smallest populations. For the 3000 experimental spectra, the cluster populations were found to be: 528, 526, 471, 430, 429, 317, 280, 210 and 49. Note that the algorithm does not return the clusters with any sorting order. Any physical interpretation of the clusters identified must be performed independently of the algorithm itself as we have done above by associating the spectra in particular clusters with their associated temporal properties.

High dynamic range real-time spectrometer

Single-shot MI spectra were measured in real time at the fibre output using a rapidly rotating mirror mounted on a galvanometer (Nutfield QS-12) with angular speed ω = 240 rev. per min, and focussed with a lens of focal length f = 150 mm at the entrance slit of a Czerny–Turner spectrograph. The spectrograph used a grating with 300 lines per mm and 500 nm blaze (ThorLabs GR25-0305) to disperse consecutive spectra onto different lines of a high-sensitivity electron-multiplying charged-coupled device (EMCCD) camera (Andor iXon 3), allowing single-shot spectral intensity measurements with a 1.1 nm resolution. With this scan rate and our setup, it was necessary to reduce the repetition rate of the laser to 150 kHz using an acousto-optic modulator, but acquisition speeds up to the MHz range would be possible either using a faster galvanometer, using a multi-pass geometry⁴⁴ or by increasing the focal length at the spectrograph entrance slit.

The camera was cooled to −80 °C and used 5× pre-amplifier gain to decrease the noise level to a single electron level corresponding to a maximum dynamic range close to 40 dB. In order to increase the effective dynamic range of the measurement, we used a differential spectral attenuation scheme that captures the central part and wings of the MI spectra separately with the same dynamic range. In this scheme, the MI field at the fibre output is divided between two arms of unequal length corresponding to a 200 ps delay. Differential attenuation was induced in the two arms using a notch filter with a 40 dB, 20 nm rejection band centred at 825 nm (Edmund Optics) and a variable neutral density filter, respectively. Beams from the two arms are then recombined with a beamsplitter such that the central part and wings of the individual spectra are recorded with the same dynamic range and 200 ps delay by the individual lines of the EMCCD. The spectral response of the system was carefully calibrated by measuring the mean spectrum with and without the filters. The full spectra are subsequently recombined by post-processing with an effective 60 dB dynamic range, representing a more than three orders of magnitude improvement compared to a conventional fibre-based DFT approach^12,13. Direct comparison of the average MI spectrum at the PCF output was performed with an integrating optical spectrum analyser (Ando AQ6315B).

Conventional fibre-DFT

The conventional fibre-DFT implemented for comparison with the high dynamic range real-time method used a 100 m custom fabricated fibre (IXfibre IXF-SM series) designed to be single mode over a broad wavelength range in the near-infrared and with total dispersion β₂L = +4030 ps² at 825 nm. The input to the dispersive stretching fibre was attenuated to ensure linear propagation. The real-time spectra were recorded with a 25 GHz InGaAs photodiode (UPD-15-IR2-FC Alphalas) and 20 GHz real-time oscilloscope (DSA72004 Tektronix), leading to an effective resolution of ~0.03 nm.

Data availability

The data that support the plots, code modules used in data analysis and other findings of this study are available from the corresponding author upon reasonable request.

References

Benjamin, T. B. & Feir, J. E. The disintegration of wave trains on deep water. Part I. Theory. J. Fluid Mech. 27, 417–430 (1967).
Article ADS Google Scholar
Bespalov, V. I. & Talanov, V. I. Filamentary structure of light beams in nonlinear liquids. JETP Lett. 3, 307–310 (1966).
ADS Google Scholar
Zakharov, V. E. & Ostrovsky, L. A. Modulation instability: the beginning. Phys. D 238, 540–548 (2009).
Article MathSciNet CAS Google Scholar
Zakharov, V. E., Dyachenko, A. I. & Prokofiev, A. O. Freak waves as nonlinear stage of Stokes wave modulation instability. Eur. J. Mech. B Fluids 25, 677–692 (2006).
Article ADS MathSciNet Google Scholar
Akhmediev, N., Dudley, J. M., Solli, D. R. & Turitsyn, S. K. Recent progress in investigating optical rogue waves. J. Opt. 15, 060201 (2013).
Article ADS Google Scholar
Dudley, J. M., Dias, F., Erkintalo, M. & Genty, G. Instabilities, breathers and rogue waves in optics. Nat. Photonics 8, 755–764 (2014).
Article ADS CAS Google Scholar
Goda, K. & Jalali, B. Dispersive Fourier transformation for fast continuous single-shot measurements. Nat. Photonics 7, 102–112 (2013).
Article ADS CAS Google Scholar
Salem, R., Foster, M. A. & Gaeta, A. L. Application of space-time duality to ultrahigh-speed optical signal processing. Adv. Opt. Photonics 5, 274–317 (2013).
Article ADS Google Scholar
Närhi, M. et al. Real-time measurements of spontaneous breathers and rogue wave events in optical fibre modulation instability. Nat. Commun. 7, 13675 (2016).
Article ADS PubMed PubMed Central Google Scholar
Suret, P. et al. Single-shot observation of optical rogue waves in integrable turbulence using time microscopy. Nat. Commun. 7, 13136 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Solli, D. R., Ropers, C., Koonath, P. & Jalali, B. Optical rogue waves. Nature 450, 1054–1057 (2007).
Article ADS CAS PubMed Google Scholar
Solli, D. R., Herink, G., Jalali, B. & Ropers, C. Fluctuations and correlations in modulation instability. Nat. Photonics 6, 463–468 (2012).
Article ADS CAS Google Scholar
Wetzel, B. et al. Real-time full bandwidth measurement of spectral noise in supercontinuum generation. Sci. Rep. 2, 882 (2012).
Article CAS PubMed PubMed Central Google Scholar
Godin, T. et al. Real time noise and wavelength correlations in octave-spanning supercontinuum generation. Opt. Express 21, 18452–18460 (2013).
Article ADS CAS PubMed Google Scholar
Runge, A. F. J., Broderick, N. G. R. & Erkintalo, M. Observation of soliton explosions in a passively mode-locked fiber laser. Optica 2, 36–39 (2015).
Article CAS Google Scholar
Herink, G., Jalali, B., Ropers, C. & Solli, D. R. Resolving the build-up of femtosecond mode-locking with single-shot spectroscopy at 90 MHz frame rate. Nat. Photonics 10, 321–326 (2016).
Article ADS CAS Google Scholar
Krupa, K., Nithyanandan, K., Andral, U., Tchofo-Dinda, P. & Grelu, P. Real-time observation of internal motion within ultrafast dissipative optical soliton molecules. Phys. Rev. Lett. 118, 243901 (2017).
Article ADS PubMed Google Scholar
Ryczkowski, P. et al. Real-time full-field characterization of transient dissipative soliton dynamics in a mode-locked laser. Nat. Photonics 12, 221–227 (2018).
Article ADS CAS Google Scholar
Tikan, A., Bielawski, S., Szwaj, C., Randoux, S. & Suret, P. Single-shot measurement of phase and amplitude by using a heterodyne time-lens system and ultrafast digital time-holography. Nat. Photonics 12, 228–234 (2018).
Article ADS CAS Google Scholar
Lei, C. & Goda, K. The complete optical oscilloscope. Nat. Photonics 12, 190–191 (2018).
Article ADS CAS Google Scholar
Mahjoubfar, A. et al. Time stretch and its applications. Nat. Photonics 11, 341–351 (2017).
Article ADS CAS Google Scholar
Akhmediev, N., Ankiewicz, A., Soto-Crespo, J. M. & Dudley, J. M. Rogue wave early warning through spectral measurements? Phys. Lett. A 375, 541–544 (2011).
Article ADS CAS Google Scholar
Akhmediev, N., Soto-Crespo, J. M., Ankiewicz, A. & Devine, N. Early detection of rogue waves in a chaotic wave field. Phys. Lett. A 375, 2999–3001 (2011).
Article ADS CAS Google Scholar
Jordan, M. I. & Mitchell, T. M. Machine learning: trends, perspectives, and prospects. Science 349, 255–260 (2015).
Article ADS MathSciNet CAS PubMed Google Scholar
Pathak, J., Hunt, B., Girvan, M., Lu, Z. & Ott, E. Model-free prediction of large spatiotemporally chaotic systems from data: a reservoir computing approach. Phys. Rev. Lett. 120, 024102 (2018).
Article ADS PubMed Google Scholar
Zimmermann, R. S. & Parlitz, U. Observing spatio-temporal dynamics of excitable media using reservoir computing. Chaos 28, 043118 (2018).
Article ADS Google Scholar
Woodward, R. I. & Kelleher, E. J. R. Towards ‘smart lasers’: self-optimisation of an ultrafast pulse source using a genetic algorithm. Sci. Rep. 6, 37616 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Baumeister, T., Brunton, S. L. & Kutz, J. N. Deep learning and model predictive control for self-tuning mode-locked lasers. J. Opt. Soc. Am. B 35, 617–626 (2018).
Article ADS CAS Google Scholar
Zahavy, T. et al. Deep learning reconstruction of ultrashort pulses. Optica 5, 666–673 (2018).
Article Google Scholar
Chen, C. L. et al. Deep learning in label-free cell classification. Sci. Rep. 6, 21471 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Lyu, M. et al. Deep-learning-based ghost imaging. Sci. Rep. 7, 17865 (2017).
Article ADS PubMed PubMed Central Google Scholar
Higham, C. F., Murray-Smith, R., Padgett, M. J. & Edgar, M. P. Deep learning for real-time single-pixel video. Sci. Rep. 8, 2369 (2018).
Article ADS PubMed PubMed Central Google Scholar
Rivenson, Y., Zhang, Y., Günaydin, H., Teng, D. & Ozcan, A. Phase recovery and holographic image reconstruction using deep learning in neural networks. Light Sci. Appl. 7, 17141 (2018).
Article CAS PubMed Central Google Scholar
Giacoumidis, E., Wei, J., Aldaya, I. & Barry, L. P. Exceeding the nonlinear Shannon-limit in coherent optical communications using 3D adaptive machine learning. Preprint at https://arxiv.org/pdf/1802.09120 (2018).
Dudley, J. M., Genty, G. & Coen, S. Supercontinuum generation in photonic crystal fiber. Rev. Mod. Phys. 78, 1135–1184 (2006).
Article ADS CAS Google Scholar
Corwin, K. L. et al. Fundamental noise limitations to supercontinuum generation in microstructure fiber. Phys. Rev. Lett. 90, 113904 (2003).
Article ADS CAS PubMed Google Scholar
Frosz, M. H. Validation of input-noise model for simulations of supercontinuum generation and rogue waves. Opt. Express 18, 14778–14787 (2010).
Article ADS CAS PubMed Google Scholar
Toenger, S. et al. Emergent rogue wave structures and statistics in spontaneous modulation instability. Sci. Rep. 5, 10380 (2015).
Article ADS PubMed PubMed Central Google Scholar
James, G., Witten, D., Hastie, T. & Tibshirani, R. An Introduction to Statistical Learning with Applications in R (Springer, New York, 2013).
Google Scholar
Samarasinghe, S. Neural Networks for Applied Sciences and Engineering: From Fundamentals to Complex Pattern Recognition (Auerbach Publications, New York, 2006).
Book Google Scholar
Shalev-Shwartz, S. & Ben-David, S. Understanding Machine Learning. From Theory to Algorithms (Cambridge, Cambridge, 2014).
Fletcher, R. & Reeves, C. M. Function minimization by conjugate gradients. Comput. J. 7, 149–154 (1964).
Article MathSciNet Google Scholar
Hagan, M. T., Demuth, H. B. & Beale, M. H. Neural Network Design (PWS Publishing, Boston, 1996).
Lai, C., Goosman, D., Wade, J. & Avara, R. Design and field test of a galvanometer deflected streak camera. Vol. 4948 of Proceedings of SPIE, 25th International Congress on High Speed Photography and Photonics, 330–335 (SPIE, Beaune, 2003).

Download references

Acknowledgements

M.N. acknowledges the support from Kaute foundation and TUT graduate school. J.M.D. acknowledges support from the French Investissements d’Avenir programme, project ISITE-BFC (contract ANR-15-IDEX-0003). G.G. acknowledges the support from the Academy of Finland (grants 298463 and 318082). D. Brunner, P. Ryczkowski and T. Sylvestre are acknowledged for useful discussions.

Author information

These authors contributed equally: Mikko Närhi, Lauri Salmela.

Authors and Affiliations

Tampere University of Technology, Laboratory of Photonics, FI-33101, Tampere, Finland
Mikko Närhi, Lauri Salmela, Juha Toivonen & Goëry Genty
Institut FEMTO-ST, Université Bourgogne Franche-Comté, CNRS UMR 6174, 25000, Besançon, France
Cyril Billet & John M. Dudley

Authors

Mikko Närhi
View author publications
You can also search for this author in PubMed Google Scholar
Lauri Salmela
View author publications
You can also search for this author in PubMed Google Scholar
Juha Toivonen
View author publications
You can also search for this author in PubMed Google Scholar
Cyril Billet
View author publications
You can also search for this author in PubMed Google Scholar
John M. Dudley
View author publications
You can also search for this author in PubMed Google Scholar
Goëry Genty
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.N., L.S., J.T. and C.B. performed the experiments with assistance and independent checking from J.M.D. and G.G. Numerical simulations were carried out by L.S., M.N., J.M.D. and G.G. All authors contributed to interpreting the results obtained and to the writing of the manuscript, with overall project supervision by J.M.D. and G.G.

Corresponding author

Correspondence to Goëry Genty.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Närhi, M., Salmela, L., Toivonen, J. et al. Machine learning analysis of extreme events in optical fibre modulation instability. Nat Commun 9, 4923 (2018). https://doi.org/10.1038/s41467-018-07355-y

Download citation

Received: 28 May 2018
Accepted: 30 October 2018
Published: 22 November 2018
DOI: https://doi.org/10.1038/s41467-018-07355-y

This article is cited by

Analysis of interaction dynamics and rogue wave localization in modulation instability using data-driven dominant balance
- Andrei V. Ermolaev
- Mehdi Mabed
- John M. Dudley
Scientific Reports (2023)
Prediction of soft X-ray laser gain value generated from laser plasmas by using a multilayer perceptron neural network
- G. Ghani-Moghadam
Optical and Quantum Electronics (2023)
$\hbox {U}^p$-Net: a generic deep learning-based time stepper for parameterized spatio-temporal dynamics
- Merten Stender
- Jakob Ohlsen
- Alexander Schlaefer
Computational Mechanics (2023)
Modulation instability—rogue wave correspondence hidden in integrable systems
- Shihua Chen
- Lili Bu
- Nail Akhmediev
Communications Physics (2022)
Data-driven model discovery of ideal four-wave mixing in nonlinear fibre optics
- Andrei V. Ermolaev
- Anastasiia Sheveleva
- John M. Dudley
Scientific Reports (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.