Introduction

Deterministic nonlinearities and noise are present in many natural systems and the distinction of their relative influence is a long-standing challenge1,2,3,4,5,6,7,8,9,10,11,12. This can be particularly difficult in complex systems and in systems with time delayed interactions, since high-dimensional chaos can be in practice indistinguishable from stochastic dynamics.

A semiconductor laser with optical feedback is a well-known example of this situation. It displays a complex dynamical behavior that results from the interplay of deterministic nonlinear light-matter interactions, spontaneous emission noise and time-delayed feedback13,14,15,16,17. Close to the lasing threshold and for moderate feedback strengths, the laser output intensity displays apparently random dropouts [see Fig. 1(a)] that resemble neuronal spikes. This dynamics has attracted a lot of attention, not only because for practical applications a stable output is required and the dropouts need to be avoided18,19,20, but also because it involves fundamental questions related to the interplay of delay, noise and nonlinearities. Several statistical studies have been performed in order to validate the models used in the literature and to yield light into the underlying mechanisms that trigger the dropouts21,22,23,24,25,26,27,28.

Figure 1
figure 1

(a) Experimental time series of the laser intensity displaying several dropouts.The pump current is 26.4 mA. The words ‘012’ and ‘210’ are indicated as examples. Also as examples, a few IDIs are classified either as SIs or as LIs (see text for details). Histogram of the inter-dropout-intervals (IDIs) for a pump current of 26.4 mA (b) and 27.8 mA (c). In (b) and (c) the threshold, Tth = 0.9T*, used to classify IDIs as LIs or SIs is indicated with a line.

Here we describe a method of time-series analysis that allows to distinguish signatures of determinism and stochasticity in the sequence of intensity dropouts. We analyze experimental data consisting in long time series of inter-dropout intervals (IDIs) by means of ordinal analysis29, by which the IDI sequence is transformed into a symbolic sequence of ordinal patterns (OPs), also referred to as words. We choose a threshold, Tth, to first classify IDIs into two types: those shorter than Tth are referred to as short intervals (SIs) and those longer than Tth, as fixed long intervals (LIs). In this way the laser spiking activity is separated in periods of fast dropout events that alternate with periods of no events [see Fig. 1(a)]. The motivation for this classification is that some dropouts can be noise-induced while others can be due to a deterministic underlying dynamics30,31,32. Thus, some IDIs correspond to waiting intervals in a resting state until noise triggers a dropout, while others correspond to time intervals between dropouts that are more likely to have a deterministic origin.

We perform the analysis by varying the laser drive current within the whole range where the laser displays spiking dropouts: from low pump currents where the intensity is low and the dropouts are almost too small to be distinguished from small fluctuacions, to high pump currents, where the dropouts become very frequent and can not be distinguished from one to another as individual events. We compute the probabilities of the words formed by consecutive SIs and by consecutive LIs and find that there is a range of pump currents where they are significantly different; the LI probabilities are consistent with stochastic dropouts while the SI probabilities are more deterministic. These results are obtained with a threshold of 0.9T*, where T* is the most probable IDI value. Similar results can be obtained with other threshold values around 0.9T*. Since the type of dynamics analyzed here occurs in various natural complex systems under the influence of noise, the method that we propose can be a powerful tool of time-series analysis of these systems, at an event-level description of the dynamics.

Results

The experiments were performed with a commercial semiconductor laser and two sets of measurements were obtained at temperatures 18°C and 20°C (see methods). Similar results were found in both data sets and thus we present only the results for the data obtained at 18°C. In order to perform a robust statistical analysis we recorded time series of 32 million points each, with a sampling time of 0.5 ns. The time series contain, at low pump currents, about 45,000 dropouts and at high pump currents, more than 220,000 dropouts.

For each pump current the IDI sequence, ΔTi = ti+1ti (with ti being the time when a dropout occurs), is transformed into a sequence of OPs of length D, by considering the relative length of D consecutive IDIs29. For example, for D = 2 there are two possible OPs: ΔTi < ΔTi+1 gives word ‘01’ and ΔTi > ΔTi+1 gives word ‘10’; for D = 3 there are six possible OPs: ΔTi < ΔTi+1 < ΔTi+2 gives ‘012’, ΔTi+2 < ΔTi+1 < ΔTi gives ‘210’, etc. [see Fig. 1(a)]. This symbolic transformation keeps the information about the correlations in the dropout sequence and the short-time memory in the system, but neglects the information contained in the duration of the IDIs.

By counting the number of times a word appears in the symbolic sequence we compute the probabilities of the various words (pi with i = 1 … D!). The results are displayed in Fig. 2, that shows, for each pump current, the probabilities of the two D = 2 OPs [Fig. 2(a)] and of the six D = 3 OPs [Fig. 2(b)]. The error bars represent the confidence interval computed with a binomial test, corresponding to a confidence level of 95% and the gray region represents the probability values consistent with the null hypothesis (N.H.) that there are no correlations in the sequence of dropouts and thus, the OPs are equally probable. Probability values in the gray region, p ± 3σp [where p = 1/D! and with N being the number of OPs in the sequence] are consistent with the N.H.

Figure 2
figure 2

Probabilities of OPs formed with: (a) D = 2 and with (b) D = 3 consecutive IDIs.

The gray region indicates probabilities consistent with the null hypothesis (that there are no correlations in the IDI sequence).

Figures 2(a) and 2(b) show that the most probable words are ‘10’ for D = 2 and ‘210’ for D = 3, respectively, except at low pump currents. These OPs correspond to two and three consecutively decreasing IDIs respectively. For D = 4 (not shown) a similar result is found, with the word ‘3210’ being the most probable except at low pump currents. Notice that, for D = 3 [Fig. 2(b)], for all pump currents there are several OPs with probabilities outside the N.H. gray region, indicating a certain degree of deterministic behavior. Also, it can be noticed that at low pump currents, the analysis with D = 2 [Fig. 2(a)] gives probabilities that are consistent with stochastic behavior (they are within the N.H. region); however, the analysis with D = 3 [Fig. 2(b)] actually reveals a significant degree of determinism, since the probabilities of all the OPs are outside the gray region. The six probabilities form two groups: there are two more probable OPs (012 and 210) and four less probable ones.

To further analyze the underlying structure of the experimental sequence of dropouts we select a threshold, Tth, close to the most probable IDI value, T* (see discussion below for selecting the threshold) and classify the IDIs into two types: those shorter than Tth, as SIs and those longer than Tth, as LIs. By counting the number of times a word appears in the sequence of consecutive LIs or consecutive SIs, we now compute new probabilities of words formed by consecutive LIs (referred to as LI OPs) and by consecutive SIs (SI OPs).

Because the words are now formed with consecutive LIs or SIs, we have shorter sequences of words, as compared to those in the full sequence of IDIs; however, the data sets are long enough to still allow calculating the LIs and SIs probabilities with good statistics. One of the criteria used for choosing the threshold is to obtain enough LI and SI words to allow for a robust statistical analysis. For example, for Tth = 0.9T* and D = 2, the number of SI OPs is about 6,000–7,000 for low and high current respectively and the number of SI OPs is about 9,000–68,000; for D = 3 the number OPs is smaller, formed by SIs is about 1,900–1,300 and by LIs, about 3,800–35,000.

The probabilities of the LI OPs and of the SI OPs are displayed in Fig. 3 for D = 2 and in Fig. 4 for D = 3 OPs. In both figures, the LI probabilities are displayed in the left column and SIs in the right column. To analyze the influence of the threshold, various thresholds are used: in Fig. 3 Tth = 0.85T* top, 0.9T* middle and 0.95T* bottom; in Fig. 4 we use only two, 0.90T* top and 0.95T* bottom, because for 0.85T* the number of words that can be formed with three consecutive SIs is not enough to compute the probabilities with good statistics.

Figure 3
figure 3

Probabilities of D = 2 OPs formed by consecutive LI intervals (a, c and e) and by consecutive SI intervals (b, d and f).

The probabilities are computed for three threshold values, 0.85T* (a and b), 0.90T* (c and d) and 0.95T* (e and f), where T* is the most probable IDI value [indicated in Figs. 1(b), (c) with a vertical line].

Figure 4
figure 4

Probabilities of D = 3 OPs formed by: consecutive LI intervals (a, c) and by consecutive SI intervals (b, d).

The probabilities are calculated using the threshold 0.90T* (a, b) and 0.95T* (c, d).

In these figures we observe that the OPs formed by consecutive LIs appear equally probable for all pump currents, as it is expected in a random process like noise-induced escapes. On the other hand, the probabilities of the OPs formed by consecutive SIs have a deterministic component, except at low pump currents (see discussion below). Therefore, except at low pump currents, the SI sequence keeps the deterministic signature of the IDI sequence while the random signatures of the LIs have been removed.

As the choice of the threshold is rather arbitrary, it can be expected that there will be short LIs that are wrongly classified as SIs and long SIs that are wrongly classified as LIs. However, Figs. 3 and 4 show that the differences of the LI and SI probabilities are significant (except at low pump currents) and that they are robust to threshold variations. It can be observed that the lower threshold reveals more deterministic SIs (with probabilities far from the uniform distribution) but has the drawback of a larger degree of uncertainty (i.e., the error bars and the N.H. region are wider due to a low number of OPs in the SI sequence). On the other hand, for the larger threshold one can observe that the degree of SI determinism decreases, while increases the robustness of the analysis (i.e., the error bars and the width of the N.H. region of the SI probabilities decrease due to a larger number of OPs in the sequence). As the variation of the threshold leaves rather unaffected the number of LI OPs, it has almost no effect on the LI probabilities.

In the above figures the thresholds were selected in order to take into account the following three goals: i) we can form enough SI and LI words to compute their probabilities with good statistics (i.e., having small error bars and narrow N.H. region), ii) the distribution of the LIs is close to an exponential and iii) the LI OP distribution is close to the uniform distribution. While here we have chosen the threshold in the same way for all data sets (Tth = αT*, where α in the range 0.85–0.95 takes the same value for all pump currents), the method could be optimized by fine-tuning the threshold such that it is optimal for each data set, giving a sequence of LIs with the closest statistics to a random sequence of events.

It should be noticed that at low pump currents, both LIs and SIs are consistent with the null hypothesis, for D = 2 [Fig. 3] and also for D = 3 [Figs. 4]. By fine-tuning we could not find a threshold that allowed to separate the IDIs into two sets with significantly different statistical properties. While for D = 2 this could be expected (as also the IDIs seem stochastic), for D = 3 this is rather unexpected as the probabilities of the OPs formed by consecutive IDIs are all not consistent with the N.H. [Fig. 2(b)]. Moreover, the IDI distribution [shown in Fig. 1(b)] has a nontrivial structure at low IDI values and an exponential decay at large IDIs, suggesting the existence of two IDI categories. The fact that when separating the IDIs in LIs and SIs we obtain two sets consistent with the N.H. means that by separating we actually remove the correlations existing in the IDI sequence. This effect can be understood in terms of the numerical results in Refs. 31, 32, where it was shown that the average duration of the transient dynamics decreases with increasing current values. Thus, at low pump currents long intervals between consecutive dropouts might occur during the transient dynamics and these “long SIs” have time-scales comparable to noise induced escapes. Thus, in the low pump current region the method can not distinguish two different IDI categories, in spite of the fact that the distribution of IDIs displays a bimodal structure.

The different statistical properties of the IDIs, LIs and SIs are also captured by the permutation entropy29, i.e., the entropy of the probabilities of the OPs, , normalized to its maximum value, Smax = ln D!. This has been proven to be an appropriate measure of complexity for chaotic time series in the presence of noise. Figure 5 displays the permutation entropy computed for OPs formed by consecutive IDIs (top), LIs (middle) and SIs (bottom), for a threshold Tth = 0.95T*. Notice that the entropy of SI-OPs is smaller than that of IDIs and LIs, which is consistent with a lower degree of randomness in the SI sequence, as compared to the IDI and LI sequences.

Figure 5
figure 5

Permutation entropy of OPs formed by consecutive IDIs (a), LIs (b) and SIs (c).

Circles correspond to OPs of length D = 2, triangles D = 3 and squares to D = 4. Threshold for classifying the IDIs as LIs or SIs is 0.95T*.

To further demonstrate that the LIs and SIs have indeed different statistical properties (and thus are likely to correspond to dropouts triggered by different mechanisms), we computed the histograms of the time intervals composed by the sum of consecutive SIs, ΣTi,SI and by the sum of consecutive LIs, ΣTi,LI. These are shown in figure 6. The histogram of the sum of consecutive LIs displays an exponential decay, as can be expected for a variable that is the sum of independent random variables, each with an exponentially decaying distribution. On the contrary, the histogram of the sum of consecutive SIs displays a nontrivial structure. In the interpretation of the SIs as time-intervals between deterministic dropouts, the sum of consecutive SIs represents the duration of the transient dynamics, before returning to the resting state. Thus, this distribution of transient times can be traced back to a deterministic attractor that rules the dynamics and can be compared with recent simulations32. The good agreement with the simulated statistics of transient times enforces the interpretation of the dropouts observed experimentally as a dynamics sustained by intrinsic laser noise.

Figure 6
figure 6

Histogram of the sum of consecutive LIs, ΣTi,LI (a, c) and of the sum of consecutive SIs, ΣTi,SI, (b, d).

The laser pump current is 26.8 mA (a, b) and 27.8 mA (c, d). Threshold is 0.95T*.

The analysis of the IDI data collected at a higher temperature (20°C) did not reveal any significant difference in the probabilities of the OPs formed by the IDI sequence and their dependence with the pump current, nor for the OPs formed by consecutive LIs and SIs.

Discussion

We proposed a novel method of analysis that allows to distinguish signatures of determinism and stochasticity in the sequence of dropouts of a semiconductor laser with optical feedback. The analysis reveled the existence of an underlying structure in the IDI sequence. By choosing an appropriate threshold, the IDIs can be classified into two categories (SIs and LIs), with significantly different deterministic components that suggest that different physical mechanisms trigger the dropouts. These are consistent with interpreting the LIs as waiting intervals in a resting state and the SIs as intervals between dropouts occurring during the return to the resting state. Thus, the method allows statistically inferring which dropouts could be noise induced and which ones could have a deterministic origin, due to a stochastic trajectory that follows an underlying attractor in its return to the resting state.

The threshold for classifying the IDIs as LIs or SIs was chosen taking into account three criteria: i) the probability distribution of the LIs is exponentially decaying (expected for noise-induced escapes), ii) the probabilities of the words formed by the LIs are close to the uniform distribution (also expected for noise-induced escapes) and iii) there are enough words formed by consecutive LIs and by consecutive SIs to perform a robust statistical analysis. There is a range of threshold values that meet these criteria and we have shown that the results are qualitatively robust to threshold variations within this range.

The method is computationally simple to implement and the data requirements can be easily adapted to small and large data sets by appropriately choosing the length D of the ordinal patterns. For improved performance, instead of using a general criterium across all data sets for selecting the threshold, Tth could be fine tuned to work optimally for each data set, giving the sequence of LIs with an statistics closest to a random sequence of events.

The method proposed here can be a very powerful tool for the analysis of real-world data, such as experimental recordings of neuronal inter-spike intervals, or data generated by complex systems such as inter-event times of user activity in social communities, where signatures of deterministic underlying dynamics can be obscured by the presence of noise.

Methods

The experiments were performed with a 675 nm AlGaInP semiconductor laser (Hitachi Laser Diode HL6724MG) with optical feedback from a diffraction grating. The external cavity length was 45 cm and thus the feedback delay time was 3 ns. To detect the laser output power we used a beam-splitter that sent 50% of the light to a 2.5 GHz oscilloscope (Agilent Infiniium 9000). The laser temperature and pump current were controlled to an accuracy of 0.01 C and 0.01 mA respectively (with a ITC502 Thorlabs laser diode combi controller). Two sets of measurements were obtained, at T = 18°C and 20°C. At 18°C the threshold current of the solitary laser was 27.6 mA and with optical feedback it was reduced to 25.7 mA (the feedback strength being such that the threshold reduction was 7%). In the experiments the pump current was varied in steps of 0.20 mA, from 26.20 mA to 28.0 mA.