On-chip photonic decision maker using spontaneous mode switching in a ring laser

Homma, Ryutaro; Kochi, Satoshi; Niiyama, Tomoaki; Mihana, Takatomo; Mitsui, Yusuke; Kanno, Kazutaka; Uchida, Atsushi; Naruse, Makoto; Sunada, Satoshi

doi:10.1038/s41598-019-45754-3

Download PDF

Article
Open access
Published: 01 July 2019

On-chip photonic decision maker using spontaneous mode switching in a ring laser

Ryutaro Homma¹,
Satoshi Kochi¹,
Tomoaki Niiyama^1,2,
Takatomo Mihana³,
Yusuke Mitsui³,
Kazutaka Kanno³,
Atsushi Uchida ORCID: orcid.org/0000-0002-4654-8616³,
Makoto Naruse⁴ &
…
Satoshi Sunada^1,2

Scientific Reports volume 9, Article number: 9429 (2019) Cite this article

1998 Accesses
22 Citations
2 Altmetric
Metrics details

Subjects

Abstract

Efficient and accurate decision making is gaining increased importance with the rapid expansion of information communication technologies including artificial intelligence. Here, we propose and experimentally demonstrate an on-chip, integrated photonic decision maker based on a ring laser. The ring laser exhibits spontaneous switching between clockwise and counter-clockwise oscillatory dynamics; we utilize such nature to solve a multi-armed bandit problem. The spontaneous switching dynamics provides efficient exploration to find the accurate decision. On-line decision making is experimentally demonstrated including autonomous adaptation to an uncertain environment. This study paves the way for directly utilizing the fluctuating physics inherent in ring lasers, or integrated photonics technologies in general, for achieving or accelerating intelligent functionality.

Reconfigurable photonics with on-chip single-photon detectors

Article Open access 03 March 2021

Photonic-circuited resonance fluorescence of single molecules with an ultrastable lifetime-limited transition

Article Open access 09 July 2022

Photonic chip-based low-noise microwave oscillator

Article Open access 06 March 2024

Introduction

In the age of artificial intelligence, the requirements for efficient and intelligent processing of massive amount of data are continuously increasing. Present technologies to accommodate these demands rely on digital electronics; however, hardware scaling in electronics, as foreseen by Moore’s law, is predicted to be unsustainable^1,2. As a consequence, studies on novel computing principles and architectures beyond the present Turing–von-Neumann computing paradigm^2,3 are gaining importance. These include neuromorphic computing^4,5,6, photonic deep learning^7,8, reservoir computing^9,10, molecular computing¹¹, and quantum and coherent ising machines^12,13, where the underlying complexity and fluctuations in natural systems have been used for cognitive processing, prediction, and solving large-scale combinatorial optimization problems.

From the view of information processing, most of the above-mentioned studies have focused on supervised learning or optimization processing. Reinforcement learning is another emergent and important branch of machine learning¹⁴, where utilization of physical processes can enhance or accelerate their performance^15,16,17,18.

As a foundation of reinforcement learning, decision making plays a key role in engineering applications such as cognitive wireless communication^19,20, online advertisements²¹, and Monte-Carlo searches²². Herein, the decision-making problems under study are called multi-armed bandit (MAB) problems²³; the goal is to maximize the total reward from multiple slot machines whose reward probabilities are unknown. A key point of the MAB problems is to resolve the exploration-exploitation dilemma inherent in decision making under uncertainty: sufficient exploratory actions may inform the best slot machine, but it may be accompanied by a significant amount of losses. In contrast, insufficient exploration may result in missing the best machine.

Recently, we have experimentally revealed that optical fluctuation dynamics can be used for exploring and making an optimal decision in the MAB problems^24,25,26,27. Particularly, it has been found that complex temporal waveforms generated from a chaotic laser are useful for making decisions at a fast rate in the gigahertz regimes²⁶. Previous studies suggest the potential of using complex laser dynamics with ultra-wide bandwidth for fast decision making. However, important issues remain open ranging from novel fundamental principles, system architectures, to device implementations for photonic decision making. For instance, the former studies using laser chaos^26,27 only exploit chaotic waveforms as correlated random numbers for a decision making software algorithm; the physical dynamic itself was not directly engineered, even though a variety of dynamical features are inherent in laser systems²⁸. Furthermore, the previous studies have used a long fiber optic delay line to generate chaotic waveforms^26,27; such use could lead to impractically large systems, inhibit stable operation, and may prevent practical deployments.

We here propose a compact (<5 mm² area), on-chip photonic decision maker based on a ring laser structure. Unlike previous studies^26,27, the laser structure can generate fast, complex, but controllable dynamics at a chip scale, without a long delay line. The origin of the dynamics is a spontaneous switching phenomenon, i.e., noise-induced mode-hopping^29,30; the phenomenon is used for exploring an optimal solution under uncertainty. We demonstrate that optimal decision-making is efficiently achieved by opto-electronically controlling the spontaneous-switching dynamics.

Principle of Ring-Laser-Based Decision-Making

Ring laser dynamics and device structure

The device structure used for decision-maker is shown in Fig. 1a. A ring laser is coupled to adjacent waveguides that are integrated on the same chip as a GaAs/AlGaAs single quantum well structure. The resonator of the ring laser supports clockwise (CW) and counter-clockwise (CCW) propagating waves, and can exhibit various operating regimes, such as bidirectional operation and bistability, depending on the pump current^31,32. Spontaneous switching between the CW and CCW modes is an interesting dynamic that appears in the transition from the stable bidirectional regime to the bistable regime. Spontaneous switching has been regarded as an obstacle for deterministic optical switching applications^33,34,35. Conversely, in this work, it is preferably utilized for decision making with feedback control of the CW and CCW modes, as discussed later.

The two waveguides with contact electrodes (denoted by PD_i and BC_i, $i=1,2$, as shown in Fig. 1a, are used for independent input/output control of the two modes in the ring laser: PD₁ and PD₂ are used as the photodetectors to monitor the intensities of the CW and CCW modes, whereas BC₁ and BC₂ with current injections are used for introducing an asymmetry and changing the dynamics of the CW and CCW modes, respectively³⁰. (See Methods section for details.) We note that a similar optoelectronic control method has been used for deterministic optical switching³¹ and random number generation³⁶. However, unlike the previous studies, we use this method for changing statistical characteristics of spontaneous switching dynamics, as demonstrated later in detail.

Principle of decision-making

Here, we consider a two-armed bandit (TAB) problem, i.e., the issue is to select the machine with the higher reward probability among two machines, denoted by SM₁ and SM₂ (Fig. 1b). We examine a TAB problem, the simplest MAB problem, so that we can validate the principle of the first ring-laser-based decision making. Meanwhile, the scalability of photonic decision making has been studied in the literature^25,27, which would be applied to ring-laser-based device architectures. Our decision-making method is based on the tug-of-war (TOW) model, exhibiting highly efficient decision making compared to conventional algorithms^15,16. Based on the model principle, we solve the TAB problem by repeating the following four steps:

(i)
Signal detection: The intensity level of CW and CCW outputs, denoted respectively by I_CW and I_CCW, are detected by photodetector PD₁ and PD₂, respectively.
(ii)
Decision of the machine selection: If I_CW is larger than I_CCW, the decision is to select SM₁. Otherwise, the decision is to choose SM₂.
(iii)
Playing the selected machine.
(iv)
Learning and feedback: If a reward is provided by playing SM₁ or if a reward is not provided by playing SM₂, the current (or voltage) applied to BC₁ is increased to facilitate the lasing in the CW mode. Consequently, the probability of selecting SM₁ slightly increases in the next decision making. On the other hand, if a reward is provided by playing SM₂ or if a reward is not provided by playing SM₁, the current (or voltage) applied to BC₂ is increased so that the CCW lasing is facilitated, leading to a slight increase of the probability of choosing SM₂ in the next step.

Repeating steps (i)–(iv), we can finally choose the best slot machine.

As described in step (iv) above, an important point for the decision making is how to change currents J₁ and J₂ to activate controllers BC₁ and BC₂, respectively. In this study, we control J₁ and J₂ by the following rules in which a dimensionless, time-dependent control parameter C(t) is introduced:

$$\begin{array}{lll}{J}_{1}=KC(t), & {J}_{2}=0, & {\rm{if}}\,C(t)\ge 0,\\ {J}_{1}=0, & {J}_{2}=-\,KC(t), & {\rm{if}}\,C(t)\le 0,\end{array}$$

(1)

where K is a gain parameter. If $C\ge 0$ at the t-th play, the current ${J}_{1}=KC$ (mA) is injected to controller BC₁ whereas ${J}_{2}=-\,KC$ is injected to BC₂ if $C < 0$. The amount of C(t) is updated in accordance with the results of slot machine playing as follows:

$$C(t+1)=\alpha C(t)+{\rm{\Delta }}C,$$

(2)

and

$${\rm{\Delta }}C=\{\begin{array}{cc}+{\rm{\Delta }} & {\rm{i}}{\rm{f}}\,{{\rm{S}}{\rm{M}}}_{1}\,{\rm{w}}{\rm{i}}{\rm{n}}{\rm{s}}\\ -{\rm{\Delta }} & {\rm{i}}{\rm{f}}\,{{\rm{S}}{\rm{M}}}_{2}\,{\rm{w}}{\rm{i}}{\rm{n}}{\rm{s}}\\ -{\rm{\Omega }}{\rm{\Delta }} & {\rm{i}}{\rm{f}}\,{{\rm{S}}{\rm{M}}}_{1}\,{\rm{f}}{\rm{a}}{\rm{i}}{\rm{l}}{\rm{s}}\\ +{\rm{\Omega }}{\rm{\Delta }} & {\rm{i}}{\rm{f}}\,{{\rm{S}}{\rm{M}}}_{2}\,{\rm{f}}{\rm{a}}{\rm{i}}{\rm{l}}{\rm{s}},\end{array}$$

(3)

where $\alpha \in [0,1]$ is the memory parameter (typically, ≈0.99–0.999)³⁷, and Δ is an incremental parameter (${\rm{\Delta }}=1$ in this study). ${\rm{\Omega }}$ in Eq. (3) is determined based on the estimated reward probability ${\hat{P}}_{i}$ for SM_i ($i=1,2$) from the history of the betting results. ${\hat{P}}_{i}$ is given by L_i/S_i, where S_i is the total number of times of playing SM_i and L_i is the number of wins in selecting SM_i. ${\rm{\Omega }}$ is then given as,

$${\rm{\Omega }}=\frac{{\hat{P}}_{1}+{\hat{P}}_{2}}{2-({\hat{P}}_{1}+{\hat{P}}_{2})}.$$

(4)

The details of the derivation of Eq. (4) are shown in¹⁵.

Results

Optoelectronic control of spontaneous switching dynamics

In our ring laser, a spontaneous switching phenomenon used for the above decision-making method appears when the pump current J_p exceeded ~1.3 times of the laser threshold current J_th. Figure 2a shows the examples of the switching dynamics, where the CW and CCW intensities stochastically change due to internal laser noise. For convenience, we hereafter refer to the state of ${I}_{CW(CCW)} > {I}_{CCW(CW)}$ as the CW (CCW) mode. A statistical analysis reveals that the mode switching is characterized by a characteristic time ${\tau }_{c}$ ≈ 43 ns; in a timescale longer than ${\tau }_{c}$, the switching process is treated as a Poisson (random) process, and the duration time in the CW (CCW) mode obeys an exponential distribution (see Supplementary Fig. A1 for details). We refer to ${\tau }_{c}$ as the correlation time of the switching process. When current J₁ to BC₁ increases with J₂ = 0, the duration time in the CW mode increases [Fig. 2a(i,ii)]. In particular, we found that for ${J}_{1} > 20\,{\rm{mA}}$, the duration time diverges, and a stable CW mode operation is achieved [Fig. 2a(iii)]. Otherwise, increasing J₂ can lead to an increase of the duration time in the CCW mode [Fig. 2a(iv,v)], and a stable CCW mode operation is achieved for J₂ > 25 mA [Fig. 2a(vi)].

On-chip decision making: proof-of-concept demonstration

We conducted decision-making experiments based on the controllable dynamics in the ring laser by repeating the processes (i-iv) described in the previous section. In the experimental setup shown in Fig. 1b, the two machines SM₁ and SM₂ were emulated in a computer with the reward probabilities of (P₁, P₂) = (0.7, 0.3). The gain K, step Δ, and memory parameter α were set to be 1, 1, and 0.99, respectively. A machine is selected and played once, and the reward dispensed from SM₁ and SM₂ is assumed to be both 1. The goal of the experiment is to confirm whether the ring-laser-based decision maker selects SM₁ (rather than SM₂) since SM₁ has a higher reward probability (P₁ > P₂). We assume the situation of zero prior knowledge, where the sum of the two hit probabilities is unknown, unlike ref.²⁶.

The experimental results on the decision-making process are displayed in Fig. 3. At first, I_CW and I_CCW are randomly switched when the number of plays t < 100 [Fig. 3a(i)], suggesting the exploration to choose the best machine. The accumulated knowledge is used for estimating the reward probabilities and setting the ${\rm{\Omega }}$-value, and then the C-value is appropriately updated [Fig. 3a(ii)]. The updated C-value affects the dynamics, and the dynamical state change from the switching mode to the CW mode. Consequently, the best machine (SM₁ in this case) is selected. We repeated the decision-making experiment ${n}_{T}=200$ times and evaluated the correct decision rate (CDR), which is defined as the ratio of the number of selecting the slot machine with higher reward probability at the t-th play in n_T trials²⁴. As shown in Fig. 3b, the CDR monotonically increases and approaches 1, suggesting the achievement of correct decision making.

We also conducted similar decision-making experiments with respects to different reward probabilities and parameters; we found that with appropriately tuned parameters (K and Δ), the decision-making performances could be comparable to existing decision-making algorithms such as a modified softmax¹⁶ and upper confidence bound 1-tuned (UCB1-tuned)^38,39. As shown in Fig. 4, the CDR of the ring laser-based method can exceed those of the other methods in some cases.

Discussion

Decision-making strategy and its control

In our decision-making method, the strategy for making good decisions is characterized by the probability function of inducing CW mode configured by the control parameter C(t), denoted by P_CW(C). As observed in Fig. 2b, P_CW(C) of the ring laser has a plateau region in the range of around −21 ≤ C ≤ 12, where P_CW(C) moderately changes when C-value is changed. The plateau region plays a role in explorations to estimate the reward probability (and hence an appropriate ${\rm{\Omega }}$-value), and can lead to a correct decision after many slot plays, as demonstrated in Fig. 4; however, it may also lead to a slow convergence of CDR. A better alternative strategy (i.e., the design of P_CW(C)) satisfying both fast adaptation speed and decision accuracy can theoretically be estimated in the case when we can obtain prior knowledge on the sum of the reward probabilities, P₁ + P₂, such as when either of two events inevitably occurs with the probabilities P₁ and ${P}_{2}=1-{P}_{1}$.

Let us here assume that the value of P₁ + P₂ is a priori known and ${\rm{\Omega }}$ in Eq. (4) is a constant value. For simplicity, we consider α = 1 and assume that the mode switching is random. Under these assumptions, we can treat the time evolution of C as a random walk. The random walk model gives an analytical expression of CDR and suggests that fast and correct decision is made when the probability distribution P_CW(C) is close to 1 for C > 0 and 0 for C < 0, and steeply vary from 0 to 1 near C = 0. (See Sec. 2 of Supplementary Information).

In an actual experiment, such a P_CW(C) is effectively realized by modifying the relationship between the control parameter C and J₁₍₂₎ (Eq. 1) as follows:

$$\begin{array}{lll}{J}_{1}=0, & {J}_{2}=0, & {\rm{if}}\,C(t)=0,\\ {J}_{1}=KC(t)+{K}_{1}, & {J}_{2}=0, & {\rm{if}}\,C(t) > 0,\\ {J}_{1}=0, & {J}_{2}=-\,KC(t)+{K}_{2}, & {\rm{if}}\,C(t) < 0,\end{array}$$

(5)

where K₁ and K₂ are chosen such that the plateau region of P_CW(C) shown in Fig. 2b is reduced and the desirable P_CW(C) results. Figure 5a shows P_CW(C) with (K₁, K₂) = (0, 0), (5, 9) and (13, 17), depicted by Types I, II, and III, respectively. As predicted by the random walk model, CDR in Type III most quickly increases and the convergence value is higher than the other types, regardless of the reward probabilities P₁ and P₂ (Fig. 5b,c). Thus, we conclude that the decision-making performance can be enhanced by changing the intrinsic characteristics (P_CW(C)) of the physical devices with an appropriate mode-control.

Decision-making rate

The rate of decision-making, i.e., the number of decision-making per unit time, in principle, depends on the sampling rate of the CW- and CCW-signal detections. Thus, fast decision making is possible by increasing the sampling rate; however, sampling too rapidly may degrade the accuracy of the decision making because nearly identical signal levels will be observed due to the limitation of the ring laser dynamics. It is important to know how rapidly decision making can be made without degrading the performance. In order to address this question and obtain an insight into the effect of the switching dynamics on the decision-making performance, we numerically examine decision-making processes by standard ring laser model equations³². See Methods section for details of the modeling.

Figure 6a shows the evolution of the CDR for various values of the sampling rate 1/${\tau }_{sam}$ when (P₁, P₂) = (0.7, 0.3), where is the sampling time interval of the signal detections. The CDRs at the 30th-play are shown as a function of ${\tau }_{sam}$ in Fig. 6b. These numerical results clearly show that the decision-making performance (accuracy and adaptation) degrades when ${\tau }_{sam}$ is much shorter than the correlation time ${\tau }_{c}$ of the ring laser. Actually, the autocorrelation of the switching signals sampled at ${\tau }_{sam}\ll {\tau }_{c}$ exhibits a positive value [See Supplementary Fig. A1(d)]. In the decision-making, the positive correlation may result in repetition of the same choice even when the choice is wrong. In contrast, when ${\tau }_{sam}\gg {\tau }_{c}$, the correlation becomes close to zero, which enables an exploration without repeating wrong choices. Accordingly, the sampling time interval (i.e., inverse of the decision-making rate) can be shorter up to the correlation time without degrading the performance. The correlation time can be shorter in principle, allowing faster decision making by increasing the noise strength and activating mode-hopping phenomenon. In an actual experiment, this can be achieved by coupling the laser to an external amplified spontaneous emission noise source; the experimental verification will be an interesting future study.

Summary

In this study, we proposed and experimentally demonstrated on-chip photonic decision making by an integrated ring laser. Ring lasers generate statistical characteristics regarding the CW and CCW lasing, which are optoelectronically controllable; we directly utilize such inherent spontaneous dynamics of ring lasers for decision-making functionalities. Correct decision making was successfully demonstrated with appropriate optoelectronic control of the dynamics, and it is found that the performance can be enhanced by changing the decision-making strategy with the statistical characteristics (P_CW(C)). These results would open novel research perspectives of controlling complex dynamics based on environmental changes.

One interesting and important future study is to increase the decision-making rate by using faster and more complex switching dynamics. In addition to the above-mentioned method on increasing the noise strength, the use of the delayed feedback structure will be useful. Interestingly, semiconductor ring lasers can exhibit chaotic switching in the GHz regimes by delayed feedback even with a short time delay^40,41. Combination of noise-induced switching with delayed feedback instability indicates a promising research direction.

As for the ring laser structure, we emphasize that in addition to miniaturization, it would be beneficial for all optical realization of decision-making devices because all photonic components required for decision making can be monolithically integrated on a chip. Instead of the optoelectronic control methods employed in the present study, it would be interesting to use an optical injection method because ring lasers subjected to optical injection enable low power and ultrafast switching at picosecond time scales^33,34,35.

Another interesting and important future study is to tackle larger-scale MAB problems. MAB problems can be solved based on a hierarchical TOW principle^25,27. The decision-making based on the hierarchical principle can be achieved by using a number of independent two-choice decision-makers (for two-armed bandit problems) or using a time-division multiplexing scheme²⁷. Compact ring lasers could offer a good experimental platform for implementing the hierarchical principle and addressing the MAB problems.

We believe that the combination of photonic integration technologies and competitive fluctuating dynamics, as demonstrated by the proposed ring laser, will shed light on a way toward novel photonic intelligent computing paradigms.

Methods

Device structure and operating regime

The ring laser device used in this study was fabricated in a graded-index separate-confinement-heterostructure (GRIN-SCH) single-quantum well GaAs/Al_xGa_1−xAs structure, the emission wavelength of which is designed to be 850 nm. The fabricated laser device was thermally controlled by a heat-sink with an accuracy of 0.01 °C. The ring radius is 1 mm, and the waveguide width is 2 μm. In an actual device, multiple waveguides with independent electrical contacts are coupled to the ring with an angle to the cleaved facet. We used the waveguides with contacts, PD_i and BC_i ($i=1,2$), as shown in Fig. 1a. The CW and CCW intensity signals are detected with PD₁ and PD₂ in the waveguide, respectively, and sent to a digital oscilloscope (Tektronix TDS7154B, bandwidth 1.5 GHz, 20 GSample/s) via the bonding wires attached to PD₁ and PD₂. Bias contacts BC₁ and BC₂ were used for the mode-control inside the ring laser. Sending current to BC₁ and BC₂ reduces the absorption loss of the waveguide. Thus, the light coupled from the CCW(CW) mode in the ring to the waveguide is back-reflected at the BC₁₍₂₎-side end of the waveguide and re-coupled to the ring in the CW(CCW) direction. In addition, BC₁₍₂₎ can enhance the spontaneous emission noise coupled to the CW(CCW) mode, and consequently, facilitates the laser operation in the CW(CCW) mode^31,36.

When ${J}_{1}={J}_{2}=0\,{\rm{mA}}$, the threshold current J_th of the ring laser used in the experiment was approximately estimated to be 210 mA at 25 °C. The large threshold may partly be attributed to non-optimal etching depth of the ring waveguide³². For J/${J}_{th} < 1.3$, the ring laser operated on a bidirectional state of the CW and CCW modes. For larger J-value, a transition to spontaneous switching regime occurred.

Intensity adjustment

In the experiment, the PD couplings to the CW and CCW modes are not essentially equal to each other due to an imperfect device fabrication. In order to reduce the effect of the asymmetry of the PD-couplings and appropriately evaluate the decision-making performance, the CW and CCW intensities, I_CW and I_CCW, were adjusted by adding constant biases so that the occurrence probability is calibrated being around 0.5 when ${J}_{1}={J}_{2}=0\,{\rm{mA}}$. This way would realize easy tuning of both intensities, while we should also note that there is another simpler way, which is to measure either of I_CW or I_CCW only and adjust the switching probability to 0.5 by bias currents J₁ and J₂ without the intensity biases.

Decision-making experiment

First, the BC₁ and BC₂ were connected to a standard current source. Discrete-valued electrical currents were applied to BC₁ or BC₂. Then, the CW and CCW optical intensity signals for different values of J₁ and J₂ were recorded by a digital oscilloscope. In the decision-making experiment, the slot machines were numerically simulated in the embedded signal processing unit in the oscilloscope using pseudorandom numbers. The decision is immediately made based on the sampling. The controllers BC₁ and BC₂ were also connected to a two-channel function generator (Tektronix AFG3152C), which reconfigures the oscillation dynamics of the ring laser in an on-line or real-time manner.

Rate-equation model for semiconductor ring laser

The numerical simulation was conducted by using a set of dimensionless semiclassical equations for the two slowly varying complex amplitudes of CW and CCW waves, E₁ and E₂³².

$$\frac{d{E}_{1,2}}{dt}=(1+i\tilde{\alpha })\,[{\xi }_{1,2}N-1]\,{E}_{1,2}-{k}_{1,2}{E}_{2,1}+{\eta }_{1,2}(t),$$

(6)

$${\xi }_{1,2}=1-s|{E}_{1}{|}^{2}-c|{E}_{2}{|}^{2},$$

(7)

where $\tilde{\alpha }$ accounts for phase-amplitude coupling, s and c are the self- and cross-saturation coefficients, and k_1,2 represents the complex backscattering coefficients. We model internal optical noises as complex Gaussian noise satisfying $\langle {\eta }_{i}\rangle =0$ and $\langle {\eta }_{i}(t){\eta }_{j}^{\ast }(t^{\prime} )\rangle =2D{\delta }_{ij}\delta (t-t^{\prime} )$ ($i=1,2$). $\langle \cdot \rangle $ represents the ensemble average, and D represents the noise strength. Carrier density N obeys the following equation:

$$\frac{dN}{dt}=2\gamma \,[\mu -N\,(1-{\xi }_{1}|{E}_{1}{|}^{2}-{\xi }_{2}|{E}_{2}{|}^{2})],$$

(8)

where μ is the dimensionless pumping power ($\mu =1$ at the laser threshold). In the above equations, t is dimensionless time rescaled by photon lifetime ${\tau }_{p}$. γ is the ration of ${\tau }_{p}$ to carrier lifetime ${\tau }_{s}$.

In Eq. (6), the asymmetric coupling caused by activating BC₁ and BC₂ is simply modeled as an asymmetric backreflection effect such that ${k}_{1}={\beta }_{1}{k}_{b}$ and ${k}_{2}={\beta }_{2}{k}_{b}$, where k_b denotes the backreflection coefficient when $C=0$, and β_1,2 denotes a dimensionless asymmetry parameter, depending on C as follows:

$$\begin{array}{lll}{\beta }_{1}=1+kC(t), & {\beta }_{2}=1, & {\rm{if}}\,C(t)\ge 0,\\ {\beta }_{1}=1, & {\beta }_{2}=1-kC(t), & {\rm{if}}\,C(t)\le 0,\end{array}$$

(9)

where C(t) is updated by Eq. (2). This is the simplest model of the asymmetric backscattering, although a real asymmetry may be introduced in a more complex way in the actual experiment. We confirmed that regardless of the details of the asymmetry model, the control of spontaneous switching can be achieved. The detailed investigation using more realistic model will be a future work.

In this study, we set some of the parameters as follows: $\tilde{\alpha }=3.5$, $2s=c=0.006$, ${k}_{b}=0.004+i0.001$, $k=0.025$, $D=5\times {10}^{-5}$, $\mu =2.0$, ${\tau }_{p}=10\,{\rm{ps}}$, $\gamma =0.01$. With these parameter values, we obtained stochastic switching dynamics with the correlation time ${\tau }_{c}\approx 13\,{\rm{ns}}$ when $C=0$. In the decision-making simulation, we assume that the slot machines provide a reward without any time delay and use Eqs (2–4) and (6–9).

References

Vetter, J. S., DeBenedictis, E. P. & Conte, T. M. Architectures for the Post-Moore Era. IEEE Micro. 37(4), 6–8 (2017).
Article Google Scholar
Nahmias, M. A., Shastri, B. J., Tait, A. N., Ferreira de Lima, T. & Prucnal, P. R. Neuromorphic Photonics. Optics & Photonics News 29(1), 34–41 (2018).
Article ADS Google Scholar
Peper, F. The End of Moore’s Law: Opportunities for Natural Computing? New Gener. Comput. 35, 253 (2017).
Article Google Scholar
Merolla, P. A. et al. A million spiking-neuron integrated circuit with a scalable communication network and interface. Science 345(6197), 668–673 (2014).
Article ADS CAS Google Scholar
Esser, S. K. et al. Convolutional networks for fast, energy efficient neuromorphic computing. Proc. Natl Acad. Sci. USA 113, 11441–11446 (2016).
Article CAS Google Scholar
Tait, A. N. et al. Neuromorphic photonic networks using silicon photonic weight banks. Sci. Rep. 7, 7430 (2017).
Article ADS Google Scholar
Shen, Y. et al. Deep learning with coherent nanophotonic circuits. Nat. Photonics 11, 441–446 (2017).
Article ADS CAS Google Scholar
Lin, X. et al. All-optical machine learning using diffractive deep neural networks. Science 361(6406), 1004–1008 (2018).
Article ADS MathSciNet CAS Google Scholar
Appeltant, L. et al. Information processing using a single dynamical node as complex system. Nat. Commun. 2, 468 (2011).
Article ADS CAS Google Scholar
Van der Sande, G. et al. Advances in photonic reservoir computing. Nanophotonics 6, 561 (2017).
Google Scholar
Bandyopadhyay, A., Pati, R., Sahu, S., Peper, F. & Fujita, D. Massively parallel computing on an organic molecular layer. Nat. Phys. 6(5), 369–375 (2010).
Article CAS Google Scholar
Berkley, A. J. et al. Quantum annealing with manufactured spins. Nature 473, 194–198 (2011).
Article ADS Google Scholar
Inagaki, T. et al. A coherent Ising machine for 2000-node optimization problems. Science 354(6312), 603–606 (2016).
Article ADS CAS Google Scholar
Sutton, R. S. & Barto, A. G. Reinforcement Learning: An introduction. (The MIT Press, Massachusetts, 1998).
MATH Google Scholar
Kim, S.-J. M., Aono & Nameda, E. Efficient decision-making by volume-conserving physical object. New. J. Phys. 17, 083023 (2015).
Article ADS Google Scholar
Kim., S.-J., Aono, M. & Hara, M. Tug-of-war model for the two-bandit problem: Nonlocally-correlated parallel exploration via resource conservation. Biosystems 101(1), 29–36 (2010).
Article Google Scholar
Hu, W., Wu, K., Shum, P. P., Zheludev, N. I. & Soci, C. All-optical implementation of the ant colony optimization algorithm. Sci. Rep. 6, 26283 (2016).
Article ADS CAS Google Scholar
Alonzo, M. et al. All-Optical Reinforcement Learning In Solitonic X-Junctions. Sci. Rep. 8, 5716 (2018).
Article ADS CAS Google Scholar
Jouini, W., Ernst, D., Moy, C. & Palicot, J. Multi-armed bandit based policies for cognitive radio’s decision making issues. 2009 3rd International Conference on Signals, Circuits and Systems (SCS), Medenine, pp. 1–6, https://doi.org/10.1109/ICSCS.2009.5412697 (2009).
Lai, L., El Gamal, H., Jiang, H. & Poor, H. V. Cognitive Medium Access: Exploration, Exploitation, and Competition. IEEE Trans. on Mob. Comput. 10(2), 239–253 (2011).
Article Google Scholar
Agarwal, D., Chen, B. & Elango, P. Explore/Exploit Schemes for Web Content Optimization. 2009 Ninth IEEE International Conference on Data Mining, pp. 1–10, https://doi.org/10.1109/ICDM.2009.52 (2009).
Kocsis, L. & Szepesvári, C. Bandit Based Monte-Carlo Planning. In: Furnkranz, J., Scheffer, T. & Spiliopoulou, M. (eds) Machine Learning: ECML 2006. ECML 2006. Lecture Notes in Computer Science, vol 4212. Springer, Berlin, Heidelberg, https://doi.org/10.1007/11871842_29.
Google Scholar
Mahajan, A. & Teneketzis, D. Multi-Armed Bandit Problems. In: Hero, A. O., Castanon, D. A., Cochran, D. & Kastella, K. (eds) Foundations and Applications of Sensor Management. Springer, Boston, MA.
Naruse, M. et al. Single-photon decision maker. Sci. Rep. 5, 13253 (2015).
Article ADS CAS Google Scholar
Naruse, M. et al. Single Photon in Hierarchical Architecture for Physical Decision Making: Photon Intelligence. ACS Photonics 3(12), 2505–2514 (2016).
Article CAS Google Scholar
Naruse, M., Terashima, Y., Uchida, A. & Kim, S.-J. Ultrafast photonic reinforcement learning based on laser chaos. Sci. Rep. 7, 8772 (2017).
Article ADS Google Scholar
Naruse, M. et al. Scalable photonic reinforcement learning by time-division multiplexing of laser chaos. Sci. Rep. 8, 10890 (2018).
Article ADS Google Scholar
Uchida, A. Optical Communication with Chaotic Lasers: Applications of Nonlinear Dynamics and Synchronization (Wiley-VCH, 2012).
Beri, S. et al. Topological Insight into the Non-Arrhenius Mode Hopping of Semiconductor Ring Lasers. Phys. Rev. Lett. 101, 093903 (2008).
Article ADS CAS Google Scholar
Gelens, L. et al. Exploring Multistability in Semiconductor Ring Lasers: Theory and Experiment. Phys. Rev. Lett. 102, 193904 (2009).
Article ADS CAS Google Scholar
Sorel, M., Laybourn, P. J. R., Giuliani, G. & Donati, S. Unidirectional bistability in semiconductor waveguide ring lasers. Appl. Phys. Lett. 80, 3051–3053 (2002).
Article ADS CAS Google Scholar
Sorel, M. et al. Operating regimes of GaAs- AlGaAs semiconductor ring lasers. IEEE J. Quantum Electron. 39, 1187–1195 (2003).
Article ADS CAS Google Scholar
Pérez, T., Scirè, A., Van der Sande, G., Colet, P. & Mirasso, C. R. Bistability and all-optical switching in semiconductor ring lasers. Opt. Express 15, 12941–12948 (2007).
Article ADS Google Scholar
Hill, M. T. et al. A fast low-power optical memory based on coupled micro-ring lasers. Nature 432, 206–209 (2004).
Article ADS CAS Google Scholar
Liu, L. et al. An ultra-small, low-power, all-optical flip-flop memory on a silicon chip. Nat. Photonics 4, 182–187 (2010).
Article ADS CAS Google Scholar
Sunada, S. et al. Random optical pulse generation with bistable semiconductor ring lasers. Opt. Exp. 19, 7439 (2011).
Article ADS Google Scholar
Mihana, T., Terashima, Y., Naruse, M., Kim, S.-J. & Uchida, A. Memory Effect on Adaptive Decision Making with a Chaotic Semiconductor Laser. Complexity 2018, 4318127 (2018).
Article Google Scholar
Auer, P., Cesa-Bianchi, N. & Fischer, P. Finite-time analysis of the multiarmed bandit problem. Machine Learning 47, 235 (2002).
Article Google Scholar
Kuleshov, V. & Precup, D. Algorithms for multi-armed bandit problems. Journal of Machine Learning Research. 1, 1–48, arXiv:1402.6028 (2014).
Ermakov, I., Van der Sande, G. & Danckaert, J. Semiconductor ring laser subject to delayed optical feedback: Bifurcations and stability. Commun. Nonlinear Sci. Numer. Simul. 17, 4767 (2012).
Article ADS MathSciNet Google Scholar
Sunada, S. et al. Chaos laser chips with delayed optical feedback using a passive ring waveguide. Opt Express 19(7), 5713 (2011).
Article ADS CAS Google Scholar

Download references

Acknowledgements

This work was supported in part by JST CREST Grant Number JPMJCR17N2, Japan. S.S. acknowledges the support from the Murata Science Foundation and JSPS KAKENHI Grant No. 16K04974. M.N. and A.U. acknowledge JSPS KAKENHI Grant No. JP17H01277. A.U. acknowledges the support from JSPS KAKENHI Grant No. JP16H0387.

Author information

Authors and Affiliations

Graduate School of Natural Science and Technology, Kanazawa University, Kakuma-machi, Kanazawa, Ishikawa, 920-1192, Japan
Ryutaro Homma, Satoshi Kochi, Tomoaki Niiyama & Satoshi Sunada
Faculty of Mechanical Engineering, Institute of Science and Engineering, Kanazawa University, Kakuma-machi, Kanazawa, Ishikawa, 920-1192, Japan
Tomoaki Niiyama & Satoshi Sunada
Department of Information and Computer Sciences, Saitama University, 255 Shimo-Okubo, Sakura-ku, Saitama City, Saitama, 338-8570, Japan
Takatomo Mihana, Yusuke Mitsui, Kazutaka Kanno & Atsushi Uchida
Department of Information Physics and Computing, Graduate School of Information Science and Technology, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan
Makoto Naruse

Authors

Ryutaro Homma
View author publications
You can also search for this author in PubMed Google Scholar
Satoshi Kochi
View author publications
You can also search for this author in PubMed Google Scholar
Tomoaki Niiyama
View author publications
You can also search for this author in PubMed Google Scholar
Takatomo Mihana
View author publications
You can also search for this author in PubMed Google Scholar
Yusuke Mitsui
View author publications
You can also search for this author in PubMed Google Scholar
Kazutaka Kanno
View author publications
You can also search for this author in PubMed Google Scholar
Atsushi Uchida
View author publications
You can also search for this author in PubMed Google Scholar
Makoto Naruse
View author publications
You can also search for this author in PubMed Google Scholar
Satoshi Sunada
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.S., A.U. and M.N. directed the project and designed the decision-making principle. R.H., S.K., T.M., Y.M., K.K., A.U. and S.S. designed and performed ring laser experiments. R.H., S.K., T.N. and S.S. analyzed the data. S.S. conducted numerical evaluation of the performance. R.H., M.N., A.U. and S.S. wrote the paper, and all of the authors contributed to the preparation of the manuscript.

Corresponding author

Correspondence to Satoshi Sunada.

Ethics declarations

Competing Interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Homma, R., Kochi, S., Niiyama, T. et al. On-chip photonic decision maker using spontaneous mode switching in a ring laser. Sci Rep 9, 9429 (2019). https://doi.org/10.1038/s41598-019-45754-3

Download citation

Received: 14 March 2019
Accepted: 13 June 2019
Published: 01 July 2019
DOI: https://doi.org/10.1038/s41598-019-45754-3

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.