Abstract
Efficient and accurate decision making is gaining increased importance with the rapid expansion of information communication technologies including artificial intelligence. Here, we propose and experimentally demonstrate an onchip, integrated photonic decision maker based on a ring laser. The ring laser exhibits spontaneous switching between clockwise and counterclockwise oscillatory dynamics; we utilize such nature to solve a multiarmed bandit problem. The spontaneous switching dynamics provides efficient exploration to find the accurate decision. Online decision making is experimentally demonstrated including autonomous adaptation to an uncertain environment. This study paves the way for directly utilizing the fluctuating physics inherent in ring lasers, or integrated photonics technologies in general, for achieving or accelerating intelligent functionality.
Introduction
In the age of artificial intelligence, the requirements for efficient and intelligent processing of massive amount of data are continuously increasing. Present technologies to accommodate these demands rely on digital electronics; however, hardware scaling in electronics, as foreseen by Moore’s law, is predicted to be unsustainable^{1,2}. As a consequence, studies on novel computing principles and architectures beyond the present Turing–vonNeumann computing paradigm^{2,3} are gaining importance. These include neuromorphic computing^{4,5,6}, photonic deep learning^{7,8}, reservoir computing^{9,10}, molecular computing^{11}, and quantum and coherent ising machines^{12,13}, where the underlying complexity and fluctuations in natural systems have been used for cognitive processing, prediction, and solving largescale combinatorial optimization problems.
From the view of information processing, most of the abovementioned studies have focused on supervised learning or optimization processing. Reinforcement learning is another emergent and important branch of machine learning^{14}, where utilization of physical processes can enhance or accelerate their performance^{15,16,17,18}.
As a foundation of reinforcement learning, decision making plays a key role in engineering applications such as cognitive wireless communication^{19,20}, online advertisements^{21}, and MonteCarlo searches^{22}. Herein, the decisionmaking problems under study are called multiarmed bandit (MAB) problems^{23}; the goal is to maximize the total reward from multiple slot machines whose reward probabilities are unknown. A key point of the MAB problems is to resolve the explorationexploitation dilemma inherent in decision making under uncertainty: sufficient exploratory actions may inform the best slot machine, but it may be accompanied by a significant amount of losses. In contrast, insufficient exploration may result in missing the best machine.
Recently, we have experimentally revealed that optical fluctuation dynamics can be used for exploring and making an optimal decision in the MAB problems^{24,25,26,27}. Particularly, it has been found that complex temporal waveforms generated from a chaotic laser are useful for making decisions at a fast rate in the gigahertz regimes^{26}. Previous studies suggest the potential of using complex laser dynamics with ultrawide bandwidth for fast decision making. However, important issues remain open ranging from novel fundamental principles, system architectures, to device implementations for photonic decision making. For instance, the former studies using laser chaos^{26,27} only exploit chaotic waveforms as correlated random numbers for a decision making software algorithm; the physical dynamic itself was not directly engineered, even though a variety of dynamical features are inherent in laser systems^{28}. Furthermore, the previous studies have used a long fiber optic delay line to generate chaotic waveforms^{26,27}; such use could lead to impractically large systems, inhibit stable operation, and may prevent practical deployments.
We here propose a compact (<5 mm^{2} area), onchip photonic decision maker based on a ring laser structure. Unlike previous studies^{26,27}, the laser structure can generate fast, complex, but controllable dynamics at a chip scale, without a long delay line. The origin of the dynamics is a spontaneous switching phenomenon, i.e., noiseinduced modehopping^{29,30}; the phenomenon is used for exploring an optimal solution under uncertainty. We demonstrate that optimal decisionmaking is efficiently achieved by optoelectronically controlling the spontaneousswitching dynamics.
Principle of RingLaserBased DecisionMaking
Ring laser dynamics and device structure
The device structure used for decisionmaker is shown in Fig. 1a. A ring laser is coupled to adjacent waveguides that are integrated on the same chip as a GaAs/AlGaAs single quantum well structure. The resonator of the ring laser supports clockwise (CW) and counterclockwise (CCW) propagating waves, and can exhibit various operating regimes, such as bidirectional operation and bistability, depending on the pump current^{31,32}. Spontaneous switching between the CW and CCW modes is an interesting dynamic that appears in the transition from the stable bidirectional regime to the bistable regime. Spontaneous switching has been regarded as an obstacle for deterministic optical switching applications^{33,34,35}. Conversely, in this work, it is preferably utilized for decision making with feedback control of the CW and CCW modes, as discussed later.
The two waveguides with contact electrodes (denoted by PD_{i} and BC_{i}, \(i=1,2\), as shown in Fig. 1a, are used for independent input/output control of the two modes in the ring laser: PD_{1} and PD_{2} are used as the photodetectors to monitor the intensities of the CW and CCW modes, whereas BC_{1} and BC_{2} with current injections are used for introducing an asymmetry and changing the dynamics of the CW and CCW modes, respectively^{30}. (See Methods section for details.) We note that a similar optoelectronic control method has been used for deterministic optical switching^{31} and random number generation^{36}. However, unlike the previous studies, we use this method for changing statistical characteristics of spontaneous switching dynamics, as demonstrated later in detail.
Principle of decisionmaking
Here, we consider a twoarmed bandit (TAB) problem, i.e., the issue is to select the machine with the higher reward probability among two machines, denoted by SM_{1} and SM_{2} (Fig. 1b). We examine a TAB problem, the simplest MAB problem, so that we can validate the principle of the first ringlaserbased decision making. Meanwhile, the scalability of photonic decision making has been studied in the literature^{25,27}, which would be applied to ringlaserbased device architectures. Our decisionmaking method is based on the tugofwar (TOW) model, exhibiting highly efficient decision making compared to conventional algorithms^{15,16}. Based on the model principle, we solve the TAB problem by repeating the following four steps:

(i)
Signal detection: The intensity level of CW and CCW outputs, denoted respectively by I_{CW} and I_{CCW}, are detected by photodetector PD_{1} and PD_{2}, respectively.

(ii)
Decision of the machine selection: If I_{CW} is larger than I_{CCW}, the decision is to select SM_{1}. Otherwise, the decision is to choose SM_{2}.

(iii)
Playing the selected machine.

(iv)
Learning and feedback: If a reward is provided by playing SM_{1} or if a reward is not provided by playing SM_{2}, the current (or voltage) applied to BC_{1} is increased to facilitate the lasing in the CW mode. Consequently, the probability of selecting SM_{1} slightly increases in the next decision making. On the other hand, if a reward is provided by playing SM_{2} or if a reward is not provided by playing SM_{1}, the current (or voltage) applied to BC_{2} is increased so that the CCW lasing is facilitated, leading to a slight increase of the probability of choosing SM_{2} in the next step.
Repeating steps (i)–(iv), we can finally choose the best slot machine.
As described in step (iv) above, an important point for the decision making is how to change currents J_{1} and J_{2} to activate controllers BC_{1} and BC_{2}, respectively. In this study, we control J_{1} and J_{2} by the following rules in which a dimensionless, timedependent control parameter C(t) is introduced:
where K is a gain parameter. If \(C\ge 0\) at the tth play, the current \({J}_{1}=KC\) (mA) is injected to controller BC_{1} whereas \({J}_{2}=\,KC\) is injected to BC_{2} if \(C < 0\). The amount of C(t) is updated in accordance with the results of slot machine playing as follows:
and
where \(\alpha \in [0,1]\) is the memory parameter (typically, ≈0.99–0.999)^{37}, and Δ is an incremental parameter (\({\rm{\Delta }}=1\) in this study). \({\rm{\Omega }}\) in Eq. (3) is determined based on the estimated reward probability \({\hat{P}}_{i}\) for SM_{i} (\(i=1,2\)) from the history of the betting results. \({\hat{P}}_{i}\) is given by L_{i}/S_{i}, where S_{i} is the total number of times of playing SM_{i} and L_{i} is the number of wins in selecting SM_{i}. \({\rm{\Omega }}\) is then given as,
The details of the derivation of Eq. (4) are shown in^{15}.
Results
Optoelectronic control of spontaneous switching dynamics
In our ring laser, a spontaneous switching phenomenon used for the above decisionmaking method appears when the pump current J_{p} exceeded ~1.3 times of the laser threshold current J_{th}. Figure 2a shows the examples of the switching dynamics, where the CW and CCW intensities stochastically change due to internal laser noise. For convenience, we hereafter refer to the state of \({I}_{CW(CCW)} > {I}_{CCW(CW)}\) as the CW (CCW) mode. A statistical analysis reveals that the mode switching is characterized by a characteristic time \({\tau }_{c}\) ≈ 43 ns; in a timescale longer than \({\tau }_{c}\), the switching process is treated as a Poisson (random) process, and the duration time in the CW (CCW) mode obeys an exponential distribution (see Supplementary Fig. A1 for details). We refer to \({\tau }_{c}\) as the correlation time of the switching process. When current J_{1} to BC_{1} increases with J_{2} = 0, the duration time in the CW mode increases [Fig. 2a(i,ii)]. In particular, we found that for \({J}_{1} > 20\,{\rm{mA}}\), the duration time diverges, and a stable CW mode operation is achieved [Fig. 2a(iii)]. Otherwise, increasing J_{2} can lead to an increase of the duration time in the CCW mode [Fig. 2a(iv,v)], and a stable CCW mode operation is achieved for J_{2} > 25 mA [Fig. 2a(vi)].
Onchip decision making: proofofconcept demonstration
We conducted decisionmaking experiments based on the controllable dynamics in the ring laser by repeating the processes (iiv) described in the previous section. In the experimental setup shown in Fig. 1b, the two machines SM_{1} and SM_{2} were emulated in a computer with the reward probabilities of (P_{1}, P_{2}) = (0.7, 0.3). The gain K, step Δ, and memory parameter α were set to be 1, 1, and 0.99, respectively. A machine is selected and played once, and the reward dispensed from SM_{1} and SM_{2} is assumed to be both 1. The goal of the experiment is to confirm whether the ringlaserbased decision maker selects SM_{1} (rather than SM_{2}) since SM_{1} has a higher reward probability (P_{1} > P_{2}). We assume the situation of zero prior knowledge, where the sum of the two hit probabilities is unknown, unlike ref.^{26}.
The experimental results on the decisionmaking process are displayed in Fig. 3. At first, I_{CW} and I_{CCW} are randomly switched when the number of plays t < 100 [Fig. 3a(i)], suggesting the exploration to choose the best machine. The accumulated knowledge is used for estimating the reward probabilities and setting the \({\rm{\Omega }}\)value, and then the Cvalue is appropriately updated [Fig. 3a(ii)]. The updated Cvalue affects the dynamics, and the dynamical state change from the switching mode to the CW mode. Consequently, the best machine (SM_{1} in this case) is selected. We repeated the decisionmaking experiment \({n}_{T}=200\) times and evaluated the correct decision rate (CDR), which is defined as the ratio of the number of selecting the slot machine with higher reward probability at the tth play in n_{T} trials^{24}. As shown in Fig. 3b, the CDR monotonically increases and approaches 1, suggesting the achievement of correct decision making.
We also conducted similar decisionmaking experiments with respects to different reward probabilities and parameters; we found that with appropriately tuned parameters (K and Δ), the decisionmaking performances could be comparable to existing decisionmaking algorithms such as a modified softmax^{16} and upper confidence bound 1tuned (UCB1tuned)^{38,39}. As shown in Fig. 4, the CDR of the ring laserbased method can exceed those of the other methods in some cases.
Discussion
Decisionmaking strategy and its control
In our decisionmaking method, the strategy for making good decisions is characterized by the probability function of inducing CW mode configured by the control parameter C(t), denoted by P_{CW}(C). As observed in Fig. 2b, P_{CW}(C) of the ring laser has a plateau region in the range of around −21 ≤ C ≤ 12, where P_{CW}(C) moderately changes when Cvalue is changed. The plateau region plays a role in explorations to estimate the reward probability (and hence an appropriate \({\rm{\Omega }}\)value), and can lead to a correct decision after many slot plays, as demonstrated in Fig. 4; however, it may also lead to a slow convergence of CDR. A better alternative strategy (i.e., the design of P_{CW}(C)) satisfying both fast adaptation speed and decision accuracy can theoretically be estimated in the case when we can obtain prior knowledge on the sum of the reward probabilities, P_{1} + P_{2}, such as when either of two events inevitably occurs with the probabilities P_{1} and \({P}_{2}=1{P}_{1}\).
Let us here assume that the value of P_{1} + P_{2} is a priori known and \({\rm{\Omega }}\) in Eq. (4) is a constant value. For simplicity, we consider α = 1 and assume that the mode switching is random. Under these assumptions, we can treat the time evolution of C as a random walk. The random walk model gives an analytical expression of CDR and suggests that fast and correct decision is made when the probability distribution P_{CW}(C) is close to 1 for C > 0 and 0 for C < 0, and steeply vary from 0 to 1 near C = 0. (See Sec. 2 of Supplementary Information).
In an actual experiment, such a P_{CW}(C) is effectively realized by modifying the relationship between the control parameter C and J_{1(2)} (Eq. 1) as follows:
where K_{1} and K_{2} are chosen such that the plateau region of P_{CW}(C) shown in Fig. 2b is reduced and the desirable P_{CW}(C) results. Figure 5a shows P_{CW}(C) with (K_{1}, K_{2}) = (0, 0), (5, 9) and (13, 17), depicted by Types I, II, and III, respectively. As predicted by the random walk model, CDR in Type III most quickly increases and the convergence value is higher than the other types, regardless of the reward probabilities P_{1} and P_{2} (Fig. 5b,c). Thus, we conclude that the decisionmaking performance can be enhanced by changing the intrinsic characteristics (P_{CW}(C)) of the physical devices with an appropriate modecontrol.
Decisionmaking rate
The rate of decisionmaking, i.e., the number of decisionmaking per unit time, in principle, depends on the sampling rate of the CW and CCWsignal detections. Thus, fast decision making is possible by increasing the sampling rate; however, sampling too rapidly may degrade the accuracy of the decision making because nearly identical signal levels will be observed due to the limitation of the ring laser dynamics. It is important to know how rapidly decision making can be made without degrading the performance. In order to address this question and obtain an insight into the effect of the switching dynamics on the decisionmaking performance, we numerically examine decisionmaking processes by standard ring laser model equations^{32}. See Methods section for details of the modeling.
Figure 6a shows the evolution of the CDR for various values of the sampling rate 1/\({\tau }_{sam}\) when (P_{1}, P_{2}) = (0.7, 0.3), where is the sampling time interval of the signal detections. The CDRs at the 30thplay are shown as a function of \({\tau }_{sam}\) in Fig. 6b. These numerical results clearly show that the decisionmaking performance (accuracy and adaptation) degrades when \({\tau }_{sam}\) is much shorter than the correlation time \({\tau }_{c}\) of the ring laser. Actually, the autocorrelation of the switching signals sampled at \({\tau }_{sam}\ll {\tau }_{c}\) exhibits a positive value [See Supplementary Fig. A1(d)]. In the decisionmaking, the positive correlation may result in repetition of the same choice even when the choice is wrong. In contrast, when \({\tau }_{sam}\gg {\tau }_{c}\), the correlation becomes close to zero, which enables an exploration without repeating wrong choices. Accordingly, the sampling time interval (i.e., inverse of the decisionmaking rate) can be shorter up to the correlation time without degrading the performance. The correlation time can be shorter in principle, allowing faster decision making by increasing the noise strength and activating modehopping phenomenon. In an actual experiment, this can be achieved by coupling the laser to an external amplified spontaneous emission noise source; the experimental verification will be an interesting future study.
Summary
In this study, we proposed and experimentally demonstrated onchip photonic decision making by an integrated ring laser. Ring lasers generate statistical characteristics regarding the CW and CCW lasing, which are optoelectronically controllable; we directly utilize such inherent spontaneous dynamics of ring lasers for decisionmaking functionalities. Correct decision making was successfully demonstrated with appropriate optoelectronic control of the dynamics, and it is found that the performance can be enhanced by changing the decisionmaking strategy with the statistical characteristics (P_{CW}(C)). These results would open novel research perspectives of controlling complex dynamics based on environmental changes.
One interesting and important future study is to increase the decisionmaking rate by using faster and more complex switching dynamics. In addition to the abovementioned method on increasing the noise strength, the use of the delayed feedback structure will be useful. Interestingly, semiconductor ring lasers can exhibit chaotic switching in the GHz regimes by delayed feedback even with a short time delay^{40,41}. Combination of noiseinduced switching with delayed feedback instability indicates a promising research direction.
As for the ring laser structure, we emphasize that in addition to miniaturization, it would be beneficial for all optical realization of decisionmaking devices because all photonic components required for decision making can be monolithically integrated on a chip. Instead of the optoelectronic control methods employed in the present study, it would be interesting to use an optical injection method because ring lasers subjected to optical injection enable low power and ultrafast switching at picosecond time scales^{33,34,35}.
Another interesting and important future study is to tackle largerscale MAB problems. MAB problems can be solved based on a hierarchical TOW principle^{25,27}. The decisionmaking based on the hierarchical principle can be achieved by using a number of independent twochoice decisionmakers (for twoarmed bandit problems) or using a timedivision multiplexing scheme^{27}. Compact ring lasers could offer a good experimental platform for implementing the hierarchical principle and addressing the MAB problems.
We believe that the combination of photonic integration technologies and competitive fluctuating dynamics, as demonstrated by the proposed ring laser, will shed light on a way toward novel photonic intelligent computing paradigms.
Methods
Device structure and operating regime
The ring laser device used in this study was fabricated in a gradedindex separateconfinementheterostructure (GRINSCH) singlequantum well GaAs/Al_{x}Ga_{1−x}As structure, the emission wavelength of which is designed to be 850 nm. The fabricated laser device was thermally controlled by a heatsink with an accuracy of 0.01 °C. The ring radius is 1 mm, and the waveguide width is 2 μm. In an actual device, multiple waveguides with independent electrical contacts are coupled to the ring with an angle to the cleaved facet. We used the waveguides with contacts, PD_{i} and BC_{i} (\(i=1,2\)), as shown in Fig. 1a. The CW and CCW intensity signals are detected with PD_{1} and PD_{2} in the waveguide, respectively, and sent to a digital oscilloscope (Tektronix TDS7154B, bandwidth 1.5 GHz, 20 GSample/s) via the bonding wires attached to PD_{1} and PD_{2}. Bias contacts BC_{1} and BC_{2} were used for the modecontrol inside the ring laser. Sending current to BC_{1} and BC_{2} reduces the absorption loss of the waveguide. Thus, the light coupled from the CCW(CW) mode in the ring to the waveguide is backreflected at the BC_{1(2)}side end of the waveguide and recoupled to the ring in the CW(CCW) direction. In addition, BC_{1(2)} can enhance the spontaneous emission noise coupled to the CW(CCW) mode, and consequently, facilitates the laser operation in the CW(CCW) mode^{31,36}.
When \({J}_{1}={J}_{2}=0\,{\rm{mA}}\), the threshold current J_{th} of the ring laser used in the experiment was approximately estimated to be 210 mA at 25 °C. The large threshold may partly be attributed to nonoptimal etching depth of the ring waveguide^{32}. For J/\({J}_{th} < 1.3\), the ring laser operated on a bidirectional state of the CW and CCW modes. For larger Jvalue, a transition to spontaneous switching regime occurred.
Intensity adjustment
In the experiment, the PD couplings to the CW and CCW modes are not essentially equal to each other due to an imperfect device fabrication. In order to reduce the effect of the asymmetry of the PDcouplings and appropriately evaluate the decisionmaking performance, the CW and CCW intensities, I_{CW} and I_{CCW}, were adjusted by adding constant biases so that the occurrence probability is calibrated being around 0.5 when \({J}_{1}={J}_{2}=0\,{\rm{mA}}\). This way would realize easy tuning of both intensities, while we should also note that there is another simpler way, which is to measure either of I_{CW} or I_{CCW} only and adjust the switching probability to 0.5 by bias currents J_{1} and J_{2} without the intensity biases.
Decisionmaking experiment
First, the BC_{1} and BC_{2} were connected to a standard current source. Discretevalued electrical currents were applied to BC_{1} or BC_{2}. Then, the CW and CCW optical intensity signals for different values of J_{1} and J_{2} were recorded by a digital oscilloscope. In the decisionmaking experiment, the slot machines were numerically simulated in the embedded signal processing unit in the oscilloscope using pseudorandom numbers. The decision is immediately made based on the sampling. The controllers BC_{1} and BC_{2} were also connected to a twochannel function generator (Tektronix AFG3152C), which reconfigures the oscillation dynamics of the ring laser in an online or realtime manner.
Rateequation model for semiconductor ring laser
The numerical simulation was conducted by using a set of dimensionless semiclassical equations for the two slowly varying complex amplitudes of CW and CCW waves, E_{1} and E_{2}^{32}.
where \(\tilde{\alpha }\) accounts for phaseamplitude coupling, s and c are the self and crosssaturation coefficients, and k_{1,2} represents the complex backscattering coefficients. We model internal optical noises as complex Gaussian noise satisfying \(\langle {\eta }_{i}\rangle =0\) and \(\langle {\eta }_{i}(t){\eta }_{j}^{\ast }(t^{\prime} )\rangle =2D{\delta }_{ij}\delta (tt^{\prime} )\) (\(i=1,2\)). \(\langle \cdot \rangle \) represents the ensemble average, and D represents the noise strength. Carrier density N obeys the following equation:
where μ is the dimensionless pumping power (\(\mu =1\) at the laser threshold). In the above equations, t is dimensionless time rescaled by photon lifetime \({\tau }_{p}\). γ is the ration of \({\tau }_{p}\) to carrier lifetime \({\tau }_{s}\).
In Eq. (6), the asymmetric coupling caused by activating BC_{1} and BC_{2} is simply modeled as an asymmetric backreflection effect such that \({k}_{1}={\beta }_{1}{k}_{b}\) and \({k}_{2}={\beta }_{2}{k}_{b}\), where k_{b} denotes the backreflection coefficient when \(C=0\), and β_{1,2} denotes a dimensionless asymmetry parameter, depending on C as follows:
where C(t) is updated by Eq. (2). This is the simplest model of the asymmetric backscattering, although a real asymmetry may be introduced in a more complex way in the actual experiment. We confirmed that regardless of the details of the asymmetry model, the control of spontaneous switching can be achieved. The detailed investigation using more realistic model will be a future work.
In this study, we set some of the parameters as follows: \(\tilde{\alpha }=3.5\), \(2s=c=0.006\), \({k}_{b}=0.004+i0.001\), \(k=0.025\), \(D=5\times {10}^{5}\), \(\mu =2.0\), \({\tau }_{p}=10\,{\rm{ps}}\), \(\gamma =0.01\). With these parameter values, we obtained stochastic switching dynamics with the correlation time \({\tau }_{c}\approx 13\,{\rm{ns}}\) when \(C=0\). In the decisionmaking simulation, we assume that the slot machines provide a reward without any time delay and use Eqs (2–4) and (6–9).
References
Vetter, J. S., DeBenedictis, E. P. & Conte, T. M. Architectures for the PostMoore Era. IEEE Micro. 37(4), 6–8 (2017).
Nahmias, M. A., Shastri, B. J., Tait, A. N., Ferreira de Lima, T. & Prucnal, P. R. Neuromorphic Photonics. Optics & Photonics News 29(1), 34–41 (2018).
Peper, F. The End of Moore’s Law: Opportunities for Natural Computing? New Gener. Comput. 35, 253 (2017).
Merolla, P. A. et al. A million spikingneuron integrated circuit with a scalable communication network and interface. Science 345(6197), 668–673 (2014).
Esser, S. K. et al. Convolutional networks for fast, energy efficient neuromorphic computing. Proc. Natl Acad. Sci. USA 113, 11441–11446 (2016).
Tait, A. N. et al. Neuromorphic photonic networks using silicon photonic weight banks. Sci. Rep. 7, 7430 (2017).
Shen, Y. et al. Deep learning with coherent nanophotonic circuits. Nat. Photonics 11, 441–446 (2017).
Lin, X. et al. Alloptical machine learning using diffractive deep neural networks. Science 361(6406), 1004–1008 (2018).
Appeltant, L. et al. Information processing using a single dynamical node as complex system. Nat. Commun. 2, 468 (2011).
Van der Sande, G. et al. Advances in photonic reservoir computing. Nanophotonics 6, 561 (2017).
Bandyopadhyay, A., Pati, R., Sahu, S., Peper, F. & Fujita, D. Massively parallel computing on an organic molecular layer. Nat. Phys. 6(5), 369–375 (2010).
Berkley, A. J. et al. Quantum annealing with manufactured spins. Nature 473, 194–198 (2011).
Inagaki, T. et al. A coherent Ising machine for 2000node optimization problems. Science 354(6312), 603–606 (2016).
Sutton, R. S. & Barto, A. G. Reinforcement Learning: An introduction. (The MIT Press, Massachusetts, 1998).
Kim, S.J. M., Aono & Nameda, E. Efficient decisionmaking by volumeconserving physical object. New. J. Phys. 17, 083023 (2015).
Kim., S.J., Aono, M. & Hara, M. Tugofwar model for the twobandit problem: Nonlocallycorrelated parallel exploration via resource conservation. Biosystems 101(1), 29–36 (2010).
Hu, W., Wu, K., Shum, P. P., Zheludev, N. I. & Soci, C. Alloptical implementation of the ant colony optimization algorithm. Sci. Rep. 6, 26283 (2016).
Alonzo, M. et al. AllOptical Reinforcement Learning In Solitonic XJunctions. Sci. Rep. 8, 5716 (2018).
Jouini, W., Ernst, D., Moy, C. & Palicot, J. Multiarmed bandit based policies for cognitive radio’s decision making issues. 2009 3rd International Conference on Signals, Circuits and Systems (SCS), Medenine, pp. 1–6, https://doi.org/10.1109/ICSCS.2009.5412697 (2009).
Lai, L., El Gamal, H., Jiang, H. & Poor, H. V. Cognitive Medium Access: Exploration, Exploitation, and Competition. IEEE Trans. on Mob. Comput. 10(2), 239–253 (2011).
Agarwal, D., Chen, B. & Elango, P. Explore/Exploit Schemes for Web Content Optimization. 2009 Ninth IEEE International Conference on Data Mining, pp. 1–10, https://doi.org/10.1109/ICDM.2009.52 (2009).
Kocsis, L. & Szepesvári, C. Bandit Based MonteCarlo Planning. In: Furnkranz, J., Scheffer, T. & Spiliopoulou, M. (eds) Machine Learning: ECML 2006. ECML 2006. Lecture Notes in Computer Science, vol 4212. Springer, Berlin, Heidelberg, https://doi.org/10.1007/11871842_29.
Mahajan, A. & Teneketzis, D. MultiArmed Bandit Problems. In: Hero, A. O., Castanon, D. A., Cochran, D. & Kastella, K. (eds) Foundations and Applications of Sensor Management. Springer, Boston, MA.
Naruse, M. et al. Singlephoton decision maker. Sci. Rep. 5, 13253 (2015).
Naruse, M. et al. Single Photon in Hierarchical Architecture for Physical Decision Making: Photon Intelligence. ACS Photonics 3(12), 2505–2514 (2016).
Naruse, M., Terashima, Y., Uchida, A. & Kim, S.J. Ultrafast photonic reinforcement learning based on laser chaos. Sci. Rep. 7, 8772 (2017).
Naruse, M. et al. Scalable photonic reinforcement learning by timedivision multiplexing of laser chaos. Sci. Rep. 8, 10890 (2018).
Uchida, A. Optical Communication with Chaotic Lasers: Applications of Nonlinear Dynamics and Synchronization (WileyVCH, 2012).
Beri, S. et al. Topological Insight into the NonArrhenius Mode Hopping of Semiconductor Ring Lasers. Phys. Rev. Lett. 101, 093903 (2008).
Gelens, L. et al. Exploring Multistability in Semiconductor Ring Lasers: Theory and Experiment. Phys. Rev. Lett. 102, 193904 (2009).
Sorel, M., Laybourn, P. J. R., Giuliani, G. & Donati, S. Unidirectional bistability in semiconductor waveguide ring lasers. Appl. Phys. Lett. 80, 3051–3053 (2002).
Sorel, M. et al. Operating regimes of GaAs AlGaAs semiconductor ring lasers. IEEE J. Quantum Electron. 39, 1187–1195 (2003).
Pérez, T., Scirè, A., Van der Sande, G., Colet, P. & Mirasso, C. R. Bistability and alloptical switching in semiconductor ring lasers. Opt. Express 15, 12941–12948 (2007).
Hill, M. T. et al. A fast lowpower optical memory based on coupled microring lasers. Nature 432, 206–209 (2004).
Liu, L. et al. An ultrasmall, lowpower, alloptical flipflop memory on a silicon chip. Nat. Photonics 4, 182–187 (2010).
Sunada, S. et al. Random optical pulse generation with bistable semiconductor ring lasers. Opt. Exp. 19, 7439 (2011).
Mihana, T., Terashima, Y., Naruse, M., Kim, S.J. & Uchida, A. Memory Effect on Adaptive Decision Making with a Chaotic Semiconductor Laser. Complexity 2018, 4318127 (2018).
Auer, P., CesaBianchi, N. & Fischer, P. Finitetime analysis of the multiarmed bandit problem. Machine Learning 47, 235 (2002).
Kuleshov, V. & Precup, D. Algorithms for multiarmed bandit problems. Journal of Machine Learning Research. 1, 1–48, arXiv:1402.6028 (2014).
Ermakov, I., Van der Sande, G. & Danckaert, J. Semiconductor ring laser subject to delayed optical feedback: Bifurcations and stability. Commun. Nonlinear Sci. Numer. Simul. 17, 4767 (2012).
Sunada, S. et al. Chaos laser chips with delayed optical feedback using a passive ring waveguide. Opt Express 19(7), 5713 (2011).
Acknowledgements
This work was supported in part by JST CREST Grant Number JPMJCR17N2, Japan. S.S. acknowledges the support from the Murata Science Foundation and JSPS KAKENHI Grant No. 16K04974. M.N. and A.U. acknowledge JSPS KAKENHI Grant No. JP17H01277. A.U. acknowledges the support from JSPS KAKENHI Grant No. JP16H0387.
Author information
Authors and Affiliations
Contributions
S.S., A.U. and M.N. directed the project and designed the decisionmaking principle. R.H., S.K., T.M., Y.M., K.K., A.U. and S.S. designed and performed ring laser experiments. R.H., S.K., T.N. and S.S. analyzed the data. S.S. conducted numerical evaluation of the performance. R.H., M.N., A.U. and S.S. wrote the paper, and all of the authors contributed to the preparation of the manuscript.
Corresponding author
Ethics declarations
Competing Interests
The authors declare no competing interests.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Homma, R., Kochi, S., Niiyama, T. et al. Onchip photonic decision maker using spontaneous mode switching in a ring laser. Sci Rep 9, 9429 (2019). https://doi.org/10.1038/s41598019457543
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598019457543
Further reading

Decision making for largescale multiarmed bandit problems using bias control of chaotic temporal waveforms in semiconductor lasers
Scientific Reports (2022)

Dynamic channel selection in wireless communications via a multiarmed bandit algorithm using laser chaos time series
Scientific Reports (2020)

Timeofflight telemeter based on a ringlaser
Optical and Quantum Electronics (2020)

Generative adversarial network based on chaotic time series
Scientific Reports (2019)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.