Simulating self-learning in photorefractive optical reservoir computers

Photorefractive materials exhibit an interesting plasticity under the influence of an optical field. By extending the finite-difference time-domain method to include the photorefractive effect, we explore how this property can be exploited in the context of neuromorphic computing for telecom applications. By first priming the photorefractive material with a random bit stream, the material reorganizes itself to better recognize simple patterns in the stream. We demonstrate this by simulating a typical reservoir computing setup, which gets a significant performance boost on performing the XOR on two consecutive bits in the stream after this initial priming step.

These two equations describe how the classification of the readout y(t) at each time t is related to the internal reservoir state x(t) through a nonlinear detection operation f (which can simply be the quadratic response of a photodiode) and a set of readout weights W out . Moreover, the internal reservoir states x(t) at the current timestep (2) � y(t) = W out f (� x(t)). 1 Photonics Research Group, UGent-imec, Technologiepark-Zwijnaarde 126, 9052 Ghent, Belgium. 2  www.nature.com/scientificreports/ are related to the input u(t) and reservoir states at the previous timestep � x(t − dt) through completely passive mixing (without any nonlinearities), characterized by W res .

OPEN
Traditionally, all the weights are explicitly encoded in software or dedicated hardware. However, the beauty of RC in physical hardware is that W in and W res can be completely encoded by a physical system, which allows us to only find a suitable readout W out . In this case W in and W res are usually unknown parameters of the physical system.
In this work, we will explore how the photorefractive effect can be used to potentially improve such a simple passive optical reservoir computing setup by influencing the internal (unknown) reservoir state W res .
Indeed, a potentially interesting aspect of these photorefractive crystals, is that photorefractive crystals might exhibit a form of self-learning, i.e. the ability to reconfigure themselves according to prolonged exposure to the signals the reservoir system is supposed to classify. It might be possible to exploit this in a way similar to Hebbian learning 22,23 , which is best characterized by the catchphrase "neurons that fire together wire together". In the case of photorefractive crystals in a RC setup, this would take the form of common patterns and correlations in the training input becoming more expressed in the crystal in such a way to make the final classification by the readout easier.

Results
A photorefractive crystal is placed inside a free-space cavity with 50/50 mirrors, as illustrated in Fig. 1. This combination of crystal and cavity will act as the reservoir. Light leaking out of the cavity will be detected by a camera with a limited number of pixels. On these detected camera pixels a readout can be trained for the application at hand.
Within the proposed setup, the photorefractive crystal thus acts as a diffractive element inside the cavity, which introduces the random mixing necessary for a reservoir to function. However, even though this setup resembles typical diffractive optical reservoirs [24][25][26][27] , it is important to note that the proposed setup does not contain any nonlinear elements and hence follows the passive photonic reservoir setup as introduced in Eq. (1). Indeed, the nonlinear photorefractive effect, which typically acts on a timescale of seconds is typically too slow to have any effect during inference and hence, any charge distribution (and resulting index contrast) within the crystal can be considered constant during inference.
However, whereas traditional reservoir computing setups do not allow any optimization of the reservoir itself, our simulation setup is designed in a way to exploit the self-reorganization of the photorefractive crystal by prolonged exposure to a random bitstream and a reference beam. We will call this initialization procedure the priming of the crystal. During priming, a reference beam is active to induce beam coupling 28 inside the crystal between the signal beam and the reference beam, as illustrated in Fig. 2a. The technique of beam coupling in photorefractive crystals is a well-known concept in photorefractive holography 3,4 and will result in a refractive index distribution that will allow part of the light to be coupled out of the cavity, as illustrated in Fig. 2b during inference, after this initial priming step, when the reference beam is turned off.
Generally speaking, a bit entering the cavity will make at least one roundtrip before its amplitude drops below the noise due to power loss at the 50/50 mirrors, losses in the crystal and leakage out of the cavity. These round trips allow for the time-dependent signal to interfere with itself inside the crystal. Hence, when the reference beam is active, recurring patterns in the time-dependent input signal will start to couple with the reference beam, resulting in an emerging input signal-dependent index contrast in the crystal.
During inference, the actual bit stream is sent through. The aim is that the initial priming step will have improved the performance of the reservoir setup on the task at hand due to the more pronounced correlations Figure 1. A photorefractive crystal placed in a cavity. Light entering the cavity will interact with the photorefractive crystal such that part of the light will be coupled out of the cavity. Moreover, frequently occurring recurring patterns in the time-dependent input signal will interfere similarly in the crystal resulting in a form of self-learning. This figure was created with Inkscape. www.nature.com/scientificreports/ between common bit patterns inside the photorefractive crystal. Indeed, even when a purely random bit stream is used for priming, each subsequence of bits (take for example the two-bit sequences 00, 01, 10, 11) will still interact differently with the photorefractive crystal by exciting it slightly differently. This means that those substrings of bits will interact differently with the crystal during inference in a predictable way, which can be classified by the readout. In our simulations, a bit stream of 10,000 bits is sent through the cavity. When the signal leaks out of the cavity (either behind the mirrors or on the sides of the crystal), the signal will be detected by a camera, consisting of 64 recorded pixel values sampled eight times per bit, which are obtained by spatial averaging of the FDTD grid at the camera location and by performing a lowpass filtering with a cutoff frequency equal to the bitrate for each of the pixels. The readout weights are then trained to follow a boolean target function. Two target functions are considered: a simple copy task, where the same output should be reproduced with a certain latency and the XOR task, where the XOR of two consecutive bits in a bitstream is performed by the system and which also must be reproduced with a certain latency. Training of the readout is, like usually in RC, simply done by linear regression: the chosen readout minimizes the mean squared error between target and prediction. Moreover, after performing a threshold on the predicted output, the Bit Error Rate (BER) can be calculated.
As we are targeting boolean tasks, we would like two consecutive bits in a bit stream to be able to interfere inside the crystal. Hence we choose the width of the cavity such that the propagation time between the two mirrors equals the length of a single bit. However, using typical photorefractive parameters for LiNbO 3 29-31 summarized in Table 1 and a target bitrate of 100 Gbps , this would result in a cavity width of about 700 µ m. Doing an FDTD simulation for such cavity for a meaningful amount of bits at a wavelength of 1550 nm is however near impossible. Therefore, for computational reasons, the simulated cavity is made 100 times smaller to 7µ m. To compensate for this reduced size, the bitrate in simulation is increased by the same factor to 10 Tbps . Moreover, as the diffractive power of a grating is in general proportional to both the length of the grating and its refractive index contrast, the shorter length of the crystal is compensated by an equal and opposite increase of its Pockels coefficients.
Three different cases will be considered: the primed crystal as discussed earlier, an empty cavity and a cavity with a crystal with a random diffraction pattern within. The random diffraction pattern is obtained by performing a band-pass filter on white noise within the spatial frequency range of typical gratings inside a photorefractive crystal, i.e. spatial frequencies corresponding to gratings with pitch between /2 (co-or counter propagating beams) and /(2 √ 2) (perpendicular beams) where allowed. Moreover, the standard deviation on this random index contrast was chosen to equal the standard deviation of the grating in the primed case. These two extra cavity setups should offer a fair comparison between the self-learning priming approach and more typical random diffraction reservoirs.
Copy Task. The copy task consists of sending a bit stream through the reservoir and trying to retrieve the same bit stream with a certain delay. Even though no special calculations need to be performed to do this operation, the copy task still serves as the prime measure for the memory of the reservoir.

Figure 2. (a)
During priming, a random input signal is sent through the cavity containing a photorefractive crystal while a reference beam is active. The signal beam enters the cavity through a 50/50 mirror and will reflect at the other side of the cavity on another 50/50 mirror. Due to interference with the reference beam (and with reflections of the signal beam itself), induced gratings start to form inside the photorefractive crystal for common patterns in the signal beam. These gratings will in turn influence the propagation of the light through the crystal. When the induced gratings become strong enough, beam coupling occurs. (b) During inference, the signal beam will still interact with the gratings created during the priming step These interactions with the photorefractive crystal might give rise to beams that are coupled out of the photorefractive crystal and slower propagation times through the crystal. Overall, such a primed crystal will perform better as a reservoir on the benchmark tasks considered here than the non-primed crystal. This figure was created with Inkscape. www.nature.com/scientificreports/ We attempt to retrieve the original bit stream at different delays or latencies after which they were sent out. This is done for a randomly initialized non-primed crystal and a crystal primed by the previously described initialization procedure.
In Fig. 3, the latency is increased in steps of 0.125 bits. These steps correspond to the sampling rate of the signal (which is 8 times the bit rate). For each of these latencies, the BER is calculated. Note that latency 0 is defined as when the bit starts entering the cavity. As can clearly be seen on the figure, the performance on the copy task degrades after priming, from 0 bits in 10, 000 simulated bits in the non-primed and random case (which corresponds to a max BER of 10 −2 for the amount of 10, 000 bits used 32 ) to about 200 in 10, 000 in the primed case. XOR task. When performing the XOR task, the system is asked to produce the XOR of two consecutive bits in the bitstream. The XOR is a typical benchmark problem in machine learning, as the nonlinearity of the XOR operation makes solving this task non-trivial because the output cannot be found by just performing a linear classification algorithm on the inputs. However, if the mixing in the reservoir is sufficiently large, the non-linearity of the detector is often enough to perform this task 19 . Hence, being able to perform the XOR task in an optical reservoir where a readout is trained on the detected reservoir output is often a good indication of sufficient mixing in the reservoir.
In Fig. 4, we compare again the performance of the primed reservoir with the non-primed and random reservoir for different latencies (in the case of the XOR, the latency is counted from the moment the last bit has started entering the cavity), and here we see a stark difference: whereas the primed reservoir is able to perform the XOR between two consecutive bits in the bitstream with 0 errors out of 10, 000 bits, the non-primed reservoir and the random reservoir are totally unable to do so. This might indicate that priming the reservoir increases the performance of the reservoir system by trading of some of the memory for computational performance.  Figure 3. Performance on the copy task before and after priming. Priming has a detrimental effect on the memory of the reservoir. The latency is increased in steps of 0.125 bits.

Discussion
This study presents an initial attempt at using photorefractive materials for self-learning neuromorphic computing applications. We show that by exposing a photorefractive material to a long, repeated bit stream (priming), the induced gratings make it perform better on a nonlinear time-dependent telecom related benchmark task: the XOR task. Indeed, performance on the XOR task can be improved from 50 % BER (random guessing) in the non-primed and random case to 0 errors in 10, 000 simulated bits in the primed case. Moreover, comparing the random grating with similar properties (average index variation and grating pitch) as the primed grating shows much worse performance on that same task, showing that the primed crystal has indeed learned to perform the XOR by itself.
However, this gain in computational power comes at the cost of reducing the memory of the reservoir system: tasks requiring more memory (but less computational power) will perform worse. This is exemplified by the results on the copy task, where the primed crystal is unable to copy the bits without any errors.
To perform these self-learning reservoir simulations in a reasonable amount of time, some approximations were necessary. The most important limitation on the simulation was the size of the crystal, which was reduced by a factor 100. To compensate for this smaller crystal size, the bitrate was increased by the same factor, from 100 Gbps (which would be the target bitrate in an actual physical setup) to 10 Tbps in simulation. As -generally speaking -the refractive power of a grating is proportional through the index contrast and to its length, we increased the Pockels coefficients in the simulation as well, to compensate for the shorter propagation length through the crystal. All these approximations indicate that the results obtained yield a qualitative indication that the proposed system could work in principle on physical hardware. However, actual experimental results are necessary to confirm this claim.

Methods
The FDTD method. The Finite Difference Time Domain (FDTD) method 33,34 is one of the most-used ways to simulate electromagnetic phenomena.
By discretizing the electric field E and the magnetic field H on a Yee cell, as illustrated in Fig. 5, one can derive the following update equations: with Where q represents the time-index of the simulation and the indices m, n, p represent the index of the Yee-cell the field components belong to along the x, y and z axis respectively (half-integer offsets from the corner of the grid-cell as laid out in Fig. 5 are implicitly assumed). Moreover, s c is known as the Courant number of the simulation, which-for a 3D simulation-must satisfy the following stability requirement.  8)) is related to two processes. The first process (Eq. (9)) describes the excitation of the free carriers n from neutral donors N D (related to the intensity I of the incident optical field, the photo-ionization constant s and the thermal excitation constant β ) and recombination with positively charged traps N + D (according to a recombination constant γ ). The second process (Eq. (10)) describes the diffusion of the free carriers through the material due to the non-uniform charge distribution ∇n and its resulting space-charge field S , where µ represents the mobility of the free carriers, e represents the elementary charge and k is the Boltzmann constant.
The equations, as they are described here, assume no difference between traps and donors in the photorefractive material: each unfilled trap is positively charged and conversely each filled trap is a (neutral) donor. Second, it is implicitly assumed that each trap has the same excitation energy and the excitation energy needed to excite electrons from the valence band is too high to have any influence.
Generation and recombination. In Eq. (9), the change in excited donor density N + D can be split into a generative term and a recombination term. The generative term will be proportional (through a photo-ionization cross-section s) to the intensity of the light I, which in this case is defined in terms of the energy density I = cE . Assuming the only absorption in the photorefractive material is due to the photo-ionization, we can propose a relation between the photo-ionization s and the absorption coefficient in the material α: Note that the assumption that the absorption is completely due to the photo-ionization is an approximation. It gives a lower bound for the absorption, given the photo-ionization cross-section s. Moreover, free carriers will also be uniformly generated due to a thermal excitation rate β . On the other hand, the recombination term will Electron diffusion. The carrier density will also be influenced by diffusion, which is related to Eq. (10) in the following way: Here, we defined the diffusion constant D = µkT/e and the electron flow � F = nµ � E . This diffusion equation can be discretized on the Yee-grid with symmetric differences (n is chosen to be on the corners of the Yee-cell): However, by carefully fixing the time step for this update equation to be the update equation gets considerably simplified: This equation would reduce to the typical Lax-Friedrich scheme if the space-charge field were to be uniform.
Space charge electric field. The diffusion of the free carriers depends on the space-charge field S through Eq. (16). However, the space-charge field itself is related to the free carrier distribution n through the charge density ρ = e(N + D − n): Here, ǫ s is the static permittivity of the photorefractive material, which usually is vastly different than the permittivity at optical wavelengths. Moreover, it is also assumed that S varies slowly enough to allow the second equation to equal zero. The space-charge field is an electric field and hence lives on the edges of the Yee cell. Moreover, n is located on the corners of the Yee-cell, hence Eq. (17) can be discretized as follows: Moreover, the application of the curl (Eq. 18) needs to be solved on the faces of the grid cell: Taking all the equations together for each grid point gives the overdetermined system Where x is the vector of 3MNP unknowns and b the vector of 4MNP targets:   www.nature.com/scientificreports/ Moreover, A is a sparse matrix containing the coefficients of Eqs. (19) and (22). Although this system overdetermined, it turns out a solution to this linear system of equations can still be found by using the left pseudoinverse of A: We use the biconjugate gradient method 36 to solve this system every diffusion timestep. The biconjugate gradient method is efficient because it solves for A T A iteratively and hence no inversion of a sparse matrix (which is generally speaking not sparse itself) is necessary. Moreover, the biconjugate gradient method also allows to initialize the system with an estimate of x for which the value of x at the previous diffusion time step can be used.
The electro-optic effect. Finally, a relation between this space charge electric field S and the optical properties of the material still needs to be found. Generally speaking, the electro-optic effect is described as a dependence of the impermeability tensor η = µ r ǫ −1 r of the material on a present electric field E . When this electric field is small-as is the case for the space charge electric field S -a first-order series expansion can be used: This first-order dependency on the electric field is called the Pockels effect. In the case of non-magnetic materials ( µ r = 1 ), this equation can be rewritten as: This equation mixes E-type field components, located on the edges of the Yee cell with H-type field components, located on the faces of the Yee-cell. This often causes numerical instabilities as it is not clear how to handle the non-diagonal components of ǫ −1 r . To solve this problem, the method proposed in by Werner et al. 37 can be used, which proposes a modified but stable update equation for the electric fields: where (·) {C} is defined as an interpolation of a field component to the corner of the grid cell and (·) {E i } is defined as an interpolation of a field component to the E i -edge of the grid cell.
Towards a photorefractive FDTD simulation. How all these different photorefractive processes are merged into a modified FDTD simulation is visualized in Fig. 6, which also shows the different timescales each of these processes operate on.
The FDTD timescale is best characterized by its time step, which usually is around dt FDTD ≈ 0.1 fs . Depending on the size of the grid, the FDTD simulation is run for a few thousand time steps. As a general rule we assume it takes about 1000 FDTD time steps to simulate the propagation of a single pulse through the photorefractive material at hand, hence after the FDTD simulation, about 100 fs has passed.
However, the diffusion in the crystal happens at a much slower pace. Using (15) we can find that for a typical photorefractive material like LiNbO 3 , with a mobility µ = 0.0015 m 2 /Vs 31 , the diffusion time step is about dt diff ≈ 15 ps-about 2 orders of magnitude larger than the full FDTD simulation. During this characteristic time of the diffusion, the refractive index of the material can be considered constant. To save simulation time, we multiply the absorption profile obtained through the FDTD simulation with a factor 100, which would physically be roughly equivalent to sending the same signal 100 times through the crystal.
The absorption profile can then be converted into free carriers through Eqs. (11) and (9). These are then free to diffuse through the crystal with the mentioned diffusion time step. Typically we will update the space charge field about every 100 diffusion steps and repeat this process 1000 times, which means that the refractive index is updated every 10 −6 s before the whole simulation process is started over again.

Data availibility
Simulation files and results can be obtained from the corresponding author upon reasonable request.