Complex Learning in Bio-plausible Memristive Networks

The emerging memristor-based neuromorphic engineering promises an efficient computing paradigm. However, the lack of both internal dynamics in the previous feedforward memristive networks and efficient learning algorithms in recurrent networks, fundamentally limits the learning ability of existing systems. In this work, we propose a framework to support complex learning functions by introducing dedicated learning algorithms to a bio-plausible recurrent memristive network with internal dynamics. We fabricate iron oxide memristor-based synapses, with well controllable plasticity and a wide dynamic range of excitatory/inhibitory connection weights, to build the network. To adaptively modify the synaptic weights, the comprehensive recursive least-squares (RLS) learning algorithm is introduced. Based on the proposed framework, the learning of various timing patterns and a complex spatiotemporal pattern of human motor is demonstrated. This work paves a new way to explore the brain-inspired complex learning in neuromorphic systems.

the two parallel memristors in a synapse. There is only one memristor in the synapse is allowed to change its conductance during the modulation stage, which ensures a unique modulation path to drive the synaptic weight from any initial value to the desired one. The joint conductance of the synapse could be gradually modulated from Supplementary Figure S2 | High-precision modulation under differential pulse pair. In order to achieve a higher precision when modulating the weight, a pair of differential pulses consisting of a positive pulse and a negative pulse is used to modulate the memristor resistance to the desired value. In our measurement, the positive pulse has the amplitude of 2V and pulse width of 100µs. The negative pulse has the same pulse amplitude and width, but different pulse direction. By applying the pulse pair, the resistance of memristor can be tuned in a more precise way instead of the big strike if we just use one side of pulse to adjust the resistance, for instance in the process of Fig. 1(c). As shown inside the inset of this graph, the minimum step of resistance change is as small as 0.6 Ω, which indicates that the relative precision could reach 0.3% at the 200 ohm level of low resistance state. By varying the pulse amplitude and width of pulse pair, a higher accuracy can be further achieved.

Device mechanism and behavior model
In order to obtain analog resistance states of the iron oxide-based memristor, a pre-forming process and an initial reset are required before operation. The resistive switching is believed to be due to the growth and rupture of the conducting filament, which is driven by the oxygen vacancy migration. The To support the simulation, we introduce a phenomenological model [4] to emulate the behaviour of the memristive device: where G, , are the conductance, minimal conductance and maximal conductance of memristor, respectively. A and B are two fitting parameters. The plus sign before A indicates the LTP process and the minus sign is the LTD process. The simulation results are shown in Supplementary Figure S1(b).

High-precision modulation of memristor conductance
To build the memristor-based neuromorphic computing system, the most challenging work is to precisely modulate the memristive states. Owing to the proposed synapse structure shown in Supplementary Figure S1(a), we could achieve a wider weight range compared with that of the conductance-based synapse, by matching the feedback resistance Rf of the inverting amplifier. Theoretically, it's reasonable to use the continuous curve in Supplementary Figure   S1(b) obtained from the device measurements to support the simulation model [4,5]. The illustration of online weight adaptation is also presented in Supplementary Figure S1(c). We tried to achieve a higher tuning precision in contrast to the crude modulation in Fig. 1(c). As shown in Supplementary Figure S2, we propose a differential pulse pair scheme to tune the memristive states. By choosing the amplitude and width of the pulse pair, the relative precision could reach 0.3%. However, these methods are still based on open-loop modulation, which is tough to tackle the device variation in real hardware system, including variation from device to device and cycle to cycle. A natural way to deal with the issue is to utilize a closed-loop scheme or feedback algorithm [2,6]. In the feedback modulation, complex pulse train (often varying amplitude and width) and repeatedly write/read operations are required. Nevertheless, in the sense of synaptic modulation, our bio-plausible recurrent network and learning algorithm are feasible in memritor-based neuromorphic hardware system, especially with the development of the crossbar and 3-D memristor integration technology [7,8].
Offline and online training scheme By introducing the classical RLS algorithm in control theory, the recurrent network is trained to produce various complex spatiotemporal patterns in Supplementary Figure S3, including the bio-plausible neuronal spiking activity and ECG signal. In fact, these complex patterns emerge from the internal chaotic activities of the network, as shown in Supplementary Figure S4, which is crucial to the timing coding of the brain [9,10]. In our work, we fix g at 1.5 to remain the initial network in a chaotic dynamic regime [11]. In order to produce specific target pattern, the network must be trained to reach the corresponding stable state. There are two schemes to train a memristive network, online training and offline training [2,12]. Online training requires a rapid weight modulation at each time step and needs longer training time; while offline training only needs a final importation of the predefined parameters, as illustrated in Supplementary Figure S5. In this sense, the offline modulation is faster, and without time limitation between two iterations, we can further introduce the feedback tuning mentioned earlier to achieve a higher precision. However, offline training heavily depends on the softwarebased precursor system for parameter computing so that it is hard to be embedded into portable devices. In our implementation, we prefer the online scheme for the future practical application.
In this context, the implementation should take into account the limited time interval to modulate the memristive states at each time step. In principle, fortunately, there is no need to consider this issue because the switching speed of memristor could be ~10 3 −~10 6 times faster than that of the brain operation. Accordingly, we can update the weights one by one. Whereas, the number of synapses will increase rapidly when the network scales up because of the crossbar architecture. As a result, the burden of the pulse modulator will be enormously heavy. One solution is to adopt parallel programming method in the memristor array. Another solution is to construct a sparsely connected network consisting of much fewer active synapses to alleviate the modulation burden. As shown in Supplementary Figure S6, our recurrent network is able to produce the right pattern even at the sparseness of 0.3. This promises to support the large-scale neuromorphic networks for online learning in the future.