Step-like dependence of memory function on pulse width in spintronics reservoir computing

Physical reservoir computing is a type of recurrent neural network that applies the dynamical response from physical systems to information processing. However, the relation between computation performance and physical parameters/phenomena still remains unclear. This study reports our progress regarding the role of current-dependent magnetic damping in the computational performance of reservoir computing. The current-dependent relaxation dynamics of a magnetic vortex core results in an asymmetric memory function with respect to binary inputs. A fast relaxation caused by a large input leads to a fast fading of the input memory, whereas a slow relaxation by a small input enables the reservoir to keep the input memory for a relatively long time. As a result, a step-like dependence is found for the short-term memory and parity-check capacities on the pulse width of input data, where the capacities remain at 1.5 for a certain range of the pulse width, and drop to 1.0 for a long pulse-width limit. Both analytical and numerical analyses clarify that the step-like behavior can be attributed to the current-dependent relaxation time of the vortex core to a limit-cycle state.

A recurrent neural network (RNN) is categorised as a class of artificial neural networks having recurrent interactions between a large number of neurons [1].The RNN can memorise data inputted into the system through recurrent interactions.The dynamical response from the network therefore reflects a time sequence of the input data.In this respect, RNNs have attracted much attention not only in fundamental sciences but also applied sciences such as information processing of time-dependent data, for example, human voices and robotic motions.Physical reservoir computing is a model of RNNs where many-body systems, called reservoirs, are used as the networks [2][3][4][5][6][7][8][9], enabling it to bridge the gap between neural science, information science, biology, and physics.Physical reservoir computing have been performed in several kinds of physical systems, such as optical lasers, soft materials and quantum reservoirs [10][11][12][13][14].
Recent studies have discovered that a fine-structured ferromagnet can also be applied to physical reservoir computing [15][16][17][18][19][20][21][22][23][24].By applying an electric current to a magnetic multilayer in nanoscale, spin transfer [25,26] from conducting electrons to a local magnetisation induces nonlinear magnetisation dynamics such as magnetisation switching and limit-cycle oscillation [27][28][29][30].Such a dynamical response from the ferromagnet is divided into several nodes, with each node regarded as a virtual neuron.The method to construct a virtual manybody system, in this case, a reservoir, is called a timemultiplexing method [12,15,17].The high performance of a voice recognition using a vortex-type spin-torque oscillator [15] has evidently proven that spintronic systems are effective for physical reservoir computing.While such exciting works for practical applications have been reported, the fundamental properties of physical reservoir computing have yet to be fully clarified.For example, the role of the physical parameters, such as the current magnitude and magnetic damping constant, on the performance of physical reservoir computing remains unclear.Further development of physical reservoir computing, or a wide range of brain-inspired computing in general, relies crucially on the clarification of the relation between the physical parameters and computation performances.
In this work, a theoretical study is carried out for physical reservoir computing using a vortex-type ferromagnet.The short-term memory (STM) and parity-check (PC) capacities are evaluated as a function of the pulse width of input data given by current pulses.The capacities show a step-like dependence on the pulse width, where the capacities' values remain at 1.5 for a certain range of the pulse width, and drop to 1.0 for a long pulse-width limit, where the range of the pulse width depends on the material parameters and input-current strength.It is clarified that the step-like behaviour originates from the current-dependent relaxation time of the vortex core.

MODEL
Figure 1(a) schematically shows a magnetic trilayer consisting of a vortex-type free layer and a uniformlymagnetised reference layer separated by a thin nonmagnetic spacer.The z axis is normal to the film-plane.The unit vector pointing in the magnetisation direction of the reference layer is denoted as p = (p x , 0, p z ) (|p| = 1), where the x axis is chosen to be parallel to the projection of p to the xy plane.The dynamics of the magnetic vortex core are known to be well described by the Thiele equation [31][32][33][34][35][36].By applying an electric current density (a) Schematic diagram of the magnetic multilayer consisting of a vortex-type free layer and a uniformlymagnetised reference layer.(b) Time evolution of the normalised centre position of the vortex core in the presence of a direct current.The inset shows a limit-cycle oscillation of the vortex core in a steady state.
J, the spin-transfer torque [25,26] induces the dynamics of the vortex core described by the Thiele equation for the core position X = (X, Y, 0) [31][32][33][34][35][36], where consist of the saturation magnetisation M , the gyromagnetic ratio γ, the Gilbert damping constant α, the thickness L, the disc radius R and the core radius R 0 of the free layer.The polarity p and the chirality c are assumed to be +1 for convenience.The normalised centre position of the vortex core is s = |X|/R.The nonlinear parameter of the damping torque is denoted as ξ.The magnetic potential is where κ = (10/9)4πM 2 L/R and ζ = κ ′ /κ ≃ 1/4 [32,36].The spin-transfer torque strength with the spin polarisation P is a J = π P/(2e).A positive current corresponds to the current flowing from the reference layer to the free layer.The values of the parameters are estimated from experiments and simulations [36][37][38][39] as M = 1500 emu/cm 3 , γ = 1.764 × 10 7 rad/(Oe s), α = 0.005, L = 4 nm, R = 150 nm, R 0 = 10 nm, ξ = 1/4, and P = 0.3.The vortex dynamics are described by two dynamical variables, which are (X, Y ) in a Cartesian coordinate or the normalised distance s = |X|/R of the core centre from the disc centre and the phase of the core position in the xy plane.In previous experiments on reservoir computing in Refs.[15,17], however, only s is used for the computing.Therefore, we are also inclined to focus on the dynamics of s in the following discussion.Simultaneously, we note that recent works have focused on reservoir computing using the phase of the vortex oscillation [21,22].
Figure 1(b) shows the time evolution of the vortex core position s in the presence of a direct current of I = πR 2 J dc = 4.0 mA.The vortex core position s is saturated after a time on the order of 1 µs, and a limitcycle oscillation of the vortex core around the disc centre is excited, as shown in the inset.The relaxation phenomenon plays a key role on the memory function of reservoir computing, as discussed below.We also note that the core position in the limit-cycle state is not constant because the spin-transfer torque originating from the in-plane component of the magnetisation p breaks the rotational symmetry of the vortex core around the z axis.As a result, the distance s of the vortex core from the disc centre for the circular motion is not constant, as shown in Fig. 1(b).We note that a small-amplitude oscillation disappears when p x in Eq. ( 1) is zero.In the experiments of Refs.[15,17,21,22], however, p x remains finite because the output signal from the oscillator, generated through the tunnel magnetoresistance effect, is determined by the number of magnetic moments having the projection to the p direction.On the other hand, the critical current density to excite the vortex motion is determined by p z , as shown below.Therefore, we keep both terms proportional to p x and p z in Eq. ( 1) finite.In the experiments, we apply a small external magnetic field pointing in the z direction to the in-plane magnetised reference layer and make both p x and p z finite [15,17,21,22].
Reservoir computing in the system used in this study is performed by applying random binary pulse inputs to the magnetic trilayer.For example, the input data was encoded in the amplitude of the direct current [15,17] or microwave magnetic field [22] in previous works.The input data in this work is encoded in the amplitude of the direct current as where J dc is the magnitude of the direct current density and ν is the ratio of the binary pulse with respect to the bias current density J dc .In this work, we use ν = 0.2, except the results shown in Fig. 5 where ν = 0.05.The random binary data is bi(t) = 0 or 1, which is constant during a pulse width t p .
Figures 2(a) and 2(b) show the dynamics of the centre position of the vortex core s(t) from the random binary input with pulse widths of 0.5 µs and 3.0 µs, respectively.Six pulses, divided by dotted lines, represent the binary data "001100" as shown by the black line for example.The core position saturates to the value shown in Fig. 1(b) when the binary pulse of bi = 0 is inputted, whereas it saturates to a large value when a binary pulse of bi = 1 is inputted because the spin-transfer torque due to the additional current, νJ dc , pushes the vortex core away from the disc centre.As a result, the relaxation dynamics between the two states is observed when the value of the binary pulse changes.When the pulse width is shorter than the time to saturate the vortex core position, a gradual change in the vortex-core position can be observed, as shown in Fig. 2(a).On the other hand, the vortex core remains almost constant during a pulse when the pulse width is longer than the time to saturate the vortex core position, as shown in Fig. 2(b).

Evaluation of short-term memory and parity-check capacities
As a figure of merit for reservoir computing, we evaluate the STM and PC capacities.The memory capacity characterises the number of data points the reservoir can store and is a dimensionless quantity.The STM capacity is the memory capacity for a linear combination of the input data as defined in Eq. ( 5) below.The PC capacity is the memory capacity for nonlinear data, where a nonlinear transformation is applied to the input data, as shown in Eq. ( 6) below.In general, the memory capacity depends on the sequence of the input data, therefore, it is defined as an average of the number of stored data with respect to random input.
The time-multiplexing method is applied to construct a virtual many-body system from the dynamical response of a single ferromagnet.The dynamical response s(t) during a pulse is divided into N node , and s k,i corresponding to s(t) at the ith node in the presence of the kth pulse is regarded as the ith neuron at a discrete time k [12,15,17].By applying N = 1000 pulses and dividing the output s(t) into N node = 250 nodes, we determined the weight w D,i minimising the error between the system output and the target data given by where the bias term is s k,N node +1 = 1 [12].The integer D(= 0, 1, 2, • • • ) represents the delay [12,16,17,22].Since reservoir computing calculates the time sequence of input data, the capability of a reservoir is characterised by the number of past input data stored in it.The delay D characterises such past input data; for example, it shows whether the (k − D)th input data can be reproduced from the kth output.Accordingly, the delay D is a quantity related to the order of the input data, and thus, is an integer.It does not relate to any time scale of physical systems, such as physical delay time.In this paper, the symbol D appeared in quantities related to reservoir computing and is used as an integer represents the delay, whereas the symbol D that appeared in the Thiele equation represents the damping strength.The data the reservoir should reproduce is called target data or output, and is constructed from the input data.The target outputs for the STM and PC tasks are, respectively, given by where bi k is the kth random binary input data.
After determining the weight, we apply different N ′ = 1000 random binary pulses as a test set and evaluate the reproducibility of the target output v ′ n,D (n = 1, 2, • • • , N ′ ) from the system output defined as where s ′ n,i is the vortex-core position in the presence of N ′ random pulses.Figure 2(c) shows the target output (red) and the system output (blue) for the STM task with a pulse width of 3.0 µs, where the delay varies as D = 1, 2 and 3 from left to right.The system output well reproduces the target output when the delay is small, whereas the reproducibility becomes low as the delay increases.The reproducibility is quantitatively characterised by the correlation coefficient Cor(D) between the target output and system output defined as where • • • is the averaged value.Figures 3(a) and 3(b) show the square of the correlation coefficients for the STM and PC tasks, respectively, at pulse widths of t p = 0.5, 3.0 and 5.0 µs.The correlation coefficient becomes small as the delay D increases and becomes nearly zero as shown in Fig. 3.This is also indicated in Fig. 2(c) where the reproducibility of the input data decreases as the delay increases.The STM and PC capacities, denoted as C STM and C PC , respectively, are defined as The maximum delay D max in this work is set to 20, which is sufficient to evaluate the saturated values of the capacities.We also note that the number of the training data, N = 1000, is also sufficient to saturate the values of the capacities; see Supplemental Information.Figure 3(c) shows the dependences of C STM (red) and C PC (blue) on the pulse width.It should be emphasised that both C STM and C PC show step-like behaviour with respect to the pulse width.When the pulse width is relatively short ( 2.0 µs), the capacities remain at large values over 2.0.In a middle range of the pulse width (2.0 t p 4.5 µs), on the other hand, the capacities suddenly drop to 1.5.The capacities again drop to 1.0 when the pulse width becomes longer (t p 4.5 µs).

Role of current-dependent relaxation on memory function
The step-like behavior of the capacities in Fig. 3(c) can be attributed to the current-dependent relaxation phenomenon of the vortex core.To explain this point, we first focus on an analytical solution of the Thiele equation.Neglecting the small spin-transfer torque due to the in-plane component of the reference layer's magnetisation and higher order terms of s (|s| ≤ 1) and α (α ≪ 1), the Thiele equation for the vortex centre position s becomes ṡ = as − bs 3 , where a and b are defined as Equation ( 10) is the Stuart-Landau equation [40,41] for the real variable s.The vortex core is stabilised at the disc centre when a < 0, whereas a limit-cycle oscillation appears when a > 0. The solution of Eq. ( 10) for a > 0 with the initial condition s(t = 0) = s(0) is given by The solution saturates to lim t→∞ s(t) = a/b.The normalised core position should satisfy 0 ≤ s ≤ 1.Therefore, the condition to excite the limit cycle oscillation is J c1 < J < J c2 , where J c1 = (|D|κ)/(Ga J p z ) and J c2 = J c1 (1 + ξ + ζ) are determined by the conditions of a > 0 and a/b < 1, respectively.We note that the saturated core position depends on the disc radius R through the parameter κ ∝ 1/R.The parameter κ determines the magnetic potential energy of the vortex core, as shown in Eq. ( 2), and becomes small as the disc radius R increases because, in a large disc, a tiny displacement of the vortex core does not change the magnetic energy significantly.Since the spin-transfer torque should overcome the damping torque, which helps keep the vortex core close to the disc centre to minimise the magnetic energy, in driving the vortex-core dynamics, the critical current density J c1 is proportional to the parameter κ.Accordingly, the saturated core position a/b at a given current density J depends on the disc radius R through the parameter κ in the critical current density.Roughly speaking, the saturated core position becomes large as the disc radius R increases because the critical current density becomes small for a large R. Equation (10) indicates that the time scale for the exponential evolution of the vortex core, given by 1/a, depends on the input data through the current density; a large current density results in a fast relaxation.For example, 1/a for the current density J with bi = 0 is 0.36 µs for the present parameters, whereas it becomes 0.14 µs when bi = 1.
Before proceeding to the discussion, we give a brief comment on the relation between the parameter 1/a and the time scale for reservoir computing.Reservoir computing estimates the information of the past input from the temporal response of the reservoir.Therefore, the relaxation phenomenon is key to performing physical reservoir computing.The parameter 1/a(≃ 0.36 or 0.14 µs) characterises the exponential relaxation of the vortex core.We, however, note that nearly e −1 ≃ 0.37, i.e., 37% of the relaxation process to the saturated value a/b is not completed even after a time scale of 1/a has passed.The vortex dynamics, such as those at the last part of the relaxation process, can also be used in reservoir computing.As a result, the time scale of the relaxation dynamics of the vortex core applicable to reservoir computing is longer than 1/a.For example, the pulse width (∼ 4.5 µs) at which the capacities drop from 1.5 to 1.0 shown in Fig. 3(c) is not exactly the same with the parameter 1/a, although such a drop of the capacities is related to the relaxation phenomenon of the vortex core, as explained below.Therefore, we use the word "relaxation time" in the following as a time scale to determine the performance of reservoir computing.However, defining the relaxation time quantitatively is difficult because it depends on the number of significant figures in the calculations.We also note that the relaxation time depends on the current magnitude, as in the case of the parameter 1/a, because of the current-dependent relaxation dynamics.If the relaxation time is faster than the pulse width, the system rapidly relaxes to a steady state and loses the history of the time-series data.In other words, the pulse width should be shorter than the relaxation time of the reservoir.An excessively short pulse width is, however, also not preferable to reservoir computing because the change in dynamical response with respect to a pulse input becomes small for a short pulse width.
Regarding these points, the identification of the data inputted one sequence before the present pulse can be explained as follows.As an example, let us assume that the vortex core remains in a steady state under a binary pulse of bi = 0.For convenience, we name this pulse as the (k − 1)th pulse.If the next (kth) pulse is bi = 0, the vortex state is unchanged, as shown in Fig. 4(a).On the other hand, the vortex core changes the position if the next pulse is bi = 1, as shown in Fig. 4(b).In other words, the response with respect to the kth pulse is affected by the (k − 1)th pulse.Therefore, the value of the (k − 1)th pulse can be estimated from the response under the kth pulse, corresponding to [Cor(D = 1)] 2 = 1.This fact results in capacities of 1.0 in a long pulse-width limit shown in Fig. 3(c).
Another saturated value, 1.5, of the capacities in the middle pulse-width range is explained in a similar way, by taking into account the current-dependent relaxation time.Let us consider the injection of three pulses, named as the (k − 2)th, (k − 1)th and kth pulses, as shown in Figs.4(c) and 4(d).The solid and dotted lines in the figures show the vortex dynamics when the (k − 2)th and (k − 1)th binary pulses are different and the same, respectively.In Fig. 4(c), the (k − 1)th pulse is bi = 1.In this case, the vortex core saturates rapidly, and thus, the vortex core is in a steady state at the end of the (k − 1)th pulse, independent of the (k − 2)th pulse.This fact indicates that the (k − 2)th input data cannot be identified from the vortex dynamics under the presence of the kth input.On the other hand, when the (k − 1) th pulse is bi = 0, the vortex dynamics under the presence of the kth pulse depends on the (k − 2)th pulse.This is because the relaxation time is slow, and thus, the vortex state at the end of the (k − 1)th pulse reflects the state under the (k − 2)th pulse, as shown in Fig. 4(d).As a result, the (k − 2)th input can be identified when the (k−1)th pulse is bi = 0. Therefore, the reproducibility of the (k − 2)th data is 50%, resulting in [Cor(D = 2)] 2 = 0.5 and the capacity value of 1.5.As can be seen in this explanation, the step-like behaviour of the memory function in Fig. 3(c) can be attributed to the currentdependent relaxation mechanism of the vortex dynamics.

Memory capacity for different input strength
The role of the relaxation time of the vortex core on the memory function for reservoir computing can be discussed from a different viewpoint.Figure 5(a) shows the dynamics of the centre position of the vortex core in the presence of six pulses with widths of 0.5 µs.The situation is the same with that shown in Fig. 3(a), but here, we use a relatively weak input pulse characterised by ν = 0.05.Because of the small input-pulse strength ν, the change of s(t) with respect to the input current is small compared with that shown in Fig. 2(a).
Figure 5(b) shows the dependences of the STM and PC capacities on the pulse width for ν = 0.05.Note that we argue above that the value of 1.5 in the memory capacity reflects the difference in the relaxation times between two states.Due to the small parameter ν, the relaxation time from the bi = 0 to bi = 1 state becomes close to that of the opposite case, compared with the system studied in the previous sections.As a result, the reservoir can distinguish both signals in a relatively large pulse-width range.Therefore, the range corresponding to the value of C = 2 for the memory capacities is enlarged, compared with that found in Fig. 3(c) in the main text, and the step-like behavior is observed even in a short pulse-width range.In addition, the pulse width at which the jump from C = 2 to C = 1.5 occurs is shifted to a long pulsewidth region.The results also indicate that the step-like behaviour originates from the current-dependent vortex relation, which is controlled by the parameter ν.In Supplemental Information, we also provide the data showing the STM and PC capacities for various values of ν, which also support the argument here.

DISCUSSION
In summary, physical reservoir computing using magnetic vortex-core dynamics in a fine-structured ferromagnet was performed by solving the Thiele equation numerically.The step-like dependence of the STM and PC capacities on the pulse width was found, where the capacities remain at a value of 1.5 for a certain range of the pulse width, and drop to 1.0 for a long pulse-width limit.Such a half-integer memory capacity originated from the current-dependent relaxation mechanism of the vortex core, where a fast relaxation caused by a large input led to a fast fading of the input memory, whereas a slow relaxation by a small input enabled the reservoir to keep the input memory for a relatively long time.Using a small input-pulse made the difference of the relaxation times between the two states small and suppressed such a fast fading memory.Accordingly, the input-pulse range corresponding to a relatively large capacity, 2.0, was enlarged.
In practical applications, the input data for reservoir computing is converted to electrical inputs by the data conversion, and a preprocessing method is applied to the data [17].The present work indicates that an appropriate choice of pulse width is necessary in the data conversion to achieve high performance of physical reservoir computing.The value of the appropriate pulse width relates to the relaxation time of the reservoir.Spintronics technology provides various kinds of structures, such as macrospin, domain wall, spin-wave and/or skyrmions, by arranging the materials, designs and topologies.The process of the data conversion will depend on the devices because the relaxation time depends on the structures.This study therefore provides a crucial guideline for such device design.

METHODS
The Thiele equation is solved by the fourth-order Runge-Kutta method.The minimisation of the error between the system output and the target data is performed using the Moore-Penrose pseudo inverse matrix determined by the singular value decomposition.
FIG. 1.(a) Schematic diagram of the magnetic multilayer consisting of a vortex-type free layer and a uniformlymagnetised reference layer.(b) Time evolution of the normalised centre position of the vortex core in the presence of a direct current.The inset shows a limit-cycle oscillation of the vortex core in a steady state.

FIG. 2 .FIG. 3 .
FIG.2.Time evolutions of the vortex-core position (red) and binary pulses (black) with pulse widths of (a) 0.5 µs and (b) 3.0 µs.The six binary pulses denote the input "001100".The dotted line represents unit pulse width tp.(c) Target output (red, solid) and system output (blue, dashed) of the STM task.The delay varies as D = 1, 2, and 3 from left to right.The pulse width is 3.0 µs.

FIG. 4 .
FIG. 4. (a), (b)Schematic pictures of the vortex dynamics (red) in the presence of two ((k − 1)th and kth) pulses (black), where the kth and (k − 1)th pulses are the same in (a) and different in (b).The blue circles represent the nodes.(c), (d), Schematic views of similar dynamics with three pulses, where the solid and dotted lines show the cases when the values of the (k − 2)th and (k − 1)th binary pulses are different and the same.

7 FIG. 5 .
FIG. 5. (a) Time evolutions of the vortex-core position (red) and the binary pulses (black) with widths of 0.5 µs and (b) dependences of the STM (red) and PC (blue) capacities on pulse width, where the dimensionless parameter ν that determinines the difference in magnitudes of the binary inputs is 0.05.