Biological plausibility and stochasticity in scalable VO2 active memristor neurons

Neuromorphic networks of artificial neurons and synapses can solve computationally hard problems with energy efficiencies unattainable for von Neumann architectures. For image processing, silicon neuromorphic processors outperform graphic processing units in energy efficiency by a large margin, but deliver much lower chip-scale throughput. The performance-efficiency dilemma for silicon processors may not be overcome by Moore’s law scaling of silicon transistors. Scalable and biomimetic active memristor neurons and passive memristor synapses form a self-sufficient basis for a transistorless neural network. However, previous demonstrations of memristor neurons only showed simple integrate-and-fire behaviors and did not reveal the rich dynamics and computational complexity of biological neurons. Here we report that neurons built with nanoscale vanadium dioxide active memristors possess all three classes of excitability and most of the known biological neuronal dynamics, and are intrinsically stochastic. With the favorable size and power scaling, there is a path toward an all-memristor neuromorphic cortical computer.


Supplementary Figure 4. Structural and compositional characterizations of a 100nm-thick
VO2 film grown on SiNx/Si substrate. a, Grazing incidence X-ray diffraction (GIXRD) spectrum acquired at 1.54059 Å Cu Kα1 wavelength. Indexed lines are the results of a phaseidentification analysis by using the whole pattern fitting method. The best match (at an R factor of 7.28 %) is found with a monoclinic VO2 phase (space group: P21/c (14) PDF# [98-001-4290]). The relative intensities indicate some preferred orientation of the crystallites. b, Highresolution X-ray photoemission spectroscopy (XPS) of the V2p spectral doublet (V2p1/2 and V2p3/2). The V2p3/2 peak is curve fitted in an attempt to quantify the oxidation states of V, showing a dominating V 4+ oxidation state (64 %) with the rest of it being at V 5+ state (36 %). Since XPS only detects the top few nm thickness of the film, the V 5+ state was possibly due to native oxidation after the film was exposed to air. c, Rutherford backscattering spectrum (RBS) showing atomic concentrations of O at (674) % and V at (331) %, or a O:V ratio of 2.03:1. The thickness in RBS is estimated by assuming a density of 7.1510 22 atoms cm -3 . d, Secondary ion mass spectroscopy (SIMS) showing an average O:V ratio of 2.04:1 (in the depth of 40-120 nm).

Supplementary Figure 5. Transmission electron microscopy and electron diffraction
characterizations of a 100nm-thick VO2 film grown on SiNx/Si substrate. a, Cross-sectional bright-field transmission electron microscopy (BFTEM) image of the VO2 sample prepared by focused ion beam cutting. The polycrystalline nature of the VO2 film with columnar grain structures is clearly resolved. The brighter and darker contrasts seen across grain boundaries are caused by electron beam diffractions by lattice planes with a slight tilt from one grain to another. Scale bar: 50 nm. b, High-resolution (HRTEM) image of the VO2/SiNx interface. Scale bar: 4nm. c, BFTEM image of one particular grain ("grain 1") selected for electron diffraction study. Scale bar: 20 nm. d, Selected area electron diffraction (SAED) pattern of grain 1 showing diffraction spots that can be matched nearly perfectly with the d-spacing of a few low-index lattice planes, (002), (100), and (011), from a monoclinic VO2 phase (space group: P21/c (14) PDF# [98-001-4290]). Scale bar: 2 nm -1 . Figure 6. Characteristics of electroform-free VO2 active memristor devices. a, Photo of an array of 576 VO2 crossbar devices (36 reticles and 16 devices in each reticle) fabricated on a 3-inch SiNx/Si substrate. b, Histograms of the switching threshold voltage measured from 288 100  100 nm 2 VO2 devices (top) and 288 50  50 nm 2 devices (bottom) fabricated on the same wafer. The mean values of switching thresholds are size-dependent and tunable by the VO2 film process conditions. c, Temperature dependence of device conductance G near zero bias (50 mV) measured in a heating cycle, showing a sharp Mott transition when temperature rises above 60 C. Thermally activated electron transport in the insulating state is shown by a least-square fit (R 2 = 0.996) by ln(G/T 2 ) vs. 1000/T (dashed line) with a single activation energy of (0.205  0.003) eV. d, Switching endurance data of > 26.6 million pulsedmode on/off switching cycles. All the switching events were measured without subsampling. The data show no sign of device degradation or drift in resistance values during the endurance test. e, Histogram of the ROFF/RON resistance ratio in the endurance measurement of d. Red line is a Gaussian fit with R 2 = 0.99, having a center of 155.65  0.15 and FWHM of 16.72  0.30. f, Four-terminal quasi d.c. I V characteristics (force V, measure I) of the device tested in d before and after the endurance test, showing no deterioration or drift in its switching characteristics. Figure 7. Wafer-scale uniform and electroform-free switching characteristics of VO2 devices. Plotted are 36 sets of four-terminal quasi-d.c. I V traces (force V, measure I) measured from VO2 crossbar devices, sampled one device per reticle from 36 reticles (labeled by the wafer row and column numbers, from 48 to 53) in the same wafer. All the devices are identified as WaferSubsite = D1, meaning that they are all located at the same relative position inside each reticle (D1 is the first device at the bottom left corner in each reticle, as shown in Supplementary Figure 6a). For each device, the force V and measure I sweeps is repeated 10 times at the same setting. Majority of the as-grown VO2 devices, ~98 %, showed upfront resistive switching and NDR in the very first I V sweep without the need of electroforming. The uniformity of switching is demonstrated in both the run-to-run repeatability and the low device-to-device variation in switching thresholds (see Supplementary Figure 8).

Supplementary Figure 8. Statistical dispersions in switching threshold voltage
Vth and device size d of 1152 VO2 nano-crossbar devices, as measured by coefficient of variation V ≡ / (ratio of the standard deviation to the mean ). The device size d, as defined by the electrode linewidth, is measured by a critical dimension scanning electron microscope (CDSEM) system. a, V in switching threshold Vth vs. the mean device size and its inverse 1/ . b, V in device size d vs. the mean device size and its inverse 1/ . A total number of 1152 VO2 devices, with six different designed dimensions, are measured from two 3-inch wafer samples (wafer A: 7575 nm 2 , 150150 nm 2 , 300300 nm 2 , and 600600 nm 2 ; wafer B: 5050 nm 2 and 100100 nm 2 ). Each data point in wafer A is the result from 144 distinctive VO2 devices with the same designed size. For each device, the force V and measure I sweeps is repeated 10 times at the same setting to obtain 20 switching events (including both positive and negative bias polarities). Each data point in wafer B is the result from 288 distinctive VO2 devices in the same manner as in wafer A. At d < 150 nm, V in device size starts to increase linearly with the inverse of device size due to the impact of edge roughness in ebeam lithography. In the worst-case, V ( ) is ~5 % for 5050 nm 2 devices. No such trend is observed in the relative variation of switching threshold, and all the V ( th ) data are less than 13 %. Figure 9. Comparison of switching energy and switching time (speed) of Mott IMT in VO2 and NbO2 devices with channel radius of 5-35 nm. a, Calculated switching energy vs. channel radius for devices with a film thickness of 20 nm. In VO2, due to the much smaller temperature rise needed for IMT to occur (40 K vs. 800 K), the volumetric free energy cost for IMT in VO2 is only one-sixth of that for NbO2 at the same crystal volume, and is less than 1 fJ at channel radii smaller than 7 nm. For details, see Supplementary Table 1. b, Simulated switching time vs. channel radius for devices with a film thickness of 50 nm. SPICE simulations of a VO2-based Pearson-Anson relaxation oscillator (see the circuit in Supplementary Figure 3d) are used to estimate the switching time (speed) of Mott IMT from the rising edges of device current in each oscillation period 16 . The VO2 channel radius is varied while all the other VO2 model parameters are kept the same. Note that the switching speed is a material-dependent parameter, and is not affected by the values of RL and C1 passive components (RL from 5 k to 100 k and C1 of 22 pF were used). Simulations found that, at the same channel dimensions, Mott IMT switching in VO2 is 100 times faster than in NbO2, and is faster than 1 ps at channel radii smaller than 15 nm. Figure 10. All-or-nothing firing behavior in a tonic VO2 neuron circuit. a, Experimentally measured all-or-nothing behavior, showing no response at subthreshold input voltage pulses of 0.1 V and 0.15 V, and spiking at suprathreshold input voltage pulses of 0.25 V and 0.4 V. In the suprathreshold regime, the shape or amplitude of neuron spikes does not change with an increase in the input voltage pulse amplitude. b, Simulated all-or-nothing behavior of the same tonic VO2 neuron circuit. The pulse width in all the data shown is 10 µs. Figure 11. Refractory period behavior in a tonic VO2 neuron circuit. a, Experimentally measured refractory period behavior, showing a spiking in response to the first suprathreshold input voltage pulse, but no response to the second input voltage pulse if it occurred within the refractory period (panels 1 to 4 from the top). The neuron fires again if the second input voltage pulse is outside the refractory period (panels 5 to 7 from the top). From top to bottom, the periods of the input voltage pulse doublets are 20 µs, 40 µs, 60 µs, 80 µs, 100 µs, 120 µs, and 150 µs, respectively. b, Simulated refractory period behavior of the same tonic VO2 neuron circuit. The pulse width in all the data shown is 10 µs. Figure 12. Absolute and relative refractory periods experimentally observed in a tonic VO2 neuron circuit. a, Absolute refractory period behavior, showing that if a second input voltage pulse is applied within the absolute refractory period, regardless of its strength (in this example 1.5 V was used, an amplitude greater than the spike amplitude), the neuron will never fire a second action potential in response. b, Relative refractory period behavior, showing that if the second input voltage pulse applied within the relative refractory period has the same strength as the first input pulse (in this example 0.75 V) , the neuron will not fire a second action potential. c, Relative refractory period behavior, showing that if the second input voltage pulse applied within the relative refractory period is much stronger than the first input pulse (in this example 1.5 V vs. 0.75 V) , the neuron will respond to it and fire a second action potential. The pulse width in all the data shown is 8 µs. Figure 13. Tonic spiking behavior in a tonic VO2 neuron circuit. a, Experimentally measured tonic spiking behavior, showing that the neuron continues to fire a train of spikes in response to a d.c. input current. IRL1, the current flowing through RL1, was monitored by probing the voltage across it using two high-impedance (10 M) oscilloscope probes. The current jitters coinciding with the output spikes are likely caused by the reflection of action potentials toward the neuron input, i.e. "back actions". b, Simulated tonic spiking behavior of the same tonic VO2 neuron circuit in response to a d.c. input current. c, Experimental phase plane of the K + membrane potential VK (aka Vout) vs. the Na + membrane potential VNa. In the phase plane representation, time-domain tonic spiking turns into a limit cycle attractor. d, Simulated phase plane of VK vs. VNa, showing a limit cycle attractor similar to the experimental data in a qualitative manner. The green and magenta dots in a d show the first and last plotted data points. Figure 14. Tunable tonic bursting behavior measured in a tonic VO2 neuron circuit. a-j, Experimental tonic burst patterns measured with a fixed value of C2 capacitor and an increasing value of C1 capacitor (C1 ≫ C2 in all the cases). As C1 becomes larger, both the number of spikes in each burst period and the burst period itself increase. k, Experimental C1 dependence of the number of spike in each burst period. Dashed line is a linear fit (with R 2 = 0.98) which shows a linear trend that the number of spike per period = (1.10.5) + (0.490.03)C1 (nF). l, Experimental C1 dependence of the burst period. Dashed line is a linear fit (with R 2 = 0.997) which shows a linear trend that the burst period (µs) = (17.760.33)C1 (nF). In all the data shown, C2 1 nF comes from the stray capacitance in the experimental setup. Figure 15. Spike frequency adaptation behavior in a tonic VO2 neuron circuit. a, Experimentally measured spike frequency adaptation in tonic burst under a sustained d.c. current stimulation. The spike frequency is relatively high at the onset of tonic bursting, and then it decreases over time, i.e., the neuron adapts. b, Simulated spike frequency adaptation behavior of the same tonic VO2 neuron circuit in response to a d.c. input current. c, Experimental phase plane of the K + membrane potential VK (aka Vout) vs. the Na + membrane potential VNa. A 100-fold down-sampling followed by 5-point adjacent-averaging was applied to the raw oscilloscope data to smooth the curve. d, Simulated phase plane of VK vs. VNa, showing trajectories similar to the experimental data in a qualitative manner. The green and magenta dots in a d show the first and last plotted data points, respectively. Figure 16. Spike frequency adaptation behavior in a phasic VO2 neuron circuit. a, Experimentally measured spike frequency adaptation in phasic burst under a sustained d.c. current stimulation. The spike frequency is relatively high at the onset of stimulation, and then it decreases over time, i.e., the neuron adapts. b, Simulated spike frequency adaptation behavior of the same phasic VO2 neuron circuit in response to a d.c. input current. c, Experimental phase plane of the K + membrane potential VK (aka Vout) vs. the Na + membrane potential VNa. A 10-fold down-sampling followed by 5-point adjacent-averaging was applied to the raw oscilloscope data to smooth the curve. d, Simulated phase plane of VK vs. VNa, showing trajectories similar to the experimental data in a qualitative manner. The green and magenta dots in a d show the first and last plotted data points. Figure 17. Spike latency behavior in a tonic VO2 neuron circuit. a, Experimental spike latencies in responding to suprathreshold 10 µs input voltage (Vin) pulses. Spike delay is longer for a relatively weak input, and it diminishes as the input gets stronger. b, Simulated spike latencies of the same tonic VO2 neuron circuit. c, Dependences of measured and simulated spike latencies on the amplitude of the input pulse. Spike latency is arbitrarily defined as the delay between the onset of Vin and the peak time of spiking. Simulated spike latency  can be fitted (with R 2 = 0.9995) by a logarithmic formula = 0 + ln( − in ), where 0 = (17.29  0.02) µs, = (3.20  0.07) µs/ln(V), and = (0.382  0.007) V. The logarithmic dependence of spike latency on the input amplitude can be accounted for by the logarithmic formula of the capacitor discharge time in a relaxation oscillator 17 . Figure 18. Subthreshold oscillation behavior measured in a tonic Class 2 VO2 neuron circuit. a-c, Subthreshold oscillations in the neuron output, i.e., the K + channel membrane potential, under a sustained d.c. current stimulation of 100 µA, 120 µA, and 140 µA, respectively. It is evident that the frequency of the oscillations increases with the input current level. d-f, Tonic spiking intermixed with occasional subthreshold oscillations (as demonstrated by the missing spikes) in the neuron output under a sustained d.c. current stimulation of 160 µA, 180 µA, and 200 µA, respectively. The transition from subthreshold oscillations to tonic spiking is better observed by ramping up the current stimulation. See Figure 4b in the main text for details. Figure 19. Integrator behavior in a tonic Class 1 VO2 neuron circuit. a, Experimentally measured integrator behavior, showing that the neuron fires an action potential if a doublet of two subthreshold input voltage pulses are applied with sufficiently short interval between them The pulse interval for data in panel 1-5 starting from the top is 4 µs, 6 µs, 10 µs, 11 µs, and 14 µs, respectively. The neuron does not spike if the two subthreshold input voltage pulses are too far apart from each other, i.e., the data in the bottom panel, with an interval of 16 µs. b, Experimental integrator behavior of the same neuron circuit as in a, demonstrated by applying two doublets of subthreshold input voltage pulses in the same measurement. The neuron fires a spike in response to the first input pulse doublet with a shorter interval (5 µs), but does not fire in response to the second input pulse doublet with a longer interval (23 µs). c, Simulated integrator behavior of the same neuron circuit as in b. The pulse width and amplitude in all the data shown is 6 µs and 0.5 V, respectively.

Supplementary Figure 20. Bistability behavior measured in a tonic VO2 neuron circuit. a,
The neuron is driven from the resting mode into a persistent tonic spiking mode (a selfoscillation) by applying the first input pulse stimulation. A second input pulse arriving at an interval of 154 µs successfully switches the neuron from tonic spiking back to the resting mode. b, A second input pulse arriving at an interval of 155 µs fails to switch the neuron from tonic spiking back to the resting mode. In the measurements, 0.85 V and 15 µs wide voltage pulses sent from an arbitrary waveform generator (AWG) were converted into 85 µA input current pulses (Iin) using a stimulus isolator with a gain of 0.1 mA V -1 . Input current was not monitored because the load resistor RL1 was set to be zero to enable bistability. The plotted Iin waveforms are calculated from the monitored AWG voltage waveforms. c, Probability (success rate) of the second input pulse switching off the self-oscillation vs. the pulse interval. Red line is a secondorder polynomial fit. Each data point represents the statistics from 8 to 10 such attempts. At an interval of 154 µs, the success rate is 100 %, or the neuron self-oscillation was switched off in 10 out of 10 attempts. At an interval of 155 µs, the success rate dropped to 62.5 %, or 5 out of 8 attempts. Despite the scattering of data points, it is evident that the success rate peaks at around 154 µs interval, and it drops off as the interval is detuned away. This observation is consistent with the interpretation that the input must arrive at an appropriate phase of oscillation for it to switch the neuron from tonic spiking back to the resting mode. Large scattering in the success rates may be explained by the stochastic onset of tonic spiking, which shifts the phase of oscillation randomly with respect to the fixed intervals used in measurements. Figure 21. Inhibition-induced spiking (IIS) behavior in a tonic VO2 neuron circuit. a, Experimentally measured inhibition-induced spiking behavior, showing that the neuron is quiescent (at rest) when there is no input current, but fires a tonic spike train when it is hyperpolarized by an inhibitory (negative) input current of -90 µA. b, Simulated inhibitioninduced spiking behavior of the same tonic VO2 neuron circuit. In biology, many thalamocortical neurons exhibit the IIS feature. The mechanism was attributed to inhibitory-input induced activation of the h-current and deactivation of the Ca 2+ T-current 18 .

Supplementary Figure 22. Experimental inhibition-induced bursting (IIB) behaviors in tonic VO2 neuron circuits.
The neuron is quiescent (at rest) when there is no input current, but fires irregular bursts of spikes when it is hyperpolarized by an inhibitory (negative) input current of -70 µA. Similar to the case of tonic bursting induced by excitatory (positive) inputs, IIB requires a much slower Na + channel than the K + channel, or C1 ≫ C2. a, IIB measured from a tonic VO2 neuron circuit with the discrete membrane capacitors set at C1 = 35 nF and C2 = 0 nF. b, IIB measured from another tonic VO2 neuron circuit with C1 = 21 nF and C2 = 0 nF. In the test setup, for each discrete capacitor there also exists a stray capacitance of ~1 nF, mostly contributed by the cables. Other circuit parameters can be found in Supplementary Table 2. Figure 23. Experimental excitation block behavior in a Class 2 tonic VO2 neuron circuit. a, Experimentally measured excitation block behavior, showing that the neuron fires tonic spikes when the input current ramps above the spiking threshold at 25 µA. As the stimulus current further increases beyond 95 µA, the neuron suddenly ceases to spike and the output is locked to an elevated value (1.05 V). b, Experimental phase plane of the K + membrane potential VK (aka Vout) vs. the Na + membrane potential VNa for the data shown in a. The green and magenta dots in a and b show the onset of current ramp and the onset of excitation block. The VK VNa trajectory shows characteristics of a distorted letter 'B'-shaped limit cycle attractor. Each loop of the trajectory corresponds to a complete cycle of action potential generation in a. Further increase of the stimulus current beyond 95 µA causes the disappearance of the limit cycle attractor and the locking of neuron state (the magenta dot). A 100-fold down-sampling followed by 16-point adjacent-averaging was applied to the raw oscilloscope data to smooth the curve. c, Experimental phase plane of VK vs. VNa for the time duration of the first tonic spike in a (marked by green and magenta triangles). The loop of letter 'B'-shaped trajectory is marked with arrows. In theory, excitation block is attributed to the conversion from a spiking limit cycle to a supercritical Andronov-Hopf bifurcation phenomenon and is explained in FitzHugh-Nagumo model by the phase plane approach, which shows a stimulus-induced shift of the equilibrium through stable-unstable-stable branches of the 'N'-shaped nullcline 19 . VO2 neuron (b). a, The phasic neuron circuit with a capacitively-coupled input. b, Oscilloscope-captured phasic neuron response to a subthreshold (0.6 V) frequency-sweeping sinusoidal voltage input (ZAP sweep 20 ), showing a pass band centered around ~17 kHz. c, Sporadic spikes occur near ~17 kHz as the ZAP sweep amplitude increased to 0.7 V. d, Intensified spiking near ~17 kHz as the ZAP sweep amplitude further increased to 0.9 V. e, The tonic neuron circuit with all the circuit elements the same as the phasic neuron in a, except for the missing Cin capacitor. f-g, Frequency-domain response of the tonic neuron to ZAP sweeps with an amplitude of 0.6 V, 0.7 V, 0.9 V, respectively. The integrator nature is reflected by the low-pass filter characteristics of the neuron response. Figure 25. Phasic spiking (Class 3 excitable) behavior in a phasic VO2 neuron circuit. a, Experimentally measured phasic spiking behavior, showing that the neuron fires only a single spike at the onset of a d.c. input current, and then it remains quiescent even in the presence of the input current. The plotted input current (Iin) waveform is calculated from the monitored AWG voltage waveform, because in the phasic neuron circuit the load resistor RL1 is replaced with a capacitor Cin. b, Simulated phasic spiking behavior of the same phasic VO2 neuron circuit. Figure 26. Phasic bursting behavior in a phasic VO2 neuron circuit. a, Experimentally measured phasic bursting behavior, showing that the neuron fires only a single period of burst spikes at the onset of a d.c. input current, and then it remains quiescent even in the presence of the input current. The plotted input current (Iin) waveform is calculated from the monitored AWG voltage waveform, because the load resistor RL1 is replaced with a capacitor Cin in the phasic neuron circuit. b, Simulated phasic bursting behavior of the same phasic VO2 neuron circuit. Figure 27. Rebound spike behavior in a phasic VO2 neuron circuit. a, Experimentally measured rebound spike behavior, showing that when the neuron receives and then is released from an inhibitory (negative) input, it fires a post-inhibitory (rebound) spike, in response to the release (the rise edge) of the inhibitory input waveform). b, Simulated rebound spike behavior of the same phasic VO2 neuron circuit. In the case of excitatory input, a phasic spike is fired at the rise edge of the input current (see Supplementary Fig. 25). In the case of inhibitory input, a rebound spike is fired also at the rise edge of the input current. Equipped with a capacitively-coupled input, the phasic neuron essentially acts as a rise-edge detector. Figure 28. Experimental all-or-nothing characteristics in the rebound spike behavior of a phasic VO2 neuron circuit. Similar to the all-or-nothing response to an excitatory stimulus, for an inhibitory input, there also exists a threshold in its amplitude for a rebound spike to be fired when the input is released. a, The lack of response to a subthreshold inhibitory input pulse (-0.4 V), which is not strong enough for the neuron to fire a rebound spike at the rise edge of the input. It is noticed that the Na + channel membrane potential surges up at the rise edge of the input, but it still stays below zero and therefore does not trigger the Na + channel to open. b-c, A rebound spike is fired in response to a suprathreshold inhibitory input pulse of -0.5 V and -0.6 V, respectively. In both cases, the Na + channel membrane potential surges above zero at the rise edge of the input, triggering the coordinated opening or closing of the Na + and K + channels and an action potential generation. d, a close-up view of the greyed-out area in c, showing more details in the time evolution of the input Vin, the Na + channel membrane potential VNa, and the K + channel membrane potential (the neuron output) Vout. Figure 29. Experimental input-duration threshold characteristics in the rebound spike behavior of a phasic VO2 neuron circuit. a, From top to bottom, a rebound spike is fired in response to a suprathreshold inhibitory input pulse of -0.5 V with a duration of 30 µs, 20 µs, 10 µs, and 5 µs, respectively, but the neuron does not fire if the duration is further shortened to 4 µs (the bottom panel). b, a close-up view of the case for 5 µs inhibitory input in a, showing that the Na + channel membrane potential VNa surges above zero at the rise edge of the input (arrow), triggering a rebound spike. c, a close-up view of the case for 4 µs inhibitory input in a, showing that the Na + channel membrane potential VNa surges but stays below zero at the rise edge of the input, and is thus incapable of triggering the rebound spike. Figure 30. Rebound burst behavior in a phasic VO2 neuron circuit. a, Experimentally measured rebound burst behavior, showing that when the neuron receives and then is released from an inhibitory (negative) input, it fires a post-inhibitory (rebound) burst of spikes, in response to the release (the rise edge) of the inhibitory input waveform). b, Simulated rebound burst behavior of the same phasic VO2 neuron circuit.

Supplementary Figure 31. Threshold variability behavior in a phasic VO2 neuron circuit. a,
Experimental threshold variability, showing that the neuron does not fire when it receives a brief subthreshold excitatory or inhibitory input pulse, but it fires a spike if the inhibitory input pulse is followed by an excitatory pulse (both are subthreshold) as long as the interval is short enough. The preceding inhibitory pulse lowers the threshold and makes the neuron more excitable. b, Simulated threshold variability of the same phasic VO2 neuron circuit. Combining the rise edges of the inhibitory and excitatory pulses into one effective rise edge, threshold variability is caused by the same mechanism as the threshold seen in rebound spike (See Supplementary Fig. 28). Figure 32. Experimental depolarizing after-potential behavior of a phasic VO2 neuron circuit. a, Single phasic spikes fired at the onset of d.c. input currents of 200 µA, 300 µA, and 450 µA levels. At relatively weaker input currents, the neuron membrane potential develops the commonly seen hyperpolarizing after-potential (HAP) that goes below the resting level. As the input strengthens, the HAP gradually weakens and morphs into a depolarizing afterpotential (DAP) that goes above the resting level. b, Once a DAP is developed, the neuron has shortened refractory period and becomes superexcitable. A slightly stronger input, from 450 µA to 500 µA, causes the neuron to fire a second spike. The second spike is triggered by Na + channel reactivation when the relative refractory period is nullified by the formation of DAP. Figure 33. Accommodation behavior in a phasic VO2 neuron circuit. a, Experimental accommodation behavior, showing that a slowly ramped input current does not trigger the neuron to fire. In other words, the neuron accommodates the input change and becomes less excitable. A sharply ramped current, however, triggers a spike. All the current ramps have the same maximum amplitude of 150 µA. b, Simulated accommodation behavior of the same phasic VO2 neuron circuit. Figure 34. Mixed-mode spiking behavior of a mixed-mode VO2 neuron circuit. a, A phasic neuron with a capacitive coupling (Cin) to dendritic inputs. b, A tonic neuron with a resistive coupling (RL1) to dendritic inputs. c, A mixed-mode neuron with both capacitive and resistive couplings (Cin in parallel with RL1) to dendritic inputs. Except for the difference in input impedance, the tested neuron circuits in a-c are identical, including the VO2 devices and d.c. biases used. d, Phasic bursting measured from the phasic neuron circuit in a. e, Tonic spiking measured from the tonic neuron circuit in b. f, Mixed-mode spiking, i.e., phasic bursting followed by tonic spiking, measured from the mixed-mode neuron circuit in c. g, Simulated phasic bursting of the phasic neuron circuit in a. h, Simulated tonic spiking of the tonic neuron circuit in b. i, Simulated mixed-mode spiking of the mixed-mode neuron circuit in c. Fig. 6). a f, Scatter recurrence plots of adjacent spike amplitudes at input white noise (peak-to-peak) levels of 5 µApp, 10 µApp, 15 µApp, 25 µApp, 50 µApp, and 75 µApp, respectively, showing that irregularities develop in both the spike timing (See Fig. 6) and the spike amplitude as the input stimulus becomes noisier. g, Dependences of the mean spike amplitude and the skewness in its distribution on the input noise level, showing similar trends of initially a fast decrease with the input noise, then a partial recovery if the input noise level is higher than ~20 µApp. Figure 36. Simulated dynamic power scaling of a VO2 neuron. a, Simulated dynamic spike energy of a tonic VO2 model neuron circuit vs. membrane capacitances, showing a nearly linear scaling with the capacitor values that also determine the neuron area. This is because the energy dissipated in a spike comes from the coordinated discharging of the two membrane capacitors. The dynamic spike energy is calculated by first summing the total power supplied by the -ENa and +EK voltage sources, then integrating over time (see example in b-d).

Supplementary
For simplicity, C1 = C2 was assumed in the simulations. Two sets of VO2 channel dimensions of r/L = 10/10 nm and r/L = 36/50 nm are compared. Here r is the VO2 channel radius, L is the channel length (film thickness). The results show that the impact of the VO2 channel dimensions on the neuron dynamic spike energy is relatively small. By aggressively shrinking the VO2 volume by a factor of 18, from r/L = 36/50 nm to r/L = 10/10 nm, the spike energy is only reduced by 24 %. The top x axes show the calculated total capacitor area at capacitance density of 1 fF μm -2 (blue) and 43 fF μm -2 (magenta), respectively. A dynamic spike energy use <0.1 pJ/spike (green arrow) can be achieved at a total capacitor area of ~1 μm 2 by using 20 fF membrane capacitors, which can be realized by today's integrated high-κ metal-insulator-metal (MIM) capacitors with a record-high capacitance density 21 of 43 fF µm -2 (typical MIM capacitance density of high-κ dielectrics is in the range of 15-20 fF µm -2 ). Note that at a given capacitor area, lower capacitance value (by using lower capacitance density) will result in a lower spiking energy and hence a better EE. This is clearly shown in the EE-area scaling trend lines of VO2 neurons (See Supplementary Fig. 2). b, Time dependent neuron spike waveform at C1, C2 = 10 fF and r/L = 10/10 nm (the circled red dot in a). c, Time dependent total dynamic power supplied by the -ENa and +EK voltage sources. d, Time dependent dynamic energy consumption, calculated by integrating the total dynamic power simulated in c over time. Figure 37. Simulated dynamic and static power scaling of a VO2 neuron over its spike rate. Left axis shows simulated dynamic and static power consumptions of a tonic VO2 model neuron, and the right axis shows the percentage of static power in the total power consumption. Static power is dissipated by the neuron membrane leakage current in the resting state, i.e., leakage current drawn by d.c. biased VO2 devices due to the finite resistivity of the insulating phase (~1 cm). With Vth as the switching threshold voltage, the upper bound (UB) and lower bound (LB) of static power are calculated at d.c. bias Vdc = Vth and Vth/2, respectively. Since the neuron only spikes if (Vin + Vdc)  Vth, the signal gain, capped by Vdc/Vin, is always smaller than Vdc/(Vth -Vdc). The gain will becomes less than 1 if Vdc is less than Vth/2. While static power dissipation is independent of the spike rate, dynamic power dissipation is proportional to the spike rate. Therefore at low spike rates, static power may dominate the total power consumption. At sufficiently high spike rates, the static power makes only an insignificant contribution to the total power consumption, and is not expected to be of major concern for system level energy efficiency. The LB and UB of static power is less than 10 % of the total power at a spike rate higher than 100 MHz and 400 MHz, respectively, and the overall energy use is better than 0.11 pJ/spike. At 100 MHz spike rate, the single neuron total power consumption is 11 µW (LB) to 14 µW (UB). The SPICE simulation assumes a VO2 channel of r/L = 10/10 nm, and the VO2 model neuron has an energy use of 0.1 pJ/spike at C1, C2 = 38.3 fF (see the red dashed line in Supplementary Fig. 36a for details). Other methods, e.g. single damascene, can also be used.

Supplementary Tables
Supplementary Table 1. Comparison of volumetric free energy cost for Mott phase transitions in NbO2 and VO2 materials. Values for NbO2 are from Table 1 in Ref. 16 (and references therein). At the same volume, the free energy cost for NbO2 phase transition is 6.1 times that for VO2. The total volumetric enthalpy change in a VO2 nano-crossbar device with channel radius r = 10 nm and length L = 10 nm is merely 1.15 fJ. Spiking behavior Figure  No.

Material
All-or-nothing S10 (Continued on next page) Spiking behavior Figure  No.

Supplementary Table 4. Biological fidelity and computational cost of neuron models in comparison with experimentally demonstrated biological fidelity of VO2 neurons.
The neurocomputational properties of neuron models are adopted and augmented from Fig. 2 in Ref. 18. "# of FLOPS" is the approximate number of floating point operations needed to simulate the neuron model for a 1 ms duration using a digital computer. (+), (-) and empty square represents possessed, missing, and unconfirmed properties of the model. For VO2 neurons, the only property that remains unconfirmed is chaos.

Supplementary Note 1: VO2 active memristor relaxation oscillator
VO2 is well-known for its first-order thermodynamically-driven Mott insulator-to-metal (IMT) phase transition with a critical temperature TC near 67 C 25 . Joule heating produced by electrical current through a metal/VO2/metal device generates Mott IMT-induced volatile hysteretic resistive switching and an NDR regime, which forms the basis to construct oscillators, amplifiers, and impulse circuits (neurons). Mott memristors, a type of active memristors based on Mott IMT, were previously realized by producing crystalline NbO2 in an electroforming process from amorphous Nb2O5 films 26 . Electroformed devices suffer from large device variability that is undesirable for integration. In our case, electroform-free VO2 active memristor nano-crossbar devices with typical device yield of 98-100 %, low-voltage (down to ~0.5 V), high-endurance, and low device variability (<13 % coefficient of variation in switching threshold voltage) are fabricated on CMOS-compatible 3-inch SiNx-coated silicon substrates (See Methods). Pearson-Anson (PA) relaxation oscillator is the prototype electronic circuit analogue for voltage-gated Na + or K + nerve membrane ion channels. Other ion channels, e.g. Clor Ca 2+ , can also be emulated in a similar manner. If two such relaxation oscillators are coupled with proper impedance, the overall circuit can generate an action potential [26][27][28] . Supplementary Fig. 3 shows the circuit diagram and astable oscillator characteristics measured in a VO2 relaxation oscillator. A one-to-one correspondence can be identified between the quasi d.c. V-I trace (force V, measure I) of the VO2 device (without the capacitor) and the V-time and I-time waveforms of the astable oscillations under an external d.c. bias Vdc 17 . For astable oscillation to occur, the load line, defined by Vdc and the load resistor RL, must intersect the V-I curve in its NDR regime. A complete cycle of astable oscillation, from point (1) to (5), has four stages as explained by figure caption. The actual switching time scale of VO2 is much faster than the rise time of the oscilloscope and cannot be measured. Values ranging from 100 fs to 5 ps have been measured by pump-probe methods 29,30 .
Although many transition metal oxides exhibit Mott IMT, TC in many of these materials are well below 300 K (room temperature). Mott insulators with TC > 300 K, e.g. VO2, Ti2O3, Ti3O5, NbO2, SmNiO3, LaCoO3, are more suitable for electronic applications 31 . NbO2 is a demonstrated material for spiking neurons 26 . However, its TC of 1080 K requires a large local temperature rise of 800 K to operate, which negatively impacts both power consumption and device longevity 31 (See Supplementary Table 1 for volumetric free energy cost of Mott IMT in NbO2 and VO2). We applied SPICE simulations of a VO2-based relaxation oscillator to estimate the switching energy and switching time (speed) of Mott IMT 26 . The switching time of the phase transition is estimated from the rising edges of device current in each oscillation period. The VO2 channel radius is varied while all the other model parameters, including the channel length (50 nm), are fixed. As shown in Supplementary Fig. 9, at the same channel dimensions, simulated Mott IMT switching in VO2 is 100 times faster than in NbO2, and only consumes about one-sixth (16 %) of the energy. Note that the switching speed is a material-dependent parameter, and is not affected by the values of R, C passive elements. <1 fJ switching energy and <1 ps switching speed can be achieved at VO2 channel radius of 7-15 nm, dimensions feasible for advanced-node lithography.

Supplementary Note 2: Device modeling of VO2 active memristors
We use the same analytical mathematical equations developed by the authors of Ref. 16 Supplementary Figure 40. Circuit diagram of a tonic memristor neuron circuit with legends of voltages and currents to assist modeling.
Applying KVL at the joint connecting RL1, C1, X1 and RL2, Supplementary Eq. (14) is then rewritten in the form of a first-order ODE: Supplementary Eqs. (7), (9), (15) and (16) Experimentally it's more convenient to probe the Na + and K + channel membrane potentials Na = 1 − 1 and K = 2 + 2 . Substituting 1 with Na and 2 with K in the above equations, the dynamics equations can be recasted as We noted that the dynamic equations (17) and (18) have been presented in Ref. 32. However, in that reference, the dynamic equations of state variables u1 and u2 are not used for reduceddimension VNa VK nullcline analysis, instead a hard switching between two preset resistance values RON and ROFF were assumed. Such an overly-simplified approach unavoidably will miss some important aspects of the nonlinear dynamics of Supplementary Eqs. (7) and (9). Applying Ohm's law, we rewrite Supplementary Eqs. (7) and (9) as Supplementary Eqs. (19) and (20), wherein the channel currents 1 and 2 are replaced by the corresponding membrane potentials Na and K .

Supplementary Note 4: Dynamic and static power scaling of VO2 neurons
We use SPICE simulations to analyze the dynamic and static power scaling of tonic VO2 neurons (See Supplementary Figs. 37 and 38). The dynamic spike energy is calculated by first summing the total power supplied by the d.c. voltage sources, then integrating over time through the course of a spike.
In Supplementary Fig. 36, the dynamic spiking energy scales almost linearly with the capacitance of membrane capacitors, with a power-law fitted slope of 0.96 at r/L = 10/10 nm and 0.924 at r/L = 36/50 nm, respectively. The neuron area also scales linearly with the membrane capacitance as capacitor elements dominate the circuit area. It is therefore desirable to make smaller neurons to achieve higher (dynamic) spiking energy efficiency (EE) (See Supplementary  Fig. 2). Note that area scaling of membrane capacitors is not a limiting factor for the neuron area scaling, because memristor neurons do not require a minimum value of membrane capacitors to operate, and therefore there is no size constraint posted by the requirement on certain membrane capacitance value. Another observation is that since VO2 switching energy is extremely low (See Supplementary Table 1), only 1.15 fJ/device at r/L = 10/10 nm, aggressive VO2 device scaling is not needed: an 18-fold volume reduction, from r/L = 36/50 nm to 10/10 nm, only reduces the neuron spike energy by ~24 %.
If only considering dynamic power consumption for the case of r/L = 36/50 nm, <0.1 pJ/spike energy use can be achieved at a total capacitor area of ~1 μm 2 by using 20 fF membrane capacitors (see green arrow in Supplementary Fig. 36a), which can be realized by today's integrated high-κ metal-insulator-metal (MIM) capacitors with a record-high capacitance density 21 of 43 fF µm -2 (typical MIM capacitance density of high-κ dielectrics is in the range of 15-20 fF µm -2 ). However, using high-κ dielectric to boost the capacitance density is not a good strategy to achieve higher EE. At a given capacitor area, lower capacitance value (by using lower capacitance density) will result in a lower spiking energy and hence a better EE. This is clearly shown in the EE-area scaling trend lines of VO2 neurons (See Supplementary Fig. 2) stimulated at capacitance density of 1, 10, and 43 fF μm -2 . At the same neuron (and capacitor) area, lower capacitance density translates into lower dynamic spiking energy and hence higher EE. At 1 fF μm -2 capacitance density, VO2 neurons show superior EE-area scaling than the best-case HH cells at neuron sizes smaller than 70 μm 2 , and can surpass the estimated human brain EE of 1.810 14 spike/J (or 5.6 fJ/spike energy use) at neuron sizes smaller than 3 μm 2 .
The static power consumption is dissipated by standby current through d.c. biased VO2 devices due to the finite resistivity of the insulating phase (~1 Ωcm) 24 . At low firing rates, static power may dominate the total power consumption. Since the dynamic power is proportional to the firing rate, while the static power remains a constant, the percentage of static power in total power decreases with firing rate. The lower and higher bounds of static power, estimated at d.c. bias of Vth and Vth/2 (Vth is the switching threshold) respectively, is <10 % of the total power at a firing rate higher than 100 MHz and 400 MHz, respectively, and the overall energy use is lower than 0.11 pJ/spike. We have not considered the possibility that the insulating-phase resistivity of VO2 can be improved. Note that the neuron EE is not the only factor that determines the network power consumption. The firing rate, the synapse resistance, and the synapse/neuron ratio also need to be considered.