Introduction

Two-dimensional (2D) materials have attracted significant interest for the downscaling of CMOS (complementary metal-oxide-semiconductor)1,2,3, as well as for beyond-CMOS electronic applications4,5. Their atomic scale thicknesses and pristine (i.e., dangling-bond free) surfaces could enable ultra-dense integration for next-generation integrated electronic systems5. Consequently, many studies have evolved from the demonstration of isolated devices (e.g., field effect transistors or FETs) based on exfoliated flakes towards large-area methods for fabrication of integrated circuits with 2D materials6,7,8,9,10,11,12. While early device demonstrations focused predominantly on FET applications13,14,15,16, recent studies have proposed memory and neuromorphic devices based on the non-volatile resistive-switching (NVRS) behavior observed in various 2D materials including transition metal dichalcogenides (TMDC)17, black-phosphorus18,19, graphene20,21, hexagonal boron nitride (h-BN)22,23,24,25,26,27,28,29,30, etc. These devices are generally configured in vertical two-terminal structures, where the resistive switching layer is sandwiched between top and bottom metal electrodes. The use of 2D materials has enabled the demonstration of devices with atomically thin resistive switching layers having low voltage operation27 and fast switching speeds23,24. Chemical vapor deposition (CVD)-grown h-BN has attracted much attention for use as the resistive switching layer due to its compatibility with large-area wafer-scale fabrication, and arrays of h-BN memristors have been reported11. In CVD-grown h-BN devices the resistive switching process is attributed to the formation and rupture of conductive paths via penetration of metal ions into defects at h-BN grain boundaries.

Initial studies of h-BN memristors reported on their non-volatile resistive switching behavior observed as transitions or hysteresis in measurements of DC current–voltage characteristics23,24,27,29. Previous work11 has also shown the programming of multiple resistive states in h-BN memristors by the application of consecutive voltage pulses, although using significantly larger pulse widths (milliseconds) compared to what is reported here (nanoseconds). Pulsed programming is required for practical memory and neuromorphic computing applications. Moreover, the pulsed programming of multiple conductive states is critical for the implementation of synaptic plasticity (i.e., long-term potentiation and depression) in neuromorphic hardware, as well as for the analog-based implementation of machine learning functions in memristor arrays31,32. For example, most analog-based implementations of neural networks and/or machine learning hardware based on memristor crossbars rely on dot-product (i.e., multiply-accumulate) operations33,34,35. Here, the accumulated currents at the outputs of the array result from the product of input voltage signals (input vector) and the conductance of the memristors in the array (column vectors). Nevertheless, this basic function has not been reported in arrays of h-BN memristors.

This paper presents the wafer-scale fabrication of memristor arrays using on CVD-grown h-BN resistive switching layers, and their multi-state analog programmability. We focus on the experimental demonstration of dot-product operation on h-BN memristor arrays and on the hardware implementation of multi-variable stochastic linear regression. This work extends beyond existing demonstrations of NVRS behavior in isolated h-BN memristors and paves the way for more sophisticated demonstrations of machine learning applications based on 2D materials.

Results

Fabrication of h-BN memristor arrays

Multilayer CVD-grown h-BN was transferred from copper onto a 90 nm SiO2/Si substrate patterned with Au bottom electrodes. The h-BN film was then shaped using standard photolithographic and etching techniques to expose the bottom electrodes. Subsequently, we prepared top electrodes through patterning and Ti deposition using e-beam evaporation and lift-off (see “Methods” and Supplementary Fig. 2 for fabrication details). Figure 1a shows a schematic of the fabricated Au/h-BN/Ti memristors arrays where the Au bottom electrode (BE) is shared across various devices each having an independent Ti top electrode (TE) (1 × 3 and 1 × 10 arrays are shown). Figure 1b illustrates the cross-section of the Au/h-BN/Ti memristor. Figure 1c is a photograph of the memristor arrays on a 2 cm by 2 cm SiO2/Si wafer. A micrograph of the fabricated h-BN memristor arrays shown in Fig. 1d corroborates the dimensions of the 100 µm × 100 µm squared pads and the electrodes with 3 µm × 3 µm active areas (see Supplementary Fig. 1 for 20 µm × 20 µm and 50 µm × 50 µm active areas). Figure 1e shows a cross-section transmission electron microscopy (TEM) image of a typical Au/h-BN/Ti memristor. From the TEM image we confirm the thickness of the CVD-grown multilayer h-BN film (~8–10 nm) corresponding to approximately 15–20 atomic layers. Moreover, we can observe local defects that facilitate metallic penetration from the top electrode (Ti) to form conductive paths (i.e., conductive nanofilaments) responsible for the resistive switching behavior in the h-BN memristors.

Resistive-switching properties

Individual h-BN memristors from the arrays were measured electrically to evaluate their resistive-switching properties (see “Methods” for details on electrical characterization). Current–voltage (IV) characteristics were obtained by sweeping a voltage across the top and bottom electrodes while measuring current. Figure 2a plots 100 consecutive cycles of IV measurements on an Au/h-BN/Ti memristor with a 3 µm × 3 µm active area. A compliance of 0.1 mA was activated for positive applied voltages. The numbered labels indicate the sweeping process during the IV measurement. As shown, clear transitions occur between resistive states, evidence of a forming-free bipolar resistive-switching (RS) operation with low cycle-to-cycle resistance variability and low set and reset voltages (approximately 1 and −1 V). The cumulative distribution plot of the resistive states extracted at a read voltage of 0.1 V from all 100 cycles is shown in Fig. 2b. Two distinct states labeled as HRS (high resistance states) and LRS (low resistance state) are easily observed as their distributions are separated by approximately two orders of magnitude. Another illustration of the HRS and LRS distributions is provided in Fig. 2c where the resistances are plotted as a function of the cycle number. A histogram of the set and reset voltages corresponding to transitions between HRS and LRS is shown in Fig. 2d. All results indicate a stable and reliable RS bipolar operation.

We also explore the dependence of the IV characteristics and of the HRS and LRS statistics on h-BN memristor active area. In Fig. 2e we compare the IV characteristics from devices with 3 µm × 3 µm, 20 µm × 20 µm, and 50 µm × 50 µm active areas. All devices were measured for 100 cycles and the results show good repeatability with limited cycle-to-cycle variation. The difference in active area has a larger effect on the HRS and this is easier to identify in the cumulative distribution plot shown in Fig. 2f. Here, the HRS an LRS resistances are shown for the three devices (all 100 cycles) extracted at a read voltage of 0.1 V. While distributions of LRS are only minimally affected by active area we see a clear trend in HRS. The HRS resistance goes down with increasing the active area. This trend in HRS and LRS with active area has been previously reported for different filamentary-based RS memory36,37. Figure 2g is a box plot showing the distribution of HRS and LRS as a function of cell area side length (3, 20, and 50 µm). The plot includes the raw data (circles), the standard deviation (size of box), and the mean values (solid horizontal lines). We note that while cycle-to-cycle variability is comparable to previous resistive-switching technologies (e.g., oxide-based RRAM38), device-to-device variability remains large, likely due to nonuniformity of the h-BN film and may be improved by optimizing the synthesis and transfer methods.

Multistate non-volatile pulse programmability

Achieving multiple conductive states through the application of programming pulses is critical for the implementation of neuromorphic hardware and for the analog-based implementation of machine learning functions in memristor arrays. We investigate the multistate pulse programmability of the Au/h-BN/Ti memristors by applying a sequence of positive/negative voltage pulses (pulse width is 500 ns, amplitudes indicated in Fig. 3). After each pulse a small read of 0.1 V is applied to read the current (conductive state) of the device (see Fig. 3a top panel). The results are shown in Fig. 3b, where 100 cycles of 50 positive pulses followed by 50 negative pulses were applied. The gray lines are the results from each individual cycle and the solid red line with circles is the average from all 100 cycles. The results show a gradual change in conductance (from ~4 to 10 µS) indicating good analog (i.e., multistate) programmability. Due to the fast-switching behavior (nanoseconds), a low energy consumption per programming pulse of $$E_{{{{\mathrm{pulse}}}}} = \left( I \right)\left( V \right)\left( {t_{{{{\mathrm{pulse}}}}}} \right) \approx 125$$ fJ is achieved. We note that this can be further reduced to aJ/pulse by applying a low compliance current as previously reported on h-BN memristors11. The non-volatile property of the conductive states is also demonstrated by retention tests where current is sampled over 100 s (read voltage 0.1 V) following the application of the programming pulses (Fig. 3c, d). Figure 3c plots current for different programming cycles where the number of positive pulses was varied from two up to twenty. Immediately after the last positive pulse we apply and hold a 0.1 V read voltage and sample current every second for 100 s (see Fig. 3a bottom panel). After the retention test the negative pulses are applied and we then proceed to the next cycle. The results from the retention test are shown in Fig. 3d where the current is plotted as a function of the retention time (longer retention tests up to 104 s confirming a stable, non-volatile response are shown in Supplementary Fig. 6). The results confirm the endurance and robustness of the conductive filaments and demonstrate the multistate non-volatile pulse programmability of Au/h-BN/Ti memristors.

Dot product with h-BN memristor arrays

The dot-product operation is crucial for neuromorphic computing and machine learning hardware. For example, a dot-product operation is typically used in neural networks implemented on memristor crossbar arrays to accumulate currents at the outputs (i.e., the post-synaptic neurons). Here, the product of the input voltage signals (the input vector, v) multiply the conductances of the memristor arrays (the column vector, G) to accumulate an output current (I). This is achieved in hardware due to Ohm’s and Kirchhoff’s laws as given by $$I = \mathop {\sum }{vG}$$. This dot-product operation has been previously reported on oxide-based memristors34,39, but not on recently developed h-BN memristor arrays. Here we demonstrate the most basic implementation of dot-product on an array of two h-BN memristors where the accumulated current is given by $$I = v_1G_1 + v_2G_2$$. The experimental setup is illustrated in Fig. 4a. As shown, for each memristor we can switch between a pulse source (used to program the memristor conductances G1, and G2) and a voltage source to apply the read voltage on the memristors (v1, and v2). During the read operation we measure the output current through the shared bottom electrode. Figure 4b plots the total current measured with a read voltage of 0.1 V (v1 = v2 = 0.1 V) following the application of consecutive programming pulses (positive then negative). We show the case with both memristors pulsed (i.e., both are programmed with voltage pulses), the case with only one of the memristors pulsed, and with none pulsed (20 cycles shown for each case). For each cycle we also sweep the read voltage (v1 = v2 = Vread) between −0.15 and +0.15 V and measure the total current after all 30 positive programming pulses. The results from these voltage sweeps are shown in Fig. 4c. For the case where both memristor were pulsed (blue lines), the conductances G1 and G2 are both high (LRS) and therefore the current is the largest. When none of the memristors are pulsed (black lines), both G1 and G2 are low (HRS), and the current is the lowest. When only one memristor is pulsed, its conductance is high (LRS) while the other memristor’s conductance is low (HRS), and the magnitude of the current is between the first two cases. The results in Fig. 4c indicate good linear behavior of the memristor IV characteristics (needed for reliable dot-product operation)40 and show good repeatability (small cycle-to-cycle variation).

Discussion

We now demonstrate the implementation of stochastic multivariable linear regression on an h-BN memristor array. In this implementation we use an h-BN memristor array to predict the profit of startup companies given their investment in marketing and in research and development (R&D). Our model is trained using a dataset from 50 startup companies available online41. In this implementation, the memristor conductances (G1 and G2) are the model parameters. The training process is illustrated in Fig. 5a. For each training step a single sample from the dataset is randomly selected (the sample includes profit, marketing, and R&D in \$K). The input variables (marketing and R&D) are translated (normalized) to voltages between 0 and 0.15 V. These voltages are applied to the h-BN memristors (v1 and v2). We have previously confirmed that for this range of read voltages the IV response is linear, and the dot-product operation is reliable (see Fig. 4c). This is important for the implementation of linear regression as the prediction (h) is determined from the output current of the h-BN memristor array given by the dot product as

$$I = v^{{{\mathrm{T}}}}G,v = \left[ {\begin{array}{*{20}{c}} {v_1} \\ {v_2} \end{array}} \right],G = \left[ {\begin{array}{*{20}{c}} {G_1} \\ {G_2} \end{array}} \right].$$
(1)

The prediction is then compared against the training sample (y = profit) from which we determine the error and the required update for each of the model parameters (ΔG1 and ΔG2) (see “Methods” for details of the implementation). Here we use a hardware-compatible approach to update the model parameters whereby a single programming pulse is applied to each memristor42,43,44, and the polarity of the pulse is determined by the sign of ΔG1 and ΔG2. This programming pulse will slightly adjust the conductances to ultimately minimize the error in the prediction. To achieve good convergence, stochastic regression algorithms typically limit the parameter updates with a learning rate that is gradually reduced with training number42,43,45. In our experiments the learning rate is implemented by gradually reducing the amplitude of the programming pulses. We reduce the amplitude of the programming pulses by 0.1% after each iteration (starting with ±1 V, the pulse amplitude will be reduced to ±0.67 V after 400 training steps). The width of positive and negative programming pulses is kept fixed at 500 ns throughout the training process.

Figure 5b–d shows the results of the stochastic linear regression implementation. In Fig. 5b we plot the training data (black dots) as well as the model prediction before (magenta plane) and after 400 training steps (green plane). As shown, the trained model clearly predicts the profit of startup companies based on their investments in marketing and R&D much better than the before training. A more quantitative result is shown in Fig. 5c where we plot the mean squared error (MSE) as a function of the training step (i.e., iteration) as given by $${{{\mathrm{MSE}}}} = \left( {1/N} \right)\mathop {\sum}\nolimits_i {\delta _i^2}$$ where N is the sample size (50 in this case) and $$\delta _i = h_i - y_i$$ is the error in the prediction. As shown, the MSE reduces with training indicating good convergence of the algorithm. Figure 5d shows the change in conductances G1 and G2 (the model parameters) during the training process. The mean absolute error (MAE) was also calculated and is shown in Supplementary Fig. 4. We see larger updates and fluctuations in the conductances during the initial training steps, and eventually convergence to the optimal values for the model parameters.

In this article, we have reported the fabrication and characterization of Au/h-BN/Ti memristor arrays. We have presented statistics for the nonvolatile resistive switching behavior of h-BN memristors, including the effects of cell active area. We have then focused on establishing the non-volatile multistate pulse programmability of the h-BN memristors based on multiple cycles of consecutive programming pulses, and retention tests. Our results show successful multistate programming of conductive states with good stability. Moreover, we have presented the implementation of the dot-product operations on h-BN memristor arrays, and show good linearity and repeatability, which is crucial for machine learning hardware. Finally, we have demonstrated the hardware implementation of stochastic multivariable linear regression on an h-BN memristor array. Our hardware-compatible implementation shows good convergence and represents an important milestone in advancing the research and implementation of 2D materials for machine learning hardware. It also paves the way for more sophisticated demonstrations of machine learning algorithms using 2D materials, devices, and circuits.

Methods

h-BN memristor and memristor arrays fabrication

The Au/h-BN/Ti memristor arrays were fabricated on a 90 nm SiO2/Si wafer. First, the bottom electrodes (5 nm Cr/35 nm Au) with 3, 20, and 50 µm width were patterned on the substrate via photolithography and e-beam evaporation methods. Second, CVD-grown multilayer h-BN on copper from Graphene Supermarket was transferred onto the prepared SiO2/Si substrate by wet transfer method. Third, h-BN film was patterned to expose the 100 µm by 100 µm bottom electrodes pads using oxygen plasma. Finally, the top electrodes (70 nm Ti) were patterned with the same electrode width and the same methods as that of the bottom electrodes. The top electrodes are exposed to air and a thin surface layer may be oxidized over time. This oxidized layer can be easily penetrated with probe needles during measurements, and its impact on the resistive switching behavior has been ruled out by comparing against devices with Au-capped top electrodes that show very similar characteristics (see Supplementary Fig. 5). More details about fabrication process and a diagram of the fabrication flow are provided in Supplementary Fig. 2.

Electrical characterization

The electrical characterization was conducted on a Cascade semi-automatic probe station using a Keithley 4200 semiconductor characterization system. The DC IV measurements were performed using source measure units (SMUs), while the pulse programming experiments used a combination of pulse measure units (PMU, model 4225) for programming pulses and SMUs for reading currents. In the pulse programming experiments we switched between PMU and SMU automatically using a Keithley remote amplifier/switch (4225-RPM). Supplementary Fig. 3 shows the experimental setup.

Linear regression test

Our implementation of multivariable stochastic linear regression on the Au/h-BN/Ti memristor arrays was trained using a dataset available online41. The experimental demonstration was done with a Keithley 4200 SCS using a custom test script developed in the Keithley user library tool (KULT) and executed in the Keithley interactive testing environment (KITE). The input parameters to the test script are the minimum and maximum conductance values for each memristor (predetermined based on pulse measurements, used to normalize output currents from the array), the initial values for the programming pulse amplitudes, the constant value for the width of the programming pulses, and the number of iterations. The test script loads the training data and normalizes the independent variables (in this case marketing and R&D investments in thousands of dollars) to voltages between 0 and 0.15 V. We also subtract a constant offset (y-intercept) from the dependent variable (profit) so that the model is based only on two regression coefficients (model parameters represented by the memristor conductances). The script then goes into a loop where it randomly selects a sample for the data set and apply the read voltages (v1 and v2) that correspond to the independent variables of that sample. The current $$I = v_1G_1 + v_2G_2$$ is read at the output of the h-BN memristor array (shared bottom electrode) and is translated from Amps to dollars to be compared against the training sample. This read operation is conducted with the Keithley SMUs. We then calculate the error (δ) in the prediction as well as the required update for each model parameter (i.e., ΔG1 and ΔG2). From the minimization of the cost function (i.e., δ2/2) the updates are calculated as ΔG = −δv. Here we propose a simplified hardware-compatible regression approach where the memristor conductances (i.e., the model parameters) are updated through the application of a single programming pulse, and the polarity of the pulse is determined by the sign of the corresponding ΔG. The programming pulses are applied using the Keithley’s 4225 PMU (pulse width is fixed to 500 ns). Gradient descent algorithms typically use learning rate decay to improve convergence, where model parameter updates are weighted by a learning rate (α) that is reduced gradually as training advances. In our hardware demonstration we introduce learning rate decay by gradually reducing the amplitude of the programming pulses (we have reduced the amplitude of the programming pulses by 0.1% after each iteration).