Graphene-based 3D XNOR-VRRAM with ternary precision for neuromorphic computing

Recent studies on neural network quantization have demonstrated a beneficial compromise between accuracy, computation rate, and architecture size. Implementing a 3D Vertical RRAM (VRRAM) array accompanied by device scaling may further improve such networks’ density and energy consumption. Individual device design, optimized interconnects, and careful material selection are key factors determining the overall computation performance. In this work, the impact of replacing conventional devices with microfabricated, graphene-based VRRAM is investigated for circuit and algorithmic levels. By exploiting a sub-nm thin 2D material, the VRRAM array demonstrates an improved read/write margins and read inaccuracy level for the weighted-sum procedure. Moreover, energy consumption is significantly reduced in array programming operations. Finally, an XNOR logic-inspired architecture designed to integrate 1-bit ternary precision synaptic weights into graphene-based VRRAM is introduced. Simulations on VRRAM with metal and graphene word-planes demonstrate 83.5 and 94.1% recognition accuracy, respectively, denoting the importance of material innovation in neuromorphic computing.


INTRODUCTION
Deep neural networks (DNN) have made significant progress in the field of brain-inspired learning for various applications, including voice and image recognition 1,2 . For practical purposes, such computation-intensive tasks can be performed in graphic processing units 3 . This suggests the notion that within the frames of the von Neumann paradigm, the on-chip application of DNN is significantly constrained due to the segregation of memory and processing units.
In an attempt to further simplify the neural network, recent studies have obtained a beneficial compromise between reducing the immense model size (~16-32X) and deterioration of learning accuracy 18,19 . This compromise was achieved by quantizing 32-bit floating weights to 1-bit binary (−1, +1) precision. Consequently, the inference computation is also simplified, where the vectormatrix multiplication operation with floating weights can be replaced by addition/subtraction in a Binary Neural Network 18 . In turn, this simplification can be further optimized into an XNOR and bit-counting operation in XNOR-Net 19 . It is noted that the model can provide improvements in energy efficiency, computation rate, and cost by using a weight-pruning technique, thus achieving ternary precision (−1, 0, +1) 20 . Therefore, such algorithms can be practically implemented using binary RRAM devices as synaptic networks 21-24 . In conventional cross-point architectures, frequently used for two-terminal memories, the critical mask steps increase rapidly as the stack number increases 25 . This limits both bit-cost efficiency and the integration density of the whole array. To overcome such limitations, 3D vertical stacking technologies, such as VNAND for flash memories 26 and VRRAM for resistive memories, are currently implemented to achieve high-density arrays with optimal bitcosts 25,[27][28][29] . VRRAMs have demonstrated good write/read margins, energy consumption, and parallel programming properties 23,27,30 . However, using conventional metal materials as a word-plane (WP) electrode may, owing to its intrinsic parasitic properties, limit both planar and vertical sizes of 3D VRRAM 28 . In addition, recently proposed techniques for weighted-sum (WS) operations with binary RRAM devices limit the full application of multiple layers 31 . The specified factors impede 3D VRRAM application in neuromorphic computing for large image dataset recognition.
Therefore, a holistic approach integrating emerging devices, circuits, and system-level analysis is required to overcome these issues. With its remarkable electronic and thermal conductivity, graphene is a potential candidate to replace metal-based interconnects for various devices, including 3D VRRAM 32,33 . In the case of the VRRAM, integrating graphene sheets as a WP electrode will drastically change the response of the individual devices within an array [34][35][36] , thus requiring a distinctive programming scheme to have a positive impact on the system. Consequently, inspired by the studies on 1-bit ternary precision quantization [18][19][20] , applications with RRAM devices [21][22][23][24] , and 3D VRRAM technology 25,[27][28][29][30] , this work investigated the potential of the graphene-based VRRAM array as a memory-centric, neuromorphic computing platform. The graphene edges were used as electrodes in individual devices to extract device characteristics for statistical analysis. Based on the intrinsic behavior of the devices, a large-scale multilayer 3D vertical RRAM array was simulated for read, write, and weighted-sum operations. With XNOR algorithminspired architecture, the graphene-based VRRAMs resulted in considerably higher recognition accuracy compared to VRRAMs using conventional metals such as Pt. The difference was found to be the result of both the improvement of the device and the enhanced performance of the interconnect.

RESULTS
3D VRRAM single device structure and response In this work, fabricated TiN/HfO x /Pt and TiN/HfO x /graphene structures are referred to as Pt-RRAM and Gr-RRAM, respectively, denoting 3D VRRAM with platinum (Pt) and graphene plane electrodes. Figure 1a,b depicts the cross-sectional schematic of Pt-RRAM and Gr-RRAM single devices with two stacked layers. In 3D VRRAM, the active memory cells are sandwiched between pillar and multilayer plane electrodes 27,35,37 . The thicknesses of Pt and graphene electrodes are 5 nm and~3 Å, respectively. Figure 1c-e shows the high-resolution transmission electron microscopy (TEM) image of cross-sections of both Pt-RRAM and Gr-RRAM devices. The Al 2 O 3 layer was used as an adhesion promoter since it has higher surface energy compared to thermally oxidized SiO 2 . Nevertheless, fabricating an extra adhesion layer is not a crucial stage, as high-quality graphene can be transferred directly on the SiO 2 substrate 37 .
The DC I-V characteristics of Pt-RRAM and Gr-RRAM devices with two layers are shown in Fig. 1f. In contrast to the Pt-based devices, we can achieve unconventional switching in VRRAM with graphene plane electrodes, where one of the noticeable distinctions includes the inverted polarity of the programming voltage. In conventional 3D memory, the SET operation is achieved by applying the positive bias on the pillar electrode (TiN) and negative on the WP electrode (Pt) 27,28 , whereas for Gr-RRAM, the SET is carried out by applying a negative bias on the pillar electrode (TiN) and positive on the plane electrode (graphene). This difference can be explained by the fact that in RRAMs, TiN is commonly used as an oxygen reservoir, and a TiO x N 1−x interfacial layer is formed at the TiN/HfO x interface, facilitating the accumulation and the discharge of the oxygen ions 27 . Although such a principle is also applicable to Gr-RRAM 35,37 , distinctive device features were implemented by utilizing graphene as an active electrode, as it can also be operated as a stand-alone oxygen reservoir 34 . Conducting the soft dielectric breakdown and initiating the primary defects in the bulk metal oxide layer determines the further behavior of the Gr-RRAM device by activating one of the electrodes. This means that the polarity of the forming voltage of graphene-based VRRAM with TiN/HfO x / graphene structure dictates toward which electrode (TiN or graphene) oxygen ions will initially migrate, assigning the consequent programming operation characteristics. More detailed illustrations are shown in Supplementary Fig. 1.
The stochasticity in device response, and programming schemes Before estimating VRRAM array performance, it is important to ascertain the repeatability of the single devices in temporal means. The device DC responses for 30 cycles are shown in Fig. 2a, b for Pt-RRAM and Gr-RRAM devices, where median curves are shown in blue and red, respectively. Furthermore, Supplementary  Fig. 2 shows the SET voltage variations in distinctive quasi-static sweeps for 100 cycles in both VRRAMs, whereas cycle-to-cycle variations (σ/μ) were found to be 13 and 6.4% for Pt-RRAM and Gr-RRAM devices, respectively. The switching voltages and the currents are considerably lower for Gr-RRAM than for Pt-RRAM due to its much thinner graphene electrode and its highly focused electrical field at the edge 34 . The lack of a TiO x N 1−x barrier layer, which impedes current conductance at the pillar electrode and the active memory interface, is another reason for low voltage operation 34,36 .
Due to the stochastic nature and the unconventional switching behavior of the Gr-RRAM devices, a protocol must be set up for safe write and read operations based on experimental results. As these devices have asymmetric bipolar characteristics, the write operation can be categorized as SET and RESET, shown in Fig. 2c, d. As a result, one can establish criteria for array programming conditions depending on the biasing scheme. For the considered ½ bias scheme in a 3D Pt-RRAM array, the safe programming conditions should include the applied voltage pulse in the range of 1.5 to 2 V for SET and −1.5 to −2.4 V for RESET to ensure the switching without ½ bias disturbance of half-selected cells (i.e., V W = {2 V, −2.4 V}). In Gr-RRAMs, a voltage range of −1.0 to −1.5 V for SET and 0.4 to 0.5 V for RESET are expected to perform the current switching in the selected RRAM cell; therefore, the applied pulse should have an amplitude of V W = {−1.5, 0.5 V}. Otherwise, the resistive switching in the metal oxide may result in probabilistic behavior, in which the switching probability roughly follows the Gaussian distribution. For a safe reading operation, the read pulse should be in the range of 0.1 V to ½ bias of the minimum safe write amplitude; in this work, it is 0.1 V (i.e., V R = 0.1 V). Figure 3a, b displays the DC characteristics from five randomly chosen Pt-RRAM and Gr-RRAM devices. Supplementary Fig. 3 shows the cumulative distribution function for HRS and LRS in several measured devices. Overall, both VRRAMs show comparable uniformity while Gr-RRAM has a larger memory window. The magnitudes of HRS and LRS are relatively high with graphene WP, which is beneficial for array applications. Under pulse measurements, Gr-RRAM has a tolerable fluctuation at a 1.5 V/500 ns programming condition while maintaining a minimum detected ON/OFF ratio greater than 10 ( Fig. 3c). From the retention test, the read noise can be retrieved before the unintended resistance shift under various thermal stresses ranging from~145 to 200°C (shown in Fig. 3d). As expected, the read noise is more substantial at the elevated temperature. To further estimate the array performance, it is essential to build a single device model that can accurately reflect the VRRAM resistive switching behaviors. Therefore, the Verilog-A compact model was configured based on the concepts of tunneling gap evolution, accurately demonstrated in the Stanford RRAM design [38][39][40][41] , and conductive filament (CF) radial evolutions 42,43 . One of the pillars of this work is based on the statistical study achieved by the extensive experimental measurements. Consequently, the RRAM model was defined considering the intrinsic programming variations and read noise.  The modeling of the 3D VRRAM array with graphene word planes Figure 4a illustrates a virtual 3D VRRAM array schematic with a 2D graphene film as the WP electrode. The thickness t di of the isolation layer (SiO 2 dielectric) is selected according to the type and dimensions of the plane electrode. The vertical density of the 3D array highly depends on the isolation layer and word plane thicknesses; thus, Gr-RRAM brings favorable impact. More detailed information can be found in Supplementary Note 1. Array biasing is performed using the WPs and bit-lines (BLs) connected to the pillar electrodes (y-axis), whereas specific memory cells are selected via vertical transistors 44 controlled by selector-lines (SLs) (connected in the x-axis). V W/2 on both sides or not selected by vertical transistors. For the single device read operation, all WPs and BLs are grounded except for the selected WP, which is biased to V R . Following such a pattern of array biasing, we believe, will avert unintentional write and considerably alleviate the sneak current effect 27,30 .
Vector-matrix multiplication (VMM) is one of the critical operations in neuromorphic computing, and a simple cross-point array is advantageous because it can perform a WS operation easily, due to Kirchhoff's law, at the junctions. However, the stacked 3D VRRAM design requiring only a single pivotal lithography step is more bit-cost efficient 27 . To exploit both the bit-cost efficiency and the WS operation scheme in a vertical memory structure, several scenarios to conduct the WS operation in the VRRAM array have been proposed. In one study 31 , pillar electrodes were exploited as input neurons, whereas WPs were utilized as WS output neurons, allowing a 1TkR configuration (k is the number of layers). However, in this structure, the number of output neurons will heavily rely on the number of the stacked WPs, which in turn are dependent on the technology etching aspect ratio. Alternatively, the WPs and SLs can be combined to perform the VMM operation 23 , in which the selected and unselected WPs are biased to V R and 0 V, respectively, and selected BLs should be grounded (Fig. 4d). Thus, the output current read in each BL represents the WS of the input voltages and conductance of memory cells located along the y-axis of the planar array.
For a large-scale simulation of the VRRAM array in HSPICE with RRAM models, the 2 × 2 × 2-size 3D sub-circuit with one virtual node was used as a building block 30,45 , as shown in Supplementary Fig. 5. For both computationally accurate and efficient array simulation, the voltage-dependent VRRAM model was specifically designed to incorporate the intrinsic behavior of the Pt-RRAM and Gr-RRAM devices. The main simulation parameters for the VRRAM array are listed in Table 1.
It is worth noting that the electronic properties of graphene can be further enhanced by doping with nitrogen, boron, or FeCl 3 33,46,47 . In contrast to the pristine graphene sheet, the resistivity is expected to be reduced by increasing the carrier concentration during the doping process. Supplementary Fig. 6 summarizes recent studies in obtaining highly conductive graphene interconnects through doping with various materials. On the single memory cell level, since the overall proportion of the change in the switching resistance states in HfO x based VRRAM is significantly larger than the graphene's actual sheet resistance, it was presumed that doping the graphene will not have a fundamental effect on the individual device response. Nevertheless, in the large 3D array architecture, where interconnect parasitics have a major impact on signal degradation, doping the monolithic graphene word plane can be the area for continued development. Therefore, the reasonable assumption that VRRAM with doped graphene plane electrode (referred to as DGr-RRAM) has similar switching characteristics as the pristine one (Gr-RRAM) but with lower sheet resistance was made. As doped graphene, owing to its relatively high conductivity, can be operated in favor of planar size increase, its potentials in the 3D vertical memory array for neuromorphic computing were also evaluated. Applied interconnect resistivities of metal and graphene materials for WP and pillar electrodes were estimated from the International Technology Roadmap for Semiconductors table 48 and reports 33,49 .
In addition, more detailed information regarding the parameterization of WP and pillar electrodes can be found in Supplementary Note 2 and Supplementary Fig. 7. For the selector transistors with a sub-45 nm node, a Predictive Technology Model was used 50,51 . It should be noted that graphene formation in the practical application may be challenging; the difficulties are mostly associated with graphene (1) synthesis and (2) transfer processes. For synthesis, although the chemical vapor deposition method provides large-area high-quality uniform graphene sheet growth, there are certain thermal limitations (<400-500°C) imposed by the back-end-of-line (BEOL) process. Nevertheless, some studies have made significant advancements in graphene synthesis compatible with current and next-generation semiconductor technologies 52 . Since transfer-free approaches predominantly require deposition at elevated temperatures opposing BEOL limitations, therefore, in some synthesis techniques, graphene transfer may appear to be an inescapable stage. Due to the quality issues of the wet transfer related to the polymer residues left on the graphene, which degrade its electronic properties, the dry transfer process offers a more promising solution. In this regard, there have been several progressive works that demonstrate the dry transfer potential of graphene 53,54 . Such high-quality formation and process integration challenges are among the fundamental issues for all 2D materials. Nevertheless, the research interest is increasing significantly in these areas, mainly because it is believed that with enough advancement, there is a high possibility that 2D materials will synchronize with the current electronics paradigm.
The array performance in programming, read, and WS operations Figure 5a,b illustrates the simulation results for accessed voltage drop over the selected furthest cell for various planar array sizes during the SET/RESET processes of the worst-case scenario. The Pt-RRAM array can no longer meet the minimum access voltage requirement beyond the 128 × 128 planar array size since the voltage range is going down to the probabilistic region ( Fig. 2c) with no guarantee of the resistive switching of the selected cell. Nevertheless, this does not apply to graphene-based VRRAM arrays, as selected Gr-RRAM or DGr-RRAM cells can be safely programmed at all considered array sizes. Various components in  In this study, the 416 × 224 array size is particularly significant for further estimation of image recognition performance. Three investigated types of VRRAM can satisfy the read margin requirements for sense amplifiers to differentiate HRS and LRS states (Fig. 5c). Figure 5d,e indicates that energy consumption can be reduced by an average of~262X for RESET (sub-pJ levels) and 8X for SET operations in Gr-RRAM arrays, compared to Pt-RRAM arrays. Notably, the energy consumption ratio in the array simulation is reduced by two, in contrast to the experimentally measured results of the single devices, which can be explained by the presence of the half-selected cell in the ½ biasing condition.
Along with programming and reading of the selected cells, the WS is a crucial operation for further implementing VRRAMs in neuromorphic computing. Similar to the 2D cross-point array 42 , the VMM efficiency of specific VRRAM arrays can also be evaluated as a deviation of WS from the expected ideal value, known as read inaccuracy (Fig. 5f). Both Gr-RRAM and DGr-RRAM arrays show superior effectiveness in WS operations, not exceeding a 10% deviation of accuracy, which does not apply to Pt-RRAM. We have noticed that a large number of parallel BL readings during the inference process causes the read inaccuracy to grow (Fig. 5g), related to the increase in sneak-path current. Therefore, eight BLs for parallel inference were found to be optimal. Nevertheless, the VRRAM array with a doped graphene plane electrode can promote more BL numbers for parallel computing, owing to its low specific resistivity. Figure 5h demonstrates the Shmoo plot, which indicates the performance of a 416 × 224 × 8-size 3D array system, according to various conditions of switching current (I W ) and metal WP thickness. Metal films experience a sharp increase in resistivity, becoming comparable to insulators, as thickness goes below 5 nm 28,49 . With the studied Pt-RRAM switching characteristics, the WP thickness should exceed 30 nm to succeed in all operations, including WS for in-memory computation. Therefore, fabricating at this thickness may oppose the known trend of stacking more layers and obtaining a high-dense memory structure. At sub-3 nm thickness, a VRRAM array with a conventional metal plane electrode is expected to fail regardless of the switching current. On the other hand, Gr-RRAM with only 0.3 nm WP thickness can pass in all necessary operations, owing to its intrinsic properties and switching characteristics. A comparison of write voltage drop on different components as a function of WP thickness is shown in Fig. 5i. Below 5 nm WP thickness, the voltage drop on a selected cell declines considerably due to a drastic increase in circuit parasitic resistance. Shmoo plot results suggest that the design of metal-based VRRAMs, including Pt-RRAM, needs to be further optimized by device engineering to accomplish all procedures by meeting the requirements for switching current, WP resistivity, and 3D array density.
The XNOR operation-based architecture of 3D VRRAM arrays The recognition performance of the VRRAM array can be assessed using handwritten digits from the Modified National Institute of Standards and Technology (MNIST) database 55 . In this work, a 2layer perceptron (MLP) topology, shown in Fig. 6a, with 400 input, 200 hidden, and 10 output neurons is used to estimate the productivity of Pt and graphene-based 3D VRRAMs. To reduce the WP planar size and implement the ternary XNOR operation, the original image is cropped and binarized (Fig. 6b). For the training process 1 , the required weight update can be determined using the gradient descent method shown in Eq. (1).
where α is the learning rate, B is the batch size within which the samples are computed for the sequential weight update, and V i , δ j are the specific neuron input value and output error, respectively, for the lth layer with mxn synapse size. The detailed information about the training flow is shown in Supplementary Fig. 9. Due to the inexpensive stacking properties of the 3D VRRAM arrays, the in situ training itself can be performed in 6-bit or higher weight precisions, and further, can be optimized to 1-bit ternary precision by following the instructions shown in Fig. 6c and Supplementary Table 3, Note 3. For optimal online learning, it is expected to have 6-bit precision for binary RRAM devices 56 or 64 distinct conductance levels for analog ones. Alternatively, for the ex-situ training, VRRAM can be directly quantized to ternary levels for further image classifications. Furthermore, with the use of the XNOR operation for ternary weights (Fig. 6b), the computational and energy resources can be reduced, provided the reduction is made by bit-count operations and natural weight pruning. Figure 6d shows the XNOR architecture implemented in 3D VRRAM, where the synaptic weight matrix is achieved with two vertical layers. The output current flowing in the specific BL depends primarily on input and weight logic values. For instance, input logic value "1" can be represented by applying positive and negative read pulses to the top and bottom layers, respectively. Thus, given that the top VRRAM is in LRS and the bottom one is in HRS, the expected current flowing in the pillar electrode is I LRS -I HRS, which can be represented as "1", following the XNOR logic. In addition, natural synaptic pruning can be obtained by programming both VRRAMs to HRS states, leading to the extremely small current output considered as a logic "0". Although a monolithic WP pattern limits the input vector range used for different RRAMs along the BL (Fig.  6d), the input "0" can be achieved by turning off the corresponding vertical transistor 23 . However, due to the 1TkR configuration, the whole pillar will be in an idle state. This may restrain multiplelayer parallel computing for large datasets with diverse input values. Alternatively, provided with adequate compensation, XNOR architecture can be a possible solution for layer-based partial WS of the large datasets that can be integrated with the high-stackable characteristics of the 2D graphene. It is worth noting that this work focuses on evaluating the impact of graphene in the XNOR operation-focused 3D VRRAM architecture for neuromorphic computing applications despite the graphene process integration challenges. Therefore, as an alternative, the device-circuit-architecture/algorithm levels holistic approach was applied, which also included the simulation of the programming and in-memory computing potential of the large-scale array using the graphene-based RRAM model, which was verified by extensive experimental measurements (Supplementary Note 4).
The learning performance of 1-bit ternary VRRAM arrays Figure 7a demonstrates the evolution of neural network training accuracy based on the ideal neuromorphic devices with floating, 6bit, and 1-bit ternary synaptic precisions. The MLP accuracy with floating weights is~98%, reaching the baseline software benchmark. Given that the error-free VRRAM array with seven stacked layers acts as an artificial synapse network, the 2% accuracy degradation is expected. Furthermore, in comparison with floatingweight precision there is only a 3% decline at the 1-bit ternary neural network. Such a network requires only two stacked layers in the VRRAM array, as shown in Fig. 6d. Nevertheless, fluctuations in learning evolution increase considerably as weight precision is being compressed. It is worth noting that the learning outcomes are highly likely to be downgraded in a real VRRAM array, depending on the device properties, circuit parasitics, array dimensions, etc. Particularly, due to the deviations (read inaccuracy) in the WS, which is a crucial operation in neuro-inspired computing,  Table 3). d The proposed XNOR operation-focused architecture for 3D VRRAM arrays.
the recognition performance may deteriorate noticeably. Figure 7b presents the learning outcomes, considering the worst-case scenario of the read inaccuracy values corresponding to the Pt-, Gr-, and DGr-RRAM 3D arrays in the inference process. The learning accuracy for Pt-RRAM is significantly decreased, followed by substantial stochasticity; this outcome can be explained by the high read inaccuracy values shown in Fig. 5f. However, the accuracy of graphene-based VRRAMs is comparable to that of an artificial neural network based on an error-free VRRAM array, owing to the intrinsic properties of graphene, its interface with the active memory layer, and 1-bit ternary synaptic architecture with XNOR operation. Such architecture shows little susceptibility to minor deviations in the read accuracy. In addition, for precise analysis of the WS effect on classification accuracy, a Monte Carlo simulation was conducted, as shown in Fig. 7c. Read inaccuracy values were selected in a uniformly random manner, ranging from minimum to maximum values according to the VRRAM type. Under the competent performance of graphene-based VRRAMs in the WS operation, the accuracy range is higher and more concise than the Pt-RRAM array, approximately following the Gaussian probability. Figure 7d-f shows the MLP simulation results projected on Pt and graphene-based 3D VRRAM arrays, considering the intrinsic read and write noises. As a result of the intrinsic properties of the device and its characteristics on the circuit level described previously, the accuracy downgrade rate of the Pt-RRAM is relatively higher than that of graphene-based RRAM devices. Consequently, integrating graphene does not only affect the interconnect characteristics and dimensions of the 3D VRRAM array, but also the unconventional switching mechanisms have a favorable impact on both circuit and architecture levels. In addition, by quantizing the neural network to 1-bit ternary precision and implementing the XNOR operation in the 3D VRRAM array for inference computation, the effect of read and write noises is less forceful than it is in analog synapses with floating precision (Supplementary Fig. 10). Here, the write noise combines both cycle-to-cycle and device-to-device variations. Consequently, based on the experimentally obtained read and write noises of the device ( Fig. 3d and Supplementary Fig. 2), one can expect a recognition accuracy of~83.5% in the Pt-RRAM and~94.1% in the Gr-RRAM arrays. Since the holistic approach was applied in this study to evaluate the graphene impact in the 3D array architecture for neuromorphic computing applications, it is important to compare with other studies. Therefore, the benchmark comparison with related studies in the field of 2D materials integration into memory technology, circuit analysis of the largescale memristor arrays, and neuromorphic computing using resistive switching devices is provided in Supplementary Note 4, Figure 11, and Table 4 in the Supplementary Information.

DISCUSSION
In summary, this study investigated the potential performance of the Pt and graphene-based 3D VRRAM arrays as on-chip computing platforms. Replacing the conventional metal wordplane with sub-nm thin graphene increases the possible number of vertical stacks and reduces the effective parasitic resistance allowing safe read and write procedures in larger planar array sizes. Due to the low switching currents and voltages of individual devices, programming of the furthest cell consumed much lower energy on the circuit level than a conventional system. The Gr-RRAM array can successfully conduct a VMM operation resulting in a tolerable read accuracy deviation at >90k planar array size. Furthermore, the design of the XNOR algorithm-inspired architecture for the 3D VRRAM array allows the implementation of 1-bit ternary synaptic weights for the image recognition tasks. In particular, XNOR architecture has the potential to supplement the highly stackable nature of the graphene-based VRRAM arrays for parallel processing of multiple layers. This study highlights the importance of a holistic approach to correlating the material and device engineering, circuit structuring, and algorithm building to design a memory-centric, next-generation computing system.

Fabrication summary
Two-layer VRRAM devices with graphene WPs were prepared sequentially following the highlighted stages, including graphene transfer, trench forming, and deposition of metal oxide and pillar electrodes. Initially, 5 nm thin Al 2 O 3 dielectric was deposited by atomic layer deposition (ALD) on a SiO 2 (100 nm)/Si substrate for adhesion promotion, followed by graphene sheet transfer (monolayer sandwiched by copper foil, Graphene supermarket). Ti(3 nm)/Pt(30 nm) metal pads for probing were deposited by evaporation. A SiO 2 (60 nm) passivation layer was deposited using LPCVD.
To form the second and higher layers of the WP, the described process must be repeated 34,37 . Next, one pivotal lithography process was conducted to form the trench by dry etching; subsequently, HfO 2 (5 nm) and TiN were deposited by ALD and sputtering, respectively.

Device characterization
High-resolution TEM images were obtained using a Tecnai TF-20 Field Emission Gun/TEM@200 kV (FEI company, UK). Electrical characterization was obtained using an Agilent Parameter Analyzer 4155C (Agilent, CA, USA) with an 81150A arbitrary signal function generator (Keysight, CA, USA) and Switch Matrix 707B (Keithley, OH, USA) for pulse measurement (retention, endurance tests).

Circuit analysis
3D VRRAM arrays with conventional Pt and graphene WPs were modeled as a matrix of 2 × 2 × 2-size subcircuits with one virtual node. Gr-RRAM and Pt-RRAM were designed as voltage-dependent models based on the experimentally verified individual device response. As a result, XNOR operation-inspired VRRAM arrays with 8 vertical layers and various planar sizes for individual cell programming, read, and network WS procedures for the worst-case scenario were simulated in HSPICE software (Synopsys, CA, USA). The detailed information regarding the characteristics of WP and pillar electrodes is shown in Supplementary Figs

Neural Network simulation
The ANN, with 400 input, 200 hidden, and 10 output neurons, was simulated in MATLAB software (MathWorks, MA, USA); 60,000 and 10,000 cropped and binarized MNIST data were used for training and testing operations. The considered cycle-to-cycle, device-to-device variations, and read noise at elevated environment temperatures were derived from experimental results (Supplementary Fig. 9).

DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.