Redox-based resistive switching random access memory (ReRAM) offers excellent properties to implement future non-volatile memory arrays. Recently, the capability of two-state ReRAMs to implement Boolean logic functionality gained wide interest. Here, we report on seven-states Tantalum Oxide Devices, which enable the realization of an intrinsic modular arithmetic using a ternary number system. Modular arithmetic, a fundamental system for operating on numbers within the limit of a modulus, is known to mathematicians since the days of Euclid and finds applications in diverse areas ranging from e-commerce to musical notations. We demonstrate that multistate devices not only reduce the storage area consumption drastically, but also enable novel in-memory operations, such as computing using high-radix number systems, which could not be implemented using two-state devices. The use of high radix number system reduces the computational complexity by reducing the number of needed digits. Thus the number of calculation operations in an addition and the number of logic devices can be reduced.
Redox-based resistive switching random access memories (ReRAMs) are considered as one of the most promising emerging non-volatile memory technologies1,2,3. The devices can be scaled down to 5 nm4,5, offer endurance up to 1012 cycles6, 10 years retention7 and fast read/write speed of below 200 ps8. The devices are switched to a low resistive state (LRS) for a positive SET voltage and switched to the high resistive state for a negative RESET voltage. Up to 8 multi-states have been shown9, allowing the storage of up to three binary digits in a single cell. Additionally ReRAM devices offer highly non-linear switching kinetics, i.e. the SET time depends exponentially on the pulse amplitude10. Due to abrupt switching events the common approach is to apply an external current compliance (CC) to enable multi-level resistance states11. The drawback of this approach is that the final resistance is defined by the CC, but not by the actual applied pulse amplitudes. However, a direct correlation between pulse height and final resistive state is feasible for a gradual RESET process, where a Vstop voltage defines the resistive state in advanced valence change mechanism (VCM) devices12,13. In this work, we use optimized Pt/W/TaOx/Pt ReRAM devices, offering highly reliable stop-voltage behavior and use the corresponding multi-level properties to implement modular arithmetic operations, as discussed in the result section.
Modular arithmetic finds usage in everyday applications, e.g., quantifying a specific clock-time, which wraps around after a fixed value is reached. A rigorous mathematical framework of modular arithmetic was developed by Carl Friedrich Gauss14 by defining a congruence relation between integers. Two integers, a and b are said to be congruent modulo n, when their difference (a-b) is divisible by n. In this case, n is known to be the modulus of this relation.
The properties of integer numbers for a specific modulus, spanning addition, subtraction and multiplication are written as following.
Given a1 ≡ b1(mod n) and a2 ≡ b2(mod n), we have
Apart from applications in mathematics, modular arithmetic plays a fundamental role in modern computer arithmetic. Here, a ring of integers modulo 2 is termed as a Boolean ring and every Boolean ring gives rise to Boolean algebra, where the ring multiplication is conjunction operator (∧) and the ring addition is exclusive disjunction operator (∨). Furthermore, the idea of secure and fault-tolerant data communication relies on the principles of public-key cryptography and error-correcting codes, respectively. Both of these fields require efficient implementations of modular arithmetic.
Modular arithmetic is also useful for reducing the complexity of standard arithmetic circuits15,16 and is essential for building the residue numeral systems (RNS). The RNS representation allows overflow-free addition, subtraction and multiplication, thereby enabling high degree of parallelism.
State-of-the-art modular arithmetic circuits in CMOS technology are implemented using two-state Boolean arithmetic operations, which follows directly from the two-level switching algebra introduced by Shannon17. Memristive devices were suggested to replace register files in conventional signed-digit adders18 or to be used in conjunction with complex quantization circuits19. The current paper reports the first implementation of modular arithmetic using multi-state ReRAM devices, which is fully crossbar array compatible in conjunction with a selector device. So far, most previous memristive circuit studies are based on over simplistic memristor models20, we hereby use real memristive devices fabricated in word structures to verify the proposed functionality. It should also be noted that we perceive no theoretical limit in scaling the number of states for memristive devices, thereby, opening a new research direction on multi-state storage and computing devices.
In this work, 5 μm × 5 μm Pt/W/TaOx/Pt cross-point devices arranged in word structures (Fig. 1a) are used to realize the three Trit (trinary digit) modular addition. The device stack 25 nm Pt/ 13 nm W/7 nm TaOx/30 nm Pt is depicted in Fig 1b. The typical I-V characteristic of this device is shown in Fig. 1c. In supplementary S1, the I-V characteristic of an 80 nm × 80 nm cross-point device is also shown, highlighting the scaling potential of these devices. During RESET process, the maximum applied voltage |Vstop| defines the final resistive state (1.8 V in Fig. 1c). This feature is also present in pulse mode, thus can be implemented in memory and logic operations for controlling the multi-level states.
Figure 2a shows the cumulative probability of low resistance state (LRS) and six multi-level resistive states, which are obtained for 200 ns pulses in the range of Vstop = −1.50 V to −2.25 V. Each state is based on 5 devices with 10 cycles. The inset explains the statistical information of distribution. The tight distribution highlights the excellent switching properties of this device. In Fig. 2b, the mean value for each of the resistive state R0 to R5 is given.
For the proposed arithmetic operation, the input operands are applied to the top (TE) and bottom (BE) electrode, respectively. To enable an equidistant voltage stepping, we use a predefined OFFSET voltage (VOFFSET) for each pulse. The operand voltages are Vop = 0.00 V to 0.75 V with increment of 0.15 V. The actual pulse applied at the bottom electrode is therefore VBE = VOFFSET + Vop1 and VTE = −(VOFFSET+Vop2) for the top electrode. Thus, the overall potential difference is Vstop = VTE–VBE = −(2 VOFFSET + Vop1 + Vop2). Since the overall device voltage is always negative, a logic operation corresponds to a RESET pulse whose amplitude depends on the actual operands. To show the multi-level pulse operation mode, we set VBE = 0.75 V and vary VTE from −0.75 V to −1.50 V (Fig. 3a). The resulting resistances are depicted in Fig. 3b. Depending on the overall device voltage (Vstop = −1.50 V to −2.25 V) six different resistance states (R0, R1, R2, R3, R4 and R5) are easily accessible (Fig. 3b). Note that three resistive states would be sufficient to represent a ternary numeral system (Trit). This multi-level device property is used for the modular arithmetic operation. To enable highly reproducible RESET operation, we always apply a DC SET operation before each pulsed RESET operation. Note that also nanosecond pulsed SET operations are feasible, but not applied in this work. Details on the pulsed SET operation can be found in Supplementary S2.
Developed Modular Arithmetic Working Principle
The new developed algorithm calculates the carries and sums directly in the ReRAM devices, which store the results until they are read out. Initially, all the devices in a wordline are initialized, i.e. written to the LRS. Starting from this state the sum bit of significance 0 (s0) can be directly calculated in the device of significance 0 while the other devices are calculating the first output carry c1. The actual sum or carry calculating devices are shifted for each significance one device to the left.
In general, for the carry algorithm (Fig. 4a), first the device state of the actual device is read, to check whether the input carry cin is 0 or 1. In case of 1, VOFFSET is set to 0.875 V whereas in case of 0, the OFFSET remains VOFFSET = 0.75 V. Next, the logic operation is conducted after a SET operation using the evaluated OFFSET. We apply VTE = −(VOFFSET + Vop1) to the top electrode and VBE = VOFFSET + Vop2 to the bottom electrode. Finally, the resistive state of the device is read and evaluated. To enable a proper modulus operation the ReRAM device has to provide 2n states for an n-ary number system. The background is that in a n-ary number system the operands at each specific significance are in the range of 0…n-1, i.e., the sum of two operands is at most 2n-1. Since, an input carry of 1 may also occur, the totally required number of states per device is 2n. Thus, for a ternary number system six states (R0…R5) are required. If the state is R ≤ R2, the output carry cout is 0 and R0 is written back. For R > R2, the device is written to R1, i.e. cout = 1. Note that prior to the write back operation, the SET operation is conducted to enable a highly controlled R1 state. In supplementary S4, the required peripheral circuitry is depicted. Note that a certain minimum crossbar array size is required to justify the peripheral circuitry overhead. In this respect, a suitable selector device is a key component enabling large-scale ultra-dense multi-level ReRAM arrays.
Based on the input carry, the final sum can be calculated (Fig. 4b). As for the carry calculation, the required level of the VOFFSET (either 0.75 V or 0.875 V) is first evaluated. Next, the operand voltages VTE = −(VOFFSET + Vop1) and VBE = VOFFSET + Vop2 are applied after the SET operation. Based on a final readout, the mapping R3 → R0, R4 → R1 and R5 → R2 has to be conducted to complete the modulo sum operation. The corresponding write back operation is done subsequently after the SET operation.
Since the signals which are applied to the TE are the same in both algorithm, these can be conducted in parallel on devices of different significance. Thus the cycle count can be kept low.
For the proof-of-concept measurement, a two Trit modular addition is selected, adding the ternary numbers p = p1p0 and q = q1q0. Since the sum output z = z2z1z0 needs three Trit digits, three ReRAM devices are required for this operation and initialized to LRS firstly. The addition is performed in a word-line structure (cf. Fig. 1a). For the exemplary addition, operand 1 is p = 21 (=7) and operand 2 is q = 22 (=8). Note that input 0 corresponds to Vop = 0.00 V, input 1 corresponds to Vop = 0.15 V and input 2 corresponds to Vop = 0.30 V, using the earlier described incremental stepping of 0.15 V. In Fig. 5a–c, the sequentially obtained resistive states are shown. The arrows mark the order of steps without showing the in between SET-steps.
The algorithm described in Fig. 4 realizes the following mathematical modulo sum operation:
In device z0, the sum operation is conducted directly:
z0 = (1 + 2) rem 3 = 0 (s0).
Note that the function ‘rem’ returns the remainder. Starting from LRS, the device is reset (p0 = 1 = > VTE = −0.9 V and q0 = 2 = >VBE = 1.05 V, i.e. −1.95 V). According to Fig. 2b, this voltage leads to state R3, as can be also seen in Fig. 5c directly. According to the sum algorithm (Fig. 4b), R3 is finally mapped to R0, see Fig. 5c.
In device z1, first the carry operation is conducted:
z1 = (1 + 2) div 3 = 1 (c1 = 1).
The function ‘div’ returns the floor quotient. Starting from LRS, the device is toggled to R3 state by applying p0 and q0 to calculate the carry c1. According to the carry algorithm (Fig. 4a), R3 is then mapped to R1, see Fig. 5b.
Next, the second cell sum Trit is obtained by the following operation:
z1 = (2 + 2 + c1) rem 3 = 2 (s1).
For cell z2 (Fig. 5a), again the carry operation is conducted first:
z2 = (1+2) div 3 = 1 (c1 = 1).
Starting from LRS, R1 state is accessed via R3.
Since we consider a two Trit addition, the final sum bit equals the carry c2:
z2 = (2 + 2 + c1) div 3 = 1 (c2 = s2 = 1).
The final sum is stored directly in memory:
Sum z = z2 z1 z0 = 120 (=15).
In Fig. 6 the schematics of applied operation voltages and corresponding states are depicted. The first line shows the voltages at the common bottom electrode (BE) acting as a wordline (WL). The second, fourth and sixth lines show the voltages applied to the three separate top electrodes (VTE2, VTE1, VTE0) acting as bitlines (BL) while the third, fifth and seventh lines represent the resistance states (RTE2, RTE1, RTE0) at each BL. The three background colors are used. The gray shows the LRS after SET. The yellow depicts the logic implementations and the blue shows the corresponding states after the logic implementation. Overall twelve steps are presented. Step 1–2 show the initialized LRS and logic implementation. Step 3–4 depict the corresponding resistance states with the the LRS after SET. Step 5–6 show implementations and the corresponding states. The state is set to LRS in Step 7. The logic is implemented with adjusted OFFSEET in Step 8 and the corresponding states are shown in Step 9. The LRS after SET is shown in Step 10. Step 11–12 show the logic implementations and the corresponding resistance states. The states (RTE2, RTE1, RTE0) shown in Step 12 and Step 6 depict the final sum stored in memory (Sum z = z2 z1 z0 = 120). The more details of overall steps are given in Supplementary S4-S6. A truth table for the overall state definition (R0 – R5) is shown in the Fig. 7. Each combination of p (TE) and q (BE) sets the corresponding state with and without the adjustment of OFFSET.
We have demonstrated a ternary number system implementation, using multi-states tantalum oxide devices in word structures. Depending on the available number of resistive states, higher order number systems can also be implemented in the same way. For n-ary systems, we would need 2n resistive states, hence further progress in ReRAM memory technology will directly enable arithmetic operations using higher radix number systems.
On the other hand, the choice of radix for a number representation can be motivated from the perspective of underlying implementation as well as the analysis of radix economy. A quantifiable measure of radix economy proposed in ref. 21 is as following:
where, b is the radix and N is the number to be represented. This metric yields E as the most economical real-valued radix. It also turns out that the radix value of 3 (ternary) is more economical compared to binary. We argue that the above measure does not take the growth of the implementation media into account. For several device technologies, the area requirement grows linearly with the radix size and make the radix implementation very tough. However, it is not true for multistate memristive devices such as the Pt/W/TaOx/Pt ReRAM, since the implementable radix size depends on the number of resistance states. Considering, a k-state device can be realized at the same cost of a two-state device, a more appropriate metric would be
Compared to binary arithmetic, an n-ary number representation reduces the space complexity in a logarithmic ratio. Given comparable performance for the base devices, the gain in arithmetic circuits, such as, integer addition is also expected to be in logarithmic scale. However, the actual gain will be somehow smaller due to need for better sense amplifiers and more control circuitry.
The presented approach is not limited to a specific multistate ReRAM device, but would work for any memristive device offering multiple resistance levels induced by different stop voltages Vstop. The proposed algorithm could further be simplified by avoiding in between SET operations, however this requires ultra-low variance ReRAM devices. For the considered ReRAM device only RESET pulses were allowed as logic inputs. Appropriate SET pulses enabling step-by-step decrease of the resistance could be used to implement also subtraction within the same device similar to the here shown additional operation.
The presented approach is compatible to the passive crossbar array configuration, by integrating a selector device to each TaOx junction. The implementation of the arithmetic functionality within the resistive memory device using the available multi-resistance levels is a highly attractive option for future functionality enhanced hybrid CMOS/ReRAM chips. This approach enables a reduction of cycle count compared to Boolean logic based ReRAM approaches22,23,24,25. For example, a recently proposed cipher application could be decisively improved using multi-level ReRAMs26, enabling efficient in-hardware encryption and decryption for future smart devices. The energy per operation depends on the device properties, namely switching voltage, multi-level resistive states and inherent switching speed and control circuit properties such as the applied pulse width. Since the pulse width (t) that will be used in real application is much shorter than 200 ns (used in study), the final power consumption will be reduced further. In summary, low-variance multi-level ReRAM could play a key role for implementation of public-key cryptography and error-correcting codes in smart devices.
Pt/W/TaOx/Pt devices enable highly reliable multi states, which can be accessed reproducibly by pulses of specific height, starting from a defined LRS. By using word and bit lines as inputs for pulses, the resistive multi-levels can be used to store and calculate in-memory logic operations. To avoid an overflow in individual devices, a modulus arithmetic is implemented, assuring the device to be always in a valid 0, 1, 2 (Trit) state. By using a ternary number system, the amount of devices and cycles can be reduced significantly. In contrast to two-state devices, multistate devices provide better radix economy with the option for further scaling. Therefore, establishing multi-state ReRAM for non-volatile memory opens the door to novel storage and in-memory computer arithmetic options.
Devices are fabricated for 1 × 3 array with crossbar structure based on 5 × 5 μm2 single cell size. Three top electrodes shares single bottom electrode. For the bottom electrode (BE), 5 nm titanium (Ti) and 30 nm platinum (Pt) layers are deposited by sputtering on top of thermally grown SiO2 layer (430 nm). A photolithography and dry etch processes are applied to pattern the bottom electrode. And then 7.0 nm-thick TaOx is deposited by reactive sputtering under process gas mixture of Ar (23%) and oxygen (7%) with the RF power of 116W at the chamber pressure of 2.3 × 10−2 mbar. Without breaking the vacuum, 13 nm tungsten (W) ohmic electrode, and 25 nm platinum (Pt) are deposited consecutively. And then top electrode (TE) is patterned with a photolithography and a reactive ion etch. A scanning electron microscopy of 1 × 3 crossbar array with 5 × 5 μm2 size cell is shown in Fig. 1a and its corresponding cross-sectional structure with tunneling electron microscopy is shown in Fig. 1b.
Initially the resistive switching devices remain in high resistance state (HRS) and we need to apply irreversible forming process in order to activate the devices before repetitive switching cycles are possible. Detailed information is given in supplementary section. During the forming process, the resistance state of device changes from HRS to LRS. In order to change the resistance state from LRS to HRS, a ‘RESET’ process is required. In reverse way, a ‘SET’ process converts the HRS to LRS. The DC operation of a single device also known as current-voltage (I-V) sweep is shown in Fig. 1. The voltage sweep is applied only to the top electrode (TE) while the bottom electrode is grounded. In order to achieve better control on the resistance state of device with single pulse operation, the ‘SET’ process is based on DC operation. However, the other operations such as ‘RESET’ and ‘read’ use AC pulse operation. For the RESET operation, 200 ns pulse width based on full width half maximum (FWHM) is applied for switching operations with 40 ns rising/falling times. The read operation at 0.1 V uses 120 μs long pulse in order to verify the each resistance value more accurately, especially HRS values. By applying a 0.15 V stepping, seven resistive levels can be distinguished in the considered devices. Depending on the actual devices even more multi-level states are feasible in ReRAM devices. However, due to variability, e.g., induced by random telegraph noise27, the maximum number of properly accessible multi-levels is limited.
How to cite this article: Kim, W. et al. Multistate Memristive Tantalum Oxide Devices for Ternary Arithmetic. Sci. Rep. 6, 36652; doi: 10.1038/srep36652 (2016).
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was supported by the German Research Foundation (DFG) within the framework of SFB 917, Nanoswitches and the authors thank to Dr. Daesung Park and Mr. Sebastian Zischke from Central Facility for Electron Microscopy (GFE), RWTH Aachen for the cross-sectional TEM images.
About this article
Cognitive Computation (2017)