An error-resilient non-volatile magneto-elastic universal logic gate with ultralow energy-delay product

A long-standing goal of computer technology is to process and store digital information with the same device in order to implement new architectures. One way to accomplish this is to use nanomagnetic logic gates that can perform Boolean operations and then store the output data in the magnetization states of nanomagnets, thereby doubling as both logic and memory. Unfortunately, many of these nanomagnetic devices do not possess the seven essential characteristics of a Boolean logic gate : concatenability, non-linearity, isolation between input and output, gain, universal logic implementation, scalability and error resilience. More importantly, their energy-delay products and error rates tend to vastly exceed that of conventional transistor-based logic gates, which is unacceptable. Here, we propose a non-volatile voltage-controlled nanomagnetic logic gate that possesses all the necessary characteristics of a logic gate and whose energy-delay product is two orders of magnitude less than that of other nanomagnetic (non-volatile) logic gates. The error rate is also superior.

A long-standing goal of computer technology is to process and store digital information with the same device in order to implement new architectures.
One way to accomplish this is to use nanomagnetic logic gates that can perform Boolean operations and then store the output data in the magnetization states of nanomagnets, thereby doubling as both logic and memory.Unfortunately, many of these nanomagnetic devices do not possess the seven essential characteristics of a Boolean logic gate : concatenability, non-linearity, isolation between input and output, gain, universal logic implementation, scalability and error resilience.More importantly, their energy-delay products and error-rates vastly exceed that of conventional transistor-based logic gates, which is unacceptable.Here, we propose a non-volatile voltage-controlled nanomagnetic logic gate that possesses all the necessary characteristics of a logic gate and whose energy-delay product is two orders of magnitude less than that of other nanomagnetic (non-volatile) logic gates.The error-rate is also superior.
There is significant interest in 'non-volatile logic' because the ability to store and process information with the same device affords immense flexibility in designing computing architectures.Non-volatile logic based architectures can reduce overall energy dissipation by eliminating refresh clock cycles, improve system reliability and produce 'instant-on' computers with virtually no boot delay.A number of non-volatile universal logic gates have been proposed to date [1][2][3] , but they do not necessarily satisfy all the requirements for a logic gate 4,5 and therefore may not be usable in all circumstances.Ref. [1] proposed an idea where digital bits are stored in the magnetization orientations of an array of dipole-coupled nanomagnets and dipole coupling between neighbors elicits logic operation on the bits.This gate is not concatenable since the input and output bits are encoded in dissimilar physical quantities: the inputs are encoded in directions of magnetic fields and the output is encoded in the magnetization orientation of a magnet.Thus, the output of a preceding gate cannot act as the input to the succeeding gate without additional transducer hardware to convert the magnetization orientation of a nanomagnet into the direction of a magnetic field.The gate also lacks true gain since the energy needed to switch the output comes from the inputs and not an independent source such as a power supply.Additionally, the strength of dipole coupling between magnets decreases as the square of the magnet's volume, which limits scalability.Finally, dipole coupling is not sufficiently resilient against thermal noise, resulting in unacceptably large dynamic bit error probability in dipole-coupled logic gates [6][7][8] .
Ref. [2] proposed a different construct where a NAND gate was implemented with a single magneto-tunneling junction (MTJ) placed close to four current lines, two of which ferry the two input bits to the gate, the third is required for an initialization operation, and the fourth carries the output.The magnetic fields generated by the input currents flip the magnetization of the MTJ's soft layer and switch its resistance, thereby switching the magnitude of the output current and performing NAND logic operation.Slightly different renditions of this idea have been proposed 9 and an experimental demonstration was reported 10 .Unfortunately, this gate too is not directly concatenable since the input bits are encoded in the directions of the input currents while the output bit is encoded in the magnitude of the output current.Moreover, since it is difficult to confine magnetic fields to small regions, the separation between neighboring devices must be large.Individual devices can be small in size, but because the inter-device pitch is large, the device density will be small.There is also some chance that the output current can, by itself, switch the magnetization of the magnetic layers and therefore affect its own state.This is equivalent to lack of isolation between the input and the output, which makes gate operation unreliable.Finally, another MTJ-based logic gate has been proposed recently 11 , but it requires a feedback circuit to operate (which makes it extremely energy-inefficient and error-prone) and additionally, the design is flawed 12 .Thus, while these devices are interesting in their own right, they may not be universally usable.
A more recent scheme that overcomes most of the above shortcomings was proposed in Ref. [3 and 4].It implements non-volatile logic with magnets switched by spin currents.
Both computation and communication between gates are carried out with a sequence of clock pulses.Unfortunately, its error-resilience has not been examined.Normally, magnetic devices are much more error-prone than transistors since magnetization dynamics is easily disrupted by thermal noise 6,7 .Logic has stringent requirements on error rates and it is imperative to evaluate the dynamic bit error probability of any gate to assess its viability.
Finally, the most important metric for a logic gate is the energy-delay cost.All nonvolatile magnetic logic schemes fail in this area.The scheme in Ref. [2] uses currentgenerated magnetic fields to switch magnets and hence would dissipate at least 10 9 kT of energy per gate operation at room temperature to switch in ∼1 ns 13 (energy-delay product = 4×10 −21 J-s).A recent experiment conducted to demonstrate this scheme used on-chip current-generated magnetic fields to switch magnets and ended up dissipating approximately 10 12 kT of energy per switching event, despite switching in ∼1 µs (energy-delay product = 4×10 −15 J-s) 14 .The scheme in Ref. [3] is expected to dissipate between 10 5 and 10 6 kT of energy when it switches in 1 ns (energy-delay product = 4×10 −25 -4×10 −24 J-s) 15 , although a lower energy-delay product may be possible with design optimization 16 .In contrast, a low-power transistor may dissipate only 10 3 kT of energy when it switches in 0.1 ns (energydelay product = 3×10 −28 J-s) 17 .Therefore, all the above non-volatile schemes appear to be far inferior to transistors in energy-delay product, which may preclude their widespread application, despite the non-volatility.
In this report, we propose a non-volatile nanomagnetic NAND gate that is switched with voltage (not current) unlike the other schemes.It has an energy-delay product of 1.6×10 −26 J-s, which is smaller than that of others (magnetic logic schemes) by approximately an order of magnitude.The energy-delay product however, by itself, is not the most meaningful metric for benchmarking device performance.It is always possible to reduce this product arbitrarily by sacrificing reliability.For example, one can forcibly switch a device faster and also dissipate less energy to switch (which will reduce the energy-delay product), but at the cost of increased switching failures.A more meaningful metric may be the product of energy, delay and failure (error) probability.The error probability for the proposed NAND gate has been evaluated rigorously from stochastic simulations.With careful choice of parameters, it is possible to reduce the error probability to below 10 −8 at room temperature, which is remarkable for magnetic logic (magnetic logic is typically much more error-prone than transistor logic, which is the price one must pay for non-volatility).Finally, the proposed gate fulfills all the requirements for logic.Therefore, it is the first nanomagnetic logic gate that has the cherished advantage of magnetic logic gates (non-volatility) and yet none of the usual disadvantages.
The proposed gate structure is shown in Fig. 1(a).It is implemented with a skewed MTJ stack, resistors R, a bias dc voltage V BIAS , and a constant current source I BIAS .The current source is not used to switch the gate, but merely to produce an output voltage V out representing the output logic bit.Input bits are encoded in input voltages V in .Both input and output bits are encoded in the same physical quantity, voltage, which allows direct concatenation.
The bottom layer of the MTJ stack is an elliptical magnetostrictive (metallic) nanomagnet (Terfenol-D) and the top layer is a non-magnetostrictive elliptical (metallic) synthetic antiferromagnet (SAF) with large shape anisotropy.The top layer acts as the hard (or pinned) layer and the bottom layer acts as the soft (or free) layer of the MTJ.There is a small permanent magnetic field directed along the minor axis of the magnetostrictive nanomagnet (+y-direction) which brings its two stable magnetization orientations out of the major axes and aligns them along two mutually perpendicular in-plane directions that lie between the major and minor axes (Fig. 1(b)) 18,19 .The major axis of the top SAF layer is aligned along one of the two stable magnetization orientations of the soft magnet.It is then permanently magnetized in the direction anti-parallel to that orientation.Two electrodes E and E are delineated on the PZT surface such that the line joining their centers lies close to that orientation.The electrode lateral dimensions, the separation between their edges, and the PZT film thickness are all approximately equal.
The two electrodes E and E are electrically shorted.Whenever an electrostatic potential difference appears between them and the silicon substrate (between point-M and point-N in Fig. 1(a)), the PZT layer is strained.Since the electrode in-plane dimensions are comparable to the PZT film thickness, the out-of-plane (d 33 ) expansion/contraction and the in-plane (d 31 ) contraction/expansion of the piezoelectric regions underneath the electrodes produce a highly localized strain field under the electrodes 20 .Furthermore, since the electrodes are separated by a distance approximately equal to the PZT film thickness, the interaction between the local strain fields below the electrodes will lead to a biaxial strain in the PZT layer underneath the soft magnet 20 .This biaxial strain (compression/tension along the line joining the electrodes and tension/compression along the perpendicular axis) is transferred to the soft magnetostrictive magnet in elastic contact with the PZT, thus rotating its magnetization.This happens despite any substrate clamping and despite the fact that the electric field in the PZT layer just below the magnet is approximately zero 20 .Some of the generated strain may even reach the top hard magnet 21 , but since the hard magnet is very anisotropic in shape and is not magnetostrictive, its magnetization will not rotate perceptibly.Rotation of the magnetization of the soft layer of an MTJ due to strain has been recently demonstrated experimentally 21 .
Fig. 2 shows the potential energy profile of the soft magnetostrictive nanomagnet in its own plane (φ = 90 • ) plotted as a function of the angle θ subtended by the magnetization vector with the major axis of the ellipse (z-axis).Note that the energy profile has two The distance between the electrodes is 100 nm and the electrode lateral dimensions are also of the same order.(b) The fixed magnetization orientation of the top (hard) magnet is denoted by Ψ f , and the two stable magnetization orientations of the bottom (soft) magnet are denoted by Ψ 0 and Ψ 1 .The MTJ resistance is high when the soft magnet's magnetization is aligned along Ψ 1 .The MTJ resistance is (ideally) a factor of 2 lower when the soft magnet's magnetization is aligned along Ψ 0 .The slanted ellipse is the footprint of the soft magnet and the horizontal ellipse is the footprint of the hard magnet.The black double arrows show the direction of the permanent degenerate minima (B and C) in the absence of stress (i.e. when no voltage is applied between nodes M and N ).These two states correspond to Ψ 1 and Ψ 0 , respectively, in Fig. 1.Application of sufficient potential difference between M and N , to generate sufficient stress in the magnetostrictive magnet, transforms the energy profile into a monostable well (with no local minima) located at either B or D, depending on whether the stress is tensile or compressive, i.e. whether node M is at a higher potential than node N , or the opposite 18,19 .
If we apply compressive stress with the right voltage polarity, the system will go to point D and the magnetization will point along the corresponding direction.Thereafter, if we withdraw the voltage and stress, the system will go to the nearer energy minimum at point C (and not the other minimum at B) because of the potential barrier that exists between B and C.This happens with >99.999999% probability at room temperature in the presence of thermal noise (see supplementary material).Once it reaches C, the system will remain there (since it is an energy minimum) and the magnetization will continue to point along the corresponding direction (making the device non-volatile) until tensile stress is applied [by applying voltage of opposite polarity between M and N ] to take the system to B, thereby changing the magnetization to the other stable direction.Upon withdrawal of the tensile stress, the system will remain in state B because the energy barrier between B and C will prevent it from migrating to C. Therefore, the system is non-volatile in either state.By merely choosing the polarity of the voltage between nodes M and N , we can deterministically visit either state B or state C and orient the magnetization along either of the two stable states.The magnet will remain in the chosen state after the voltage is withdrawn.This was used as the basis for deterministically writing the bit 0 or 1 in non-volatile memory, irrespective of what the initial stored bit was 18,19 .Here, we have extended that idea to build a non-volatile universal logic gate (NAND) using a magneto-tunneling junction in the manner of Ref. [2].
The gate works as follows: Let us first assume that the binary logic bits '1' and '0' are encoded in voltage levels V 0 and V 0 /2 [what determines the minimum value of V 0 is discussed later].The bias voltage is set to V BIAS = 5V 0 /12.Every logic operation is preceded by a RESET operation where the two inputs V in1 and V in2 are set to V 0 /4.During RESET, the potential drop appearing between the terminals M and N in Fig. 1 generates in-plane tensile stress in the direction of the line joining the two electrodes and inplane compressive stress in the direction perpendicular to the line joining the two electrodes.The voltage levels between M and N that generate these stresses are ±112.5 mV.This moves the system to point B in the energy profile in Fig. 2 where the magnetization vector is nearly anti-parallel to the magnetization of the top magnet (SAF) [see the state 'Ψ 1 ' in Fig. 1(b)].This makes the resistance of the MTJ 'high'.When the input voltages are subsequently withdrawn by grounding the inputs and shorting the bias voltage source connected to the Si substrate, V M N drops to nearly zero as long as R is much greater than the resistance of the ultrathin PZT layer.Therefore, the stress in the magnet relaxes, but the system remains at point B. Consequently, the MTJ is always left in the high resistance state after the RESET step is completed.Hence, the magnetization vector remains oriented very close to Ψ 1 and the MTJ resistance remains high.If either input is low, the magnet is under small compressive stress (-10MPa) and the global energy minimum moves to B .However, there is an energy barrier of 23.63 kT separating B and B , which cannot be transcended at room temperature.Consequently, the magnetization remains stuck at the local minimum near B and the MTJ resistance remains high.When both inputs are high, the magnet experiences high compressive stress (+30 MPa), which makes the energy profile monostable with a single energy minimum at D and no local minimum where the system can be trapped.Therefore, the system migrates to D , the magnetization vector orients close to Ψ 0 , and the MTJ resistance goes low.
In the logic operation stage, the following scenarios occur: (1) if both inputs are low (i.e. or vice versa), then V M N = V 0 /12 (see supplementary material).The potential energy profiles for these two scenarios are shown in Fig. 3.When both inputs are low, the global energy minimum is at B ≈ B. Since the RESET operation left the system at B, the magnetization barely rotates and the MTJ resistance remains high.When one input is high and the other low, the global energy minimum moves to B which is closer to the other stable magnetization orientation, but there is still a local energy minimum close to B which is separated from B by a potential barrier that cannot be crossed.Therefore, the system remains stuck in the metastable state corresponding to the local minimum near B and the magnetization does not rotate perceptibly.Hence, once again, the MTJ resistance remains high.After the inputs are removed by grounding V in1 and V in2 , shorting the bias voltage sources and open-circuiting the bias current source, the strain in the magnet relaxes and the magnetization settles into the only accessible stable state B. It remains there in perpetuity thereby implementing non-volatile logic (memory of the last output state is retained).However, (3) if both inputs are high, then V M N = +V 0 /4 (see supplementary material) and the strain becomes in-plane compressive in the direction of the line joining the two electrodes and in-plane tensile in the direction perpendicular to the line joining the two electrodes.This is sufficient to change the potential energy profile dramatically as shown in Fig. 3. Now the operating point moves to D since it becomes the global minimum and there is no local minimum where the system can get stuck.Consequently, the magnetization vector rotates to an orientation nearly perpendicular to the magnetization of the top layer [state 'Ψ 0 ' in Fig. 1(b)].The resistance of the MTJ then drops by ∼50% since it is inversely proportional to 1+η 1 η 2 cosγ, where γ is the angle between the magnetizations of the top and bottom magnets, and η 1 , η 2 are the spin injection and detection efficiencies of the magnetspacer interfaces 22 .Since in the high resistance state γ = 180 • and in the low-resistance state γ ≈ 90 • , the resistance ratio is 1/ (1 − η 1 η 2 ), which would be ∼2 if we reasonably assume η 1 = η 2 = 0.7 23 [if the efficiencies are less than 70%, the logic levels will be encoded in V 0 and xV 0 , where x > 0.5].Subsequent removal of the input voltages (by grounding them), drives the system to state C where the MTJ resistance remains low, thereby retaining memory of the last output state (non-volatility).The probability of the gate working in this fashion, in the presence of thermal noise, has been calculated rigorously from stochastic Landau-Lifshitz-Gilbert simulations of the magnetodynamics (see supplementary material) and that probability was found to exceed 99.999999% in all cases.
Let us now explain how this translates to NAND logic.Since there is not much electric field in the PZT directly under the MTJ stack 20 , we can neglect any voltage drop in the PZT between the magnetostrictive magnet and the silicon substrate.Therefore, V out = I BIAS R M T J , where R M T J is the resistance of the MTJ stack.The biasing constant current source I BIAS is set to V 0 /R high , where R high is the resistance of the MTJ in the high-resistance state.Therefore, whenever the MTJ is in the high resistance state, the output voltage is V 0 and whenever the MTJ is in the low resistance state, the output voltage is Since the logic bit 1 is encoded in voltage V 0 and logic bit 0 is encoded in the voltage level V 0 /2, we find that the output bit is 1 when either input bit is 0, and it is 0 when both inputs are 1.In other words, we have successfully implemented a NAND gate (see the truth table shown in Fig. 1).
Let us now examine if this device fulfills all the requirements of a Boolean logic gate.
a. Concatenability: For concatenability, the output voltage of a preceding gate has to be fed directly to the input of a succeeding gate.This requires that V in1 (high) = V in2 (high) = I BIAS R high = V 0 , and V in1 (low) = V in2 (low) = I BIAS R low = V 0 /2 which is easily achieved by choosing I BIAS = V 0 /R high .In the event the logic levels have to be encoded in V 0 and xV 0 (0.5 ≤ x ≤ 1), the resistive network at the input side and V BIAS have to be re-designed, but this is trivial.d.Gain: Gain is ensured when the energy to switch the output bit does not come from the input energy, but from an independent power source 3 , which, in our case, is the constant current source.Whenever the inputs V in1 and V in2 end up switching the MTJ resistance, the independent current source I BIAS switches V out .
e. Universal logic: The gate performs NAND operation which is universal.
f. Scalability: Because we do not use magnetic fields to switch specific gates (unlike refs.[1 and 2]), but instead use only voltages, we do not have to space gates far apart so that fringing magnetic fields from one gate do not influence the neighbor.As a result, gates can be placed close to each other, thereby increasing the gate density.The gates can scale all the way down to the superparamagnetic limit of the nanomagnets at the operating temperature.
g. Error-resilience: Two types of errors afflict non-volatile gate operation: static errors caused by the magnetization of the soft magnetostrictive layer flipping spontaneously owing to thermal noise [thereby switching the output bit erroneously in standby state], and dynamic errors that occur (also because of thermal noise) when the output switches to an incorrect state in response to the inputs changing.The static error probability is determined by the energy barrier separating the two stable magnetization states in the soft layer.The minimum barrier height is determined by the magnetic field strength, the dimensions of the magnet and material parameters.In our case, it was 69.26 kT at room temperature (see supplementary material), so the static error probability is ∼ e −69.26 ≈ 10 −30 per spontaneous switching attempt 24 .In other words, the retention time of an output bit in the non-volatile logic gate at room temperature will be ∼ (1/f 0 ) e 69.26 = 3.8×10 10 years, since the attempt frequency f 0 in nanomagnets will very rarely exceed 1 THz 25 .In other words, the gate is indeed non-volatile.Dynamic gate errors, however, are much more probable and accrue from two sources: (1) thermal noise causing erratic magnetization dynamics that drive magnets to the wrong stable magnetization state resulting in bit error, and (2) complicated clocking schemes that require precise timing synchronization for gate operation and whose failure cause bit errors.The gate in ref. [3] works with Bennett clocking 26 which is predicated on the principle of placing the output magnet in its maximum energy state, and then waiting for the input signal to drive it to the desired one among its two minimum energy states to produce the correct output bit.This strategy is risky since the maximum energy state is also maximally unstable.While perched on the energy maximum, thermal fluctuations can drive the output magnet to the wrong minimum energy state with unacceptably high probability 6 , resulting in unacceptable bit error rates.A later modification 4 overcame this shortcoming, but at the expense of much increased energy dissipation.Moreover, that logic gate also requires a complicated clocking sequence without which it cannot operate.In contrast, we never place any element of our gate at the maximum energy state (no Bennett clocking) and no complicated clocking sequence is needed.
An important consideration for Boolean logic is logic level restoration 27 .If noise broadens the input voltage levels V 0 and V 0 /2, making it harder to distinguish between bits 0 and 1, the logic device should be able to restore the distinguishability by ensuring that the output voltage levels are not broadened and remain well separated.For this, the transfer characteristic of the gate (when used as an inverter) must show a sharp transition.We have computed the transfer characteristic (V out versus V in ) by shorting the two inputs (thus making it an inverter) and calculating the output V out for various values of V in .The calculation procedure is described in the supplementary section.The characteristic is shown in Fig. 4 and the sharpness of the transition allows for excellent logic level restoration capability.
The proposed gate has unprecedented energy-efficiency that far exceeds that of other non-volatile magnetic NAND gates.There are four contributions to the energy dissipated in this logic gate during a logic operation: internal dissipation due to Gilbert damping that occurs while the magnetostrictive layer's magnetization switches (rotates), energy C (V M N ) 2 dissipated in turning on/off the potential V M N = ±V 0 /4 (= 112.5 mV) abruptly or nonadiabatically during the RESET stage or logic operation stage (where C is the capacitance between the shorted pair of electrodes and the n + -Si substrate), the energies dissipated in the resistors R, and the maximum energy V 2 0 /R high dissipated in the MTJ when the output is high (the energy dissipated when the output is low is V 2 0 /4R low = V 2 0 /2R high , which is 50% lower).We can make the energies dissipated in the resistors arbitrarily small by choosing arbitrarily high values for R; hence, this contribution is neglected.The other contributions are computed in the supplementary material and add up to a mere 3004 kT (12.5 aJ) at room temperature.This dissipation is comparable to that of state-of-the-art low-power complementary-metal-oxide-semiconductor transistor (CMOS) based two-input NAND gates 17 .The switching time, on the other hand, is ∼1.3 ns, which is one order of magnitude longer than that of the CMOS based logic gate.However, the CMOS based gate is volatile while this gate is non-volatile.The overall energy delay product of this gate (1.6×10 −26 J-s) is about two orders of magnitude superior to that of any other magnetic (non-volatile) logic gate 17 .
Logic gates of this type may have a special niche for medically implanted processors such as pacemakers 28 , wearable electronics 29 or devices implanted in an epileptic patient's brain that monitor brain signals and warn of an impending seizure.They also need to dissipate very little energy so that they can be powered by the user's body movements and not require a battery 30 .The present device is tailor-made for such applications.

METHODS
To fabricate the gate, a piezoelectric (PZT) thin film (∼100 nm thick) is deposited on a conducting n + -Silicon substrate which is grounded through a bias voltage V BIAS .A skewed MTJ stack is fabricated on top of the PZT film.The bottom layer material is chosen as Terfenol-D because of its large magnetostriction (900 ppm).The magnetostriction is positive which tends to make the magnetization align along the direction of tensile stress and perpendicular to the direction of compressive stress.The angle between the major axes of the two elliptical nanomagnets is determined by the angular separation between Ψ 1 and Ψ 0 .The current source I BIAS is connected across the To evaluate the dynamic error probability, the magnetization dynamics of the soft magnetostrictive magnet induced by stress in the presence of thermal noise is modeled by the stochastic Landau-Lifshitz-Gilbert equation 6 .In the Supplementary section, we present results of simulations to show that if V 0 = 0.45V, then switching is accomplished in 1.3 ns and the dynamic error probability associated with incorrect switching is less than 10 −8 in every gate operation if we keep the voltage on for 1.3 ns.Therefore, the gate can work at a clock frequency of ∼1/1.3 ns > 0.75 GHz with an error probability < 10 −8 .Stated succinctly, the probability of the output voltage being low when both inputs are high is > 99.999999% and the probability of it being low when either input is low is < 10 −8 .In other words, the NAND gate works with > 99.999999% fidelity.This is unimpressive for transistor-based volatile logic, but it is remarkable for non-volatile magnetic logic gates, which typically have very high error probabilities [6][7][8] .This degree of error-resilience may be sufficient for use in stochastic logic architectures 31 .

I. SUPPLEMENTARY MATERIAL
In this accompanying supplementary material, we elucidate gate operation and concatenation, choice of the voltage level V 0 , the stochastic Landau-Lifshitz-Gilbert simulations, and calculations of the transfer characteristic and energy dissipation in a gate operation.

II. GATE OPERATION
To understand how the RESET, logic and the concatenation schemes work, consider Fig.

II
. The nodes M and N represent the same nodes as in Fig. 1(a) of the main paper and V M N is the voltage drop between these nodes.Therefore, V M N is the voltage drop across the piezoelectric layer that generates strain in the magnetostrictive layer (soft layer of the MTJ) and makes its magnetization rotate.Note that V M N alone determines the MTJ resistance.
As established by the energy profiles in the main paper, when V M N is either negative or positive but small, the hard and soft layers of the MTJ remain magnetized in anti-parallel directions and the MTJ resistance remains high.This high resistance is denoted by R 0 .
When V M N is positive and sufficiently large in magnitude, the magnetizations of the hard and soft layers become mutually perpendicular and the MTJ resistance drops by a factor of 2 to become R 0 /2.
Since the ratio R high /R low is 2:1, logic '1' must be encoded in some voltage level V 0 and logic '0' in voltage level V 0 /2.This is needed because the logic levels at the output are determined solely by the MTJ resistance.
Let us consider the RESET operation that is supposed to leave the MTJ resistance in the high state R 0 (Case I in Fig. II).The input voltages are set to V 0 /4.The voltage at node M is then found by superposition and it is V 0 /6.Since the bias voltage V BIAS is set to 5V 0 /12, the voltage at node N is always fixed at 5V 0 /12.Therefore, V M N , which is the voltage drop across the PZT thin film, becomes -V 0 /4.This negative voltage generates tensile stress in the magnetostrictive layer and leaves its magnetization pointing anti-parallel to that of the the hard (SAF) layer of the MTJ (close to Ψ 1 ).Therefore, the MTJ resistance R M T J is left high at R 0 by the RESET step.
Note that the voltage drop between the bottom (soft) layer of the MTJ and node N is almost zero since the metallic layer shorts out the electric field underneath it in the PZT 20 .
barrier.As a result, the magnetization of the soft magnetostrictive layer now rotates by ∼90 • , placing it approximately perpendicular to that of the hard layer.Therefore, the MTJ's resistance drops to R 0 /2 and [from Equation (1)] the output voltage drops to V 0 /2.Thus, when both inputs are bit '1', the output is bit '0'.Note that the output voltage levels encoding bits '1' and '0' are V 0 and V 0 /2 which are also the input voltage levels encoding bits '1' and '0'.Therefore, the output of one stage can be directly fed to the next stage as input without requiring additional hardware for amplification or level shifting.That makes this construct concatenable.

III. CHOICE OF VOLTAGE LEVEL V 0
In order to choose the value of V 0 (which ultimately determines the amount of dissipation, switching delay and energy-delay product), we have to ensure that compressive stress generated by V M N = V 0 /4 is sufficient to overcome the shape anisotropy barrier in the elliptical magnetostrictive layer and rotate its magnetization, but compressive stress generated by V M N = V 0 /12 is not.The amount of stress generated by a certain voltage, and the effective shape anisotropy barrier in the presence of the permanent magnetic field, depend on many parameters such as the strength of the magnetic field, the shape and size of the magnetostrictive layer, the electrode size and placements, the piezoelectric layer thickness, and the piezoelectric and magnetostrictive materials.For the choices we made, we found from stochastic Landau-Lifshitz-Gilbert simulations of magnetodynamics in the presence of room-temperature thermal noise 32 that a compressive stress of 30 MPa rotates the magnetization with greater than 99.999999% probability (and switches the MTJ resistance from high to low) in the presence of room-temperature thermal fluctuations, while a compressive stress of 10 MPa has less than 10 −8 probability of rotating the magnetization and switching the MTJ resistance.Therefore, V M N = V 0 /4 needs to generate a stress of -30 MPa (compressive strain is negative).The material chosen for the magnetostrictive material is Terfenol-D because of its large magnetostriction.From the Young's modulus of Terfenol-D, we calculated that the strain required to generate a stress of -30 MPa is -3.75×10 −4 .To generate this amount of strain, the strength of the electric field in the PZT between the shorted electrodes and the n + -Si substrate should be 1.125 MV/m (interpolated from the results in Ref. 20 ).This value is well below the breakdown field of PZT.Since the PZT layer thickness is 100 nm, the voltage V M N needed to generate the strain of -3.75×10 −4 will be 112.5 mV.Hence, V 0 = 4V M N = 0.45 V.

IV. STOCHASTIC LANDAU-LIFSHITZ-GILBERT (LLG) SIMULATIONS
The error probability associated with gate operation, the internal energy dissipated during switching, and the switching delay -all in the presence of room-temperature thermal noise -are calculated from the stochastic Landau-Lifshitz-Gilbert equation.We first write expressions for the various contributions to the potential energy of the magnetostrictive layer and then find the effective torques due to these contributions as well as the random torque due to thermal noise.These torques rotate the magnetization vector.The entire procedure is described next.
We refer to Fig. 1(b) from the main paper and define our coordinate system such that the magnet's easy (major) axis lies along the z-axis and the in-plane hard (minor) axis lies along the y-axis (see also Fig. 1(a) in main paper).Application of a positive/negative voltage between the electrode pair and the conducting n + -Si substrate generates biaxial strain leading to compression/expansion along the z -axis and expansion/compression along the y -axis 20 .The latter two axes are the axes of Ψ 1 and Ψ 0 .The angle between the zand z axes is δ, which is therefore the angle between the major axes of the hard and soft elliptical magnets.
To derive general expressions for the instantaneous potential energies of the nanomagnet due to shape-anisotropy, stress-anisotropy and the static magnetic field, we used the primed axes of reference (x , y , z ) and represented the magnetization orientation of the singledomain magnetostrictive magnet in spherical coordinates with θ representing the polar angle and φ representing the azimuthal angle.The magnitude of the magnetization is invariant in time and space owing to the macrospin assumption.
Using the rotated coordinate system (see Fig. S2), the shape anisotropy energy of the nanomagnet E sh (t) can be written as, where θ (t) and φ (t) are respectively the instantaneous polar and azimuthal angles of the magnetization vector in the rotated frame, M s is the saturation magnetization of the magnet, N d−xx , N d−yy and N d−zz are the demagnetization factors that can be evaluated from the nanomagnet's dimensions 33 , µ 0 is the permeability of free space, and Ω = (π/4)abd is the nanomagnet's volume.
The potential energy due to the static magnetic flux density B applied along the in-plane hard axis is given by The stress anisotropy energy is given by where λ s is the magnetostriction coefficient, Y is the Young's modulus, and (t) is the strain generated by the applied voltage V M N at the instant of time t.We only consider the uniaxial strain along the line joining the two electrodes, but the strain is actually biaxial resulting in tension/compression along that line and compression/tension along the perpendicular direction.The torques due to these two components add.Therefore, we underestimate the stress anisotropy energy, which makes all our figures conservative.
We neglect any contribution due to the dipolar interaction of the hard magnet since the use of the synthetic anti-ferromagnet makes it negligible.
The total potential energy of the nanomagnet at any instant of time t is therefore The above result is used to plot the energy profiles in the main paper as a function of θ for φ = 90 • under various scenarios.
We follow the standard procedure to derive the time evolution of the polar and azimuthal angles of the magnetization vector in the rotated coordinate frame under the actions of the torques due to shape anisotropy, stress anisotropy, magnetic field and thermal noise.
The torque that rotates the magnetization of the shape-anisotropic magnet in the presence of stress can be written as − M s ΩB(cos δ sin φ (t) cos θ (t) + sin δ sin θ (t)) where m(t) is the normalized magnetization vector, quantities with carets are unit vectors in the original frame of reference, and At non-zero temperatures, thermal noise generates a random magnetic field h(t) with Cartesian components (h x (t), h y (t), h z (t)) that produces a random thermal torque which can be expressed as 32 where In order to find the temporal evolution of the magnetization vector under the vector sum of the different torques mentioned above, we solve the stochastic Landau-Lifshitz-Gilbert (LLG) equation: From the above equation, we can derive two coupled equations for the temporal evolution of the polar and azimuthal angles of the magnetization vector: Solutions of these two equations yield the magnetization orientation (θ (t), φ (t)) at any instant of time t.Since the thermal torque is random, the solution procedure involves generating switching trajectories by starting each trajectory with an initial value of (θ , φ ) and finding the values of these angles at any other time by running a simulation using a time step of ∆t = 1 ps and for a sufficiently long duration.At each time step, the random thermal torque is generated stochastically.The time step is equal to the inverse of the maximum attempt frequency of demagnetization due to thermal noise in nanomagnets 34 .The duration of the simulation is always sufficiently long to ensure that the final results are independent of this duration, and they are also verified to be independent of the time step.
The permanent magnetic field (B = 0.1305 T) applied along the +y-direction (hard axis of the magnet) makes the two stable states of the soft magnet's magnetization align along To study the switching dynamics under the influence of stress, we generate 100 million switching trajectories in the stressed state of the magnet by solving Equations ( 9) and (10), again using a time step of 1 ps.This time the initial magnetization orientation for each of the 10 8 trajectories is chosen from the thermal distributions generated in the previous step with the appropriate weightage since the RESET step always leaves the magnetization around state Ψ 1 .The simulation is continued for 1.5 ns.We find that when the stress is either tensile (+10 MPa corresponding to V M N = −V 0 /12), or compressive but weak (-10 MPa corresponding to V M N = V 0 /12), the magnetization's polar angle returns to within 4 This procedure tells us that when the inputs to the logic gate are both low, or one is high and the other is low, the magnetization of the soft layer of the MTJ does not rotate and the MTJ resistance remains high with >99.999999% probability.This fulfills the requirements of the NAND gate with >99.999999% probability.
In the case of one input low and one input high, what prevents rotation from Ψ 1 to Ψ 0 is the energy barrier of 23.63 kT between these two states as discussed in the main paper.
This barrier is high enough to reduce the switching probability to below 10 −8 .
When both inputs are high, a compressive stress of -30 MPa is generated in the magnetostrictive magnet.Once again, we pick the initial orientations of the magnetization from the thermal distribution around Ψ 1 which is where the RESET step leaves the magnet at, and generate 100 million switching trajectories as before.This time θ approaches within 4 • of final state Ψ 0 (θ = θ 0 = 133.2• ) in 1.3 ns or less.We continue the simulation for an additional 0.2 ns to confirm that once the magnetization reaches the vicinity of Ψ 0 , it settles around that orientation and does not return to the neighborhood of the initial orientation Ψ 1 .We repeated this procedure for 10 8 times and found that every single switching trajectory behaved in the above manner.Therefore, we conclude that when the inputs to the logic gate are both high, the magnetization of the soft layer of the MTJ does rotate and the MTJ resistance goes low with >99.999999% probability.That fulfills the remaining requirement of the NAND gate with >99.999999% probability.
vicinity of Ψ 1 and arrives within 4 • polar angle of Ψ 0 ) and P d gd (t) is the power dissipation and can be expressed as where τ eff (t) is the torque due to shape anisotropy, stress anisotropy and the torque due to magnetic field (the thermal torque does not dissipate energy).The energy dissipation is obviously different for different switching trajectories, and we have found that the mean dissipation is 316 kT at room temperature.This calculation overestimates the energy dissipation slightly, but that only makes our figures conservative.
The next component is the C (V M N ) 2 dissipation.We have electrodes of dimensions 100 nm × 100 nm and the thickness of the PZT layer is 100 nm.Thus, the capacitance between either electrode and the silicon substrate is C = 0.88fF, assuming that the relative dielectric constant of PZT is 1000.The voltage V M N = ±V 0 /4 (= 112.5 mV).Since we have a pair of electrodes, the dissipation will be roughly twice (1/2)C (V M N ) 2 .We calculated that value as 2688 kT.
The dissipation in the resistance R can be negligible as we can make this resistance arbitrarily high.
Finally, we have to calculate the maximum energy dissipation due to the bias current flowing through the MTJ stack during a switching action.Since the bias current is flowing continuously, it results in standby energy dissipation which would be unacceptable in certain applications where the circuit is mostly dormant and wakes up to perform an operation infrequently (e.g.cell phones that wake up and perform a function only when a call or message is received).However, there are many applications where the circuit is constantly busy and seldom, if ever, in a standby mode (e.g.medical applications where the implanted device constantly monitors and processes signals).For such applications, standby dissipation is a not a serious concern.
In order to calculate this energy dissipation, let us assume that for the sake of adequate noise margin, the bias current I BIAS can be no less than 1 pA.This restricts R high to V 0 /I BIAS = 0.45×10 12 ohms.The MTJ's resistance will increase super-linearly (almost exponentially) with the spacer layer thickness since current flows by tunneling through this layer, so the above resistance is not difficult to achieve.The resulting maximum energy dissipation V 2 0 t s /R low is 0.28 kT, which is negligible.Consequently, in a gate operation, the maximum energy dissipation is 3004 kT, which is comparable to that of low-power CMOS based NAND gates 17 , but the latter is volatile while the present gate is not.The energy dissipation is almost two orders of magnitude smaller than that in other magnetic non-volatile logic gates 17 .

FIG. 1 .
FIG. 1. Structure of a NAND gate.(a) The PZT film has a thickness of ∼100 nm and is deposited on a conducting n + -Si substrate.It is poled with an electric field in the direction shown.

FIG. 2 .
FIG.2.Potential energy profiles of the magnetostrictive layer in Fig.1as a function of its magnetization orientation.Energy plot as a function of polar angle (θ) of the magnetization vector, where the red line is for the unstressed magnet, the green line is for the compressively stressed magnet (-30 MPa), and the blue line is for the expansively stressed magnet (+30 MPa).

FIG. 3 .
FIG. 3. Potential energy profiles of the magnetostrictive layer in Fig. 1 for different logic inputs.Energy plot as a function of polar angle (θ) of the magnetization vector.The RESET operation brings the magnetization to state B where the magnetization is oriented along Ψ 1 and the MTJ resistance is high.During logic operation, when both inputs are low, the magnet is under small tensile stress (+10 MPa) and the global energy minimum shifts slightly to B (B ≈ B ).
b. Non-linearity: Since the MTJ resistance has only two values (high and low), the gate is inherently non-linear 3 .c. Isolation between input and output:The output voltage cannot change the input voltage levels in any way.This results in isolation.

FIG. 4 .
FIG. 4. Transfer characteristic in the inverter mode.Shorting the two inputs of a NAND gate makes it an inverter.Plot of V out versus V in of the inverter at room temperature, where the V out values have been thermally averaged.

Ψ 1 (
θ = θ 1 = 46.8• ) and Ψ 0 (θ = θ 0 = 133.2• ) leaving a separation angle γ of 86.3 • (Fig.S2) between them.Thermal noise however will make the magnetization of the soft magnet fluctuate around these two orientations and in order to determine the thermal distribution around Ψ 1 (which is where the RESET operation leaves the magnetization at), we solve the last two equations in the absence of any stress by starting with the initial state θ = 46.8• and φ = 90 • and obtaining the final values of θ and φ by running the simulation for a long time.This process is repeated for 100 million switching trajectories.A histogram is then generated from these 100 million switching trajectories for the final values of θ and φ, which yields the thermal distribution around Ψ 1 .
•of Ψ 1 (θ = θ 1 = 46.8• ) in 1.3 ns or less for every one of the 10 8 trajectories.After 1.3 ns, the stress is removed abruptly and the simulation is continued for an additional 0.2 ns to ensure that the final state does not change.It did not change for any of the 10 8 trajectories.