Cryogenic Memory Architecture Integrating Spin Hall Effect based Magnetic Memory and Superconductive Cryotron Devices

One of the most challenging obstacles to realizing exascale computing is minimizing the energy consumption of L2 cache, main memory, and interconnects to that memory. For promising cryogenic computing schemes utilizing Josephson junction superconducting logic, this obstacle is exacerbated by the cryogenic system requirements that expose the technology’s lack of high-density, high-speed and power-efficient memory. Here we demonstrate an array of cryogenic memory cells consisting of a non-volatile three-terminal magnetic tunnel junction element driven by the spin Hall effect, combined with a superconducting heater-cryotron bit-select element. The write energy of these memory elements is roughly 8 pJ with a bit-select element, designed to achieve a minimum overhead power consumption of about 30%. Individual magnetic memory cells measured at 4 K show reliable switching with write error rates below 10−6, and a 4 × 4 array can be fully addressed with bit select error rates of 10−6. This demonstration is a first step towards a full cryogenic memory architecture targeting energy and performance specifications appropriate for applications in superconducting high performance and quantum computing control systems, which require significant memory resources operating at 4 K.


Architecture Description
The cryogenic memory system we developed includes a memory array and superconducting control circuits. The latter are currently fabricated on separate chips which can be connected to the memory array by wire-bonding or Multi-Chip Module (MCM) technology. The control circuits, designed with SFQ technologies, can supply triggering signals of ~100 μA into a load of a few Ohms. Here we limit our discussion to the design and performance of the main memory array which will target compatibility with the specifications of these control chips. The arrows indicate the current paths during the writing process of cell (2,2). (e) Micrograph of the memory array. (f) SEM image of cell (2,2) in the array. device in parallel with a bit-select hTron device. The addressing architecture was based on a design described in ref. 24 . The SHE-MTJ, as depicted in Fig. 1(a), is a three-terminal device consisting of an MTJ patterned on top of a spin Hall channel. The MTJ itself is an elliptical nanopillar consisting of two ferromagnetic layers separated by a thin tunnel barrier: the reference layer's (RL) magnetization is rigidly pinned, while the free layer's (FL) magnetization can be switched parallel (P) or antiparallel (AP) to that of RL. One bit of information is encoded in this non-volatile relative orientation: the tunneling magnetoresistance (TMR) effect gives rise to two easily distinguishable resistance states (R P or R AP ) in either case. In our measurements, memory readout is performed by monitoring the MTJ voltage V MTJ while applying a small 5 μA sense current through the MTJ.
To switch the magnetization of the FL, i.e., to change the MTJ state, a current is applied in the spin Hall channel. The SHE [25][26][27] of the channel induces spin accumulations on the surfaces of the channel, in turn enacting magnetization reversal of the FL via the spin (transfer) torque 21,28,29 as spins diffuse into the ferromagnet. The low switching current and confinement of the spin-torque switching mechanism within the device (no external magnetic fields, with their detrimental impact on superconductive circuits, are required) make the SHE-MTJ suitable for integration with superconducting structures.
An issue with the SHE-MTJ, however, is that its characteristic impedance and switching currents are too large to be directly compatible with our separately fabricated SFQ control circuits. For example, a 300 nm wide, 5 nm thick spin Hall channel requires a switching current of roughly 1 mA into a 0.5 kΩ load, which is incompatible with typical SFQ circuit output impedance of a few Ohms.
To resolve this problem, we use a heater-cryotron (hTron) bit-select element connected in parallel with the SHE-MTJ channel which acts as a three terminal switch and also can be engineered to have better impedance match to the SHE-MTJ. The hTron 12 is a non-contact variation of the nano-cryotron 23 in which heat generated in a gate nanowire to suppress the critical current of an adjacent superconducting channel below its nominal operating current, thereby switching it into the normal state. Our implementation uses a nanowire gate constriction made from the same material and placed in the same plane and very near to the main channel, as illustrated in Fig. 1(b). A small triggering pulse (green arrow) is applied to the gate constriction, sending it into the normal state and creating a hot spot because of dissipative current flow. The lateral heat flowing away from this hotspot suppresses the critical current of the nearby channel and causes the channel to enter the normal state if it is biased (blue arrow). The resistance of the channel in the normal state can be engineered to reach multiple kilo-Ohms while carrying in excess of 1 mA. Our primitive memory cell is formed by placing the hTron in parallel with a SHE-MTJ channel, as depicted in Fig. 1(c). Triggering the hTron presents a large enough impedance that current redirects through the SHE channel and switches the MTJ's state. In this manner, the memory element can be controlled by a small hTron gate pulse (100 µA) sourced by a line-driver hTron which can be interfaced directly with the SFQ peripheral circuitry. The use of an hTron or an equivalently high isolation device is particularly important in the interior of the memory arrays: sneak currents or leakage current from writing to a cell can otherwise cause redistribution of current within the surrounding circuitry, since in the inactive state all lines are superconducting. This can create multiple current loops for leakage current to accumulate. This effect is mitigated by inclusion of a bit select element that has high isolation. In addition, when an hTron is inactive, current simply flows through the superconducting path in the memory cell to subsequent cells in the columns and is recycled elsewhere with no concerns of other leakage paths.
Scaling to a device array is accomplished by chaining these memory cells into many parallel columns. The schematic of a 4 × 4 memory array is shown in Fig. 1(d). To be able to address the individual memory cells, the array is connected to peripheral row and column drivers, which are also hTron devices, to address the individual memory cells. The writing procedure is as follows: All row and column driver hTrons are biased with currents, which are initially shunted to ground through the driver channels. To write to a cell, a triggering pulse is applied to the corresponding column driver hTron (dashed blue arrow applied at I B2 ). The column driver becomes normal and consequently the bias current is diverted down the column of cells and flows through all bit-select hTrons in the column (solid blue arrow), which are still superconducting. Any combination of columns can be activated, with each column constituting a bit in the 4-bit word to be written. Next, one row driver hTron is triggered (dashed green arrow), diverting its bias current through the gates of all bit-select hTrons in that row (solid green arrow). For those bit-select hTrons which have both the current from the column driver flowing through their channel and receive a gate current from the row driver, the triggering condition is fulfilled, diverting sufficient current to the SHE-MTJ channel to produce a change in magnetic state (orange arrow).
The micrograph of the 4 × 4 memory array is shown in Fig. 1(e). A scanning electron microscopy (SEM) image of cell (2,2) in the array is shown in Fig. 1(f). See Methods for detail on sample multilayer and fabrication process.
Optimizing for power efficiency. Because current is shared between the channels of the SHE-MTJ and hTron bit-select elements in the memory cell during writing events, it is important to match their impedances to minimize power consumption. For fixed SHE-MTJ parameters, the problem becomes determining an hTron design which minimizes dissipation while still providing enough current to switch the MTJ. This means that the normal-state resistance of the hTron should be maximized, so that most of the current will be diverted into the memory cell, with the following two constraints: (i) The column bias current must be lower than the hTron critical current of the hTron channel, to ensure that the bit-select hTron channel is superconducting before its gate is triggered. (ii) The current through the hTron channel during the writing event must be large enough to sustain the hotspot in the channel 30 .
www.nature.com/scientificreports www.nature.com/scientificreports/ Solving straightforward inequalities (see Methods), the optimal width and length of the hTron are found to be: where I SH is the SHE-MTJ switching current, R SH its channel resistance which is roughly 600 Ω for our devices; J c , d, R sq , α are hTron channel's critical current density, thickness, sheet resistance, and specific heat per unit area, respectively; and ΔT = T c − T s is the difference between the hTron critical temperature T c and the sample temperature T s . In this optimal design, the overhead power consumption added by the hTron is then (see Methods): c overhead sq sq which is fixed by the hTron's material properties and system temperature, independent of SHE-MTJ parameters. This result implies that we can always design the bit-select hTron to accommodate the impedance of a particular SHE-MTJ device with a fixed overhead power. Using typical values for our NbN film (not fully optimized for this application), α = 220 kW/m 2 K, R sq = 100-140 Ω/sq, T c = 12.8 K, T s = 3.6 K, J c = 25-30 GA/m 2 , the minimal energy overhead P overhead is 30(6)%. Estimated using the results from nanosecond pulse switching of a standalone SHE-MTJ device, described in the next section, the typical switching energy for a SHE-MTJ device is roughly 6 pJ and thus that of a memory cell is about 8 pJ per switching.

Measurements and Results
Standalone SHe-MtJ device. As described above, our memory element, the spin-Hall-effect based magnetic tunnel junction (SHE-MTJ), consists of an MTJ atop a spin Hall channel, as illustrated in Fig. 1(a). A writing (electrical) current in the channel induces a transverse spin accumulation that exerts torque on the free layer (FL) via the SHE, thereby switching its magnetization back and forth. The conventional figure of merit representing the strength of the SHE is the spin Hall angle, which is the ratio of the induced spin current to the applied electrical current. Energy-efficient switching devices require metallic materials possessing very strong SHE such as Pt 31 , β-Ta 21 , β-W 32 and their alloys 33-35 whose (absolute) spin Hall angles are in the range of 0.1-0.35.
SHE-MTJ devices in our memory arrays employ a Pt 85 Hf 15 alloy channel with spin Hall angle ~0.2. This was an early design decision made with fabrication stability in mind, and without the benefit of subsequent materials advances. As seen in the SHE-MTJ stack illustrated in Fig. 2(a), a thin Hf layer of 0.5 nm thick is inserted between the channel and the MTJ to suppress the Gilbert magnetic damping caused by spin pumping 36 . Further device details are given in the Methods section.
We find robust SHE-MTJs response seen in other devices of similar composition 37 . The tunneling magnetoresistance (TMR) response to an easy-axis magnetic field is shown in Fig. 2(b). The MTJ switches between the parallel (P, low resistance) and antiparallel (AP, high resistance) states with a 20 mT coercivity (loop half-width) and a small offset field of −5 mT. Bi-stability at zero external field is thus achieved: a mandatory criterion given the flux-trapping propensity of the superconducting peripheral circuitry. Another criterion is satisfied in that the AP and P resistance states are easily discernible using a simple superconducting comparator.
The nanosecond pulse switching performance of the SHE-MTJ device is gauged using the same measurement setup and technique outlined in our previous reports 18,37 . The switching probability is shown in Fig. 2(c) for the two switching polarities as a function of pulse duration and amplitude. We fit the pulse amplitudes I 50% and pulse durations t 50% along the 50% probability boundary (blue dots) with the macrospin model for ballistic reversal (blue curves) 38,39 to obtain the critical current I 0 and characteristic reversal time t 0 . For P to AP switching we find I 0 = 0.9 mA and t 0 = 0.7 ns while for AP to P switching we find I 0 = 0.9 mA and t 0 = 1.5 ns. These t 0 values are similar to previous measurements (at 4 K) on Pt channel devices 18 , while the I 0 values are about two times lower, which is consistent with the two times higher spin Hall angle of Pt 85 Hf 15 when compared to pure Pt 37 . With the performance of the memory element itself confirmed, we verify the ability of our hTron select element to redirect its channel current across the SHE-MTJ to actuate switching.
Standalone htron device. The heater cryotron (hTron) devices shown in Fig. 1(b) are arranged in parallel with a SHE-MTJ devices as shown in Fig. 1(c). A current pulse of 100 µA is supplied the hTron gate, ultimately sourced by a SFQ-DC or SFQ-RO (relaxation oscillator) driver, sending the gate into the normal state and producing sufficient Ohmic heating to trigger hotspot formation in the hTron's current carrying channel situated 100 nm away. The supercurrent is impeded by the emergence of a ~ 1kΩ dissipating impedance of the normal state NbN, and a large fraction of the current is redirected into the SHE-MTJ channel where it switches the memory element. To minimize the overhead power consumption of the bit-select hTron (see analysis in the previous section), the hTron channel was designed to be 1.7 μm × 25.5 μm, which gives the optimal overhead energy consumption about 31%.
Before studying integrated devices, we performed quasi-static switching measurements on one standalone hTron device. As shown in Fig. 3(a), we biased the hTron channel with a square pulse (blue) and the gate with a triangular pulse (red), while monitoring their respective voltages (right axis). The voltages remain at the noise floor until the gate current reaches the critical value of 87 μA, beyond which the gate becomes normal (resistive). If the channel current is high enough, as is the case in Fig. 3(a), the channel also becomes normal. Beyond t Scientific RepoRtS | (2020) 10:248 | https://doi.org/10.1038/s41598-019-57137-9 www.nature.com/scientificreports www.nature.com/scientificreports/ = 0.3 ms, the resistive state of the channel is maintained by the channel current (below its critical value) even though the gate current has been turned off. This important latching behavior is a result of thermal run-away in the channel as its resistance increases and allows the triggering pulse to the gate to be short and of low amplitude.
We repeated the measurement in Fig. 3(a) 100 times and measured the distribution of current required to turn the gate resistive, shown in Fig. 3(b) along with the cumulative switching probability. We acquired repeated histograms of this type, for a range of channel currents, and the resulting switching probability is shown in the left panel of Fig. 3(c). The clear and orthogonal switching boundaries shows that the gate and channel of the hTron are galvanically isolated, as expected.
We next performed a similar experiment but this time monitored the switching (to normal state) of the hTron channel, and obtained the results shown in the right panel of Fig. 3(c). Comparing the two panels of Fig. 3(c) reveals an interesting correlation: For |I gate | > 90 μA, the gate always turns normal, but the channel state does not necessarily follow. Instead, the boundaries around I channel = 0.2 mA and |I gate | = 90 μA are curved. This difference from the orthogonal boundaries seen in the left panel is due to the relative sizes of the hotspots created in the channel and the gate. Fig. 1(c) -we used external pulse generators to provide bias and triggering current pulses via reference resistances of 5-10 kΩ, as shown in Fig. 4(a). To measure the MTJ resistance, we applied a DC bias current of 5 μA into the top of the MTJ and monitored its voltage drop.

Memory cell. To perform switching experiments on a memory cell -as shown in
We performed microsecond pulse switching measurement of the representative memory cell (2,2) in the array at 4 K without an external magnetic field (see Fig. 1(d)). In these experiments, we biased the 2 nd row driver hTron with a 1 mA pulse and applied a trigger pulse of 100 μA to the gate. The gate trigger level of the 2 nd column driver hTron was also 100 μA, but its channel bias was varied between −1.5 and 1.5 mA. Figure 4(b) shows the traces of the input and sample voltages read from an oscilloscope (for column bias = −1.5 mA). The MTJ state is determined by the sign of ∆V MTJ which is the MTJ drop voltage subtracted by its median value of 53 mV.
As shown in Fig. 4(b), for time t < 2 μs, the column bias current flows through the superconductive hTron channels of the 2nd column and a small sensing resistance at the bottom, resulting in a small voltage drop. At t = 2 μs, the row driver is triggered, and after a 2.5 μs delay (due to the length of the coaxial cables of a few meters from the chip package in the cryostat to the RT electronics) at t = 4.5 μs the row driver channel builds up enough current to trigger the bit-select hTron, resulting in a rapid rise in the column voltage. This result confirms the intended operation of the cell and driver hTrons. www.nature.com/scientificreports www.nature.com/scientificreports/ The response of ∆V MTJ to the column bias is shown in Fig. 4(c). When the 6 μs long column bias current (orange line) is negative, the triggering of the bit-select hTron causes the MTJ voltage to relax to the AP state (∆V MTJ > 0). Conversely, when the column bias current is positive, the MTJ voltage relaxes to its P state (∆V MTJ < 0). This clearly demonstrates the successful control of the memory cell (2,2) in the array. We note that the long rise and fall times observed in Fig. 4(b,c) result from the use of slow external electronics, and are not characteristic of the underlying devices.
We determined the write-error-rate (WER) of the cell by repeating the switching experiment 10 3 -10 6 times, with each attempt preceded by resetting the MTJ to its original state by applying a high current bias to the column. The resulting behavior, shown in Fig. 5(a), is very similar in shape to the previously reported WER of single standalone SHE-MTJs 18,22 . Note that the open symbols near the bottom of the plot indicate no write error was observed in 1 million switching attempts, implying that the WER is below 10 −6 for bias current of 1.0 mA (P to AP) and 1.5 mA (AP to P).
We characterized the intercell crosstalk by monitoring the MTJ voltage of cell (2,2) while triggering its four neighboring cells. Figure 5(b) shows the response in ∆V MTJ of cell (2,2) when the adjacent cell (2,1) was triggered. A voltage rise of about 3 mV is observed during triggering events, which translates to roughly 10 μA of leakage current in the SHE-MTJ channel of cell (2,2). This leakage current is too small to affect the state of the cell (2,2), which remains unchanged in the AP state. We repeated the experiment with all permutations of cell (2,2) original state, bias current polarity, switching each of the four neighboring cells, with 10 4 switching attempts for each case. No unintended switching of cell (2,2) was observed.
Array triggering fidelity. We then investigated the triggering fidelity (successful rate) of the 4 × 4 memory array by writing all possible 4-bit words to each row and detecting whether the targeted bit-select hTrons triggered. As illustrated in Fig. 6(a), the measurement procedure is as follows: The column and row driver hTron channels are biased with long current pulses. For each bit in the word to be written, the corresponding column driver is activated with a 1 μs long pulse on its gate. The voltage at the top of each column, V pre , is measured, to make sure that none of the bit-select hTrons have been accidentally activated before the row driver. One of the four row drivers is then activated with a 1μs long current pulse, and the column voltages www.nature.com/scientificreports www.nature.com/scientificreports/   www.nature.com/scientificreports www.nature.com/scientificreports/ (i) All column voltages fall below a threshold before the row is triggered: V pre < V min . (ii) All column voltages V 1-4 for unwritten bits fall below a threshold value after the row is triggered: V 1-4 (unwriten) < V min . (iii) All column voltages V 1-4 for written bits fall within a specified interval after the row is triggered: V min < V [1][2][3][4] (written) < V max . Figure 6(b) shows the measured triggering fidelity with respect to column bias current and the trigger current used to activate both row and column drivers. The fidelity is assessed from 640 written words per pixel, encompassing all 16 possible bit patterns, written ten times to each row. We found that the fidelity is nearly 100% for a wide range of parameter values. Here we used V min = 400 mV, V max = 1 V, and a row bias current of 0.8 mA, but found that the measured fidelity did not depend critically on any of these values.
As previously determined, a column current of ~1 mA suffices to reliably change the state of a SHE-MTJ element, and SFQ circuitry is able to provide trigger pulses of ~100 μA. For this critical combination of parameters, highlighted with a red spot in Fig. 6(b), we benchmarked the writing fidelity in detail with 6.4 million writing cycles (distributed over all rows and bit patterns as above) and yielded a total of 6 errors. The bright area confined by the dotted line in Fig. 6(b) indicates the parameter space suitable for the operation of the SHE-MTJ elements and SFQ control pulses.

Discussion
We have experimentally demonstrated the successful integration of SHE-based MRAM technology and the superconductive hTron bit-selects and drivers in a 4 × 4 memory array architecture which can be triggered by current signals as low as 100 μA, compatible with SFQ DC/RO control circuits. The array is fully addressed with an error rate as low as 1 × 10 −6 , while the write-error-rate of the individual memory cells is shown to be below 10 −6 . The bit-select hTron was designed to match the impedance of the SHE-MTJ channel, adding a minimal energy overhead of 31% to the switching energy which is currently in the order of 6 pJ per switch.
Beyond the proof-of-concept demonstration of the cryogenic memory array architecture presented in this Report, we anticipate to make significant improvements to array speed and energy consumption. Because the minimum energy overhead added by the bit-select hTron is indpendent of SHE-MTJ parameters, the overall energy consumption of the memory cell scales with that of the SHE-MTJ element which can be further decreased using newly discovered spin Hall materials 35,40 , device structures 41 and better lithography technologies. For example, the SHE-MTJ channel can be replaced with materials such as Au 1-x Pt x alloys whose spin Hall angles approach 0.35 35 for a four-fold energy reduction. The SHE-MTJ channel resistance, currently about 0.6 kΩ, whose major contribution comes from the two vias providing electrical contact to the SHE channel (see Fig. 1(c)) can be reduced 2-3 times by replacing them with metallic vias. Another obvious improvement in design is scaling down the SHE-MTJ channel to reduce the switching current. Combining all these modifications in materials and design, we estimate the cell's switching energies could be as low as 0.1 pJ, more than 10 times less than that of cryogenic DRAM 10 .
Besides SHE-MTJ, the same memory architecture can be applied for other types of MRAM, including the 2-terminal spin-transfer-torque (STT) MRAM or cryogenic spinvalves 16,17 . Albeit more complexity in fabrication, the 3-terminal structure of SHE-MTJs provides several advantages over STT-MRAMs: the read and write paths are separate thus can be optimized independently; read-disturb-error (accidental writing when reading) is negligible (has not been observed so far); superior durability (no known wear-out mechanism). On the other hand, in terms of scalability and energy efficiency, spinvalves can be the best option due to their all-metallic structure hence very low impedance. However, their typically small magnetoresistance variation of less than 1% causes difficulty for reliable read-out. We anticipate that this problem may soon be resolved with future material development, such as the application of Heusler compounds 42 .
There are also opportunities to improve the performance of the control elements. The hTron column and row drivers used in our prototype array, whose triggering delay is roughly 8 ns 12 due to the heat propagation across the 100 nm gap in the film plane, can be replaced with nTrons 23 with vanishing triggering delay owing to the galvanic connectivity of their gate and channel. Our hTron bit-selects can also be replaced with a stacked variant 15 in which the gate crosses over the channel thereby providing more efficient heating and smaller triggering current. These enhancements will result in a faster, more energy efficient, and more compact bit-select and driver elements.
By combining the high speed and non-volatility of the MRAM, the superconductivity of the select elements, and the compatibility with SFQ circuitry and fabrication, our cryogenic memory architecture is proven to be a fast and power efficient candidate for exascale superconducting computing and quantum control systems. Fig. 2(a), the SHE-MTJ stack is as follows (thicknesses in nanometer): substrate/ Pt 85 Hf 15 (4)/Hf(0.5)/FL/MgO/RL/Ru/PL/cap, where FL = 1.5 nm Fe 60 Co 20 B 20 , RL is the magnetic fixed layer pinned by the pinning layer PL in the synthetic antiferromagnetic structure RL/Ru/PL which is tuned to minimize the stray dipole field at the FL so that the magnetization of the FL is bistable at zero bias field. The stack was patterned into SHE-MTJ devices with 300 nm wide channel and 75 nm × 190 nm nanopillar MTJ by deep-UV and e-beam lithography and Argon ion milling.

Samples. As illustrated in
Upon the fabrication of SHE-MTJ devices, a NbN layer of 30 nm thick was deposited at ambient temperature and patterned by photo-and electron-beam lithography. Superconducting structures are designed using conformal mapping curves 43 to prevent current crowding effect at corners. The bit-select hTron channel size was 1.7 μm × 25.5 μm, which was optimized to achieve the minimal energy overhead of 31%. A second level of metallic interconnect was provided by e-beam evaporated aluminum wires and vias and PECVD SiN dielectric. Micrographs of the array and a representative cell are shows in Fig. 1(e,f) www.nature.com/scientificreports www.nature.com/scientificreports/ impedance matching. To minimize the overhead power consumption added to the SHE-MTJ switching by the bit-select hTron, the normal-state resistance of the hTron should be maximized, within the following two constraints: (i) The column bias current must be lower than the hTron critical current of the hTron channel: where J c , W, d are the critical current density, width and thickness of the hTron channel. (ii) The current through the hTron channel during the writing event must be large enough to maintain the hotspot in the channel 25 : where J is the current density in the hTron channel during writing, R sq and α are its sheet resistance and specific heat per unit area in normal state, and ΔT = T c − T s is the difference between the hTron critical temperature T c and the sample temperature T s .
Combining the above two conditions and the relation where I SH is the SHE-MTJ switching current, we obtain: Measurements. All measurements presented in this Report were performed at the base temperature of 3.6 K in cryogen-free cryostats. Characterizations of stand-alone SHE-MTJ devices (shown in Fig. 2) were done using the same HPD cryostat and control electronics as described in our previous report 18 . Measurements and data collection procedures were performed using our custom-built Python package 7 , Auspex, available online at https:// github.com/BBN-Q/Auspex.