Experimental demonstration of an on-chip p-bit core based on stochastic magnetic tunnel junctions and 2D MoS2 transistors

Daniel, John; Sun, Zheng; Zhang, Xuejian; Tan, Yuanqiu; Dilley, Neil; Chen, Zhihong; Appenzeller, Joerg

doi:10.1038/s41467-024-48152-0

Download PDF

Article
Open access
Published: 15 May 2024

Experimental demonstration of an on-chip p-bit core based on stochastic magnetic tunnel junctions and 2D MoS₂ transistors

Nature Communications volume 15, Article number: 4098 (2024) Cite this article

1 Altmetric
Metrics details

Subjects

Abstract

Probabilistic computing is a computing scheme that offers a more efficient approach than conventional complementary metal-oxide–semiconductor (CMOS)-based logic in a variety of applications ranging from optimization to Bayesian inference, and invertible Boolean logic. The probabilistic bit (or p-bit, the base unit of probabilistic computing) is a naturally fluctuating entity that requires tunable stochasticity; by coupling low-barrier stochastic magnetic tunnel junctions (MTJs) with a transistor circuit, a compact implementation is achieved. In this work, by combining stochastic MTJs with 2D-MoS₂ field-effect transistors (FETs), we demonstrate an on-chip realization of a p-bit building block displaying voltage-controllable stochasticity. Supported by circuit simulations, we analyze the three transistor-one magnetic tunnel junction (3T-1MTJ) p-bit design, evaluating how the characteristics of each component influence the overall p-bit output. While the current approach has not reached the level of maturity required to compete with CMOS-compatible MTJ technology, the design rules presented in this work are valuable for future experimental implementations of scaled on-chip p-bit networks with reduced footprint.

A quantum coherent spin in hexagonal boron nitride at ambient conditions

Article Open access 20 May 2024

Monolithic three-dimensional tier-by-tier integration via van der Waals lamination

Article 22 May 2024

A programmable topological photonic chip

Article Open access 22 May 2024

Introduction

Computing is at a crossroads: just as the transistor-scaling driven by Moore’s Law has afforded improvements in conventional complementary metal-oxide–semiconductor (CMOS)-based computing performance, there is an inevitable slowing down due to fundamental device limits¹. Furthermore, the inherently deterministic nature of conventional computing makes the current CMOS model unsuitable for contending with the continued future growth of applications such as in neuromorphic computing and Artificial Intelligence (AI)².

A superior approach is that of probabilistic computing. In probabilistic computing, the key component is the probabilistic bit (or p-bit), a unit that fluctuates randomly, but controllably, between 0 and 1³. Indeed, a network of such p-bits can leverage their stochastic nature to function as efficient hardware accelerators for solving complex problems that are themselves inherently probabilistic. These problems, which lie at the core of many real-world machine learning applications and algorithms of AI, range in nature from combinatorial optimization problems (such as integer factorization) to recognition and classification^{4,5,6,7,8,9,10,11,12,13,14,15,16}.

At its core, a p-bit requires a tunable stochastic element. While it should be noted that this can be implemented with standard CMOS technology^17,18,19 and a significant device overhead, the resulting p-bit suffers from a large areal and energy footprint, as well as not offering true randomness²⁰.

An ultra-compact approach for tunable randomness that yields the desired sigmoidal-shaped input/output characteristics, which is scalable and energy-efficient, is achieved by exploiting the physics of low-barrier fluctuating nanomagnets when coupled with existing magnetic tunnel junction (MTJ) technology. Such p-bit implementations using stochastic MTJs have been shown^{21,22,23,24,25}, but as yet, the proof-of-concept implementations used alternate designs to the 3T-1MTJ p-bit structure that made it necessary to employ field-programmable gate arrays (FPGAs) or external circuitry, with orders of magnitude more transistors involved than required in the p-bit design explored in this work.

In this work, an on-chip demonstration of the core of a p-bit, exhibiting tunable stochasticity, is reported. Using a variation of the 3T-1MTJ design proposed by Camsari et al.²⁶, a stochastic MTJ is integrated with a high-performance MoS₂ transistor next to each other on the same chip, experimentally showing the desired gate-controlled fluctuations at room temperature. Moreover, this article elucidates the impact and interaction of the various critical device characteristics shown in Fig. 1a, including that of the (i) MTJ, (ii) the transistor that is part of the p-bit core, and (iii) the inverter (see Fig. 1a). It is found that—against common wisdom—a large tunnel magnetoresistance (TMR) is not the best choice for p-bits; bimodal telegraphic fluctuations are highly undesirable and are a sign of a slow device; matching of the MTJ resistance and the transistor characteristics is crucial; and an ideal inverter with a large gain is incompatible with the desired p-bit operation.

**Fig. 1: Implementing probabilistic bits (p-bits) with stochastic magnetic tunnel junctions (MTJs).**

Results

Implementing probabilistic bits (p-bits) with stochastic MTJs

At its core, a magnetic tunnel junction (MTJ) consists of two ferromagnetic layers separated by an ultrathin insulating layer (Fig. 1b). The “fixed” layer, which has the stronger magnetic moment, is used as the reference for the “free” layer, whose magnetic moment is more susceptible to being switched. Important MTJ parameters are tunnel magnetoresistance (TMR), which describes the difference in resistance between the parallel (P) and antiparallel (AP) arrangement of the two magnetic layers, and the energy barrier of the free layer, E_B, which needs to be overcome to toggle between the two resistance states^27,28,29.

For stable MTJs, such as those used in spin-transfer torque magnetic random access memory (STT-MRAM) applications³⁰, energy barriers are large and when the resistance is measured as an external magnetic field is swept, the resulting minor loop exhibits deterministic switching of the free layer. Figure 1c shows an example minor loop of a fabricated MTJ that was observed to be stable.

If this energy barrier is made smaller, through material changes or shape scaling³¹, the ambient thermal energy may be sufficient for the free layer to switch stochastically between the two resistance states (Fig. 1d). When biased at the center of this window, the signal is shown to be a naturally fluctuating output whereby the time spent in each resistance state (known as the dwell time, τ) may be described by the equation:

$$\tau={\tau }_{0}{e}\,^{{E}_{B}/{K}_{B}T},$$

(1)

where ${k}_{B}$ is the Boltzmann constant, $T$ is the temperature and ${\tau }_{0}$ is the “attempt time”, a material-dependent constant that is ~1 ns³². For in-plane stochastic MTJs, dwell times down to ~5 ns have been demonstrated^33,34.

For p-bit applications, this source of natural stochasticity is ideal; by coupling a stochastic MTJ with an access transistor, and including an inverter for amplification, a compact voltage-controlled p-bit design is achieved (Fig. 1a)²⁶.

The theoretical output from such a p-bit implementation, generated using modified experimental data from stochastic MTJs (Fig. 1e) and circuit simulations of transistor behavior (Fig. 1f), is shown before (Fig. 1g) and after (Fig. 1h) the inverter’s amplification. (For more details regarding the use of experimental data in the circuit simulations, please see Supplementary Note 1). The core of the p-bit, which includes the stochastic MTJ and the N-channel metal-oxide-semiconductor (NMOS) transistor, provides the tunable stochasticity while the inverter provides the thresholding and amplification of the stochastic signal. The resulting sigmoidal output allows for pinning at low- and high-input voltages while exhibiting the desired output fluctuations in the transition region.

The tunability in the output is controlled by varying the transistor gate voltage (V_IN), where changes in the relative resistance of the transistor to the MTJ change the voltage at the inverter’s input. This voltage is then amplified through the inverter’s operation, allowing the output to be pinned to output-low for low V_IN, and to output-high for high V_IN. In the middle region, the stochastic resistance fluctuations from the MTJ manifest as tunable random voltage fluctuations in the p-bit output.

This design is discussed further in the following section, which shows the experimental realization of the p-bit core using a stochastic MTJ and a 2D-MoS₂ transistor.

Experimental demonstration of an on-chip p-bit core

For this demonstration, MTJ devices were first fabricated before those devices possessing sufficient TMR for a large read-signal were interconnected with appropriate resistance-matched field-effect transistor (FET) devices in a 1T-1MTJ configuration. It is desirable to have the transistor chosen such that the on-state FET resistance is at least two orders of magnitude smaller than the MTJ’s low-resistance state, R_P, and that the off-state FET resistance is two orders of magnitude larger than the MTJ’s high-resistance state, R_AP, to attain the maximum swing in the output voltage.

Figure 2a shows a schematic of the 1T-1MTJ configuration for the on-chip p-bit core. The detailed stack structure for the MTJs used in this demonstration is shown in Fig. 2b. The magnetic layer (CoFeB) thicknesses, were chosen to best yield MTJs with in-plane anisotropy due to two reasons: MTJs with in-plane anisotropies have been shown to be more resistant to spin-transfer torque (STT)-pinning³⁵, and have also shown to fluctuate with time scales that are orders of magnitude faster than perpendicular-anisotropy MTJs^32,34,36.

**Fig. 2: Fabricating an on-chip p-bit.**

Figure 2c shows an SEM image of an example elliptical nanopillar with the same dimensions as the MTJs used in this demonstration, while Fig. 2d shows an optical microscope image of a finished MTJ device, along with a tilted-angle false-color SEM image of the MTJ region.

The interdigitated (IDT) monolayer (ML) MoS₂ FETs are then fabricated alongside the completed MTJ devices. The cross-section of the FET is shown in Fig. 2e while an SEM image of a fabricated IDT ML MoS₂ FET is shown in Fig. 2f, where a single IDT FET includes 20 sets of source/drain contacts, with L_ch ~ 150 nm and W_ch ~ 6.5 μm, for a total effective channel width of 130μm. ML MoS₂ is chosen as the channel material of the drive transistor due to the low thermal budget fabrication process (to help preserve the performance of the fabricated stochastic MTJs, which suffer shorting in the SiO₂ isolation layer for temperatures above ~400 °C), low contact resistance³⁷, the large bandgap (1.8 eV), the high on-state performance of scaled 2D-MoS₂ FETs³⁸ and good electrostatic control achievable with ML MoS₂. Although it would require significant experimental effort, it should be noted that the ultimate p-bit implementation would involve integrating advanced CMOS circuitry with unstable MTJs (rather than using MTJs in an MRAM array structure as nonvolatile memory elements).

Figure 3a shows the minor loop of the stochastic MTJ used in the integrated p-bit. The dashed line at −16 mT indicates the 50–50 point at which the device spends an equal amount of time in the AP- and P-state. All further measurements for this device are performed at this 50–50 point to ensure the MTJ’s resistance output (Fig. 3b) is truly random. As this is an intrinsically Poisson process, fitting the histograms of the AP- and P-state dwell times (Fig. 3c) with an exponential envelope yields the average dwell time in each state (${\tau }_{{AP}}$ and ${\tau }_{P}$, respectively)^20,39. The dwell time of this device, a quantity that determines the speed at which a p-bit may operate, is calculated as the harmonic mean of ${\tau }_{{AP}}$ and ${\tau }_{P}$ and is 695 ms (details on the dwell time extraction and the quality of randomness can be found in Supplementary Note 2).

**Fig. 3: Characterization & measurement of the interconnected on-chip p-bit core.**

The transfer characteristics of 24 as-fabricated IDT ML MoS₂ FETs are seen in Fig. 3d, showing a narrow variation in the threshold voltage, while the benefits of the IDT structure are seen in the high-current levels and on/off ratios. The on-current level is around 0.6 mA at V_DS = 0.1 V and the on/off ratio is around ~10¹⁰, with a minimum subthreshold slope (SS) around 94 mV/dec. Note that the scaled devices operate at gate voltages on the order of ~1 V, which is critical for the ultimate p-bit implementation to ensure that V_IN and V_OUT are identical.

Following the characterization of devices, a Ti/Au interconnect is fabricated between the MTJ- and MoS₂-FET pair observed to have the best resistance match and stochastic signal. It is observed, however, that after the integration of MTJ and FETs, there is a degradation in the transistor performance, as shown in Fig. 3e, including degraded on-off ratio and SS. This is not a result of connecting the FET with the MTJ, but likely due to process-induced trap charges in the HfO₂ gate oxide that produced an aging effect, whereby the FET characteristics were observed to degrade over time for this device⁴⁰.

A circuit schematic of this 1T-1MTJ p-bit core is shown in Fig. 3f, while an optical microscope image of the finished device is shown in Fig. 3g. Figure 3h shows the output, V_{INVERTER INPUT}, as a function of the input (FET gate) voltage, V_IN. V_D = 200 mV was used to avoid excessive stress to the MgO barrier and to prevent damage to the MTJ observed at larger current densities. (To better understand the choice of V_D and the impact of large current densities through the MTJ, see Supplementary Note 3).

For this measurement, the MTJ is biased at its 50–50 point (as seen in Fig. 3a), and V_{INVERTER INPUT} is measured 200 times at each input voltage value, V_IN, to demonstrate the impact of the stochastic fluctuations on the p-bit core’s output.

To compensate for the transistor degradation in the interconnected p-bit core, V_IN had to be significantly increased, which will not be required in a further optimized p-bit implementation. At large negative V_IN, when the transistor is in its highly resistive OFF-state, the potential at V_{INVERTER INPUT} is close to V_D. Increasing V_IN yields a decrease in the transistor’s resistance, resulting in a reduction in V_{INVERTER INPUT} as the transistor approaches its threshold voltage, V_TH.

For this device, the leftward shift of the degraded transistor’s threshold voltage, V_TH, results in a leftward shift of the overall sigmoid while the degradation in the transistor’s off-state resistance (shown in Fig. 3e) results in the output not being fully pinned to V_D (see Supplementary Note 5 for off-chip p-bit core implementations with better resistance-matching and better V_IN-V_OUT matching between the constituent MTJ-FET pair, illustrating that the non-idealities in the on-chip demonstration discussed here are a result of process modules and not a fundamental issue).

The impact of the MTJ’s fluctuations also becomes increasingly clear in the p-bit core output as V_IN is increased, with the magnitude of fluctuations observed at a maximum when the resistances of the transistor and the MTJ are approximately equal, and an equal voltage is dropped across both components. The red inset in Fig. 3h reveals a significant voltage drift in the output due to charge traps from the degradation of the transistor gate oxide and its impact on the subthreshold slope.

A further increase in V_IN to the transistor’s ON-state, where the resistance of the transistor is less than that of the MTJ, sees the output approach 0 V. The output here still shows the fluctuations from the MTJ, but at a much smaller scale (green inset, Fig. 3h). This is beneficial as any STT-pinning effects from the large currents at this input voltage_, that could act to potentially bias the 50–50 fluctuations of the MTJ, do not significantly impact the output of the p-bit core (Supplementary Note 3 shows how large current densities through the MTJ can result in STT-pinning).

In this way, this demonstration of a scaled on-chip p-bit core is shown to produce the desired sigmoidal output with the tunable stochasticity that is required for probabilistic computing. As an individual device demonstration, and in comparing this design to a pure CMOS implementation, a high-quality tunable random number generator would require orders of magnitude more transistors/components than that which is experimentally demonstrated on-chip here²⁴.

A desirable feature of the sigmoid is that it is centered around V_IN = V_D/2, such that V_IN and V_OUT may be of similar scales, and the output of one p-bit may be fed into the input of another p-bit to create correlated p-bit networks. This may be achieved by implementing a dual-gated transistor design, whereby the threshold voltage may be shifted to the desired region through the application of an additional top-gate voltage (demonstrated in Supplementary Note 4).

This demonstration also illustrates the impact the transistor has on the p-bit’s output. For example, the subthreshold slope (SS) determines the steepness of the sigmoid (a steeper SS would yield a steeper sigmoid), and how well the transistor is resistance-matched with the MTJ impacts the V_D range over which the output sigmoid spans and if the output can be pinned. Moreover, the location of the threshold voltage is critical in determining the centroid of the overall sigmoid (as shown in Supplementary Note 5).

Influence of MTJ characteristics on the p-bit output

To study the impact of an MTJ’s characteristics on a p-bit’s output, experimental data from stochastic MTJs are used as input for circuit simulations, conducted using the Spectre Simulation Platform. A 3T-1MTJ model of the p-bit is used (Fig. 4a), with additional bias points available at the body bias for the N-channel metal-oxide-semiconductor (NMOS) and P-channel metal-oxide-semiconductor (PMOS) transistors of the inverter for tuning of inverter characteristics (further information about data handling, and the transistors that are part of the p-bit circuitry, is provided in Supplementary Note 1).

**Fig. 4: Influence of the MTJ’s TMR on the p-bit output.**

Two key properties of an MTJ are investigated: the MTJ’s TMR and the MTJ’s distribution of resistance states. An ideal p-bit output in a 3T-1MTJ configuration would be a smooth sigmoidal function with a wide region of fluctuations, at the center of which are rail-to-rail fluctuations that could be used to drive other p-bits in a network of such devices.

Figure 4b shows the p-bit output for three MTJs fluctuating at the same frequency but with TMR ratios scaled to different values (Supplementary Note 6 describes how this TMR-scaling was performed using actual measured MTJ fluctuations). The dotted line shows the time-averaged V_OUT at each V_IN, while the shaded background shows the instantaneous output as V_IN is swept linearly from 0 to 1.8 V.

The largest TMR device (300%, blue) has the widest stochastic region and rail to rail fluctuations but also shows a plateau in the time-averaged curve. These plateaus, or the pinning of the output over a range of input voltages, are non-ideal for concatenation purposes as they reduce the tunability of an individual p-bit’s fluctuations with changes in its input from other p-bits in the network.

To quantify the degree of plateau, only the central region of stochastic fluctuations is used; this is defined as the V_IN voltage range which corresponds to the middle 90%-interval of the averaged p-bit output, or the region between V_OUT = 0.18 V and V_OUT = 1.62 V. For the 80% TMR device, this corresponds to V_IN = 0.83 V and V_IN = 1.00 V, respectively. These points are used to define the “ideal” gradient, describing a line that spans these points and corresponds to an averaged p-bit output that would be consistently tunable and devoid of plateaus in the stochastic region. A plateau is defined as any point within the averaged p-bit output where the instantaneous gradient is less than 50% of the “ideal” gradient (See Supplementary Note 9 for more information on quantifying the plateau in the p-bit output).

Using these definitions, it is observed that the 300% TMR device has 67% of the stochastic region formally defined as a plateau, where little tunability is observed in the averaged output.

In contrast, the smallest TMR device (15% TMR, black) has no major plateaus within the central stochastic region but has a narrower range over which the fluctuations are visible (with the middle 90%-interval of the stochastic output being measured as between V_IN = 0.87 V and V_IN = 0.93 V).

This is undesirable as it limits the V_IN range in which usable fluctuations are observed, with the p-bit output primarily in the output-low or output-high state. To understand this behavior, consider Fig. 4c–e.

Figure 4c shows, for increasing V_IN applied to the transistor’s gate, the distributions of values at the inverter’s input for each of the p-bits made with MTJs of differing TMRs, along with the voltage transfer curve (VTC) of the inverter (overlaid in green). The largest TMR device (300%, blue), with the largest resistance fluctuation, has the widest spread of values for Inverter Input, while the smallest TMR device (15%, black) has the narrowest distribution (Supplementary Note 7 provides further explanation of these voltage distributions).

For V_IN = 0.8 V (Fig. 4c), the value at the inverter’s input is centered around V_{INVERTER INPUT} ≈ 1.2 V, such that the V_OUT is within output-low, i.e., close to zero, on the VTC for both the 15% and 80% TMR. However, the 300% TMR device has a sufficient number of states in the bottom arm of its V_{INVERTER INPUT} distribution (blue) that is in-between the noise margin regions of the inverter’s VTC, such that the average V_OUT for the 300% TMR device is shifted to a larger value of ~490 mV (Fig. 4b).

As V_IN is increased, the transistor connected to the stochastic MTJ becomes more conducting, and the center of the distributions shifts to smaller V_{INVERTER INPUT} values. For V_IN = 0.98 V (Fig. 4e), the 15% TMR device (black) has inverter input values such that it interacts primarily with the output-high section of the VTC, giving an average V_OUT that is pinned close to 1.7 V (Fig. 4b).

In contrast, the 300% TMR device has a larger range of inverter input values that spans between the noise margin regions of the inverter. This results in the plateau effect, where changing the input voltage does not yield a meaningful change in average V_OUT as the TMR is large enough for the distribution of inverter input values to span both the output-high and output-low regions of the VTC for a range of V_IN values.

To summarize, the smaller the TMR, the smaller the section of the VTC that is sampled by the inverter input distribution, and the smaller the range of V_OUT over which the values are averaged. This results in a smoother averaged output that is more sigmoidal and less prone to plateauing. However, the V_IN range over which the stochastic fluctuations are observed is small, limited to the range between the output-high and output-low regions of the VTC, where the gain is non-zero. This means that for a small TMR device, rail-to-rail fluctuations are not observed at all. Although it has been shown that rail-to-rail fluctuations are not necessary for the entire fluctuating range²⁶, the diminished output fluctuation range would make it difficult to form networks with small-TMR p-bits due to the insufficient voltage drive it would provide to the next p-bit. A large TMR device is good for attaining rail-to-rail output voltages, such as at V_IN = 0.9 V (Fig. 4d) where the 300% TMR device shows an output spanning 0 to 1.8 V, but is prone to the plateauing effect if the device’s inverter input distribution spans the output-high and output-low regions of the VTC for an extended range of V_IN values.

This is a key finding: for a given inverter, the TMR should not be too high such that it spans the output-low and output-high regions of the inverter for a large V_IN range. Similarly, a “perfect” inverter that has an infinite gain would be undesirable for p-bit applications, as even an MTJ with a small TMR would have a step-like plateau in the output.

These plateaus are particularly problematic when interconnecting p-bits to form p-circuits. In Kaiser et al.²¹, a circuit of 5 p-bits, made with non-ideal perpendicular MTJs (in an example of off-chip integration), is used to emulate a Full Adder circuit. The performance of this non-ideal p-circuit (in which the constituent p-bits had, on average, 51% of their central stochastic range within a plateau region) is compared to an ideal p-circuit (made of p-bits devoid of plateaus in their output) (see Supplementary Note 9 for further information). It was found that the non-ideal p-circuit took twice as long as the ideal p-circuit to reach the ground state solution, demonstrating that the plateaus in an individual p-bit’s output can have a direct impact on the performance of the wider p-bit network.

Another characteristic that affects the p-bit’s output is the MTJ’s distribution of states. An MTJ with a very bimodal distribution is more prone to plateaus in the output⁴¹, especially if the TMR is large enough for the fluctuations in the inverter’s input to sample both output-high and output-low regions of the VTC. In contrast, an MTJ with a very continuous distribution, with the ideal being a uniform distribution between R_P and R_AP, would sample each value of the VTC equally and would give a much smoother sigmoidal output.

A further key finding of this work is that there appears to be a correlation between the distribution of resistance states and the speed at which these in-plane MTJs fluctuate. To quantify how bimodal a MTJ’s resistance fluctuations are, a new figure-of-merit, the “distribution factor”, is introduced. Using the normalized resistance output of an MTJ, histograms are created where the counts in the 8 edge-state bins are divided by the 8 middle-state bins. For statistical significance, the total number of data points is the same in each data set. Figure 5a–c, d–f show this process for two MTJs of different dwell times (τ = 29 μs and τ = 27 ms, respectively).

**Fig. 5: Resistance distributions of stochastic MTJ devices.**

Figure 5g shows this distribution factor calculated for 23 stochastic MTJs, made with the same stack material, with dwell times spanning orders of magnitude (Supplementary Note 8 explains in greater detail why it is meaningful and justified to use the distribution factor as a key metric).

It is observed that the faster the MTJ fluctuates, the more middle states there are in the resistance distribution, and the less bimodal the distribution is. One possible explanation for this is that this distribution factor, which compares the number of counts of the edge states to middle states in the resistance distribution of an MTJ’s stochastic fluctuations, is representative of the amount of time the MTJ’s free layer spends in the P- or AP-state (the edge-state counts) compared to the amount of time the free layer spends in transitioning between them (the middle-state counts). This transition time is dependent on material properties of the MTJ stack layers, such as the effective perpendicular-anisotropy field, and for in-plane MTJs is theorized to be in the range of approximately 1–10 ns³². Therefore, the smaller the energy barrier, the smaller the dwell time in the P- or AP-state, while the transition time is relatively unaffected (with a change in the energy barrier size). Thus, the smaller the dwell time, the fewer the edge states relative to middle states, which correlates to a smaller distribution factor.

Considering Fig. 5g, the dotted line is a guide to the eye which suggests that for this material stack, a uniform distribution with equal edge- and middle-state counts would be achieved for MTJs with fluctuations in the tens of ns regime. This correlation suggests that a faster device, with a more continuous distribution, would yield a smoother sigmoidal output.

This is tested with the two devices of different dwell times, τ = 29 μs and τ = 27 ms, that are scaled to the same TMR, and using the same inverter (Fig. 5h). Using the same method as previously described to quantify the plateau, the slower device (27 ms, orange) has a wider plateau region with 59% of the central stochastic region (between V_IN = 0.83 V and V_IN = 1.00 V) identified as having a gradient less than 50% of the “ideal” gradient. In contrast, the faster device (29 μs, red), which has a smaller distribution factor and is less bimodal, shows only 19% of the central stochastic region as being a plateau.

Moreover, considering the severity of the plateau, the slower device’s plateau region is shallower than the plateau in the output of the faster device, resulting in the slower (more bimodal) device having an output that not only has a wider plateau region but also one that is comparatively less tunable in the central operating region.

This is another key finding in that a faster MTJ has a two-fold advantage: firstly, the faster the fluctuation and speed of random number generation, the faster the p-bit may operate asynchronously, and secondly, the faster the MTJ, the more uniform the distribution of states is observed to be, and the more ideal the p-bit’s output is. Thus, it is this interplay of the MTJ’s TMR and the distribution of states, along with the inverter’s properties, that can determine how ideal a p-bit’s output is.

Influence of inverter characteristics on the p-bit output

The inverter also offers a degree of control over the p-bit’s output. Figure 6a shows the voltage transfer curve (VTC) for two inverters: one without applied body bias, called “pristine” (black curve), and the other which has been tuned, through the application of a positive body bias to the NMOS FET, to have a smaller gain (red curve).

**Fig. 6: Influence of the inverter’s characteristics on the p-bit output.**

Using the same MTJ (with a dwell time of $\tau=27{{{{{\rm{ms}}}}}}$) and transistor, Fig. 6b shows the impact of this inverter tuning on a p-bit’s output: the tuned inverter (red), with the smaller gain, shows a smoother sigmoid while the pristine inverter, with the larger gain, shows a more pronounced undesirable plateau in the output. This is because for a given MTJ with a bimodal distribution, the distribution of voltages at the inverter’s input is less likely to span the output-low and output-high regions for an extended range of V_IN (the cause of the undesirable plateaus) if the VTC is shallower and the gain is small.

However, the tuned inverter also suffers from a degradation in the noise margin, seen in Fig. 6a, which decreases the size of the p-bit’s output fluctuation range. This is because the body bias at the NMOS transistor shifts its threshold voltage, lowering the channel resistance and making it harder to pin to output-high, V_D, for large V_IN.

This issue could be mitigated by using a more aggressively scaled technology node for the inverter than the 180 nm-node used here. A 14nm-ultrascaled Fin-FET inverter (as used in previous p-bit simulation work^26,41,42), which provides a more piecewise-linear VTC that offers a lower gain (for a smoother sigmoidal output), and a wide-noise margin to pin the output to V_D at high-input voltages, would be desirable.

Discussion

In this work, the experimental realization of an on-chip p-bit core is demonstrated, using a stochastic in-plane MTJ interconnected with a 2D-MoS₂ transistor in a 1T-1MTJ structure. Through experimental demonstration and circuit simulations, it is shown how each component of the p-bit influences the overall output.

For the transistor, a good resistance match with the MTJ and a threshold voltage close to V_D/2 is required to achieve a well-centered sigmoid that spans the full range of V_D and is suitable for inverter amplification.

For the stochastic MTJ, too large a TMR can cause plateaus in the inverter’s average output, while too small a TMR gives an insufficient V_IN range over which the usable fluctuations in V_OUT are observed. Additionally, it is found that the speed at which the MTJ fluctuates is crucial to the p-bit’s output: a faster MTJ is observed to have a more uniform distribution (with more middle states between R_P and R_AP edge states), and for a given inverter, this results in a smoother V_OUT sigmoid with less plateauing. A faster MTJ is also beneficial when concatenating p-bits to form a p-bit network, whereby the speed of the MTJs used can determine the speed of asynchronous operation.

For the inverter, the large gain and the steep VTC associated with the conventional 180nm-node technology used in the simulations were found to be more likely to yield undesirable plateaus in the p-bit output. A smaller gain inverter, with a piecewise-linear VTC that maintains a wide-noise margin in the input-low and input-high regions, achievable with a more scaled process, is desirable for p-bit applications.

These observations highlight how each component is crucial in determining the quality of the p-bit’s output and seek to provide design insights that can contribute towards the future goal of fully scaled on-chip p-bit networks.

Methods

MTJ fabrication

MTJ films are deposited using DC/RF sputtering on thermally oxidized Si substrates and, from the bottom, are Ta(8 nm)/CoFeB(2 nm)/MgO(1 nm)/CoFeB(4 nm)/Ta(4 nm)/Ru(5 nm).

These stacks are patterned into elliptical nanopillars using e-beam lithography and Ar-ion beam etching. Amorphous SiO2 is then deposited, to electrically insulate the bottom contact channel, with the etch hard mask in place as part of a self-aligned process. The hard masks are then removed using an NMP-based solvent, after which the MTJs are annealed at 300˚C for 10 minutes to improve the TMR of the finished devices⁴³. After the annealing procedure, the top contacts are defined using e-beam lithography, with e-beam evaporation used to deposit Ti/Au (20/140 nm) electrodes to enable electrical measurements across the MTJ.

2D FET fabrication

The bottom gate electrode structure is made of a Cr (2 nm)/Au(13 nm) metal stack followed by 5.5 nm HfO₂ gate oxide. The HfO₂ is deposited by an atomic layer deposition (ALD) system at 90 °C. Then the ML MoS₂ flakes are wet transferred from the original Si/SiO₂ growth substrate onto the bottom gate electrodes and then vacuum annealed at a pressure of ~5 × 10⁻⁸ torr at 200 °C for 2 h. After vacuum annealing, the flakes are etched into a stripe before the interdigitated source/drain contacts are defined by electron beam lithography (EBL), and Ni (70 nm) is deposited as the contact metal by electron beam evaporation.

Data availability

Relevant data supporting the key findings of this study are available within the article and the Supplementary Information file. All raw data generated during the current study are available from the corresponding authors upon request.

References

Theis, T. N. & Wong, H.-S. P. The end of Moore’s law: a new beginning for information technology. Comput. Sci. Eng. 19, 41–50 (2017).
Article Google Scholar
Schuman, C. D. et al. Opportunities for neuromorphic computing algorithms and applications. Nat. Comput. Sci. 2, 10–19 (2022).
Article PubMed Google Scholar
Camsari, K. Y., Sutton, B. M. & Datta, S. p-bits for probabilistic spin logic. Appl. Phys. Rev. 6, 011305 (2019).
Article ADS Google Scholar
Cai, B. et al. Unconventional computing based on magnetic tunnel junction. Appl. Phys. A 129, 236 (2023).
Article ADS CAS Google Scholar
Misra, S. et al. Probabilistic neural computing with stochastic devices. Adv. Mater. 2204569 https://doi.org/10.1002/adma.202204569 (2022).
Chowdhury, S. et al. A full-stack view of probabilistic computing with p-bits: devices, architectures and algorithms. IEEE J. Explor. Solid-State Comput. Devices Circuits 1–1 https://doi.org/10.1109/JXCDC.2023.3256981. (2023).
Kaiser, J. & Datta, S. Probabilistic computing with p-bits. Appl. Phys. Lett. 119, 150503 (2021).
Article ADS CAS Google Scholar
Finocchio, G. et al. The promise of spintronics for unconventional computing. J. Magn. Magn. Mater. 521, 167506 (2021).
Article CAS Google Scholar
Sutton, B. et al. Autonomous probabilistic coprocessing with petaflips per second. IEEE Access 8, 157238–157252 (2020).
Article Google Scholar
Camsari, K. Y. et al. From charge to spin and spin to charge: stochastic magnets for probabilistic switching. Proc. IEEE 108, 1322–1337 (2020).
Article CAS Google Scholar
Aadit, N. A. et al. Computing with Invertible Logic: Combinatorial Optimization with Probabilistic Bits. in 2021 IEEE International Electron Devices Meeting (IEDM) 40.3.1–40.3.4. https://doi.org/10.1109/IEDM19574.2021.9720514 (2021).
Faria, R., Camsari, K. Y. & Datta, S. Low-barrier nanomagnets as p-bits for spin logic. IEEE Magn. Lett. 8, 1–5 (2017).
Article Google Scholar
Faria, R., Camsari, K. Y. & Datta, S. Implementing Bayesian networks with embedded stochastic MRAM. AIP Adv. 8, 045101 (2018).
Article ADS Google Scholar
Faria, R., Kaiser, J., Camsari, K. Y. & Datta, S. Hardware design for autonomous bayesian networks. Front. Comput. Neurosci. 15, 584797 (2021).
Article PubMed PubMed Central Google Scholar
Aadit, N. A. et al. Massively parallel probabilistic computing with sparse Ising machines. Nat. Electron. 5, 460–468 (2022).
Article Google Scholar
Pourmeidani, H., Sheikhfaal, S., Zand, R. & DeMara, R. F. Probabilistic interpolation recoder for energy-error-product efficient DBNs with p-bit devices. IEEE Trans. Emerg. Top. Comput. 9, 2146–2157 (2021).
Article Google Scholar
Pervaiz, A. Z., Ghantasala, L. A., Camsari, K. Y. & Datta, S. Hardware emulation of stochastic p-bits for invertible logic. Sci. Rep. 7, 10994 (2017).
Article ADS PubMed PubMed Central Google Scholar
Pervaiz, A. Z., Sutton, B. M., Ghantasala, L. A. & Camsari, K. Y. Weighted p-bits for FPGA implementation of probabilistic circuits. IEEE Trans. Neural Netw. Learn. Syst. 30, 1920–1926 (2019).
Article PubMed Google Scholar
Chowdhury, S., Camsari, K. Y. & Datta, S. Accelerated quantum Monte Carlo with probabilistic computers. Commun. Phys. 6, 85 (2023).
Article Google Scholar
Vodenicarevic, D. et al. Low-energy truly random number generation with superparamagnetic tunnel junctions for unconventional computing. Phys. Rev. Appl. 8, 054045 (2017).
Article ADS Google Scholar
Kaiser, J. et al. Hardware-aware in situ learning based on stochastic magnetic tunnel junctions. Phys. Rev. Appl. 17, 014016 (2022).
Article ADS CAS Google Scholar
Borders, W. A. et al. Integer factorization using stochastic magnetic tunnel junctions. Nature 573, 390–393 (2019).
Article ADS CAS PubMed Google Scholar
Grimaldi, A. et al. Experimental evaluation of simulated quantum annealing with MTJ-augmented p-bits. in 2022 International Electron Devices Meeting (IEDM) 22.4.1–22.4.4. https://doi.org/10.1109/IEDM45625.2022.10019530 (2022).
Singh, N. S. et al. CMOS plus stochastic nanomagnets enabling heterogeneous computers for probabilistic inference and learning. Nat. Commun. 15, 2685 (2024).
Article CAS PubMed PubMed Central Google Scholar
Lv, Y., Bloom, R. P. & Wang, J.-P. Experimental demonstration of probabilistic spin logic by magnetic tunnel junctions. IEEE Magn. Lett. 10, 1–5 (2019).
Article ADS Google Scholar
Camsari, K. Y., Salahuddin, S. & Datta, S. Implementing p-bits with embedded MTJ. IEEE Electron Device Lett. 38, 1767–1770 (2017).
Article ADS Google Scholar
Butler, W. H. Tunneling magnetoresistance from a symmetry filtering effect. Sci. Technol. Adv. Mater. 9, 014106 (2008).
Article PubMed PubMed Central Google Scholar
Zink, B. R., Lv, Y. & Wang, J.-P. Review of magnetic tunnel junctions for stochastic computing. IEEE J. Explor. Solid-State Comput. Devices Circuits 1–1 https://doi.org/10.1109/JXCDC.2022.3227062 (2022).
Bapna, M. & Majetich, S. A. Current control of time-averaged magnetization in superparamagnetic tunnel junctions. Appl. Phys. Lett. 111, 243107 (2017).
Article ADS Google Scholar
Koike, H. et al. 40 nm 1T–1MTJ 128 Mb STT-MRAM with novel averaged reference voltage generator based on detailed analysis of scaled-down memory cell array design. IEEE Trans. Magn. 57, 1–9 (2021).
Article Google Scholar
Debashis, P., Faria, R., Camsari, K. Y. & Chen, Z. Design of stochastic nanomagnets for probabilistic spin logic. IEEE Magn. Lett. 9, 1–5 (2018).
Article Google Scholar
Kanai, S., Hayakawa, K., Ohno, H. & Fukami, S. Theory of relaxation time of stochastic nanomagnets. Phys. Rev. B 103, 094423 (2021).
Article ADS CAS Google Scholar
Safranski, C. et al. Demonstration of nanosecond operation in stochastic magnetic tunnel Junctions. Nano Lett. 21, 2040–2045 (2021).
Article ADS CAS PubMed Google Scholar
Hayakawa, K. et al. Nanosecond random telegraph noise in in-plane magnetic tunnel junctions. Phys. Rev. Lett. 126, 117202 (2021).
Article ADS CAS PubMed Google Scholar
Hassan, O., Faria, R., Camsari, K. Y., Sun, J. Z. & Datta, S. Low-barrier magnet design for efficient hardware binary stochastic neurons. IEEE Magn. Lett. 10, 1–5 (2019).
Article Google Scholar
Camsari, K. Y., Torunbalci, M. M., Borders, W. A., Ohno, H. & Fukami, S. Double free-layer magnetic tunnel junctions for probabilistic bits. Phys. Rev. Appl. 15, 044049 (2021).
Article ADS CAS Google Scholar
Shen, P.-C. et al. Ultralow contact resistance between semimetal and monolayer semiconductors. Nature 593, 211–217 (2021).
Article ADS CAS PubMed Google Scholar
Lan, H.-Y., Oleshko, V. P., Davydov, A. V., Appenzeller, J. & Chen, Z. Dielectric interface engineering for high-performance monolayer MoS₂ transistors via TaO_x interfacial layer. IEEE Trans. Electron Devices 70, 2067–2074 (2023).
Article ADS CAS Google Scholar
Debashis, P., Faria, R., Camsari, K. Y., Datta, S. & Chen, Z. Correlated fluctuations in spin orbit torque coupled perpendicular nanomagnets. Phys. Rev. B 101, 094405 (2020).
Article ADS CAS Google Scholar
McClellan, C. J., Yalon, E., Smithe, K. K. H., Suryavanshi, S. V. & Pop, E. High current density in monolayer MoS ₂ doped by AlO _x. ACS Nano 15, 1587–1596 (2021).
Article CAS PubMed Google Scholar
Hassan, O., Datta, S. & Camsari, K. Y. Quantitative evaluation of hardware binary stochastic neurons. Phys. Rev. Appl. 15, 064046 (2021).
Article ADS CAS Google Scholar
Camsari, K. Y., Faria, R., Sutton, B. M. & Datta, S. Stochastic p-bits for invertible logic. Phys. Rev. X 7, 17 (2017).
ADS Google Scholar
Wang, W.-G. et al. Rapid thermal annealing study of magnetoresistance and perpendicular anisotropy in magnetic tunnel junctions based on MgO and CoFeB. Appl. Phys. Lett. 99, 102502 (2011).
Article ADS Google Scholar

Download references

Acknowledgements

The authors thank Prof. K. Camsari for the many helpful discussions and for their invaluable insight. This work was supported by the National Science Foundation (NSF) through Award Number 2106501.

Author information

Authors and Affiliations

Birck Nanotechnology Center, Purdue University, West Lafayette, IN, 47907, USA
John Daniel, Zheng Sun, Xuejian Zhang, Yuanqiu Tan, Neil Dilley, Zhihong Chen & Joerg Appenzeller
Department of Physics and Astronomy, Purdue University, West Lafayette, IN, 47907, USA
John Daniel
School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, 47907, USA
Zheng Sun, Xuejian Zhang, Yuanqiu Tan, Zhihong Chen & Joerg Appenzeller

Authors

John Daniel
View author publications
You can also search for this author in PubMed Google Scholar
Zheng Sun
View author publications
You can also search for this author in PubMed Google Scholar
Xuejian Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yuanqiu Tan
View author publications
You can also search for this author in PubMed Google Scholar
Neil Dilley
View author publications
You can also search for this author in PubMed Google Scholar
Zhihong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Joerg Appenzeller
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.A. and Z.C. conceived of and supervised the project. N.D. provided film-level analysis of the Magnetic Tunnel Junction (MTJ) stacks from which J.D. fabricated the stochastic MTJ devices. J.D. and Y.T. characterized the stochastic MTJ devices. Z.S. fabricated and characterized the 2D-MoS2 FET devices. J.D. and Z.S. fabricated and measured the integrated on-chip device. X.Z. performed the circuit simulations and X.Z., J.D., J.A. and Z.C. analyzed the results. J.D. and J.A. wrote the manuscript, with contributions from Z.S. and X.Z. All the authors discussed the data and resulting outcomes.

Corresponding authors

Correspondence to John Daniel or Joerg Appenzeller.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Daniel, J., Sun, Z., Zhang, X. et al. Experimental demonstration of an on-chip p-bit core based on stochastic magnetic tunnel junctions and 2D MoS₂ transistors. Nat Commun 15, 4098 (2024). https://doi.org/10.1038/s41467-024-48152-0

Download citation

Received: 19 March 2024
Accepted: 23 April 2024
Published: 15 May 2024
DOI: https://doi.org/10.1038/s41467-024-48152-0

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.