Hardware implementation of Bayesian network building blocks with stochastic spintronic devices

Bayesian networks are powerful statistical models to understand causal relationships in real-world probabilistic problems such as diagnosis, forecasting, computer vision, etc. For systems that involve complex causal dependencies among many variables, the complexity of the associated Bayesian networks become computationally intractable. As a result, direct hardware implementation of these networks is one promising approach to reducing power consumption and execution time. However, the few hardware implementations of Bayesian networks presented in literature rely on deterministic CMOS devices that are not efficient in representing the stochastic variables in a Bayesian network that encode the probability of occurrence of the associated event. This work presents an experimental demonstration of a Bayesian network building block implemented with inherently stochastic spintronic devices based on the natural physics of nanomagnets. These devices are based on nanomagnets with perpendicular magnetic anisotropy, initialized to their hard axes by the spin orbit torque from a heavy metal under-layer utilizing the giant spin Hall effect, enabling stochastic behavior. We construct an electrically interconnected network of two stochastic devices and manipulate the correlations between their states by changing connection weights and biases. By mapping given conditional probability tables to the circuit hardware, we demonstrate that any two node Bayesian networks can be implemented by our stochastic network. We then present the stochastic simulation of an example case of a four node Bayesian network using our proposed device, with parameters taken from the experiment. We view this work as a first step towards the large scale hardware implementation of Bayesian networks.


I. INTRODUCTION
Bayesian networks (BNs) are directed graphical models that are used to represent the causal dependencies among stochastic variables 1 .In a BN, each node represents a stochastic variable, whose probability of occurrence is determined by the states of its parent nodes.The dependence between a set of such nodes is given by a conditional probability table (CPT).BNs are traditionally implemented in software aiming at applications in areas such as forecasting, diagnosis, and computer vision 2 .However, as the complexity of the BNs grows, i.e., as the number of parent nodes affecting the probability of a particular child node becomes large, both the assessment of that child node probability, and the inference about the possible cause becomes impractical 3 .Specifically, as the network size grows, the number of terms in the calculation of the joint probability using probability chain rule increases rapidly 3 .
Direct representation of Bayesian networks in hardware has been proposed as an alternative way to perform the two above representative cases where the magnetization relaxes to the "up" and "down" direction after being released from the hard axis.(c) Experimental measurements on the device showing stochastic behavior with tunability using a charge current through an isolated Oersted ring.The bottom panels show the stochastic outputs, whose averages show the sigmoidal behavior as a function of the input current.mentioned tasks, i.e., probability assessment and inference.In this case, each "node" in a Bayesian network is represented by a stochastic device, having a distinct probability of being in one of two possible states.This probability is controlled by the input it receives dependent on the states of its parent nodes, through the weights of the connections between them.The CPT is encoded in the weights of these connections.By representing a BN with a hardware network of this kind, the required probability of a particular event is readily obtained by sampling the output of the corresponding stochastic device.Moreover, inference about the possible cause of a particular event can be evaluated by observing the joint distribution of the two stochastic devices corresponding to the "event" node and the particular "cause" node of interest.
Several hardware implementations of BNs have been proposed based on CMOS hardware.For example, Zermani et al. 4 demonstrated FPGA based BN implementation utilizing suitable architectural design and memory allocation schemes.Cai et al. 5 demonstrated another FPGA based architectural design along with a suitable pseudo random number generator.Manisinghka et al. 6 implemented a BN in digital circuits using a novel abstraction.Chakrapani et al. 7 and Weijia et al. 8 proposed a probabilistic CMOS hardware for BN implementation, however there has not been an experimental demonstration in literature to our knowledge.Nevertheless, there is an interest in a compact implementation of the stochastic nodes of a BN and their conditional relations.
In this work, we present an experimental demonstration of a spintronics based compact hardware implementation of BNs.The stochastic elements are implemented naturally by a compact device consisting of a perpendicular nanomagnet.The CPTs are translated directly to the connection weights, implemented by resistive connections between such devices.
Unstable nanomagnet based spintronic devices have recently attracted much research interest for probabilistic spin logic (PSL) 9,10,[19][20][21][11][12][13][14][15][16][17][18] and are given the name "p-bit", which is the short form of "probabilistic bit". It ha been proposed that inherently unstable nanomagnet can be a natural implementation of the stochastic variable in a BN 9,18,19,22 .We first present a p-bit implementation using a stochastic spintronic device that has isolated input and output to allow for interconnection in circuitry.The output of such a device is a tunable random number, whose mean is controlled by an electrical input.Then, we build an electrically connected network of two such devices and study the correlation of their outputs for different connections and biases.We show that any CPT can be implemented by changing the connections and biases of this circuit, thus representing a hardware BN building block.Finally, using parameters taken from the experiment, we perform a stochastic Landau Lifshitz Gilbert (sLLG) simulation of a four node BN and compare the results of the forecast with those expected from calculating joint probability distributions.

A. Hard axis initialized PMA magnet as p-bit
In our experiment, the stochastic device is based on a hard axis initialized magnet with perpendicular magnetic anisotropy (PMA), whose output probability is controlled by the magnetic field produced by a charge current passing through an isolated metal ring 14,15,17 .The top left of Fig. 1 (a) shows the schematic of our device.It consists of a nanomagnet island with perpendicular magnetic anisotropy (PMA) shown in orange, on top of a heavy metal (Ta) Hall bar, shown in blue.It is well understood that the magnetization of a PMA magnet can be deterministically switched by the Spin Obit Torque (SOT) of a heavy metal under-layer in the presence of a symmetry breaking in-plane magnetic field 23,24 .However, when the spin current density is large enough, and when this field is absent, the magnetization gets pinned in the direction of the spin polarization, i.e. the magnets hard axis.Once the spin current is removed, ambient thermal noise makes the magnetization relax to either "up" or "down" with equal probability due to the symmetric energy landscape for these two states 14,15,25 as depicted by the cartoon in the top right of Fig. 1 (a).The magnetization state is read out by the anomalous Hall effect (AHE), where the transverse VOUT is +ve for a magnetization in the "up" direction and -ve for "down".The probability of relaxing back to the "up" or "down" direction can be controlled by applying a small out-of-plane magnetic field that lifts the degeneracy of the energy landscape.A positive field in the zdirection lowers the energy of the "up" state and raises that of the "down" state, thus making the "up" state more favorable.A negative z-directed field does the exact opposite.This is depicted in the energy landscape diagrams shown in the bottom panel of Fig. 1 (a).This z-directed field is provided by a ringshaped electrode called the "Oersted ring" henceforth, shown in yellow in the device schematic.A current "IIN" passing through the Oersted ring of radius "r" produces a magnetic field given by B = 0*IIN/2r.
Fig. 1 (b) shows the sLLG simulation of such a device.The top panels show the magnetization dynamics during the pulsing of the device.The current pulse through the GSHE layer is shown in black color in both the panels.The zcomponent of magnetization (mZ) is shown in blue and red.It can be seen that mZ goes to zero while the current pulse is ON.After the pulse is removed, mZ relaxes to -1 in the first case and it relaxes to +1 in the second, nominally identical case, highlighting the stochastic nature of the process.The time scale of this relaxation is governed by the material parameters of the nanomagnet such as MS, HK and damping.The bottom panel of Fig. 1 (b) shows the average of the magnetization (after the dynamics have settled) in the z-direction (perpendicular easy axis) as a function of the input current, resembling a sigmoidal activation function.
For experimental implementation, starting with a stack of Ta(5nm)/CoFeB(1nm)/MgO(2nm)/Ta(1nm) thin film, a Hall bar device with a PMA magnetic island located at the center is fabricated by means of successive e-beam lithography and Ar ion milling steps.To generate the out-of-plane field for tunability, the "Oersted ring" is fabricated on top and electrically isolated from the Hall bar by a dielectric layer.A false colored SEM image of the fabricated device is shown in the inset of Fig. 1(c).
For the operation of the device, a Keithley 6221 current source is used to provide a current pulse of duration 100 s through the Ta Hall bar.This current pulse experimentally implements the required hard axis biasing scheme as shown in the sLLG simulation of Fig. 1 (b).Although the magnet can respond to much faster pulses, as shown in Fig. 1(b), we chose to use 100 s to be safely within the delay times of the measurement circuit.After the pulsing event, the state of the magnetization is read by a lock-in scheme, with a sinusoidal current provided by the same Keithley current source and an SRS830 lock-in amplifier.The device is pulsed repeatedly, and the state of the magnetization is read after each individual pulse.Fig. 1 (c) shows the average magnetization as a function of the input current "IIN".Each data point is obtained by averaging 25 pulsing events, as shown for three representative cases in the bottom panels.These measurements clearly demonstrate the successful implementation of a device with an electrical input and output, which behaves stochastically for individual events, but produces a sigmoidal curve for the average output.This is the desired characteristic for many probabilistic spin logic applications including hardware BNs.

B. Implementing a two node Bayesian network in hardware
Next, we show how the stochastic devices described in the previous section can be used to implement a two node Bayesian network in hardware.The essential characteristic of a BN is captured in the CPT.Fig. 2 (a) shows the example of a twonode network, with the first or the parent node (m1) representing the packaging material for blocks of cheese in a dairy farm, and the second node (m2) representing the probability of finding a stale cheese block.The values "a" and "b" in the CPT represent the probability of a cheese block being stale if the packaging material is of low quality (m1 = 0) vs. high quality (m1=1).Since the packaging material positively affects the shelf life, in this case, a > b.If instead of packaging material, m1 represents the print design on the package, then the shelf life is not affected by it, and hence, a=b in this case.Similarly, if some other variable, that negatively affects the shelf life is represented by m1, then the CPT would have a < b.Now, for the first case, if the cheese was stored in a cold and dry storage, then the shelf life is increased, irrespective of the packaging material quality.This corresponds to adding a positive value to both "a" and "b" in the CPT.Hence, the variables in the CPT can span the entire space between 0 and 1 independently, depending on the problem being modeled.
We first demonstrate that the CPT between the two probabilistic random variables in our example can be implemented by design of proper electrical connections between two of our stochastic devices (of the type shown in Fig. 1).Then, by testing the circuit with designed parameters, we show that the probability of the output device (m2) follows the probability of finding a stale cheese block, obtained from calculating the joint probability distribution.We also show that the inference about the potential cause of stale cheese that is evaluated by Bayes theorem is well matched to the directly observed values from the joint distribution of the device outputs.The results are also verified by stochastic LLG simulations with magnet parameters (MS, HK and volume) taken to match the sigmoidal activation function obtained from the experiment.
Fig. 2 (a) shows the given CPT that represents the relation between the stochastic variables m1 and m2.This CPT is translated into the parameters J21 and h2 of the PSL model as shown in Fig. 2 (b).This translation can be obtained from the analysis below: The total input, I2, received by the second device is given by: I2 = J21*m1 + h2 (1)   where J21 corresponds to the connection from the first to the second device, m1 corresponds to the state of the first device and h2 corresponds to the constant bias given to the second device.As eq. 1 represents the physical input to node 2 (which is current in our hardware design), m1 has to enter as a bipolar value (+1 for 'UP' state and -1 for 'DN' state).The average state of the second device is given by: where  represents the sigmoidal activation function for device 2. The conditional dependencies can be directly seen from this expression.The probability of m2 being high given m1 high is obtained by evaluating <m2> from equation ( 2) by setting m1 = 1.Since this probability should match the value specified in the given CPT, we obtain: Similarly, 'a' can be obtained by setting m1 = -1 (as bipolar entry corresponding to m1 being 'DN' is -1 instead of 0) in equation ( 2) From equations ( 3) and ( 4), we obtain the values of the PSL parameters J21 and h2, from the given CPT table as follows: The parameters J21 and h2 are then used to design the hardware connection strengths and biases to two stochastic devices, as will be discussed in the following paragraphs.
Fig. 2 (c) shows the schematic of our circuit.The output voltage from the first device is amplified by a LT1677 low noise, rail-to-rail precision Op Amp operating in an open loop configuration.The output level of the Op Amp is determined by its +/-VDD supply voltages.This output is then connected to the Oersted ring of the second device through a weight resistor "Rweight" that determines how much current passes through it, and hence controls the output probability of the second device, corresponding to the J21 term in a BN.Additionally, a voltage source "Vbias" is connected to the input of the second device through a resistor "Rbias" to mimic the fixed bias (h2) in a BN.The values of the circuit parameters VDD, Vbias, Rweight and Rbias are obtained from the required J21 and h2 by the following design analysis: In our circuit as shown in Fig. 2(c), J21 is the magnetic field produced by the Oersted ring of device 2, normalized with the field required to saturate its magnetization in the "up" or "down" state, denoted by B0.This is given by: J21 = ±0VDD/2rB0Rweight (7)   where r is the radius of the Oersted ring, 0 is the permeability of vacuum and the ± sign depends on the connection polarity.Similarly, h2 is the additional magnetic field produced by the constant bias Vbias, normalized to B0. h2 = (0Vbias/2rB0Rbias) Note that h2 contributions due to the remnant magnetic field in the measurement setup have been subtracted out in this analysis for brevity.This additional h2 contribution is just added to the calculated h2 in equation 8.
Next, we show that the same circuit can capture any given CPT, by changing the Rweight and Rbias.In the circuit shown in Fig. 2(c), the total input received by device 2 is given by: For Rweight = ∞, which means J21=0, the coefficient of m1 in eq. 9 vanishes, and so does the correlation between the two devices.For a finite Rweight, the connection polarity dictates the sign of the correlation between the two devices, with a strength inversely proportional to Rweight.Vbias makes the correlation asymmetric as its corresponding term in eqn. 9 does not change sign with the state of m1.Therefore, we can span all possible conditional probabilities between two nodes of a BN (given by 'a' and 'b' in the CPT) by changing the circuit parameters Rweight, polarity and Rbias.

C. Experimental testing of the hardware Bayesian network
We take five different CPTs with "a" and "b" spanning the range between 0 and 1, shown in Fig. 3 (a).We then calculate J21 and h2 for these five cases and design our circuit according to eq. ( 7) and eq.( 8).The designed circuits are then tested by repeating a sequential pulsing scheme.The inset of Fig. 2 (c) shows the timing diagram of the measurement procedure.The two devices are pulsed sequentially by a Keithley 6221 current source that provides the clocking scheme for our devices.During the pulsing of the second device, a constant DC read current is passed through the first device in order to generate the input voltage to the second device.Then, this sequential pulsing is repeated to generate the required statistics.The two devices produce random outputs, but with correlated statistics, as is required by the CPT between the two random variables.The output after each pulse is measured by a lock-in amplifier and then digitized.Representative sections of the device outputs are shown in Fig. 3 (b) for three different connection configuration.It is worth noting that the pulsing method being followed in the presented experiments (shown in the inset of Fig. 2(c)) is analogous to Gibbs sampling 26,27 , which is widely used is used for statistical inference 28,29 .Here, each node of the network is pulsed (sampled) sequentially under the influence of all the other nodes, which are fixed to their current values.
The probability of finding a stale cheese block can be found from the joint probability distribution by using the probability chain rule: = p(m2=1|m1=0)×p(m1=0)+ p(m2=1|m1=1)×p(m1=1) = a×p(m1=0) + b×p(m1=1) (10) where p(m1 = 0 or 1) is an input parameter.The number of terms in the above expression grows as 2 N where N is the number of parent nodes for the particular child node of interest 3 .Instead of performing this algebra, the required probability can be obtained from the circuit by directly observing the stochastic output of device 2 and obtaining its mean value over several pulsing cycles.This luxury of having to observe only the nodes of interest while disregarding all the other nodes is an advantage of using a probabilistic algorithm, versus calculating the probabilities using deterministic rules as discussed by Feynman 30 and utilized in many sampling schemes 26 .
Similarly, given that a randomly drawn cheese block from a large lot is stale, the probability that it was caused by a low quality packaging material can be found by using Bayes theorem: p(m1=0|m2=1) = p(m1=0,m2=1)/ p(m2=1) = p(m2=1|m1=0)×p(m1=0)/p(m2=1) = [a×p(m1=0)] / [a×p(m1=0) + b×p(m1=1)] The number of terms required in the evaluation of the above expression also grows as ~2N where "N" is the number of potential binary causes of a particular effect 3 .However, from the hardware BN, this probability can be directly obtained by observing the joint distribution of states of the two devices.It is to be noted here that this way of performing the inference always involves observing the joint distributions of only two nodes of the BN: nodes corresponding to the effect and the potential cause of interest, irrespective of N.
In our experiment, after 100 pulsing cycles, the obtained output probabilities for all the five circuits (representing the five different CPTs of Fig. 3(a)) is comparable with the expectation from calculating the joint probability distribution and is also verified by stochastic LLG simulations, as shown in Fig. 3 (c).Similarly, the obtained probabilities from inference is comparable with that from Bayes theorem and stochastic LLG simulations, seen in Fig. 3 (d).

III. SIMULATION OF A FOUR NODE BAYESIAN NETWORK
In this section, we present a self-consistently coupled sLLG simulation of the more complicated, four node Bayesian network shown in the top left inset of Fig. 4 (a).Here, the BN consists of four nodes: cloud (C), rain (R), sprinkler (S), and wetness of grass (W).In this case, the evaluation of a node probability from the joint probability distribution requires the following evaluation, for example for the W node: (12)   Here the number of terms to be evaluated in the summation is eight, as each of the C, R and S nodes could take two possible values "0" or "1".Similarly performing inference, for example, what is the probability that it had rained, given that the grass is wet requires the following evaluation: where both the numerator and the denominator of the righthand side of the above equation must be evaluated by summing over the joint probability distribution (, , , ), resulting in the evaluation of four and eight terms respectively.However, by using the hardware, the required node probabilities and the inference can be obtained in exactly the same way as our previous two-node example: we simply observe the stochastic output of the corresponding node for probability assessment; and observe the joint distribution of only the R and the W node to perform the required inference.This is demonstrated in the simulation study below.
The parameters used in the sLLG simulation platform such as the magnet dimensions and the output sigmoidal response are benchmarked with the experimental results from the device in Fig. 1 (c).The coupling and biases are benchmarked with the two node BN network experiments shown in Fig. 2 and 3.
Fig. 4 (a) shows the circuit implementation, where each node is represented by a hardware p-bit as described in Fig. 1.It is to be noted here that an auxiliary p-bit (represented by node 'X') is needed to implement this four node Bayesian network.This is because, the CPT capturing the dependency of node 'W' on node 'R' and 'S' has four conditional probabilities, which can take any value between 0 and 1 independent of each other.Therefore, from basic principles of linear algebra, we need four independent physical parameters to implement this CPT.Two of the four required parameters are provided by the two interconnection weights (JWR and JWS) and another parameter is provided by the bias to the node 'W' (hW).The remaining one parameter is provided by the interconnection to the auxiliary node 'X'.The requirement of auxiliary nodes in designing Bayesian networks from p-bits is described in more detail by Faria et al. 18 .The dynamics of the PMA magnet used in the hardware p-bit design is captured by solving the sLLG equation with a monodomain macrospin assumption: where,  ⃗ ⃗ is the total internal and external field along with thermal noise field,   ⃗⃗ is the spin current,   =    is the total magnetic moment with   being the saturation magnetization,  is the damping coefficient,  is the gyromagnetic ratio.), where  is the Oersted field generated from the current coil and  0 is a fitting parameter.
For the system simulation, we start with chosen CPTs for each of the nodes.These are shown as the inputs next to the respective nodes in Fig. 4 (c).These values are then translated into coupling term   and bias term ℎ  by following similar principles as in deriving eq. ( 5) and (6).The derivation for   and ℎ  for an n-node Bayesian network is provided by Faria et al. 18 .The dimensionless terms   and ℎ  are then translated to corresponding Oersted fields to each p-bit by a relation: The coupling and bias component of   can be realized through the coupling resistance  ℎ and   respectively with a mapping principle as described in eq. ( 7) and ( 8) for the two node case.
While solving the coupled sLLG, each p-bit is put along the hard axis by the GSHE current in a sequential order from parent to child node and the magnetizations of all p-bits are recorded after their corresponding pulse is turned off.It is worth noting that the pulse sequence is important for the proper operation of the Bayesian network.The pulsing should start from the first node and move down the hierarchy from parent to corresponding child nodes.The order of pulsing among different nodes on the same hierarchy level (e.g.node R and S in our example) is not critical.Taking these principles into account, the pulsing order for one cycle is shown in Fig. 4 (b).This cycle is repeated several times to generate the probabilities of each of the four nodes.Fig. 4 (c) shows representative data of magnetization of each node for 50 pulses.From this distribution of the magnetization state of each node in 'UP' vs. 'DN' state, probabilities of each node are calculated.For example, the magnetization of the p-bit corresponding to 'sprinkler' node shows more occurrences in the 'DN' state compared to 'UP' state, resulting in a low probability of sprinkler being ON (P(S) ~ 0.25 in this case).Similarly, the probability of 'rain': P(R) and the probability of 'grass being wet': P(W) are obtained from the magnetization state distribution.The obtained probabilities are compared with those obtained by calculating the joint probability distribution as shown in the output tables alongside each of the four nodes in Fig. 4 (c).It can be seen that the probabilities obtained from the coupled sLLG result match well with the simple PSL behavioral model and with the values obtained from the evaluation of equation (12).Similarly, the probability of rain, given that the grass is wet ((|)) is obtained from the coupled sLLG result is which is well matched with the value of 0.75 obtained from equation (13).It to be noted that the accuracy in this depends on the number of samples taken to calculate the probabilities.

IV. CIRCUIT IMPLICATIONS AND IMPROVEMENTS
Previously proposed hardware implementations of Bayesian networks have used CMOS based pseudo random number generators realized with XOR-SHIFT circuits 6 or RAM-based Linear Feedback Gaussian Random Number Generators 4,5 that require a large area footprint.What we have demonstrated here is a compact true random number generator (TRNG) capable of operating at few hundreds of MHz.The speed of the device is only limited by the time required for SOT hard axis initialization and magnetization relaxation after removal of SOT, which normally requires a few nanoseconds as shown in the sLLG time plot panels of Fig. 1 (b).Compared to previously demonstrated spin based TRNG [33][34][35] , this implementation employs a different scheme to generate random numbers.In our approach, any applied current that is larger than that required for hard axis initialization of the magnet will result in the generation of a random number with the correct statistic once the current pulse is removed 17,22 .Hence, in a large network, the device to device variation in the required current can be easily mitigated by choosing the largest value of the required current among all devices.Possible variations in the shape and offset of the sigmoidal activation function of our devices can be controlled by appropriately choosing the parameters B0 and h2 while translating the given CPTs into the connection weights, shown in equations ( 5)- (8).Also note that the Bayesian network proposed here does not require analog voltage sources or CMOS MUX to realize the CPT as proposed previously by Shim et al 22 .Using current controlled tunability of the device and auxiliary nodes, any CPT can be realized by using only p-bits, one voltage level (VDD) and analog memristive elements for interconnections and individual biases similar to RRAM based neural networks.Such programmable analog memristive elements have been successfully demonstrated recently 36,37 .The energy requirement of the device demonstrated here can be improved by using the voltage-controlled magnetism (VCM) effect for hard-axis initialization as proposed by Scott et al. 38 in their benchmarking study (section IV of the main text).In addition, employing magnetic tunnel junctions (MTJs) instead of AHE can eliminate the need for OP-AMPs for readout.The typical difference in the two stable resistive states of an MTJ is of the order of 10 K, whereas in case of AHE, it is a few ohms for standard material stacks.This allows the elimination of the OP-AMPs for readout.Implementations of an MTJ based readout scheme, where the state of the free layer magnet is converted to a voltage by a potential divider formed by the MTJ and a normal resistor was presented by Camsari et al. 12 (figure 3 of the main text) and Hassan et al. 39 (figure 4 of the main text).In these references, the voltage swing generated at the output is large enough to be converted to a "rail-to-rail" swing by a single inverter.In the above references, the MTJ free layer was designed to be a low barrier magnet, but the analysis of the output swing remains unchanged for our hard axis initialization scheme with stable magnets.

V. CONCLUSION
We have experimentally demonstrated that by connecting two stochastic spintronic devices and designing the connection and bias parameters, BN building blocks can be implemented in hardware.By implementing BNs using such hardware, both probability assessment and inference can be performed by sampling the output of only the relevant nodes.Using experimentally benchmarked sLLG simulations, we have shown that a four node BN implemented in hardware using the presented stochastic devices can generate probabilities that are well matched to the theoretical values from calculating the joint probability distribution.This demonstration serves as a step towards building large scale hardware systems for implementing Bayesian networks.

Fig. 1 .
Fig. 1.Hardware building block of Bayesian Networks.(a) Schematic of the probabilistic device and illustration of the hard axis initialization by spin orbit torque.(b) Stochastic LLG simulation of 500 ensembles, showing tunable random behavior of the device.The two top panels show representative cases where the magnetization relaxes to the "up" and "down" direction after being released from the hard axis.(c) Experimental measurements on the device showing stochastic behavior with tunability using a charge current through an isolated Oersted ring.The bottom panels show the stochastic outputs, whose averages show the sigmoidal behavior as a function of the input current.

Fig. 2 .
Fig. 2. Hardware design of a two-node network.(a) The given conditional probability table (CPT) representing the causal dependency of two probabilistic variables, i.e., the quality of packaging and state of cheese (b) PSL model of the two node BN with the CPT parameters translated to PSL parameters (c) Circuit schematic of two connected devices to implement two coupled Bayesian nodes.Inset on the top left shows the timing diagram of various operations performed on device 1 and 2.

Fig. 3 .
Fig. 3. Testing of the two node BN circuit.(a) Five different combinations of the CPT parameters that are experimentally implemented in hardware.(b) Representative sections of the measured data for positive, negative and no connection between device 1 and device 2 as shown in Fig 2(c).(c) Obtained output probabilities of cheese being stale for the five different given CPTs.The experimentally obtained probability values are in good agreement with theory and stochastic LLG simulations.(d) Inference about probability of the packaging being bad quality given that a stale cheese is found is plotted for the different CPTs, showing good match between direct experimental observation, Bayes theorem and stochastic LLG simulations.

Fig. 4 .
Fig. 4. Simulation results of a four node BN.(a) Hardware implementation layout (b) Representative one clock cycle of operation (c) Results obtained from the four node BN with the given CPTs shown as the input tables.Probabilities for node, generated after 500 clock cycles are shown inside the blue boxes.Representative sections of the state of each node after 50 pulses is shown next to them.obtained probabilities show good agreement with expectation from calculating the joint probability distribution.