Mosaic: in-memory computing and routing for small-world spike-based neuromorphic systems

The brain’s connectivity is locally dense and globally sparse, forming a small-world graph—a principle prevalent in the evolution of various species, suggesting a universal solution for efficient information routing. However, current artificial neural network circuit architectures do not fully embrace small-world neural network models. Here, we present the neuromorphic Mosaic: a non-von Neumann systolic architecture employing distributed memristors for in-memory computing and in-memory routing, efficiently implementing small-world graph topologies for Spiking Neural Networks (SNNs). We’ve designed, fabricated, and experimentally demonstrated the Mosaic’s building blocks, using integrated memristors with 130 nm CMOS technology. We show that thanks to enforcing locality in the connectivity, routing efficiency of Mosaic is at least one order of magnitude higher than other SNN hardware platforms. This is while Mosaic achieves a competitive accuracy in a variety of edge benchmarks. Mosaic offers a scalable approach for edge systems based on distributed spike-based computing and in-memory routing.

As n (number of nodes, e.g., neurons, in the graph) increases and k (number of neighbour nodes in a ring topology, each node is connected to) decreases, the more connections in the connectivity matrix will be zero, indicating the increased proportion of non-used memory elements in a n × n crossbar array.
Figure S1 quantifies the under-utilization of conventional crossbar arrays while storing example small-world connectivity patterns generated by two standard random graph generation models: Watts-Strogatz small-world graphs [1] and Newman-Watts-Strogatz small-world graphs [2].The first type of graphs is characterized by a high degree of local clustering with short vertex-vertex distances, observed in neural networks and self-organizing systems, whereas the latter type mostly captures the properties of lattices with which statistical physics deals with.
To communicate the events between the computing nodes in neuromorphic chips, Address-Event Representation (AER) communication scheme has been developed and used [3].In AER, whenever a spiking neuron in a chip (or module) generates a spike, its "address" (or any given ID) is written on a high-speed digital bus and sent to the receiving neuron(s) in one (or more) receiver module(s).In general, AER processing modules require at least one AER input port and one AER output port.As neuromorphic systems scale up in size, complexity, and functionality, researchers have been developing more complex and smarter AER "variations" to maintain the efficiency, reconfigurability, and reliability of the ever-growing target systems they want to build.The scheme that is used to transport events can be source or destination based, where the source or destination address is embedded in the sent event "packet".In the source-based scheme, each receiving neuron has a local Content Addressable Memory (CAM) that stores the address of all the neurons that are connected to it.In the destination-based approach, each event hops between the nodes where its address gets compared to the node's address until it matches and thus gets delivered.Source-driven routing provides the designer with more freedom to balance event traffic and design routes, but the hardware complexity increases the delays.Destination-based creates pre-determined routes along the network and the designer can only change the output ports [4].In summary, in source-based routing, the system requires a CAM memory per neuron, which results in an increase in the area and memory access read times.In destination-based routing, the configurability in the network structure is reduced.Comparatively, in the Mosaic, the routers are memory crossbars that are distributed between the computing cores and steer the spiking information in the mesh.Routing tiles define the connectivity of spiking neural networks implemented on Mosaic.When the number of memristive devices in the routing tiles which are in their high-conductive state (HCS) is not sparse, Mosaic resembles a densely connected neural network (Fig. S2, top left).When most of the memristor in the routing tiles are in the low-conductance state, Mosaic is sparsely connected (Fig. S2, top right).Furthermore, one can further sparsify Mosaic networks by setting memristors in the neuron tiles to the LCS.To do so, we can change the probability of memristors being in their HCS in the neuron tiles, p n , and in the routing tiles, p r .The switching of the Resistive Random Access Memorys (RRAMs) presents the property of probabilistic switching as a function of the voltage applied during the programming operation as is shown in Fig. S2, bottom.
Fig. S3 shows the construction of two graph topologies, made of 2 Neuron Tiles and one Routing Tile, to clarify the formation of the graphical structure in the Mosaic.By controlling the probability of connections within the Neuron and Routing Tiles, we can produce a densely connected graph (left) with p N T = 0.75, p RT = 0.6, and a sparse graph (right) with p N T = 0.30, p RT = 0.05.
The corresponding connectivity matrix is also shown in the figure, which is directly represented as a hardware architecture in the 3 tiles of the Mosaic, as shown in the figure.16), plus 4 recurrent inputs).The routing tiles receive 16 inputs (4 inputs from 4 directions) and send out 16 outputs (4 outputs in 4 directions).In the crossbars, the red squares and black squares represent devices in their high conductive and low conductive state, respectively.The connection between the neuron tile and the routing tile is directly through a wire.For instance, V out < 3:0 > is the same as the V in,W , and V in,E < 3:0 > is the same as V out,W .
Figure S4 shows the details of the Mosaic architecture, with a zoomed in neuron and routing tile pair.The diagram in the top shows how one cluster of neuron/one router sends and receives information to and from the routing/neuron tile.This highlights the strength of this architecture which makes the connectivity easy through simple wiring to the neighbour, without suffering from long wires, as the maximum length of a wire is the size of the wire from one row/column, plus the size of the connecting column/row.Supplementary Note 5 RRAMs are used as the weights of the neurons.On the arrival of any of the input events V in<i> , the amplifier pins node V x to V top and thus a read voltage equivalent to V top − V bot is applied across G i , giving rise to current i in at M 1 .This current is mirrored to M 2 giving rise to i buf f which is in turn again mirrored through the M 3 − M 4 transistor pair.The "synaptic dynamics" circuit is the Differential Pair Integrator (DPI) [5].On the arrival of any of the input events, V i , 0 < i < n, I w , equivalent to i buf f , flows in transistor M 5. Depending on the value on V g , a portion of I w flows out of the MOS capacitor M6 and discharges it.This current is proportional to G i , 0 < i < n.As soon as the event is gone, MOS capacitor M 6 charges back through the M 8 path with current I tau , which determines the rate of charging, and thus the time constant of the synaptic dynamics.The output current of the DPI synapse, I syn , is injected into the neuron's membrane potential node, V mem , and charges MOS capacitor M 13 .There is also an alternative path with a DC current input through M 17 which can charge neuron's membrane potential.Membrane potential charging has a time constant determined by V lk at the gate of M 11 .As soon as the voltage developed on V mem passes the threshold of the following inverter stage, it generates a pulse.The width of the pulse, depends on the delay of the feedback path from V out to the gate of M 12 .This delay is determined by the inverter delays, and the refractory time constant.The inverter symbols with the horizontal dashed lines correspond to a starved inverter circuits with longer delays.The refractory period time constant depends on the MOS cap M 16 and the bias on V rp .Details of the implementation of the neuron row, the circuit that leverages the information of the conductance of a memristor to weight the effect of a spike to a neuron is shown in Figure S5.The circuit features multiple inputs connected to a row of memristive devices (left) and a Front-End circuit buffering the current read from the devices to a differential-pair-integrator synapse.The synapse is then connected to a leaky-integrated-andfire (LIF) neuron which eventually emits a spike.Figure S6 delves deeper in the behavior of the LIF neuron analyzing its output spiking frequency against an input DC voltage and its linear behavior respect to the RRAM conductance in a neuron row circuit.neuron tile is 7.5/0.5 = 15 times more than that of the routing tile.The BW requirements directly translate to the biasing of the amplifier and thus its power consumption.Therefore, the static power consumption of the neuron tile is 15 times that of the routing tile.The current requirements also translate to area, since larger currents require wider transistors.

Figure S1 :
Figure S1: The heatmaps show the ratio of zero elements to non-zero elements in the connectivity matrix for two examples of recurrently connected small-world graph generators.As n (number of nodes, e.g., neurons, in the graph) increases and k (number of neighbour nodes in a ring topology, each node is connected to) decreases, the more connections in the connectivity matrix will be zero, indicating the increased proportion of non-used memory elements in a n × n crossbar array.

Figure
Figure S2: (top) Different random graphs generated using Mosaic model, changing the probability of devices being in their High Conductive state in the neuron tile (p n ) and routing tile (p r ).(bottom) The probability of device switching is a function of the voltage applied to it while being programmed.

Figure S3 :
Figure S3: Mosaic connectivity example, formed by setting the probability of connection within Neuron Tile (p N T ) and Routing Tiles (p RT ).(left) Densely connected Mosaic composed of 2 Neuron Tiles and 1 Routing Tile.The graph related to its connectivity is shown as well adjacency matrix.(right) Sparsely connected Mosaic.The graph is programmed to favor the intra-Neuton Tile connectivity and allow for two clusters to emerge, penalizing connections between the two clusters.

Figure
Figure S4: (Neuron tiles (green) transfer information in the form of spikes to each other through routing tiles (blue).Details of the Mosaic architecture is shown with the size of the neuron and routing tiles.The neuron tiles receive feed-forward input from four directions of North (N), East (E), West (W), and South (S), and local recurrent input from the neurons in the tile.The neurons integrate the information and once spike, send their output to 4 directions.Having 4 neurons in a tile, gives rise to 16 outputs (4 outputs copied in 4 directions), and 20 inputs (4 inputs from 4 directions (16), plus 4 recurrent inputs).The routing tiles receive 16 inputs (4 inputs from 4 directions) and send out 16 outputs (4 outputs in 4 directions).In the crossbars, the red squares and black squares represent devices in their high conductive and low conductive state, respectively.The connection between the neuron tile and the routing tile is directly through a wire.For instance, V out < 3:0 > is the same as the V in,W , and V in,E < 3:0 > is the same as V out,W .

Figure S5 :
FigureS5: Schematic of the neuron tile including the CMOS synapse and neuron circuits fabricated for use in this paper.RRAMs are used as the weights of the neurons.On the arrival of any of the input events V in<i> , the amplifier pins node V x to V top and thus a read voltage equivalent to V top − V bot is applied across G i , giving rise to current i in at M 1 .This current is mirrored to M 2 giving rise to i buf f which is in turn again mirrored through the M 3 − M 4 transistor pair.The "synaptic dynamics" circuit is the Differential Pair Integrator (DPI)[5].On the arrival of any of the input events, V i , 0 < i < n, I w , equivalent to i buf f , flows in transistor M 5. Depending on the value on V g , a portion of I w flows out of the MOS capacitor M6 and discharges it.This current is proportional to G i , 0 < i < n.As soon as the event is gone, MOS capacitor M 6 charges back through the M 8 path with current I tau , which determines the rate of charging, and thus the time constant of the synaptic dynamics.The output current of the DPI synapse, I syn , is injected into the neuron's membrane potential node, V mem , and charges MOS capacitor M 13 .There is also an alternative path with a DC current input through M 17 which can charge neuron's membrane potential.Membrane potential charging has a time constant determined by V lk at the gate of M 11 .As soon as the voltage developed on V mem passes the threshold of the following inverter stage, it generates a pulse.The width of the pulse, depends on the delay of the feedback path from V out to the gate of M 12 .This delay is determined by the inverter delays, and the refractory time constant.The inverter symbols with the horizontal dashed lines correspond to a starved inverter circuits with longer delays.The refractory period time constant depends on the MOS cap M 16 and the bias on V rp .

Figure S6 :
Figure S6: Measurements from fabricated neuron's output frequency as a function of the input DC voltage.The DC voltage is applied at the gate of transistor M 17 shown in Fig. S5 as V dc .Therefore, as the gate voltage of M 17 changes linearly, the current of M 17 and thus the output frequency of the neuron changes non-linearly.Each curve is measured with a different neuron's time constant, determined by a different voltage, V lk , on the gate of transistor M 11 in Fig. S5.As the leak voltage increases, the neuron's time constant decreases, giving rise to a lower output frequency.