Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

A memristive deep belief neural network based on silicon synapses

Abstract

Memristor-based neuromorphic computing could overcome the limitations of traditional von Neumann computing architectures—in which data are shuffled between separate memory and processing units—and improve the performance of deep neural networks. However, this will require accurate synaptic-like device performance, and memristors typically suffer from poor yield and a limited number of reliable conductance states. Here we report floating-gate memristive synaptic devices that are fabricated in a commercial complementary metal–oxide–semiconductor process. These silicon synapses offer analogue tunability, high endurance, long retention time, predictable cycling degradation, moderate device-to-device variation and high yield. They also provide two orders of magnitude higher energy efficiency for multiply–accumulate operations than graphics processing units. We use two 12 × 8 arrays of memristive devices for the in situ training of a 19 × 8 memristive restricted Boltzmann machine for pattern recognition via a gradient descent algorithm based on contrastive divergence. We then create a memristive deep belief neural network consisting of three memristive restricted Boltzmann machines. We test this system using the modified National Institute of Standards and Technology dataset, demonstrating a recognition accuracy of up to 97.05%.

This is a preview of subscription content, access via your institution

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Fig. 1: Memristive synapses based on two-terminal floating-gate devices.
Fig. 2: Memristive synaptic array and device-to-device variations.
Fig. 3: Setup of the test system for array operations and memristive RBM demonstration.
Fig. 4: Demonstration of RBM training using two memristive synaptic chips.
Fig. 5: Training of a DBN based on silicon synapses for the MNIST dataset.

Data availability

The data that support the plots within this paper and other findings of this study are available as Source data.

Code availability

The code that supports the device modelling and neural network simulations in this study is provided as Supplementary Source Code and is also available via GitHub at https://github.com/wangweifcc/memristive_dbn_yflash.

References

  1. Ielmini, D. & Wong, H.-S. P. In-memory computing with resistive switching devices. Nat. Electron. 1, 333–343 (2018).

    Article  Google Scholar 

  2. Fuller, E. J. et al. Parallel programming of an ionic floating-gate memory array for scalable neuromorphic computing. Science 364, 570–574 (2019).

    Article  Google Scholar 

  3. Chen, W.-H. et al. CMOS-integrated memristive non-volatile computing-in-memory for AI edge processors. Nat. Electron. 2, 420–428 (2019).

    Article  Google Scholar 

  4. Wang, W. et al. Integration and co-design of memristive devices and algorithms for artificial intelligence. iScience 23, 101809 (2020).

    Article  Google Scholar 

  5. Zidan, M. A., Strachan, J. P. & Lu, W. D. The future of electronics based on memristive systems. Nat. Electron. 1, 22–29 (2018).

    Article  Google Scholar 

  6. Gokmen, T. & Vlasov, Y. Acceleration of deep neural network training with resistive cross-point devices: design considerations. Front. Neurosci. 10, 333 (2016).

    Article  Google Scholar 

  7. Chen, P.-Y., Peng, X. & Yu, S. NeuroSim+: an integrated device-to-algorithm framework for benchmarking synaptic devices and array architectures. In 2017 IEEE International Electron Devices Meeting (IEDM) 6.1.1–6.1.4 (IEEE, 2017).

  8. Cheng, H. Y. et al. An ultra high endurance and thermally stable selector based on TeAsGeSiSe chalcogenides compatible with BEOL IC integration for cross-point PCM. In 2017 IEEE International Electron Devices Meeting (IEDM) 2.2.1–2.2.4 (IEEE, 2017).

  9. Chang, C. C. et al. Mitigating asymmetric nonlinear weight update effects in hardware neural network based on analog resistive synapse. IEEE J. Emerg. Sel. Topics Circuits Syst. 8, 116–124 (2018).

    Article  Google Scholar 

  10. Wang, C. et al. Scalable massively parallel computing using continuous-time data representation in nanoscale crossbar array. Nat. Nanotechnol. 16, 1079–1085 (2021).

    Article  Google Scholar 

  11. Li, C. et al. Long short-term memory networks in memristor crossbar arrays. Nat. Mach. Intell. 1, 49–57 (2019).

    Article  Google Scholar 

  12. Wang, Z. et al. In situ training of feed-forward and recurrent convolutional memristor networks. Nat. Mach. Intell. 1, 434–442 (2019).

    Article  Google Scholar 

  13. Wang, W. et al. Learning of spatiotemporal patterns in a spiking neural network with resistive switching synapses. Sci. Adv. 4, eaat4752 (2018).

    Article  Google Scholar 

  14. Romera, M. et al. Vowel recognition with four coupled spin-torque nano-oscillators. Nature 563, 230–234 (2018).

    Article  Google Scholar 

  15. Boybat, I. et al. Neuromorphic computing with multi-memristive synapses. Nat. Commun. 9, 2514 (2018).

    Article  Google Scholar 

  16. Sebastian, A. et al. Temporal correlation detection using computational phase-change memory. Nat. Commun. 8, 1115 (2017).

    Article  Google Scholar 

  17. Ni, K. et al. Ferroelectric ternary content-addressable memory for one-shot learning. Nat. Electron. 2, 521–529 (2019).

    Article  Google Scholar 

  18. van de Burgt, Y., Melianas, A., Keene, S. T., Malliaras, G. & Salleo, A. Organic electronics for neuromorphic computing. Nat. Electron. 1, 386–397 (2018).

    Article  Google Scholar 

  19. Chien, N. A. et al. Synergistic gating of electro-iono-photoactive 2D chalcogenide neuristors: coexistence of Hebbian and homeostatic synaptic metaplasticity. Adv. Mater. 30, 1800220 (2018).

    Article  Google Scholar 

  20. Yang, J. et al. Artificial synapses emulated by an electrolyte-gated tungsten-oxide transistor. Adv. Mater. 30, 1801548 (2018).

    Article  Google Scholar 

  21. Wang, Z. et al. Resistive switching materials for information processing. Nat. Rev. Mater. 5, 173–195 (2020).

    Article  Google Scholar 

  22. Lee, S., Sohn, J., Jiang, Z., Chen, H.-Y. & Philip Wong, H.-S. Metal oxide-resistive memory using graphene-edge electrodes. Nat. Commun. 6, 8407 (2015).

    Article  Google Scholar 

  23. Xia, Q. & Yang, J. J. Memristive crossbar arrays for brain-inspired computing. Nat. Mater. 18, 309–323 (2019).

    Article  Google Scholar 

  24. LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).

    Article  Google Scholar 

  25. Song, L., Qian, X., Li, H. & Chen, Y. PipeLayer: a pipelined ReRAM-based accelerator for deep learning. In 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA) 541–552 (IEEE, 2017).

  26. Shafiee, A. et al. ISAAC: a convolutional neural network accelerator with in-situ analog arithmetic in crossbars. In 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA) 14–26 (IEEE, 2016).

  27. Ambrogio, S. et al. Equivalent-accuracy accelerated neural-network training using analogue memory. Nature 558, 60–67 (2018).

    Article  Google Scholar 

  28. Yao, P. et al. Fully hardware-implemented memristor convolutional neural network. Nature 577, 641–646 (2020).

    Article  Google Scholar 

  29. Roy, K., Jaiswal, A. & Panda, P. Towards spike-based machine intelligence with neuromorphic computing. Nature 575, 607–617 (2019).

    Article  Google Scholar 

  30. Hinton, G. E., Osindero, S. & Teh, Y.-W. A fast learning algorithm for deep belief nets. Neural Comput. 18, 1527–1554 (2006).

    Article  MathSciNet  MATH  Google Scholar 

  31. Danial, L. et al. Two-terminal floating-gate transistors with a low-power memristive operation mode for analogue neuromorphic computing. Nat. Electron. 2, 596–605 (2019).

    Article  Google Scholar 

  32. Roizin, Y. & Pikhay, E. Memristor using parallel asymmetrical transistors having shared floating gate and diode. US patent US9514818B1 (2016).

  33. Alibart, F., Zamanidoost, E. & Strukov, D. B. Pattern classification by memristive crossbar circuits using ex situ and in situ training. Nat. Commun. 4, 2072 (2013).

    Article  Google Scholar 

  34. Pavan, P., Bez, R., Olivo, P. & Zanoni, E. Flash memory cells-an overview. Proc. IEEE 85, 1248–1271 (1997).

    Article  Google Scholar 

  35. Diorio, C., Hasler, P. & Minch, B. A. A single-transistor silicon synapse. IEEE Trans. Electron Devices 43, 19721980 (1996).

    Article  Google Scholar 

  36. Ziegler, M., Oberländer, M., Schroeder, D., Krautschneider, W. H. & Kohlstedt, H. Memristive operation mode of floating gate transistors: a two-terminal MemFlash-cell. Appl. Phys. Lett. 101, 263504 (2012).

    Article  Google Scholar 

  37. Wang, W. et al. Physical based compact model of Y-Flash memristor for neuromorphic computation. Appl. Phys. Lett. 119, 263504 (2021).

    Article  Google Scholar 

  38. Ramakrishnan, S., Hasler, P. E. & Gordon, C. Floating gate synapses with spike-time-dependent plasticity. IEEE Trans. Biomed. Circuits Syst. 5, 244–252 (2011).

    Article  Google Scholar 

  39. Hasler, J. & Marr, B. Finding a roadmap to achieve large neuromorphic hardware systems. Front. Neurosci. 7, 118 (2013).

    Article  Google Scholar 

  40. Ielmini, D. & Pedretti, G. Device and circuit architectures for in‐memory computing. Adv. Intell. Syst. 2, 2000040 (2020).

    Article  Google Scholar 

  41. Yao, P. et al. Face classification using electronic synapses. Nat. Commun. 8, 15199 (2017).

    Article  Google Scholar 

  42. Cai, F. et al. A fully integrated reprogrammable memristor–CMOS system for efficient multiply–accumulate operations. Nat. Electron. 2, 290–299 (2019).

    Article  Google Scholar 

  43. Jang, J.-W., Park, S., Burr, G. W., Hwang, H. & Jeong, Y.-H. Optimization of conductance change in Pr1–xCaxMnO3-based synaptic devices for neuromorphic systems. IEEE Electron Device Lett. 36, 457–459 (2015).

    Article  Google Scholar 

  44. Burr, G. W. et al. Experimental demonstration and tolerancing of a large-scale neural network (165 000 synapses) using phase-change memory as the synaptic weight element. IEEE Trans. Electron Devices 62, 3498–3507 (2015).

    Article  Google Scholar 

  45. Milo, V. et al. Accurate program/verify schemes of resistive switching memory (RRAM) for in-memory neural network circuits. IEEE Trans. Electron Devices 68, 3832–3837 (2021).

    Article  Google Scholar 

  46. Hinton, G. E. Training products of experts by minimizing contrastive divergence. Neural Comput. 14, 1771–1800 (2002).

    Article  MATH  Google Scholar 

  47. Wang, W. et al. Efficient training of the memristive deep belief net immune to non‐idealities of the synaptic devices. Adv. Intell. Syst. 4, 2100249 (2022).

    Article  Google Scholar 

  48. Mahmoodi, M. R., Prezioso, M. & Strukov, D. B. Versatile stochastic dot product circuits based on nonvolatile memories for high performance neurocomputing and neurooptimization. Nat. Commun. 10, 5113 (2019).

    Article  Google Scholar 

  49. Kiani, F., Yin, J., Wang, Z., Yang, J. J. & Xia, Q. A fully hardware-based memristive multilayer neural network. Sci. Adv. 7, eabj4801 (2021).

    Article  Google Scholar 

  50. Lecun, Y., Bottou, L., Bengio, Y. & Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998).

    Article  Google Scholar 

  51. Hinton, G., Dayan, P., Frey, B. & Neal, R. The ‘wake-sleep’ algorithm for unsupervised neural networks. Science 268, 1158–1161 (1995).

    Article  Google Scholar 

  52. Nandakumar, S. R. et al. Mixed-precision architecture based on computational memory for training deep neural networks. In 2018 IEEE International Symposium on Circuits and Systems (ISCAS) 1–5 (IEEE, 2018).

  53. Ma, Y. & Kan, E. Non-Logic Devices in Logic Processes (Springer International Publishing, 2017).

  54. Simon Tam, Ping-KeungKo & Chenming Hu Lucky-electron model of channel hot-electron injection in MOSFET’s. IEEE Trans. Electron Devices 31, 1116–1125 (1984).

    Article  Google Scholar 

  55. Yoshikawa, K. et al. Lucky-hole injection induced by band-to-band tunneling leakage in stacked gate transistors. In International Technical Digest on Electron Devices 53, 577–580 (IEEE, 1990).

  56. Ielmini, D., Ghetti, A., Spinelli, A. S. & Visconti, A. A study of hot-hole injection during programming drain disturb in flash memories. IEEE Trans. Electron Devices 53, 668–676 (2006).

    Article  Google Scholar 

  57. Wang, Z. et al. Reinforcement learning with analogue memristor arrays. Nat. Electron. 2, 115–124 (2019).

    Article  Google Scholar 

  58. Gao, B. et al. Memristor-based analogue computing for brain-inspired sound localization with in situ training. Nat. Commun. 13, 2026 (2022).

    Article  Google Scholar 

  59. Nandakumar, S. R. et al. Mixed-precision deep learning based on computational memory. Front. Neurosci. 14, 406 (2020).

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the European Research Council through the European Union’s Horizon 2020 Research and Innovation Programme under grant 757259 and FET-Open NeuChip project under grant agreement no. 964877. W.W. was supported in part at Technion by the Aly Kaufman Fellowship. W.W. acknowledges help from D. Dutta on the PCB and FPGA code development.

Author information

Authors and Affiliations

Authors

Contributions

W.W. conceived the concept of Y-Flash memristor-based RBM and DBN. L.D. designed the memristive chip and led the tapeout to fabrication, including array layout and readout cells, and developed the device-level operation schemes. Y.R and E.P. suggested the Y-Flash memristor cell and performed the initial verification. W.W. conducted the device characterization, array-level operation schemes with the assistance of L.D., E.H. and B.H. W.W. conducted the neural network demonstration and simulations with the assistance of L.D. and Y.L. Y.R., E.P. and Z.W. helped with the illustration results. All the authors discussed the results and contributed to the preparation of the manuscript. S.K. supervised the research.

Corresponding authors

Correspondence to Wei Wang or Shahar Kvatinsky.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Electronics thanks the anonymous reviewers for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Device structure and band diagrams in different operational modes.

Schematic illustration of the device structure from the sectional view. The injection transistor (shorter channel and higher threshold voltage) and the read transistor (longer channel and lower threshold voltage) are parallelly connected: they have a common drain and a common floating gate; their sources are externally shortened. The capacitance between the floating gate and the drain (\(C_{GD}\) is much larger than other capacitances), thus the floating gate voltage is mainly coupled to the drain. (b) The equivalent circuit symbol of the device. (c) Band diagrams of the read transistor in the reading mode. A low drain voltage will induce a low floating gate voltage and keep the transistor closed. As the drain voltage increases, the floating gate voltage increases, and the transistor gradually opens. The injection transistor shares similar behaviour but has a lower current since the threshold voltage is higher. (d) The band diagram of the injection transistor at the depression (program) mode. When a higher voltage is applied to the drain and the source is grounded, the electrons in the channel are accelerated by the electrical field in the drain region, and the lucky ones are injected into the floating gate, that is, channel hot electron injection (CHEI). (e) The band diagram of the injection transistor in the potentiation (erase) mode. When a higher voltage is applied to the source and the drain is grounded or floating, the source p-n junction is reversely biased. This induces band-to-band (B2B) hole generation in the source region. The high lateral electric field accelerates the generated holes, and the lucky ones are injected into the floating gate. The hot electron/hole effects are negligible in the readout transistor.

Extended Data Fig. 2 Depression and potentiation of the devices by continuous program or erase pulses.

(a) Read mode of the device by applying a voltage pulse (VR = 2 V for pulse reading) on the D terminal and sensing the current (IR). (b) Depression (program) mode of the device operation by applying a voltage pulse (VP = 5 V) on the D terminal and grounding the S terminal. (c) Potentiation (erase) mode of the device operation by applying a voltage pulse (VE = 8 V) on the S terminal and grounding, or floating, the D terminal. The floating D terminal will be capacitively coupled to the grounded substrate. (d) Schematic of the pulse depression/program test by alternatively applying the program pulse and reading the device by the read pulse. (e) The device conductance (G = IR/VR) as a function of the pulse number when depressing the device by programming pulses with the width of 10 us. (f) Schematic of the pulse potentiation/erase test by alternatively applying the erase pulse and reading the device by the read pulse. (g) The device conductance as a function of the pulse number when potentiating the device by erasing pulses with the width of 10 us.

Source data

Extended Data Fig. 3 Depression, potentiation, and continuous readout operations of the devices.

(a) Depression (program) of the device by 200 μs pulses. (b) Potentiation (erase) of the device by 100 μs pulses. (c) The conductance of the device as a function of time after each program pulse by continuously reading the state of the device for 20 seconds. (d) The conductance of the device as a function of time after each erase pulse by continuously reading the state of the device for 20 seconds.

Source data

Extended Data Fig. 4 Cycling degradation and its modeling.

(a) Continuously depression/program of one device from the high conductance state (HCS) to the low conductance state (LCS) for 400 test cycles. (b) Continuously potentiation/erase of one device from the LCS to the HCS for 400 test cycles. (c) The program time (pulse number multiplied by the pulse width) needed for programming the device from HCS to LCS as a function of the cycling number. Both the experimental data and simulation results are presented in the figure. (d) The erase time needed for erasing the device from LCS to HCS as a function of the cycling number.

Source data

Extended Data Fig. 5 Setup of the testing system.

The schematic setup of the test system for array operations and memristive RBM demonstration.

Extended Data Fig. 6 Modelling of device-to-device variations.

(a) Simulated device-to-device variations of the depression/program behavior (red line: a typical depression curve; gray lines: all depression curves of other devices). (b) Simulated device-to-device variations of the potentiation/erase behavior. (c) Statistical result of total program times in different devices compared with simulation results. (d) Statistical result of total erase times in different devices compared with simulation results.

Source data

Extended Data Fig. 7 Flow chart of the online training of RBM including test algorithm after each training epoch.

The VMM and weight updates are performed in the memristive array using the testing system. The stochastic sampling, as well as the calculation and accumulation of the contrastive divergence (CD), are performed in software.

Extended Data Fig. 8 Full hardware design of the memristive part of an RBM layer.

The memristive part of an RBM layer conducts the forward and backward VMMs as well as the stochastic sampling in the analog domain. The peripheral circuit includes multiplexers, trans-impedance amplifiers, noise current generators, and comparators. No digital-to-analog converters (DACs) or analog-to-digital converters (ADCs) are needed in the peripheral circuit. Other parts of the memristive RBM and DBN are all digital circuits.

Supplementary information

Supplementary Software

Source code for device modelling of the two-terminal floating-gate memristor model and the online training of memristive DBN.

Source data

Source Data Fig. 1

Source data for Fig. 1c–e.

Source Data Fig. 2

Source data for Fig. 2d–g.

Source Data Fig. 4

Source data for Fig. 4f–j.

Source Data Fig. 5

Source data for Fig. 5b–d.

Source Data Extended Data Fig. 2

Source data for Extended Data Fig. 2e,g.

Source Data Extended Data Fig. 3

Source data for Extended Data Fig. 3a–d.

Source Data Extended Data Fig. 4

Source data for Extended Data Fig. 4a–d.

Source Data Extended Data Fig. 6

Source data for Extended Data Fig. 6c,d.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wang, W., Danial, L., Li, Y. et al. A memristive deep belief neural network based on silicon synapses. Nat Electron 5, 870–880 (2022). https://doi.org/10.1038/s41928-022-00878-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41928-022-00878-9

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing