In situ training of feed-forward and recurrent convolutional memristor networks

Wang, Zhongrui; Li, Can; Lin, Peng; Rao, Mingyi; Nie, Yongyang; Song, Wenhao; Qiu, Qinru; Li, Yunning; Yan, Peng; Strachan, John Paul; Ge, Ning; McDonald, Nathan; Wu, Qing; Hu, Miao; Wu, Huaqiang; Williams, R. Stanley; Xia, Qiangfei; Yang, J. Joshua

doi:10.1038/s42256-019-0089-1

Article
Published: 09 September 2019

In situ training of feed-forward and recurrent convolutional memristor networks

Nature Machine Intelligence volume 1, pages 434–442 (2019)Cite this article

5123 Accesses
201 Citations
8 Altmetric
Metrics details

Subjects

Abstract

The explosive growth of machine learning is largely due to the recent advancements in hardware and architecture. The engineering of network structures, taking advantage of the spatial or temporal translational isometry of patterns, naturally leads to bio-inspired, shared-weight structures such as convolutional neural networks, which have markedly reduced the number of free parameters. State-of-the-art microarchitectures commonly rely on weight-sharing techniques, but still suffer from the von Neumann bottleneck of transistor-based platforms. Here, we experimentally demonstrate the in situ training of a five-level convolutional neural network that self-adapts to non-idealities of the one-transistor one-memristor array to classify the MNIST dataset, achieving similar accuracy to the memristor-based multilayer perceptron with a reduction in trainable parameters of ~75% owing to the shared weights. In addition, the memristors encoded both spatial and temporal translational invariance simultaneously in a convolutional long short-term memory network—a memristor-based neural network with intrinsic 3D input processing—which was trained in situ to classify a synthetic MNIST sequence dataset using just 850 weights. These proof-of-principle demonstrations combine the architectural advantages of weight sharing and the area/energy efficiency boost of the memristors, paving the way to future edge artificial intelligence.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: 1T1R implementation of the 5-level convolutional neural network (CNN).**

**Fig. 2: In situ training of the 1T1R-based five-level CNN.**

**Fig. 3: 1T1R implementation of the ConvLSTM network.**

**Fig. 4: In situ training of the 1T1R based ConvLSTM.**

Fully hardware-implemented memristor convolutional neural network

Article 29 January 2020

Peng Yao, Huaqiang Wu, … He Qian

Purely self-rectifying memristor-based passive crossbar array for artificial neural network accelerators

Article Open access 02 January 2024

Kanghyeok Jeon, Jin Joo Ryu, … Gun Hwan Kim

Energy-efficient memcapacitor devices for neuromorphic computing

Article Open access 11 October 2021

Kai-Uwe Demasius, Aron Kirschen & Stuart Parkin

Data availability

The data that support the plots within this paper and other finding of this study are available in a Zenondo repository at https://doi.org/10.5281/zenodo.3273475.

Code availability

The code that support the plots within this paper and other finding of this study is available in a Zenondo repository at https://doi.org/10.5281/zenodo.3277298 and https://github.com/zhongruiwang/memristorCNN. The code that supports the communication between the custom-built measurement system and the integrated chip is available from the corresponding author on reasonable request.

References

Hubel, D. H. & Wiesel, T. N. Receptive fields and functional architecture of monkey striate cortex. J. Physiol. 195, 215–243 (1968).
Article Google Scholar
LeCun, Y., Bottou, L., Bengio, Y. & Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998).
Article Google Scholar
Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. Preprint at https://arxiv.org/abs/1409.1556 (2014).
Szegedy, C. et al. Going deeper with convolutions. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2015).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 770–778 (IEEE, 2016).
Shi, X. et al. Convolutional LSTM network: a machine learning approach for precipitation nowcasting. In Advances in Neural Information Processing Systems 802–810 (NIPS, 2015).
Buonomano, D. V. & Maass, W. State-dependent computations: spatiotemporal processing in cortical networks. Nat. Rev. Neurosci. 10, 113–125 (2009).
Article Google Scholar
Patraucean, V., Handa, A. & Cipolla, R. Spatio-temporal video autoencoder with differentiable memory. Preprint at https://arxiv.org/abs/1511.06309 (2015).
Jouppi, N. P. et al. In-datacenter performance analysis of a tensor processing unit. In Proc. 44th Annual International Symposium on Computer Architecture (ACM/IEEE, 2017).
Chen, Y.-H., Krishna, T., Emer, J. S. & Sze, V. Eyeriss: an energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE J. Solid St. Circ. 52, 127–138 (2017).
Article Google Scholar
Chen, Y. et al. Dadiannao: a machine-learning supercomputer. In Proc. 47th Annual IEEE/ACM International Symposium on Microarchitecture 609–622 (IEEE/ACM, 2014).
Tsai, H., Ambrogio, S., Narayanan, P., Shelby, R. M. & Burr, G. W. Recent progress in analog memory-based accelerators for deep learning. J. Phys. D 51, 283001 (2018).
Article Google Scholar
Ielmini, D. & Wong, H. S. P. In-memory computing with resistive switching devices. Nat. Electron 1, 333–343 (2018).
Article Google Scholar
Zidan, M. A., Strachan, J. P. & Lu, W. D. The future of electronics based on memristive systems. Nat. Electron 1, 22–29 (2018).
Article Google Scholar
Yu, S. Neuro-inspired computing with emerging nonvolatile memorys. Proc. IEEE 106, 260–285 (2018).
Article Google Scholar
Strukov, D. B., Snider, G. S., Stewart, D. R. & Williams, R. S. The missing memristor found. Nature 453, 80–83 (2008).
Article Google Scholar
Jo, S. H. et al. Nanoscale memristor device as synapse in neuromorphic systems. Nano Lett. 10, 1297–1301 (2010).
Article Google Scholar
Yu, S., Wu, Y., Jeyasingh, R., Kuzum, D. & Wong, H. S. P. An electronic synapse device based on metal oxide resistive switching memory for neuromorphic computation. IEEE Trans. Elect. Dev. 58, 2729–2737 (2011).
Article Google Scholar
Eryilmaz, S. B. et al. Brain-like associative learning using a nanoscale non-volatile phase change synaptic device array. Front. Neurosci. 8, 205 (2014).
Article Google Scholar
Burr, G. W. et al. Experimental demonstration and tolerancing of a large-scale neural network (165 000 Synapses) using phase-change memory as the synaptic weight element. IEEE Trans. Elect. Dev. 62, 3498–3507 (2015).
Article Google Scholar
Prezioso, M. et al. Training and operation of an integrated neuromorphic network based on metal-oxide memristors. Nature 521, 61–64 (2015).
Article Google Scholar
Ambrogio, S. et al. Unsupervised learning by spike timing dependent plasticity in phase change memory (PCM) synapses. Front. Neurosci. 10, 56 (2016).
Article Google Scholar
Hu, M. et al. Dot-product engine for neuromorphic computing: programming 1T1M crossbar to accelerate matrix-vector multiplication. In 53rd ACM/EDAC/IEEE Design Automation Conference (ACM/IEEE, 2016).
Hu, M. et al. Memristor-based analog computation and neural network classification with a dot product engine. Adv. Mater. 30, 1705914 (2018).
Article Google Scholar
Li, C. et al. Analogue signal and image processing with large memristor crossbars. Nat. Electron. 1, 52–59 (2018).
Article Google Scholar
Nili, H. et al. Hardware-intrinsic security primitives enabled by analogue state and nonlinear conductance variations in integrated memristors. Nat. Electron. 1, 197–202 (2018).
Article Google Scholar
Le Gallo, M. et al. Mixed-precision in-memory computing. Nat. Electron. 1, 246–253 (2018).
Article Google Scholar
Zidan, M. A. et al. A general memristor-based partial differential equation solver. Nat. Electron. 1, 411–420 (2018).
Article Google Scholar
Jeong, Y., Lee, J., Moon, J., Shin, J. H. & Lu, W. D. K-means data clustering with memristor networks. Nano Lett. 18, 4447–4453 (2018).
Article Google Scholar
Shin, J. H., Jeong, Y. J., Zidan, M. A., Wang, Q. & Lu, W. D. Hardware acceleration of simulated annealing of spin glass by RRAM crossbar array. In 2018 IEEE International Electron Devices Meeting 3.3.1–3.3.4 (IEEE, 2018).
Sun, Z. et al. Solving matrix equations in one step with cross-point resistive arrays. Proc. Natl Acad. Sci. USA 116, 4123–4128 (2019).
Article MathSciNet Google Scholar
Sheridan, P. M. et al. Sparse coding with memristor networks. Nat. Nanotechnol. 12, 784–789 (2017).
Article Google Scholar
Choi, S., Shin, J. H., Lee, J., Sheridan, P. & Lu, W. D. Experimental demonstration of feature extraction and dimensionality reduction using memristor networks. Nano Lett. 17, 3113–3118 (2017).
Article Google Scholar
Yao, P. et al. Face classification using electronic synapses. Nat. Commun. 8, 15199 (2017).
Article Google Scholar
Ambrogio, S. et al. Equivalent-accuracy accelerated neural-network training using analogue memory. Nature 558, 60–67 (2018).
Article Google Scholar
Bayat, F. M. et al. Implementation of multilayer perceptron network with highly uniform passive memristive crossbar circuits. Nat. Commun. 9, 2331 (2018).
Article Google Scholar
Boybat, I. et al. Neuromorphic computing with multi-memristive synapses. Nat. Commun. 9, 2514 (2018).
Article Google Scholar
Chen, W.-H. et al. A 65nm 1Mb nonvolatile computing-in-memory ReRAM macro with sub-16ns multiply-and-accumulate for binary DNN AI edge processors. In 2018 IEEE International Solid-State Circuits Conference 494–496 (IEEE, 2018).
Xue, C.-X. et al. A 1Mb multibit ReRAM computing-in-memory macro with 14.6 ns parallel MAC computing time for CNN based AI edge processors. In 2019 IEEE International Solid-State Circuits Conference 388–390 (IEEE, 2019).
Mochida, R. et al. A 4M synapses integrated analog ReRAM based 66.5 TOPS/W neural-network processor with cell current controlled writing and flexible network architecture. In 2018 IEEE Symposium on VLSI Technology 175–176 (IEEE, 2018).
Gokmen, T., Onen, M. & Haensch, W. Training deep convolutional neural networks with resistive cross-point devices. Front Neurosci. 11, 538 (2017).
Article Google Scholar
Li, C. et al. Long short-term memory networks in memristor crossbar arrays. Nat. Mach. Intell. 1, 49–57 (2019).
Article Google Scholar
Sun, X. et al. XNOR-RRAM: a scalable and parallel resistive synaptic architecture for binary neural networks. In 2018 Design, Automation & Test in Europe Conference & Exhibition 1423–1428 (IEEE, 2018).
Gao, L., Chen, P.-Y. & Yu, S. Demonstration of convolution kernel operation on resistive cross-point array. IEEE Elect. Dev. Lett. 37, 870–873 (2016).
Article Google Scholar
Li, C. et al. Efficient and self-adaptive in-situ learning in multilayer memristor neural networks. Nat. Commun. 9, 2385 (2018).
Article Google Scholar
Yang, J. J. et al. High switching endurance in TaO_x memristive devices. Appl. Phys. Lett. 97, 232102 (2010).
Article Google Scholar
Tieleman, T. & Hinton, G. Lecture 6.5-rmsprop: divide the gradient by a running average of its recent magnitude. COURSERA: Neural networks for machine learning 4, 26–31 (2012).
Google Scholar
Choi, S. et al. SiGe epitaxial memory for neuromorphic computing with reproducible high performance based on engineered dislocations. Nat. Mater. 17, 335–340 (2018).
Article Google Scholar
Graves, A., Mohamed, A.-r. & Hinton, G. Speech recognition with deep recurrent neural networks. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing 6645–6649 (IEEE, 2013).
An, G. The effects of adding noise during backpropagation training on a generalization performance. Neural Comput. 8, 643–674 (1996).
Article Google Scholar
Wang, Z. et al. Reinforcement learning with analogue memristor arrays. Nat. Electron. 2, 115–124 (2019).
Article Google Scholar
Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning representations by back-propagating errors. Nature 323, 533–536 (1986).
Article Google Scholar
Werbos, P. J. Backpropagation through time: what it does and how to do it. Proc. IEEE 78, 1550–1560 (1990).
Article Google Scholar

Download references

Acknowledgements

This work was supported in part by the US Air Force Research Laboratory (grant no. FA8750-18-2-0122) and the Defense Advanced Research Projects Agency (contract no. D17PC00304). Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of US Air Force Research Laboratory. H.W. was supported by the Beijing Advanced Innovation Center for Future Chip and National Science Foundation of China (grant no. 61674089 and 61674092). Part of the device fabrication was conducted in the clean room of the Center for Hierarchical Manufacturing, an National Science Foundation Nanoscale Science and Engineering Center located at the University of Massachusetts Amherst.

Author information

These authors contributed equally: Zhongrui Wang, Can Li, Peng Lin.

Authors and Affiliations

Department of Electrical and Computer Engineering, University of Massachusetts, Amherst, MA, USA
Zhongrui Wang, Can Li, Peng Lin, Mingyi Rao, Yongyang Nie, Wenhao Song, Yunning Li, Peng Yan, Qiangfei Xia & J. Joshua Yang
Hewlett Packard Labs, Hewlett Packard Enterprise, Palo Alto, CA, USA
Can Li, John Paul Strachan & Ning Ge
Department of Electrical Engineering and Computer Science, Syracuse University, Syracuse, NY, USA
Qinru Qiu
Information Directorate, Air Force Research Laboratory, Rome, NY, USA
Nathan McDonald & Qing Wu
Department of Electrical and Computer Engineering, Binghamton University, Binghamton, NY, USA
Miao Hu
Institute of Microelectronics, Tsinghua University, Beijing, China
Huaqiang Wu
Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX, USA
R. Stanley Williams

Authors

Zhongrui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Can Li
View author publications
You can also search for this author in PubMed Google Scholar
Peng Lin
View author publications
You can also search for this author in PubMed Google Scholar
Mingyi Rao
View author publications
You can also search for this author in PubMed Google Scholar
Yongyang Nie
View author publications
You can also search for this author in PubMed Google Scholar
Wenhao Song
View author publications
You can also search for this author in PubMed Google Scholar
Qinru Qiu
View author publications
You can also search for this author in PubMed Google Scholar
Yunning Li
View author publications
You can also search for this author in PubMed Google Scholar
Peng Yan
View author publications
You can also search for this author in PubMed Google Scholar
John Paul Strachan
View author publications
You can also search for this author in PubMed Google Scholar
Ning Ge
View author publications
You can also search for this author in PubMed Google Scholar
Nathan McDonald
View author publications
You can also search for this author in PubMed Google Scholar
Qing Wu
View author publications
You can also search for this author in PubMed Google Scholar
Miao Hu
View author publications
You can also search for this author in PubMed Google Scholar
Huaqiang Wu
View author publications
You can also search for this author in PubMed Google Scholar
R. Stanley Williams
View author publications
You can also search for this author in PubMed Google Scholar
Qiangfei Xia
View author publications
You can also search for this author in PubMed Google Scholar
J. Joshua Yang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.J.Y. conceived the idea. J.J.Y., Q.X. and Z.W. designed the experiments. Z.W., C.L., P.L., Y.N. and W.S. performed the programming, measurements, data analysis and simulation. M.R., P.Y, C.L. and N.G. built the integrated chips. P.L., Y.L., M.H. and J.P.S. designed the measurement system and firmware. Q.Q., H.W., N.M., Q.W. and R.S.W. helped with experiments and data analysis. J.J.Y. and Z.W wrote the manuscript. All authors discussed the results and implications and commented on the manuscript at all stages.

Corresponding authors

Correspondence to Qiangfei Xia or J. Joshua Yang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figs. 1–13, Tables 1–6 and Notes 1–4

Supplementary Video 1

The in situ training of the 60,000 MNIST training images with the five-level CNN. The upper left panel shows the in-batch training accuracy, which raised sharply from 1 to 200 mini-batches and stayed around 90% accuracy in the rest course. The lower left 3 columns show the corresponding weights of the 15 kernels of size 3 × 3 of the first convolutional layer. Each weight is calculated by dividing the averaged conductance differences of the 2 differential pairs by the constant R_gw (see Methods). (The weights are arranged in the same way as those in Supplementary Figure 6.) The middle 4 columns show the corresponding weights of the 4 kernels of size 2 × 2 (×15) of the second convolutional layer. The right two columns show the corresponding weights of the 64 × 10 fully connected layer.

Supplementary Video 2

The inference of 10,000 MNIST test-set images with the five-level CNN. The left panel shows the image to be classified. The middle panel shows the raw output currents of the fully connected layer neurons. The right panel shows the corresponding Bayesian probabilities based on the softmax function. Blue colour bars are with valid classifications while red ones with misclassifications.

Supplementary Video 3

The in situ training of the 5,958 MNIST-sequence training set with the ConvLSTM. The upper left panel shows the in-batch training accuracy which raised sharply from 1 to 50 minibatches and stayed around 95% accuracy in the rest course. The lower left 4 columns show the corresponding weights of the 5 input kernels of size 3 × 3 of the cell input, input gate, forget gate, and output gate of the ConvLSTM layer. Each weight is calculated by dividing the averaged conductance differences of the 2 differential pairs by the constant conductance-to-weight ratio R_gw (see Method). (The weights are arranged in the same way as those in Supplementary Figure 8). The middle 4 columns show the corresponding weights of the 5 recurrent input kernels of size 2 × 2 (×5) of the cell input, input gate, forget gate, and output gate of the ConvLSTM layer. The right column shows the corresponding weights of the 45 × 6 fully connected layer.

Supplementary Video 4

The inference of 1,010 MNIST-sequence test-set with the ConvLSTM. The left 3 panels show the MNIST-sequence to be classified. The fourth panel shows the raw output currents of the fully connected layer neurons at different time steps (time step 1: blue; time step 2: red, time step 3: orange). The corresponding Bayesian probabilities (of the last time step) based on the softmax function are with the last panel. Blue colour bars are with valid classifications while red ones with misclassifications.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, Z., Li, C., Lin, P. et al. In situ training of feed-forward and recurrent convolutional memristor networks. Nat Mach Intell 1, 434–442 (2019). https://doi.org/10.1038/s42256-019-0089-1

Download citation

Received: 17 December 2018
Accepted: 24 July 2019
Published: 09 September 2019
Issue Date: September 2019
DOI: https://doi.org/10.1038/s42256-019-0089-1

This article is cited by

Purely self-rectifying memristor-based passive crossbar array for artificial neural network accelerators
- Kanghyeok Jeon
- Jin Joo Ryu
- Gun Hwan Kim
Nature Communications (2024)
Pruning and quantization algorithm with applications in memristor-based convolutional neural network
- Mei Guo
- Yurui Sun
- Shiping Wen
Cognitive Neurodynamics (2024)
Open-loop analog programmable electrochemical memory array
- Peng Chen
- Fenghao Liu
- Gang Pan
Nature Communications (2023)
Wearable in-sensor reservoir computing using optoelectronic polymers with through-space charge-transport characteristics for multi-task learning
- Xiaosong Wu
- Shaocong Wang
- Weiguo Huang
Nature Communications (2023)
Monolithic three-dimensional integration of RRAM-based hybrid memory architecture for one-shot learning
- Yijun Li
- Jianshi Tang
- Huaqiang Wu
Nature Communications (2023)