Deep neural networks are efficient at learning from large sets of labelled data, but struggle to adapt to previously unseen data. In pursuit of generalized artificial intelligence, one approach is to augment neural networks with an attentional memory so that they can draw on already learnt knowledge patterns and adapt to new but similar tasks. In current implementations of such memory augmented neural networks (MANNs), the content of a network’s memory is typically transferred from the memory to the compute unit (a central processing unit or graphics processing unit) to calculate similarity or distance norms. The processing unit hardware incurs substantial energy and latency penalties associated with transferring the data from the memory and updating the data at random memory addresses. Here, we show that ternary content-addressable memories (TCAMs) can be used as attentional memories, in which the distance between a query vector and each stored entry is computed within the memory itself, thus avoiding data transfer. Our compact and energy-efficient TCAM cell is based on two ferroelectric field-effect transistors. We evaluate the performance of our ferroelectric TCAM array prototype for one- and few-shot learning applications. When compared with a MANN where cosine distance calculations are performed on a graphics processing unit, the ferroelectric TCAM approach provides a 60-fold reduction in energy and 2,700-fold reduction in latency for a single memory search operation.
Subscribe to Journal
Get full journal access for 1 year
only $8.67 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
The data that support the plots within this paper and other findings of this study are available from the corresponding author on reasonable request.
LeCun, Y., Bengio, Y. & Hinton, G. E. Deep learning. Nature 521, 436–444 (2015).
Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. In Proc. Advances in Neural Information Processing Systems 25 (NIPS 2012) (eds Pereira, F. et al.) 1090–1098 (Neural Information Processing Systems Foundation, 2012).
Graves, A., Mohamed, A. R. & Hinton, G. E. Speech recognition with deep recurrent neural networks. In Proc. 2013 IEEE Int. Conference on Acoustics, Speech and Signal Processing (ICASSP) 6645–6649 (IEEE, 2013).
Sutskever, I., Vinyals, O. & Le, Q. V. Sequence to sequence learning with neural networks. In Proc. Advances in Neural Information Processing Systems 27 (NIPS 2014) (eds Ghahramani, Z. et al.) 3104–3112 (Neural Information Processing Systems Foundation, 2014).
McCloskey, M. & Cohen, N. J. Catastrophic interference in connectionist networks: the sequential learning problem. Psychol. Learn. Motiv. 24, 109–165 (1989).
Youssef, D. & Vilalta, R. A perspective view and survey of meta-learning. Artif. Intell. Rev. 18, 77–95 (2002).
Lemke, C., Budka, M. & Gabrys, B. Meta-learning: a survey of trends and technologies. Artif. Intell. Rev. 44, 117–130 (2015).
Graves, A., Wayne, G. & Danihelka, I. Neural Turing machines. Preprint at http://arxiv.org/abs/1410.5401 (2014).
Graves, A. et al. Hybrid computing using a neural network with dynamic external memory. Nature 538, 471–476 (2016).
Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D. & Lillicrap, T. Meta-learning with memory-augmented neural networks. Proc. Machine Learning Res. 48, 1842–1850 (2016).
Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, K. & Wierstra, D. Matching networks for one shot learning. In Proc. Advances in Neural Information Processing Systems 29 (NIPS 2016) (eds Lee, D. D. et al.) 3637–3645 (Neural Information Processing Systems Foundation, 2016).
Kaiser, L., Nachum, O., Roy, A. & Bengio, S. Learning to remember rare events. In Proc. Int. Conference on Learning Representations (2017).
Karam, R. et al. Emerging trends in design and applications of memory-based computing and content-addressable memories. Proc. IEEE 103, 1311–1330 (2015).
Laguna, A. F., Niemier, M. & Hu, X. S. Design of hardware friendly memory enhanced neural networks. In 2019 Design, Automation and Test in Europe Conference & Exhibition (DATE) 1583–1586 (IEEE, 2019).
Imani, M., Patil, S. & Rosing, T. S. Approximate computing using multiple-access single-charge associative memory. IEEE Trans. Emerg. Top. Comput. 6, 305–316 (2018).
Pagiamtzis, K., Azizi, N. & Najm, F. N. A soft-error tolerant content-addressable memory (CAM) using an error-correcting-match scheme. In IEEE Custom Integrated Circuits Conference 2006 301–304 (IEEE, 2007).
Andoni, A. & Indyk, P. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06) 459–468 (IEEE, 2006).
Lai, Z., Luo, Q. & Jia, X. Revisiting multi-pass scatter and gather on GPUs. In Proceedings of the 47th Int. Conference on Parallel Processing 25 (ACM, 2018).
He, A., Govindaraju, N. K., Luo, Q. & Smith, B. Efficient gather and scatter operations on graphics processors. In SC ‘07 Proceedings of the ACM/IEEE Conference on Supercomputing 46 (ACM, 2007).
Nii, K. et al. A 28nm 400 MHz 4-parallel 1.6Gsearch/s 80Mb ternary CAM. In 2014 IEEE Int. Solid-State Circuits Conference Digest of Technical Papers (ISSCC) 240–242 (IEEE, 2014).
Li, J. et al. 1 Mb 0.41 µm² 2T-2R cell nonvolatile TCAM with two-bit encoding and clocked self-referenced sensing. IEEE J. Solid-State Circuits 49, 896–907 (2014).
Imani, M., Rahimi, A., Kong, D., Rosing, T. & Rabaey, J. M. Exploring hyperdimensional associative memory. In 2017 IEEE Int. Symposium on High Performance Computer Architecture (HPCA) 445–456 (IEEE, 2017).
Matsunaga, S. et al. Fine-grained power-gating scheme of a metal-oxide-semiconductor and magnetic-tunnel-junction-hybrid bit-serial ternary content addressabl ememory. Jpn. J. Appl. Phys. 49, 04DM05 (2010).
Müller, J. et al. Ferroelectricity in HfO2 enables nonvolatile data storage in 28 nm HKMG. In 2012 Symposium on VLSI Technology (VLSIT) 25–26 (IEEE, 2012).
Ni, K. et al. Critical role of interlayer in Hf0.5Zr0.5O2 ferroelectric FET nonvolatile memory performance. IEEE Trans. Electron Devices 65, 2461–2469 (2018).
Yin, X., Niemier, M. & Hu, X. S. Design and benchmarking of ferroelectric FET based TCAM. In DATE ‘17 Proceedings of the Conference on Design, Automation and Test in Europe 1448–1453 (European Design and Automation Association, 2017).
Yin, X., Ni, K., Reis, D., Datta, S., Niemier, M. & Hu, X. S. An ultra-dense 2FeFET TCAM design based on a multi-domain FeFET model. IEEE Trans. Circuits Syst. II. 66, 1577–1581 (2018).
Trentzsch, M. et al. A 28 nm HKMG super low power embedded NVM technology based on ferroelectric FETs. In 2016 IEEE Int. Electron Devices Meeting (IEDM) 294–297 (IEEE, 2017).
Mulaosmanovic, H., Mikolajick, T. & Slesazeck, S. Accumulative polarization reversal in nanoscale ferroelectric transistors. ACS Appl. Mater. Interfaces 10, 23997–24002 (2018).
Mulaosmanovic, H. et al. Switching kinetics in nanoscale hafnium oxide based ferroelectric field-effect transistors. ACS Appl. Mater. Interfaces 9, 3792–3798 (2017).
Ni, K. et al. Write disturb in ferroelectric FETs and its implication for 1T-FeFET AND memory arrays. IEEE Electron Device Lett. 39, 1656–1659 (2018).
Dünkel, S. et al. A FeFET based super-low-power ultra-fast embedded NVM technology for 22 nm FDSOI and beyond. In 2017 IEEE Int. Electron Devices Meeting (IEDM) 485–488 (IEEE, 2018).
Ni, K., Jerry, M., Smaith, J. A. & Datta, S. A circuit compatible accurate compact model for ferroelectric-FETs. In 2018 IEEE Symposium on VLSI Technology 131–132 (IEEE, 2018).
Lake, B. M. et al. Human-level concept learning through probabilistic program induction. Science 350, 1332–1338 (2015).
Shinde, R., Goel, A., Gupta, P. & Dutta, D. Similarity search and locality sensitive hashing using ternary content addressable memories. In SIGMOD ‘10 Proc. 2010 ACM SIGMOD Int. Conference on Management of Data 375–386 (ACM, 2010).
Ni, K. et al. SoC logic compatible multi-bit FeMFET weight cell for neuromorphic applications. In 2018 IEEE Int. Electron Devices Meeting (IEDM) 296–299 (IEEE, 2019).
Franklin, D. NVIDIA Jetson TX2 delivers twice the intelligence to the edge. NVIDIA Developer Blog https://devblogs.nvidia.com/jetson-tx2-delivers-twice-intelligence-edge/ (2017).
Song, B. et al. A 10T-4MTJ nonvolatile ternary CAM cell for reliable search operation and a compact area. IEEE Trans. Circuits Syst. II 64, 700–704 (2017).
Fedorov, V. V., Abusultan, M. & Khatri, S. P. An area-efficient ternary CAM design using floating gate transistors. In 2014 IEEE 32nd Int. Conference on Computer Design (ICCD) 55–60 (IEEE, 2014).
Lin, C. C. et al. A 256b-wordlength ReRAM-based TCAM with 1ns search-time and 14× improvement in word length-energy efficiency-density product using 2.5T1R cell. In 2016 IEEE Int. Solid-State Circuits Conference (ISSCC) 136–138 (IEEE, 2016).
Ahn, S. J. et al. Highly manufacturable high density phase change memory of 64Mb and beyond. In IEDM Technical Digest. IEEE Int. Electron Devices Meeting, 2004 907–910 (IEEE, 2005).
Lin, C. J. et al. 45nm low power CMOS logic compatible embedded STT MRAM utilizing a reverse-connection 1T/1MTJ cell. In 2009 IEEE Int. Electron Devices Meeting (IEDM) 279–282 (IEEE, 2010).
Govoreanu, B. et al. 10×10nm2 Hf/HfOx crossbar resistive RAM with excellent performance reliability and low-energy operation. In 2011 Int. Electron Devices Meeting 729–732 (IEEE, 2012).
Dong, Q. et al. A 1Mb embedded NOR flash memory with 39µW program power for mm-scale high-temperature sensor nodes. In 2017 IEEE Int. Solid-State Circuits Conference (ISSCC) 198–200 (IEEE, 2017).
Matsunaga, S. et al. A 3.14 μm2 4T-2MTJ-cell fully parallel TCAM based on nonvolatile logic-in-memory architecture. In 2012 Symposium on VLSI Circuits (VLSIC) 44–45 (IEEE, 2012).
Roth, A., Foss, D., McKenzie, R. & Perry, D. Advanced ternary CAM circuits on 0.13 μm logic process technology. In Proc. IEEE 2004 Custom Integrated Circuits Conference 465–468 (IEEE, 2004).
Choi, S., Sohn, K. & Yoo, H. J. A 0.7-fJ/bit/search 2.2-ns search time hybrid-type TCAM architecture. IEEE J. Solid-State Circuits 40, 254–260 (2005).
Huang, P. T. & Hwang, W. A 65 nm 0.165 fJ/bit/search 256 × 144 TCAM macro design for IPv6 lookup tables. IEEE J. Solid-State Circuits 46, 507–519 (2011).
Xu, W., Zhang, T. & Chen, Y. Design of spin-torque transfer magnetoresistive RAM and CAM/TCAM with high sensing and search speed. IEEE Trans. Very Large Scale Integr. VLSI Syst. 18, 66–74 (2010).
Matsunaga, S. et al. Fully parallel 6T-2MTJ nonvolatile TCAM with single-transistor-based self match-line discharge control. In 2011 Symposium on VLSI Circuits—Digest of Technical Papers 298–299 (IEEE, 2011).
Huang, L. Y. et al. ReRAM-based 4T2R nonvolatile TCAM with 7x NVM-stress reduction, and 4x improvement in speed-wordlength-capacity for normally-off instant-on filter-based search engines used in big-data processing. In 2014 Symposium on VLSI Circuits Digest of Technical Papers 298–299 (IEEE, 2014).
This work was supported in part by ASCENT, one of six centres in JUMP, sponsored by DARPA and the Semiconductor Research Corporation (SRC).
The authors declare no competing interests.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Ni, K., Yin, X., Laguna, A.F. et al. Ferroelectric ternary content-addressable memory for one-shot learning. Nat Electron 2, 521–529 (2019) doi:10.1038/s41928-019-0321-3