Design principles for lifelong learning AI accelerators

Kudithipudi, Dhireesha; Daram, Anurag; Zyarah, Abdullah M.; Zohora, Fatima Tuz; Aimone, James B.; Yanguas-Gil, Angel; Soures, Nicholas; Neftci, Emre; Mattina, Matthew; Lomonaco, Vincenzo; Thiem, Clare D.; Epstein, Benjamin

doi:10.1038/s41928-023-01054-3

Perspective
Published: 16 November 2023

Design principles for lifelong learning AI accelerators

Nature Electronics volume 6, pages 807–822 (2023)Cite this article

1403 Accesses
2 Citations
48 Altmetric
Metrics details

Subjects

Abstract

Lifelong learning—an agent’s ability to learn throughout its lifetime—is a hallmark of biological learning systems and a central challenge for artificial intelligence (AI). The development of lifelong learning algorithms could lead to a range of novel AI applications, but this will also require the development of appropriate hardware accelerators, particularly if the models are to be deployed on edge platforms, which have strict size, weight and power constraints. Here we explore the design of lifelong learning AI accelerators that are intended for deployment in untethered environments. We identify key desirable capabilities for lifelong learning accelerators and highlight metrics to evaluate such accelerators. We then discuss current edge AI accelerators and explore the future design of lifelong learning accelerators, considering the role that different emerging technologies could play.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Addressing lifelong learning in AI systems.**

**Fig. 2: Hardware optimizations for lifelong learning in AI accelerators.**

**Fig. 4: Maximum accelerator power with respect to quantization and numerical formats.**

**Fig. 5: Accelerator throughput versus power.**

Biological underpinnings for lifelong learning machines

Article 23 March 2022

A collective AI via lifelong learning and sharing at the edge

Article 22 March 2024

Catalyzing next-generation Artificial Intelligence through NeuroAI

Article Open access 22 March 2023

References

Kudithipudi, D. et al. Biological underpinnings for lifelong learning machines. Nat. Mach. Intell. 4, 196–210 (2022).
Google Scholar
Delange, M. et al. A continual learning survey: defying forgetting in classification tasks. IEEE Trans. Pattern Anal. Mach. Intell 44, 3366–3385 (2021).
Google Scholar
Thrun, S. & Mitchell, T. M. Lifelong robot learning. Robot. Auton. Syst. 15, 25–46 (1995).
Google Scholar
McCloskey, M. & Cohen, N. J. Catastrophic interference in connectionist networks: the sequential learning problem. Psychol. Learn. Motiv 24, 109–165 (1989).
Google Scholar
McClelland, J. L., McNaughton, B. L. & O’Reilly, R. C. Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. Psychol. Rev. 102, 419–457 (1995).
Google Scholar
Pratt, L. Y. et al. Direct transfer of learned information among neural networks. In Proc. Ninth National Conference on Artificial Intelligence Vol. 2, 584–589 (AAAI Press, 1991).
Caruana, R. Multitask learning. Mach. Learn. 28, 41–75 (1997).
Google Scholar
Fei-Fei, L., Fergus, R. & Perona, P. One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach. Intell. 28, 594–611 (2006).
Google Scholar
Thrun, S. & Pratt, L. in Learning to Learn (eds Thrun, S. & Pratt, L.) 3–17 (Springer, 1998).
Kirkpatrick, J. et al. Overcoming catastrophic forgetting in neural networks. Proc. Natl Acad. Sci. USA 114, 3521–3526 (2017).
MathSciNet MATH Google Scholar
Zenke, F., Poole, B. & Ganguli, S. Continual learning through synaptic intelligence. In Proc. 34th International Conference on Machine Learning (eds Precup, D. & Teh, Y. W.) 3987–3995 (MIT Press, 2017).
Laborieux, A., Ernoult, M., Hirtzlin, T. & Querlioz, D. Synaptic metaplasticity in binarized neural networks. Nat. Commun. 12, 2549 (2021).
Google Scholar
Soures, N., Helfer, P., Daram, A., Pandit, T. & Kudithipudi, D. TACOS: Task agnostic continual learning in spiking neural networks. In Theory and Foundation of Continual Learning Workshop at ICML 2021 (PMLR, 2021).
Schug, S., Benzing, F. & Steger, A. Presynaptic stochasticity improves energy efficiency and helps alleviate thestability-plasticity dilemma. eLife 10, e69884 (2021).
Google Scholar
Ebrahimi, S., Meier, F., Calandra, R., Darrell, T. & Rohrbach, M. Adversarial continual learning. In Proc. Computer Vision—ECCV 2020: 16th European Conference Part XI 16 386–402 (Springer, 2020).
Pandit, T. & Kudithipudi, D. Relational neurogenesis for lifelong learning agents. In Proc. Neuro-Inspired Computational Elements Workshop (ed. Okandan, M.) 10 (Association for Computing Machinery, 2020).
Masse, N. Y., Grant, G. D. & Freedman, D. J. Alleviating catastrophic forgetting using context-dependent gating and synaptic stabilization. Proc. Natl Acad. Sci. USA 115, 10467–10475 (2018).
Google Scholar
Rebuffi, S., Kolesnikov, A., Sperl, G. & Lampert, C. H. iCaRL: incremental classifier and representation learning. In Proc. 2017 IEEE Conference on Computer Vision and Pattern Recognition CVPR’17 5533–5542 (IEEE, 2017) .
Lopez-Paz, D. & Ranzato, M. Gradient episodic memory for continual learning. In Proc. 31st International Conference on Neural Information Processing Systems NIPS’17 (eds von Luxburg & Guyon, I.) 6470–6479 (Curran Associates, 2017) .
Ven, G. M., Siegelmann, H. T. & Tolias, A. S. Brain-inspired replay for continual learning with artificial neural networks. Nat. Commun. 11, 4069 (2020).
Google Scholar
Hayes, T. L. et al. Replay in deep learning: current approaches and missing biological elements. Neural Comput. 33, 2908–2950 (2021).
MathSciNet MATH Google Scholar
Mundt, M., Hong, Y., Pliushch, I. & Ramesh, V. A wholistic view of continual learning with deep neural networks: forgotten lessons and the bridge to active and open world learning. Neural Netw. 160, 306–336 (2023).
Google Scholar
Kwon, Y. D., Chauhan, J., Kumar, A., Hui, P. & Mascolo, C. Exploring system performance of continual learning for mobile and embedded sensing applications. In Proc. 2021 IEEE/ACM Symposium on Edge Computing (SEC) 319–332 (IEEE, 2021).
Ven, G. M. & Tolias, A. S. Three scenarios for continual learning. Preprint at https://arxiv.org/abs/1904.07734 (2019).
Gupta, V., Narwariya, J., Malhotra, P., Vig, L. & Shroff, G. Continual learning for multivariate time series tasks with variable input dimensions. In Proc. 2021 IEEE International Conference on Data Mining (ICDM) 161–170 (IEEE, 2021).
Seshia, S. A., Sadigh, D. & Sastry, S. S. Toward verified artificial intelligence. Commun. ACM 65, 46–55 (2022).
Google Scholar
Fernando, C. et al. PathNet: evolution channels gradient descent in super neural networks. Preprint at https://arxiv.org/abs/1701.08734 (2017) .
Lee, S., Ha, J., Zhang, D. & Kim, G. A neural dirichlet process mixture model for task-free continual learning. In International Conference on Learning Representations (ICLR, 2020)
Harris, M. Inside Pascal: NVIDIA’s Newest Computing Platform https://developer.nvidia.com/blog/inside-pascal/ (NVIDIA, 2016) .
Norrie, T. et al. The design process for Google’s training chips: TPUv2 and TPUv3. IEEE Micro 41, 56–63 (2021).
Google Scholar
New, A., Baker, M., Nguyen, E. & Vallabha, G. Lifelong learning metrics. Preprint at https://arxiv.org/abs/2201.08278 (2022) .
Zohora, F. T., Karia, V., Daram, A. R., Zyarah, A. M. & Kudithipudi, D. MetaplasticNet: architecture with probabilistic metaplastic synapses for continual learning. In Proc. 2021 IEEE International Symposium on Circuits and Systems (ISCAS) 1–5 (IEEE, 2021).
Karia, V., Zohora, F. T., Soures, N. & Kudithipudi, D. SCOLAR: a spiking digital accelerator with dual fixed point for continual learning. In Proc. 2022 IEEE International Symposium on Circuits and Systems (ISCAS) 1372–1376 (IEEE, 2022).
Díaz-Rodríguez, N., Lomonaco, V., Filliat, D. & Maltoni, D. Don’t forget, there is more than forgetting: new metrics for continual learning. In Workshop on Continual Learning (NeurIPS, 2018).
Lesort, T. et al. Continual learning for robotics: definition, framework, learning strategies, opportunities and challenges. Inf. Fusion 58, 52–68 (2020).
Google Scholar
Ravaglia, L. et al. Memory-latency-accuracy trade-offs for continual learning on a RISC-v extreme-edge node. In Proc. 2020 IEEE Workshop on Signal Processing Systems (SiPS) 1–6 (IEEE, 2020).
De Lange, M., Ven, G. & Tuytelaars, T. Continual evaluation for lifelong learning: identifying the stability gap. Eleventh International Conference on Learning Representations (ICLR, 2023).
Reddi, V. J. et al. MLPerf inference benchmark. In Proc. 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA) 446–459 (IEEE, 2020).
Vanschoren, J., Van Rijn, J. N., Bischl, B. & Torgo, L. OpenML: networked science in machine learning. ACM SIGKDD Explorations Newsl. 15, 49–60 (2014).
Google Scholar
Davies, M. Benchmarks for progress in neuromorphic computing. Nat. Mach. Intell. 1, 386–388 (2019).
Google Scholar
Jouppi, N. P. et al. A domain-specific supercomputer for training deep neural networks. Commun. ACM 63, 67–78 (2020).
Google Scholar
Chen, Y.-H., Yang, T.-J., Emer, J. & Sze, V. Eyeriss v2: a flexible accelerator for emerging deep neural networks on mobile devices. IEEE J. Emerg. Sel. Top. Circuits Syst. 9, 292–308 (2019).
Google Scholar
Chung, E. et al. Serving DNNs in real time at datacenter scale with project brainwave. IEEE Micro 38, 8–20 (2018).
Google Scholar
Davies, M. et al. Loihi: a neuromorphic manycore processor with on-chip learning. IEEE Micro 38, 82–99 (2018).
Google Scholar
Pfister, J.-P. & Gerstner, W. Triplets of spikes in a model of spike timing-dependent plasticity. J. Neurosci. 26, 9673–9682 (2006).
Google Scholar
Gu, P. et al. DLUX: a LUT-based near-bank accelerator for data center deep learning training workloads. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 40, 1586–1599 (2020).
Google Scholar
Lee, J. et al. 7.7 LNPU: a 25.3 TFLOPS/W sparse deep-neural-network learning processor with fine-grained mixed precision of FP8-FP16. In Proc. 2019 IEEE International Solid-State Circuits Conference (ISSCC) 142–144 (IEEE, 2019).
Han, D., Lee, J. & Yoo, H.-J. DF-LNPU: a pipelined direct feedback alignment-based deep neural network learning processor for fast online learning. IEEE J. Solid State Circuits 56, 1630–1640 (2020).
Google Scholar
Tu, F. et al. Evolver: a deep learning processor with on-device quantization-voltage-frequency tuning. IEEE J. Solid State Circuits 56, 658–673 (2021).
Google Scholar
Han, D. et al. HNPU: an adaptive DNN training processor utilizing stochastic dynamic fixed-point and active bit-precision searching. IEEE J. Solid State Circuits 56, 2858–2869 (2021).
Google Scholar
Kim, C. et al. A 2.1TFLOPS/W mobile deep RL accelerator with transposable PE array and experience compression. In Proc. 2019 IEEE International Solid-State Circuits Conference (ISSCC) 136–138 (IEEE, 2019).
Furber, S. B., Galluppi, F., Temple, S. & Plana, L. A. The SpiNNaker Project. Proc. IEEE 102, 652–665 (2014).
Google Scholar
Demler, M. BrainChip Akida is a fast learner, spiking-neural-network processor identifies patterns in unlabeled data. Microprocessor Report (28 October 2019).
Nguyen, D.-A., Tran, X.-T. & Iacopi, F. A review of algorithms and hardware implementations for spiking neural networks. J. Low Power Electron. Appl. 11, 23 (2021).
Google Scholar
Frenkel, C. & Indiveri, G. ReckOn: a 28 nm sub-mm² task-agnostic spiking recurrent neural network processor enabling on-chip learning over second-long timescales. In Proc. 2022 IEEE International Solid-State Circuits Conference (ISSCC) 1–3 (IEEE, 2022).
Frenkel, C., Lefebvre, M., Legat, J.-D. & Bol, D. A 0.086-mm² 12.7-pJ/SOP 64k-synapse 256-neuron online-learning digital spiking neuromorphic processor in 28-nm CMOS. IEEE Trans. Biomed. Circuits Syst. 13, 145–158 (2018).
Google Scholar
Chen, G. K., Kumar, R., Sumbul, H. E., Knag, P. C. & Krishnamurthy, R. K. A 4096-neuron 1M-synapse 3.8-pJ/SOP spiking neural network with on-chip STDP learning and sparse weights in 10-nm FinFET CMOS. IEEE J. Solid State Circuits 54, 992–1002 (2018).
Google Scholar
Dean, M. E. & Daffron, C. A VLSI design for neuromorphic computing. In Proc. 2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) 87–92 (IEEE, 2016).
Chicca, E., Stefanini, F. & Indiveri, G. Neuromorphic electronic circuits for building autonomous cognitive systems. Proc. IEEE 102, 1367–1388 (2013).
Google Scholar
Basu, A., Deng, L., Frenkel, C. & Zhang, X. Spiking neural network integrated circuits: a review of trends and future directions. In Proc. 2022 IEEE Custom Integrated Circuits Conference (CICC) 1–8 (IEEE, 2022).
Chen, Y.-H., Emer, J. & Sze, V. Using dataflow to optimize energy efficiency of deep neural network accelerators. IEEE Micro 37, 12–21 (2017).
Google Scholar
Yin, S. & Seo, J.-S. A 2.6 TOPS/W 16-bit fixed-point convolutional neural network learning processor in 65-nm CMOS. IEEE Solid State Circuits Lett. 3, 13–16 (2020).
Google Scholar
Lu, C.-H., Wu, Y.-C. & Yang, C.-H. A 2.25 TOPS/W fully-integrated deep CNN learning processor with on-chip training. In Proc. 2019 IEEE Asian Solid-State Circuits Conference (A-SSCC) 65–68 (IEEE, 2019).
Fleischer, B. et al. A scalable multi-teraOPS deep learning processor core for AI training and inference. In Proc. 2018 IEEE Symposium on VLSI Circuits 35–36 (IEEE, 2018).
Qin, E. et al. SIGMA: a sparse and irregular GEMM accelerator with flexible interconnects for DNN training. In Proc. 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA) 58–70 (IEEE, 2020).
Giannoula, C. et al. SparseP: towards efficient sparse matrix vector multiplication on real processing-in-memory architectures. Proc. ACM Meas. Anal. Comput. Syst. 6, 1–49 (2022).
Google Scholar
Li, J. et al. SmartShuttle: optimizing off-chip memory accesses for deep learning accelerators. In Proc. 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE) 343–348 (IEEE, 2018).
Dally, W. On the model of computation: point. Commun. ACM 65, 30–32 (2022).
Google Scholar
Chen, T., Xu, B., Zhang, C. & Guestrin, C. Training deep nets with sublinear memory cost. Preprint at https://arxiv.org/abs/1604.06174 (2016).
De Lange, M. et al. A continual learning survey: defying forgetting in classification tasks. IEEE Trans. Pattern Anal. Mach. Intell. 44, 3366–3385 (2021).
Google Scholar
Merlin, G., Lomonaco, V., Cossu, A., Carta, A. & Bacciu, D. Practical recommendations for replay-based continual learning methods. In Proc. International Conference on Image Analysis and Processing 548–559 (Springer, 2022).
Kang, S. et al. 7.4 GANPU: a 135TFLOPS/W multi-DNN training processor for GANs with speculative dual-sparsity exploitation. In Proc. 2020 IEEE International Solid-State Circuits Conference (ISSCC) 140–142 (IEEE, 2020).
Mayr, C., Höppner, S. & Furber, S. Spinnaker 2: A 10 million core processor system for brain simulation and machinelearning-keynote presentation. In Communicating Process Architectures 2017 & 2018 277–280 (IOS Press, 2019).
Nedbailo, Y. A., Tokarev, D. S. & Shpagilev, D. I. Designing a QoS-enabled 2 GHz on-chip network router in 16 nm CMOS. In Proc. 2022 Moscow Workshop on Electronic and Networking Technologies (MWENT) 1–5 (IEEE, 2022).
Bashir, J., Peter, E. & Sarangi, S. R. A survey of on-chip optical interconnects. ACM Comput. Surv. 51, 115 (2019).
Google Scholar
Shastri, B. J. et al. Photonics for artificial intelligence and neuromorphic computing. Nat. Photon. 15, 102–114 (2021).
Google Scholar
Krishnamoorthi, R. Techniques for efficient inference with deep networks. Workshop on Energy Efficient Machine Learning and Cognitive Computing (ECM2, 2020).
Kim, S., Lee, J., Kang, S., Lee, J. & Yoo, H.-J. A 146.52 TOPS/W deep-neural-network learning processor with stochastic coarse-fine pruning and adaptive input/output/weight skipping. In Proc. 2020 IEEE Symposium on VLSI Circuits 1–2 (IEEE, 2020).
Agrawal, A. et al. A 7 nm 4-Core AI chip with 25.6TFLOPS hybrid FP8 training, 102.4 TOPS INT4 inference and workload-aware throttling. In Proc. 2021 IEEE International Solid-State Circuits Conference (ISSCC) 144–146 (IEEE, 2021).
Chen, J., Gai, Y., Yao, Z., Mahoney, M. W. & Gonzalez, J. E. A statistical framework for low-bitwidth training of deep neural networks. Adv. Neural Inf. Process. Syst. 33, 883–894 (2020).
Google Scholar
Oh, J. et al. A 3.0 TFLOPS 0.62V scalable processor core for high compute utilization AI training and inference. In Proc. 2020 IEEE Symposium on VLSI Circuits 1–2 (IEEE, 2020).
Kim, H. et al. GradPIM: a practical processing-in-DRAM architecture for gradient descent. In Proc. 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA) 249–262 (IEEE, 2021).
Zhao, Y. et al. Cambricon-Q: a hybrid architecture for efficient training. In Proc. 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA) 706–719 (IEEE, 2021).
Hazelwood, K. et al. Applied machine learning at Facebook: a datacenter infrastructure perspective. In Proc. 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA) 620–629 (IEEE, 2018).
Yao, Z. et al. HAWQ-V3: dyadic neural network quantization. In Proc. International Conference on Machine Learning 11875–11886 (PMLR, 2021).
Zhao, S., Yue, T. & Hu, X. Distribution-aware adaptive multi-bit quantization. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 9281–9290 (IEEE, 2021).
Hoefler, T., Alistarh, D., Ben-Nun, T., Dryden, N. & Peste, A. Sparsity in deep learning: pruning and growth for efficient inference and training in neural networks. J. Mach. Learn. Res. 22, 1–124 (2021).
MathSciNet MATH Google Scholar
Zyarah, A. M. & Kudithipudi, D. Neuromorphic architecture for the hierarchical temporal memory. IEEE Trans. Emerg. Top. Comput. Intell. 3, 4–14 (2019).
Google Scholar
Davies, M. et al. Advancing neuromorphic computing with Loihi: a survey of results and outlook. Proc. IEEE 109, 911–934 (2021).
Google Scholar
Nowatzki, T., Gangadhan, V., Sankaralingam, K. & Wright, G. Pushing the limits of accelerator efficiency while retaining programmability. In Proc. 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA) 27–39 (IEEE, 2016).
Liu, D. et al. PuDianNao: a polyvalent machine learning accelerator. ACM SIGARCH Comput. Architect. News 43, 369–381 (2015).
Google Scholar
Chen, Y., Xie, Y., Song, L., Chen, F. & Tang, T. A survey of accelerator architectures for deep neural networks. Engineering 6, 264–274 (2020).
Google Scholar
Jia, Z., Tillman, B., Maggioni, M. & Scarpazza, D. P. Dissecting the Graphcore IPU architecture via microbenchmarking. Preprint at https://arxiv.org/abs/1912.03413 (2019).
Putic, M. et al. DyHard-DNN: even more DNN acceleration with dynamic hardware reconfiguration. In Proc. 55th Annual Design Automation Conference (DAC ’18) 1–6 (ACM, 2018); https://doi.org/10.1145/3195970.3196033
Gustafson, J. Posit Arithmetic. Mathematica Notebook describing the posit number system (2017); https://posithub.org/docs/Posits4.pdf
Langroudi, H. F. et al. ALPS: adaptive quantization of deep neural networks with generaLized PositS. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 3100–3109 (IEEE, 2021).
Piyasena, D., Lam, S.-K. & Wu, M. Accelerating continual learning on edge FPGA. In Proc. 2021 31st International Conference on Field-Programmable Logic and Applications (FPL) 294–300 (IEEE, 2021); https://doi.org/10.1109/FPL53798.2021.00059
Zhang, F. et al. XST: a crossbar column-wise sparse training for efficient continual learning. In Proc. 2022 Design, Automation & Test in Europe Conference & Exhibition (DATE) 48–51 (IEEE, 2022).
Warden, P. & Situnayake, D. TinyML (O’Reilly Media, 2019).
Gao, C. Energy-Efficient Recurrent Neural Network Accelerators for Real-Time Inference. PhD thesis, Univ. of Zurich (2022).
Badodekar, N. Power saving with Cypress’s 65-nm asynchronous PowerSnooze™ SRAM. 001–89371 (Cypress Semiconductor Corporation, 2014–2015).
Mahowald, M. VLSI Analogs of Neuronal Visual Processing: A Synthesis of Form and Function. PhD thesis, California Institute of Technology (1992).
Goldberg, D. H., Cauwenberghs, G. & Andreou, A. G. Probabilistic synaptic weighting in a reconfigurable network of VLSI integrate-and-fire neurons. Neural Netw. 14, 781–793 (2001).
Google Scholar
Zyarah, A. M., Gomez, K. & Kudithipudi, D. Neuromorphic system for spatial and temporal information processing. IEEE Trans. Comput. 69, 1099–1112 (2020).
MATH Google Scholar
Carmichael, Z. et al. Deep Positron: a deep neural network using the posit number system. In Proc. 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE) 1421–1426 (IEEE, 2019).
Murillo, R. et al. PLAM: a posit logarithm-approximate multiplier. IEEE Trans. Emerg. Top. Comput. 10, 2079–2085 (2021).
Google Scholar
Zyarah, A. M. & Kudithipudi, D. Invited Paper: resource sharing in feed forward neural networks for energy efficiency. In Proc. 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS) 543–546 (IEEE, 2017).
Zyarah, A. M., Ramesh, A., Merkel, C. & Kudithipudi, D. Optimized hardware framework of MLP with random hidden layers for classification applications. In Proc. Machine Intelligence and Bio-inspired Computation: Theory and Applications X Vol. 9850 (eds Blower, M. et al.) 985007 (International Society for Optics and Photonics, 2016).
Mutlu, O., Ghose, S., Gómez-Luna, J. & Ausavarungnirun, R. in Emerging Computing: From Devices to Systems (eds Sabry Aly, M. M. & Chattopadhyay, A.) 171–243 (Springer, 2023).
O’Connor, M. et al. Fine-grained DRAM: energy-efficient DRAM for extreme bandwidth systems. In Proc. 50th Annual IEEE/ACM International Symposium on Microarchitecture 41–54 (IEEE, 2017).
Olgun, A. et al. Sectored DRAM: an energy-efficient high-throughput and practical fine-grained DRAM architecture. Preprint at https://arxiv.org/abs/2207.13795 (2022).
Indiveri, G., Linares-Barranco, B., Legenstein, R., Deligeorgis, G. & Prodromakis, T. Integration of nanoscale memristor synapses in neuromorphic computing architectures. Nanotechnology 24, 384010 (2013).
Google Scholar
Manohar, R. Hardware/software co-design for neuromorphic systems. In Proc. 2022 IEEE Custom Integrated Circuits Conference (CICC) 1–5 (IEEE, 2022).
Rossi, S. M., Sutili, T., Souza, A. L. Nd & Figueiredo, R. C. Electro-optical modulator requirements for 1 Tb/s per channel coherent systems. J. Microw. Optoelectron. Electromagn. Appl. 20, 823–833 (2021).
Google Scholar
Yu, S. Semiconductor Memory Devices and Circuits (CRC Press, 2022).
Park, S. P., Gupta, S., Mojumder, N., Raghunathan, A. & Roy, K. Future cache design using STT MRAMs for improved energy efficiency: devices, circuits and architecture. In Proc. 49th Annual Design Automation Conference 492–497 (IEEE, 2012).
Yu, S., Shim, W., Peng, X. & Luo, Y. RRAM for compute-in-memory: from inference to training. IEEE Trans. Circuits Syst. I: Regul. Pap. 68, 2753–2765 (2021).
Google Scholar
Zhu, X., Du, C., Jeong, Y. & Lu, W. D. Emulation of synaptic metaplasticity in memristors. Nanoscale 9, 45–51 (2017).
Google Scholar
Zohora, F. T., Zyarah, A. M., Soures, N. & Kudithipudi, D. Metaplasticity in multistate memristor synaptic networks. In Proc. 2020 IEEE International Symposium on Circuits and Systems (ISCAS) 1–5 (IEEE, 2020).
Yanguas-Gil, A. Memristor design rules for dynamic learning and edge processing applications. APL Mater. 7, 091102 (2019).
Google Scholar
Aimone, J. B., Deng, W. & Gage, F. H. Resolving new memories: a critical look at the dentate gyrus, adult neurogenesis and pattern separation. Neuron 70, 589–596 (2011).
Google Scholar
Prabhu, K. et al. CHIMERA: a 0.92-TOPS, 2.2-TOPS/W edge AI accelerator with 2-MByte on-chip foundry resistive RAM for efficient training and inference. IEEE J. Solid State Circuits 57, 1013–1026 (2022).
Google Scholar
Ignjatović, D., Bailey, D. W. & Bajić, L. The wormhole AI training processor. In Proc. 2022 IEEE International Solid-State Circuits Conference (ISSCC) 356–358 (IEEE, 2022).
Vasiljevic, J. et al. Compute substrate for Software 2.0. IEEE Micro 41, 50–55 (2021).
Google Scholar
Shrestha, A., Fang, H., Rider, D. P., Mei, Z. & Qiu, Q. In-hardware learning of multilayer spiking neural networks on a neuromorphic processor. In Proc. 2021 58th ACM/IEEE Design Automation Conference (DAC) 367–372 (IEEE, 2021).
Höppner, S. & Mayr, C. SpiNNaker2—Towards Extremely Efficient Digital Neuromorphics and Multi-Scale Brain Emulation (NICE, 2018).

Download references

Acknowledgements

We thank members of the Neuromorphic AI lab, P. Helfer, T. Pandit, V. Karia and S. Hamed Fatemi Langroudi for discussions on this topic. Part of this material is based on research sponsored by the Air Force Research Laboratory under agreement no. FA8750-20-2-1003 through BAA FA8750-19-S-7010. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the Air Force Research Laboratory or the US Government. This Article has been approved for public release; distribution unlimited (case no. AFRL-2023-3120, 28 Jun 2023). A.Y. acknowledges Laboratory Directed Research and Development (LDRD) funding from Argonne National Laboratory, provided by the Director, Office of Science, of the US Department of Energy under contract no. DE-AC02-06CH11357.

Author information

Authors and Affiliations

University of Texas at San Antonio, San Antonio, TX, USA
Dhireesha Kudithipudi, Anurag Daram, Abdullah M. Zyarah, Fatima Tuz Zohora & Nicholas Soures
Sandia National Laboratories, Albuquerque, NM, USA
James B. Aimone
Argonne National Laboratory, Lemont, IL, USA
Angel Yanguas-Gil
Rochester Institute of Technology, Rochester, NY, USA
Nicholas Soures
Forschungszentrum Jülich and RWTH Aachen, Aachen, Germany
Emre Neftci
Tenstorrent Inc., Boston, MA, USA
Matthew Mattina
University of Pisa, Pisa, Italy
Vincenzo Lomonaco
Air Force Research Laboratory, Rome, NY, USA
Clare D. Thiem
ECS Federal, Arlington, VA, USA
Benjamin Epstein

Authors

Dhireesha Kudithipudi
View author publications
You can also search for this author in PubMed Google Scholar
Anurag Daram
View author publications
You can also search for this author in PubMed Google Scholar
Abdullah M. Zyarah
View author publications
You can also search for this author in PubMed Google Scholar
Fatima Tuz Zohora
View author publications
You can also search for this author in PubMed Google Scholar
James B. Aimone
View author publications
You can also search for this author in PubMed Google Scholar
Angel Yanguas-Gil
View author publications
You can also search for this author in PubMed Google Scholar
Nicholas Soures
View author publications
You can also search for this author in PubMed Google Scholar
Emre Neftci
View author publications
You can also search for this author in PubMed Google Scholar
Matthew Mattina
View author publications
You can also search for this author in PubMed Google Scholar
Vincenzo Lomonaco
View author publications
You can also search for this author in PubMed Google Scholar
Clare D. Thiem
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin Epstein
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

D.K. led the team working on the design and concept for the manuscript. D.K., A.D., A.M.Z., F.T.Z., J.B.A., A.Y.-G., N.S., E.N., M.M., V.L., C.D.T. and B.E. had multiple rounds of discussions regarding the conceptualization of the manuscript and contributed to the iterative draft manuscripts and the main manuscript text. A.D. and A.M.Z. prepared the figures, with input from D.K., F.T.Z. and N.S. A.D., F.T.Z. and A.M.Z. collected data and prepared the tables in the manuscript. D.K. revised it critically for important intellectual content. All authors commented on the manuscript and reviewed the final version of the manuscript.

Corresponding author

Correspondence to Dhireesha Kudithipudi.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Electronics thanks the anonymous reviewers for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Kudithipudi, D., Daram, A., Zyarah, A.M. et al. Design principles for lifelong learning AI accelerators. Nat Electron 6, 807–822 (2023). https://doi.org/10.1038/s41928-023-01054-3

Download citation

Received: 27 April 2022
Accepted: 27 September 2023
Published: 16 November 2023
Issue Date: November 2023
DOI: https://doi.org/10.1038/s41928-023-01054-3

This article is cited by

A collective AI via lifelong learning and sharing at the edge
- Andrea Soltoggio
- Eseoghene Ben-Iwhiwhu
- Soheil Kolouri
Nature Machine Intelligence (2024)

Design principles for lifelong learning AI accelerators

Subjects

Abstract

Access options

Similar content being viewed by others

Biological underpinnings for lifelong learning machines

A collective AI via lifelong learning and sharing at the edge

Catalyzing next-generation Artificial Intelligence through NeuroAI

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Rights and permissions

About this article

Cite this article

This article is cited by

A collective AI via lifelong learning and sharing at the edge

Search

Quick links

Subjects

Abstract

Access options

Similar content being viewed by others

Biological underpinnings for lifelong learning machines

A collective AI via lifelong learning and sharing at the edge

Catalyzing next-generation Artificial Intelligence through NeuroAI

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

A collective AI via lifelong learning and sharing at the edge

Search

Quick links