The computational cost of deep neural networks presents challenges to broadly deploying these algorithms. Low-power and embedded neuromorphic processors offer potentially dramatic performance-per-watt improvements over traditional processors. However, programming these brain-inspired platforms generally requires platform-specific expertise. It is therefore difficult to achieve state-of-the-art performance on these platforms, limiting their applicability. Here we present Whetstone, a method to bridge this gap by converting deep neural networks to have discrete, binary communication. During the training process, the activation function at each layer is progressively sharpened towards a threshold activation, with limited loss in performance. Whetstone sharpened networks do not require a rate code or other spike-based coding scheme, thus producing networks comparable in timing and size to conventional artificial neural networks. We demonstrate Whetstone on a number of architectures and tasks such as image classification, autoencoders and semantic segmentation. Whetstone is currently implemented within the Keras wrapper for TensorFlow and is widely extendable.
This is a preview of subscription content, access via your institution
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Rent or buy this article
Prices vary by article type
Prices may be subject to local taxes which are calculated during checkout
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. Proc. IEEE Conference on Computer Vision and Pattern Recognition 770–778 (IEEE, 2016).
Pinheiro, P. O., Collobert, R. & Dollár, P. Learning to segment object candidates. Proc. 28th International Conference on Neural Information Processing Systems 2, 1990–1998 (2015).
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Yang, T.-J., Chen, Y.-H. & Sze, V. Designing energy-efficient convolutional neural networks using energy-aware pruning. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 6071–6079 (IEEE, 2017).
Coppola, G. & Dey, E. Driverless cars are giving engineers a fuel economy headache. Bloomberg.com https://www.bloomberg.com/news/articles/2017-10-11/driverless-cars-are-giving-engineers-a-fuel-economy-headache (2017).
Horowitz, M. 1.1 Computing’s energy problem (and what we can do about it). In 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC) 10–14 (IEEE, 2014).
Jouppi, N. P. et al. In-datacenter performance analysis of a tensor processing unit. In 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA) 1–12 (IEEE, 2017).
Rao, N. Intel® nervana™ neural network processors (NNP) redefine AI silicon. Intel https://ai.intel.com/intel-nervana-neural-network-processors-nnp-redefine-ai-silicon/ (2018).
Hemsoth, N. Intel, Nervana shed light on deep learning chip architecture. The Next Platform https://www.nextplatform.com/2018/01/11/intel-nervana-shed-light-deep-learning-chip-architecture/ (2018).
Markidis, S. et al. Nvidia tensor core programmability, performance & precision. Preprint at https://arxiv.org/abs/1803.04014 (2018).
Merolla, P. A. et al. A million spiking-neuron integrated circuit with a scalable communication network and interface. Science 345, 668–673 (2014).
Khan, M. M. et al. Spinnaker: mapping neural networks onto a massively-parallel chip multiprocessor. In IEEE International Joint Conference on Neural Networks, 2008, IJCNN 2008 (IEEE World Congress on Computational Intelligence) 2849–2856 (IEEE, 2008).
Schuman, C. D. et al. A survey of neuromorphic computing and neural networks in hardware. Preprint at https://arxiv.org/abs/1705.06963 (2017).
James, C. D. et al. A historical survey of algorithms and hardware architectures for neural-inspired and neuromorphic computing applications. Biolog. Inspired Cogn. Architec. 19, 49–64 (2017).
Knight, J. C., Tully, P. J., Kaplan, B. A., Lansner, A. & Furber, S. B. Large-scale simulations of plastic neural networks on neuromorphic hardware. Front. Neuroanat. 10, 37 (2016).
Sze, V., Chen, Y.-H., Yang, T.-J. & Emer, J. S. Efficient processing of deep neural networks: a tutorial and survey. Proc. IEEE 105, 2295–2329 (2017).
Bergstra, J., Yamins, D. & Cox, D. D. Hyperopt: a python library for optimizing the hyperparameters of machine learning algorithms. In Proceedings of the 12th Python in Science Conference 13–20 (Citeseer, 2013).
Li, L., Jamieson, K., DeSalvo, G., Rostamizadeh, A. & Talwalkar, A. Hyperband: a novel bandit-based approach to hyperparameter optimization. J. Mach. Learn. Res. 18, 6765–6816 (2017).
Lin, T.-Y. et al. Microsoft coco: common objects in context. In European Conference on Computer Vision, 740–755 (Springer, 2014).
Hunsberger, E. & Eliasmith, C. Training spiking deep networks for neuromorphic hardware. Preprint at https://arxiv.org/abs/1611.05141 (2016).
Esser, S. K., Appuswamy, R., Merolla, P., Arthur, J. V. & Modha, D. S. Backpropagation for energy-efficient neuromorphic computing. In Advances in Neural Information Processing Systems 28 (eds Cortes, C., Lawrence, N. D., Lee, D. D., Sugiyama, M. & Garnett, R.) 1117–1125 (Curran Associates, Red Hook, 2015).
Esser, S. et al. Convolutional networks for fast, energy-efficient neuromorphic computing. 2016. Preprint at http://arxiv.org/abs/1603.08270 (2016).
Rueckauer, B., Lungu, I.-A., Hu, Y., Pfeiffer, M. & Liu, S.-C. Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Front. Neurosci. 11, 682 (2017).
Bohte, S. M., Kok, J. N. & La Poutré, J. A. Spikeprop: backpropagation for networks of spiking neurons. In European Symposium on Artificial Neural Networks 419–424 (ELEN, London, 2000).
Huh, D. & Sejnowski, T. J. Gradient descent for spiking neural networks. Preprint at https://arxiv.org/abs/1706.04698 (2017).
Cao, Y., Chen, Y. & Khosla, D. Spiking deep convolutional neural networks for energy-efficient object recognition. Int. J. Comput. Vis. 113, 54–66 (2015).
Hunsberger, E. & Eliasmith, C. Spiking deep networks with LIF neurons. Preprint at https://arxiv.org/abs/1510.08829 (2015).
Liew, S. S., Khalil-Hani, M. & Bakhteri, R. Bounded activation functions for enhanced training stability of deep neural networks on visual pattern recognition problems. Neurocomputing 216, 718–734 (2016).
Nise, N. S. Control Systems Engineering, 5th edn (Wiley, New York, NY, 2008).
Chollet, F. et al. Keras https://github.com/fchollet/keras (2015).
Rothganger, F., Warrender, C. E., Trumbo, D. & Aimone, J. B. N2A: a computational tool for modeling from neurons to algorithms. Front. Neural Circuits 8, 1 (2014).
Davison, A. P. et al. Pynn: a common interface for neuronal network simulators. Front. Neuroinform. 2, 11 (2009).
Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R. & Bengio, Y. Binarized neural networks. In Proceedings of Advances in Neural Information Processing Systems 4107–4115 (Curran Associates, Red Hook, 2016).
LeCun, Y., Cortes, C. & Burges, C. Mnist handwritten digit database. AT&T Labs http://yann.lecun.com/exdb/mnist 2 (2010).
Xiao, H., Rasul, K. & Vollgraf, R. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. Preprint at https://arxiv.org/abs/1708.07747 (2017).
Krizhevsky, A. & Hinton, G. Learning Multiple Layers of Features from Tiny Images. Technical Report, Univ. Toronto (2009).
This work was supported by Sandia National Laboratories’ Laboratory Directed Research and Development (LDRD) Program under the Hardware Acceleration of Adaptive Neural Algorithms Grand Challenge project and the DOE Advanced Simulation and Computing program. Sandia National Laboratories is a multi-mission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, a wholly owned subsidiary of Honeywell International, for the US Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525.
This Article describes objective technical results and analysis. Any subjective views or opinions that might be expressed in the paper do not necessarily represent the views of the US Department of Energy or the US Government.
The authors declare no competing interests.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Severa, W., Vineyard, C.M., Dellana, R. et al. Training deep neural networks for binary communication with the Whetstone method. Nat Mach Intell 1, 86–94 (2019). https://doi.org/10.1038/s42256-018-0015-y
This article is cited by
Nature Computational Science (2022)
Nature Electronics (2022)
Nature Electronics (2019)