Abstract
The ‘Internet of Things’ has brought increased demand for artificial intelligence-based edge computing in applications ranging from healthcare monitoring systems to autonomous vehicles. Quantization is a powerful tool to address the growing computational cost of such applications and yields significant compression over full-precision networks. However, quantization can result in substantial loss of performance for complex image classification tasks. To address this, we propose a principal component analysis (PCA)-driven methodology to identify the important layers of a binary network, and design mixed-precision networks. The proposed Hybrid-Net achieves a more than 10% improvement in classification accuracy over binary networks such as XNOR-Net for ResNet and VGG architectures on CIFAR-100 and ImageNet datasets, while still achieving up to 94% of the energy efficiency of XNOR-Nets. This work advances the feasibility of using highly compressed neural networks for energy-efficient neural computing in edge devices.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Code availability
The publicly available tools Python and PyTorch were used to perform the experiments. Custom codes for the work are available at https://github.com/ichakra2/pca-hybrid.
References
Gubbi, J., Buyya, R., Marusic, S. & Palaniswami, M. Internet of things (IoT): a vision, architectural elements and future directions. Future Gener. Comput. Syst. 29, 1645–1660 (2013).
Yao, S., Hu, S., Zhao, Y., Zhang, A. & Abdelzaher, T. DeepSense. In Proceedings of the 26th International Conference on World Wide Web - WWW 17 351–360 (ACM Press, 2017).
Krizhevsky, A. et al. ImageNet classification with deep convolutional neural networks. In Proceedings of Advances in Neural Information Processing Systems 1097–1105 (NIPS, 2012).
Szegedy, C. et al. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 1–9 (IEEE, 2015).
He, K. & Zhang, X. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 770–778 (IEEE, 2016).
Girshick, R. B. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision 1440–1448 (IEEE, 2015).
Kaufman, L. M. Data security in the world of cloud computing. IEEE Secur. Priv. 7, 61–64 (2009).
Gonzalez, N. et al. A quantitative analysis of current security concerns and solutions for cloud computing. J. Cloud Comput. Adv. Syst. Appl. 1, 11 (2012).
Li, D., Salonidis, T., Desai, N. V. & Chuah, M. C. DeepCham: collaborative edge-mediated adaptive deep learning for mobile object recognition. In Proceedings of 2016 IEEE/ACM Symposium on Edge Computing (SEC) 64–76 (IEEE, 2016).
Iandola, F. N., Han, S., Moskewicz, M. W. & Ashraf, K. SqueezeNet: AlexNet-level accuracy with 50× fewer parameters and <0.5 MB model size. Preprint at https://arxiv.org/pdf/1602.07360.pdf (2016).
Alvarez, J. M. & Salzmann, M. Compression-aware training of deep networks. In Proceedings of Advances in Neural Information Processing Systems 856–867 (NIPS, 2017).
Weigend, A. S., Rumelhart, D. E. & Huberman, B. A. Generalization by weight-elimination with application to forecasting. In Proceedings of Advances in Neural Information Processing Systems 875–882 (NIPS, 1991).
Han, S., Pool, J., Tran, J. & Dally, W. Learning both weights and connections for efficient neural network. In Proceedings of Advances in Neural Information Processing Systems 1135–1143 (NIPS, 2015).
Ullrich, K., Meeds, E. & Welling, M. Soft weight-sharing for neural network compression. In International Conference on Learning Representations (ICLR, 2017).
Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R. & Bengio, Y. Quantized neural networks: training neural networks with low precision weights and activations. J. Mach. Learn. Res. 18, 6869–6898 (2017).
Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R. & Bengio, Y. Binarized neural networks: training deep neural networks with weights and activations constrained to +1 or −1. Preprint at https://arxiv.org/abs/1602.02830 (2016).
Rastegari, M., Ordonez, V., Redmon, J. & Farhadi, A. XNOR-Net: ImageNet classification using binary convolutional neural networks. In Proceedings of the European Conference on Computer Vision 525–542 (Springer, 2016).
Garg, I., Panda, P. & Roy, K. A low effort approach to structured CNN design using PCA. Preprint at https://arxiv.org/abs/1812.06224 (2018).
Mishra, A., Nurvitadhi, E., Cook, J. J. & Marr, D. WRPN: wide reduced-precision networks. Preprint at https://arxiv.org/abs/1709.01134 (2017).
Liu, Z. et al. Bi-Real Net: enhancing the performance of 1-bit CNNs with improved representational capability and advanced training algorithm. In Proceedings of the European Conference on Computer Vision (ECCV) 722–737 (Springer, 2018).
Zhou, S. et al. DoReFa-Net: training low bitwidth convolutional neural networks with low bitwidth gradients. Preprint at https://arxiv.org/abs/1606.06160 (2016).
Zhou, S.-C., Wang, Y.-Z., Wen, H., He, Q.-Y. & Zou, Y.-H. Balanced quantization: an effective and efficient approach to quantized neural networks. J. Comput. Sci. Technol. 32, 667–682 (2017).
Zhang, D., Yang, J., Ye, D. & Hua, G. LQ-Nets: learned quantization for highly accurate and compact deep neural networks. In Proceedings of the European Conference on Computer Vision (ECCV) 365–382 (Springer, 2018).
Jung, S. et al. Learning to quantize deep networks by optimizing quantization intervals with task loss. Preprint at https://arxiv.org/abs/1808.05779 (2018).
Choi, J. et al. PACT: parameterized clipping activation for quantized neural networks. Preprint at https://arxiv.org/pdf/1805.06085.pdf (2018).
Graham, B. Low-precision batch-normalized activations. Preprint at https://arxiv.org/abs/1702.08231 (2017).
Prabhu, A. et al. Hybrid binary networks: optimizing for accuracy, efficiency and memory. In Proceedings of 2018 IEEE Winter Conference on Applications of Computer Vision (WACV) 821–829 (IEEE, 2018).
Wu, B. et al. Mixed precision quantization of ConvNets via differentiable neural architecture search. Preprint at https://arxiv.org/pdf/1812.00090.pdf (2018).
Sakr, C. & Shanbhag, N. Per-tensor fixed-point quantization of the back-propagation algorithm. Preprint at https://arxiv.org/pdf/1812.11732.pdf (2018).
Paszke, A. et al. Automatic differentiation in PyTorch. In Proceedings of 31st Conference on Neural Information Processing Systems (NIPS, 2017).
Krizhevsky, A. Learning Multiple Layers of Features from Tiny Images Technical Report (Citeseer, 2009).
Deng, J. et al. ImageNet: a large-scale hierarchical image database. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition 248–255 (IEEE, 2009).
Keckler, S. W., Dally, W. J., Khailany, B. & Garland, M. GPUs and the future of parallel computing. IEEE Micro 7–17 (2011).
Acknowledgements
This work was supported in part by the Center for Brain-inspired Computing Enabling Autonomous Intelligence (C-BRIC), one of six centres in JUMP, a Semiconductor Research Corporation (SRC) programme sponsored by DARPA, in part by the National Science Foundation, in part by Intel, in part by the ONR-MURI programme and in part by the Vannevar Bush Faculty Fellowship.
Author information
Authors and Affiliations
Contributions
I.C. and K.R. conceived the idea. I.C., D.R. and I.G. developed the simulation framework. I.C. carried out all experiments. I.C. and A.A. developed the energy and memory analysis framework. I.C., D.R., I.G. and K.R. analysed the results. I.C., D.R. and I.G. wrote the paper.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
About this article
Cite this article
Chakraborty, I., Roy, D., Garg, I. et al. Constructing energy-efficient mixed-precision neural networks through principal component analysis for edge intelligence. Nat Mach Intell 2, 43–55 (2020). https://doi.org/10.1038/s42256-019-0134-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s42256-019-0134-0
This article is cited by
-
Autonomous vehicles decision-making enhancement using self-determination theory and mixed-precision neural networks
Multimedia Tools and Applications (2023)
-
Analyzing point cloud of coal mining process in much dust environment based on dynamic graph convolution neural network
Environmental Science and Pollution Research (2023)
-
Lead federated neuromorphic learning for wireless edge artificial intelligence
Nature Communications (2022)