Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Constructing energy-efficient mixed-precision neural networks through principal component analysis for edge intelligence

A preprint version of the article is available at arXiv.

Abstract

The ‘Internet of Things’ has brought increased demand for artificial intelligence-based edge computing in applications ranging from healthcare monitoring systems to autonomous vehicles. Quantization is a powerful tool to address the growing computational cost of such applications and yields significant compression over full-precision networks. However, quantization can result in substantial loss of performance for complex image classification tasks. To address this, we propose a principal component analysis (PCA)-driven methodology to identify the important layers of a binary network, and design mixed-precision networks. The proposed Hybrid-Net achieves a more than 10% improvement in classification accuracy over binary networks such as XNOR-Net for ResNet and VGG architectures on CIFAR-100 and ImageNet datasets, while still achieving up to 94% of the energy efficiency of XNOR-Nets. This work advances the feasibility of using highly compressed neural networks for energy-efficient neural computing in edge devices.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Illustration to show PCA analysis and subsequent Hybrid-Net design.
Fig. 2: Network configurations for comparison.
Fig. 3: Layer-wise principal component analysis trends for various networks.
Fig. 4: Illustration of the energy–accuracy optimality of Hybrid-Net.

Similar content being viewed by others

Data availability

All datasets used in this work are publicly available: CIFAR-10031 and ImageNet32.

Code availability

The publicly available tools Python and PyTorch were used to perform the experiments. Custom codes for the work are available at https://github.com/ichakra2/pca-hybrid.

References

  1. Gubbi, J., Buyya, R., Marusic, S. & Palaniswami, M. Internet of things (IoT): a vision, architectural elements and future directions. Future Gener. Comput. Syst. 29, 1645–1660 (2013).

    Article  Google Scholar 

  2. Yao, S., Hu, S., Zhao, Y., Zhang, A. & Abdelzaher, T. DeepSense. In Proceedings of the 26th International Conference on World Wide Web - WWW 17 351–360 (ACM Press, 2017).

  3. Krizhevsky, A. et al. ImageNet classification with deep convolutional neural networks. In Proceedings of Advances in Neural Information Processing Systems 1097–1105 (NIPS, 2012).

  4. Szegedy, C. et al. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 1–9 (IEEE, 2015).

  5. He, K. & Zhang, X. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 770–778 (IEEE, 2016).

  6. Girshick, R. B. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision 1440–1448 (IEEE, 2015).

  7. Kaufman, L. M. Data security in the world of cloud computing. IEEE Secur. Priv. 7, 61–64 (2009).

    Article  Google Scholar 

  8. Gonzalez, N. et al. A quantitative analysis of current security concerns and solutions for cloud computing. J. Cloud Comput. Adv. Syst. Appl. 1, 11 (2012).

    Article  Google Scholar 

  9. Li, D., Salonidis, T., Desai, N. V. & Chuah, M. C. DeepCham: collaborative edge-mediated adaptive deep learning for mobile object recognition. In Proceedings of 2016 IEEE/ACM Symposium on Edge Computing (SEC) 64–76 (IEEE, 2016).

  10. Iandola, F. N., Han, S., Moskewicz, M. W. & Ashraf, K. SqueezeNet: AlexNet-level accuracy with 50× fewer parameters and <0.5 MB model size. Preprint at https://arxiv.org/pdf/1602.07360.pdf (2016).

  11. Alvarez, J. M. & Salzmann, M. Compression-aware training of deep networks. In Proceedings of Advances in Neural Information Processing Systems 856–867 (NIPS, 2017).

  12. Weigend, A. S., Rumelhart, D. E. & Huberman, B. A. Generalization by weight-elimination with application to forecasting. In Proceedings of Advances in Neural Information Processing Systems 875–882 (NIPS, 1991).

  13. Han, S., Pool, J., Tran, J. & Dally, W. Learning both weights and connections for efficient neural network. In Proceedings of Advances in Neural Information Processing Systems 1135–1143 (NIPS, 2015).

  14. Ullrich, K., Meeds, E. & Welling, M. Soft weight-sharing for neural network compression. In International Conference on Learning Representations (ICLR, 2017).

  15. Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R. & Bengio, Y. Quantized neural networks: training neural networks with low precision weights and activations. J. Mach. Learn. Res. 18, 6869–6898 (2017).

    MathSciNet  MATH  Google Scholar 

  16. Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R. & Bengio, Y. Binarized neural networks: training deep neural networks with weights and activations constrained to +1 or −1. Preprint at https://arxiv.org/abs/1602.02830 (2016).

  17. Rastegari, M., Ordonez, V., Redmon, J. & Farhadi, A. XNOR-Net: ImageNet classification using binary convolutional neural networks. In Proceedings of the European Conference on Computer Vision 525–542 (Springer, 2016).

  18. Garg, I., Panda, P. & Roy, K. A low effort approach to structured CNN design using PCA. Preprint at https://arxiv.org/abs/1812.06224 (2018).

  19. Mishra, A., Nurvitadhi, E., Cook, J. J. & Marr, D. WRPN: wide reduced-precision networks. Preprint at https://arxiv.org/abs/1709.01134 (2017).

  20. Liu, Z. et al. Bi-Real Net: enhancing the performance of 1-bit CNNs with improved representational capability and advanced training algorithm. In Proceedings of the European Conference on Computer Vision (ECCV) 722–737 (Springer, 2018).

  21. Zhou, S. et al. DoReFa-Net: training low bitwidth convolutional neural networks with low bitwidth gradients. Preprint at https://arxiv.org/abs/1606.06160 (2016).

  22. Zhou, S.-C., Wang, Y.-Z., Wen, H., He, Q.-Y. & Zou, Y.-H. Balanced quantization: an effective and efficient approach to quantized neural networks. J. Comput. Sci. Technol. 32, 667–682 (2017).

    Article  MathSciNet  Google Scholar 

  23. Zhang, D., Yang, J., Ye, D. & Hua, G. LQ-Nets: learned quantization for highly accurate and compact deep neural networks. In Proceedings of the European Conference on Computer Vision (ECCV) 365–382 (Springer, 2018).

  24. Jung, S. et al. Learning to quantize deep networks by optimizing quantization intervals with task loss. Preprint at https://arxiv.org/abs/1808.05779 (2018).

  25. Choi, J. et al. PACT: parameterized clipping activation for quantized neural networks. Preprint at https://arxiv.org/pdf/1805.06085.pdf (2018).

  26. Graham, B. Low-precision batch-normalized activations. Preprint at https://arxiv.org/abs/1702.08231 (2017).

  27. Prabhu, A. et al. Hybrid binary networks: optimizing for accuracy, efficiency and memory. In Proceedings of 2018 IEEE Winter Conference on Applications of Computer Vision (WACV) 821–829 (IEEE, 2018).

  28. Wu, B. et al. Mixed precision quantization of ConvNets via differentiable neural architecture search. Preprint at https://arxiv.org/pdf/1812.00090.pdf (2018).

  29. Sakr, C. & Shanbhag, N. Per-tensor fixed-point quantization of the back-propagation algorithm. Preprint at https://arxiv.org/pdf/1812.11732.pdf (2018).

  30. Paszke, A. et al. Automatic differentiation in PyTorch. In Proceedings of 31st Conference on Neural Information Processing Systems (NIPS, 2017).

  31. Krizhevsky, A. Learning Multiple Layers of Features from Tiny Images Technical Report (Citeseer, 2009).

  32. Deng, J. et al. ImageNet: a large-scale hierarchical image database. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition 248–255 (IEEE, 2009).

  33. Keckler, S. W., Dally, W. J., Khailany, B. & Garland, M. GPUs and the future of parallel computing. IEEE Micro 7–17 (2011).

Download references

Acknowledgements

This work was supported in part by the Center for Brain-inspired Computing Enabling Autonomous Intelligence (C-BRIC), one of six centres in JUMP, a Semiconductor Research Corporation (SRC) programme sponsored by DARPA, in part by the National Science Foundation, in part by Intel, in part by the ONR-MURI programme and in part by the Vannevar Bush Faculty Fellowship.

Author information

Authors and Affiliations

Authors

Contributions

I.C. and K.R. conceived the idea. I.C., D.R. and I.G. developed the simulation framework. I.C. carried out all experiments. I.C. and A.A. developed the energy and memory analysis framework. I.C., D.R., I.G. and K.R. analysed the results. I.C., D.R. and I.G. wrote the paper.

Corresponding author

Correspondence to Indranil Chakraborty.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chakraborty, I., Roy, D., Garg, I. et al. Constructing energy-efficient mixed-precision neural networks through principal component analysis for edge intelligence. Nat Mach Intell 2, 43–55 (2020). https://doi.org/10.1038/s42256-019-0134-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s42256-019-0134-0

This article is cited by

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics