Deep neural networks with applications from computer vision to medical diagnosis1,2,3,4,5 are commonly implemented using clock-based processors6,7,8,9,10,11,12,13,14, in which computation speed is mainly limited by the clock frequency and the memory access time. In the optical domain, despite advances in photonic computation15,16,17, the lack of scalable on-chip optical non-linearity and the loss of photonic devices limit the scalability of optical deep networks. Here we report an integrated end-to-end photonic deep neural network (PDNN) that performs sub-nanosecond image classification through direct processing of the optical waves impinging on the on-chip pixel array as they propagate through layers of neurons. In each neuron, linear computation is performed optically and the non-linear activation function is realized opto-electronically, allowing a classification time of under 570 ps, which is comparable with a single clock cycle of state-of-the-art digital platforms. A uniformly distributed supply light provides the same per-neuron optical output range, allowing scalability to large-scale PDNNs. Two-class and four-class classification of handwritten letters with accuracies higher than 93.8% and 89.8%, respectively, is demonstrated. Direct, clock-less processing of optical data eliminates analogue-to-digital conversion and the requirement for a large memory module, allowing faster and more energy efficient neural networks for the next generations of deep learning systems.
Your institute does not have access to this article
Subscribe to Nature+
Get immediate online access to the entire Nature family of 50+ journals
Subscribe to Journal
Get full journal access for 1 year
only $3.90 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
Codes that are used in this paper are available from the corresponding author on reasonable request.
Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M. & Poggio, T. Robust object recognition with cortex-like mechanisms. IEEE Trans. Pattern Anal. Mach. Intell. 29, 411–426 (2007).
Wang, D., Su, J. & Yu, H. Feature extraction and analysis of natural language processing for deep learning English language. IEEE Access 8, 46335–46345 (2020).
Ribeiro, A. H. et al. Automatic diagnosis of the 12-lead ECG using a deep neural network. Nat. Commun. 11, 1760 (2020).
Lai, L. et al. Computer-aided diagnosis of pectus excavatum using CT images and deep learning methods. Sci. Rep. 10, 20294 (2020).
Yuan, B. et al. Unsupervised and supervised learning with neural network for human transcriptome analysis and cancer diagnosis. Sci. Rep. 10, 19106 (2020).
Shin, H. et al. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans. Med. Imaging 35, 1285–1298 (2016).
Tajbakhsh, N. et al. Convolutional neural networks for medical image analysis: full training or fine tuning? IEEE Trans. Med. Imaging 35, 1299–1312 (2016).
LeCun, Y. & Bengio, Y. in The Handbook of Brain Theory and Neural Networks (ed. Arbib, M. A.) 255–258 (MIT Press, 1998).
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Barbastathis, G., Ozcan, A. & Situ, G. On the use of deep learning for computational imaging. Optica 6, 921–943 (2019).
Krizhevsky, A., Sutskever, I. & Hinton, G. E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1097–1105 (2012).
Nair, V. & Hinton, G. E. Rectified linear units improve restricted Boltzmann machines. In Proc. 27th International Conference on Machine Learning (eds Fürnkranz, J. & Joachims, T.) 807–814 (Omnipress, 2010).
Ren, S., He, K., Girshick, R. & Sun, J. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2017).
Li, H., Lin, Z., Shen, X., Brandt, J. & Hua, G. A convolutional neural network cascade for face detection. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 5325–5334 (IEEE, 2015).
Shen, Y. et al. Deep learning with coherent nanophotonic circuits. Nat. Photonics 11, 441–446 (2017).
Shastri, B. J. et al. Photonics for artificial intelligence and neuromorphic computing. Nat. Photonics 15, 102–114 (2021).
Bogaerts, W. et al. Programmable photonic circuits. Nature 586, 207–216 (2020).
Moons, B. & Verhelst, M. An energy-efficient precision-scalable ConvNet processor in 40-nm CMOS. IEEE J. Solid-State Circuits 52, 903–914 (2017).
Lee, J. et al. UNPU: an energy-efficient deep neural network accelerator with fully variable weight bit precision. IEEE J. Solid-State Circuits 54, 173–185 (2019).
Hill, P. et al. DeftNN: addressing bottlenecks for DNN execution on GPUs via synapse vector elimination and ear-compute data fission. In 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) 786–799 (IEEE, 2017).
Nurvitadhi, E. et al. Accelerating binarized neural networks: comparison of FPGA, CPU, GPU, and ASIC. In 2016 International Conference on Field-Programmable Technology (FPT) 77–84 (IEEE, 2016).
Ashtiani, F., Risi, A. & Aflatouni, F. Single-chip nanophotonic near-field imager. Optica 6, 1255–1260 (2019).
Cheng, Z., Rios, C., Perince, W. H. P., Wright, C. D. & Bhaskaran, H. On-chip photonic synapse. Sci. Adv. 3, e1700160 (2017).
Tait, A. N. et al. Neuromorphic photonic networks using silicon photonic weight banks. Sci. Rep. 7, 7430 (2017).
Feldmann, J. et al. All-optical spiking neurosynaptic networks with self-learning capabilities. Nature 569, 208–214 (2019).
Miscuglio, M. et al. All-optical nonlinear activation function for photonic neural networks. Opt. Mater. Express 8, 3851–3863 (2018).
Jha, A., Huang, C. & Prucnal, P. R. Reconfigurable all-optical nonlinear activation functions for neuromorphic photonics. Opt. Lett. 45, 4819–4822 (2020).
Feldmann, J. et al. Parallel convolutional processing using an integrated photonic tensor core. Nature 589, 52–58 (2021).
Zuo, Y. et al. All-optical neural network with nonlinear activation functions. Optica 6, 1132–1137 (2019).
Lin, X. et al. All-optical machine learning using diffractive deep neural networks. Science 361, 1004–1008 (2018).
Bueno, J. et al. Reinforcement learning in a large-scale photonic recurrent neural network. Optica 5, 756–760 (2018).
Zhou, T. et al. Large-scale neuromorphic optoelectronic computing with a reconfigurable diffractive processing unit. Nat. Photonics 15, 367–373 (2021).
Chang, J. et al. Hybrid optical-electronic convolutional neural networks with optimized diffractive optics for image classification. Sci. Rep. 8, 12324 (2018).
Xu, X. et al. 11 TOPS photonic convolutional accelerator for optical neural networks. Nature 589, 44–51 (2021).
AMD RadeonTM RX 6700 XT Graphics. https://www.amd.com/en/products/graphics/amd-radeon-rx-6700-xt.
Chollet, F. et al. Keras. https://keras.io (2015).
Tait, A. N. et al. Silicon photonic modulator neuron. Phys. Rev. Appl. 11, 064043 (2019).
Stone, M. Cross-validatory choice and assessment of statistical predictions. J. R. Stat. Soc. Series B Stat. Methodol. 36, 111–147 (1974).
Lecun, Y. et al. The MNIST dataset of handwritten digits. http://yann.lecun.com/exdb/mnist/ (1999).
Lecun, Y., Bottou, L., Bengio, Y. & Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998).
Rakowski, M. et al. 45nm CMOS — Silicon Photonics Monolithic Technology (45CLO) for next-generation, low power and high speed optical interconnects. In 2020 Optical Fiber Communications Conference and Exhibition (OFC) (IEEE, 2020).
Fahrenkopf, N. M. et al. The AIM photonics MPW: a highly accessible cutting edge technology for rapid prototyping of photonic integrated circuits. IEEE J. Sel. Top. Quantum Electron. 25, 1–6 (2019).
Borji, A., Cheng, M., Jiang, H. & Li, J. Salient object detection: a benchmark. IEEE Trans. Image Process. 24, 5706–5722 (2015).
Cheng, M., Mitra, N. J., Huang, X., Torr, P. H. S. & Hu, S. Global contrast based salient region detection. IEEE Trans. Pattern Anal. Mach. Intell. 37, 569–582 (2015).
Kist, A. M. Deep learning on edge TPUs. Preprint at https://arxiv.org/abs/2108.13732 (2021).
IMAGO Technologies’ Edge AI camera. https://imago-technologies.com/wp-content/uploads/2021/01/Specification-VisionAI-V1.2.pdf.
JeVois smart machine vision. https://www.jevoisinc.com/collections/jevois-hardware/products/jevois-pro-deep-learning-smart-camera.
Kulyukin, V. et al. On image classification in video analysis of omnidirectional Apis mellifera traffic: random reinforced forests vs. shallow convolutional networks. Appl. Sci. 11, 8141 (2021).
Chiu, T. Y., Wang, Y. & Wang, H. A 3.7–43.7-GHz low-power consumption variable gain distributed amplifier in 90-nm CMOS. IEEE Microw. Wirel. Compon. Lett. 31, 169–172 (2021).
Xuan, Z. et al. A low-power 40 Gb/s optical receiver in silicon. In 2015 IEEE Radio Frequency Integrated Circuits Symposium (RFIC) 315–318 (IEEE, 2015).
This work was supported by the Office of Naval Research of the United States under award number N00014-19-1-2248.
F. Aflatouni and F. Ashtiani have filed a patent on the proposed PDNN architecture (publication number WO2022020437A1).
Peer review information
Nature thanks Wolfram Pernice and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
a, The input pixel array aperture and grating coupler (pixel) design are shown. The printed circuit board (PCB) is tilted (by about 12°) to maximize the pixel efficiency at 1,532 nm. b, Classification measurement setup. Laser 1, emitting at 1,532 nm, serves as the light source for image formation on the input pixel array (in the classification phase) or the calibration array (in the training phase), whereas laser 2, emitting at 1,559.93 nm, is used as the supply light. The target objects (dataset) are printed on a transparency film mounted on a custom-fabricated frame. A high-precision XY positioner is used for scanning through the dataset. A microcontroller is used to write the weights into the photonic chip and to implement MRM alignment control loops.
a, The implemented algorithm to find and revise the threshold values to properly separate N different classes. A linear combination of the network output, in this case, the differential output defined as Vout = Out1 − Out2, is measured and compared with different threshold levels. The threshold values (THj) are revised one by one as measured network differential output values (Vout,i) are sequentially passed into the algorithm.
a, Propagation time measurement setup and packaging. b, Calibration setup using a test structure, which consists of a grating coupler and a PD. c, Two detected pulses at nodes A and B showing an end-to-end system delay of about 570 ps. d, Measurement setup used to show the effect of the forward-biased MRM response on the pulse shape.
Image classification schemes implemented using optical and electronic neural networks.
a, An N-layer photonic neural network, in which each layer has its dedicated supply light, allowing scalability to a deep network with a large number of layers. Bandwidth enhancement by absorbing the parasitic capacitance of PDs in a lumped-element transmission line (b) and using one PD per neuron (after optical combining) (c).
a, The implemented algorithm flow chart for micro-ring alignment. The cost function to be minimized is VSUM, which is the sum of the outputs of the second and third layers (that is, Hi and Oi). All micro-rings are thermally tuned to find the optimal heater voltages that correspond to the same resonance wavelength for all seven rings. b, In case of no input illumination, the outputs of the neurons of the first layer (Ii) are zero. If micro-rings are properly aligned, the outputs of the neurons of the second and third layers remain low. c, In the case that the optical input is uniformly illuminating the input pixel array, if all rings are aligned, I1 to I4 will increase, shifting the resonance wavelengths of the MRMs, which results in a large change in the outputs of the neurons of the second and third layers.
The microcontroller sends the clock and data signals to the serial DACs, whereas the outputs of the DACs are connected to their corresponding drivers to drive the on-chip photonic devices (PIN attenuators, ring PN junctions and micro-ring thermal phase shifters).
About this article
Cite this article
Ashtiani, F., Geers, A.J. & Aflatouni, F. An on-chip photonic deep neural network for image classification. Nature 606, 501–506 (2022). https://doi.org/10.1038/s41586-022-04714-0
Nature Photonics (2022)