Deep learning with coherent nanophotonic circuits

Journal name:
Nature Photonics
Volume:
11,
Pages:
441–446
Year published:
DOI:
doi:10.1038/nphoton.2017.93
Received
Accepted
Published online

Abstract

Artificial neural networks are computational network models inspired by signal processing in the brain. These models have dramatically improved performance for many machine-learning tasks, including speech and image recognition. However, today's computing hardware is inefficient at implementing neural networks, in large part because much of it was designed for von Neumann computing schemes. Significant effort has been made towards developing electronic architectures tuned to implement artificial neural networks that exhibit improved computational speed and accuracy. Here, we propose a new architecture for a fully optical neural network that, in principle, could offer an enhancement in computational speed and power efficiency over state-of-the-art electronics for conventional inference tasks. We experimentally demonstrate the essential part of the concept using a programmable nanophotonic processor featuring a cascaded array of 56 programmable Mach–Zehnder interferometers in a silicon photonic integrated circuit and show its utility for vowel recognition.

At a glance

Figures

  1. General architecture of the ONN.
    Figure 1: General architecture of the ONN.

    a, General artificial neural network architecture composed of an input layer, a number of hidden layers and an output layer. b, Decomposition of the general neural network into individual layers. c, Optical interference and nonlinearity units that compose each layer of the artificial neural network. d, Proposal for an all-optical, fully integrated neural network.

  2. Illustration of OIU.
    Figure 2: Illustration of OIU.

    a, Schematic representation of our two-layer ONN experiment. The programmable nanophotonic processor is used four times to implement the deep neural network protocol. After the first matrix is implemented, a nonlinearity associated with a saturable absorber is simulated in response to the output of layer 1. b, Experimental feedback and control loop used in the experiment. Laser light is coupled to the OIU, transformed, measured on a photodiode array, and then read on a computer. c, Optical micrograph illustration of the experimentally demonstrated OIU, which realizes both matrix multiplication (highlighted in red) and attenuation (highlighted in blue) fully optically. The spatial layout of MZIs follows the Reck proposal27, enabling arbitrary SU(4) rotations by programming the internal and external phase shifters of each MZI (θi, φi). d, Schematic illustration of a single phase shifter in the MZI and the transmission curve for tuning the internal phase shifter. DMMC, diagonal matrix multiplication core.

  3. Vowel recognition.
    Figure 3: Vowel recognition.

    a,b, Correlation matrices for the ONN and a 64-bit electronic computer, respectively, implementing two-layer neural networks for vowel recognition. Each row of the correlation matrices is a histogram of the number of times the ONN or 64-bit computer identified vowel X when presented with vowel Y. Perfect performance for the vowel recognition task would result in a diagonal correlation matrix. c, Correct identification ratio in percent for the vowel recognition problem with phase-encoding (σΦ) and photodetection error (σD). The definitions of these two variables are provided in the Methods. Solid lines are contours for different correctness ratios. In our experiment, σD ≃ 0.1%. The contour line shown in red marks an isoline corresponding to the correct identification ratio for our experiment. d, Two-dimensional projection (log area ratio coefficient 1 on the x axis and 2 on the y axis) of the testing data set, which shows the large overlap between spoken vowel C and D. This large overlap leads to lower classification accuracy for both a 64-bit computer and the experimental ONN.

References

  1. LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436444 (2015).
  2. Silver, D. et al. Mastering the game of go with deep neural networks and tree search. Nature 529, 484489 (2016).
  3. Mnih, V. et al. Human-level control through deep reinforcement learning. Nature 518, 529533 (2015).
  4. Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. Proc. NIPS 10971105 (2012).
  5. Esser, S. K. et al. Convolutional networks for fast, energy efficient neuromorphic computing. Proc. Natl Acad. Sci. USA 113, 1144111446 (2016).
  6. Mead, C. Neuromorphic electronic systems. Proc. IEEE 78, 16291636 (1990).
  7. Poon, C.-S. & Zhou, K. Neuromorphic silicon neurons and large-scale neural networks: challenges and opportunities. Front. Neurosci. 5, 108 (2011).
  8. Shafiee, A. et al. ISAAC: a convolutional neural network accelerator with in-situ analog arithmetic in crossbars. Proc. ISCA 43, 1426 (2016).
  9. Misra, J. & Saha, I. Artificial neural networks in hardware: a survey of two decades of progress. Neurocomputing 74, 239255 (2010).
  10. Chen, Y. H., Krishna, T., Emer, J. S. & Sze, V. Eyeriss: an energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE J. Solid-State Circuits 52, 127138 (2017).
  11. Graves, A. et al. Hybrid computing using a neural network with dynamic external memory. Nature 538, 471476 (2016).
  12. Tait, A. N., Nahmias, M. A., Tian, Y., Shastri, B. J. & Prucnal, P. R. in Nanophotonic Information Physics (ed. Naruse, M.) 183222 (Springer, 2014).
  13. Tait, A. N., Nahmias, M. A., Shastri, B. J. & Prucnal, P. R. Broadcast and weight: an integrated network for scalable photonic spike processing. J. Lightw. Technol. 32, 34273439 (2014).
  14. Prucnal, P. R., Shastri, B. J., de Lima, T. F., Nahmias, M. A. & Tait, A. N. Recent progress in semiconductor excitable lasers for photonic spike processing. Adv. Opt. Phot. 8, 228299 (2016).
  15. Vandoorne, K. et al. Experimental demonstration of reservoir computing on a silicon photonics chip. Nat. Commun. 5, 3541 (2014).
  16. Appeltant, L. et al. Information processing using a single dynamical node as complex system. Nat. Commun. 2, 468 (2011).
  17. Larger, L. et al. Photonic information processing beyond Turing: an optoelectronic implementation of reservoir computing. Opt. Express 20, 32413249 (2012).
  18. Paquot, Y. et al. Optoelectronic reservoir computing. Sci. Rep. 2, 287 (2011).
  19. Vivien, L. et al. Zero-bias 40gbit/s germanium waveguide photodetector on silicon. Opt. Express 20, 10961101 (2012).
  20. Cardenas, J. et al. Low loss etchless silicon photonic waveguides. Opt. Express 17, 47524757 (2009).
  21. Yang, L., Zhang, L. & Ji, R. On-chip optical matrix-vector multiplier. In SPIE Optical Engineering + Applications, 88550F (International Society for Optics and Photonics, 2013).
  22. Farhat, N. H., Psaltis, D., Prata, A. & Paek, E. Optical implementation of the Hopfield model. Appl. Opt. 24, 14691475 (1985).
  23. Harris, N. C. et al. Bosonic transport simulations in a large-scale programmable nanophotonic processor. Preprint at http://arXiv.org/abs/1507.03406 (2015).
  24. Schmidhuber, J. Deep learning in neural networks: an overview. Neural Netw. 61, 85117 (2015).
  25. Lawson, C. L. & Hanson, R. J. Solving Least Squares Problems Vol. 15 (SIAM, 1995).
  26. Miller, D. A. B. Perfect optics with imperfect components. Optica 2, 747750 (2015).
  27. Reck, M., Zeilinger, A., Bernstein, H. J. & Bertani, P. Experimental realization of any discrete unitary operator. Phys. Rev. Lett. 73, 5861 (1994).
  28. Connelly, M. J. Semiconductor Optical Amplifiers (Springer Science & Business Media, 2007).
  29. Selden, A. Pulse transmission through a saturable absorber. Br. J. Appl. Phys. 18, 743 (1967).
  30. Bao, Q. et al. Monolayer graphene as a saturable absorber in a mode-locked laser. Nano Res. 4, 297307 (2010).
  31. Schirmer, R. W. & Gaeta, A. L. Nonlinear mirror based on two-photon absorption. J. Opt. Soc. Am. B 14, 28652868 (1997).
  32. Soljačić, M., Ibanescu, M., Johnson, S. G., Fink, Y. & Joannopoulos, J. Optimal bistable switching in nonlinear photonic crystals. Phys. Rev. E 66, 055601 (2002).
  33. Xu, B. & Ming, N.-B. Experimental observations of bistability and instability in a two-dimensional nonlinear optical superlattice. Phys. Rev. Lett. 71, 39593962 (1993).
  34. Centeno, E. & Felbacq, D. Optical bistability infinite-size nonlinear bidimensional photonic crystals doped by a microcavity. Phys. Rev. B 62, R7683R7686 (2000).
  35. Nozaki, K. et al. Sub-femtojoule all-optical switching using a photonic-crystal nanocavity. Nat. Photon. 4, 477483 (2010).
  36. Ríos, C. et al. Integrated all-photonic non-volatile multilevel memory. Nat. Photon. 9, 725732 (2015).
  37. Krizhevsky, A., Sutskever, I. & Hinton, G. E. in Imagenet Classification with Deep Convolutional Neural Networks (eds Pereira, F., Burges, C. J. C., Bottou, L. & Weinberger, K. Q.) 10971105 (Curran Associates, 2012).
  38. Cheng, Z., Tsang, H. K., Wang, X., Xu, K. & Xu, J.-B. In-plane optical absorption and free carrier absorption in graphene-on-silicon waveguides. IEEE J. Sel. Top. Quantum Electron. 20, 4348 (2014).
  39. Chow, D. & Abdulla, W. H. in PRICAI 2004: Trends in Artificial Intelligence (eds Booth, R. & Zhang, M.-L.) 901908 (Springer, 2004).
  40. Deterding, D. H. Speaker Normalisation for Automatic Speech Recognition. PhD thesis, Univ. Cambridge (1990).
  41. Hinton, G. E. & Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. Science 313, 504507 (2006).
  42. Baehr-Jones, T. et al. A 25 Gb/s silicon photonics platform. Preprint at http://arXiv.org/abs/1203.0767 (2012).
  43. Harris, N. C. et al. Efficient, compact and low loss thermo-optic phase shifter in silicon. Opt. Express 22, 1048710493 (2014).
  44. Bertsimas, D. & Nohadani, O. Robust optimization with simulated annealing. J. Global Optim. 48, 323334 (2010).
  45. Wang, Q. et al. Optically reconfigurable metasurfaces and photonic devices based on phase change materials. Nat. Photon. 10, 6065 (2016).
  46. Tanabe, T., Notomi, M., Mitsugi, S., Shinya, A. & Kuramochi, E. Fast bistable all-optical switch and memory on a silicon photonic crystal on-chip. Opt. Lett. 30, 25752577 (2005).
  47. Horowitz, M. Computing's energy problem. In 2014 IEEE Int. Solid-State Circuits Conf. Digest of Technical Papers (ISSCC) 1014 (IEEE, 2014).
  48. Arjovsky, M., Shah, A. & Bengio, Y. Unitary evolution recurrent neural networks. In Int. Conf. Machine Learning (2016).
  49. Sun, J., Timurdogan, E., Yaacobi, A., Hosseini, E. S. & Watts, M. R. Large-scale nanophotonic phased array. Nature 493, 195199 (2013).
  50. Rechtsman, M. C. et al. Photonic Floquet topological insulators. Nature 496, 196200 (2013).
  51. Jia, Y. et al. Caffe: convolutional architecture for fast feature embedding. In Proc. 22nd ACM Int. Conf. Multimedia (MM ’14), 675678 (ACM, 2014).
  52. Sun, C. et al. Single-chip microprocessor that communicates directly using light. Nature 528, 534538 (2015).

Download references

Author information

  1. These authors contributed equally to this work.

    • Yichen Shen &
    • Nicholas C. Harris

Affiliations

  1. Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA

    • Yichen Shen,
    • Nicholas C. Harris,
    • Scott Skirlo,
    • Mihika Prabhu,
    • Dirk Englund &
    • Marin Soljačić
  2. Elenion, 171 Madison Avenue, Suite 1100, New York, New York 10016, USA

    • Tom Baehr-Jones &
    • Michael Hochberg
  3. Department of Mathematics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA

    • Xin Sun
  4. Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA

    • Shijie Zhao
  5. Université de Sherbrooke, Administration, 2500 Boulevard de l'Université, Sherbrooke, Quebec J1K 2R1, Canada

    • Hugo Larochelle

Contributions

Y.S., N.C.H., S.S., X.S., S.Z., D.E. and M.S. developed the theoretical model for the optical neural network. N.H. designed the photonic chip and built the experimental set-up. N.H., Y.S. and M.P. performed the experiment. Y.S., S.S. and X.S. prepared the data and developed the code for training MZI parameters. T.B.-J. and M.H. fabricated the photonic integrated circuit. All authors contributed to writing the paper.

Competing financial interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to:

Author details

Supplementary information

PDF files

  1. Supplementary information (1.37 KB)

    Supplementary information

Additional data