Double-deep Q-learning to increase the efficiency of metasurface holograms

Sajedian, Iman; Lee, Heon; Rho, Junsuk

doi:10.1038/s41598-019-47154-z

Download PDF

Article
Open access
Published: 29 July 2019

Double-deep Q-learning to increase the efficiency of metasurface holograms

Scientific Reports volume 9, Article number: 10899 (2019) Cite this article

6091 Accesses
69 Citations
Metrics details

Subjects

Abstract

We use a double deep Q-learning network (DDQN) to find the right material type and the optimal geometrical design for metasurface holograms to reach high efficiency. The DDQN acts like an intelligent sweep and could identify the optimal results in ~5.7 billion states after only 2169 steps. The optimal results were found between 23 different material types and various geometrical properties for a three-layer structure. The computed transmission efficiency was 32% for high-quality metasurface holograms; this is two times bigger than the previously reported results under the same conditions. The found structure is transmission-type and polarization-independent and works in the visible region.

Recent advances in metasurface design and quantum optics applications with machine learning, physics-informed neural networks, and topology optimization methods

Article Open access 07 July 2023

Deep learning enabled topological design of exceptional points for multi-optical-parameter control

Article Open access 16 September 2023

A programmable diffractive deep neural network based on a digital-coding metasurface array

Article 21 February 2022

Introduction

Metasurfaces can manipulate the phase and spectrum of impinging light^{1,2,3,4,5,6,7,8}. The geometrical properties of metasurfaces can be tuned to change the phase of light for desired applications. Many applications have emerged from this idea, including ultrathin lenses^9,10, vortex beam generators^11,12 and holograms^{13,14,15,16,17,18,19,20,21}. Different designs have been proposed to increase the holograms’ efficiency^18,19,20,21; some used metallic structures but the efficiency in visible wavelengths was low because of metals’ intrinsic loss^22,23,24,25. Some efficient designs work only for reflected light²⁶.

The incident light’s polarization must be considered when a metasurface hologram is designed. Some of the proposed metasurfaces are noncircular^18,27,28,29 so they only work for a specific polarization. Some are highly efficient and work independently of polarization, but they do not work in visible light²⁰. The ideal structure would generate the whole phase map, work with transmitted light, work in the visible regime, be independent of polarization, and have high efficiency. No structure has yet satisfied all these conditions simultaneously.

The task of finding the best geometrical parameters and choosing the right materials for a structure is always a challenge. Recently, researchers have used neural networks (NNs) to design nanophotonic structures^{30,31,32,33,34,35,36}, and to design chiral metamaterials³⁷. The double deep Q-learning network (DDQN)³⁸ has been used to find the optimized parameters for a photonic structure³⁹. Here we use DDQN to optimize the design of a metasurface holograms. This method is like an intelligent sweep, which learns how to efficiently explore the given parameter space to reach the highest reward in the lowest time, and it can be used to optimize a given physical structure.

We would like to clarify why using DDQN (which belongs to the family of reinforcement learning (RL) methods) is much more efficient (or probably the only way) than using classic optimization methods like Monte-Carlo search, swarm intelligence, genetic algorithms or Bayesian method as a few examples. In neural networks, a number of hidden layers of nodes connect the input data to the output data. This connection is then optimized by an optimization method. This way the network not only optimizes the problem but also learns from it. To compare this to a human mind, assume that we are getting a reward for each step that we take. For example if we get 10 points for the first step and 20 points for the second step, we immediately learn that to get to 100 points we need to take 10 steps, and no further actions is needed to be taken. In a same way, an RL method optimizes a problem by learning from it and not just finding a maxima or minima in a mathematical space (like optimization methods). This leads to a much more efficient optimization method and in some cases the only method that can find the solution for a complex problem.

A detailed benchmark of DDQN compared to other methods is provided by other researchers³⁸. We explained the details of this method and how it can be used in optics and its benefits previously³⁹. Reinforcement learning was also used for identifying variational protocols in quantum physics⁴⁰ as an optimizing mechanism.

Methods

We can tune the phase of the incident light by changing the geometrical properties of the metasurfaces. The metasurface consists of nano-antennas with dimensions much smaller than the operating wavelength, so the thin layer of metasurface acts as a homogeneous medium with different refractive index n than the surrounding medium. The function of this thin layer is to apply a small phase delay to the light that is passing through it²⁶. n of this thin layer can be controlled by changing the geometrical properties of the nano-antennas; the change in n leads to different phase delays by different structures. So each structure creates a phase delay. This way we can create a phase map (which is a collection of different phase delays from −π to π) by combining different structures.

Some authors use non circular Nano-antennas (like V-shaped²⁵, rectangular²²,…) to produce the required phase map. As an example, V-shaped Nano-antennas can easily be tuned by changing the length of each antenna or by changing the angle between them, to create the required phase delay. But they only work for a specific polarization. A structure that can produce a polarization independent phase delay should be cylindrical. But the problem with cylindrical structures is that they can only be tuned by their radius to generate the required phase delay, compared to non-circular shaped structures (since the thickness should be kept constant for manufacturing limitations). This makes it very hard to find the right structure which can generate the whole phase map and at the same time have a high transmission efficiency. Adding lattice constant and material type variables (which is common between all holograms) to this problem makes it very hard for human researchers to check all the possibilities to find the optimum structures. Here we use an AI code to help us find the optimum structure.

Structure definition

Here we try to find the optimal structure type to achieve high efficiency in the visible range for transmission-type holograms. The metasurface structure that we chose for this idea is a nano-disk laying on a thin film that is laying upon a grating, all on a glass substrate (Fig. 1(a)). Having this structure as the starting point covers many possibilities. The combination of the grating with nano-disks can increase transmission in metallic metamaterials by forming a structure that is similar to a Fabry-Perot cavity⁴¹. Using the circular shape for nanoantennas makes the hologram independent of polarization. All geometrical properties (except disk radius) and material types will be found using the DDQN. The important factor here is that the DDQN decides to use the starting structure as it is or to change it (removing grating, thin film, or both).

DDQN structure

DDQN can be used to optimize a physical structure, as described previously³⁹. Briefly, based on the given and future rewards, DDQN tries to connect the state of the structure to the action that should be taken (Fig. 1(b)). In DDQN we have two neural network models. A main model and an auxiliary model. The auxiliary model is used to update the main model’s weights, and the main model is used to predict the actions. These models had 3 hidden layers with 24, 48, 24 neurons each, with an Adam optimizer with a learning rate of 0.005³⁹. A Markov decision process⁴² is used to predict the actions.

The model creates some data for itself by initial guessing at the beginning and by doing some actions (the model creates some data by itself from what it learned so far) as the code progresses, and all of these data are saved as an experience replay. This experience replay keeps getting updated as the model progresses (the old data is replaced by new data) so the model learns from the newly generated data. In other words, the model is training on data that is continuously updated.

An epsilon-greedy method is used to create the initial database. This method determines when the guessing should finish and the learning should start. To do this we define an epsilon function starting from 0.95 to 0.1 with a decay rate of 0.995 as shown in Fig. 2. At each step, a random number is generated by the code. If the generated random number was lower than epsilon, then the model guesses the next action randomly (known as exploration), and if it was higher than epsilon the model predicts the next action by what it learned so far. At each step, the epsilon decays until it reaches 0.1 (this assures that the model always has a 10 percent chance of exploration).

At each step, the DDQN changes a geometrical property or material type of the structure. Based on the given feedback from the simulating environment it learns the effect of the change it made, and so it learns how to act better in future. The model consists of three parts: (1) the state of the structure at each step; (2) the action that should be taken to change the geometrical properties of the structure at each step; (3) a reward system that awards or penalizes the model for the action that it chose.

The state of the structure is composed of the geometrical properties and material types of the structure at each step:

Nano-disk material type (D_M): 23 different materials.
Thin film material type (F_M): 23 different materials.
Grating material type (G_M): 23 different materials.
Nano disk thickness (D_H): 10 nm–250 nm, step size: 10 nm; number of steps: 24
Film thickness (F_H): 10 nm–150 nm, step size: 10 nm; number of steps: 14
Grating thickness (G_H): 10 nm–150 nm, step size: 10 nm; number of steps: 14
Grating coverage (G_C) in each unit cell: 0–100%; number of steps: 10
The spacing between disks (L): 20 nm–120 nm, step size: 10 nm; number of steps: 10

The total number of possible states = 23 × 23 × 23 × 24 × 14 × 14 × 10 × 10 = 5,723,356,800.

We did not include the disk’s radius in the parameters, because it is used to evaluate the structure’s ability to produce the needed phase map. At each state, a separate loop is performed on the disk’s radius between 45 nm to 190 nm and the phases generated by different radii are computed and saved. If the phase range generated by the structure is big enough for holographic uses, the structure is considered as a candidate for optimization by the DDQN. All of these processes are performed in the reward system.

Action definitions

The next step is to define the actions, which determine what the model should do at each step. To change the material of the disks, film, and grating we represented 23 materials in a matrix (Table 1). If the model wants to change the material of one of the parts, it simply changes the index of the material matrix of that specific part. Two actions are defined for changing the material of each part: one to increase the material’s matrix index and one to decrease it. The definitions of actions are shown in Table 2.

Table 1 List of materials used for disks, thin film, and grating.

Full size table

Table 2 Definitions of actions used in DDQN.

Full size table

Reward system

The final step is to define a reward system. It gives feedback to the model at each step, so it learns to improve its actions in future steps. We designed the reward system to give the highest feedback to the phase-generating property of the structure, and lesser feedback to its efficiency; i.e., the model prioritizes the structures that increase the phase map, then considers their efficiencies. To do this we divided the range of −π to π into six equal parts. A model gets 100 points for finding a structure that generates one of these parts, so in total, a model can get 600 points for finding a structure that can generate the whole phase map. A model gets additional points for the minimum transmitted power of the structure times 100. For example, if a structure generates four phase parts and has a minimum transmitted power of 0.25 it will get 4 × 100 + 0.25 × 100 = 425 points. This way, a structure that generates a large number of phase parts will be preferred by the model compared to a structure with a lower number of generated phase parts regardless of the structure’s efficiency. We choose this scheme because we seek a structure that can generate the whole phase map. The scheme also sets the terminating reward of the structure as 700 as is needed for DDQN model. A score of 700 means that the model has reached its ideal structure and should stop looking for new structures. The final found structure by DDQN is shown in (Fig. 1(c)).

Generating the hologram

Now we discuss how the found structure can be used to generate holograms. The first step is to find the phase that the structure generates. This procedure is performed by calculating the S-parameter, which shows the generated phase. We generated the whole phase range of [−π, π] while changing the radius of the disk from 45 nm to 190 nm. The radius affected the phase and amplitude of the S-parameter of the transmitted light (Fig. 3(a)). It also affects the transmission (Fig. 3(b)), which will be used for calculating hologram’s efficiency.

To generate the hologram, we need the phase map of our desired image. To find the phase map of a given image, we used an algorithm⁴³ that creates a numerical phase map from a given image. Once we have the phase map, the next step is to construct the phase map by metasurfaces (Fig. 4(a)). The phase map is a matrix of numbers. We replace each phase by its corresponding diameter (Fig. 3(a)); this process yields a matrix of diameters by which we can construct an array of metasurfaces, and create the needed phase map and also calculate the efficiency of the hologram (Fig. 3(b)). The full phase map contains all the phases. A Fourier transform of the phase map generates a hologram (Fig. 4(b)). This procedure is done physically by using a Fourier transform lens⁴⁴ or done numerically by applying a Fourier transform to the phase map matrix.

To generate the whole phase map, we need all radii from 45 nm to 190 nm. However, fabricating radii with a 1 nm is precision is impossible in practice, so we chose only some of them. This process is known as an m-level phase map, in which m represents the number of chosen radii. For example, m = 6 means that the phase map has 6 levels or in other words only 6 radii. This procedure leads to loss of data and decreases the quality of the recovered image (Fig. 5(a–d)). Only 6 cylinders were used to calculate the final phase map and since the average of transmission power of those cylinders were higher than the average of transmission power of all the cylinders, the efficiency of 6-level phase map is higher in this case.

Results

The simulations were done at 532 nm (green) to be compatible with most recent experimental work in the visible range^21,23,45. The numerical simulations were performed in Lumerical and the machine learning codes are written in Python. A machine with a 16-core 3.40 GHz processor, 64 GB of RAM, and a NVIDIA GTX 1080ti GPU with 11GB DDR5X RAM was used. Although the number of states was ~5.7 billion, DDQN found the optimal results in only 2169 steps. As can be seen the model could find the results pretty fast. This may imply that the model just found this result by random guessing. As can be seen from Fig. 2 after step 400 the model predicts the actions just by learning and only 10 percent of the actions are done by guessing. It should be noted that there might be better answers than what the model found. So based on how long we let the model run, how the good the rival results are, initial conditions or complexity of the problem, it may take longer or shorter to find the optimum results.

It took a month for coding and running the model. The time is variable for different problems based on their complexity.

The final structure is as follows:

Nano-disk material type (D_M): 19 (Indium phosphide) (Table 2)
Thin film material type (F_M): 23 (SiO2 Glass)
Grating material type (G_M): 23 (SiO2 Glass)
Nano disk thickness (D_H): 190 nm
Film thickness (F_H): 10 nm
Grating thickness (G_H): 20 nm
Grating coverage (G_C) in each unit cell: 0\%
The spacing between disks (L): 70 nm
Minimum transmitted power: 0.16

We can compute the efficiency of the hologram by calculating the average transmitted power from the phase map. We have the transmission for each of the radii (Fig. 3(b)), so by counting the number of each of the radii used in the phase map and calculating the average of transmission power, we can estimate the efficiency of the corresponding phase map (Fig. 5(a)). This is a rough estimation for two reasons. First the coupling between adjacent cylinders should be considered, and second, each image will have its own phase map and so different setup of cylinders is used for each image which results in different transmission efficiencies. So the only way to correctly find the transmission efficiency of a hologram is by fabricating it, and each image will have its own efficiency²¹. But this method gives us an approximate estimation of the average transmission efficiency as is shown in Fig. 4.

The computed transmission efficiency was 32% for a high-quality recovered image (Fig. 5(a)). Compared to the structures with the same properties (transmission type, polarization independent, and in visible regime) our proposed structure’s transmission efficiency is two times higher than²¹ with 17% transmission efficiency (theoretical) (6% experimental), and much higher than other similar work²⁴ with <1% transmission efficiency(experimental). It should be noted that what we computed here is the total transmission efficiency (that is defined as the ratio of image intensity to the total power of incident light) and it shouldn’t be confused with diffraction efficiency (that is defined as the ratio of image intensity to the total power of hologram plane²⁴). In diffraction efficiency, the source monitor is placed after the hologram (unlike the transmission efficiency in which the source monitor is placed before hologram) and so the efficiency is much higher compared to transmission efficiency, since the effect of hologram is neglected. A comparison of our hologram’s transmission efficiency with some of the previously reported results is shown in Table 3.

Table 3 Comparison of our hologram’s efficiency with some of the previously reported results.

Full size table

Conclusion

Here, we used double deep Q-learning to optimize a hologram structure to increase its efficiency. The DDQN model optimized the geometrical properties and also found the best material types for the structure. The hologram structure reported here is transmission type, works in the visible range and is independent of polarization. The previously reported structures with these properties had a maximum of 17% transmission efficiency, but our AI code could find a structure that had a 32% transmission efficiency while yielding a high-quality output.

Data Availability

All data generated or analysed during this study are included in this published article (and its Supplementary Information files).

References

N. K. Kildishev, A. V., Boltasseva, A. & Shalaev, V. M. Broadband light bending with plasmonic nanoantennas. Science 335, 427–427 (2012).
Article ADS Google Scholar
Sun, S. et al. Gradient-index meta-surfaces as a bridge linking propagating waves and surface waves. Nature materials 11, 426 (2012).
Article ADS CAS Google Scholar
Yu, N. et al. Light propagation with phase discontinuities: generalized laws of reflection and refraction. science 334, 333–337 (2011).
Article ADS CAS Google Scholar
Kim, I. et al. Thermally robust ring-shaped chromium perfect absorber of visible light. Nanophotonics 7, 1827–1833 (2018).
Article CAS Google Scholar
Rana, A. S. et al. Tungsten-based ultrathin absorber for visible regime. Scientific Reports 8, 2443 (2018).
Article ADS Google Scholar
Kim, I. et al. Outfitting next generation display with optical metasurfaces. ACS Photonics 5, 3876–3895 (2018).
Article CAS Google Scholar
Lee, T. et al. Plasmonic- and dielectric-based structural coloring: from fundamentals to practical applications. Nano Convergence 5, 1 (2018).
Article Google Scholar
Li, Z. et al. Full-space cloud of random points with a scrambling metasurface. Light: Science & Application 7, 63 (2018).
Article ADS CAS Google Scholar
Aieta, F. et al. Aberration-free ultrathin flat lenses and axicons at telecom wavelengths based on plasmonic metasurfaces. Nano letters 12, 4932–4936 (2012).
Article ADS CAS Google Scholar
Ni, X., Ishii, S., Kildishev, A. V. & Shalaev, V. M. Ultra-thin, planar, Babinet-inverted plasmonic metalenses. Light: Science & Applications 2, e72 (2013).
Article ADS Google Scholar
Li, G. et al. Spin-enabled plasmonic metasurfaces for manipulating orbital angular momentum of light. Nano letters 13, 4148–4151 (2013).
Article ADS CAS Google Scholar
Mahmood, N. et al. Polarisation insensitive multifunctional metasurfaces based on all-dielectric nanowaveguides. Nanoscale 10, 18323–18330 (2018).
Article CAS Google Scholar
Devlin, R. C., Khorasaninejad, M., Chen, W. T., Oh, J. & Capasso, F. Broadband high-efficiency dielectric metasurfaces for the visible spectrum. Proceedings of the National Academy of Sciences 113, 10473–10478 (2016).
Article ADS CAS Google Scholar
Lee, G.-Y. et al. Complete amplitude and phase control of light using broadband holographic metasurface. Nanoscale 10, 4237–4245 (2018).
Article CAS Google Scholar
Yoon, G. et al. “Crypto-display” in dual-mode metasurfaces by simultaneous control of phase and spectral responses. ACS Nano 12, 6421–6428 (2018).
Article CAS Google Scholar
Li, Z. et al. Dielectric meta-holograms enabled with dual magnetic resonances in visible light. ACS Nano 11, 9382–9389 (2017).
Article CAS Google Scholar
Ansari, M. A. et al. A spin-encoded all-dielectric metahologram for visible light. Laser and Photonics Reviews 13, 1900065 (2019).
Article ADS Google Scholar
Arbabi, A., Horie, Y., Bagheri, M. & Faraon, A. Dielectric metasurfaces for complete control of phase and polarization with subwavelength spatial resolution and high transmission. Nature nanotechnology 10, 937 (2015).
Article ADS CAS Google Scholar
Chong, K. E. et al. Efficient polarization-insensitive complex wavefront control using Huygens’ metasurfaces based on dielectric resonant meta-atoms. Acs Photonics 3, 514–519 (2016).
Article CAS Google Scholar
Wang, L. et al. Grayscale transparent metasurface holograms. Optica 3, 1504–1505 (2016).
Article Google Scholar
Yoon, G., Lee, D., Nam, K. T. & Rho, J. Pragmatic metasurface hologram at visible wavelength: the balance between diffraction efficiency and fabrication compatibility. ACS Photonics 5, 1643–1647 (2017).
Article Google Scholar
Li, X. et al. Multicolor 3D meta-holography by broadband plasmonic modulation. Science advances 2, e1601102 (2016).
Article ADS Google Scholar
Huang, K. et al. Ultrahigh-capacity non-periodic photon sieves operating in visible light. Nature communications 6, 7059 (2015).
Article ADS CAS Google Scholar
Huang, K. et al. Photon-nanosieve for ultrabroadband and large-angle-of-view holograms. Laser & Photonics Reviews 11, 1700025 (2017).
Article ADS Google Scholar
Ni, X., Kildishev, A. V. & Shalaev, V. M. Metasurface holograms for visible light. Nature communications 4, 2807 (2013).
Article ADS Google Scholar
Zheng, G. et al. Metasurface holograms reaching 80% efficiency. Nature nanotechnology 10, 308 (2015).
Article ADS CAS Google Scholar
Wang, B. et al. Visible-frequency dielectric metasurfaces for multiwavelength achromatic and highly dispersive holograms. Nano letters 16, 5235–5240 (2016).
Article ADS CAS Google Scholar
Qin, F. et al. Hybrid bilayer plasmonic metasurface efficiently manipulates visible light. Science advances 2, e1501168 (2016).
Article ADS Google Scholar
Ding, X. et al. Ultrathin Pancharatnam–Berry Metasurface with Maximal Cross-Polarization Efficiency. Advanced Materials 27, 1195–1200 (2015).
Article CAS Google Scholar
Liu, D., Tan, Y., Khoram, E. & Yu, Z. Training deep neural networks for the inverse design of nanophotonic structures. ACS Photonics 5, 1365–1369 (2018).
Article CAS Google Scholar
Peurifoy, J. et al. In Frontiers in Optics. FTh4A. 4 (Optical Society of America).
Malkiel, I. et al. Deep learning for design and retrieval of nano-photonic structures. arXiv preprint arXiv 1702, 07949 (2017).
Peurifoy, J. et al. Nanophotonic particle simulation and inverse design using artificial neural networks. Science advances 4, eaar4206 (2018).
Article Google Scholar
Sajedian, I., Kim, J. & Rho, J. Finding the optical properties of plasmonic structures by image processing using the combination of convolutional neural networks and recurrent neural networks. Microsystems and Nanoengineering 5, 27 (2019).
Article ADS Google Scholar
So, S. & Rho, J. Designing nanophotonic structures using conditional-deep convolutional generative adversarial networks. Nanophotonics 8, 1255–1261 (2019).
Article Google Scholar
So, S., Mun, J. & Rho, J. Simultaneous inverse design of materials and structures via deep learning: Demonstration of dipole resonance engineering using core-shell nanoparticles. ACS Applied Materials & Interfaces 11, 24264–24268 (2019).
Article CAS Google Scholar
Ma, W., Cheng, F. & Liu, Y. Deep-learning-enabled on-demand design of chiral metamaterials. ACS nano 12, 6326–6334 (2018).
Article CAS Google Scholar
Mnih, V. et al. Human-level control through deep reinforcement learning. Nature 518, 529 (2015).
Article ADS CAS Google Scholar
Sajedian, I., Badloe, T. & Rho, J. Optimisation of colour generation from dielectric nanostructures using reinforcement learning. Optics express 27, 5874–5883 (2019).
Article ADS Google Scholar
Bukov, M. et al. Reinforcement learning in different phases of quantum control. Physical Review X 8, 031086 (2018).
Article ADS CAS Google Scholar
Sajedian, I., Zakery, A. & Rho, J. High efficiency second and third harmonic generation from magnetic metamaterials by using a grating. Optics Communications 397, 17–21 (2017).
Article ADS CAS Google Scholar
Sutton, R. S. & Barto, A. G. Reinforcement learning: An introduction. (MIT, 1998).
Gerchberg, R. W. A practical algorithm for the determination of phase from image and diffraction plane pictures. Optik 35, 237–246 (1972).
Google Scholar
Goodman, J. W. Introduction to Fourier optics. (Roberts and Company Publishers, 2005).
Huang, K. et al. Silicon multi-meta-holograms for the broadband visible light. Laser & Photonics Reviews 10, 500–509 (2016).
Article ADS CAS Google Scholar

Download references

Acknowledgements

This work is financially supported from the National Research Foundation (NRF) grants (NRF-2018M3D1A1058998, NRF-2019R1A2C3003129, CAMM-2019M3A6B3030637 and NRF2015R1A5A1037668) funded by the Ministry of Science and ICT (MSIT), Republic of Korea.

Author information

Authors and Affiliations

Department of Mechanical Engineering, Pohang University of Science and Technology (POSTECH), Pohang, 37673, Republic of Korea
Iman Sajedian & Junsuk Rho
Department of Materials Science and Engineering, Korea University, Seoul, 02842, Republic of Korea
Iman Sajedian & Heon Lee
Department of Chemical Engineering, Pohang University of Science and Technology (POSTECH), Pohang, 37673, Republic of Korea
Junsuk Rho

Authors

Iman Sajedian
View author publications
You can also search for this author in PubMed Google Scholar
Heon Lee
View author publications
You can also search for this author in PubMed Google Scholar
Junsuk Rho
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.R. and I.S. conceived the concept and initiated the project. I.S. designed the research and performed numerical calculations. H.L. provided the materials chracterizations. I.S. and J.R. wrote the manuscript. All authors contributed to the discussion, analysis and confirmed the final manuscript. J.R. guided the entire the project.

Corresponding author

Correspondence to Junsuk Rho.

Ethics declarations

Competing Interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Sajedian, I., Lee, H. & Rho, J. Double-deep Q-learning to increase the efficiency of metasurface holograms. Sci Rep 9, 10899 (2019). https://doi.org/10.1038/s41598-019-47154-z

Download citation

Received: 24 April 2019
Accepted: 09 July 2019
Published: 29 July 2019
DOI: https://doi.org/10.1038/s41598-019-47154-z

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.