Introduction

Metasurfaces are two-dimensional (2D) versions of metamaterials that provide a new platform for realizing high-performance optical devices and components1,2,3,4,5,6,7,8,9,10,11,12,13,14. By controlling the geometry of each unit cell (i.e., meta-atom), the phase, amplitude, and polarization of optical waves can be fully engineered at the sub-wavelength scale to provide control of light propagation. Recently, significant efforts have been made aiming at moving them from proof-of-concept demonstrations to practical adoption in beachhead applications exemplified by 3D sensing and augmented reality15,16,17,18,19,20,21. Transition of optical metasurface technologies to the commercial realm is facilitated by their compatibility with high-volume semiconductor manufacturing processes that have been well established for electronics devices. Indeed, there are already multiple instances of wafer-scale metasurface fabrication22,23,24,25,26. As industrial implementation of metasurfaces progresses, it becomes imperative to minimize the performance gap between simulation and manufactured devices. Such differences can be partly attributed to fabrication imperfections, which result in deviations of meta-atom geometries from their designs. On top of that, approximations assumed in large-area metasurface modeling can be another source of discrepancy. The canonical metasurface design recipe involves full-wave modeling of individual meta-atom responses with periodic boundary conditions (known as the local phase approximation, LPA), followed by assembly of so-designed meta-atoms according to optical phase profiles optimized via ray trace simulations. This approach avoids computationally intensive simulation of large-area metasurface devices, although it fails to account for interactions (i.e., optical coupling) between dissimilar neighboring meta-atoms, and is often not applicable to metasurface designs with discontinuous phase profiles. Coupling of these errors can lead to significant disparities between the design and actual performance, incurring escalated development time and cost. Therefore, mitigating these sources of error is crucial to expanding metasurface applications.

This perspective focuses on ML as a promising route to address the abovementioned challenges, highlighting both recent technological progress and future prospects. Various ML methods have demonstrated the potential to alleviate metasurface design burdens without compromising their optical performances27,28,29,30,31,32,33,34,35,36,37,38,39. They also provide a facile means to assess fabrication tolerance and hence manufacturability of metasurface structures. In addition, another area where AI and ML are likely going to make significant impacts is in the computational backend, where AI-based post-processing can effectively compensate for imperfections or even intrinsic limitations of metasurface optical devices (e.g., chromatic aberration). We foresee that these AI-based approaches will likely make a significant contribution to expediting the industrial adoption of metasurface optics technologies. Fig. 1

Fig. 1: A graphic summary of AI based approaches for metasurfaces applications.
figure 1

AI based approaches will likely make a significant impact on AI-enabled metasurface DFM, design beyond the classical local phase approximation, and AI-empowered computational backend.

AI-enabled metasurface design-for-manufacturing

DFM aims to formulate device designs optimized for industry-scale manufacturing and packaging processes, and thus constitute an important element in product development cycles. In the context of optical metasurfaces, there are multiple instances40,41,42,43 and especially AI can serve to simplify and expedite DFM-centric device engineering in two ways. First, ML provides a powerful tool to develop nonintuitive advanced metasurface designs with full consideration of fabrication tolerance. Deviation of fabricated metasurface geometry from design layout is probably the most commonly encountered fabrication imperfection and thus has received considerable attention. For example, efforts have been made to design metasurfaces whose performance remains unaltered with shape distortions. Jenkins et al. developed an optical performance prediction system for meta-gratings using deep learning44. Using this prediction system, they evaluated how transmission diffraction efficiencies evolved with dimensional deviations to find robust designs insensitive to geometric changes. As a proof-of-concept, they managed to limit the deterioration in diffraction efficiency to approximately 15% with the same level of edge variability, whereas the normal design would result in an approximately 50% reduction in efficiency with similar deviation in edge shape (Fig. 2a, b). Another approach involves selecting meta-atoms suitable for processing from an extensive meta-atom library based on the manufacturing constraints. In a study by Ueno et al., computationally efficient prediction of meta-atom performance was attained leveraging a convolutional neural network (CNN) using 2D cross-sections of meta-atoms as the input, thereby facilitating construction of a diverse meta-atom library. Using a meta-atom selector tailored to actual manufacturing conditions, fabrication-friendly meta-atoms were chosen from the library for metasurface design (Fig. 2c)45.

Fig. 2: Metasurfaces design considering workability and fabrication tolerance of meta-atoms.
figure 2

a, b Establishing exhaustive metasurface robustness against fabrication uncertainties through deep learning44 Copyright (2021) Walter de Gruyter. c deep-learning designed, fabrication-friendly metasurfaces45 Copyright (2021) Walter de Gruyter.

In addition to formulating fabrication-friendly designs, another area where AI can make significant impacts is in manufacturing process development for metasurfaces. Fabrication-induced deviations from design can result from various processing steps such as lithography, resist development, and etching. Given the inherent complexity of these fabrication steps, AI provides a predictive process design tool to obviate iterative trial-and-error optimization with constant human intervention. A case-in-point is ML-assisted optical proximity correction (OPC). Through modifying photomask patterns to compensate for diffraction effects, OPC has become a standard practice in state-of-the-art deep ultraviolet lithography to enhance pattern fidelity. The increasing demand for more complex meta-atom geometries for applications such as polarization control and dispersion engineering accentuate the importance of OPC in metasurface fabrication46,47. Liao et al. used a fully convolutional network model for the lithographic simulation of OPC. It reduced critical dimension (CD) variations to an average of 1.69%, resulting in a metalens focusing efficiency of 64%, closely matching the calculated value of 69%48.

Beyond the local phase approximation

The classical recipe for designing metasurface devices follows a “unit-cell” approach based on LPA, where a metasurface is synthesized using a set of meta-atoms as unit cells whose responses are modeled under the periodic boundary condition. The approach facilitates efficient design of large-area metasurface devices without the headache of excessive computational overhead. LPA generally yields satisfactory results especially when (1) the optical coupling between neighboring meta-atoms is weak, such as in the case of high-index-contrast waveguide type meta-atoms with moderate aspect ratios; and (2) the optical phase gradient is small, such that the periodic boundary condition remains a reasonable approximation. These conditions however, are being challenged in a growing number of applications. The discrepancy of this approach with experiment becomes unacceptable in the search for ultrahigh-efficiency metasurface designs where any loss of efficiency cannot be tolerated. Its accuracy is also compromised in designs involving large light bending angles—a critical advantage of metasurfaces over traditional refractive or diffractive optics—exemplified by high numerical aperture (NA) and wide field-of-view (FOV) optics. In fact, theoretical analysis suggests that the LPA approach based on a fixed set of meta-atoms face fundamental limitations in optical efficiency at large deflection angles due to exacerbated phase sampling error49. Additionally, optical coupling effects are simply non-negligible in many meta-atom systems50. Alternative schemes that circumvent these limitations while enabling accurate and efficient large-area metasurface design are therefore highly sought after.

AI-based techniques can contribute to solving the challenge in three ways. ML methods can be applied to predict and hence compensate optical coupling between meta-atoms, thereby closing the gap between design and experimental results. Alternatively, ML method are adopted to generate meta-grating designs, which are then used to inform non-grating metasurface optimization. Lastly, new ML approaches foresee modeling of large-area metasurface performances without full-scale brute-force electromagnetic simulations.

Deep neural network (DNN) predicts optical coupling between meta-atoms

In recent years, several novel DNN models have been proposed51,52, which consider the shapes of neighboring meta-atoms as part of their input and utilize a large dataset to discern the impact of non-identical neighboring meta-atoms under realistic boundary conditions. Given that these DNN models are limited to forward prediction of target meta-atoms based on the dimensions of their neighbors, and adjusting one meta-atom alters the boundary conditions of its surrounding neighbors, an iterative optimization process is often adopted to identify the optimal design across the entire metasurface. An et al. developed a DNN model51 to predict the transmission and phase delay of a target meta-atom, using its dimensions and those of its eight closest neighbors as inputs (Fig. 3a). The network is based on a CNN architecture composed of 6 consecutive convolutional layers and 3 fully connected layers (FCLs). After fully-trained with sufficient data, this model can rapidly characterize meta-atoms taking into account mutual coupling effects. This tool effectively enhances the performance of metasurface devices, with examples such as beam deflectors and metalenses analyzed. For instance, a beam deflector’s efficiency increased from 41.3% to 68.8% prior to and after optimization by the tool. Similarly, a metalens achieved over 20% improvement in focusing efficiency using this optimization framework. This was achieved by adjusting meta-atom configurations based on predicted local responses, thereby reducing phase errors. In another example by Zhelyeznyakov et al. 53, the authors trained a DNN model that maps the geometry of the target meta-atom and its closest neighbors to its electromagnetic field response. An extended simulation domain was used to account for coupling along both x and y axes. The DNN architecture consists of 11 FCLs, each followed by a ReLU activation function. The training data was acquired by simulating the electromagnetic response of 10 metalenses with a diameter of 50 μm and varying focal lengths between 10 and 100 μm using the Finite difference time domain (FDTD) method. A direct comparison between the electric field simulated by FDTD and the fully-trained DNN is visualized in Fig. 3b. Compared to conventional adjoint optimization approaches, the overall design process is significantly expedited using this AI approach, even with the data collection and model training time taken into consideration. On the other hand, Ha et al. 52 developed a different approach from the previous examples as it includes an inverse design network, enabling direct generation of target meta-atom dimensions using the desired transmitted field distribution and surrounding meta-atom neighbors as input. This significantly streamlines the design process compared to the iterative adjoint optimization methods in 51,53. For instance, a metalens with 50 µm diameter working at 1550 nm wavelength was designed in just 15 s. In comparison, completing 20 iterations of a conventional adjoint-based optimization approach, which served for data collection purposes, would take over 15 days. In these approaches, the AI-enabled inverse designs do not rely on the LPA to account for inter-unit cell mutual coupling effects. These methods serve as a practical intermediary between the computationally intensive full-wave simulations of entire metasurfaces and the simpler, LPA-based unit cell simulations used to derive combined phase fronts. Given that full-wave simulations of whole metasurfaces are often prohibitively expensive, these AI-driven approaches offer a more efficient alternative for designing metasurfaces with enhanced optical performance. However, training DNNs to predict the coupling between meta-atoms, while avoiding full-wave simulations, still requires a massive amount of data. Although this collection process can be accelerated through parallel computing, the data requirements are substantial. Additionally, the inverse design process introduces complexities as adjustments to one meta-atom can impact the phase front of all its neighbors, requiring iterative optimizations that are further complicated when the shapes of the meta-atoms vary, such as in freeform designs, as detailed in ref. 51.

Fig. 3: Metasurface design utilizing AI to enhance optical performance.
figure 3

a A CNN that predicts a meta-atom’s transmission and phase delay under realistic boundary conditions, accounting for dissimilar meta-atom neighbors51. b An FCN that predicts the EM field response of a meta-atom based on its geometry and closest neighbors53. c An autoencoder that optimizes the configuration of meta-atoms within a 5 × 5 array to achieve the desired transmitted field distribution, considering inter-meta-atom coupling effects54.

AI-enhanced designs based on metagrating assembly

A phase-gradient metasurface with a continuous phase function (e.g., a metalens) can be approximated as a collection of different regions, each of which corresponds to one light deflection angle. Therefore, the task of designing a phase-gradient metasurface can be decomposed into optimization of a series of (in general 2D) metagratings with different light bending angles54,55. Modeling of metagratings involve simulating their repeating units (which comprise multiple meta-atoms) under the periodic boundary condition. In this way, meta-gratings designed using the “super-cell” approach naturally accounts for inter-metaatom coupling. While for meta-atoms phase delay is the main design parameter, metagrating modeling necessarily encompass different deflection angles, orientations (for 2D metagratings), and initial phase settings. Despite the efficiency of modeling individual metagratings using methods such as rigorous coupled wave analysis, the more extensive optimization process for metagratings can be computationally onerous. AI is therefore expected to play an important role in streamlining the design process. For example, neural networks can be applied to expedite metagrating design, potentially even extending the meta-atom geometries from simple regular shapes to freeform structures to further enhance performance. For example, Jiang et al. proposed a novel platform using conditional Generative Adversarial Networks (CGANs) in combination with iterative optimization to enhance the design of metagratings56. This study employed CGANs to learn geometric features from a set of training images, specifically cross sections of freeform metagratings. The CGAN framework comprises two interconnected deep networks: a generator and a discriminator (Fig. 4a). During the training process, these networks undergo alternating phases of training: The generator aims to create realistic-looking device designs to deceive the discriminator, while the discriminator endeavors to distinguish between real and generated designs. Upon training completion, 5000 different layouts of metagratings operating at a 70-degree deflection angle operating at 1200 nm wavelength were produced using the trained generator. Remarkably, among the generated designs, metagratings demonstrated over 60% efficiency, showcasing the CGAN’s ability to learn and generalize key features in metagrating design. Following this work, Wen et al. proposed a novel GAN for the design of robust freeform high-resolution, high-dimensional metagratings (Fig. 4b)57. Compared to traditional GANs, the progressive GAN (PGGAN) significantly improves the training stability and the quality of the generated images, making it particularly effective for creating high resolutions metasurfaces with intricate design details. The network and device resolution gradually increase from 16 × 32 pixels, to 32 × 64, and finally to 64 × 128 during the training process. Upon training completion, average efficiency of the metagratings generated by the PGGAN is 14% higher compared to the metagratings generated by basic GANs. By integrating this AI-enhanced metagrating design approach into large-scale metasurface designs assembled from combined metagratings58, we anticipate that the optical performance of the metasurfaces could be significantly improved. This potential enhancement is due to the elimination of the LPA and the expanded design space. Fig. 5.

Fig. 4: Metagratings design utilizing generative AI.
figure 4

a A conditional GAN for the design of freeform metagratings with on-demand deflection angle at specified wavelength56. b A progressive GAN for robust high-resolution, high-dimensional metagrating designs57.

Fig. 5: Imaging process frameworks for metasurfaces imaging performance.
figure 5

a Achromatic Single Metalens Imaging via Deep Neural Network68. b ML based approach for aberration free metalens70. c Design framework to co-optimize the meta-optic and reconstruction algorithm73.

AI-empowered computational backend

AI and ML techniques have already been widely applied to image post-processing. For example, state-of-the-art smartphones deploy amalgamation of multiple images to elevate image quality59. In the context of metasurface optics, computational backend can similarly be implemented to enhance output quality or even ameliorate their inherent performance limitations60,61. One example that highlights the efficacy of AI-augmented metalens imaging is chromatic aberration compensation. Chromatic aberration in metalenses, which stems from the wavelength-dependence of Fresnel zone boundary positions62 deteriorates with increasing metalens size, numerical aperture, and FOV63,64,65. While broadband operation via pure hardware correction (e.g., via dispersion or zone engineering66,67) is unrealistic for large metalens with a moderate FOV and F-number, computational post-processing via DNN is capable of restoring high-quality color scenes from heavily aberrated raw images captured by metalenses. Full-color computational imaging with metalenses was recently demonstrated by several groups. In a study by Dong et al., they used deep learning-based approaches to overcome chromatic aberration and enhance image resolution68. They used a U-Net-based neural network architecture69, which is known for its efficiency in image processing tasks, to directly correct chromatic aberration, leveraging a training set compiled from diverse scenarios to ensure robustness and versatility across various aperture sizes, focusing distances, and real-world conditions. The deep learning network yielded re-constructed color images with a gain of over 10 dB in Peak Signal-to-Noise Ratio (PSNR) and a 35% increase in Structural Similarity Index Measure (SSIM) values, demonstrating the method’s effectiveness in achieving high-quality full-color imaging. In another similar piece of work, Seo et al. introduces a deep learning-based image restoration framework to achieve full-color imaging for large-area metalenses70. In a demonstration, the DNN improved the PSNR by 7.37 dB, the SSIM by 22.8%, and the Learned Perceptual Image Patch Similarity by 35.6% compared to the original color-aberrated images directly collected by the metalens. The examples cited above used “standard” metasurfaces not specifically designed for broadband operation. Further performance boost can be attained through concurrent optimization of both the metasurface optics (the hardware frontend) and the post-processing algorithm (the computational backend). Such an end-to-end design framework has been discussed in a number of publications71,72. For instance, Tseng et al. developed a technique that employs a differentiable metasurface model that enables the joint optimization of the metasurface design and the deconvolution post-processing algorithm within a differentiable image-formation model73. This allows facile simulation of spatially varying PSFs and their convolution with the sensor input, followed by neural-network-based deconvolution to reconstruct the final image. The neural nano-optics achieved a spatial resolution of 214 line pairs per millimeter (lp/mm) across all color channels at an object distance of 120 mm, a resolution that marks a seven-fold improvement over previous state-of-the-art. Furthermore, a major concern in this AI-empowered computational imaging field is the lack of standardized datasets and benchmarks, which are crucial for evaluating the scalability and real-world applicability of ML models. It has been observed that while models often perform well on limited academic datasets, their effectiveness tends to diminish in more diverse real-world settings. This highlights the critical need for robust benchmarks that truly reflect the complexities encountered in practical applications. To circum vent these limitations, the strategy of transfer learning is frequently discussed74. Transfer learning utilizes models that have been pre-trained on comprehensive, varied datasets, thereby enhancing their adaptability and improving generalization capabilities in computational imaging tasks.