Large area optimization of meta-lens via data-free machine learning

Sub-wavelength diffractive optics, commonly known as meta-optics, present a complex numerical simulation challenge, due to their multi-scale nature. The behavior of constituent sub-wavelength scatterers, or meta-atoms, needs to be modeled by full-wave electromagnetic simulations, whereas the whole meta-optical system can be modeled using ray/ Fourier optics. Most simulation techniques for large-scale meta-optics rely on the local phase approximation (LPA), where the coupling between dissimilar meta-atoms is neglected. Here we introduce a physics-informed neural network, coupled with the overlapping boundary method, which can efficiently model the meta-optics while still incorporating all of the coupling between meta-atoms. We demonstrate the efficacy of our technique by designing 1mm aperture cylindrical meta-lenses exhibiting higher efficiency than the ones designed under LPA. We experimentally validated the maximum intensity improvement (up to 53%) of the inverse-designed meta-lens. Our reported method can design large aperture ( ~ 104 − 105λ) meta-optics in a reasonable time (approximately 15 minutes on a graphics processing unit) without relying on the LPA.


Introduction
In the age of silicon computing, numerical simulations are at the heart of understanding and designing physical systems.For many cases, analytical solutions to complex device geometries are intractable to compute, or simply do not exist.From extremely large systems like rockets 1 to ultra-small nanophotonic devices, 2 numerical simulations provide scientists and engineers with the necessary tools to design non-intuitive structures.In electromagnetics, direct solvers, including the finite difference time domain (FDTD) 3 and the finite difference frequency domain (FDFD) 4,5 simulators, are the usual choices when dealing with heterogeneous structures with sub-wavelength features that require a high degree of numerical accuracy.Most commonly, electromagnetic simulation tools serve to validate the qualitative designs created by engineers based on prior knowledge and intuition.In recent years, the field of nanophotonics has incorporated a new paradigm of computer-aided device design, where a device's performance is summarized by a quantitative figure of merit (FOM) that is optimized over.This method involves running a forward numerical simulation, computing the FOM, and iteratively modifying the device's geometry based on an optimization algorithm to reach the desired FOM.][8][9][10][11][12][13][14][15] However, electromagnetic simulators suffer from a computational resource problem when the device dimension becomes large ( 10 3 λ), where λ is the device's operating wavelength.As most electromagnetic simulations are performed over a sub-wavelength grid size, with increased size, the number of input variable becomes prohibitively large, making the simulation slow and memory extensive.The limitation of such forward electromagnetic simulators becomes even more severe for inverse design, where many such forward simulations are needed.
Sub-wavelength diffractive optics, also known as meta-optics, present an important testbed for these problems: the constituent elements of the meta-optics, i.e. meta-atoms, are subwavelength, but the dimensions of the whole meta-optics are on the order of ∼ 10 3 λ − 10 5 λ.
Thus the underlying physics of each scatterer has to be modelled using full-wave electromagnetic simulation, but the whole meta-optical system needs to be simulated using ray or wave optics.Such multi-scale electromagnetic simulators invariably rely on approximations, the most common of which is the local phase approximation (LPA): the scattering in any small region is taken to be the same as the scattering from a periodic surface. 9This approximation allows the simulation of each scatterer in a periodic array, abstracting out the electromagnetic response as a simple phase shift.While this significantly reduces the computational complexity of simulating a meta-optic, this approximation fails to consider the coupling of each scatterer with their dissimilar neighbors.In fact, it has already been shown that meta-optical lenses designed under LPA have sub-optimal efficiencies, 16 especially when the numerical aperture is large.The LPA becomes even more inaccurate when the material used to create the meta-lens has low index, such as SiN. 17We note that, while a full FDTD coupled with adjoint optimization has been used to design a meta-optic without relying on LPA, their size has been limited to only ∼ 100λ. 18LPA can also be bypassed using Mie scattering approaches, 19 which however limits the shape of scatterers.
To address the computational bottleneck of large-area inverse design, here we introduce a physics-informed neural network (PINN), which can replace a traditional FDTD/ FDFD solver to predict the electric field distribution for a given dielectric distribution.1][22][23][24][25][26][27][28] However, these works used largely periodic structures for which LPA is accurate.We present a solution via PINNs 29,30 for lenses and devices with spatially varying scatterer geometries, where it is necessary to model the whole electric field from several scatterers and their neighbors.PINNs solve partial differential equations (PDEs) by minimizing a loss function constructed from the PDE itself.This loss function is generally some norm of the residual 29 or an energy function derived from the PDE. 31 PINNs have already seen wide usage in the field of fluid mechanics, 32-34 biology, 35 and solving stochastic PDEs. 36In electromagnetic inverse problems, PINNs have also been employed to design meta-optics and nanophotonic devices. 37,38These works, however, did not demonstrate a simulation speedup, and are limited to the inverse design of very small devices.We also note that pre-trained PINNs have been used to design small gratings; 39 however their methodology is limited to small gratings that deflect light fields to specific angles, and thus cannot be readily used for the inverse design of general meta-optics.
In our work, we train PINNs to predict the electric fields from a parameterized set of dielectric meta-atoms corresponding to rectangular pillars.We then use this as a surrogate model to design cylindrical meta-lenses operating in the visible with a diameter of 1 mm (∼ 1500λ).Large area meta-optics are simulated by partitioning the simulation region into groups of 11 meta-atoms, with the outermost meta-atoms overlapping.After simulation, the fields are stitched together.Our PINNs do not require a training data set.They are trained by randomly generating distributions of dielectric meta-atoms , feeding them into a neural network N N , and minimizing the residual of the linear Maxwell PDE operator over the neural network training parameters.This means our PINNs are trained without ever invoking a forward numerical simulation of Maxwell's equations during the training process.
Numerical simulations are invoked only to test the neural network performance (see next section and supplement).Once trained, this method can calculate the full electromagntic field response from a 1 mm diameter cylindrical meta-lens at ∼ 630nm in approximately 3 seconds on a graphics processing unit (GPU).Furthermore, we demonstrate a theoretical and experimental improvement of the maximum intensity of cylindrical metalenses over their forward designed hyperboloid counterparts, signifying the improvement over using LPA.We emphasize that the size of the meta-lens, on which we demonstrate the intensity enhancement of over 50%, is at least one order of magnitude larger than any other inverse designed lens that does not rely on the LPA.We note that the reported method is robust enough to handle even larger meta-optics, with simulation time scaling only linearly with the aperture of the cylindrical lens (see supplement).

Deep neural network proxy to Maxwell's equations
Our problem statement is summarized in Fig. 1.The monochromatic electromagnetic scattering equation for an inhomogeneous, non-magnetic material is given by: In the 2D case, assuming out of plane polarization (0, 0, E z ), and the double curl vector identity, ∇ × ∇× = ∇(∇•) − ∇ 2 we can simplify Eq. ( 2) to: where E z and J z are scalar fields.Equation ( 3) is defined over all space, with boundary conditions at |x| → ∞.To simulate this equation, we discretize it on a Yee grid 3 by replacing the ∇ operator with a matrix, and treating the field E z (x) and current J z as vectors E and J at discrete values of x.Similarly, we treat the dielectric distribution (x) as a diagonal matrix ε.To truncate the simulation to a finite domain, we use perfectly matched boundary layers (PML), by making the transformation on the partial derivative operators ∂ ∂x → Making these substitutions, Eq. (3) becomes: with matrices D h x , D e x , D h y , D e y being the matrix representations of corresponding derivative operators on a Yee grid with incorporated PML boundaries.See supplement S4 for a more detailed description of the matrices.These matrices were extracted from a modified version of the package angler 40 with constants c, , µ set to 1 and the length scale set to µm.To build a neural network proxy to solve Eq. ( 4), we employ a PINN.PINNs generally use the coordinates of the computational grid as the input to the neural network, and then minimize the residual of the physical equations by approximating the target quantity being solved for with a neural network.This approach is slow since it effectively functions as an iterative solver re-parametrized over neural network weights and biases.It also required retraining the neural network for all different dielectric distributions.Our approach is to build a proxy solver that predicts the field E from a dielectric distribution ε.We pre-train the PINN to predict fields from inputs ε before optimizing our meta-lenses.The minimization problem to train the PINN becomes: with N N (ε; θ) being the output field from the PINN, and || • || 1 is the vector l 1 norm.Here θ refers to the weights and biases of the neural network N N .A lower physics informed loss indicates that the neural network is actually satisfying the PDE, and thus predicting the field more accurately.We re-emphasize that there is no data term in f ( ; θ), which simplifies the problems. 39The model is trained for 5 × 10 5 epochs using the ADAM optimizer 41 with a learning rate set to 5 × 10 −4 .The final residual of the fields predicted by the neural network are of the order of ∼ 0.5, compared to the numerical residual produced by FDFD which is on the order of 10 −16 .Although there is a large difference, in the next section we show that this still produces a simulator which is capable of outperforming the LPA when optimizing the efficiency of a metalens.Fig. 2 a shows an example of a field predicted from a random set of pillars by the neural network, by a 2D FDFD code, and their difference, showing good qualitative match.A more quantitative measure at the errors is shown in Fig. 2 b, where we show the point-wise error probability density functions for the relative error between the complex fields predicted by FDFD and that predicted by neural network and field predicted under LPA, and the absolute error between pillar-wise average transmission coefficients.See supplement S2 for a more detailed description of the pillar wise transmission coefficient error.
The relative error is expressed as: For the PINN, E approx is the field predicted from a set of 11 pillars.For the LPA, E approx is fields predicted from the same set of pillars, and then stitched together over the same region.See fig.S2 for a visual explanation.This error can be interpreted as the % difference between the FDFD predicted field and the approximate field, either predicted by neural network without making any LPA or the field under LPA.The mean expected relative error for the neural network is µ = 0.025 with a standard deviation of σ = 0.0073.When using the LPA over the same region, we get a mean relative error of µ = 0.17 with a standard deviation of 0.078.Thus, based on the relative field error, our method is 6.8× more accurate than the LPA.For the pillar-wise transmission coefficient error, we get an expected error of µ = 0.051 for the neural network with a standard deviation of σ = 0.033 and for the LPA method we get an expected error of µ = 0.38 with a standard deviation of 0.14.Thus, based on the transmission coefficient error, our method is 7.2× more accurate than the LPA.

Device optimization
The optimization process based on automatic differentiation functionality of PyTorch for large area meta-optics is outlined in Fig. 3.The forward problem is solved via a pretrained PINN.Since the input into the neural net is a meshed grid of pillars, a differentiable map from pillar half-widths to meshed geometries must be generated.This is achieved by generating Gaussian functions centered around pillar centers, with standard deviations of pillar half-widths in the x dimension, and pillar height in the y dimension, and then using a modified softmax function to transform the Gaussians into rectangles with slightly rounded edges, making them differentiable via automatic differentiation (see supplement S1).The meshed structures are fed into two separate neural networks that have been pre-trained to predict the complex electric field.The fields are then stitched together with regions of the outer half-widths overlapping.The total field is then propagated using the angular spectrum method.The propagated field is used to calculate the FOM f from Eq. ( 5).We use automatic differentiation to compute the gradients of the FOM with respect to the input half-widths ∇ r f , and iteratively update them with the ADAM optimizer. 41

Results
We used PINN surrogate model to optimize 9 different lenses, all with 1 mm aperture, with focal lengths ranging from 250-1500 µm in increments of 250 µm.The minimum feature size is set to 75 nm, to ensure fabricability.To compare our optimization approach, we also generated lenses according to the hyperboloid phase equation: The phase is implemented under LPA using SiN (refractive index 2), a wavelength of 0.633µm, and periodicity of p = 0.443µm.We then optimize the lens employing our PINN to increase the intensity at the focal spot, i.e., the FOM is given by: Fig. 4a and Fig. 4b show the intensity profile of a forward designed and optimized lens with  We validated our designs by fabricating and experimentally testing the meta-lenses using a microscope (details of fabrication and characterization in Methods).Fig. 5 shows an example of the inverse optimized device.Fig. 5a-c shows the scanning electron micrographs (SEMs) of the fabricated optimized lens with focal length F = 500µm.Fig. 5d shows the distribution of the dielectric pillar half-widths of the same forward and optimized lens.
signifying the two designs are very different.Fig. 5e shows the focal spot intensities of the lenses integrated over a r = 3×FWHM region at the focal spot, which yields a quantitative value to compare the lens efficiency 42 among different devices.Fig. 5f plots the maximum intensity plot as a function of the lens NA.For optimized lenses with NA > 0.44, we see improvements of more than 25%, with a maximum improvement of 53% for the NA = 0.9 lens.The experimentally determined intensity integral, which is analogous to the efficiency of a lens, on has improvements of more than 18% in all cases except for the NA=0.9 case.This is because the FWHM of the optimized lens at the NA=0.9 case is actually smaller than the FWHM of the forward designed lens, leading to a smaller integration area when computing the energy.
Fig. 5g shows experimentally measured field profiles of the forward designed F = 500µm meta-lens.Fig. 5h shows the same for an optimized lens.Fig. 5i is the slice of the focal spot intensity profile along the z = F plane.In all these figures, the intensity is normalized such that the maximum intensity of the forward designed lens is 1.

Discussion
We have developed a PINN to use as a proxy surrogate model for simulating the full Maxwell's equations to design dielectric meta-optics.We used the PINN to optimize pillar half-widths to maximize the intensity at the focal spot of 1 mm aperture cylindrical meta-lenses.We demonstrated experimental improvements of the maximum intensity of the lenses up to 53%.
We also want to note that this method was useful for the inverse design of extended depth of focus lenses 10 (see supplement S6).This model did not use the LPA, but simulated metaatoms by splitting up the device into chunks with overlapping boundaries, and stitching the chunks together to approximate the full field response.We emphasize that FDFD simulations were never carried out to train the PINN, and we only minimized the residual of the PDE itself to train the network.The PINN training took approximately 2 hours on our machine.
In our studies, this method provided approximately a 3-5x speedup over conventional FDFD with overlapping boundary conditions, and was much simpler to use as a forward simulator for optimization problems since it can be used as a simple map from to E-field with gradients computed by automatic differentiation.
We would also like to note that the theoretical intensity and efficiency improvements are quite a bit smaller than their experimental counterparts.While we don't have a clear explanation for this discrepancy, the theoretical and experimental trends in lens improvement are similar.We hypothesise that the inverse designed lenses may be more tolerant to fabrication imperfections.The inverse design solution we introduced in this paper can be integrated into various computationally intensive tasks which require mate-optical inverse design such as the end-to-end optimization of computational imaging systems and the design of optical neural networks. 43,44It is worth noting, however, that this method is not a general numerical solver.It is limited to predicting electromagnetic field responses from fixed source, material, and boundary parameters.Source type and k-vector, dielectric constant, geometry type (rectangular pillar of fixed height), and boundary conditions must all remain constant for this method to work.If any of these parameters are modified, the PINN must be retrained.
Furthermore, the method we presented was only implemented under a 2D approximation.
Extending this method to 3 dimensions would take significant effort due to the fact that the electric field E could no longer be treated as a scalar field, and the full vector nature would have to be modeled.On a n × n grid in 2D, the Maxwell operator result in 9n 3 × 9n 3 square matrices due to the additional 2 vector field components that must be modeled.However, these operators are sparse with number of nonzero elements that scale as ∼ 38n 3 in 3D, making small problems still manageable.The other problem with generalizing this method to 3D is the large null-space of the ∇ × ∇× operator which results in slow convergence of numerical methods. 45,46Its highly likely that this could also affect the training of the PINN, and require regularization or preconditioning which deflates the null space of this operator to properly converge onto a solution.On the other hand, in this work we showed that machine-precision numerical accuracy of numerical solvers may be not be needed for inverse design methods with FDFD.Solvers could be sped up by relaxing the relative error tolerance, such that iterative solvers converge quicker for predicting the forward and adjoint problems.In future work we aim to explore these options.

Fabrication
All devices described and discussed in Figure 5 (forward and PINN designed) were fabricated on the same substrate.First a ∼ 700 nm SiN film was deposited on a quartz wafer using plasma enhanced chemical vapor deposition (SPTS Delta LPX PECVD).A thin film of a polymer resist (ZEP 520-A) and a thin film of a discharging polymer layer (DisCharge H2O) were subsequently spun onto the sample.We then used electron beam lithography (JEOL JBX-6300FS, 100 keV, 2nA) to write the various structures.After development, a short descum step (Glow Research, Autoglow, 12 s, 100 W) was used to remove remaining resist residues and subsequently a layer of 60 nm AlOx was deposited using a home-built e-beam evaporator.After overnight lift-off in warm NMP and a further plasma cleaning step in O2, we used inductively coupled reactive ion etching (Oxford Instruments PlasmaLabSystem100) with a fluorine gas chemistry to transfer the pattern from the AlOx hard mask into the underlying SiN layer.The final thickness of the etched layer indicated a pillar height of ∼650 nm.

Experiment set up
For intensity measurements, light from a HeNe laser was transmitted through the backside of the chip and measured on the device side using a translatable microscope relay setup.
In detail, the sample was mounted at a fixed position on a kinematic holder, allowing the fine adjustment for pitch and yaw, as well as the lateral position.Light was transmitted through the substrate side and would propagate entirely in air.The resulting focusing pattern was measured using a microscope setup consisting of a Nikon 100X LU Plan Fluor objective with 0.9 NA (equal or higher to the NA of the meta-optic), a tube lens (Thorlabs), and a camera (Allied Vision ProSilica GT1930), which were mounted on a programmable automated translation stage (NewPort).Frames were acquired at specific intervals of the

S3 Mesh reparametrization
To mesh a set of pillars, we first generate a grid of coordinates x ∈ [−6p, 6p] and y ∈ [− 3 2 p, 3 2 p], p being the pitch of the meta-optics.Since we have a fixed set of pillars, their centroid locations are given by x i = np 2 where n is the pillar index running from −5 to 5. The first transform we define is just a shifted gaussian function on this grid: Similarly, the second one defines the height: The modified softmax is defined by:

S4 Form of FDFD linear system
In this section we give a brief summary of formulating the FDFD linear system on a Yee grid for completeness.The full details of the method can be found elsewhere, S1-S5 so we only give a brief description here.The boundary conditions of Maxwell's equations are defined at |x| → ∞, so perfectly matched layer (PML) boundary conditions must be implemented to truncate the simulation region to a finite size.To do this, scale matrices S x and S y need to be generated.They are constructed by creating scale factors Where w is the coordinate normal (x or y), l is the distance inside the PML from the PML interface, ω o is the operating angular frequency, and 0 is the permittivity of free space.σ w (l) is given by and σ w,max is d is the thickness of the PML.η 0 is the vacuum impedance η 0 = µ 0 / 0 .R is the target reflection coefficient.A good reference for PML boundaries is Shin et.al. S4 The package we used, angler, S3 uses the convention m = 4 and ln(R) = −12.Furthermore we modified the constants inside the package such that µ 0 = 0 = η 0 = 1.The matrices S x and S y are created by computing s w on a meshed grid, flattening it, and embedding it into a diagonal matrix: The numerical derivative matrices on a yee grid, with periodic boundary conditions ∆ x and ∆ y are: . . . . . .
here N x and N y are the grid sizes in the x and y directions and dx and dy are the grid spacings.The derivative matrices D e x , D e y , D h x , and D h y can then be constructed as: where the † operator is the Hermitian transpose.

S5 Neural network training
In this section we describe the effect of adding data to the PINN loss function.Given a PINN loss: and a data loss we can form a total loss function: Where ε is the input dielectric distribution, θ are the trainable parameters of the neural network, and α is the "strength" of the data term.Fig. S5 shows the test loss vs epoch of neural networks trained with various α parameters.We generated a dataset of 10000 fields by directly simulating our problem with random pillar arrangements using angler.S3 The training was done on 9900 fields, and the test was done on the remaining 100.The left hand side of Fig. S5 shows the accuracy of the neural network given by the 2 norm relative error The red lines are the EDOF and the blue lines are the lens.The black line is plotted at the full width half maxima of the EDOF lens, which is how we define the depth of focus.S5 The theoretical and experiment depth of focus for the forward designed lens is 20µm.The EDOF lens has a theoretical depth of focus of 93µm and an experimentally measured depth of focus of 97µm .

S6 Extended depth of focus (EDOF) Lens
We also designed an EDOF lens through a standard max-min objective approach, where we computed the intensity of the field produced by our lens at discrete equidistantly spaced points along a focal line, on an interval between two different focal lengths: We then used forward design to generate a lens with focal length 2050µm and a diameter of 1000µm, and optimized an EDOF lens to extend its focus between 2000 and 2100 microns.Table 1: Resource and speed comparisons between the PINN approach and the FDFD approach for the forward simulation of a 1mm lens.The PINN approach is about 5x faster than the FDFD approach when overlapping boundary conditions are used and 10x faster when we don't use overlapping boundary conditions.Furthermore, the overlapping boundary method is more memory efficient, and thus useful for running inverse design on machines with low RAM.

Figure 1 :
Figure 1: a. Neural network schematic.ε distributions of 11 pillar meta-optics are meshed by randomly generating sets of pillar half-widths of height h = 0.6µm with a dielectric constant 4 corresponding to SiN.The background medium is air.The loss function is the || • || 1 norm of the residual of Eq. (2).b.Neural network architecture.Encoder layers are down-sampled by a maxpool operation with a 2 × 2 kernel.The decoder part of the network is up-sampled by the Conv2DTranspose operation with a 2 × 2 kernel.c.Render of the system under optimization.A current J is incident on a cylindrical metalens with dielectric distribution ε, with output response E.

Figure 2 :
Figure 2: a. Real part of fields predicted by (left) neural network, (center) FDFD, and (right) the difference between the true and predicted fields.b.Comparison between the performance of the proposed neural network and LPA methods.(left) Shows the relative error between FDFD predicted fields and the fields predicted by LPA.(right) Error comparisons between the transmission coefficients predicted by LPA and the neural net.

Figure 3 :
Figure 3: Optimization strategy of 2D meta-optics with PINNs.a.We start with a vector, which contains a list of all pillar half-widths, characterizing the meta-optic.These halfwidths are then batched into groups of 11 with an overlap of 1 pillar on each side.The choice of 11 pillars was made based on the GPU memory required to train the PINN.b.The half-widths are meshed into dielectric distributions which get fed into the c.The neural network predicts patches of fields which are then stitched together, and d. propagated via the angular spectrum method.e.The objective function is formed from the resulting field, and backpropagated using PyTorch's automatic differentiation functionality to update the initial radius distribution.

Figure 4 :
Figure 4: Efficiency and intensity sweeps of forward designed lenses and optimized lenses.a. Focal spot intensity profile of a forward designed lens with focal length F = 500µm.b.Focal spot intensity profile of an optimized lens.c.Slices of intensity profiles for both lenses.The intensity was normalized such that the maximum intensity of the forward designed lens is 1.The theoretical performance improvement is ∼ 3%.d.Maximum intensity at the focal spot vs lens numerical aperture (NA).Intensities are normalized such that the maximum of the largest forward designed intensity is set to 1. e. Theoretically computed efficiencies of the lenses vs NA.The solid lines are visual aids for the trend and do not correspond to a theoretical prediction.
Fig.4c shows the normalized intensity slice at the focal spot of both lenses.As seen in Fig. 4 d the maximum intensities at the focal spots improve in every case.Fig. 4 e shows that the efficiency all improves in all except for the lens with highest NA.

Figure 5 :
Figure 5: a.-c.Scanning Electron Microscope (SEM) images of the fabricated SiN metalens with focal length F = 500µm.The scale bars correspond to 1µm, 0.1µm, and 1µm, respectively.d.Counts of pillar half-widths of the forward and inverse designed lens.e. Measured intensity contained in the region given by 3× FWHM of the focal spot vs lens numerical aperture.The units are normalized to the largest intensity integral of the forward design.f.Maximum intensity at the focal spot.The inverse designed lenses outperform the forward designed lenses for NA > 0.44.The lines are visual aids and not fits to a theoretical model.Units are normalized to the largest intensity of the forward designed lens.In the NA = 0.9 (F = 250µm) case an improvement of 53% is observed.g.Experimentally measured field intensities of the forward designed lens and h. of the inverse designed lens.i. Intensity slice at the focal spot.The intensities are normalized such that the maximum intensity of the forward designed lens is 1.The intensity of the inverse designed lens focal spot is 1.25.

Figure S2 :
Figure S2: Calculating pillar-wise transmission coefficients for a. FDFD and neural network simulations and b.LPA simulations.

Figure S3 :
Figure S3: transforms carried out to make a differentiable map from r →

11 i=1T 3 (Figure S4 :
Figure S4: Description of setup for scale matrices S x and S y .Left hand side is for the x derivative and right hand side is for the y derivative.PML scaling is computing on a meshed grid, then flattened and embedded in a diagonal matrix.

Figure S5 :
Figure S5: Comparison of neural network performance for different values of α in the data loss term.Left: epoch vs normalized error given by eq.S19 and right: the residual.Both both plots are the done on the test data set.

Figure S6 :
Figure S6: EDOF lens inverse design.a. -c.represent theoretical results, and d. -f. are experimentally measured results.a., d. are forward designed lenses with focal length 2.05 mm.b.,e.are the optimized EDOF lenses.c., f. are slices along the z axis with x = 0µm.The red lines are the EDOF and the blue lines are the lens.The black line is plotted at the full width half maxima of the EDOF lens, which is how we define the depth of focus.S5   The theoretical and experiment depth of focus for the forward designed lens is 20µm.The EDOF lens has a theoretical depth of focus of 93µm and an experimentally measured depth of focus of 97µm .

Fig. S6 shows
Fig.S6shows a summary of the results for the inverse designed EDOF lens.To characterize the performance of the lens, we define the depth of focus as the point where the distance at which the focal spot intensity reaches 1/2 of its original value, see Fig.S6 c and f.The EDOF device has a theoretically predicted depth of focus of 93µm and an experimentally measured depth of focus of 97µm.The depth of focus of the forward designed lens is ∼20µm measured both in experiment and theory.

Figure S7 :
Figure S7: I Theoretically predicted intesnities at the focal spot.II.Experimentally measured intensities at the focal spot.III.Theoretically predicted focal spot for forward design IV and inverse design.a. 200 µm focal length.b. 500 µm c. 750 µm d. 1000µm.

Figure S8 :
Figure S8: Comparison of neural network simulation time vs lens diameter.Data is taken for lens diameters of 100, 500, 1000, 2000, and 10000 µm lens diameters.Simulation time increases linearly with lens diameter.