Introduction

Computational tools have been playing an increasingly central role in material design in recent decades. With lower costs and rapid development cycles they are an efficient starting point in the material design process that can later be complemented with experimental validation and refinement1. Especially machine learning techniques have recently seen widespread adoption in applications ranging from predicting material properties to designing nanostructures2,3 and this work aims to advance and support these efforts.

Brute-force parameter sweeps are increasingly being replaced by more sophisticated optimization methods in the design of metamaterials and nanophotonic devices4,5,6,7,8,9,10. Indeed, many recent works rely especially on neural network-based approaches for this4,11,12,13. And, instead of letting human intuition and analytically derived parameters determine the architecture, nanostructured devices can now be autonomously designed for high performance, made possible by the exhaustive exploration of the design space that such algorithms perform7. In this context, this process of determining the physical characteristics from certain properties14, e.g., finding the geometry of a structure for a given optical response, is called inverse design. Recently, there have been a lot of works on inverse design for nanophotonics using different techniques, such as generative adversarial networks4, variational auto-encoders5 and other machine learning and optimization methods1,2,3.

One area where this has received a lot of attention is the space sector, where reducing the size, weight, power, and cost of objects put into orbit is not only a high priority, but can in many cases also determine the feasibility of the project. One example are photon sails, which must have sail thicknesses on the order of only a few micrometers (the IKAROS solar sail was only 7.5 μm thick15) in order to avoid an unsuitable weight. Recently, there has been research on using metasurfaces for solar sails, made with both conventional design16 and inverse design17 methods. These metasurfaces might offer, for instance, attitude control16 and mass reduction17 while maintaining a high efficiency.

Figure 1
figure 1

Training setup for NIDN; regression and classification as well as RCWA and FDTD are described in detail in Methods.

Recent works have also applied metamaterial knowledge to the optical solar reflector (OSR), a structure placed on the external surface of a spacecraft that simultaneously reflects incoming sunlight and emit infrared radiation. Instead of using the conventional design of planar structures, Sun et al. showed that a simple metasurface can reduce the thickness18,19. The terrestrial cousin of the OSR, radiative cooling structures, have followed a similar trend recently20.

Overall, these methods rely on sophisticated software tools and datasets which has inspired efforts to make those available openly as, e.g., in the work of Jiang et al.21. In a similar vain, in this work, we present an open-source software package called Neural Inverse Design of Nanostructures (NIDN) that allows designing nanostructures with neural networks using two different, fully differentiable implementations of Maxwell equations solvers based on the finite-difference time-domain (FDTD) method22 and rigorous coupled-wave analysis (RCWA)23. RCWA is a natural and efficient choice for stacked materials designed with NIDN. The relevance of this is also evident as another group has also concurrently undertaken the effort of implementing RCWA with PyTorch24. FDTD complements it by supporting design in the visible light range of the spectrum and—depending on geometry—providing higher accuracy25. Both, RCWA and FDTD, enable the machine-learning-based design of multilayer materials with patterned nanoscale grids to achieve a desired spectrum of reflectance, transmittance and absorptance. Thus, NIDN has the potential to support a wide range of the aforementioned applications. In comparison to previous similar efforts such as those by Jiang et al.26 we utilize a fundamentally different technique relying on directly encoding the material inside a neural network.

Building on advances in physics-based deep learning27, in NIDN, we directly optimize the nanostructure described by the neural network to achieve one specific spectrum using gradient information from the physical simulation via a loss. In this approach, no prior knowledge or training datasets are required as gradients are backpropagated through a (differentiable) numerical solver. Also, as the neural network does not need to solve the problem of fully describing the relationship between any material and spectral characteristics the task is more tractable. In many previous approaches28,29, neural networks effectively need to invert the Maxwell equations to match material and spectral characteristics to each other. In NIDN, by solving for a specific reflectance, transmittance and absorptance spectrum, we—from a mathematical perspective—constrain the inversion to a specific solution of the Maxwell equations. This furthermore avoids being limited to only a few degrees of freedom, such as placing a specific structure on top of another, in the material design29,30; instead we can fully explore the entire search space of structures. Several nanosurface modification strategies have been employed to tune spectral response in optical devices. NIDN specifically excels in the parametric optimization of heterogeneous, multi-layer devices, which use local composition to tune electromagnetic properties. This is in contrast to the geometric structuring of a single material surface, which has limited local spectral tunability.

There is overall a larger trend in the related research field of topology optimization to use inverse design approaches based on neural networks, where various works have explored this for problems in related domains of material design31,32,33. However, to the authors knowledge, these works have not explored a similar setup for inverse design in photonics which is challenging as a Maxwell solver requires full support of complex numbers.

With NIDN, the neural network learns to produce a spatially continuous function describing the material permittivity. We demonstrate the validity and versatility of the approach by comparing to existing implementations that we built on. We showcase several examples ranging from simple uniform layers to practical applications related to photonic filter design and perfect anti-reflection. NIDN is also capable of running on graphics cards (GPUs) and is available open-source online (https://github.com/esa/NIDN Accessed: 2022-04-21). The authors hope to build a community around the software, extending its capabilities and fidelity even further in the future.

Results

In the following, we showcase results demonstrating NIDN’s capabilities. In particular, we first validate the numerical Maxwell solvers in NIDN against baselines, then show the efficacy of the inverse design on some simple materials, and, finally, investigate two common design scenarios from the literature and perform a comparison with related approaches.

Material design setup

The following results were obtained using NIDN and its Maxwell solvers derived from the RCWA implementations by Jin et al.17 and an adaption of an open-source implementation of FDTD in PyTorch 1.9. Both are described in more detail in the Section Methods. NIDN utilizes PyTorch 1.9 to enable automatic differentiation and neural network training to design materials of stacked layers such as shown in Fig. 1. Details on neural network architectures and training as well as the detailed training setup are given in the Section Methods. Data on material permittivity are obtained from an online database.(https://refractiveindex.info Accessed: 2022-04-21) More detailed info on specific references for each material are provided in the integrated database of materials in NIDN (https://github.com/esa/NIDN/tree/main/nidn/materials/data.

Permittivity between provided measurement points is interpolated linearly. Layer thickness if not specified otherwise is 1 μm. Grid dimensions if not specified are \({1}\,{\upmu }\text{m}\times {1}\,{\upmu }\text{m}\). In general, NIDN supports two modes of inverse design which are described in more detail in the Section Methods: The first one is referred to as regression, where the permittivity of the material is predicted within some value range to investigate hypothetical materials achieving the target spectral characteristics. If not specified otherwise, the range for relative permittivity \(\epsilon _r\) considered by the network is \([0.01 , 20 + 3i]\). All references to permittivity refer to relative permittivity. The second approach is called classification and the permittivity in this case is chosen as a linear combination of reported permittivity of real materials. All considered materials are listed in the codebase. At present, only normal incidence is considered. Figure 1 gives a high-level overview of the training process.

The number of training iterations depended on the task and solver; for RCWA between 1000 and 5000 training iterations were used, for FDTD not more than 150 iterations due to the higher computation time. No dispersion model is used and frequencies are computed independently. For a single uniform layer, a single training iteration with FDTD requires about 8 s per frequency, with RCWA it takes six milliseconds per frequency per iteration. L1 errors reported in the figures refer to the mean absolute error between the target and obtained spectrum. Inversion with FDTD used a coarser grid in the numerical discretization than for validation to keep the computational cost manageable (32 instead of 100).

Figure 2
figure 2

Comparison of the transmittance of a 380 nm thick layer of \(\text{TiO}_{2}\) in NIDN and experimentally measured.

Figure 3
figure 3

Inversion results for a single-layer \(\text{TiO}_{2}\) material using FDTD and regression; (a) displays target and produced spectra, (b) shows the utilized permittivity; differences to Fig. 2 are due to smaller number of discretization grid points.

Validation and numerical stability

NIDN contains a fully differentiable RCWA implementation that is derived from the Python module GRCWA17. For RCWA, a full re-implementation of the GRCWA code to use PyTorch was undertaken replacing many components with adequate alternatives in PyTorch. Extensive tests are performed to validate both, RCWA and FDTD, implementations against the original implementations and results of these can be found in the Supplementary Information.

NIDN adapts the FDTD implementation from the Python module with the same name (https://github.com/flaport/fdtd Accessed: 2022-04-21) but adds support for gradient flow through it and performs post processing. A comparison with an experimental baseline is given to validate the implemented post processing. Further, we directly compare results with the original Python module to validate our changes to it. Overall, these implementations are major contributions of this paper as previously no Maxwell solver in PyTorch was available that allowed backpropagation through the entire simulation. This is also particularly challenging given the dependence on using complex numbers for permittivities and the simualtion as, at the start of this research work, there was a lack of support for complex numbers in PyTorch.

A comparison with an experimental measurement of the transmission spectrum of \(\text{TiO}_{2}\) is performed for both. The \(\text{TiO}_{2}\) layer in the experiment by Wojcieszak et al.34 was 380 nm thick and transmission of wavelengths between 300 and 900 nm were measured using a spectrophotometer. Figure 2 displays transmittance obtained with NIDN RCWA and FDTD using continuous waves with wavelengths from 300 to 900 nm as well as the experimental result. Overall, it can be seen that the transmittance in NIDN resembles the one observed in the experiment. Point clustering in the FDTD version is a discretization effect due to the number of grid points used in the simulation.

Figure 4
figure 4

Inversion results for a single-layer \(\text{TiO}_{2}\) material using FDTD and classification; (a) displays target and produced spectra, (b) shows the utilized permittivity; differences to Fig. 2 are due to smaller number of discretization grid points.

Material and structure inference

As a first demonstration of NIDN’s capabilities, we present results on some rudimentary examples. These examples aim to reconstruct a material for which the spectral characteristics were computed with RCWA / FDTD, thus the inversion is certain to have a solution obtainable by the network. In particular, these results show that, depending on the complexity of the spectral characteristics, almost perfect results can be obtained. Furthermore, we can investigate how unique solutions are by studying the similarity of designed materials with the ground-truth baseline.

Uniform \(\text{TiO}_{2}\) layer inversion

First, we show that NIDN is capable of recreating the permittivity of a single layer of \(\text{TiO}_{2}\), with FDTD in the visible range of the spectrum (for RCWA a more complex three-layer example is tested).

Figure 3 showcases these results. Note that to save computational time, a limited number of frequency points, i.e. 20, were investigated. A clear convergence towards the ground truth is visible with a detailed replication of the desired reflectance spectrum and slightly higher than targeted transmittance. Peaks in the transmittance are at the targeted wavelengths. Overall, the replication of the spectral characteristics is not perfect but strongly resembles the target. However, the predicted permittivity clearly differs from the real one of \(\text{TiO}_{2}\) with a higher real permittivity in the learnt material. Thus, the network seems to have found a solution leading to similar spectral characteristics with a different material.

Even though the inversion with the regression approach—in terms of reproducing the target spectrum—was successful, it was not constrained to realistic materials. In Fig. 4 we show results for the classification approach for this single-layer material where we constrain the network further in terms of allowed permittivities. Overall, the inversion is successful and the obtained spectral characteristics are even closer to the target. Furthermore, as seen in Fig. 4b, the used epsilon value now corresponds well to the experimental permittivity of \(\text{TiO}_{2}\) and thus the network was able to identify the exact material required for this spectrum.

Figure 5
figure 5

Inversion results for a three-layer material using RCWA and regression; (a) displays target and produced spectra, (b) shows the utilized permittivity for each layer.

Figure 6
figure 6

Inversion results for a three-layer material using RCWA and classification; (a) displays target and produced spectra, (b) shows the utilized permittivity for each layer.

Three-layer uniform material

The second example is the design of a three-layer material consisting of \(\text{TiO}_{2}\), Germanium (\(\text{Ge}\)) and \(\text{Ta}_{2}\text{O}_{5}\). Figure 5 displays the regression results with RCWA for this material. The design of the spectral properties is successful and the reflectance is achieved flawlessly. A small difference is noticeable at the lower end of the spectrum for the transmittance. Similar as for the previous inversion with FDTD, however, permittivities of the different layers are notably further from real materials. The obtained permittivity is not resemblant of the materials to generate the target spectrum. This again demonstrates the ill-posed nature of this inversion task where there are non-unique solutions that lead to the desired spectral characteristics.

Similar to FDTD, with the classification and RCWA approach, the restriction for the permittivity actually leads to a virtually flawless reconstruction of the spectral characteristics and to the network learning the permittivity of the utilized materials in the ground truth for each layer, as can be seen in Fig. 6. Furthermore, the achieved spectral characteristics also almost perfectly match the design goal of the ground truth. Thus, the classification was successfully used to constrain the network to realistic material choices.

Figure 7
figure 7

Inversion results for a two-layer patterned material using RCWA and regression; (a) displays target and produced spectra, (b) each line shows the utilized permittivity for a grid cell.

Figure 8
figure 8

Inversion results for a two-layer patterned material using RCWA and classification; (a) displays target and produced spectra, (b) each line shows the utilized permittivity for a grid cell.

Two-layer patterned material

The last case based on a ground truth from the forward models (i.e., the RCWA and FDTD implementations) is a two-layer patterned material consisting of \(\text{TiO}_{2}\) and \(\text{Ge}\) with a square pattern in the center similar to those shown in the Supplementary Information. In the upper layer the square is made of \(\text{TiO}_{2}\) and surroundings of \(\text{Ge}\) and vice versa for the bottom layer. This test case is only studied with RCWA as support for patterned layers is currently still in development for FDTD.

Figure 7 displays the obtained spectral characteristics with regression and RCWA. The desired spectrum is reproduced very well as can be seen in Fig. 7a with both, reflectance and transmittance, matching the target spectral characteristics precisely. However, the utilized permittivity is not close to any material available in NIDN as Fig. 7b shows and the reconstructed pattern is not resemblant of the ground truth. Again, the regression finds a solution that leads to an excellent reconstruction of the desired spectral properties with different material properties than used for the ground truth. Interestingly, the utilized permittivities are somewhat less erratic than in the previous cases (compare Figs. 7 and 3). And, it seems that there are two distinct materials (i.e. permittivity) that the network was converging to. These did not match the ones in NIDN’s dataset though.

For the classification approach and RCWA, shown in Figure. 8, comparable errors in terms of reproducing the spectral characteristics are obtained. In this case—as can be seen in Fig. 8b—the permittivity chosen by the network is close to those of used in the ground truth, i.e., \(\text{TiO}_{2}\) and \(\text{Ge}\). The pattern of the ground truth was not reconstructed implying that the network found a good solution without it.

Figure 9
figure 9

Inversion results for designing a bandpass filter with a ten layer stack using RCWA and regression; (a) displays target and produced spectra, (b) shows the utilized permittivity for each layer and closest material in NIDN (dashed).

Filter design

The first test case closer to a potential application is the design of a bandpass filter. In this case, the target spectral properties are defined manually. Hence, the spectral characteristics may actually not be obtainable with the material properties the network is allowed to utilize. A stack of ten uniform layers is used to design the material. Due to the larger computational cost of FDTD, only RCWA was studied in this case. Also, the classification for this test case did not converge—likely because the available materials in database did not suffice to find a solution for this problem.

Regression

With the regression, the chosen range for the relative permittivity is \(\epsilon _r \in [0 , 20 + 1i]\). The achieved spectral characteristics with regression and RCWA of the designed filter are shown in Fig. 9. The designed material captures the gap at the chosen wavelength around 1550 nm very well with a transmittance around 0.98 at those frequencies and a reflectance over 0.98 for the other wavelengths. Notably, the utilized permittivity is not similar to those of existing materials with erratic peaks in the real component of the permittivity for some of the layers, especially for lower wavelengths. Interestingly for higher wavelengths, the networks seems to converge to three distinct permittivities in the real part near 0, 20 and about 12.5.

Perfect anti-reflection coating

This second test case related to an application explores the scenario of building a perfect anti-reflection on top of a substrate with relative permittivity \(\epsilon _r = 16\). This is closely related to the research described by Kim & Park35. Thus, we use a stack of eight \({0.1}\,{\upmu }\textrm{m}\) thick layers on top of the higher permittivity substrate to achieve an anti-reflection coating (ARC).

Figure 10
figure 10

Inversion results for designing an anti-reflection coating with an eight layer stack using RCWA and regression; (a) displays target and produced spectra, (b) shows the utilized permittivity for each layer and closest material in NIDN (dashed).

Regression

For the regression with RCWA, Fig. 10 displays the achieved spectral characteristics. The obtained transmittance is above 0.9975 for the entire range between 400 and 700 nm, a peak transmittance of 0.9990 is achieved, the average is 0.9985. Thus, in summary the network is able to design a highly effective solution for these spectral characteristics. However, once again we can observe—as seen in Fig. 10b—that the chosen permittivity does not resemble real material properties. In this case, they are particularly erratic in the real part of the permittivity and especially in the lower wavelengths.

Figure 11
figure 11

Inversion results for designing an anti-reflection coating with an eight layer stack using RCWA and classification; (a) displays target and produced spectra, (b) shows the utilized permittivity for each layer and closest material in NIDN (dashed).

Classification

With the classification approach, the material design is still successful as can be seen in Fig. 11. On average, the transmittance is 0.9011 with all values being over 0.7822 and a peak at 0.9756, thus inferior to the regression. As discernible in Fig. 11b, the chosen materials for the layers in this approach are similar to existing ones such as Silicon Nitride and \(\text{TiO}_{2}\) and the permittivities are not erratic as for the regression. Hence, with the limited materials currently included in NIDN, this is a better trade-off of obtaining an ARC with realistic materials.

Figure 12
figure 12

(a) L1 loss versus model evaluations of the three approaches. (b) Permittivities produced with a neural network in NIDN. (c) Permittivities produced with a voxelized grid representation. (d) Permittivities produced with GRCWA and NLopt.

Comparison with previous works and ablation

To study the capabilities of NIDN in detail, we performed two comparisons: First, with the already feasible autograd approach in GRCWA17 as demonstrated in online provided examples (Example 3) using the Python optimization module NLopt. Secondly, to study the role of having a continuous representation, we compared with directly encoding the permittivities in a voxelized grid. Detailed results are shown in Fig. 12. The test case is the optimization of a five-layer uniform material to maximize reflection. With GRCWA and NLopt, the optimization converged (within changes of \(10^{-16}\)) after 363 model evaluations to an optimum L1 error of 0.07045. The voxelized grid approach with NIDN achieved the best loss after 2 000 model evaluations at 0.04994. With the neural network approach with NIDN we obtained a loss of 0.06745 after 2 000 model evaluations.

As can be seen in Fig. 12a convergence was most stable with the voxelized grid and quickest with NLopt. From a convergence standpoint, the voxelized grid approach outperforms the other two. In Fig. 12b–d the permittivities produced by each approach are shown. Notably, in the voxelized grid and GRCWA approach in Fig. 12c,d, respectively, produced more erratic results. This is expected, as each permittivity ( in terms of location, layer and wavelength) is fully independent and this is a fully discrete representation. Thus, the neural network approach achieved similar performance while producing a continuous and differentiable representation.

Discussion

Overall, the results presented in this work clearly highlight the capabilities and potential of NIDN. We have shown it is able to correctly reproduce the material properties of several synthetic examples, which demonstrates that the inversion and training is successful for both, RCWA and FDTD. Similarly, the creation of both a 1550 nm filter and a perfect ARC are achieved. However, especially the regression approach is susceptible to finding solutions which are not plausible for real materials. This is a current limitation of the regression approach as the designed permittivities do not resemble existing materials. The main reason for this is the lack of constraints on the chosen permittivity. A future improvement here may be to have the network predict the parameters of a function that produces realistic permittivities. For the moment, the constraints on physical plausibility in NIDN are implicitly given through the numerical simulations for determining the spectral characteristics. An additional component that checks if the predicted permittivity matches a realistic permittivity may be needed. The classification approach is able to partially remedy this, but for the real-world examples of filter design and ARC it is less effective at matching desired spectral characteristics. This is likely due to the limited number (12) of materials currently available in NIDN. The number can be increased, however, this also creates a harder classification problem which may in turn require finding an optimal trade-off of the number and diversity of available materials vs. the design limitations induced by fewer available materials. In the future, an alternative may be approaches from the topology optimization community where different projections are used to clamp materials36,37 or utilizing a gumbel-softmax38.

One aspect where NIDN is more versatile than previous approaches29 is that no imposition of shape or form is required or performed. In fact, even the discretization into a grid is not a constraint as it is only required by the solver (RCWA or FDTD); the neural network output is spatially continuous. One thing that has been observed is, however, that the simpler case of having uniform layers benefits training convergence when utilizing RCWA. The reason for this is likely the need to iteratively solve an eigenvalue problem if non-uniform layers are present which may deteriorate gradients. This may be remedied, e.g., by returning to a stochastic setting where gradients are averaged over several tested materials to make the training more robust. This does come at an increased computational cost but may be parallelized.

In summary, NIDN demonstrates the flexibility of physics-based deep learning methods. The range of applications for the software goes well beyond the shown test cases of filters and ARC as it conceivably can be used for a broad range applications such as radiative materials18,19 or solar sails17. With RCWA and FDTD, two complementary solvers are integrated, RCWA being fast and well-suited for higher wavelength and FDTD being more potent and capable to investigate nanostructures in the visible and infrared spectrum. In the future, optimization and parallelization of the solvers will be critical to improve performance as especially training with FDTD for complex materials becomes prohibitively computationally expensive. Further, FDTD can also be used for the design of non-periodic structures which is a feature not yet implemented in NIDN.

Methods

Rigorous coupled-wave analysis (RCWA)

The RCWA method is a well-established tool for modeling scattering of stacked structures23. In the NIDN implementation, single material as well as patterned layers can be stacked. A detailed overview of the capabilities can be found in the documentation of the Python module GRCWA17 which NIDN’s implementation is derived from. Note that the numerical stability of RCWA is dependent on the ratio of the size of the material grid to the investigated wavelength, i.e., for ratios above one RCWA may produce non-physical results.

The implementation in NIDN supports specifying a top or bottom layer in addition to the material; by default a vacuum layer is placed at both. The grid dimensions, number of layers and thicknesses of all layers can be specified manually. The default truncation order for simulations is 11 to keep computational costs low but it is configurable. The implementation employs a relaxation by adding a fictitious material absorption loss17,39. For now, we only investigated inversion with perpendicular light incidence angles and polarized light, however, both are also configurable in the RCWA implementation.

Finite-difference time-domain (FDTD)

FDTD is a simulation method that models the electric and magnetic field using a regular grid as a spatial discretization with a finite difference method40. The electric and magnetic fields at each grid point are calculated with numerical approximations of Maxwell’s equations; more specifically the electric field at each point is calculated using the curl of the magnetic field surrounding the point, and the magnetic field is calculated using the curl of the electric field surrounding the point.

The grid size is set to one tenth of the smallest wavelength used in the simulations, based on recommendations from the fdtd module’s documentation. The time step is set according to the Courant-Friedrichs-Lewy Condition41, to ensure numerical stability:

$$\begin{aligned} \Delta t = \frac{S_c\cdot \Delta x}{u} \end{aligned}$$
(1)

where \(\Delta t\) is the time step, u is the wave velocity, \(\Delta x\) is the spatial resolution and \(S_c = \sqrt{dim} = \sqrt{2}\) is the Courant number.

In this work, the FDTD simulations were setup with periodic boundaries in the directions orthogonal to the wave-propagation to simulate an infinite plane. In the direction of propagation, Perfectly Matching Layers (PML) were used. PMLs aim to absorb as much of the electromagnetic-wave as possible to avoid reflections at the boundaries of the grid. In the propagation direction, the grid is built up by a PML layer of 1.5 µm, a vacuum layer of 1.0 µm, the material(s) chosen with the specified thicknesses, a vacuum layer of 1.0 µm and finally another PML layer of 1.5 µm. The source is a line across the width of the grid, radiating a continuous wave, placed at the interface between the first PML-layer and the free-space layer.

We measure the electromagnetic wave at placed detectors in the FDTD simulations such that the signal is independent of the direction of the incoming wave. Due to this, a correction of the reflected signal is necessary as the raw signal contains both the forward-going wave before reflection and the reflected wave. For the correction, we use an identical simulation setup without a material, the so-called free-space case, and subtract the measured signal from this simulation from the one with a material.

The transmission and reflection coefficients are calculated by comparing the energy of the wave after the transmission and a free-space case where there is no loss from a material such that

$$\begin{aligned} T = \frac{P_i}{P_t} = \frac{E_{0i}^2}{E_{0t}^2} \end{aligned}$$
(2)

where the subscripts i,t and 0 refer to incident wave, transmitted wave and amplitude of the wave, respectively. Since the mean squared value of an integer number of periods of a sine wave equals \(\frac{E_0^2}{2}\), the mean squared value is used to calculate the transmission coefficients. The same is the case for the corrected reflection signal. No dispersion model has been implemented as frequencies are computed independently. By running the simulation for all desired wavelengths, the coefficients from each simulation were accumulated to obtain a transmission and reflection spectrum. The absorption spectrum is calculated by \(A = 1-T-R\).

The number of time steps were chosen such that both the reflected signal and transmitted signal have at least two peaks, such that the mean square can be calculated over at least one period. The loss from a material in FDTD comes from the conductivity parameter. This is included in the AbsorbingObject from the fdtd module, which is the only object/material type currently implemented in NIDN. The conductivity \(\sigma\) is set to \(\sigma = \epsilon '' \cdot \epsilon _0\cdot \omega\).

Model inversion approach

The inversion process in NIDN is inspired by recent research related to physics-based deep learning27 and in directly optimizing physical properties using differentiable simulations42. The main advantage of these approaches is that gradients in training neural networks are propagated directly through a numerical simulation (also referred to as forward model in the context of inverse problems). Thus, there is no need to build a dataset of any kind. In particular, the process in NIDN consists of the following steps which are also depicted in Fig. 1:

  1. 1.

    Specify target spectral characteristics (reflectance, transmittance, absorptance) and material properties (size, allowed permittivity, simulation parameters)

  2. 2.

    Randomly initialize a neural network that prompted with a position returns the material’s permittivity at that position

  3. 3.

    Perform an evaluation of the numerical model for all target frequencies, i.e., compute the spectral characteristics for the materials as described by the network

  4. 4.

    Compute the loss, i.e., the difference in the target spectral characteristics and those obtained through the network’s material

  5. 5.

    Backpropagate the gradients and update the network parameters to iteratively minimize the difference between target and obtained spectral characteristics

  6. 6.

    Until results are satisfactory, return to Step 3

After this process, the network is directly encoding the designed material by providing a permittivity for each position in the material. Herein, the conversion from a position to a permittivity is a particularly critical aspect to allow successful inverse design which will now be described in more detail.

Direct permittivity inference

NIDN implements two different ways to determine the relative permittivity \(\epsilon _r^P\) at a point \(P = (x,y,z,f)\) where x and y are the point’s location in a layer, z is the index of the layer it is on and f is the frequency at which we are evaluating. The first way is referred to as regression because the network herein directly infers a permittivity mapping \(P \mapsto \epsilon _r^P\). The values are then clipped at some specified range to restrict it to physically plausible permittivity values. However, as can be seen in the results section, the frequency-dependent permittivity is still fairly unconstrained, allowing for materials that are not plausible as they, e.g., do not necessarily satisfy the Kramers-Kronig relationship43. But, the advantage of this approach is the direct optimization of the material permittivity to obtain a solution that—at least with regard to the Maxwell solver—is plausible and can give a first idea of the hypothetical feasibility of some spectral characteristic assuming few constraints on permittivity. Permittivity values close to 0 are clipped to \(\pm 0.01\) to avoid singularities in the Maxwell solvers.

Material classification

In order to obtain materials of higher physical fidelity, NIDN implements a second way of encoding material permittivity in the neural network. Naively, one may assume that it would be easiest to have the network perform a classification by assigning one specific material (and hence its permittivity) to each grid cell. Unfortunately, this is infeasible as the necessary argmax operation in a classification (to select the most probable material) is not differentiable and thus would break the gradient flow. As an alternative, NIDN features a collection of N—at time of writing twelve—materials from which we determine the relative permittivity \(\epsilon _r\) at a point P through a linear combination such that

$$\begin{aligned} \epsilon _r^P = \sum _{i=1}^N y^P \epsilon _r^i, \end{aligned}$$
(3)

where \(y^P \in [0,1]^N\) is the neural network output for point P and its i-th entry describes the probability of the i-th material at the point P and \(\epsilon _r^i\) is the relative permittivity of the i-th material in NIDN, respectively. Thus, the produced permittivity is strongly resemblant of real experimental permittivity as can be seen in the results for the classification approach. Furthermore, to push the network towards picking one material without breaking gradient flow, the output \(y^P\) is passed through a softmax function \(\sigma (y)\) with \(\beta =16\), i.e.

$$\begin{aligned} \sigma (y)_i = \frac{e^{\beta y_i}}{\sum _{j=1}^N e^{\beta y_j}} . \end{aligned}$$
(4)

Neural network training

The neural network training in NIDN is inspired by several previous works in the field of differentiable models42,44,45,46. Thus, dense neural networks, namely the NeRF45 and Siren46 architecture are employed. Their parameters are optimized based on the gradient information obtained from running the simulation with the material encoded by the network. Thus, as illustrated in Fig. 1, the gradient of the model parameters, i.e. the permittivities, is computed by backpropagation from the loss through the computational graph formulated by the forward model (be it RCWA or FDTD) to the model. NIDN allows a variety of losses, such as absolute mean error or L2 error. For the classification, an additional regularization loss is used penalizing probabilities for their distance from the values 0 and 1. Training uses the Adam optimizer, a default learning rate of \(8\cdot 10^{-5}\) and a learning rate scheduler to reduce the learning rate upon encountering a plateauing loss.

Software structure

In general, NIDN enables users with two potential use cases: The first is to run forward model simulations using either RCWA or FDTD to obtain the spectral characteristics of a material. The second is the inverse design using neural network models to describe a material structure that produces certain spectral characteristics. NIDN’s software repository has Jupyter notebooks describing both processes in detail for RCWA and FDTD to enable users to run their own experiments. The project is open to contributions and under a permissive GPL-3 open-source license to allow customization.

NIDN is designed with modularity in mind, allowing for arbitrary neural network models and Maxwell solvers to be plugged in. It uses a continuous integration methodology with automated unit tests to ensure the correctness of the implementation. Several plotting utilities, such as those used for the figures in this work are available. It is the authors’ hope that the described steps and features will allow others to build and improve upon it.