Deep neural network-based automatic metasurface design with a wide frequency range

Beyond the scope of conventional metasurface, which necessitates plenty of computational resources and time, an inverse design approach using machine learning algorithms promises an effective way for metasurface design. In this paper, benefiting from Deep Neural Network (DNN), an inverse design procedure of a metasurface in an ultra-wide working frequency band is presented in which the output unit cell structure can be directly computed by a specified design target. To reach the highest working frequency for training the DNN, we consider 8 ring-shaped patterns to generate resonant notches at a wide range of working frequencies from 4 to 45 GHz. We propose two network architectures. In one architecture, we restrict the output of the DNN, so the network can only generate the metasurface structure from the input of 8 ring-shaped patterns. This approach drastically reduces the computational time, while keeping the network’s accuracy above 91%. We show that our model based on DNN can satisfactorily generate the output metasurface structure with an average accuracy of over 90% in both network architectures. Determination of the metasurface structure directly without time-consuming optimization procedures, an ultra-wide working frequency, and high average accuracy equip an inspiring platform for engineering projects without the need for complex electromagnetic theory.

Metamaterials, defined as artificial media composed of engineered subwavelength periodic or nonperiodic geometric arrays, have witnessed significant attention due to their exotic properties capable of modifying the permittivity and permeability of materials [1][2][3] . Today, just two decades after the first implementation of metamaterials by Smith et al. 4 who unearthed Veselago's original paper 5 , metamaterials and their 2D counterpart, metasurfaces, have been widely used in practical applications such as, but not limited to, polarization conversion 6,7 , reconfigurable wave manipulation 8,9 , vortex generation 10,11 , and perfect absorption 12,13 . Programmable digital metamaterials remarkably provide a wider range of wave-matter applications which present them especially appealing in the usages of imaging 14 , smart metasurfaces 15,16 , information metamaterials [17][18][19] , and machine learning applications 20,21 .
However, all of the abovementioned works are based on traditional design approaches, consisting of model designs, trial-and-error method, parameter sweep, and optimization algorithms. Conducting numerical full-wave numerical simulations assisted by optimization algorithm is a time-consuming process that consumes plenty of computing resources. In addition, if the design requirements change, simulations must be repeated afresh, which impedes users from paying attention to their actual needs. Therefore, to fill the existing gaps to find a fast, efficient, and automated design approach, we have taken machine learning into our consideration.
Machine learning and its specific branch, deep learning, are approaches to automatically learn the connection between input data and target data from the examples of past experiences. Machine learning is an effort to employ algorithms to devise a machine to learn and operate without explicitly planning and dictating individual actions. To be more specific, machine learning equips an inspiring platform to deduce the fundamental principles based on previously given data; thus, for another given input, machines can make logical decisions automatically. With the ever-increasing evolution of machine learning and its potential capacity to handle crucial challenges, such as signal processing 22 and physical science 23 , we are now witnessing their applications to electromagnetic problems. Due to its remarkable potential to provide less computational resources, more accuracy, less design time, and more flexibility, machine learning has been entered in various wave-interaction phenomena, such as Electromagnetic Compatibility (EMC) 24 31 . A machine-learning method to realize anisotropic digital coding metasurfaces has been investigated, whereby 70000 training coding patterns have been applied to train the network 32 . In Ref 33 , a deep convolutional neural network has been studied to encode the programmable metasurface for steered multiple beam generation with an average accuracy of more than 94 %. A metasurface inverse design method using a machine learning approach has been introduced in 34 to design an output unit cell for specified electromagnetic properties with 81% accuracy in a low-frequency bandwidth of 16-20 GHz. Recently, a double deep Q-learning network (DDQN) to identify the right material type and optimize the design of metasurface holograms has been developed 35 .
In this paper, benefiting from Deep Neural Network (DNN), an inverse design procedure of a metasurface with an average accuracy of up to 92 % has been presented. Unlike previous works, to reach the highest working frequency, we consider 8 ring-shaped digital distributions (see top left of Fig. 1) to generate resonant notches in a wide range of working frequencies from 4 to 45 GHz. Therefore, after training the deep learning model by a set of samples, our proposed model can automatically generate the desired metasurface pattern, with four predetermined reflection information (as number of resonances, resonance frequencies, resonance depth, and resonance bandwidths) for ultra-wide working frequency bands. Comparison of the output of numerical simulations with the design target illustrates that our proposed approach is successful in generating corresponding metasurface structures with any desired S-parameter configurations. Determination of the metasurface structures directly without ill-posed optimization procedures, consuming of less computational resources, ultra-wide working frequency bands, and high average accuracy paves the way for our approach to become beneficial for those engineers who are not specialists in the field of electromagnetics; thus, they can focus on their practical necessitates, boosting the speed of the engineering projects.

Methodologies
Metasurface design. Figure 1 shows the schematic representation of the proposed metasurface structure consisting of three layers, from top to bottom, as a copper ring-shaped pattern layer, a dielectric layer, and a ground layer to impede the backward transmission of EM energy. FR4 is chosen as the substrate with permittivity of 4.2+0.025i, and thickness of h = 1.5mm . The top metallic layer comprises 8 ring-shaped patterns distributed side by side, each of which can be divided into 8 × 8 lattices labeled as "1" and "0" which denote the areas with and without the copper. Each metasurface is composed of an infinite array of unit-cells. Each unit-cell consists of 4 × 4 randomly distributed 8 × 8 ring-shaped patterns. Therefore, each unit cell comprises 32 × 32 lattices. The length of the lattices, periodicity of unit cells, and thickness of the copper metallic patterns are l = 0.2 mm, p = 6.4 mm, and t = 0.018 mm, respectively. Unlike previous works 31, 34 , defining 8 ring-shaped patterns to train the DNN is the novelty employed here to generate the desired resonance notches in a wide frequency band. We designed 8-ring shaped patterns in such a way that the unit-cells generated in the dataset for training the network can generate single or multiple resonances at different frequencies from 4 to 45 GHz, thus, we can import the data set of S-parameters to train the network for our specified targets. It is almost impossible to obtain www.nature.com/scientificreports/ the relationship between the metasurface matrices and S-parameters. Due to the close connection between the metasurface pattern matrix and its corresponding reflection characteristics, the deep learning algorithm is used to reduce the computational burden for obtaining the optimal solution.
Deep learning. Artificial neural networks have emerged in the last two decades with many applications, especially in optimization and artificial intelligence. Figure 2 shows an overview of an artificial neuron, with X 1 , X 2 , ... as its inputs (input neurons). In neural networks, each X has a weight, denoted by W. Observe that each input is connected to a weight; thus, each input must be multiplied by its weight. Then, in the neural network, the sum function (sigma) adds the products of X i 's by W i 's. Finally, an activation function determines the output of these operations. Then, the output of neurons by the activation function φ(u) , with b as a bias value is: The neural network is made up of neurons in different layers. In general, a neural network consists of three layers: input, hidden, and output. A greater the number of layers and neurons in each hidden layer increases the complexity of the model. When the number of hidden layers and the number of neurons increase, our neural network becomes a deep neural network. In this work, we use a DNN to design the desired metasurface.
A. Non-restricted output. The inverse design of the metasurface is anticipated to determine the intrinsic relationships between the final metasurface structure and its geometrical dimensions by DNN. We have generated 2000 sets of random matrices that represent the metasurface structures using the "RAND" function in MATLAB software. In the next step, we have linked the MATLAB with CST MWS to calculate the S-parameters of the metasurface. To calculate the reflection characteristics of the infinite arrays of the unit cells, we have conducted simulations in which the unit-cell boundary conditions are employed in x and y directions and open boundary conditions in the z-direction. Finally, when it comes to the design procedure, we only need to enter the predetermined EM reflection properties, and our model can generate the output metasurface based on the learned data during the training step. The dataset is established to generate 16 random numbers between 1 and 8 to form 4 × 4 matrices where each number represents one of the 8 ring-shaped patterns. In the step of "Training of machine learning", to form our datasets, we have generated two thousand pairs of S-parameter and metasurface pattern matrices (70% as a training set and 30% as a testing set), and the output of the training model is a matrix of 32 × 32 . Each unit-cell can generate 8 notches in the frequency band of 4 to 45 GHz. By defining three features for each resonance (namely, notch frequency, notch depth, and notch bandwidth), the input of our proposed DNN is a vector with dimension 24, and the output is a vector of dimension 1024, which represents a unit cell of 32 × 32 pixels. The details of the designed network are summarized in Table 1.
In the proposed model, dense and dropout layers are used one after the other (see second step in Fig. 1). In the fully connected (dense) layer, each neuron in the input layer is connected to all the neurons in the previous layers. In the dropout layer, some neurons are accidentally ignored in the training process in order to avoid the misleading of the learning process, as well as increasing the learning speed and reducing the risk of overfitting. By selecting relevant features from the input data, the performance of the machine learning algorithms is efficiently enhanced. In the proposed model, the values of batch size and learning rate are set to 30 and 0.001, respectively. In addition, the Adam optimization algorithm is used for tuning the weighting values ( W i ). During the training process, the difference between original and generated data is calculated repeatedly by tuning and optimizing the weight values for each layer. When the difference reaches the satisfying predetermined criterion which is defined as loss function, then the training process stops. The Mean Square Error (MSE) is used as a loss function defined as:  Fig. 3a, that the output full-wave results achieve the design goals. For the next example, a uni-cell is designed with one resonance frequency (-15 dB) at 15 GHz. The simulation results show good conformity with our design target (see Fig. 3b). Furthermore, the curves of the mean square error and the accuracy of the presented non-restricted output DNN method are proposed in Fig. 4, where we see that the accuracy rate is higher than 92%.
B. Restricted output. In order to increase the learning speed, reduce the number of calculations, and improve the efficiency of a design process, the network architecture output is restricted in such a way that the DNN should generate the metasurface structure by using the proposed 8 ring-shaped patterns. Unlike the previous approach, in which the output generates a 1024 size vector to form the 32 × 32 metasurface pixels, in this case the output will generate a 48 size vector. More specifically, each unit-cell consists of 4 × 4 matrices of these 8 ring-shaped patterns, where each ring-shaped pattern consists of 8 × 8 pixels. To form the output vector, ring-shaped patterns are denoted by eight digital codes (3-bit) of "000" to "111". Therefore, the output of the DNN generates a 16 × 3 = 48 size vector. By restricting the output to produce a 48 size vector, the amount of calculations will be reduced. It will be shown that the accuracy of the network reaches up to 91%. The details   Table 2. The other parameters are similar to the non-restricted output network. Figure 5 shows the curves of the loss function and accuracy.
To further validate the effectiveness of the proposed DNN method for restricted output, four different examples are presented. The specified S-parameters are provided in our network, and the matrix of unit cells are generated through the input S-parameters. We re-enter these matrices into CST MWS to simulate the reflection coefficient of the metasurface. The simulated results are in good accordance with our desired design target (See Table 3 and Fig. 6).
To illustrate the advantages of our DNN approach, as detailed in Table 4, we show the information of training time, time to generate a unit-cell, and the model size for both restricted and non-restricted structures. The results of Table 4 are obtained using Google Colab and with a fixed GPU whose model is Tesla k80 with 13MB Figure 4. Curves of (a) accuracy and, (b) loss function relative to 10,000 Epochs for non-restricted network architecture.  Figure 5. Curves of (a) accuracy and, (b) loss function relative to 10,000 Epochs for restricted network architecture. www.nature.com/scientificreports/ of RAM. The design time of our method is about 0.05 sec which is much faster than conventional methods that take about 700 to 800 minutes and even compared to other inverse design methods that used deep learning. Also, our DNN-based approach takes less volume than the conventional method which certifies that our method is more efficient and effective. Consequently, it has been amply demonstrated that the proposed DNN method is superior to other inverse design algorithms of metasurface structure, from the perspective of computational repetitions, teaching time consumption, and network accuracy. The conformity between the simulated results and design targets promises that the proposed DNN approach is an effective method of metasurface design for a variety of practical applications.

Discussion
Herein, we have proposed an inverse metasurface design method based on a deep neural network, whereby metasurface structures may be computed directly by merely specifying the design targets. After training the deep learning model by a set of samples, our proposed model can automatically generate the metasurface pattern as the output by four specified reflection criteria (namely, number of resonances, resonance frequencies, resonance depths, and resonance bandwidths) as the input in an ultra-wide operating frequency. Comparing the numerical simulations with the desired design target illustrates that our proposed approach successfully generates the required metasurface structures with an accuracy of more than 90%. By using 8 ring-shaped patterns during the training process and restricting the output of the network to generate a 48 size vector, our presented method serves as a fast and effective approach in terms of computational iterations, design time consumption, and Table 3. Desired input targets for four S-parameters, which are presented in Fig. 6.