Predicting the Dispersion Relations of One-Dimensional Phononic Crystals by Neural Networks

In this paper, deep back propagation neural networks (DBP-NNs) and radial basis function neural networks (RBF-NNs) are employed to predict the dispersion relations (DRs) of one-dimensional (1D) phononic crystals (PCs). The data sets generated by transfer matrix method (TMM) are used to train the NNs and detect their prediction accuracy. In our work, filling fractions, mass density ratios and shear modulus ratios of PCs are considered as the input values of NNs. The results show that both the DBP-NNs and the RBF-NNs exhibit good performances in predicting the DRs of PCs. For one-parameter prediction, the RBF-NNs have shorter training time and remarkable prediction accuracy, for two- and three-parameter prediction, the DBP-NNs have more stable performance. The present work confirms the feasibility of predicting the DRs of PCs by NNs, and provides a useful reference for the application of NNs in the design of PCs and metamaterials.

www.nature.com/scientificreports www.nature.com/scientificreports/ problem Description Consider shear-horizontal (SH) waves propagate in a 1D PC shown in Fig. 1. Material A and B are periodically arranged in x-direction with = + a a a A B , where a is a lattice constant; a A and a B are, respectively, the thickness of material A and B in one-unit cell; and θ denotes the incident direction of SH waves.
It is known that the TMM has advantages in dealing with 1D PCs. When the SH wave is obliquely incident on the 1D PCs, its governing equation can be expressed as where u x z t ( , , ) is displacement of the PCs in the y direction, μ is shear modulus, ρ is mass density, and t is time. When a harmonic plane wave is considered, it can be assumed where ω is angular frequency, and k z is wave number of the SH wave in the z direction, which is a constant from the Snell theorem. Substituting Eq. (2) into Eq. (1), the displacement and the stresses of the material A and the material B in the n-th lattice are respectively   www.nature.com/scientificreports www.nature.com/scientificreports/ The relationship between the amplitudes of the incident and the reflected waves in the n-th lattice and the (n − 1)-th lattice can be obtained from Eq. (4), that is where k is a 1D Bloch wave vector. Substituting Eq. (5) into Eq. (6) gives a standard matrix eigenvalue problem, that is ka i where I is a 2 × 2 unit matrix.
By solving the eigenvalues of the matrix T, the DRs between the wave vector k and the angular frequency ω can be obtained 14 , and θ = . 0 5 rad; the DRs of a PC, i.e., the relations between eigen frequency and wave vector, can be obtained according to Eq. (8). Figure 2 gives the first three eigen modes of the PC, where ω π Ω = a c /2 A , is the normalized frequency. From Eq. (8), we can see, the DRs of 1D PCs are determined by a i , μ i and ρ = i A B ( , ) i for a certain incident wave and periodic constant. Considering that for anti-plane waves, the shear modulus ratio, μ μ μ = / B A , and the mass density ratio, ρ ρ ρ = / B A are the main physical parameters affecting the PC bandgap 15 , and the filling fraction, = a a a / B , is the key geometrical parameter. Hence, for our problem, we focus on the study of parameters of μ , ρ and a respectively. Thus, the relationship between DRs and PCs parameters can be written as www.nature.com/scientificreports www.nature.com/scientificreports/ where drs represents the DRs, and f a is the function which reflects the analytic relations between a, μ , ρ and drs. The function f a belongs to the category of transcendental equations, so it is difficult to get analytical solutions.
In this paper, our aim is training NNs to learn the relationship between DRs and PCs parameters or, in other words, making NNs simulate the function f a . The simulating relationship is written as s where f s is the function which simulates the function f a . The ultimate goal of our work is obtaining the function f s by NN. We investigate this simulating relationship between DRs and PCs parameters for three cases. First, we investigate the relationship between a and DRs, then the relationship between μ , ρ and DRs, and finally, the relationship between a, μ , ρ and DRs.

neural networks and Data Set
In our work, the key issue is to get an input-output relationship by NN, where the input is the PCs parameters, namely, a, μ and ρ , and the output is the DRs. NN is a technology driven by data. It can learn the features of input-output through being fed enough data. The NN returned from learning can predict the corresponding output, if an input is given, which is never learned by the NN. We are going to use enough data to train NNs for making them simulate the relationship between DRs and PCs parameters.
DBp-nn. The back propagation (BP) NN generally refers to a feedforward NN. The number of the hidden layers in the BP-NN was usually only three layers at most at the past. However, in our work, the number of the hidden layers is more than three layers, so the BP-NNs we use are deeper in the hidden layers. Hence, we call them DBP-NNs. The deeper the hidden layers, the stronger the ability to learn the features of data. The structure of a DBP-NN is shown in Fig. 3, which consists of an input layer, multiple hidden layers and an output layer. Each layer is composed of many neurons except for input layer. Each neuron consists of inputs, weights, a bias, and an activation function. The output of each neuron is defined as o o where x is the inputs, w is the weights, b is the bias, f o is the activation function, and y o is the output of the neuron. A satisfied NN can be obtained by adjusting the weights and biases.
Cost function. It has been proved that NN, whose cost function is mean squared error (MSE) function, can estimate posterior probabilities for finite samples with high accuracy 16,17 . For the DBP-NNs, we take the MSE function as the cost function, which is defined as where n is the number of groups of training set data, y is the target outputs (DRs in this paper), and y NN is the outputs of DBP-NN during training.

RBf-nn.
Radial basis function neural network (RBF-NN) can approximate arbitrary nonlinear functions.
With good generalization ability, it is able to learn complicated laws in a system, and its learning efficiency is remarkable. RBF-NN is composed of an input layer, a hidden layer and an output layer, as is shown in Fig. 4. For the RBF-NN, there is no weight connecting between the input layer and the hidden layer, but the weight connecting between the output layer and the hidden layer. RBFs can calculate the distance or similarity between the www.nature.com/scientificreports www.nature.com/scientificreports/ inputs and the centers of the hidden layer. The farther the distance is or the lower the similarity is, the smaller the activation of a neuron is, and the less obvious its effect is.
Linear regression. For the RBF-NNs in this paper, we use linear regression method to calculate the weight between the hidden layer and the output layer. Compared to the gradient descent method, linear regression method saves the training time and its model is simpler. The weights calculated by linear regression method are as follows where β is the weights, y is the target outputs, and y NN is the outputs of RBF-NN during training. Data set. The data set is composed of the training set, the validation set and the testing set, where the data in the training set, the validation set and the testing set are completely different from each other. The parameters, filling fraction a, shear modulus ratio μ and mass density ratio ρ , are taken as the input of the NNs, and the first three eigen modes of the corresponding DRs calculated by TMM as the labels. In our work, three cases are considered. For the first case, the training set A, validation set A and testing set A consist of 10, 2 and 2 sets of data respectively, where μ and ρ are unchanged and the range of a is from 0.3 to 0.75; for the second case, the training set B, validation set B and testing set B consist of 100, 20 and 20 sets of data respectively, where a is unchanged and the ranges of μ and ρ are respectively from 0.005 to 0.095 and from 0.1667 to 0.5667; for the third case, the training set C, validation set C and testing set C consist of 1000, 100 and 100 sets of data respectively, where the range of a, μ and ρ are respectively from 0.3 to 0.75, from 0.005 to 0,095 and from 0.1667 to 0.5667.

Results and Discussions
The performances of the trained DBP-NNs and RBF-NNs are tested for three cases, involving geometric parameter changes, physical parameters changes and simultaneous changes, respectively.
The computing platform used in our work is a laptop whose configuration is shown in Table 1.
All the programs are written in "Python 3.5". The DBP-NNs are developed in "Tensorflow". The function of "time.clock()" is used to calculate the running time of programs.
Here we measure the prediction accuracy by calculating the Euclidean distance (ED) between the predicted DRs and the target DRs. The smaller the Euclidean distance is, the higher the prediction accuracy is. The ED is defined as   www.nature.com/scientificreports www.nature.com/scientificreports/ the choice of nns architectures. In this section, the architectures of the two NNs are discussed for three cases. The optimal architectures of the two NNs are determined by comparing the mean errors of the corresponding validation sets. where "1", "2" or "3" is the dimension of the input layer, "303" is the number of the neurons in the output layer, and others are the number of the neurons in the hidden layers. Figure 5 gives the mean errors of the validation sets of the DBP-NNs under three cases. It can be seen that for one-parameter prediction, the mean error of "DBP-1-3" is smaller than others, for two-parameter design, "DBP-2-3" and "DBP-2-4" have a similar accuracy, but "DBP-2-3" has less hidden layers than "DBP-2-4", and for three-parameter design, "DBP-3-3" is the best choice.
Choosing the RBF-NNs architectures. Similar to the previous section, for one-, two-and three-parameter prediction, the following twelve architectures of the RBF-NNs are respectively compared:  Figure 6 gives the mean errors of the validation sets of the RBF-NNs under three cases. It can be seen that "RBF-1-3", "RBF-2-3" and "RBF-3-3" are the optical architectures for the first, second and third cases, respectively. www.nature.com/scientificreports www.nature.com/scientificreports/ one-parameter prediction. In this section, the DRs of PCs with different filling fractions are predicted, and the data set is composed of the training set A, the validation set A and the testing set A. In this section, the architecture of the DBP-NN is "DBP-1-3", and the architecture of the RBF-NN is "RBF-1-3". The predicted DRs of the testing set A are shown in Fig. 7, and the prediction accuracies are shown in Fig. 8. It can be seen that the predicted DRs are in good agreement with the target values. The two NNs exhibit good performances for predicting the DRs of PCs with different filling fractions, but the RBF-NN is better. two-parameter prediction. In this section, the DRs of PCs with different shear modulus ratios and mass density ratios are predicted, and the data set is composed of the training set B, the validation B and the testing set   www.nature.com/scientificreports www.nature.com/scientificreports/ B. Due to space limitations, only 2 sets of predictions are shown in Fig. 9 as examples, but the prediction accuracies of the 20 sets are all shown in Fig. 10. Here, the architecture of the DBP-NN is "DBP-2-3", and the architecture of the RBF-NN is "RBF-2-3". It can be noticed from Fig. 9 that for two-parameter, both the two NNs present satisfied predictions with high precision. From Fig. 10, it can be seen the DBP-NN has more stable performance, and the RBF-NN has several relatively large errors, although its most errors are very small. three-parameter prediction. Predictions of the DRs for different filling fractions, shear modulus ratios and mass density ratios are carried out, and the data set is composed of the training set C, the validation set C and the testing set C. Only 2 sets of predictions are shown in Fig. 11 as examples, while the error statistics of the predicted results of the testing set C are shown in Fig. 12. In this section, the architecture of the DBP-NN is "DBP-3-3", and the architecture of the RBF-NN is "RBF-3-3". It can be seen that the performances of the two NNs are still remarkable, but the DBF-NN performs much better.
Comparison among DBP-NNs, RBF-NNs and TMM are given in Table 2. It can be noticed that the time required by NNs is extremely short, and the prediction accuracy is remarkable. For one-parameter prediction, the RBF-NNs are superior to the DBP-NNs on training time, prediction accuracy and simplicity of the model. For two-parameter prediction, the RBF-NN has a smaller mean error, but the DBP-NNs are better than the RBF-NNs in terms of the performance stability from Fig. 10. For three-parameter prediction, the DBP-NN is a better choice because of its high accuracy and stability. In terms of calculation time, TMM is hundreds of thousands of times DBP-NNs and RBF-NNs

conclusions
The deep back propagation neural networks (DBP-NNs) and the radial basic function neural networks (RBF-NNs) are trained to predict the dispersion relations (DRs) of one-dimensional (1D) phononic crystals (PCs) for three different cases in our work. The results show that both the DBP-NNs and the RBF-NNs can predict the DRs of PCs with rather short time and high accuracy. For one-parameter prediction, the RBF-NNs are superior to the DBP-NNs on training time, prediction accuracy and simplicity of the model. For two-parameter prediction, the DBP-NN has more stable performance. For three-parameter prediction, the DBP-NN is a better choice because of its high accuracy and stability.  www.nature.com/scientificreports www.nature.com/scientificreports/ This paper confirms the feasibility and superiority of using NNs to predict the DRs of PCs. It is the fact that 2D and 3D problems are more complex in design and calculation, consuming much more time and computer memory. Therefore, the application of NNs to the design and analysis of 2D and 3D PCs and metamaterials will be of great significance. How to design a suitable NN to solve the problems of PCs and metamaterials, especially their inverse design problems, will be the difficulty and focus of the future research. The present work provides a useful reference for the related investigations in the future.    www.nature.com/scientificreports www.nature.com/scientificreports/