Efficient Wideband Numerical Simulations for Nanostructures Employing a Drude-Critical Points (DCP) Dispersive Model

A highly efficient numerical approach for simulating the wideband optical response of nano-architectures comprised of Drude-Critical Points (DCP) media (e.g., gold and silver) is proposed and validated through comparing with commercial computational software. The kernel of this algorithm is the subdomain level discontinuous Galerkin time domain (DGTD) method, which can be viewed as a hybrid of the spectral-element time-domain method (SETD) and the finite-element time-domain (FETD) method. An hp-refinement technique is applied to decrease the Degrees-of-Freedom (DoFs) and computational requirements. The collocated E-J scheme facilitates solving the auxiliary equations by converting the inversions of matrices to simpler vector manipulations. A new hybrid time stepping approach, which couples the Runge-Kutta and Newmark methods, is proposed to solve the temporal auxiliary differential equations (ADEs) with a high degree of efficiency. The advantages of this new approach, in terms of computational resource overhead and accuracy, are validated through comparison with well-known commercial software for three diverse cases, which cover both near-field and far-field properties with plane wave and lumped port sources. The presented work provides the missing link between DCP dispersive models and FETD and/or SETD based algorithms. It is a competitive candidate for numerically studying the wideband plasmonic properties of DCP media.

(Only the mesh on the surfaces are shown while the volumetric mesh is hidden to improve visual clarity. Three of the exterior surfaces are set to be transparent to reveal the nanoloop). P x = 1000 nm; P y = 1000 nm and P z = 200 nm. (C) The remaining of the physical region is meshed with high-order hexahedrons. The space inside the yellow boxes are vacuum as they have already been meshed with tetrahedrons in (B). L 1 = 3000 nm. The shared interfaces (yellow surfaces) uses rectangles mesh on one side while the other side uses triangles. Ren et al. have introduced the technique, which is also used in this work, to solve the energy communication across these non-conformal interfaces 32 .
in Fig. 1(B), due to the curved surfaces and relatively thin structure (2r), the nanoloops and their surrounding space are separated and meshed with dense low-order tetrahedrons to capture the fine geometries. For the remaining free space, a high-order hexahedron mesh is preferred due to the spatial sampling efficiency, as shown in Fig. 1(C). A Riemann solver is applied to fulfill the energy communication between the hexahedron subdomain and the tetrahedron subdomains. It is worth pointing out that all the mesh plots in this paper only show the mesh on the surfaces of the objects and the volumetric meshes are hidden to improve visual clarity. Second, the auxiliary differential equations (ADEs) employed to represent the constitutive relations of the dispersive media adopt E-J collocation to transform the inversion of matrices into efficient vector manipulation. Third, a hybrid scheme is proposed for the time integration. The second order Runge-Kutta (mid-point) method 33,41 is used to update the E and B fields as well as the polarization currents of the Drude model. The Newmark method 42 is chosen to update the polarization currents of the Lorentzian resonances corresponding to the critical points of the band transitions. The two methods are arranged in a proper updating sequence in the coupled system to achieve accurate results with efficacy.
The proposed method is applied to three examples of DCP gold nano-architectures: the near-field response of a cube, the far-field radar cross section (RCS) of a split ring resonator (SRR), and the mutual coupling between two nanoloops. Besides good agreement, a comparison to the results obtained from widely used commercial software packages including ANSYS Electronics Desktop (AEDT) and FEKO, indicates that the proposed method is more computationally efficient as well as requires less CPU time and memory.

Theory and Methods
System Equations (Maxwell's Equations and Auxiliary Differential Equations). The DCP model under consideration in this study was first derived by Etchegoin et al. 20 and extended by McKinley et al. 13 , in which the relative permittivity is represented as: The associated susceptibility is defined by: We can define the ADEs for the polarization currents associated with the two Drude terms and the Lorentzian terms as: where Thus Maxwell's equations for the DCP medium can be expressed in terms of the field variables E and B and the polarization currents given in Eq. 3: To solve Eqs 5 and 6, divergence conforming basis functions, denoted by Ψ, are used to discretize B, while curl conforming basis functions, denoted by Φ, are used to discretize E, J D,1 , J D,2 and J L,m . The expressions and field distributions of the 3D curl-and divergence-conforming basis functions can be found in refs 33 and 35.
Using Φ and Ψ to test Eqs 5 and 6, respectively, and applying the Riemann solver to treat the numerical flux between adjacent subdomains 33 , the discretized system equation for the ith subdomain can be obtained as: j j j are the unknown field vectors for the ith and the jth (local and adjacent) subdomains, respectively. j i is the vector for the impressed current, j D,1 , j D,2 and J L,m are the unknown vectors for the polarization currents, and N s is the number of subdomains.
We assume the sizes of u i , j D,1 , j D,2 and J L,m are N × 1, and the set of DoF indices of the ith subdomain are = … U N {1, 2, 3, , }. The indices of the unknowns for the polarization currents form a subset of U, denoted by U′. If we assume the size of U′ is N′, then the size of the matrix C i is N × N′, where the elemental form is an inner product over the volume of an element V e : Similarly, Φ is also used to test Eq. 3. With the help of E-J collocation 43,44 (the system matrices are identical and can be canceled) and conversion to the time domain, it follows from Eqs 9 and 10 that the auxiliary equations only require vector manipulations: The Newmark-beta method (β = 0) is employed to solve Eq. 10, which is a second order PDE. Assuming the time step size is Δt and evaluating Eq. 10 at = − ∆ ( ) t n t 1 2 , then j L m i , at t = nΔt (denoted as j L m i n , for brevity) can be updated as shown below: ( 1) and I N′ is an identity matrix of size N′ × N′.
, j D n ,1 and j D n ,2 for all subdomains have already been obtained and i are also stored, Eq. 11 can be substituted into Eq. 7 to evaluate d dt n where P i is a N′ × N matrix which isolates the E field unknowns in the DCP medium volume. The matrix  To summarize, within each time step, the polarization currents from the Lorentzian critical points are updated twice using the Newmark-beta method, while the electric field, magnetic flux and polarization current from the

Numerical Results and Discussions
Validation Case with Near-field Response Comparison to Commercial Software. To prove the effectiveness of the spatial discretization and time stepping schemes, the near-field response of a gold cube under an incident plane wave illumination (as illustrated in Fig. 2(A)) is considered as a validation case. The gold is described by a DCP model 13 with the parameters listed in Table 1, where the conversion factor between eV and frequency is 4.13821 × 10 −15 .
The center of this gold cube is located at the origin, and its edge length is 2 µm. The plane wave propagation vector is k = µm. The physical region (L 2 = 4 µm) consists of the gold cube and its surrounding free space as shown in Fig. 2(B). Each hexahedron represents a 4th-order element, and the edge length is 1 µm. To truncate the physical region, one layer of 4th-order hexahedrons are extruded to form a perfectly matched layer (PML) (not shown) to absorb the outgoing wave. The PML formulations employed in this paper can be found in ref. 33, and we will not derive them in detail here.
The total number of simulation steps is 2000 with an interval of 0.05 fs. The fields recorded at the probe are normalized and then transformed from time domain to the frequency domain. The same case is simulated using the commercial software FEKO (frequency domain integral equation solver based) and AEDT (frequency domain finite element solver based) from 30 THz to 270 THz with an interval of 30 THz (9 frequency points in total). The y and z components of the electric field at the probe are compared with results from AEDT and FEKO, as illustrated in Fig. 2(C) and (D). The maximal differences between the commercial software and DGTD are observed to be only about 0.02 V/m and 0.04 V/m (relative errors of 10% and 6.8%), respectively. This good agreement validates the spatial discretization and time integration schemes for DCP media. It is worth noting that the numerical error obviously increases at the higher frequency range (200 THz-250 THz). The reason for this is twofold: first, in the DGTD method, the low spatial sampling density at high frequency will result in relatively less accurate results compared to those for low frequencies; second, the ultra-wideband excitation source has relatively smaller energy in the high frequency range, consequently, the accuracy is more sensitive to the numerical errors.

Radar Cross Section (RCS) Calculation of Split Ring Resonator (SRR) with DCP Gold. The SRR is
a typical metamaterial structure that is widely used for different purposes such as achieving a magnetic response at optical frequencies. To test the efficiency of the proposed method, the RCS of a SRR composed of DCP gold is simulated. Figure 3(A) shows the structure of the SRR 46 , where d 3 = 380 nm, h 3 = 350 nm, w 3 = 115 nm and t 3 = 60 nm. The edge length of the physical region (SRR and its surrounding space) is L 3 = 2 µm and the corresponding mesh is illustrated in Fig. 3(B).
The plane wave propagation vector is k = [k x , k y , k z ] = [0, −1, 0] and the electric field polarization vector is [0, 0, 1]. The source type utilized for the wideband excitation plane wave corresponds to the 1st order derivative of BHW pulse with a characteristic frequency of 200 THz. The total simulation window is 50 fs with 5000 uniform time steps, leading to a resolution of 20 THz in the frequency spectrum.
The DGTD code, AEDT and FEKO were all run on the same desktop (Intel XEON E5-2609 2.5-GHz four-core processor with 24 GB memory). The proposed DGTD method outperforms AEDT and FEKO in both time and peak memory consumption as demonstrated in Table 2. The reason for the advantages are four-fold: first, DGTD is a time domain approach and it can process the wideband analysis with just one simulation; second, high-order hexahedron elements employed in the DGTD method reduce the DoF, the total unknowns is smaller (only 79k) than that of AEDT which uses a low-order tetrahedron mesh over the computational region; third, the system matrices of the DGTD method are sparse, thus they are easier to solve than the dense matrices from FEKO; and   Mutual Coupling of DCP Nanoloops. Next, the mutual coupling between two nanoloops with DCP gold is studied with the proposed method to further exhibit its advantages. The centers of the two nanoloops are located at [−500, −500, −900] nm and [500, 500, 900] nm, respectively. The radius R of each nanoloop is 400 nm and the thickness, 2r, is 80 nm as shown in Fig. 1(A). The hybrid FETD-SETD solver with hp-refinement is employed for this example. The nanoloops are meshed with 2nd order tetrahedrons (i.e., p-refinement and the FETD solver are employed to capture the curved surfaces of the loops), while the remaining free space in the physical region and the PML region are meshed with 3rd order hexahedron elements (i.e., h-refinement and the SETD solver are employed to decrease the DoF). The two nanoloops are fed with lumped ports located at [−100, −500, −900] nm (active) and [900, 500, 900] nm (passive), respectively. The input voltage source is a 1st order derivative of a BHW pulse with the characteristic frequency of 100 THz. The simulation in the DGTD method uses 5000 time steps with an interval of 0.02 fs, thus the frequency spectrum resolution is 10 THz. The same case is simulated by AEDT and FEKO, in which the frequency sweeps from 10 THz to 250 THz with an interval of 10 THz.

Gain wrt FEKO/ AEDT
The scattering parameters are calculated and compared in Fig. 4(A). The differences between the DGTD method and the commercial software packages are illustrated in Fig. 4(B). Maximal differences for both S 11 and S 21 are approximately 2 dB, which indicates that very good agreement is achieved between the three solvers. The computational resource overhead is compared in Table 3. The advantage of the proposed method is obvious since it only requires about one third of the simulation time of its competitors as well as less memory consumption (approximately 60% of FEKO and 28% of AEDT). In addition to the four reasons for this computational gain identified in the above SRR case, an additional advantage of the DGTD method in this coupling case can be attributed to the hp-refinement. Coarse high order hexahedron elements can significantly decrease the DoF in the free space far from the nanoloops. Meanwhile the dense tetrahedron mesh for the nanoloops and their surrounding area can capture the curved surfaces. Therefore, this cases can be simulated with high accuracy as well as low computational burden.

Conclusions
This paper presents a new DGTD method for simulating the wideband response of nano-architectures comprised of DCP dispersive media. It is based on the framework of domain decomposition. The advanced technologies, such as hp-refinement and non-conformal mesh are incorporated to decrease the DoF as well as capture the fine geometries. An E-J collocation scheme is applied to solve the auxiliary differential equations which converts the   Table 3. Computational overhead comparison for the nanoloop coupling case between the DGTD method, AEDT and FEKO.
inversion of the mass matrices to simple vector manipulations. A hybridization of the second order Runge-Kutta and Newmark methods is employed to update the coupled Maxwell's equations and auxiliary equations in an efficient manner. Various examples are simulated by the newly proposed method and compared to commercial software packages. The results validate the high accuracy of this method and its advantages in computational time and memory consumption. In addition, this approach can be used as a reference in the future study of FETD/SETD/DGTD algorithms based on other complex dispersive models. In upcoming research, the proposed method could, for example, be extended to include periodic boundary conditions for the simulation of photonic metamaterials comprised of DCP media based unit cells.