Foundry-fabricated grating coupler demultiplexer inverse-designed via fast integral methods

Silicon photonics is an emerging technology which, enabling nanoscale manipulation of light on chips, impacts areas as diverse as communications, computing, and sensing. Wavelength division multiplexing is commonly used to maximize throughput over a single optical channel by modulating multiple data streams on different wavelengths concurrently. Traditionally, wavelength (de)multiplexers are implemented as monolithic devices, separate from the grating coupler, used to couple light into the chip. This paper describes the design and measurement of a grating coupler demultiplexer—a single device which combines both light coupling and demultiplexing capabilities. The device was designed by means of a custom inverse design algorithm which leverages boundary integral Maxwell solvers of extremely rapid convergence as the mesh is refined. To the best of our knowledge, the fabricated device enjoys the lowest insertion loss reported for grating demultiplexers, small size, high splitting ratio, and low coupling-efficiency imbalance between ports, while meeting the fabricability constraints of a standard UV lithography process. Wavelength division multiplexing is a key method for boosting the data transfer rate of photonic devices, typically requiring multiple on-chip elements to achieve. Here, a combined grating coupler demultiplexer is developed by inverse-design methods and fabricated with standard Si foundry processes.

S ilicon photonics is a rapidly expanding industry due to its potential for positively impacting a broad range of important application domains, including high-speed interconnects, optical phased array transceivers 1 , one-and twodimensional lens-less cameras 2,3 , nanophotonic lens-less projection systems 4 , integrated LIDAR 5 , biosensors 6 , nanophotonic gyroscopes 7 , etc. This paper presents a photonic dual-band grating coupler wavelength demultiplexer (GCWD) for the bands most commonly used in telecommunication applications, namely, the O-band (centered at 1310 nm) and the C-band (centered at 1550 nm) under vertical incidence. A grating coupler couples light into and out of on-chip waveguides via an external optical fiber. The grating, which consists of many etched "teeth" which scatter the incident field, must be designed in such a way that the sum of the scattered contributions interferes coherently in the desired mode. Use of high-efficiency grating couplers, which could allow for reductions in power requirements of the laser source, can be greatly beneficial. The majority of grating couplers are designed to be efficient at coupling in a band around a single wavelength of interest. However, there exist many important applications, such as wavelength division multiplexing (WDM) for optical interconnects where data is encoded on multiple wavelengths and sent concurrently through the same optical fiber link; other applications include spectroscopy, biosensing, and nonlinear harmonic extraction. In such situations, it is useful to utilize a single grating coupler device which can both couple as well as split the signal for the relevant wavelengths incident from the fiber into separate on-chip waveguides; the first such devices were introduced by Xu et al. and other authors [8][9][10] . Although it may be possible to design a single-wavelength grating coupler with reasonable efficiency via trial and error or sequences of parametric sweeps (a challenging and time-consuming approach which may additionally lead to suboptimal designs), it appears completely impossible to even attempt to use such approaches in the design of dual-wavelength one-dimensional grating demultiplexers.
Silicon photonic chips are comprised of nanophotonic structures of large electrical size which typically contain complex subwavelength features, and, thus, the design and optimization of photonic devices amounts to a highly challenging computational problem. Recently, inverse design, which is an optimization approach that can automatically design structures under given desired specifications and fabrication constraints, has emerged as a powerful methodology for the development of highperformance photonic devices 11,12 . A brief description of the general inverse-design methodology is presented at the beginning of the Results section. Significant efforts have been devoted over the last few years to the numerical simulation and topology optimization of such structures [12][13][14][15][16][17] .
The proposed GCWD design was obtained on the basis of a fast integral-equation electromagnetic inverse design framework introduced in Sideris et al. 11 , which is outlined, demonstrated, and compared to other approaches in Supplementary Note 1. Notably, the designed device meets all of the design rule constraints of, and was fabricated using, a standard UV-lithography silicon photonics foundry process; Supplementary Notes 3 and 4 explore the robustness of the design and the tradeoffs that result when the foundry-prescribed 160 nm minimum feature size constraint is relaxed to 50 nm. Additionally, the potential of the GCWD to be used in reverse, as a wavelength combiner, is established in Supplementary Note 5. Finally, the fabricated device measurements are in close agreement with the computational predictions. To the best of our knowledge, the measured device achieves the lowest insertion loss ever reported for a GCWD device (4.95 and 2.96 dB at 1310 and 1550 nm respectively), in addition to a number of other important advantages listed in Table 1 and described in the "Discussion" section.

Results
Grating coupler inverse design via fast integral methods. Inverse design is a computational technique that enables the automated design of photonic devices. Starting from an initial guess (which is typically selected randomly, as we do in this paper, but which could otherwise be chosen on the basis of prior insight), an iterative optimization process is subsequently used which leads to increasingly more efficient structures, until the desired structural response has been achieved. The optimization step proceeds by evaluation of the sensitivity (gradient) of the underlying objective function (which quantifies the quality of the device and its performance) to perturbations of the parameters describing the structure. The gradient points in the direction of maximum improvement. A complete design is then obtained by iteratively recomputing the gradient and adjusting the design parameters at each step, as befits minimization/maximization of nonlinear functions, until convergence to a minimum is achieved. The gradient evaluation would naively require a number of solutions of the field equations: one solution per design parameter. The cost of the procedure would be prohibitive for most photonic applications, including the one considered in this paper.
To tackle this problem we rely on a powerful alternative commonly used in modern inverse design methods, namely, the "adjoint method" for gradient evaluation, which produces the same gradient, without approximations, by relying on only two simulations [11][12][13]18 .
It was recently shown that the use of integral equation methods can provide certain advantages in the areas of photonic inverse design 11 . Indeed, integral methods, which are based on Green's functions and only require discretization of material interfaces can effectively be used to represent complex photonic designs. This is in contrast with the finite element method (FEM) and the finite-difference time-domain (FDTD) which require meshing the complete domain while adequately respecting the material interfaces and satisfying Maxwell's boundary conditions on those interfaces. Aside from significantly reducing the number of unknowns required to solve the forward problem, the integral equation method also leads to a natural boundary-based formulation which is particularly valuable for purposes of optimization, since it allows for modification of the material interfaces without requiring remeshing of any kind. Traditional FEM or FDTD-based inverse design methods, on the other hand, either rely on optimization of a continuous volumetric dielectric distribution and subsequent thresholding 19 or use of level-set methods to simulate boundaries, which must then be mapped to and from the volumetric dielectric distribution at each iteration 16 at significant accuracy loss and computational cost, as discussed in Sideris et al. 11 and Supplementary Note 1.3. Since the integral equation based approach 11 only uses the interfaces to discretize the problem, the optimization parameters directly control the shape of the boundary and the equations that need to be solved, and thus allow the device boundaries to be optimized within a computational context that is well adapted to the evaluation of Maxwell solutions under arbitrary variation of the interface boundaries.
More specifically, the proposed GCWD design was obtained by parametrizing the boundaries of the grating as shown in Fig. 1. The unknowns to be optimized are the grating tooth widths w i and tooth spacings s i . A fixed etch depth of 130 nm was chosen and the minimum width and spacing were constrained to be greater than or equal to 160 nm in order to meet the required foundry specifications 20 . An objective function was subsequently defined whose minimization corresponds to joint optimization of the coupling efficiencies for both wavelengths into the designated output waveguides while concurrently maintaining a high splitting ratio between the two outputs for each frequency.
In detail, a given candidate grating structure of the type depicted in Fig. 1 is considered in what follows, whose specific dimensions are characterized by a parameter vector p = (s 0 , w 1 , s 1 , w 2 , s 2 , …, w n , s n ) of spacings and widths, with the first spacing s 0 defined relative to a prescribed, fixed point along the horizontal. Note that the first spacing, s 0 , allows the complete grating to shift left or right to align with the center of the incident excitation. The quality of the design is evaluated by means of the "objective function" whose minimization yields efficient GCWD designs. Here, η L;R ð1300;1550Þ denotes the coupling efficiency into the guide at the wavelength indicated by the subscript and measured either at the left or right output waveguide, as indicated by the superscript (cf. eq. (3)). The positive, scalar parameter α controls the emphasis of the objective function on maximizing the coupling efficiency at each wavelength vs. minimizing coupling of each wavelength into the unintended port. Note that the design was optimized for center wavelengths of 1300 and 1550 nm as can be seen in the simulated coupling efficiency results (Fig. 2), although 1310 and 1550 nm wavelengths were used for the experimental measurements due to the limitations of the lasers available in the laboratory. For a given grating structure (that is, for a given parameter vector p), these coupling efficiencies are calculated on the basis of electromagnetic simulation 11 and subsequent evaluation of mode overlap integrals across the two output waveguide cross-sections.
Like the Method of Moments 21 , the electromagnetic solver we used utilizes Green's functions, discretizations of surface currents, and integral equations. The actual integral formulation 22 used in this contribution is based on point-based "Nyström" discretization of the boundary and the use of windowing functions for truncation of infinite waveguide domains. The approach is designed to produce high accuracy in short computing times by exploiting numerical techniques that yield extremely rapid convergence as the mesh is refined; see e.g., the left portion of Supplementary Fig. 1. The high speed of the method has significantly facilitated the design process. For example, Fig. 2 in the contribution Sideris et al. 11 compares the integral solver under consideration to other methods often used in engineering practice, demonstrating improvements in design times of the order of two or more orders of magnitude over corresponding design times required by FDTD-and FEM-based approaches; cf. Supplementary Note 1.3.
As mentioned above, this numerical method only requires discretization of the material interfaces. In the structure under consideration all interfaces are composed of horizontal or vertical planar segments, as can be seen in Fig. 1. Each one of these segments is discretized using six points per wavelength but no less than a total of twenty points per segment. Furthermore, to accurately resolve the field singularities present at the corners, a graded mesh, which is induced via a reparametrization, is used to properly cluster points at such corners, as described on pages 83-84 in the book by Colton et al. 23 -so as to give rise to highorder convergence. Additional details concerning our integral equation implementation are presented in Supplementary Notes 1.1 and 1.2. Supplementary Note 1.3 compares the performance of the proposed solver against that of a commercial state-of-the-art FDTD solver in terms of accuracy vs. computing time as well as mesh resolution, clearly demonstrating the advantages of the proposed approach. Once the integral-equation solution is found, the field values at any point within the structure can be obtained in terms of certain interface integrals involving both Green functions and the values of the surface unknowns obtained as part of the integral-equation solution process. Using such field values, the aforementioned modeoverlap integrals are then computed at approximately two wavelengths away from the ends of the grating design on the left and right sides by evaluating the fields for integration along vertical lines perpendicular to the waveguide cross-section. The optimization process was tackled by means of a quasi-Newton Fig. 1 Diagram of the proposed grating coupler wavelength demultiplexer geometry (GCWD). The individual tooth widths and spacings (shown to be uniform in the figure, for simplicity) are parametrized by the variables w i and s i respectively. These variables are selected by the inverse design algorithm to optimize the demultiplexing objective function. The input excitation is modeled as a radiative-type Gaussian beam which closely matches the free-space field emanating from an optical fiber. The quantities n, u inc , P L wg , and P L wg denote the refractive indices, the incident beam excitation, and the power coupled into the left and right silicon waveguides, respectively. method, namely, the L-BFGS algorithm 24 (Limited-memory Broyden-Fletcher-Goldfarb-Shanno) in the variables contained in the vector p, to minimize the function (1). As mentioned above, the necessary gradient is obtained by means of the adjoint integral approach introduced by Sideris et al. 11 , which requires a solution of a certain adjoint integral problem that can be discretized and solved in the same manner as the original forward electromagnetic problem on the waveguide. For the types of problems under consideration in this paper, the minimization process was achieved in approximately fifty to two hundred L-BFGS iterations.
Demultiplexer simulation and design. The design under consideration, for a given vector p of widths w j and spacings s j , is depicted in Fig. 1. This figure demonstrates some of the characteristics of the design, including the top oxide passivation layer (n = 1.44, t = 2.78 μm), the grating Silicon device layer (n = 3.48, t = 0.22 μm), the buried oxide layer (n = 1.44, t = 2.0 μm), and the silicon substrate handle (n = 3.48). A TM-polarized (Transverse Magnetic), vertically incident Gaussian beam excitation of 10 μm waist diameter at a height of 0.22 μm above the passivation layer was used to simulate the fundamental mode of the SMF-28 optical fiber vertically incident over the GCWD device.
The simplicity of the width and spacing parameters for geometry characterization leads to a straightforward implementation of the optimization problem on the basis of the L-BFGS algorithm-by using L-BFGS box constraints requiring that each width and spacing meets the foundry fabrication constraints, as lower bounds, and an arbitrary sufficiently large spacing value (taken to equal 1 μm for this design) as upper bounds. Thus, minimum width and spacing restrictions of 160 nm, as required by the A*STAR AMF silicon photonics foundry 20 that was used to fabricate the structure, were adopted in our design. The optimization process was completed in 122 L-BFGS iterations, resulting in a total computing time of approximately 130 singlecore minutes on a Xeon E5-2630 v3 2.4GHz CPU. The simulated insertion losses from the incident optical fiber to the left and right output ports of the grating are plotted as a function of the wavelength in Fig. 2 using both 2D as well as full-wave 3D FDTD simulations for comparison. The close agreement in insertion losses produced by the 2D and 3D simulations justifies a-posteriori the use of a 2D solver as part of the optimization process.
Note, in particular, that, as intended, the simulated coupling efficiency peaks for 1300 nm on the left port (at 27.5%, corresponding to 5.6 dB insertion loss) and for 1550 nm on the right port (at 35.5%, corresponding to 4.5 dB insertion loss). Furthermore, the coupling efficiency on the right output port at 1550 nm (resp. left output port at 1300 nm) is small, at only 0.29% (resp. 0.33%), implying simulated splitting-ratio values of 21 and 19 dB on the left and right ports respectively. The corresponding optimized design, finally, is depicted in Fig. 3. Figure 3a (resp. 3b) demonstrates coupling to the left (resp. right) port due to incident illumination at the 1300 nm (resp. 1550 nm) wavelength-clearly demonstrating the splitting capability of the proposed device, whose experimental realization, as discussed in the following section, achieves a lower measured insertion loss (Table 1) than previously reported grating splitters. For reference, 24.7 and 13.7% of the incident power is back-reflected upwards at 1300 and 1550 nm respectively, and the remaining power is either transmitted through one of the two output waveguides as indicated above or transmitted downwards through the substrate. The parameter vector p listing the tooth widths and spacings of the optimized GCWD is presented in Supplementary Note 2 (Supplementary Table 1).
Experimental results. As mentioned above, the grating coupler demultiplexer presented in this paper has demonstrated favorable efficiencies and wavelength demultiplexing. In order to quantitatively evaluate the device performance, we present the experimentally observed values of the insertion losses and splitting ratios at each wavelength. Thus, using the outgoing power parameters P left λ and P right λ -which equal, for each of the two wavelengths λ considered, the power obtained at the left and right ports, respectively-, following 12 we define the splitting ratio at wavelength λ by Note that, on account of the absolute value used in this expression, ζ λ yields the logarithmic ratio between the amounts of power going into the intended port and the opposite port for both wavelengths λ = 1300 nm and λ = 1550 nm. The splitting ratio is an important metric for GCWD devices, and should be maximized in order to avoid cross-talk between information in the two bands, but it is not the only important metric in this context. Indeed, the insertion loss from the input incident fiber to each output and the corresponding absolute coupling efficiency at each wavelength are also very important parameters for the assessment of the device performance. The absolute coupling efficiency and the insertion loss in dB at the left (L) and right (R) ports for a given incident wavelength λ are defined by respectively. A GCWD used for optical interconnect applications, for example, must feature both low insertion loss and low mismatch between the insertion losses at the two wavelengths in order to minimize the incident power required for a desired signal-to-noise ratio. Although Piggott et al. 9 do not report absolute coupling efficiency measurements, Fig. 4b in that paper does show that the relative insertion loss of the signal at 1550 nm is at least 10 dB lower than that at 1310 nm. Figure 4b depicts a 3D rendering of the designed GCWD device alongside a scanning electron micrograph (c) of the actual fabricated device, and the measurement setup (a) used to characterize it. The coupling losses and alignment tolerances for the demultiplexer were characterized in both the O-band (1310 nm) and the C-band (1550 nm). The structure was vertically illuminated with a 2D scanning fiber and the coupled power at the two ports was recorded for both wavelengths as illustrated in Fig. 4a. The coupled power at the two ports was measured using integrated Germanium positive-intrinsicnegative (PIN) photodiodes at both wavelengths. The outgoing power at the two output ports of the structure was measured via the photodiodes as the fiber position was swept at normal incidence over a 50 μm × 50 μm area centered at the nominal design position with a 50 nm × 2 μm step size; Fig. 5 displays the power collected at each port for each one of the two wavelengths for each position considered in the sweep. Note that a significantly finer step size was used for the sweep in the x-direction (the axis of propagation of the GCWD) compared to the transverse y-direction axis, on account of the essentially uniform character of the design, and corresponding slow field variation, in the y-direction.
The experimentally measured splitting ratio ζ λ between the two ports is 9.16 dB at 1310 nm and 8.18 dB at 1550 nm for the input fiber at position (x = 0 μm) (centered over the GCWD). The measured absolute coupling efficiency at each port and corresponding wavelength, at the cross-section y = 0 μm, are extracted from Fig. 5 and plotted in Fig. 6 alongside simulation results. It can be seen that there is good agreement between the measured and simulated results. In fact, for both left and right transmission, Fig. 6 shows experimental power values higher than the numerical values obtained for the simulated device; the departure is significantly more noticeable for the power transmitted into the left fiber. But the experimental data does not contradict the predictions, since the objective function consists not only of the power transmitted into the desired waveguide, but it also includes a penalty for power transmitted into the undesired waveguide. When both powers are taken into account the apparent contradiction is eliminated: the value of the objective function for the combined left and right power, for both frequencies, lies below the optimal predicted value. The simulated sensitivity of the device with respect to variations in position and angle of the incident fiber is plotted in Fig. 7. Additional details and analysis concerning sensitivity to angular variation are presented in Supplementary Note 3.2.
Directional coupler and PIN modulator devices were included in-line with the output waveguides in order to provide added flexibility in the experimental characterization of the GCWD. The test devices introduce an extra insertion loss which must be subtracted to accurately characterize the performance of the GCWD itself. To do this, the insertion losses of the directional couplers and PIN modulators were characterized using FDTD simulation. It was found that the PIN modulators have an insertion loss of 0.75 dB at 1310 nm and 0.05 dB at 1550 nm. The directional couplers result in an additional loss of 0.12 dB at 1310 nm and 2.49 dB at 1550 nm. The responsivity of the photodiodes was measured at 0.5 A/W at both wavelengths. Finally, as predicted by the aforementioned results presented in Fig. 7, the peak collected power for the wavelength demultiplexer corresponds to zero degree incident angle and after deembedding the responsivity of the photodiodes, the directional couplers, and the modulators, the adjusted insertion losses of the measured wavelength demultiplexer in isolation were found to be 4.95 dB at 1310 nm and 2.96 dB at 1550 nm, as reported in Table 1.

Discussion
This paper demonstrated the design, fabrication, and measurement of a grating coupler wavelength demultiplexer design with state-of-the-art performance for optical interconnect applications. The GCWD was designed, without any human interaction aside from specification of desired performance criteria and fabrication constraints, by means of a high-speed, inverse-design approach based on the boundary-integral Maxwell solver introduced in Sideris et al. 11 . The tested device was fabricated by A*STAR AMF 20 and the measurements results were found to agree well with the predicted simulations indicating a clear potential for applicability in practical settings. Most importantly, minimum feature size constraints were incorporated in the design process and the grating was designed for operation under vertical input incidence (cf. Fig. 1)-which makes the device fabricable in a standard SOI foundry process and easy to package, respectively. Both of these characteristics are crucial for reducing cost and increasing yield in a commercial setting.
A comparison between the measured performance of the proposed device and related prior work is presented in Table 1. As can be seen from the table, the proposed device provides significant advantages in several of the performance categories listed: among the devices considered, it exhibits the lowest insertion loss as well as high splitting ratio between the ports, low insertion-loss imbalance between the two wavelengths, small total device size, and it was fabricated in a standard UV lithography-based silicon photonics foundry process which was possible in view of its large minimum feature size. For reference, the contribution Piggott et al. 9 , where the 1D GCWD concept was first demonstrated, presents a design with minimum feature size of 61.1 nm, which is incompatible with foundry UV lithography, and therefore required E-beam lithography for fabrication. Further, an imbalance of~14 dB in the coupling efficiency of the two wavelengths can be seen in Fig. 4b in Piggott et al. 9 compared to the 1.95 dB imbalance in the present design.   Additionally, like Xu et al. 8 and Piggott et al. 9 , the proposed device functions under a direct vertical input incidence which greatly simplifies the packaging and alignment requirements in a practical setting. And, finally, although the design presented by Streshinsky et al. 10 enjoys a high splitting ratio at 1550 nm, this comes at the expense of non-vertical fiber incidence (23 ∘ incidence angle is used in that reference), high insertion loss, and very large device size. In all, we suggest that both, the proposed GCWD itself and the underlying rapid boundary integral equation method used in its design, have significant potential for improving the performance and reducing the size of next generation silicon photonic systems.

Methods
Device simulation and optimization. The simulation and inverse design approach used in this paper for modeling and optimization of the grating coupler wavelength demultiplexer is described in detail in Sideris et al. 11 . Briefly, the 2D simulation and optimization proceeds by means of a custom fast boundary integral solver which, based on the use of the Maxwell Green function and like the method of moments, does not rely on volumetric discretization, and it results in highly efficient solution of the Maxwell equations. The device optimization, in turn, utilizes the L-BFGS quasi-Newton gradient-descent algorithm 24 together with the integral-equation based version of the adjoint method for gradient computation. Both 2D and 3D FDTD simulations were additionally used to independently validate the coupling efficiency vs, wavelength at each of the two outputs of the GCWD and to confirm the correctness of the 2D optimization results.
Device fabrication. The finalized GCWD design was laid out using the freely available K-Layout CAD software 25 using the Process Design Kit (PDK) provided by A*STAR AMF 20 . The design was exported in the GDSII file format and sent to the foundry electronically for processing and fabrication. The chip was fabricated by AMF using their standard 193 nm UV lithography multi-project-wafer (MPW) process.
Device measurement. The system was measured and characterized using IFA-600 active alignment system with Keysight 81657A DFB laser module in O-band (1312.1 nm) and C-Band (1546.9 nm). Input polarization was adjusted using an external fiber polarization controller. The two output photodiode currents measured from the integrated Germanium detectors were used in an automated loop to align the illumination fiber and characterize the alignment tolerance. An alignment tolerance of the system better than 0.1 dB was obtained.

Data availability
The design parameters that characterize the proposed device as well as an alternate device requiring smaller minimum feature sizes, are included in the Supplementary Notes 2 and 4. The data used to produce the figures can be obtained upon reasonable request from the corresponding author.

Code availability
The computer codes used for simulation and design are available from the corresponding author upon reasonable request.