Mapping the global design space of nanophotonic components using machine learning pattern recognition

Melati, Daniele; Grinberg, Yuri; Kamandar Dezfouli, Mohsen; Janz, Siegfried; Cheben, Pavel; Schmid, Jens H.; Sánchez-Postigo, Alejandro; Xu, Dan-Xia

doi:10.1038/s41467-019-12698-1

Download PDF

Article
Open access
Published: 21 October 2019

Mapping the global design space of nanophotonic components using machine learning pattern recognition

Nature Communications volume 10, Article number: 4775 (2019) Cite this article

8693 Accesses
119 Citations
6 Altmetric
Metrics details

Abstract

Nanophotonics finds ever broadening applications requiring complex components with many parameters to be simultaneously designed. Recent methodologies employing optimization algorithms commonly focus on a single performance objective, provide isolated designs, and do not describe how the design parameters influence the device behaviour. Here we propose and demonstrate a machine-learning-based approach to map and characterize the multi-parameter design space of nanophotonic components. Pattern recognition is used to reveal the relationship between an initial sparse set of optimized designs through a significant reduction in the number of characterizing parameters. This defines a design sub-space of lower dimensionality that can be mapped faster by orders of magnitude than the original design space. The behavior for multiple performance criteria is visualized, revealing the interplay of the design parameters, highlighting performance and structural limitations, and inspiring new design ideas. This global perspective on high-dimensional design problems represents a major shift in modern nanophotonic design and provides a powerful tool to explore complexity in next-generation devices.

Stochastic and multi-objective design of photonic devices with machine learning

Article Open access 26 March 2024

Machine learning for nanoplasmonics

Article 26 January 2023

Centimeter-scale nanomechanical resonators with low dissipation

Article Open access 18 May 2024

Introduction

A multitude of parameters determine the performance of a photonic device, encompassing the optical properties of the constituent materials, structural geometry, and dimensions. Similarly, the choice of the best design to proceed to fabrication, integration, and system implementation needs to take into account many performance criteria. These not only include the primary functionality but also other metrics, such as insertion loss, the effect of temperature, and the influence on other system components (e.g., back reflections), to name a few. Susceptibility of manufacturing yield to the inherent variability of the fabrication processes is another important consideration.

Historically, conception of a new device relies on theoretical knowledge and physical intuition to identify the potential structure and design parameter range. The design parameter space is explored semi-analytically or numerically, and the relevant performance metrics are analyzed. This approach is constrained in scope by computational resources and limited to structures governed by only a few parameters and where the evaluation process can be decomposed into sequential steps. As the scope of nanophotonics broadens in complexity and application range^1,2, this conventional approach poses increasing challenges. For example, in devices employing metamaterials^3,4,5,6 or complex geometries generated by inverse design and topology optimization^{7,8,9,10,11,12}, not only the number of design parameters vastly increase but they are often strongly interdependent. Sequential optimization is no longer applicable, and simultaneous optimization of multiple parameters is required.

Optimization tools such as genetic algorithm^13,14,15, particle swarm^16,17, and gradient-based optimization^18,19,20,21 are now commonly used to search more efficiently for high-performance designs²². More recently, supervised machine-learning methods, such as the artificial neural network, have begun to enter the fray in speeding up the search process^23,24. While all these approaches represent significant improvements to the design flow, they still suffer constitutive limitations: usually a single performance criterion is optimized; only a single or a handful of optimized designs are discovered; and the optimization process needs to be repeated for modified performance criteria. Furthermore, optimized designs in isolation reveal very little on the characteristics of the design space and the influence of the design parameters on the device performance. Consequently, careful balancing of different figures of merit becomes difficult. A global perspective on the design space of nanophotonic devices is presently missing.

We propose here a methodology based on machine-learning (ML) tools to map and characterize a multiparameter design space. As a first demonstration-of-concept, we analyze a vertical fiber grating coupler in the silicon-on-insulator (SOI) platform consisting of multiple segments whose dimensions need to be optimized simultaneously^{17,21,25,26,27,28,29}. From an initial sparse set of optimized designs, unsupervised dimensionality reduction reveals a lower-dimensional design sub-space where good designs with high-performance reside. Since computational effort grows exponentially with dimensionality, this sub-space can be mapped faster by orders of magnitude than the original design space, enabling the efficient evaluation and visualization of an arbitrary number of performance criteria. The comprehensive characterization of the continuous region that includes all possible good design solutions highlights their performance as well as structural differences and limitations. This provides a clear understanding of the design space, making possible the discovery of superior designs based on the relative priorities for a particular application. Furthermore, this global perspective on the design space can be exploited to arrive at conclusive arguments as to whether certain features can be obtained with a given structure. With the considered grating geometry, an upper limit to the minimum feature size is identified. This design bottleneck revealed by dimensionality reduction inspired a new grating structure that incorporates subwavelength metamaterial and allows a minimum feature size greater than 100 nm without compromising the device performance. The composite design space of refractive index and segment dimensions involved here can be equally well characterized using the same global mapping procedure, indicating the general applicability of our optimization approach to a wide range of high-dimensional design problems. To the best of our knowledge, this is the first time such a global perspective is obtained by leveraging unsupervised machine-learning techniques for high-dimensional design problems in nanophotonics.

Results

Strategy for characterizing a multiparameter design space

In designing multiparameter devices, it is often difficult or impossible to obtain extensive information on device performance variation over the large parameter space due to constrains on computational resources. We tackle this problem by introducing the three-stage process illustrated in Fig. 1.

In the first stage, multiple iterations of an optimization algorithm are used to generate a sparse collection of different good designs, i.e., designs that optimize a primary performance criterion (Fig. 1a). Supervised ML techniques are exploited to speed up the search process by quickly providing promising design candidates as starting points for the optimizer (see the Methods section). In the second stage (Fig. 1b), dimensionality reduction is applied to analyze the relationship in the parameter space between these degenerate designs. The goal is to find a lower-dimensional sub-space where all good designs reside. This design sub-space is described by significantly fewer parameters compared with the original design space. In the last stage (Fig. 1c), we map the design sub-space by computing across it all required performance criteria and identifying a continuous region of good design solutions. Through this process, the sparse initial set of good designs efficiently leads to the identification and comprehensive characterization of the continuum of all good designs in the sub-space.

A vertical fiber grating coupler with five segments per period is taken as the study case. The considered grating structure is illustrated in Fig. 2. Although desirable, perfectly vertical emission makes the design a challenging problem due to the necessity to suppress the second-order diffraction that reflects back into the waveguide²⁶. In a recent work by Watanabe et al.¹⁷, a single optimized design was generated using particle swarm optimization, providing a good fiber–chip coupling efficiency and a fairly low level of back reflections. Each period of the grating consists of a pillar of 220 nm in height and an L-shaped section with a partial etch to 110 nm¹⁷. The multiple segments in each period need to be simultaneously optimized, and therefore provide a good target to demonstrate our machine-learning-based design approach.

Discovery of a sparse collection of good designs

In the first stage, an in-house optimizer launched from random starting points in the original parameter space is used to search for the initial sparse set of grating designs with state-of-the-art fiber coupling efficiency. A supervised machine-learning predictor is trained to determine the diffraction angle of these complex gratings without performing a first principles Bloch mode calculation. The predictor is used to rapidly screen out random start designs that do not radiate close to the vertical direction, thereby speeding up the search process by ~250% (see the Methods section). This algorithm achieves a wide coverage of the initial design space.

For the grating illustrated in Fig. 2, the dimensions [L₁ … L₅] define the five-dimensional design parameter space we explore in this work. As a primary optimization objective, we choose the coupling efficiency η of the diffracted TE-polarized light to a standard single-mode optical fiber (SMF-28) placed vertically on top of the grating. Low back reflection r is another important criterion in minimizing the impact of the grating to other upstream components in the system^30,31. These considerations lead to the formulation of the optimization problem for the first stage represented in Fig. 1a as:

$$\mathop {{{\mathrm{maximize}}}}\limits_{{{L}}_1 \cdots {{L}}_5}\quad {\eta} \left( {L_1 \cdots L_5} \right) \\ {\mathrm{subject}}{\kern 2pt} {\mathrm{to}}\quad r \left ( {L_1 \cdots L_5} \right) < - {\!\!} {15} {\mathrm{dB}} \\ 400{\kern 2pt} {\mathrm{nm}} < {\Lambda} < {1}{\kern 2pt} {\mu} {\mathrm{m}};{\kern 2pt} L_{{i}} > {50}{\kern 2pt} {\mathrm{nm}}, i = 1 \cdots 5.$$

(1)

The optimization is guided by a single performance metric, the coupling efficiency η. Only solutions with η larger than 0.74 are retained, defined here as good designs. Back reflection r is not optimized, but simply constrained by rejecting design solutions with r > −15 dB. Additional constraints on the grating period Λ and the minimum feature size are included to confine the optimization to designs that are physically manufacturable. The wavelength of light is set at λ = 1550 nm. A highly efficient Fourier-type 2-D eigenmode expansion simulator³² is used to compute the device performance. As detailed in the next section, the optimization stage is halted after a sufficient number of good designs is collected. Each good design requires on average 1000 simulations to be identified (computational details in the Methods section).

Sub-space identification through dimensionality reduction

In this second and key step, we study the relationship between the sparse set of good solutions obtained solving the optimization problem (1) through machine-learning dimensionality reduction. The goal is to transform a set of correlated variables into a smaller set of new uncorrelated variables that retain most of the original information. Here, the linear principal component analysis (PCA)³³ is used obtaining a good level of accuracy (see PCA description in the Methods section).

We find that two principal components are sufficient to accurately represent the entire pool of good designs, each defined by five segment length in the original design space. That is, all good designs approximately lie on a 2D hyperplane—the reduced design sub-space. The rest of the design space can be excluded from further investigation. As we discuss in Supplementary Note 1, post-processing error analysis demonstrates that five good designs are sufficient for PCA to provide an accurate result. In order to verify convergence we collect here 45 good designs. The linear design sub-space is defined by two orthogonal basis vectors V_1αβ and V_2αβ. Any design k with dimensions L_k = [L_{1,k …} L_5,k] can hence be written as

$${\mathbf{L}}_{\mathrm{k}} = \alpha _{\mathrm{k}}{\mathbf{V}}_{1{\mathrm{\alpha \beta }}} + \beta _{\mathrm{k}}{\mathbf{V}}_{2{\mathrm{\alpha \beta }}} + {\mathbf{C}}_{{\mathrm{\alpha \beta }}}.$$

(2)

C_αβ is a constant vector that defines the reference origin on the hyperplane. Two scalar coefficients α_k and β_k are thus sufficient to completely describe design k. Details of vector definitions are provided in the Methods section.

Now that the area of interest in the design space is limited to a 2D hyperplane, it becomes feasible to adopt a classical design approach and perform an exhaustive mapping of this sub-space, as illustrated in Fig. 1c. First, we generate a uniform grid of 60 × 60 points covering the α–β hyperplane. Therefore, 3600 sampling points are sufficient to provide a wealth of information. As a comparison, sampling with the same resolution in the original design space would require ~1.5 million points, increasing the computation time by about over 400 times (details in the Methods section). For each point [α_k, β_k], we obtain the corresponding dimensions [L_1,k … L_5,k] in the original design space through Eq. (2) and compute the coupling efficiency η. The results are shown in Fig. 3a only for the designs with η > 0.7. Note that not all points on the α–β plane provide high coupling efficiency. A unit division in α or β corresponds to a Manhattan distance $\mathop {\sum }\nolimits_{i = 1}^5 |L_{i,A} - L_{i,B}|$ of 100 nm. The 60 × 60 grid covering the α–β hyperplane has a resolution of 5 nm in Manhattan distance. This exhaustive mapping results in the discovery of a large and well-defined region of degenerate designs with η > 0.74, highlighted by the black contour line in Fig. 3a. This region encloses a continuum of designs in addition to those discovered in stage 1. Remarkably, although all the good designs have similar coupling efficiencies ranging from η = 0.74 to η = 0.76, the actual structure of the gratings can vary quite significantly, as will be detailed in the next section. Without dimensionality reduction, there is no obvious way to discover these alternative designs with similar coupling efficiency, but potentially different properties in other aspects.

As the last step, we validate the PCA outcome by verifying that the projection on the low-dimensional design sub-space (the α–β hyperplane) is sufficient to represent the region of good grating designs. We generate two additional 2D hyperplanes Γ–Π and X–Π that are orthogonal to each other and to the α–β plane (details in the Methods section). They provide two different cuts through the α–β sub-space and their projections are shown in Fig. 3a with dashed black lines. We generate a uniform grid on Γ–Π and X–Π and simulate the coupling efficiency of the corresponding grating design. Coupling efficiencies η > 0.7 are plotted in Fig. 3b and c, respectively. The intersection with the α–β hyperplane is marked with a white dashed line. The axis uses the same scale as in Fig. 3a: a unit division on X, Γ, or Π corresponds to a Manhattan distance of 100 nm. When projected on Γ–Π and X–Π hyperplanes, the region of good designs essentially reduces to a thin stripe whose thickness depends on the range of accepted coupling efficiencies η. This confirms that this region has approximately a 2D geometry. Although it appears slightly curved (see the Γ–Π projection, Fig. 3b), it is still well approximated by the α–β hyperplane which has the advantage of being a simple linear structure.

Characterization of the low-dimensional good design sub-space

The exhaustive mapping of the sub-space of all good designs can now be readily extended to other performance metrics beyond the primary criterion originally chosen as the optimization objective (the coupling efficiency). This provides the designer a complete picture of the device behavior, including the upper and lower limit of each performance metric. Informed trade-offs and identification of the best design that fits specific application needs are hence made possible.

Along with the coupling efficiency, we evaluate here three additional criteria throughout the sub-space, i.e., back reflections, minimum feature size, and tolerance to fabrication uncertainty, as presented in Figs. 3 and 4. All maps use the same axis scale, range, and sampling as the α–β plane, and the black contour line marks the region with η > 0.74 for reference. Three designs are selected for further examination. Their structural and performance parameters are listed in Table 1. Designs 1 and 2 are on the α–β plane (marked on Fig. 3a, d), while design 3 (marked in Fig. 3b, c) is the global optimum in both Γ–Π and X–Π projections (not exactly represented on the α–β plane). The design proposed in ref. ¹⁷, which was found through particle swarm optimization, also belongs to the sub-space of good solutions and its location on the α–β hyperplane is marked for reference. Despite the very different design parameter (especially for the L-shaped structure), all these gratings have a highly directional vertical emission and good overlap with the fiber mode, leading to a high coupling efficiency (η > 0.75 as listed in Table 1). On the other hand, the attainable back reflection differs markedly, from −21 dB for design 1 to −37 dB for design 2. The possibility to exhaustively map other metrics throughout the sub-space allows the designer to identify a design area with particularly low back reflections around design 2.

As an additional comparison of the performance for designs 1–3, Fig. 3e and f plots the two performance criteria η and r as a function of wavelength, now computed using 2D finite-difference-time-domain (FDTD) method as a cross-check. The results agree well with that predicted by the Fourier-type 2D simulator. All three designs have a 1-dB bandwidth larger than the telecommunication C band (1530 nm–1565 nm, green-shaded area) with design 2 slightly out-performing the other two. Regarding back reflections, the behavior of the three designs is remarkably different (Fig. 3f). Back reflections of design 2 near wavelength 1550 nm are very low, which is particularly important for coupling to a laser. However, a reflection of less than −30 dB can only be achieved within a 7-nm wavelength band. In contrast, the reflection of design 1 and 3 oscillates between −26 dB and −17 dB within the entire C band.

Minimum feature size determines the manufacturability of any nanophotonics device. Here, feature sizes can be easily retrieved exploiting the α–β hyperplane through a query process without performing any additional photonic simulations. For each point, the dimensions [L₁ … L₅] are computed with Eq. (2) and the shortest section is identified. Figure 4a shows the shortest segment among the five-dimensional parameters. It is immediately evident that a design with minimum feature size greater than 88 nm does not exist for this grating structure, even accepting a small penalty in the coupling efficiency. This essential information can be easily retrieved because of the global perspective that our method offers, and it would be difficult to obtain with a conventional optimization procedure. Clearly, the bottleneck is predominantly in either L₁ or L₂. This finding inspired an improved grating structure (described in the next section) that allows a minimum feature size greater than 100 nm while maintaining similar performance.

Another important aspect for nanophotonic devices is the robustness against unavoidable fabrication uncertainty. Here, we examine two sources of common dimensional variability: A width deviation δ_w for both shallow and deeply etched sections and etch depth deviation δ_e from the nominal 110 nm for the shallow-etched section. A good measure of the sensitivity of coupling efficiency and back reflections to variability is provided by a quantity denoted here as a degradation derivative, defined as the average of the two directional derivatives with respect to positive and negative values of δ_w or δ_e (definition in the Methods section). The computed values of the four degradation derivatives are shown in Fig. 4b–e, with a high value indicating a high sensitivity. For width deviations, the coupling efficiency has a particularly sensitive region close to design 1. On the contrary, back reflection is more sensitive to width deviations in the region close to design 2. This region has a large overlap with the region of minimum back reflection shown in Fig. 3d, making design 2 and surrounding designs high-performing when back reflection is considered, but with stringent fabrication requirements. In Supplementary Note 2, we provide a direct verification of these results through a polynomial chaos-based stochastic analysis²². Regarding sensitivity to etch depth variations, the entire region of good designs largely overlaps with a region of low sensitivity for both coupling efficiency and back reflection.

Dimensionality-reduction generality and ML inspired geometry

The proposed design approach requires no physical assumptions on the device under study, enabling its application to other design problems with different types of input parameters and/or objectives. A straightforward demonstration of its generality is carried out by designing vertical grating couplers for the optical communication O-band, centered at 1310 nm (see Supplementary Note 4). We further demonstrate here this generality by investigating a new class of grating couplers utilizing subwavelength metamaterials^3,4 to achieve designs with a minimum feature size greater than 100 nm in both the propagation and the transverse directions. This new grating geometry (see Fig. 5) is inspired by the global mapping of minimum feature size described in the previous section. The optimization now involves both dimensions and the effective material index used to represent the subwavelength segments. Dimensions and refractive index have different numerical magnitudes, but they both significantly impact the device performance when varied. Below, we show that the method presented in the previous sections can successfully map out the high-performance region of this new mixed design space.

The subwavelength metamaterial grating structure, schematically shown in Fig. 5a, is represented by a mixed set of four geometrical parameters and one material parameter ${\bar{\mathbf{L}}} = \left[ {L_1,L_2,L_3,L_4,n_{{\mathrm{swg}}}} \right]$. These complex grating couplers can still be efficiently simulated using the 2D eigenmode expansion simulator. The formulation of the optimization problem remains the same as in the Eq. (1), but we additionally set 1.6 < n_swg< 3. Good designs found by the optimization algorithm (η > 0.74) are used to perform both PCA and the corresponding error analysis (see Supplementary Note 1). Since the new design space includes parameters of different nature, before executing PCA it is essential to normalize the variables through their statistically estimated standard deviations. Performing the PCA analysis (stage 2) reveals that again only two principal components are sufficient to identify all the good designs within the original 5D design space (vectors are defined in the Methods section).

In Fig. 5b and c, we exhaustively map out the coupling efficiency and the back reflection over the 2D hyperplane (stage 3), discovering again a large continuous region with η > 0.74. The map covers the region where the coupling efficiency exceeds 0.7. The black contour encloses all devices with a minimum feature size greater than 100 nm in both propagation and transverse directions and maintains η ≥ 0.74. Three devices are marked on the maps for further investigation, where independent 2D-FDTD simulations provide full wavelength analysis in Fig. 5d, e (see Supplementary Note 3 for structural details and performance metrics). The results show that physically distinct devices with similar performances can be identified very efficiently. Delivering on such objectives would be very challenging by conventional optimization methods.

Discussion

We have demonstrated a new approach for the design of complex photonics devices with a large number of parameters, case studied on a multiparameter vertical fiber grating coupler. Rather than generating a single optimized design solution, our methodology exploits the dimensionality reduction technique from the suite of machine-learning pattern recognition tools to identify within the large design space the lower-dimensional sub-space of good devices with high performance. This approach exponentially scales down the complexity of the problem, making it feasible to exhaustively map the continuous region of grating couplers with comparable fiber coupling efficiencies ($\eta {\,}> {\,} 0.74$ at 1550 nm). Significant differences emerge when different performance criteria are considered, such as back reflection, minimum feature size, and tolerance to fabrication uncertainty. Such a global perspective also reveals performance and structural limitations of the design geometry. In particular, we were able to conclude that good coupling efficiency and a minimum feature size greater than 88 nm could not be obtained simultaneously for this first structure. The analysis of this shortcoming inspired a new class of grating couplers that uses subwavelength metamaterial, achieving a minimum feature size greater than 100 nm while maintaining state-of-the-art coupling efficiency and back reflection.

Given the generality of our implementation, the presented methodology can be readily exploited to navigate and comprehend a wide range of high-dimensional design spaces that photonic designers often encounter. While it is demonstrated here for two different design problems in nanophotonics, applications to photonic circuits or even sub-systems can be foreseen. This design methodology opens up new avenues in photonic device analysis and design where other dimensionality reduction methods such as Kernel PCA³⁴, Principal manifolds³⁵, and Autoencoders³⁶ can deal with navigating through even more complex design spaces. Indeed, automation and integration of dimensionality reduction within the design flow will provide a powerful platform potentially transforming how advanced photonic devices are discovered and investigated.

Methods

Grating coupler simulation

The simulation of coupling efficiency and back reflections for each design of the grating coupler is performed exploiting either a 2D vectorial Fourier eigenmode expansion simulator³² or a commercial 2D-FDTD solver. We consider a structure including silicon substrate, 2-μm buried oxide, 220-nm-thick silicon core and a silica upper cladding of 1.5 μm thickness. Silicon and silica refractive indices are 3.45 and 1.45 at λ = 1550 nm. The mode of an SMF-28 single-mode optical fiber vertically coupled on top of the grating is modeled with a Gaussian function with a mode field diameter of 10.4 μm (λ = 1550 nm). The fiber facet is assumed to be in direct contact with the top of the upper cladding and its longitudinal position along the grating is optimized for each design to maximize the coupling efficiency. The latter is calculated as the overlap integral between the simulated field diffracted upwards by the grating and the Gaussian function.

Machine learning enhanced optimization

We implemented a random restart local-search algorithm to solve problem (1), although other search methods could also be used. For each initial random design, small perturbations are made until a better solution in terms of coupling efficiency is found and a line search is exploited to seek further improvement until convergence. The perturbation and line search process is repeated until no improvement is found in the perturbation stage.

Following an initial optimization round with random-restart where a small collection of good designs was obtained, we trained a supervised machine-learning model, specifically gradient boosted trees, to predict if the radiation angle is within 5° of vertical. Once trained, we used the predictor to sample random designs that are nearly vertical while rejecting other random designs. Only near vertical emitting designs proceed to the local search stage. The use of the predictor in the optimization led to ~250% increase in optimizer speed to find new designs meeting the coupling efficiency criteria η > 0.74.

The performance of the general purpose machine-learning predictor is comparable to that of a predictor based on the scalar grating equation relating the section lengths, effective indices and the radiation angle:

$$\mathop {\sum }\limits_{i = 1}^5 N_iL_i = c + a\,{\mathrm{sin}}\,\theta \cdot \mathop {\sum }\limits_{i = 1}^5 L_i,$$

where $a,c$ are constants related to the wavelength and the overcladding effective index, $N_{{i}}$ is the effective index of the ith section, $L_{{i}}$ is the length of the ith section, and $\theta$ is the radiation angle. Given the simulated data ($\left( {\theta ,L_{{i}}} \right)$, one can use linear regression to estimate all the constants and then use those to predict the radiation angle of the structure for any combination of the section lengths. Despite the fact that the general machine-learning predictor is not aware of this approximation, its predictions are comparable with those obtained using the scalar grating equation. Furthermore, this approach can be applied to predict other quantities that cannot be described by simple closed-form expressions.

PCA

PCA is a dimensionality reduction technique that has been used widely and successfully across various engineering and science disciplines^37,38,39,40 and is implemented in most scientific computing platforms (e.g., Matlab, R, Python, etc.). Consider m data points in an n-dimensional design space. This can be written as a centered data matrix ${\mathbf{L}} \in {\Bbb R}^{m \times n}$, where the statistical average along each dimension is subtracted. PCA finds a sequence of best orthogonal linear projections (called principal components) that maximizes the corresponding variances. Mathematically, this is done by finding a set of vectors ${\mathbf{V}}_1,{\mathbf{V}}_2, \ldots ,{\mathbf{V}}_n \in {\Bbb R}^n$ that are L2 normalized and orthogonal, $\left\| {{\mathbf{V}}_i} \right\|_2 = 1,{\mathbf{V}}_i\, \bot\, {\mathbf{V}}_j$, and for every $k < n$, minimize

$$\left\| {{\mathbf{LR}} - {\mathbf{L}}} {\left[ {{\mathbf{R}}_{\mathbf{k}},{\mathbf{0}}^{n \times (n - k)}} \right]} \right\|_2,$$

where, ${\mathbf{R}} = [{\mathbf{V}}_1,{\mathbf{V}}_2, \ldots ,{\mathbf{V}}_n] \in {\Bbb R}^{n \times n}$ is the transformation matrix that is formed by using the PCA vectors as its columns, ${\mathbf{R}}_{\mathrm{k}} = [{\mathbf{V}}_1,{\mathbf{V}}_2, \ldots ,{\mathbf{V}}_{\mathrm{k}}] \in {\Bbb R}^{n \times k}$ is similar but transforms to a lower-dimensional space consisting of only ${\mathrm{k}}$ first components, and $0^{n \times (n - k)}$ is a null matrix used for padding. Such an exercise also returns the weights associated with PCA vectors from which the main principal components can be selected and the remaining can be dropped, with negligible amount of information lost. The original data set can now be approximately represented by $k < n$ effective parameters using a matrix product ${\mathbf{L}}^{{\mathrm{PCA}}} = {\mathbf{LR}}_k$, where each row corresponds to a transformed representation of a single data point. The transformation back to the original n-dimensional space can be done using ${\mathbf{L}}^{{\mathrm{est}}} = {\mathbf{L}}\,{\mathbf{R}}_k{\mathbf{R}}_k^{\mathrm{T}}$, which can be used to quantify approximation errors incurred by reducing the dimensionality.

Hyperplanes definitions

The hyperplane approximation given in Eq. (2) and computed by PCA is defined by three 5D vectors V₁, V₂, and C, the latter being the reference origin within the hyperplane. The scaled vectors computed for the grating structure in Fig. 2 are: V_1αβ = [−0.43, 3.78, −20.82, 44.77, −30.21] nm, V_2αβ = [−25.80, 10.81, −37.69, −3.86, 21.93] nm and the origin C_αβ = [102,73,156, 243,156] nm.

For the two orthogonal hyperplanes Γ–Π and X–Π described in Fig. 3b and c, the vectors are found following linear algebra considerations. Γ–Π is defined enforcing the plane to pass through the two points (defined in the 5D design space) [77, 84, 115, 249, 171] nm (design 1 in Table 1) and [95, 83, 104, 336, 98] nm and to be orthogonal to the α–β hyperplane. The resulting vectors are V_1ΓΠ = [9.45, −0.35, −6.13, 45.95, −38.35] nm, V_2ΓΠ = [−22.48, 33.08, 9.87, −17.87, −28.78] nm, C_ΓΠ = [85, 84, 110, 289, 138] nm. X–Π is defined as orthogonal to both α–β and Γ–Π and passing through the point [82, 87, 111, 283, 139] nm (design 3 in Table 1). The vectors are V_1×Π = [−25.92, 11.96, −44.16, 10.05, 12.61] nm, V_2×Π = V_2ΓΠ, and C _XΠ = [85, 84, 110, 284, 142] nm.

Table 1 Structural and performance parameters of selected grating designs as marked in Fig. 3

Full size table

For the grating structure with subwavelength transverse metamaterial (Fig. 5a), the 5D vectors defining the 2D hyperplane are V_1αβ = [13.37 nm, −3.34 nm, 17.21 nm, −19.5 nm, −0.03], V_2αβ = [5.07 nm, 14.07, −7.33 nm, 2.53 nm, −0.076] and C_αβ = [272 nm, 71 nm, 247 nm, 120 nm, 2.63].

Computational resources

In the proposed method, the largest fraction of the computational time is dominated by design simulations with a negligible overhead time for data processing. The initial optimization (stage 1) enhanced by the ML angle predictor required ~5000 photonic simulations to identify the initial five good designs used to find the reduced design sub-space through PCA. The exhaustive mapping was then performed by sampling 3600 points (design) arranged in a square grid in the sub-space and doing the corresponding 3600 photonic simulations to compute simultaneously coupling efficiency and back reflections. Other sampling strategies or sparser grids can be used to reduce the number of simulations. Likewise, a different simulation approach could be used to retrieve additional metrics within the same simulation. The computation of additional performance metrics may require additional simulations, for example, when fabrication tolerance is estimated through the degradation derivatives as reported in Fig. 4b–e. On the other hand, the computation of the minimum feature size does not require any additional simulation.

It is worth calculating the number of points of a grid with the same resolution used for the α–β hyperplane, but across the five grating dimensions and covering all designs with coupling efficiency η > 0.74 found in stage 1. The five dimensions of these good designs span a range of 60 nm, 27 nm, 86 nm, 138 nm, 118 nm, respectively. A grid with 5 -nm resolution across all segments and covering all good designs, including two extra points to confirm the boundaries per segment, results in 14 × 7 × 19 × 30 × 26 ≈ 1.5 ∙ 10⁶ points (designs). Compared with mapping directly the original 5D design space, mapping the α–β hyperplane thus reduces the computation time of about 400 times. For higher dimensions, the reduction can be even more significant.

Uncertainty model

For the investigation of design tolerance to fabrication uncertainty, we assume a width deviation δ_w for both shallow-etched and deeply etched sections

$$L_1^\prime = L_1 - \delta _w;\,L_2^\prime = L_2 + \delta _w;\,L_3^\prime = L_3 - \delta _w;\,L_4^\prime = L_4;\,L_5^\prime = L_5 + \delta _w$$

We define a degradation derivative that is computed from two directional derivatives, assuming that over-etch and under-etch are equally likely (positive and negative values of δ_w can in general affect the device performance differently). Calculating the common derivative would not be informative, as for locally optimized devices it would be close to zero. For coupling efficiency, we are interested in calculating

$$\alpha _{\mathrm{\eta }} = - \frac{1}{2}\left( {\frac{{\partial ^ + \eta }}{{\partial \delta _w^{}}} - \frac{{\partial ^ - \eta }}{{\partial \delta _w^{}}}} \right) \cong - \frac{1}{{2|{\mathrm{\Delta }}\delta _w|}}\left( {\eta ^ + + \eta ^ - - 2\eta _0} \right),$$

where ∂⁺ and ∂⁻ are the derivatives computed for positive and negative values of δ_w. The minus sign ensures that a positive value of α_η indicates a worse (lower) coupling efficiency. Derivatives are numerically computed considering a small width variation and simulating the coupling efficiencies η⁺ (when Δδ_w = 5 nm) and η⁻ (when Δδ_w = −5 nm). η₀ is the coupling efficiency for δ_w = 0. Similarly, for back reflections

$$\alpha _{\mathrm{r}} = \frac{1}{2}\left( {\frac{{\partial ^ + \eta }}{{\partial \delta _w^{}}} - \frac{{\partial ^ - \eta }}{{\partial \delta _w^{}}}} \right) \cong \frac{1}{{2|\Delta \delta _w|}}\left( {\eta ^ + + \eta ^ - - 2\eta _0} \right).$$

Also in this case, a positive α_r indicates worse (higher) back reflections. The degradation derivatives plotted in Figs. 4b and c are finally computed as

$$d = \left\{ {\begin{array}{*{20}{l}} {\alpha {\quad}\,{\mathrm{if}}\,\alpha{\,} > {\,} 0}\hfill \\ {0{\quad}\,{\mathrm{otherwise}}} \end{array}} \right.$$

When etch uncertainty is considered, δ_e represents the variability on the 110 nm etch depth in the fourth section of the grating. The same definitions apply for the degradation derivatives.

Data availability

The data that support the plots within this paper and other findings of this study are available from the corresponding author upon request.

Code availability

The custom code that has been used to generate the results reported in this paper is available from the corresponding author upon request.

References

Thomson, D. et al. Roadmap on silicon photonics. J. Opt. 18, 073003 (2016).
Article ADS Google Scholar
Smit, M. et al. An introduction to InP-based generic integration technology. Semicond. Sci. Technol. 29, 083001 (2014).
Article ADS Google Scholar
Cheben, P., Halir, R., Schmid, J. H., Atwater, H. A. & Smith, D. R. Subwavelength integrated photonics. Nature 560, 565–572 (2018).
Article CAS ADS Google Scholar
Halir, R. et al. Subwavelength-grating metamaterial structures for silicon photonic devices. Proc. IEEE 106, 2144–2157 (2018).
Article CAS Google Scholar
Cheben, P. et al. Refractive index engineering with subwavelength gratings for efficient microphotonic couplers and planar waveguide multiplexers. Opt. Lett. 35, 2526–2528 (2010).
Article ADS Google Scholar
Frellsen, L. F., Ding, Y., Sigmund, O. & Frandsen, L. H. Topology optimized mode multiplexing in silicon-on-insulator photonic wire waveguides. Opt. Express 24, 16866 (2016).
Article CAS ADS Google Scholar
Jensen, J. S. & Sigmund, O. Topology optimization for nano-photonics. Laser Photonics Rev. 5, 308–321 (2011).
Article CAS ADS Google Scholar
Lu, J. & Vučković, J. Nanophotonic computational design. Opt. Express 21, 13351 (2013).
Article ADS Google Scholar
Piggott, A. Y. et al. Inverse design and demonstration of a compact and broadband on-chip wavelength demultiplexer. Nat. Photonics 9, 374–377 (2015).
Article CAS ADS Google Scholar
Shen, B., Wang, P., Polson, R. & Menon, R. Integrated metamaterials for efficient and compact free-space-to-waveguide coupling. Opt. Express 22, 27175–27182 (2014).
Article ADS Google Scholar
Shen, B., Wang, P., Polson, R. & Menon, R. An integrated-nanophotonics polarization beamsplitter with 2.4 × 2.4 μm² footprint. Nat. Photonics 9, 378–382 (2015).
Article CAS ADS Google Scholar
Liu, D., Gabrielli, L. H., Lipson, M. & Johnson, S. G. Transformation inverse design. Opt. Express 21, 14223 (2013).
Article ADS Google Scholar
Covey, J. & Chen, R. T. Efficient perfectly vertical fiber-to-chip grating coupler for silicon horizontal multiple slot waveguides. Opt. Express 21, 10886 (2013).
Article CAS ADS Google Scholar
Håkansson, A. & Sánchez-Dehesa, J. Inverse designed photonic crystal de-multiplex waveguide coupler. Opt. Express 13, 5440–5449 (2005).
Article ADS Google Scholar
Melati, D., Waqas, A., Xu, D.-X. & Melloni, A. Genetic algorithm and polynomial chaos modelling for performance optimization of photonic circuits under manufacturing variability. In Optical Fiber Communication Conference M3I.4 (Optical Society of America, 2018). https://doi.org/10.1364/OFC.2018.M3I.4
Ma, Y. et al. Ultralow loss single layer submicron silicon waveguide crossing for SOI optical interconnect. Opt. Express 21, 29374–29382 (2013).
Article ADS Google Scholar
Watanabe, T., Ayata, M., Koch, U., Fedoryshyn, Y. & Leuthold, J. Perpendicular grating coupler based on a blazed antiback-reflection structure. J. Light. Technol. 35, 4663–4669 (2017).
Article CAS ADS Google Scholar
Jensen, J. S. & Sigmund, O. Systematic design of photonic crystal structures using topology optimization: low-loss waveguide bends. Appl. Phys. Lett. 84, 2022–2024 (2004).
Article CAS ADS Google Scholar
Lalau-Keraly, C. M., Bhargava, S., Miller, O. D. & Yablonovitch, E. Adjoint shape optimization applied to electromagnetic design. Opt. Express 21, 21693 (2013).
Article ADS Google Scholar
Niederberger, A. C. R., Fattal, D. A., Gauger, N. R., Fan, S. & Beausoleil, R. G. Sensitivity analysis and optimization of sub-wavelength optical gratings using adjoints. Opt. Express 22, 12971–12981 (2014).
Article CAS ADS Google Scholar
Su, L. et al. Fully-automated optimization of grating couplers. Opt. Express 26, 4023–4034 (2018).
Article CAS ADS Google Scholar
Weng, T.-W., Melati, D., Melloni, A. & Daniel, L. Stochastic simulation and robust design optimization of integrated photonic filters. Nanophotonics 6, 299–308 (2017).
Article Google Scholar
Turduev, M. et al. Ultracompact photonic structure design for strong light confinement and coupling into nanowaveguide. J. Light. Technol. 36, 2812–2819 (2018).
Article CAS ADS Google Scholar
Peurifoy, J. et al. Nanophotonic particle simulation and inverse design using artificial neural networks. Sci. Adv. 4, eaar4206 (2018).
Article ADS Google Scholar
Michaels, A. & Yablonovitch, E. Inverse design of near unity efficiency perfectly vertical grating couplers. Opt. Express 26, 4766 (2018).
Article CAS ADS Google Scholar
Wang, B., Jiang, J. & Nordin, G. P. Embedded slanted grating for vertical coupling between fibers and silicon-on-insulator planar waveguides. IEEE Photonics Technol. Lett. 17, 1884–1886 (2005).
Article CAS ADS Google Scholar
Roelkens, G., Thourhout, D. V. & Baets, R. High efficiency grating coupler between silicon-on-insulator waveguides and perfectly vertical optical fibers. Opt. Lett. 32, 1495–1497 (2007).
Article CAS ADS Google Scholar
Dabos, G. et al. Perfectly vertical and fully etched SOI grating couplers for TM polarization. Opt. Commun. 350, 124–127 (2015).
Article CAS ADS Google Scholar
Chen, X., Li, C. & Tsang, H. K. Fabrication-tolerant waveguide chirped grating coupler for coupling to a perfectly vertical optical fiber. IEEE Photonics Technol. Lett. 20, 1914–1916 (2008).
Article CAS ADS Google Scholar
Liu, A. Y., Komljenovic, T., Davenport, M. L., Gossard, A. C. & Bowers, J. E. Reflection sensitivity of 1.3 μm quantum dot lasers epitaxially grown on silicon. Opt. Express 25, 9535–9543 (2017).
Article CAS ADS Google Scholar
Melati, D., Melloni, A. & Morichetti, F. Real photonic waveguides: guiding light through imperfections. Adv. Opt. Photon 6, 156–224 (2014).
Article CAS Google Scholar
Zavargo-Peche, L., Ortega-Moñux, A., Wangüemert-Pérez, J. G. & Molina-Fernández, I. Fourier based combined techniques to design novel sub-wavelength optical integrated devices. Prog. Electromagn. Res. 123, 447–465 (2012).
Article Google Scholar
Pearson, K. On lines and planes of closest fit to systems of points in space. Lond. Edinb. Dublin Philos. Mag. J. Sci. 2, 559–572 (1901).
Article Google Scholar
Schölkopf, B., Smola, A. & Müller, K.-R. Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 10, 1299–1319 (1998).
Article Google Scholar
Gorban, A. N., Kégl, B., Wunsch, D. C., & Zinovyev, A. Y. (Eds.). Principal manifolds for data visualization and dimension reduction. 58, 96–130 (Springer Berlin Heidelberg, 2008)
Kingma, D. P. & Welling, M. Auto-encoding variational bayes. Preprint at https://arxiv.org/abs/1312.6114 (2013).
M. Shenai, P., Xu, Z. & Zhao, Y. Applications of principal component analysis (PCA) in materials science. In Principal Component Analysis - Engineering Applications, 25–40 (ed. Sanguansat, P.) (InTech, 2012).
Kassahun, Y. & Kebedee, T. Application of principal component analysis in surface water quality monitoring. In Principal Component Analysis - Engineering Applications, 83–100 (ed. Sanguansat, P.) (InTech, 2012).
Pasini, G. Principal component analysis for stock portfolio management. Int. J. Pure Apllied Math. 115, 153–167 (2017).
Article Google Scholar
Dunteman, G. H. Principal Components Analysis (SAGE, 1989).

Download references

Author information

These authors contributed equally: Daniele Melati, Yuri Grinberg.

Authors and Affiliations

Advanced Electronics and Photonics Research Centre, National Research Council Canada, 1200 Montreal Rd., Ottawa, ON, K1A 0R6, Canada
Daniele Melati, Mohsen Kamandar Dezfouli, Siegfried Janz, Pavel Cheben, Jens H. Schmid & Dan-Xia Xu
Digital Technologies Research Centre, National Research Council Canada, 1200 Montreal Rd., Ottawa, ON, K1A 0R6, Canada
Yuri Grinberg
Universidad de Málaga, Departamento de Ingeniería de Comunicaciones, ETSI Telecomunicación, Campus de Teatinos s/n, 29071, Málaga, Spain
Alejandro Sánchez-Postigo

Authors

Daniele Melati
View author publications
You can also search for this author in PubMed Google Scholar
Yuri Grinberg
View author publications
You can also search for this author in PubMed Google Scholar
Mohsen Kamandar Dezfouli
View author publications
You can also search for this author in PubMed Google Scholar
Siegfried Janz
View author publications
You can also search for this author in PubMed Google Scholar
Pavel Cheben
View author publications
You can also search for this author in PubMed Google Scholar
Jens H. Schmid
View author publications
You can also search for this author in PubMed Google Scholar
Alejandro Sánchez-Postigo
View author publications
You can also search for this author in PubMed Google Scholar
Dan-Xia Xu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

D.M., Y.G. and D.X.X. conceived the design approach and developed the theoretical framework. Y.G. developed the machine-learning algorithms. D.M. analysed the data and performed the stochastic analyses. P.C., A.S.P. and J.H.S. assisted in selecting the grating coupler study case. A.S.P. contributed to the development of the interface between the photonic simulator and machine-learning algorithms. S.J., P.C. and J.H.S. provided theoretical and design guidance. D.X.X. and Y.G. supervised the project. M.K.D. and A.S.P. conceived and analysed the grating design with subwavelength patterning. All authors contributed to the discussion and paper preparation.

Corresponding author

Correspondence to Dan-Xia Xu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Communications thanks Andrea Melloni and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Melati, D., Grinberg, Y., Kamandar Dezfouli, M. et al. Mapping the global design space of nanophotonic components using machine learning pattern recognition. Nat Commun 10, 4775 (2019). https://doi.org/10.1038/s41467-019-12698-1

Download citation

Received: 24 June 2019
Accepted: 19 September 2019
Published: 21 October 2019
DOI: https://doi.org/10.1038/s41467-019-12698-1

This article is cited by

Stochastic and multi-objective design of photonic devices with machine learning
- Paolo Manfredi
- Abi Waqas
- Daniele Melati
Scientific Reports (2024)
Machine learning for knowledge acquisition and accelerated inverse-design for non-Hermitian systems
- Waqas W. Ahmed
- Mohamed Farhat
- Ying Wu
Communications Physics (2023)
Highly efficient ultra-broad beam silicon nanophotonic antenna based on near-field phase engineering
- Shahrzad Khajavi
- Daniele Melati
- Winnie N. Ye
Scientific Reports (2022)
Intelligent designs in nanophotonics: from optimization towards inverse creation
- Ning Wang
- Wei Yan
- Min Qiu
PhotoniX (2021)
Photonic bandgap engineering using second-order supersymmetry
- Nitish Chandra
- Natalia M. Litchinitser
Communications Physics (2021)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.