Optimizing PCF-SPR sensor design through Taguchi approach, machine learning, and genetic algorithms

Kaziz, Sameh; Echouchene, Fraj; Gazzah, Mohamed Hichem

doi:10.1038/s41598-024-55817-9

Download PDF

Article
Open access
Published: 03 April 2024

Optimizing PCF-SPR sensor design through Taguchi approach, machine learning, and genetic algorithms

Sameh Kaziz¹,
Fraj Echouchene² &
Mohamed Hichem Gazzah³

Scientific Reports volume 14, Article number: 7837 (2024) Cite this article

325 Accesses
Metrics details

Subjects

Abstract

Designing Photonic Crystal Fibers incorporating the Surface Plasmon Resonance Phenomenon (PCF-SPR) has led to numerous interesting applications. This investigation presents an exceptionally responsive surface plasmon resonance sensor, seamlessly integrated into a dual-core photonic crystal fiber, specifically designed for low refractive index (RI) detection. The integration of a plasmonic material, namely silver (Ag), externally deposited on the fiber structure, facilitates real-time monitoring of variations in the refractive index of the surrounding medium. To ensure long-term functionality and prevent oxidation, a thin layer of titanium dioxide (TiO₂) covers the silver coating. To optimize the sensor, five key design parameters, including pitch, air hole diameter, and silver thickness, are fine-tuned using the Taguchi L₈(2⁵) orthogonal array. The optimal results obtained present spectral and amplitude sensitivities that reach remarkable values of 10,000 nm/RIU and 235,882 RIU-1, respectively. In addition, Artificial Neural Network (ANN) optimization techniques, specifically Multi-Layer Perceptron (MLP) and Particle Swarm Optimization (PSO), are used to predict a critical optical property of the sensor confinement loss (α_loss). These predictions are derived from the same input structure parameters that are present in the full L₃₂(2⁵) design experiment. A genetic algorithm (GA) is then applied for optimization with the goal of maximizing the confinement loss. Our results highlight the effectiveness of training PSO artificial neural networks and demonstrate their ability to quickly and accurately predict results for unknown geometric dimensions, demonstrating their significant potential in this innovative context. The proposed sensor design can be used for various applications including pharmaceutical inspection and detection of low refractive index analytes.

Mid-infrared wide-field nanoscopy

Article 17 April 2024

Self-assembly of peptide nanocapsules by a solvent concentration gradient

Article 26 April 2024

Inkjet-printed optical interference filters

Article Open access 20 April 2024

Introduction

The optical phenomenon known as surface plasmon resonance (SPR) occurs when free electrons oscillate at the interface between a metallic surface and a dielectric layer. In this fascinating phenomenon, the photon wavelengths of the incident electromagnetic wave align with the wavelengths of the surface electrons, especially under p-polarized light radiation¹. This unique phenomenon has spurred extensive research into SPR sensors, primarily because of their attractive properties. These sensors offer efficiency, precision in sensing, fast response times, real-time and label-free detection, and an exceptional ability to effectively control light². Traditional SPR sensors have been designed using prisms, fiber Bragg gratings, slot waveguides, and V-groove waveguides. However, these designs tend to be bulky and costly². To overcome these limitations, SPR sensors based on photonic crystal fibers (PCFs) have been introduced. PCF-based sensors provide portability, compactness, and the ability for remote sensing. Various PCF-SPR structures have been investigated for different sensing applications. These include configurations such as microfluidic slot-based designs, external metal-coated structures, long-period fiber Bragg gratings, internal metal-coated structures, and D-shaped structures, among others². A PCF-SPR sensor uses two different sensing configurations: external and internal. In the internal sensing approach, the analyte selectively occupies the air holes in the fiber. This mechanism enhances sensitivity because the introduced analyte directly modifies the initial refractive index distribution of the fiber. However, internal sensing is not suitable for real-time and distributed sensing applications due to its impracticality and susceptibility to significant propagation losses. To overcome these challenges, the external sensing technique is used. In this method, the analyte is located on the surface of the PCF, eliminating the need for analyte infiltration into the fiber. The external sensing technique has gained popularity due to its ease of detection and practical implementation³. In previous research, a gold lattice PCF-SPR sensor was introduced that achieved an impressive wavelength sensitivity (Ws) of 3340 nm/RIU⁴. Another study reported a D-shaped PCF sensor with a sensing range of 1.33 to 1.43, resulting in a maximum Ws of 46,000 nm/RIU⁵.

In a separate study, a gold-plated D-shaped PCF-SPR sensor with a refractive index (RI) detection range of 1.33 to 1.38 and a maximum sensitivity of 10,493 nm/RIU at an RI of 1.38 was discussed⁶. Numerous other PCF-based SPR sensors capable of detecting analytes with RI values as low as 1.33 have been documented in various studies^{4,5,6,7,8,9,10,11,12,13,14}. However, most of this work has focused primarily on sensor structures suitable for analytes with RI values greater than 1.33. Research on PCF-SPR sensors capable of detecting lower RIs, particularly those below 1.30, has been relatively limited³. The current landscape demands sensors capable of detecting low RIs as applications emerge in diverse fields, including aerogels¹⁵, halogenated ethers¹⁶, sevoflurane, pharmaceuticals, and more. Recognizing this need, a few PCF-SPR sensors have emerged to address low-RI analyte detection^{16,17,18,19,20,21}, with different WS values, including 13,500, 6000, 11,055, 20,000, 13,000, and 51,000 nm/RIU, respectively. However, it's noteworthy that only two of these studies^19,21 reported the assessment of amplification sensitivity (As), with values of approximately 1054 and 1872 RIU-1, respectively. This underscores the untapped potential for PCF-SPR sensors capable of detecting lower refractive indices with improved sensitivity in both interrogation methods.

In this research, we have introduced and performed a comprehensive numerical analysis of a dual-core photonic crystal fiber surface plasmon resonance (PCF-SPR) sensor specifically designed for low refractive index detection. The improved performance of the sensor is achieved by incorporating a dual sensing channel created by a microchannel and a bimetallic configuration²². This innovative design improves the sensitivity of the sensor in both wavelength and amplitude interrogation methods. The addition of a titanium dioxide (TiO2) layer on top of the silver coating plays a key role in improving sensor performance. It generates a significant number of surface electrons that effectively attract the field from the core, resulting in a robust interaction with the plasmonic mode.

Accurate modeling and optimization of photonic crystal structures typically depends on numerical methods, including the finite difference method²³, the finite element method (FEM)²⁴, the block-iterative frequency domain method²⁵, and the plane wave expansion method^26,27.

However, it's worth noting that these methods require significant computational resources, especially when faced with complex photonic crystal structures that require multiple simulations to achieve an optimized design. Moreover, the computational burden of these iterative analyses is directly influenced by the number of input design parameters to be optimized. Therefore, in our study, we used the Taguchi approach to optimize five critical structural parameters of the PCF sensor. These parameters include pitch, air hole diameter, and silver layer thickness. By using the Taguchi approach, we were able to streamline the optimization process and achieve our goals with a limited number of simulations^28,29,30.

Recently, the field of machine learning (ML) and deep learning has emerged as a dominant force in various fields, including computer vision, robotics, chatbots, natural language processing, and many others. In addition, researchers have expanded their exploration of the applicability of machine learning to the field of photonics. This expansion has included diverse areas such as multimode fibers³¹, plasmonics³², biosensing³³, and metamaterials³⁴ and networking³⁵. In one notable case, Kiarashinejad et al.³⁶ introduced a deep learning-based algorithm that used dimensionality reduction techniques to gain insight into the interactions between electromagnetic waves and nanostructures. In addition, a geometric deep learning approach has been used to study nanophotonic structures³⁷. In 2018, the integration of extreme learning machines and deep learning techniques has been used to compute dispersion relations³⁸ and optimize Q factors³⁹ for photonic crystals.

A genetic algorithm is a search and optimization method inspired by natural selection and genetics. It is used to solve complex problems by evolving a population of potential solutions over generations. Through operations such as reproduction, mutation, and selection, genetic algorithms aim to obtain increasingly better solutions over time, simulating the process of biological evolution to find optimal or near-optimal solutions. These algorithms are widely used in optimization and heuristic search.

In our work, we aim to harness the innovative synergy of Taguchi methodology and artificial intelligence, leveraging machine learning techniques to forecast confinement losses in photonic crystal fibers. We combine finite element simulations with artificial neural networks (ANN) to facilitate fast and accurate computations. The motivation of this work revolves around the design of a simple feed-forward Multilayer Perceptron (MLP) and Particle Swarm Optimization (PSO) models that can be trained to estimate critical parameters such as confinement loss (α_loss) for a PCF structure. Furthermore, Genetic Algorithm (GA) is applied for optimization to maximize the confinement loss in the sensor.

Design and numerical simulation

The proposed dual-core PCF sensor configuration and x–y cross-sectional view are shown in Fig. 1a and b. This novel sensor design is organized in a square lattice with two layers of air holes (Fig. 1c). The sensing area spans a length of L = 1 mm. To improve the interaction between the core-guided and surface plasmon polariton (SPP) modes, we reduced the size of two air holes (d₂) located at the top of the initial ring. In addition, we excluded two air holes located in the center of the initial ring when fabricating the dual-core structure.

The manufacturing process involves the layering of capillaries and solid rods, followed by drawing at a certain speed to form the fiber. Different dimensions of air holes, including both large and small sizes, and absence of air holes are achieved by using thin and thick capillaries and solid rods, respectively⁹. Upon completion of fiber fabrication, a polishing technique is implemented⁵. This technique involves polishing a segment of the fiber, incorporating the large thin-walled capillary from the second ring, while the remaining part of the capillary forms the microchannel. Finally, a chemical deposition technique^18,19 is used to deposit a coating of silver and TiO₂ on the polished side of the fiber.

The finite element method (FEM) was used for the numerical analysis of the proposed sensor. In order to improve the absorption of the radiation power, a perfectly matched layer was included as the outermost layer. To achieve the highest simulation accuracy, a very fine mesh element was used. The optimized structural parameters consist of the diameters of the air holes (d₁, d₂, d₃), the pitch (Λ), and the thickness of the silver layer (t_Ag). Furthermore, the opening of the microchannel is set to 1.75 µm.

The dielectric constant of silver is determined using the Drude model, as described in the reference⁴⁰:

$${\varepsilon }_{Ag}\left(\omega \right)={\varepsilon }_{\infty }-\frac{{\omega }_{p}^{2}}{\omega (\omega +i{\omega }_{\tau })}$$

(1)

where ${\varepsilon }_{\infty }=9.84$, is the dimensionless high-frequency (infinite frequency) permittivity, ${\omega }_{p}=1.367\times {10}^{16} rad/s$, is the plasma frequency, and ${\omega }_{\tau }=1.018\times {10}^{14} rad/s$, is the collision frequency.

As for the background material, SiO₂ is used, and its refractive index is determined using the following Sellmeier equation, as described in reference⁴¹.

$${n}_{Si}^{2}=1-\frac{0.6961663{\lambda }^{2}}{{\lambda }^{2}-{(0.0684043)}^{2}}+\frac{0.4079426{\lambda }^{2}}{{\lambda }^{2}-{(0.1162414)}^{2}}+\frac{0.897479{\lambda }^{2}}{{\lambda }^{2}-{(9.896161)}^{2}}$$

(2)

In this context, the refractive index of silica is expressed as ${n}_{Si}$, and the operating wavelength is expressed as λ in µm. The refractive index of air is assumed to be 1.

The dielectric constant of TiO₂ is expressed by the provided equation²²:

$${n}_{{TiO}_{2}}^{2}=5.913+\frac{2.441\times {10}^{7}}{({\lambda }^{2}-0.803\times {10}^{7})}$$

(3)

The excitation of surface plasmons is measured by evaluating the loss of the optical fiber. The confinement loss, quantified in decibels per centimeter (dB/cm), correlates directly with the imaginary component of the effective refractive index and is expressed mathematically by the following Eq. (4) ²²:

$${\alpha }_{loss} \left[dB/cm\right]=8.686\times \frac{2\pi }{\lambda }Im({n}_{eff})\times {10}^{4}$$

(4)

where, ${k}_{0}=\frac{2\pi }{\lambda }$ is the number of waves in free space, λ is the operating wavelength, and $Im({n}_{eff})$ is the imaginary part of the effective refractive index.

Wavelength sensitivity ($Ws$) and resolution ($R$) can be defined using the equations given in references^5,22, as shown in Eqs. (5) and (6):

$$Ws [nm/RIU]=\frac{\Delta {\lambda }_{peak}}{\Delta {n}_{a}}$$

(5)

where $\Delta {\lambda }_{peak}$ is the shift in the wavelength of the loss resonance peak and $\Delta {n}_{a}$ is the change in the refractive index of the analyte.

$$R \left[RIU\right]={\Delta n}_{a}\times \frac{{\Delta \lambda }_{min}}{\Delta {\lambda }_{peak}}=\frac{{\Delta \lambda }_{min}}{Ws}$$

(6)

In Eq. (6), $\Delta {\lambda }_{min}$ corresponds to the minimum wavelength resolution, and $\Delta {\lambda }_{peak}$ denotes the shift of the resonance peak in the wavelength domain.

The amplitude sensitivity (As) of the proposed sensor is calculated using the following formula²², which is defined in Eq. (7):

$$As \left[{RIU}^{-1}\right]=-\frac{1}{\alpha (\lambda ,{n}_{a})}\frac{\partial \alpha (\lambda ,{n}_{a})}{\partial {n}_{a}}$$

(7)

Here, $\alpha (\lambda ,{n}_{a})$ represents the loss for the given analyte with refractive index ${n}_{a}$, $\partial \alpha \left(\lambda ,{n}_{a}\right)$ is the difference between two loss spectra, and $\partial {n}_{a}$ is the change in analyte refractive index.

Results and discussion

Dispersion and mode field distribution

Figure 2a to c visually illustrate the distribution of the mode field, providing an intuitive assessment of the coupling intensity. The color bar reflects the normalized mode field intensity distribution, ranging from 0 (indicating a weaker field) to 1 (indicating a stronger field), with the color spectrum shifting from blue to red to represent this intensity variation. Figure 2d shows the dispersion curves for the core mode and the surface plasmon polariton (SPP) mode, assuming a refractive index of 1.34. The wavelength is plotted on the x-axis and the real part of the effective refractive index, which reflects the light dispersion capabilities of the sensor, is plotted on the right y-axis. The left y-axis shows the attenuation constant per centimeter, which mirrors the pattern of the imaginary part of the effective refractive index. This measure doesn't affect the assessment of wavelength sensitivity (Ws), but effectively characterizes the light absorption or loss capabilities of the sensor.

The optimized structural parameters for the configuration are as follows: d₁ = 1.80 µm, d₂ = 1.00 µm, d₃ = 1.65 µm, and pitch Λ = 3.30 µm. In addition, the silver and TiO₂ layers have thicknesses of 65 nm and 10 nm, respectively. The aperture of the microchannel is 1.75 µm.

In this scenario, the enhanced evanescent field in the y-polarized transverse electric (TE) mode, TE^y, is proposed to result from the excitation of a larger fraction of free electrons at the surface compared to the TE^x mode. The optimal power transfer becomes apparent when the phase matching condition is satisfied, facilitating the transition from the core-guided fundamental mode to the plasmonic mode. As a result, a distinct peak appears at the interface.

Taguchi approach and ANOVA analysis

The Taguchi method is a robust optimization technique that has gained widespread recognition in various fields for its ability to systematically optimize multiple parameters and their respective levels while minimizing the need for extensive experimentation^28,30,42,43. When applied to the task of optimizing the structural parameters of the PCF sensor, which include the diameters of the air holes (d₁, d₂, d₃), the pitch (Λ), and the thickness of the silver (t_Ag), the Taguchi method provides an efficient approach. In Table 1, we present the optimization parameters along with their associated levels. Using the Taguchi method, our goal is to identify the ideal combination of parameter settings that will increase the performance and accuracy of the PCF sensor. This methodical approach not only saves time and resources, but also fine-tunes these critical structural parameters to achieve superior results.

Table 1 Optimization parameters and their levels.

Full size table

To systematically investigate the effects of these parameters, we used a Taguchi L₈ (2⁵) orthogonal array, as shown in Table 2. In this specific design, L₈ signifies eight experimental runs, while 2⁵ indicates five factors, each with two levels. Factors are the variables or parameters that can affect the outcome of a process or product, and levels represent the different settings or values that each factor can take. The choice of factors and their levels is crucial for conducting efficient experiments while capturing the effects of interest^28,30,42,43. By utilizing the L₈(2⁵) orthogonal array, researchers can systematically explore the effects of multiple factors on a process or product with a relatively small number of experiments. This structured approach not only saves time and resources but also enables the identification of optimal factor settings for improved performance or quality. In the context of Taguchi optimization, the signal-to-noise (S/N) ratio serves as a key metric. It is used to evaluate the performance of the process and to quantify the influence of different parameter combinations on the effectiveness of the sensor, specifically in terms of confinement loss (αloss). Higher confinement loss indicates strong coupling between the core and the surface plasmon polariton (SPP) mode, and vice versa. Our primary goal is to maximize the S/N ratio, which represents an optimal balance between desired performance (signal) and unwanted variation (noise), ultimately resulting in an improved PCF sensor. The signal-to-noise (S/N) ratios were determined using the next criterion according to Eq. (8)³⁰:

$$\mathrm{Larger \, is \, better}:{(S/N)}_{i}=-10{{\text{log}}}_{10}\left(\frac{1}{n}\sum_{i=1}^{n}\left(\frac{1}{{Y}_{i}^{2}}\right)\right)$$

(8)

where n is the number of simulation tests performed and ${Y}_{i}$ is the measured response (confinement loss) for the ith simulation. Table 2 shows the numerical results for the PCF-SPR sensor's confinement loss peak ${\alpha }_{loss}^{peak}$, wavelength peak ${\lambda }_{peak}$, and the corresponding signal-to-noise (S/N) ratios obtained by the L₈ experimental layout.

Table 2 The Taguchi L₈(2⁵) orthogonal table.

Full size table

Figure 3 shows the loss curves corresponding to all the experimental tests described in Table 2.

To assess the impact of each key parameter, it is critical to calculate the mean signal-to-noise (S/N) responses for each corresponding level. This is done by summing the results associated with each level from the orthogonal table and dividing this sum by the number of tests performed at that level. The significance of each factor can then be determined by calculating the difference between the maximum and minimum mean S/N ratios across the two levels, referred to as the delta, as shown in Table 3. A larger difference indicates a greater effect of that control factor. Examination of the response data presented in Table 3 indicates that the Air Hole Diameter (d₁) factor has the most significant influence.

Table 3 Response Table for Signal to Noise Ratios.

Full size table

In Fig. 4, the plot of signal to noise (S/N) versus each key parameter shows that the maximum confinement loss occurs when d₁ is at level 2, d₂ is at level 1, d₃ is at level 2, Λ is at level 1, and t_Ag is at level. It is noteworthy that the optimal combination obtained (d₁ = 1.9 µm, d₂ = 0.8 µm, d₃ = 1.75 µm, Λ = 3.2 µm, and t_Ag = 55 nm) was not initially included in the L₈ orthogonal array provided by Taguchi's method. It is noteworthy that the simulation of the confinement loss using these optimal parameters yields a value of 31.536 dB/cm, exceeding the values obtained in all other tests performed.

The analysis of variance (ANOVA) framework, as applied in the L₈ Taguchi approach presented in Table 4, is used to determine the percentage contribution of each significant parameter to the increase in confinement loss. DF is the degree of freedom associated with each factor, Seq-SS is the sequential sum of squares, and Adj-MS is the adjusted sum of squares divided by the degrees of freedom. Table 4 and Fig. 5 together show that the most significant contributions are associated with the parameter d₃, which accounts for 53%, and t_Ag, which contributes 16%. Conversely, parameters Λ and d₁ show minimal contributions of 9% and 5%, respectively, to α_loss. Also the factor d₃ seems to have a statistically significant impact on the variability of the data, as indicated by its low p-value (typically less than 0.05). The F-values and percentage contributions show the relative importance of each parameter.

Table 4 Results of the ANOVA on the sensor confinement loss.

Full size table

Multiple linear regression model

Multiple linear regression (MLR) analysis is a statistical modeling method used to examine the correlation between a dependent variable (in this case, response) and two or more independent variables (designed as inputs). Using the data extracted from the Taguchi table, we can create an MLR model to examine how the five control factors (d₁, d₃, d₃, $\Lambda$, t_Ag) relate to the confinement loss ${\alpha }_{Loss}$. The regression analysis is performed using Matlab software. The explicit equation for this model is given below:

$${\boldsymbol{\alpha }}_{{\varvec{L}}{\varvec{o}}{\varvec{s}}{\varvec{s}}}\left(dB/cm\right)=14.35+2.06{\times {\varvec{d}}}_{1}-3.73{\times {\varvec{d}}}_{2}+6.93{\times {\varvec{d}}}_{3}-2.88\times{\varvec{\Lambda}}-3.81{\times {\varvec{t}}}_{{\varvec{A}}{\varvec{g}}}$$

(9)

We specify that for the above MLR equation, the key parameters are encoded (-1 for the low level and + 1 for the high level).

Based on the predicted results of the MLR model presented in Table 5, a high R-squared (R²) value of approximately 97.69% in the variance of the dependent variable suggests that the model is effective in capturing the variation in the data. The adjusted R-squared (${R}_{Adj}^{2}$) takes into account the number of predictors in the model and penalizes the R-squared for including irrelevant predictors. With an ${R}_{Adj}^{2}$ value of 91.93%, we see that approximately 91.93% of the variance is explained, taking into account the influence of the number of predictors. This provides a more conservative estimate of the explanatory power of the model, especially in cases with multiple predictors such as ours. The predictive ability of the model for new data points is assessed by the predicted R-squared (${R}_{pred}^{2}$). Its value of 63.11% indicates that the model can predict approximately 63.11% of the variance in new observations.

Table 5 Comparaison of simulated and prediced MLR values.

Full size table

Under ideal operating conditions (optimal conditions), the MLR model predicts the optimal value of the confinement loss value (${\widehat{\alpha }}_{Loss}$) to be 33.761 dB/cm. Performing the FEM simulation with these optimized settings yields an observed value (${\alpha }_{Loss}$) of 31.536 dB/cm, with a relative error of :$\left|\frac{{\alpha }_{Loss}-{\widehat{\alpha }}_{Loss}}{{\alpha }_{Loss}}\right|\times 100\approx 7\%$. This level of error is considered acceptable in engineering.

Confinement loss with changing analyte RI (n_a)

The loss peak approach is widely accepted for evaluating the efficiency of an SPR sensor. Increased losses contribute to an expanded evanescent field within the cladding of the photonic crystal fiber (PCF), thereby increasing sensitivity. The proposed sensor exhibits increased sensitivity, capable of detecting even subtle variations in the refractive index (RI) of the analyte. This is particularly evident when the effective RI (n_eff) of the fundamental mode is significantly affected by the analyte RI (n_a), as shown in the confinement loss spectra in Fig. 6. In this particular scenario, a noticeable shift in the resonance wavelength accompanies a change in analyte RI from 1.29 to 1.36. It can be seen that as the analyte RI increases, the confinement loss also increases, causing the resonance peak to shift to higher values. This phenomenon is due to the fact that variations in RI induce changes in both the propagation constant and the kinetic binding energy⁴⁴. Consequently, the confinement loss exhibits a minimum of 3.7654 dB/cm at 1.44 μm with n_a value of approximately 1.29, while the maximum confinement loss peak of 31.536 dB/cm is observed at 1.92 μm with n_a value of approximately 1.36.

Wavelength sensitivity

In the general context, the wavelength interrogation method is used to determine the wavelength sensitivity (Ws), which is defined by Eq. (5). In our proposed SPR sensor, we observed Δλ_peak values of 40, 60, 40, 80, 60, 100, 100 nm as n_a varied from 1.29 to 1.30, 1.30 to 1.31, 1.31 to 1.32, 1.32 to 1.33, 1.33 to 1.34, 1.34 to 1.35, and 1.35 to 1.36, respectively. Accordingly, the maximum Ws values obtained were 4000, 6000, 4000, 8000, 6000, 10,000, and 10,000 nm/RIU. Consequently, the wavelength sensitivity reaches a peak value of approximately 10,000 nm/RIU within the analyte RI range of 1.34 to 1.36.

Amplitude sensitivity

Unlike wavelength sensitivity, amplitude sensitivity provides a simple and inexpensive method of measuring sensitivity at a specific wavelength. The amplitude sensitivity observed by varying the sample refractive index (RI) from 1.29 to 1.36 is shown in Fig. 7. As shown in the figure, the amplitude sensitivity shows an increase as the sample RI increases from 1.29 to 1.33. The peak shifts to a higher wavelength, indicating an enhanced interaction between the evanescent field and the surface plasmon polariton (SPP) mode. Consequently, the amplitude sensitivity reaches a maximum value of approximately 235.882 RIU⁻¹ at n_a = 1.33 and an operating wavelength of 1.72 μm.

Machine learning models

Machine learning models were used to optimize and predict the confinement loss of the PCF sensor using simulation data. The input factors were air hole diameters (d₁, d₂, d₃), pitch (Λ), and silver thickness (t_Ag). Their effects on the efficiency of PCF sensors were evaluated by a full experimental design (2⁵) with 32 samples, since machine learning models require a large dataset. The dataset and model architecture were used to evaluate the effectiveness of two different Artificial Neural Network (ANN) optimization techniques, namely Multi-Layer Perceptron (MLP) and Particle Swarm Optimization (PSO).

Statistical error analysis

The following coefficients⁴⁵ were calculated to monitor the performance of the models used in this analysis: MLR-ANN and PSO-ANN

$$VAF=\left[1-\frac{var\left(y-\widehat{y}\right)}{var\left({y}_{i}\right)}\right]\times 100$$

(10)

$$RMSE=\sqrt{\frac{1}{N}\sum_{i=1}^{N}{\left({y}_{i}-\widehat{{y}_{i}}\right)}^{2}}$$

(11)

$$MAPE=\frac{1}{N}\sum_{i=1}^{N}\left|\frac{{y}_{i}-\widehat{{y}_{i}}}{{y}_{i}}\right|\times 100$$

(12)

$${R}^{2}=1-\frac{\sum_{i=1}^{N}{\left({y}_{i}-\widehat{{y}_{i}}\right)}^{2}}{\sum_{i=1}^{N}{\left(\overline{y }-{y}_{i}\right)}^{2}}$$

(13)

$${R}_{Adj}^{2}=1-\left(\left(1-{R}^{2}\right)\frac{N-1}{N-k-1}\right)$$

(14)

Here, N is the number of samples, VAF is the variance accounted for, RMSE is the root mean square error, MAPE is the mean absolute percentage error, ${R}^{2}$ is the coefficient of determination, and ${R}_{Adj}^{2}$ is the adjusted ${R}^{2}$. Here, ${y}_{i}$ is the actual value, $\widehat{{{\text{y}}}_{{\text{i}}}}$ is the predicted value, $\overline{{\text{y}} }$ is the average value of y, and k is the number of features (input variables).

MLP-ANN optimization

A Multi-Layer Perceptron Artificial Neural Network (MLP-ANN) is a type of artificial neural network that uses the Multi-Layer Perceptron architecture. This architecture is characterized by its composition of multiple connected layers of neurons, which typically include an input layer, one or more hidden layers, and an output layer. ANNs are widely used in the field of machine learning, where they are applied to various tasks such as classification, regression, and pattern recognition. Figure 8 provides a visual representation of the network structure of the MLP used in this study.

The overall network consists of multiple interconnected layers, and learning is accomplished by adjusting weights and biases during the training phase, typically using optimization techniques such as gradient descent. The following equation is a mathematical representation of the feedforward process in a neural network. It calculates the output of a given neuron in the output layer based on the inputs, weights, and biases from the previous layer, incorporating activation functions:

$${Y}_{k}={\varvec{g}}\left({\sum }_{j=1}^{q}{k}_{j}^{0}{\varvec{f}}\left(\left({\sum }_{i=1}^{p}{w}_{ij}{x}_{i}\right)+{b}_{j}^{h}\right)+{b}^{0}\right)$$

(15)

In this equation, ${Y}_{k}$ represents the output of neuron k within the neural network, $g$ and $f$ denote the activation function of neurons in the output layer and hidden layers respectively (Fig. 8), $q$ represents for the number of neurons in the previous layer and, ${w}_{ij}$ represents the weight of the connection between neuron i in the input layer and neuron j in the hidden layer. ${b}_{j}^{h}$ represents the bias term associated with neuron j in the hidden layer and ${b}^{0}$ represents the bias term associated with neuron k in the output layer. Several networks with different numbers of hidden layer neurons were trained and then evaluated. The architecture of the ANN used in this investigation is characterized by a feed-forward structure using sigmoid activation functions within the hidden layers and a linear activation function at the output node.

Following Bishop's seminal work in 1995⁴⁶, which suggests that more than one hidden layer is often unnecessary, our architectures have only one hidden layer. A back-propagation gradient descent algorithm was used to train the ANN. The dataset was carefully divided into three distinct subsets for the duration of the training phase: a training dataset (70%), a test dataset (15%), and a validation dataset (15%). The number of neurons in the hidden layer was systematically adjusted in the range of 1 to 20 in order to evaluate the performance of the model. The mean squared error between the simulation data and the model output, shown in Fig. 9, was expressed as a function of the number of neurons in the hidden layer. The selection of the most effective network depended on its ability to predict responses with the lowest mean squared error. Consequently, as shown in Fig. 9, our results showed that the optimal network configuration was achieved with a 5:11:1 structure (11 neurons in the hidden layer). This result is consistent with the formula derived from the literature⁴⁷:

$${N}_{neurone}=\frac{{N}_{in}+\sqrt{{N}_{p}}}{L}$$

(16)

In this equation, L is the number of hidden layers (in our case, L = 1), ${N}_{in}$ is the number of inputs (in our case, ${N}_{in}=5$), and ${N}_{p}$ is the number of samples (in our case, ${N}_{p}$=32).

The architecture, parameters and optimization process of the ANN network are shown in Table 6.

Table 6 Architecture and parameters of ANN.

Full size table

Figure 10 presents a comprehensive assessment of the MLP-ANN model's performance in predicting PCF sensor efficiency: (a) Comparison of observed (FEM simulation) and predicted data using the MLP-ANN model. It serves as a visual representation of how well the model's predictions match the actual data, providing insight into the model's predictive accuracy. (b) Statistical Analysis Fit of the MLP-ANN Model. It provides insight into the goodness of fit of the model by assessing how well the predicted values match the actual data. In addition, subplot (c) evaluates the deviation of the predicted values generated by the MLP-ANN model from the actual values. This provides a clear understanding of the model's prediction errors and any discrepancies between the predicted and observed data points, facilitating further analysis and refinement of the model if necessary.

The achieved correlation coefficient of 0.949, which is close to 1, indicates a strong correlation. Furthermore, the values for RMSE, VAF, MAPE, ${R}^{2}$, and ${R}_{adj}^{2}$ underscore the robust predictive capability of the MLP-ANN model.

PSO-ANN optimization

Using the Particle Swarm Optimization (PSO) algorithm in conjunction with a traditional Artificial Neural Network (ANN) is a promising approach⁴⁸. As mentioned earlier, in a typical ANN structure, there are several layers: an input layer, one or more hidden layers, and an output layer. These layers are connected by a network of weighted connections, where the value of each neuron is determined by the sum of the connections within the neuron, weighted by their respective values. An activation function, typically sigmoidal in nature, is then applied to this value. The weights of the network are traditionally tuned by error backpropagation and gradient descent. Incorporating the PSO algorithm into this framework helps optimize the connection weights within the ANN, with the goal of identifying the optimal weight values that produce the best results. Initially, the PSO algorithm generates a population of particles, each of which is used within the neural network. The fitness of each particle, representing a potential solution set, is evaluated, and pertinent local and global information is retained within each particle. PSO then uses this information to update particle velocities and effectively explore the solution space. The PSO-ANN model configured with a swarm size of 150, a cognitive coefficient C₁ of 1.5, a social coefficient C₂ of 2, and an inertia weight W of 0.9 provides the most accurate prediction results, resulting in an exceptionally high regression coefficient of approximately 0.99 (as shown in Fig. 11). The global parameters of the PSO algorithm for the optimization of the ANN network are shown in Table 7.

Table 7 Parameters of PSO algorithm.

Full size table

The performance metrics, including RMSE, VAF, MAPE, and R² (detailed in Table 8), consistently show that the PSO-ANN model outperforms the MLP-ANN model in terms of prediction accuracy. In particular, the RMSE obtained at the 500th iteration was exceptionally low.

Table 8 Key performance metrics for MLP-ANN and PSO-ANN models.

Full size table

Comparative study

In this section, we conducted a comparative analysis between the results generated by two different learning machine models used to predict the confinement loss of the PCF-SPR sensor and the results obtained through by finite element simulations. Figure 12 and Table 8 provide a visual and numerical representation of this comparative study. Figure 12 visually represents the discrepancies between the predicted values of the MLP-ANN and PSO-ANN models with respect to the actual data values for the confinement loss. It is evident that the PSO-ANN model provided an efficient prediction that was more reliable than the MLP-ANN model.

Furthermore, when examining the key performance metrics presented in Table 8, it is evident that the PSO-ANN model outperforms the MLP-ANN model. Specifically, metrics such as R² (coefficient of determination), RMSE (root mean square error), VAF (variance accounted for), and MAPE (mean absolute percentage error) all indicate superior performance for the PSO-ANN model. Taken together, these results indicate that the PSO-ANN model excels in predicting the optical properties of the PCF-SPR sensor.

Genetic algorithm optimization

To maximize the sensor confinement loss, the independent parameters [air hole diameters (d₁, d₂, d₃), pitch (Λ) and silver thickness (t_Ag)] were also optimized by genetic algorithm (GA). The optimized PSO-ANN model was used as the objective function of the GA. The optimization was performed under constraints to obtain optimal conditions predicted in the experimental range. The experimental ranges adopted in the Taguchi design were used as bounds for the five input variables. The optimization problem to be solved by the GA was constructed as follows:

$$Maximize \, objective \, function \, (optimized \, PSO-ANN)\left\{\begin{array}{c}{1.7\mu m\le d}_{1}\le 1.9\mu m\\ {0.8\mu m\le d}_{2}\le 1.2\mu m\\ {1.55\mu m\le d}_{3}\le 1.75\mu m\\ 3.2\mu m\le \Lambda \le 3.4\mu m\\ 55nm\le {t}_{Ag}\le 75nm\end{array}\right.$$

(17)

The optimization process was continued until very low mean sum of square error (MSE) and root mean square error (RSME) values were obtained between the mean and individual fitness values. After mutation, the optimization cycle resumed, and if the target result was not achieved, the whole population was used for the next cycle of breeding, crossover, and mutation. The objective function was written as a MATLAB file using the PSO-ANN model. The GA parameters used for optimization are shown in Table 9.

Table 9 GA parameters used for optimization.

Full size table

The evolution of the fitness value as a function of the number of generations is shown in Fig. 13. It is clear from this figure that from the 60th generation, the fitness value remains constant with an average value of − 32.2692. The optimization performed by the GA resulted in the following conditions: d₁ = 1.9 µm, d₂ = 0.8 µm, d₃ = 1.75 µm, Λ = 3.27 µm and t_Ag = 58.02 nm. Under these optimized conditions, the predicted value of the confnement loss is 32.2692 dB/cm.

Conclusion

In this study, we used the Taguchi optimization approach to efficiently identify optimal structural parameters for a PCF-SPR sensor, including air hole diameters (d₁, d₂, d₃), pitch (Λ), and silver layer thickness (t_Ag). Then, we developed MLP-ANN and PSO-ANN machine learning models to predict the confinement loss based on these parameters. The results showed that the PSO-ANN model outperformed the MLP-ANN model, achieving an impressive R² value of 0.99, indicating exceptional prediction accuracy. Taguchi optimization demonstrated its effectiveness in minimizing the number of trials required for sensor optimization. Finally, a genetic algorithm (GA) was applied to further optimize the sensor conditions with the goal of increasing the confinement loss. Under the optimized parameters (d₁ = 1.9 µm, d₂ = 0.8 µm, d₃ = 1.75 µm, Λ = 3.27µm, t_Ag = 58.02 nm), the GA approach yielded a maximum confinement loss of 32.2692dB/cm. These combined results underscore the comprehensive optimization approach using Taguchi optimization, machine learning models, and genetic algorithms for improved performance in PCF-SPR sensor design.

References

Zhan, Y. et al. Surface plasmon resonance-based microfiber sensor with enhanced sensitivity by gold nanowires. Opt. Mater. Express 8(12), 3927–3940 (2018).
Article ADS CAS Google Scholar
Hasan, M. R. et al. Plasmonic refractive index sensor employing niobium nanofilm on photonic crystal fiber. IEEE Photon. Technol. Lett. 30(4), 315–318 (2017).
Article ADS Google Scholar
An, G. et al. Quasi-D-shaped optical fiber plasmonic refractive index sensor. J. Opt. 20(3), 035403 (2018).
Article ADS Google Scholar
Lu, J. et al. D-shaped photonic crystal fiber plasmonic refractive index sensor based on gold grating. Appl. Opt. 57(19), 5268–5272 (2018).
Article ADS CAS PubMed Google Scholar
Rifat, A. A. et al. Highly sensitive D-shaped photonic crystal fiber-based plasmonic biosensor in visible to near-IR. IEEE Sens. J. 17(9), 2776–2783 (2017).
Article ADS CAS Google Scholar
An, G. et al. D-shaped photonic crystal fiber refractive index sensor based on surface plasmon resonance. Appl. Opt. 56(24), 6988–6992 (2017).
Article ADS CAS PubMed Google Scholar
Patnaik, A., Senthilnathan, K. & Jha, R. Graphene-based conducting metal oxide coated D-shaped optical fiber SPR sensor. IEEE Photon. Technol. Lett. 27(23), 2437–2440 (2015).
Article ADS CAS Google Scholar
Luan, N. et al. Surface plasmon resonance sensor based on D-shaped microstructured optical fiber with hollow core. Opt. Express 23(7), 8576–8582 (2015).
Article ADS CAS PubMed Google Scholar
Paul, A. K. et al. Twin core photonic crystal fiber plasmonic refractive index sensor. IEEE Sens. J. 18(14), 5761–5769 (2018).
Article ADS CAS Google Scholar
Hasan, M. R. et al. Spiral photonic crystal fiber-based dual-polarized surface plasmon resonance biosensor. IEEE Sens. J. 18(1), 133–140 (2017).
Article ADS Google Scholar
Dash, J. N., Das, R. & Jha, R. AZO coated microchannel incorporated PCF-based SPR sensor: A numerical analysis. IEEE Photon. Technol. Lett. 30(11), 1032–1035 (2018).
Article ADS CAS Google Scholar
Liu, M. et al. High-sensitivity birefringent and single-layer coating photonic crystal fiber biosensor based on surface plasmon resonance. Appl. Opt. 57(8), 1883–1886 (2018).
Article ADS CAS PubMed Google Scholar
Wu, J. et al. Ultrahigh sensitivity refractive index sensor of a D-shaped PCF based on surface plasmon resonance. Appl. Opt. 57(15), 4002–4007 (2018).
Article ADS CAS PubMed Google Scholar
Islam, M. S. et al. Dual-polarized highly sensitive plasmonic sensor in the visible to near-IR spectrum. Opt. Express 26(23), 30347–30361 (2018).
Article ADS CAS PubMed Google Scholar
Bellunato, T. et al. Refractive index of silica aerogel: Uniformity and dispersion law. Nucl. Instrum. Methods Phys. Res. A 595(1), 183–186 (2008).
Article ADS CAS Google Scholar
Wang, F. et al. A highly sensitive SPR sensors based on two parallel PCFs for low refractive index detection. IEEE Photon. J. 10(4), 1–10 (2018).
Article Google Scholar
Liu, C. et al. Birefringent PCF-based SPR sensor for a broad range of low refractive index detection. IEEE Photon. Technol. Lett. 30(16), 1471–1474 (2018).
Article ADS CAS Google Scholar
Chen, X., Xia, L. & Li, C. Surface plasmon resonance sensor based on a novel D-shaped photonic crystal fiber for low refractive index detection. IEEE Photon. J. 10(1), 1–9 (2018).
Google Scholar
Haque, E. et al. Surface plasmon resonance sensor based on modified D-shaped photonic crystal fiber for wider range of refractive index detection. IEEE Sens. J. 18(20), 8287–8293 (2018).
Article ADS CAS Google Scholar
Liu, C. et al. Mid-infrared surface plasmon resonance sensor based on photonic crystal fibers. Opt. Express 25(13), 14227–14237 (2017).
Article ADS CAS PubMed Google Scholar
Haque, E. et al. Microchannel-based plasmonic refractive index sensor for low refractive index detection. Appl. Opt. 58(6), 1547–1554 (2019).
Article ADS CAS PubMed Google Scholar
Haque, E. et al. Highly sensitive dual-core PCF based plasmonic refractive index sensor for low refractive index detection. IEEE Photon. J. 11(5), 1–9 (2019).
Article Google Scholar
Yu, C.-P. & Chang, H.-C. Applications of the finite difference mode solution method to photonic crystal structures. Opt. Quant. Electron. 36, 145–163 (2004).
Article Google Scholar
Cucinotta, A. et al. Holey fiber analysis through the finite-element method. IEEE Photon. Technol. Lett. 14(11), 1530–1532 (2002).
Article ADS Google Scholar
Johnson, S. G. & Joannopoulos, J. D. Block-iterative frequency-domain methods for Maxwell’s equations in a planewave basis. Opt. Express 8(3), 173–190 (2001).
Article ADS CAS PubMed Google Scholar
Shi, S., Chen, C. & Prather, D. W. Plane-wave expansion method for calculating band structure of photonic crystal slabs with perfectly matched layers. JOSA A 21(9), 1769–1775 (2004).
Article ADS PubMed Google Scholar
Norton, R. A. & Scheichl, R. Planewave expansion methods for photonic crystal fibres. Appl. Numer. Math. 63, 88–104 (2013).
Article MathSciNet Google Scholar
Kaziz, S. et al. Numerical simulation and optimization of AC electrothermal microfluidic biosensor for COVID-19 detection through Taguchi method and artificial network. Eur. Phys. J. Plus 138(1), 96 (2023).
Article CAS PubMed PubMed Central Google Scholar
Kaziz, S. et al. 3D simulation of microfluidic biosensor for SARS-CoV-2 S protein binding kinetics using new reaction surface design. Eur. Phys. J. Plus 137(2), 241 (2022).
Article CAS PubMed PubMed Central Google Scholar
Kaziz, S., Jemmali, A. & Echouchene, F. Optimization of annular microfluidic biosensor enhanced by active and passive effects using Taguchi’s method coupled with multi-layer perceptron neural networks (MLP-NN) models. Microfluid. Nanofluid. 27(9), 60 (2023).
Article CAS Google Scholar
Borhani, N. et al. Learning to see through multimode fibers. Optica 5(8), 960–966 (2018).
Article ADS Google Scholar
Baxter, J. et al. Plasmonic colours predicted by deep learning. Sci. Rep. 9(1), 8074 (2019).
Article ADS PubMed PubMed Central Google Scholar
Tittl, A. et al. Metasurface-based molecular biosensing aided by artificial intelligence. Angew. Chem. Int. Ed. 58(42), 14810–14822 (2019).
Article CAS Google Scholar
Ma, W., Cheng, F. & Liu, Y. Deep-learning-enabled on-demand design of chiral metamaterials. ACS Nano 12(6), 6326–6334 (2018).
Article CAS PubMed Google Scholar
Musumeci, F. et al. An overview on application of machine learning techniques in optical networks. IEEE Commun. Surv. Tutor. 21(2), 1383–1408 (2018).
Article Google Scholar
Kiarashinejad, Y. et al. Deep learning reveals underlying physics of light–matter interactions in nanophotonic devices. Adv. Theor. Simul. 2(9), 1900088 (2019).
Article Google Scholar
Kiarashinejad, Y. et al. Knowledge discovery in nanophotonics using geometric deep learning. Adv. Intell. Syst. 2(2), 1900132 (2020).
Article Google Scholar
da Silva Ferreira, A., Malheiros-Silveira, G. N. & Hernández-Figueroa, H. E. Computing optical properties of photonic crystals by using multilayer perceptron and extreme learning machine. J. Lightw. Technol. 36(18), 4066–4073 (2018).
Article ADS Google Scholar
Asano, T. & Noda, S. Optimization of photonic crystal nanocavities based on deep learning. Opt. Express 26(25), 32704–32717 (2018).
Article ADS CAS PubMed Google Scholar
Jiao, S. et al. Highly sensitive dual-core photonic crystal fiber based on a surface plasmon resonance sensor with a silver nano-continuous grating. Appl. Opt. 57(28), 8350–8358 (2018).
Article ADS CAS PubMed Google Scholar
Shuai, B. et al. A multi-core holey fiber based plasmonic sensor with large detection range and high linearity. Opt. Express 20(6), 5974–5986 (2012).
Article ADS CAS PubMed Google Scholar
Ben Mariem, I. et al. Numerical optimization of microfluidic biosensor detection time for the SARS-CoV-2 using the Taguchi method. Indian J. Phys. 1, 1–8 (2023).
Google Scholar
Kaziz, S. et al. Taguchi optimization of integrated flow microfluidic biosensor for COVID-19 detection. Eur. Phys. J. Plus 137(11), 1235 (2022).
Article CAS PubMed PubMed Central Google Scholar
Shafkat, A. Analysis of a gold coated plasmonic sensor based on a duplex core photonic crystal fiber. Sens. Bio-Sens. Res. 28, 100324 (2020).
Article Google Scholar
Jemmali, A., Kaziz, S., Echouchene, F. & Gazzah, M. H. Optimization of Lab-on-a CD by experimental design and machine learning models for microfluidic biosensor application. IEEE Sens. J. 1, 99 (2024).
Google Scholar
Bishop, C. M. Neural Networks for Pattern Recognition (Oxford University Press, 1995).
Book Google Scholar
Ke, J. & Liu, X. Empirical analysis of optimal hidden neurons in neural network modeling for stock prediction. in 2008 IEEE Pacific-Asia Workshop on Computational Intelligence and Industrial Application. (2008).
Eberhart, R. & Kennedy, J. A new optimizer using particle swarm theory. in MHS'95 Proceedings of the Sixth International Symposium on Micro Machine and Human Science. (1995).

Download references

Author information

Authors and Affiliations

NANOMISENE Laboratory, LR16CRMN01, Centre for Research on Microelectronics and Nanotechnology (CRMN) of Sousse Technopole, Sahloul, B.P.334, 4054, Sousse, Tunisia
Sameh Kaziz
Electronic and Microelectronics Lab, Department of Physics, Faculty of Science of Monastir, University of Monastir, 5019, Monastir, Tunisia
Fraj Echouchene
Quantum and Statistical Physics Laboratory, Faculty of Sciences of Monastir, University of Monastir, Environment Boulevard, 5019, Monastir, Tunisia
Mohamed Hichem Gazzah

Authors

Sameh Kaziz
View author publications
You can also search for this author in PubMed Google Scholar
Fraj Echouchene
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Hichem Gazzah
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.K.: study design, resources, data analysis and writing of the main manuscript. F.E.: data analysis and methodology. M.H.G.: revision of the manuscript and supervision.

Corresponding author

Correspondence to Sameh Kaziz.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Kaziz, S., Echouchene, F. & Gazzah, M.H. Optimizing PCF-SPR sensor design through Taguchi approach, machine learning, and genetic algorithms. Sci Rep 14, 7837 (2024). https://doi.org/10.1038/s41598-024-55817-9

Download citation

Received: 19 January 2024
Accepted: 28 February 2024
Published: 03 April 2024
DOI: https://doi.org/10.1038/s41598-024-55817-9

Keywords

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.