Introduction

The significance of maintaining compact size has been consistently increasing in the design of contemporary passive high-frequency devices, primarily driven by various application areas such as implantable systems1, energy harvesting2, mobile communications3, the internet of things4, or RFID5. Due to the dependence of the physical dimensions of passives components on the guided wavelength, achieving miniaturization is a complex endeavour. It is typically accomplished through a combination of methods, including transmission line (TL) meandering6, exploiting slow-wave phenomena7 (e.g., compact microstrip resonant cells, CMRCs8), integrating defected grounds9, utilizing metamaterials10, employing substrate-integrated waveguides11, or incorporating auxiliary structures (stubs12, slots13, shorting pins14). While often unavoidable, this approach results in intricate geometries characterized by numerous parameters and the presence of strong cross-coupling effects. Both aspects pose significant design challenges. On the one hand, precise rendition of circuit characteristics demands computationally expensive electromagnetic (EM) simulation, as neither analytical nor circuit-theory-based models provide sufficient accuracy15,16. On the other hand, geometry parameters must be meticulously tuned in a synchronized manner, preferably utilizing rigorous optimization methods17,18.

Optimizing microwave components at the EM level poses challenges primarily due to its computational cost. The associated expenses are considerable, even for local parameter tuning (utilizing methods like gradient-based19 or stencil-based algorithms20), not to mention global search21, statistical analysis (e.g., yield estimation22), design centering23, or multi-criterial design24. Numerous efforts have been made in the literature to streamline EM-driven design procedures25,26,27,28,29,30. Some available methods include cost reduction in sensitivity estimation for gradient-based algorithms (employing adjoint sensitivities31, restricted gradient updates32,33), surrogate-based techniques utilizing both behavioural34 and physics-based metamodels35, and machine learning frameworks36. Surrogate modelling methods applied in these procedures encompass kriging37, radial basis functions38, Gaussian process regression39, support vector machines40, as well as artificial neural networks41,42,43. Physics-based routines often rely on space mapping44 or various response correction methods45,46,47.

In the realm of compact components, parameter tuning is typically geared towards achieving miniaturization, making the reduction of circuit size a primary objective48. Simultaneously, stringent conditions are imposed on electrical performance metrics, such as the required return loss level, port isolation across a specified frequency range, power split ratio, or the phase response49,50. Numerically, this results in highly constrained optimization problems, where the constraints are computationally heavy to assess, relying on EM simulation. Explicit constraint handling is generally challenging, although recent developments have shown promise (e.g51). , . Another approach involves implicit constraint control using penalty functions52, with the main goal (diminishing size) supplemented by penalty components measuring potential constraint violations53. While this enables reformulation of the problem into an unconstrained task, penalty coefficient adjustment is crucial for optimization process performance. Manual setup is non-trivial and usually requires multiple attempts, each followed by a test optimization run. Recently, adaptive approaches have been proposed with penalty terms automatically adjusted using actual constraint levels and/or the algorithm’s convergence status54,55. However, these methods are suitable for local parameter tuning, whereas the geometric complexity of compact circuits necessitates global optimization. The most prevalent solution methods are bio-inspired algorithms56,57,58,59. Unfortunately, their global search capability comes at the cost of poor computational efficiency. In fact, direct EM-driven nature-inspired optimization is impractical if not prohibitive. Mitigation methods primarily rely on surrogate modelling techniques25,26,60,61. Typical surrogate-assisted frameworks take the form of iterative procedures, where the metamodel serves as a predictor, facilitating the identification of the optimum design, and is progressively refined using accumulated EM simulation data. Infill points are generated using various rules favouring the improvement of the model’s dependability62, allocation of the optimum63, or balanced exploration and exploitation64. Algorithms of this kind are often referred to as machine learning procedures65,66.

The bottleneck in surrogate-assisted methods lies in constructing the metamodel itself, hindered by the nonlinearity of microwave circuit responses and dimensionality issues67. Consequently, many algorithms are only showcased using test cases that involve a limited number of variables and/or within restricted parameter ranges68,69. This study introduces a novel surrogate-based methodology for globally simulating-driven miniaturization of microwave components. Our approach is a machine learning framework that operates by incorporating response features of the circuit of interest, enabling the creation of a reliable surrogate model with small training datasets. The infill criterion is involves the enhancement of the predicted merit function, and the infill points are generated using a particle swarm optimizer (PSO)70. This is facilitated by the employing penalty functions for controlling design constraints, which allow us to reformulate the problem into an unconstrained task, aside from the lower/upper parameter bounds defining the search space. Our technique has been showcased using two couplers and juxtaposed against several benchmark methods, including gradient-based optimization, population-based procedures, and a machine learning framework directly processing complete frequency characteristics of the circuit. The results demonstrate consistent and competitive operation with respect to achievable size reduction and low running costs. The average expenses incurred by the search process are merely about 100 EM circuit analyses.

The paper encompasses several original components and technical contributions, including: (i) the creation of an innovative machine learning procedure designed for explicit reduction of microwave device’s size; (ii) the implementation of mechanisms that facilitate the search process, such as conducting optimization using response features and implicit constraint handling; (iii) the demonstration of the framework’s competitive operation in terms of design quality and consistency of produced solutions; (iv) the exhibition of excellent computational efficiency in the proposed procedure. As far as the authors are aware, there have been no comparable algorithms reported in the literature to date.

Simulation-based size reduction. Constraint handling

This part of the work revisits the formulation of EM-based miniaturization for microwave components as a nonlinear constrained minimization problem. We also explore explicit and implicit constraint handling and provide an overview of existing solution methodologies. The global search procedure will be detailed in Sect. 3.

Microwave circuit size reduction as optimization problem

Compact structures have gained significant importance across various application areas, including the IoT, RFID, and wearable/implantable devices. The design of these structures involves selecting appropriate circuit architectures, employing techniques like transmission line meandering6, incorporating metamaterial components10, or leveraging the slow-wave phenomenon7. To achieve the necessary electrical performance while keeping a small size, concurrent adjustment of all circuit dimensions is essential. Traditional parametric studies are no longer sufficient, necessitating rigorous numerical optimization instead.

It should be emphasized that design of compact microwave components differs somehow from the development of other types of high-frequency structures, in particular, compact antennas. In antenna design, a possible size reduction is impeded by physical constraints, e.g., the required current length path to ensure sufficient impedance matching at the lower end of the operating bandwidth. In the case of conventional microwave components, the situation is similar. For example, the transmission line lengths are related to the operating frequency and expressed in terms of the guided wavelength. However, using technique such as those mentioned in the previous paragraph allows to bypass these physical limitations. At the same time, these techniques result in topologically complex designs described by the increased number of parameters, and the need to use full-wave EM simulations to ensure accurate circuit evaluations. These factors contribute to making the optimization process (including explicit size reduction) a challenging endeavour.

This section presents the formalization of the miniaturization task as a nonlinear minimization problem. The pertinent notation is detailed in Table 1, and the design task at hand is expressed as follows:

Table 1 Simulation-based microwave size reduction. Notation and terminology.

The feasible space Xf in Eq. (1) is defined as the subset of X such that design constraints are satisfied for all x Xf.

$${{\bf{x}}^*} = \arg \mathop {\min }\limits_{{\bf{x}} \in {X_f}} A({\bf{x}})$$
(1)

Explicit and implicit treatment of design constraints

Managing design constraints represents a challenging aspect of optimization-driven circuit miniaturization. In addition to geometrical conditions such as lower and upper parameter bounds, constraints are applied to electrical characteristics and necessitate electromagnetic (EM) analysis for evaluation. Explicitly addressing these costly constraints is troublesome, particularly in the context of global optimization. An alternative approach is implicit treatment using a penalty function method52, where the problem (1) is formulated as:

$${{\bf{x}}^*} = \arg \mathop {\min }\limits_{\bf{x}} {U_P}({\bf{x}})$$
(2)

The merit function UP in (2) is defined as

$${U_P}({\bf{x}}) = A({\bf{x}}) + \sum\nolimits_{k = 1}^{{n_g} + {n_h}} {{\beta _k}{c_k}({\bf{x}})}$$
(3)

In this formulation, the main design goal (miniaturization) is accompanied by contributions from penalty functions ck(x), which quantify constraint violations. The weighting factors βk (penalty coefficients) dictate the influence of individual terms. Examples of constraints are outlined in Table 2. Table 3 offers potential definitions for the penalty functions, wherein some cases measure relative constraint violations concerning assumed acceptance thresholds (e.g., − 20 dB).

Table 2 Example constraints in size-reduction of microwave components

Note that the power factor of two is used, [.]2, so that UP is a smooth function at the boundary feasible space Xf. This alleviates the difficulties pertinent to the exploration of that region, otherwise necessary as one or more constraints are normally active at the minimum-size design.

Table 3 Possible formulation of penalty functions for constraints of Table 2.

A practical problem of implicit constraint handling is a selection of the coefficients βk. Their values cannot be too low, as this would impede constraint control. If βk are too high, numerical issues arise due to high nonlinearity of UP in the region adjacent to the boundary of Xf. Finding the optimum levels of βk is non-trivial. The adaptive constraint handling strategies proposed recently54,55, address this issue to a great extent. At the same time, improved explicit constraint handling methods have been proposed51,71. Both types of approaches offer reasonable trade-offs between constraint control and possible size reduction; however, they are only applicable to local optimization.

Global size reduction by response features and parameter space pre-screening

This section offers a comprehensive overview of the proposed global size reduction method. The technique utilizes a surrogate-assisted machine learning approach, integrating an iterative prediction-correction scheme. In this process, the surrogate model guides the search towards the optimal design and refines itself iteratively with accumulated electromagnetic (EM) simulation data. The size reduction task is reformulated concerning the characteristic points of the circuit, as outlined in Eq. (1). Subsequently, we revisit the response feature method, present an analytical formulation of the feature-based merit function, discuss the pre-screening step of the optimization process, detail the generation of infill points using nature-inspired algorithms, and provide a summary of the operational flow of the complete framework.

Response feature approach. Size reduction task reformulation

The size reduction procedure proposed in this study relies on efficient surrogate models designed to accelerate the exploration of the parameter space. The primary challenge lies in the high nonlinearity of microwave component characteristics, making them challenging to represent accurately using data-driven methods, especially across broad ranges of frequencies and geometry parameters. Figure 1 illustrates exemplary responses of a miniaturized coupler, which is considered as one of the verification circuits in Sect. 4. The high nonlinearity of the frequency characteristics poses difficulty in being accurately represented by data-driven surrogates. Consequently, optimization for any specific objectives (e.g., size reduction) concerning a given target operating frequency must be conducted in a global sense, as local tuning is likely to converge to an inferior local optimum due to the inherent nonlinearity.

Fig. 1
figure 1

Scattering parameters of a microstrip coupler at selected random parameter vectors. The characteristics are highly nonlinear, therefore, difficult to be represented using data-driven surrogates. At the same time, given a target center frequency (here, 1.5 GHz), coupler optimization with local (e.g., gradient-based) routines would fail when initiated from majority of the shown designs.

To tackle these challenges, we leverage the response feature technology72. The underlying concept involves reformulating the problem using suitably defined characteristic (feature) points extracted from the frequency responses, utilizing EM simulation data73. As observed in existing literature, the dependence between the feature points and geometry variables exhibits less nonlinearity compared to the corresponding relationship for frequency characteristics74,75,76. This regularization of objective functions leads to faster convergence in optimization processes75, enabling global search ability even with formally local algorithms, and reducing the training set size for metamodel rendition76. The feature point selection should align with the designated design objectives, making them problem-dependent73. Specific examples are illustrated in Fig. 2. The characteristic points identified in Fig. 2(a) correspond to the minima of the matching and isolation responses (relevant for managing the operating frequency of the circuit), − 20 dB levels of |S11| and |S41| (pertinent for the circuit’s matching/isolation bandwidth), and the levels of the transmission responses at the center frequency (helpful for controlling the power split ratio).

Fig. 2
figure 2

Response features for a coupler structure: (a) possible characteristic point choices: o – points associated with the minima of |S11| and |S41|, - points corresponding to the power division ratio computed at the frequency f0 being the average of the frequencies of |S11| and |S41| minima, - points corresponding to − 20 dB levels of |S11| and |S41|; (b) relationship between operating conditions (extracted from the response features, here the center frequency f0 and power split ratio KP) and selected design variables of the circuit. The plots are created using a set of randomly-generated designs. Only the points for which the corresponding characteristics allow for extracting the approximated operating parameters, as indicated above, are illustrated. Clear patterns can be observed although the shown designs were not subject to optimization.

The most important advantage of response features is their weakly-nonlinear dependence on the circuit designable parameters72,73,74,75,76. Figure 2(a) demonstrates this using the coupler of Fig. 1, in the form of the dependency between the feature-based-evaluated operating parameters and selected design variables. Needless to say, representing such dependencies through behavioral models is considerably simpler than behavioral modeling of the frequency characteristics. The size reduction framework proposed in this study leverages the properties of response features illustrated here.

For the purpose of restating the design task using characteristic points, we need appropriate notation, which has been introduced in Fig. 3. Figure 3(a) describes a general notation, whereas Fig. 3(b) elaborates on the specific choice of the feature points for coupling structures. These particular points allow us to account for the − 20 dB bandwidth for |S11| and |S41|, as well as the power split ratio KP, which can be calculated as KP = fL.5fL.6.

Fig. 3
figure 3

Notation pertinent to response features as utilized in this work: (a) general notation, (b) an example selection of features for the microstrip coupler design problem (cf. (5)).

Let, as before, f1 and f2 define the frequency range over which both |S11| and |S41| are to be not higher than − 20 dB, and KP stands for the target power split ratio. These conditions become design constraints (cf. Tables 2 and 3) from the points of view of circuit size reduction A(x). The size reduction problem, equivalent to (1), but expressed using response features takes the form of

$${{\bf{x}}^*} = \arg \mathop {\min }\limits_{\bf{x}} {U_F}({\bf{x}},{{\bf{f}}_P}({\bf{x}}))$$
(4)

in which the merit function UF is determined using the feature vector fP(x). For a particular coupler design task as discussed above, it may be defined as

$$\begin{array}{c} {U_F}({\bf{x}},{{\bf{f}}_P}({\bf{x}})) = A({\bf{x}}) + {\beta _1}{\left\| {\left[ \begin{array}{l} \max \{ {f_{f.1}} - {f_1},0\} \\ \max \{ {f_{f.3}} - {f_1},0\} \\ \max \{ {f_2} - {f_{f.2}},0\} \\ \max \{ {f_2} - {f_{f.4}},0\} \end{array} \right]} \right\|^2} + \\ + {\beta _2}{\left[ {{f_{f.5}} - \frac{{{f_1} + {f_2}}}{2}} \right]^2} + {\beta _3}{\left[ {({f_{L.5}} - {f_{L.6}}) - {K_P}} \right]^2} \end{array}$$
(5)

As before, minimization of the circuit footprint area A(x) is the main goal. Then, we have three penalty terms. The first one is to ensure that the bandwidth constraint imposed on |S11| and |S41| is fulfilled.

Note that this term is only contributing to UF if there is a bandwidth violation either at f1 or f2. The second term is to enforce that the center frequency of the coupler is in the middle of the prescribed bandwidth, whereas the last term controls the power split constraint. The coefficients βk can be set to relatively high values because the search process will be conducted by means of the surrogate model. In this paper, we set βk = 104 for k = 1, 2, 3.

Parameter space pre-screening. Constructing initial surrogate model

The machine-learning-based framework proposed in this paper utilizes a surrogate model to expedite the optimization process. The procedure begins by a pre-screening of the search space and constructing the initial metamodel s(0)(x). For efficiency and reliability, s(0)(x) is built using characteristic points (cf. Section 3.1). It represents the components of the feature vector, i.e.,

$${{\bf{s}}^{(0)}}({\bf{x}}) = {\left[ {{{\left[ {s_{f.1}^{(0)}({\bf{x}})\;...\;s_{f.K}^{(0)}({\bf{x}})} \right]}^T}\;{{\left[ {s_{L.1}^{(0)}({\bf{x}})\;...\;s_{L.K}^{(0)}({\bf{x}})} \right]}^T}} \right]^T}$$
(6)

The modelling method employed here is kriging interpolation37, yet, this choice is not critical. The model is identified based on the dataset {xB(j), fP(xB(j))}, j = 1, …, Ninit, with the feature vectors fP(xB(j)) determined using EM simulation results at xB(j).

The parameter Ninit is not determined beforehand. The samples are generated sequentially until the metamodel’s accuracy achieves the prescribed level. Here, the metric of choice is the relative root-mean square (RMS) error77. The target predictive power is Emax. The samples xB(j) are generated to satisfy the following conditions:

  • The circuit size A(xB(j)) ≤ Amax (a user-defined parameter);

  • The circuit size A(xB(j)) ≥ Amin (a user-defined parameter);

  • The feature points are extractable at xB(j);

The first two conditions are optional and introduced to avoid EM analysis for designs at which the circuit size is clearly too large or too small; the necessary limits Amin and Amax can be deduced from prior experiments with the circuit. For the last condition, at certain designs the of the response features may not be possible, e.g., due to a severe distortion of the circuit outputs, in which case the design would not be included into the training dataset.

The pseudocode of the pre-screening and training data generation procedures has been shown in Fig. 4. Owing to a relatively simple relation between the characteristic points and circuit’s design variables, it is possible to construct usable models using small numbers of samples Ninit, typically less than a hundred. However, as the circuit outputs at some of the random designs are degenerated, the actual number of trial points is considerably larger than Ninit, usually by a factor between 1.5 and 3.

Fig. 4
figure 4

Parameter space pre-screening and training data generation for initial surrogate model construction.

Infill point generation

The core stage of the size reduction process is launched upon constructing the initial surrogate s(0), as described in Sect. 3.2. It iteratively produces infill points x(i+1), i = 0, 1, 2, …, and the updated surrogate models s(j), j = 1, 2, … .

The role of the surrogates is to provide predictions about the minimum-size design, which are obtained by optimizing the current metamodel in a global sense, as

$${{\bf{x}}^{(i + 1)}} = \arg \mathop {\min }\limits_{{\bf{x}} \in X} {U_F}({\bf{x}},{{\bf{s}}^{(i)}}({\bf{x}}))$$
(7)

The problem (7) is equivalent to (5), except that the vector fP(x) at x is predicted using the surrogate s(i). Further, (7) is solved using particle swarm optimization (PSO)78. At this juncture, the specific choice of the global optimization routine is of secondary significance, as the surrogate model is cost-effective to evaluate. PSO (or any other nature-inspired algorithm) can be configured with a substantial computational budget. Furthermore, the optimization problem is formulated in terms of response features, rendering it numerically more manageable than the conventional approach (refer to Sect. 2).

From a machine learning standpoint, the procedure outlined in Eq. (7) aligns with the infill criterion being grounded on the predicted improvement in the objective function79. The rationale for selecting this criterion is twofold: (i) due to the feature-based formulation, the initial surrogate model typically exhibits a sufficiently high quality, and (ii) the promising subset of the search space has already been identified through the pre-screening process detailed in Sect. 3.2. The surrogate model undergoes refinement after each iteration.

In particular, the infill point xB(i+1) generated by solving (7) is added, along with the corresponding response feature vector fP(xB(i+1)), to the training dataset, which becomes {xB(k),fP(xB(k))}k = 1, …, Ninit + j, where xB(Ninit+j) = x(i) for j = 1, 2, …. The surrogate model is rebuilt using this set and employed as a predictor in the next iteration.

The following termination conditions are utilized (as logical alternatives): (i) convergence in argument, i.e., ||x(i+1)x(i)|| < ε, (ii) no objective function improvement over the last Nno_improve iterations. The default control parameter values are ε = 10–3 and Nno_improve = 10.

Optimization algorithm

This section provides an overview of the procedural workflow for the proposed method of global feature-based size reduction of microwave components. We commence by delving into the control parameters, which are consolidated in Table 4. It is crucial to underscore that there are only three parameters, with two being associated with the termination condition. These parameters primarily govern the requisite resolution of the search process. For typical microwave components featuring geometry parameters expressed in millimeters, a resolution of 0.001 mm is more than sufficient. The last parameter determines the necessary predictive power of the initial surrogate. A 10% relative error (default value) is a mild condition; however, given that the model is rendered using characteristic points of the system outputs, this condition can typically be achieved with less than a hundred samples (also due to the pre-screening procedure), resulting in computational efficiency for the entire framework.

Table 4 Control parameters of the presented algorithm for globalized size reduction procedure.

Figure 5 shows the pseudocode of the complete algorithm. For auxiliary elucidation, Fig. 6 provides its flow diagram. Steps 2 and 3 constitute the initial stage of the process that include parameter space pre-screening and a rendition of the initial metamodel. Both were elaborated on in Sect. 3.2. The core part of the size reduction procedure are Steps 5 through 8. Therein, a series of optimum design approximations are generated using the assumed infill criterion (predicted objective function improvement), and the surrogate model is re-built based on the EM analysis data acquired thus far. The process is continued until the termination condition has been fulfilled.

Fig. 5
figure 5

Operating flow of the suggested global feature-based size reduction algorithm.

Fig. 6
figure 6

Flow diagram of the proposed algorithm for global size reduction of microwave passive components.

At this point, a few comments should be made concerning the implementation and the complexity of the proposed procedure. The underlying programming environment is Matlab. As indicated earlier, there are several algorithmic components that include randomized pre-screening, definition and extraction of the feature points, surrogate model construction and its (global) optimization, as well as interfacing EM simulation software. Most of these components are straightforward to implement (pre-screening, surrogate model rendition, model optimization). In particular, in this work, kriging interpolation is employed as an underlying modelling approach. It is widely available through third-party toolboxes82,83,84 (here, we use the Matlab toolbox SUMO82). Surrogate model optimization is carried out using PSO, which is one of the most popular bio-inspired techniques. This algorithm is easy to implement, and although a plethora of implementations are available, our own Matlab implementation is employed here. There are two algorithmic components which are less generic: (i) the interface between the programming language (here, Matlab) and EM simulation software (here, CST Microwave Studio), which is implemented using Visual Basic to allow batch simulations and automated data acquisition from the EM solver; (ii) a procedure for extracting response features. The last component is the only one which is problem-dependent, and generally changes for a different arrangement of the response features.

However, the feature selection and extraction only vary between different types of microwave circuits (e.g., couplers, filters, power dividers). Consequently, a library of extraction procedures may be implemented and employed as necessary because the number of different types of circuits is rather limited. Apart from this aspect, the remaining part of the procedure is generic and does not require re-implementation for different design scenarios.

Results and benchmarking

In this part of the paper, we examine the characteristics of the size reduction framework outlined in Sect. 3. The algorithm is deployed to obtain minimum-size designs for two microstrip couplers. Additionally, it is juxtaposed against several state-of-the-art methods, encompassing nature-inspired optimization, local (gradient-based) tuning, and a machine learning procedure directly handling the frequency characteristics of the circuits under consideration. The aim of the numerical experiments is to evaluate the operation of both the proposed method and selected state-of-the-art techniques in terms of achievable miniaturization rate, precision in controlling design constraints, and computational efficiency.

Test cases

Our numerical experiments are conducted using a compact rat-race coupler80 (Circuit I), and a compact branch-line coupler with CMRCs81 (Circuit II). The circuit geometries as well as the essential parameters have been shown in Figs. 7 and 8, respectively. The EM simulation models are carried out using CST Microwave Studio. The primary design goal is minimization of the footprint area A(x). Furthermore, we impose the following constraints: (i) h1(x) = |S31(x,f0) – S21(x,f0)|, and (ii) g1(x) = max{f F : max{|S11(x,f)|, |S41(x,f)|}} + 20 dB; therein, f0 stands for the center frequency, whereas F is the target bandwidth. The first constraint is introduced to maintain equal power division, whereas the second is to secure the assumed − 20 dB bandwidth for |S11| and |S41|. Both constraints are controlled using penalty functions, cf. (2), (3).

Fig. 7
figure 7

Circuit I80: (a) geometry, (b) essential parameters.

Fig. 8
figure 8

Circuit II81: (a) geometry, (b) essential parameters.

The penalty coefficients are set to 100 and 1,000 for the equality and inequality constraint, respectively. These values provide a good trade-off between the miniaturization rate and the quality of constraint handling. In particular, given that the typical circuit size is a few hundred mm2, the contribution of the penalty term (cf. (3)) is 2.5 for inequality constraint violation of 1 dB (corresponding to relative violation of 0.05), and quickly goes up if the violation increases. This setup is generally sufficient to ensure that the constraint violation at the optimized design does not exceed one dB or so, which is practically acceptable. Similarly, the penalty coefficient of 100 for the equality constraints brings the penalty of one for absolution power split ration violation of 0.1 dB, which is again sufficient for practical purposes. It should also be noted that the parameters spaces are large for both circuits: the average upper-to-lower variable bound ratio is as high as about thirteen for Circuit I, and almost seven for Circuit II.

Results

The presented methodology has been employed to optimize Circuits I and II using the setup of Table 4, i.e., Emax = 10%, ε = 10–2, and Nno_improve = 10. For the sake of comparison, three benchmark techniques have been employed, as specified in Table 5. These include:

Table 5 Benchmark algorithms.
  • A gradient-based optimizer, utilized to demonstrate multimodality of the considered design task, the latter justifying the need for global search. The algorithm is initiated from random starting points.

  • A bio-inspired algorithm, utilized to demonstrate the challenges of nature-inspired size reduction of microwave components. Particle swarm optimizer (PSO) is chosen as a representative procedure. The computational budget is kept at 1,000 objective function evaluations. This number is low for this class of algorithms, yet, it is high from the perspective of EM-driven design (the typical algorithm run would take two to three days to complete).

  • A population-based search procedure, differential evolution (DE), selected as one of the most successful methods for continuous optimization. Here, the computational budget is also kept at 1,000 objective function evaluations for the same reasons as in the case of PSO.

  • A machine learning procedure that employs the same infill criterion and surrogate model (kriging) as the algorithm of Sect. 3. However, it works at the level of complete circuit characteristics. This algorithm is incorporated into the benchmark set to illustrate the benefits of integrating the response feature technology into the optimization framework proposed in this work. The values of control parameters of the PSO and DE algorithms are standard as found in the literature. The assumption is that a typical used does not tune the algorithm for a particular task at hand but employs the most conventional parameter values. Regarding Algorithm IV, the acceptance threshold of 10% is chosen because this level of relative RMS error normally corresponds to almost decent visual alignment between surrogate-predicted and EM-simulated circuit responses and can therefore be treated as a sufficient starting point for further ML-based search process. The computational budget of 400 samples for initial surrogate model construction is selected for purely practical reasons (to avoid excessive costs of model rendition given a relatively long EM simulation time).

The investigation into the repeatability of solutions involved conducting ten independent runs of each algorithm. Tables 6 and 7 present numerical results, offering average values and standard deviations of the circuit size, design constraint violations, and the average computational cost of the optimization process. Figures 9 and 10 visually depict the circuit responses for the optimized designs obtained in selected runs of the proposed procedure, highlighting the evolution of the circuit footprint area and the values of design constraints.

Table 6 Circuit I: optimization results.
Table 7 Circuit II: optimization results.

Discussion

The results gathered in Sect. 4.2 provide an extensive performance assessment of the explicit circuit miniaturization framework, as well as how it compares to the three benchmark algorithms outlined earlier.

To start, it is noteworthy to examine the outcomes obtained using the gradient search procedure (Algorithm I). This method demonstrates poor repeatability of solutions; for instance, the standard deviation of the footprint area exceeds fifteen and ten% for Circuit I and II, respectively. This observation indicates the presence of multiple local optima for the considered test problems. Given that the results are heavily dependent on the initial design, the use of global optimizers becomes essential. Moreover, while the average circuit size produced by gradient search is small, the constraint control is subpar, especially for the inequality condition, with an average violation of about four decibels.

Figure 9. S-parameters of Circuit I at the optimum designs obtained using our size reduction framework (top), circuit size versus iteration index (middle), and design constraint violations versus iteration index (bottom), shown for the chosen algorithm executions: (a) run 1, (b) run 2. The iteration counter starts after constructing the initial surrogate model. Vertical and horizontal lines indicate the intended operating frequency range (here, from 0.95 GHz to 1.05 GHz), and the acceptance level for |S11| and |S41| (here, − 20 dB).

Fig. 9
figure 9

S-parameters of Circuit I at the optimum designs obtained using our size reduction framework (top), circuit size versus iteration index (middle), and design constraint violations versus iteration index (bottom), shown for the chosen algorithm executions: (a) run 1, (b) run 2. The iteration counter starts after constructing the initial surrogate model. Vertical and horizontal lines indicate the intended operating frequency range (here, from 0.95 GHz to 1.05 GHz), and the acceptance level for |S11| and |S41| (here, − 20 dB).

Algorithms II (particle swarm optimizer) and III (differential evolution) perform even worse. On the one hand, the average circuit sizes are considerable larger than for the remaining methods, just as are their standard deviation. On the other hand, the constraint control is mediocre, although not as bad as for Algorithm I. This level of performance is—in a large part—a result of the assigned computational budget (only 1,000 objective function calls). Arguably, increasing the budget to five or ten thousands of objective function evaluations would improve the results of Algorithms II and III significantly. Unfortunately, this is impractical or even prohibitive due to a typical algorithm run taking several weeks to conclude.

Fig. 10
figure 10

S-parameters of Circuit II at the optimum designs obtained using our size reduction framework (top), circuit size versus iteration index (middle), and design constraint violations versus iteration index (bottom), shown for the chosen algorithm executions: (a) run 1, (b) run 2. The iteration counter starts after constructing the initial surrogate model. Vertical and horizontal lines indicate the intended operating frequency range (here, from 1.45 GHz to 1.55 GHz), and the acceptance level for |S11| and |S41| (here, − 20 dB).

The operation of Algorithm IV (ML working with the complete circuit responses) is similarly poor. Although the surrogate model accuracy threshold Emax of 10% seems reasonable, the actual predictions (especially concerning design constraints) are impeded by high nonlinearity of the circuit frequency characteristics. Consequently, the resulting constraint violations are the highest among the considered algorithms, while the obtained circuit sizes are the smallest. Meanwhile, the average number of data points needed to secure the error level of Emax is over 350, and it is the major contribution to the overall CPU expenses of the search process. It is evident that improving the design quality would require setting up considerably lower accuracy threshold (e.g., two or three%), which, however, would also lead to significantly higher CPU expenses.

The algorithm proposed in this work performs better than the benchmark methods. Its most important property is that it delivers truly minimum-size designs with excellent constraint control. The average violation of the inequality constraint is a fraction of decibel, whereas the violation of the equality constraint is essentially zero. This places the designs produced by the algorithm on the feasible region boundary. At the same time, the achieved footprint areas are competitive (it should be stressed that smaller sizes found by Algorithms I and IV are strongly infeasible), whereas repeatability of solutions is good: the footprint area standard deviation amounts to only two and five% on the average for Circuit I and II, respectively. It should be recalled that design feasibility refers to a situation where all design constraints are satisfied. For inequality constraints, the reflection and port isolation (|S11| and |S41|, respectively) do not exceed − 20 dB within the frequency range of interest. The equality constraints are satisfied if the power split ratio is precisely equal to the target value (here, 0 dB, i.e., equal power division). Rendering the exact required value of the power split is impossible, so no design can be formally feasible. However, power split errors at the level of 0.1 dB (or about 1% in relative terms, i.e., in terms of relative deviation from 50/50 power division) are considered feasible for practical purposes. For the inequality constraint, when the target level is set at − 20 dB, violations of up to 1 dB are also acceptable. Keeping this in mind, “strong infeasibility” is understood by excessive violation of constraints, which in the case of the inequality constraint is more than 1–2 dB, and in the case of the equality constraint, it is more than 0.2–0.3 dB. The latter is because the power split ratio should be more precise than the impedance matching and port isolation levels.

A separate note should be made concerning standard deviations of both the achieved circuit size and constraint violations for the proposed and the benchmark algorithms. High standard deviation for Algorithms I, II, and III are related to two factors. For Algorithm I, the primary reason is that this method is a local one; consequently, it identified the local optimum which is typically nearest to the starting points (the latter being randomly selected). Most of these local optima are different from each other and many are of inferior quality. For Algorithms II and III, the principal issue is relatively limited computational budget, meaning that the algorithms were unable to properly converge. At the same time, this budget (1,000 EM simulations) is significant from practical perspective. Thus, the results indicate that direct population-based size reduction is not an attractive option. Algorithm IV performs better with this respect; however, its standard deviations are generally higher than for the proposed algorithm. Here, the underlying reason is limited accuracy of the surrogate model leading to limited-accuracy predictions of the circuit characteristics during the machine learning search process. On the other hand, the proposed approach constructs the model in a pre-screened parameter space region and leverages the properties of the response features, both leading to considerably better reliability. The latter translates into improved consistency of the results and lower standard deviation values.

In regard to computational efficiency, our methodology is evidently more time-consuming than local search; however, the differences are not substantial (seven hours versus three hours for Circuit I, and seven hours versus five hours for Circuit II). It is important to note that our technique provides global search capability. In comparison to global design procedures, our algorithm demonstrates an average optimization cost that is over six and about nine times faster than PSO/DE for Circuit I and II, respectively, and approximately three times faster than the machine learning procedure for both circuits. Therefore, the computational advantages are undeniable.

Given the comments formulated above, one can conclude that the suggested procedure exhibits properties that are attractive from the perspective of practical global optimization of microwave passives. It does not only offer quasi-global search capability, but as a result of incorporating the response features and the pre-screening stage, the computational efficiency is better than that of representative nature-inspired and machine learning methods.

It should also be emphasized that it is not possible to compare the circuit size before and after optimization. This is because the optimization process is global and there is no initial design that the optimized design might be compared to. In particular, the search process starts from randomized pre-screening used to acquire data for initial surrogate model construction. This data contains a mixture of feasible and infeasible designs (mostly infeasible), and it is not possible to find any reasonable point of reference. Only during the optimization process, the feasibility of designs is gradually improved and so is the size reduction. Consequently, we may only compare the circuit sizes obtained by different algorithms (the proposed one and the benchmark).

Conclusion

This paper proposed an innovative technique for explicit miniaturization of microwave passive components. Our method involves the response feature approach to perform a design space pre-screening, and to facilitate a rendition of dependable surrogate model. The latter is employed to identify the position of the optimum design. The search process in embedded in a machine learning framework that uses kriging interpolation as a surrogate modelling technique, particle swarm optimizer to generate the sequence of designs approximating the minimum-size design, and predicted merit function improvement as the infill criterion. The design constraints are controlled implicitly by means of penalty functions. Numerical experiments conducted using two compact microstrip couplers demonstrate competitive performance of our procedure w.r.t. the design quality, repeatability of solutions, and computational efficiency. In particular, our algorithm enables generation of small-size designs with improved control of constraint violations, and good consistency: the standard deviation of the circuit footprint area is lower than 5% of its mean value. The CPU cost associated with circuit optimization is slightly above 100 EM simulations (150 and 110 for the first and the second verification circuit), which is significantly less than for the benchmark methods.

A comparison with a machine-learning framework that processes complete circuit characteristics corroborates the benefits of incorporating response features, both w.r.t. the improved surrogate model accuracy but also computational efficiency of the search process. It should be mentioned that the employment of this technology is also a source of potential limitations of the method. On the one hand, the very definition and extraction of characteristic points is problem dependent. On the other hand, for larger parameter spaces, the likelihood of generating random observables with extractable feature points is diminished, which would lead to increasing the cost of the size reduction process, contributed to in a large part by the pre-screening step. Notwithstanding, this issue would not be pronounced if the parameter space is reasonably defined, e.g., using the designer’s insight established at the stage of developing the circuit geometry. This means, among others, that the parameter bounds are set up to focus the optimization process on designs that are likely to be of decent quality, and to eliminate parameter combinations that evidently (i.e., according to the designer’s experience) have no chance to fulfill the assumed specifications imposed on electrical characteristics (e.g., physically too small). For the specific test cases considered in this work, the design spaces were quite extensive with respect to the parameter ranges, yet, the algorithm managed to identify the minimum-size designs in a consistent manner.

In conclusion, the size-reduction framework introduced in this study emerges as a possible alternative to state-of-the-art methods, particularly when the CPU budget assigned to the optimization process is a key consideration. Aside from the necessity to individually define feature points in a problem-dependent manner, the algorithm is undemanding to implement and configure, requiring no tailoring of control parameters.

One of the objectives of the future work is to investigate the properties and performance of the proposed algorithm for high-dimensional parameter spaces (n > 10). Another topic of interest will be extending the range of applicability of the presented framework to other types of microwave components (e.g., filters, power dividers, multi-band structures). As the algorithm itself is generic, the main challenge will be appropriate definition and extraction of the response features, which must be carried out individually for each type of circuit.