Over the last few years, significant research efforts have been made on photonic topological insulator (PTI)1,2,3,4, especially on PTI lasers. While the efforts have been concentrated on the spatial stability of the topologically protected edge modes, namely on the existence of such topological edge modes in non-Hermitian PTIs5,6,7,8,9,10,11,12,13, the temporal stability has not been the focus of interest so far14,15,16,17,18,19. Due to the non-linear nature of PTI lasers, the temporal stability is an important characteristic to take into account for experimental demonstrations and real-life applications. Although the spatial stability of the topological modes, i.e., its robustness, may be guaranteed in active non-Hermitian PTIs, unstable behaviour may be present in the time domain. In this regard, the temporal dynamics of the topologically protected modes have been studied14,16,17, mainly using linear stability analysis. It is, however, a challenging task to apply the same approach to more complex structures because of the high-dimensional phase space and parameter space as well as the lack of analytical solutions15.

Machine learning (ML) can be advantageous for the theoretical study of the stability of PTIs, which requires repetitive numerical simulations for several varying parameters. ML is a data-based method that can be implemented with different strategies, and the most appropriate ML strategy depends on the dataset under study. For instance, a supervised learning strategy relies on labelled data, a dataset of input-output pairs. This has been utilised in topological photonics20 to draw topological phase diagrams21, calculate topological invariants9, or explore topological band structures22. On the other hand, an unsupervised learning strategy consists of extracting information from the dataset for which we do not have labels. This is used for dimensional reductions by keeping only the main features of the high-dimensional structure of the dataset or for clustering problems from which the data is divided into different types23. For instance, this has been successful in obtaining the phase transition in the Ising model24, and clustering Hamiltonians that belong to the same symmetry classes25.

In the unsupervised learning strategy, modal decomposition is a common and successful method, which reduces the analysis of very high-dimensional data to a set of relatively few modes. Among the modal decomposition methods, principal component analysis (PCA) is a method, which derives the eigenmodes or the main features based on their variance in the data26. These eigenmodes can then be utilised as a basis to represent the dataset27. This reduced-order model method has been extended to identify distinct non-linear regimes28,29,30,31,32,33,34 by constructing a library composed of representatives of these regimes: this is known as representation classification. Nevertheless, the preliminary identification of the regimes composing the library and the construction of the library is a manual process and requires expert knowledge of the complex system.

In this paper, we propose a representation classification method to study the spatio-temporal dynamics of non-linear topological systems. The results will be based on the phase diagram of the Su-Schrieffer-Heeger (SSH) lattice35 with a domain wall and with saturable gain16,17. To remove the necessity of the required expertise on the complex system, we present an algorithm, which constructs an appropriate library of the different phases automatically. For this goal, we propose two approaches: a top-down approach in which the library has numerous phases that are merged into the equivalent phases, and a bottom-up approach in which the library is completed on the fly to get the most accurate classification. Via reverse engineering of the derived phase diagram, our proposed method can be used as a tool to find novel topological lasing modes in more complicated settings. For given rate equations of a lasing system, one would only need to integrate the differential equations in the desired parameter space region and then apply the adaptive representation classification to obtain the phase diagram.


Phase diagram of the SSH Model

As a toy model, we will consider the domain-wall-type SSH lattice with saturable gain [Fig. 1a]. The system has a domain wall, at the A site n = 0, which separates two SSH lattices, namely lattices composed of two sites per unit cell, A and B, and characterised by intra- and inter-unit cell couplings, tintra and tinter, respectively. tintra = t1 and tinter = t2 (tintra = t2 and tinter = t1) for the lattice on the left (right) side of the domain wall, i.e., the sites with n < 0 (n ≥ 0). The dynamics of the system \(\psi (t)=\left({\psi }_{-N}(t),\ldots ,{\psi }_{-1}(t),{\psi }_{0}(t),{\psi }_{1}(t),\ldots ,{\psi }_{N}(t)\right)\equiv x(t)\), with Ns = 2N + 1 sites, reads, for n = − N, …, N:

$$i\frac{d{\psi }_{n}}{dt}= \, i\left(\frac{{g}_{n}}{1+| {\psi }_{n}{| }^{2}}-{\gamma }_{n}\right){\psi }_{n}\\ +{t}_{{{{{{{\textrm{intra},\textit{n}+1}}}}}}}{\psi }_{n+1}+{t}_{{{{{{{\textrm{inter},\textit{n}-1}}}}}}}{\psi }_{n-1}$$

where ψn is the amplitude of the n-th site. gn and γn are the linear gain and linear loss at the n-th site, respectively. Using explicitly the amplitudes ap and bp of the A and B sites on the p-th unit cell, respectively, in \(\psi (t)=(\ldots ,{a}_{p}(t),{b}_{p}(t),\ldots )\), Eq. (1) can be re-written as:

$$i\frac{d{a}_{p}}{dt}=i\left(\frac{{g}_{A}}{1+| {a}_{p}{| }^{2}}-{\gamma }_{A}\right){a}_{p}+{t}_{{{{{{{{\rm{intra}}}}}}}}}{b}_{p}+{t}_{{{{{{{{\rm{inter}}}}}}}}}{b}_{p-1},$$
$$i\frac{d{b}_{p}}{dt}=i\left(\frac{{g}_{B}}{1+| {b}_{p}{| }^{2}}-{\gamma }_{B}\right){b}_{p}+{t}_{{{{{{{{\rm{intra}}}}}}}}}{a}_{p}+{t}_{{{{{{{{\rm{inter}}}}}}}}}{a}_{p+1},$$

where gσ and γσ are the linear gain and linear loss at the site σ = A, B.

Fig. 1: Phase diagram of the domain-wall-type Su-Schrieffer-Heeger (SSH) lattice with saturable gain.
figure 1

a Schematic of the domain-wall-type SSH lattice considered. The vertical red line is a guide to the eye for the domain-wall between the two SSH lattices, SSH 1 and SSH 2. b Phase diagram of the domain-wall-type SSH lattice with saturable gain and linear loss on the A and B sublattices. The white and grey areas correspond to the non-oscillating and oscillating topological phases, respectively. Representative total intensity IA (and IB) of the A (and B) sublattice in blue (and orange) for the c non-oscillating and d oscillating topological lasing mode. Spatio-temporal dynamics of the e non-oscillating and f oscillating topological lasing modes. The colour bars correspond to the amplitudes of the real and imaginary part of the modes. The non-oscillating and oscillating topological modes displayed are chosen at (γAB, gA − γAB) = (0.48, 0.06) and (0.16, 0.44), respectively. γAB and gA are, respectively, the linear loss and gain on the A sites.

In the passive setting (gA = gB = 0, γA = γB = 0), this configuration, with t1 > t2, is known to give a single topologically protected zero-energy (non-oscillating) mode localised at the domain wall and with non-vanishing amplitudes only on the A sites36. This is due to the bulk-boundary correspondence at the domain-wall between trivial and non-trivial topological SSH lattices. Indeed, an SSH lattice is topologically trivial (non-trivial) if the intra-unit cell coupling is lower (greater) than the inter-unit cell coupling.

In the active setting, the lasing modes refer to the modes that do not vanishing over time, and it has been shown that the topological phase of the system depends on the gain and coupling parameters16,17. As parameters, we use the values from ref. 17 with t1 = 1, t2 = 0.7, gB = 0 and γA = γB ≡ γAB. Figure 1b shows the phase diagram, by varying gA and γAB in the parameter space, for a lattice composed of Ns = 21 sites (N = 10). In this configuration, the system has two distinct topological lasing modes: a non-oscillating lasing mode [white area in Fig. 1b] and an oscillating lasing mode [grey area in Fig. 1b]. The dynamics of the two topological phases can be visualised by plotting the total intensity IA = ∑pap2 (IB = ∑pbp2) of the A (B) sites in Fig. 1c, d as in refs. 16,17. Alternatively, more details can be understood by plotting the space-time dynamics of the topological modes as shown in Fig. 1e, f. The non-oscillating phase is similar to the zero-energy mode in the passive SSH lattice. We can see in Fig. 1e that the mode is localised at the interface and has the majority of its amplitudes on the A sublattice. On the other hand, the system with saturable gain exhibits a new topological phase with no counterpart in the passive setting. The new topological mode is characterised by an edge mode at the domain-wall with an oscillating behaviour of the amplitudes on the A and B sites, as shown in Fig. 1f.

The classification of the new topological phases in non-linear systems requires, so far, an expert knowledge of the given non-linear systems, for example, the known results derived in ref. 16. In fact, the phase diagram in Fig. 1b has been obtained solely by the fast Fourier transform of the time series in the parameter space. Thus, the main aim of this paper is to develop a tool to explore the topological phases of PTI lasers in more complicated settings for which we have little knowledge.

The phase diagram shown in Fig. 1b will serve as a reference for our proposed method. The dataset we will utilise throughout this paper is composed of about 1000 samples, which are randomly generated from the same coupling and gain parameters’ range as in Fig. 1b. The coupled-mode equations [Eqs. (2) and (3)] are integrated using the fourth-order Runge-Kutta method and with ap(t = 0) = bp(t = 0) = 0.01, p, as an initial condition. Although the integration has been performed using a fixed time step dt = 0.01 until a final time at t = 1400, only 2000 time snapshots are uniformly retrieved in order to keep the time series at a reasonable size. For the parameters given above, this sample rate leaves about 10 time steps per period for the oscillating regime case [Fig. 1d]. The phase diagram is then obtained solely from the time series of the states within the given parameter space.

Representation classification method

To classify topological lasing modes based on their distinct non-linear regimes, we use a representation classification method28,32,33. The general idea of representation classification relies on the assumption, and common situations, that the dynamics of a high-dimensional system evolves on a low-dimensional attractor such as fixed points or periodic orbits37. The low-dimensional structure of the attractor allows for a reduced-order model that accurately approximates the underlying behaviour of the system: the dynamics of the complex system can thus be written using a basis that spans the low-dimensional space. Representation classification consists of constructing a library of appropriate basis, representative of the dynamical regimes of interest, and only then employ a filtering strategy to identify the regime corresponding to a given unknown time series. In the following, we will use the term “regime" to denote the different dynamical behaviours or the different topological phases in the non-linear SSH lattice with saturable gain. Besides, for convenience, we will plot only the total intensity on the A (IA = ∑pap2) and B (IB = ∑pbp2) sublattices to represent the given regimes. Nevertheless, the time series of the complex amplitudes at each site will be considered for the construction of the library.

As is common in complex dynamical systems, the dynamics of a system close to an attractor lie in a low-dimensional space. This means that a given spatio-temporal dynamics, denoted by the vector x(t), can be approximately written in terms of a basis \({{\Phi }}={\{{\phi }_{i}\}}_{i = 1,\ldots ,D}\) spanning the low D-dimensional space, namely:

$$x(t)\approx \mathop{\sum }\limits_{i=1}^{D}{\phi }_{i}{\beta }_{i}(t)={{\Phi }}\beta (t)$$

where βi are the weighted coefficients in the above linear combination of basis states ϕi. Using the terminology used in the literature28,32,33, x(t) will, in the following, be referred to the state measured at time t.

However, one of the main characteristics of non-linear systems is the drastically different dynamical behaviours with respect to the system’s parameters. Therefore, the reduced-order modelling strategy using a single representative basis, i.e., corresponding to a single regime, is bound to fail. Instead of finding a global basis, we here construct a set of local bases, i.e., construct a library composed of the bases of each non-linear regime of interest:

$${{{{{{{\mathcal{L}}}}}}}}=\{{{{\Phi }}}_{1},\cdots \,,{{{\Phi }}}_{J}\}={\{{\phi }_{j,i}\}}_{j = 1,\ldots ,J,i = 1,\ldots ,D},$$

where J is the number of regimes, Φj’s are the bases of each of the dynamical regime j, and ϕj,i’s are the corresponding basis states. This is the supervised learning part of the method, from which the data-driven method attempts to capture the dynamics of the system in the reduced-order model. Therefore, the library \({{{{{{{\mathcal{L}}}}}}}}\) contains the representative basis of each regime of interest, and corresponds to an overcomplete basis that approximates the dynamics of the system across the given parameter space. A better approximation of x(t), instead of using Eq. (4) for a single basis regime, then reads:

$$x(t)\approx \mathop{\sum }\limits_{j=1}^{J}\mathop{\sum }\limits_{i=1}^{D}{\phi }_{j,i}{\beta }_{j,i}(t)=\mathop{\sum }\limits_{j=1}^{J}{{{\Phi }}}_{j}{\beta }_{j}(t)$$

where βj,i are the weighted coefficients in the above linear combination in the overcomplete basis library \({{{{{{{\mathcal{L}}}}}}}}\). It is worth noting that the library modes ϕj,i are not orthogonal to each other, but instead orthogonal in groups of modes for each different regime j.

Throughout this paper, the bases used for constructing the library \({{{{{{{\mathcal{L}}}}}}}}\) will be generated by using a time-augmented dynamical mode decomposition (aDMD) method33 to consider both the spatial and temporal behaviours (see Supplementary note 1 for additional information).

Here, we use a classification scheme based on a simple hierarchical strategy33. The regime classification approach is fundamentally a subspace identification problem, where each regime is represented by a different subspace. Given the state \(x({t}_{i}:{t}_{i+{N}_{w}})\) measured within the time window \([{t}_{i},{t}_{i+{N}_{w}}]\), with Nw the time step window size, the correct regime j* is identified as the corresponding subspace in the library \({{{{{{{\mathcal{L}}}}}}}}\) closest to the measurement in the L2-norm sense33. In other words, the classification strategy is to find the subspace that maximises the projection of the measurement onto the regime subspace:

$${j}^{* }=\arg \mathop{\max }\limits_{j=1,\ldots ,J}\parallel {P}_{j}x({t}_{i}:{t}_{i+{N}_{w}}){\parallel }_{2},$$

where Pj is a projection operator given by:

$${P}_{j}={{{\Phi }}}_{j}{{{\Phi }}}_{j}^{+}$$

with \({{{\Phi }}}_{j}^{+}\) being the pseudo-inverse of Φj, 2 the L2-norm of a vector, \(\parallel \!\!v{\parallel }_{2}:= \sqrt{{\sum }_{i}| {v}_{i}{| }^{2}}\), and \(\arg \max\) the function that returns the index of the maximum value.

For the 1D system exemplified in this paper, the data collection is not too expensive because Ns is reasonable. However, if Ns is very large, sparse measurement might be desirable and a slight change in the methodology is then needed as explained in Supplementary note 2.

Phase diagrams

In the following we will draw phase diagrams using different approaches to the representation classification. In the phase diagrams we will mark the different identified regimes by the colour of the dots, where the (dark or light) blue dots always mark the oscillating regime, the green dots the non-oscillating regime, the red dots the transient regime and the orange dots the transition regime. The white and grey areas are overlays of the referenced phase diagram obtained in Fig. 1. The aDMD bases have been generated with Nw = 25 from the time series starting at the 1800-th time step.

Fixed library

Figure 2a displays the phase diagram derived from a library basis made of the two topological regimes known from Fig. 1. These two topological modes, used for the construction of the library, are represented by the two magenta dots. They are randomly chosen from the known regimes’ region in Fig. 1b and the details of the resulted phase diagrams is dependent from that random choice. The remaining coloured dots in the plot represent the identified regime j* [Eq. (7)] of each sample depicted by the dots in the parameter space. However, we can see that the phase diagram fails to correctly predict all the dynamical behaviours. Indeed, we observe that many time series are not correctly identified. Using a different choice of topological regimes in the parameter space to construct the library could be a solution to find a better phase diagram, but our attempts only showed marginal improvement of the agreement. Testing every possible choices in the dataset with no guarantee of finding the accurate phase diagram is therefore not a practical solution. Instead, using four bases in the library instead of two bases, or equivalently considering four regimes from the given parameter space, the phase diagram in Fig. 2b shows better results: The identified oscillating and non-oscillating regimes are more separated and have a better fitting with the referenced phase diagram, even though they belong to a distinct regime index j*. Merging the three oscillating regimes present in the library would then give a more accurate derived phase diagram. Therefore, Fig. 2 suggests that increasing the number of bases in the library \({{{{{{{\mathcal{L}}}}}}}}\) and merging some of them might help to get closer to the desired phase diagram, as we will see in the later sections.

Fig. 2: Representation classification based on a fixed library.
figure 2

Phase diagram obtained from the library composed of a two regimes (one non-oscillating and one oscillating) and b four regimes (three non-oscillating and one oscillating). The library is constructed by the magenta dots located at (γAB, gA − γAB) = (0.48, 0.06) and (0.16, 0.44), and for b additionally at (0.31, 0.11) and (0.17, 0.24). γAB and gA are, respectively, the linear loss and gain on the A sites.

Top-down adaptive library

In the previous section, the construction of the library \({{{{{{{\mathcal{L}}}}}}}}\) was a manual process from which we already knew the different dynamical regimes. This, however, requires prior knowledge of the complex system considered. The strategy, here, is to adaptively construct the library based on the given data samples. Here, we employ a top-down approach in which we start with too many samples for the construction of the library, and then reduce the library size by merging some of them. Based on some measures in the decision process, this automated construction of the library thus removes the manual construction of the regimes.

The underlying assumption of the classification scheme is based on the dissimilarity between the subspace of different regimes. We thus propose to consider regimes that are similar as equivalent regimes. This would, for instance, help us to merge the three phases in the non-oscillating region in Fig. 2b, and consider them as a single regime. In other words, the regimes i and j are said to be equivalent, denoted by i ~ j, if the fraction of information retained after the projection onto each other, γij [0, 1], is high enough:

$${\gamma }_{ij}\, > \,{\gamma }_{{{\rm{th}}}},$$

where γth [0, 1] is the hyper-parameter, which decides the threshold value for merging different regimes and γij is the subspace alignment given by:

$${\gamma }_{ij}:= \frac{\parallel \!\!{P}_{i}{P}_{j}{\parallel }_{F}^{2}}{\parallel \!\!{P}_{i}{\parallel }_{F}\parallel \!\!{P}_{j}{\parallel }_{F}},$$

with F the Frobenius norm of a matrix, \(\parallel \!\!M{\parallel }_{F}:= \sqrt{{\sum }_{ij}| {M}_{ij}{| }^{2}}\). Importantly, the relation Eq. (9) is numerically computed in such a way that the transitivity property of the equivalence relation is satisfied, namely that if i ~ j and j ~ k then i ~ k. The relation Eq. (9) is then indeed an equivalence relation because the reflexive (i ~ i) and symmetric (i ~ jj ~ i) property of the relation is automatically satisfied from the definition of γij [Eq. (10)]. Supplementary note 3 gives more details on the top-down library generation principle.

The top-down representation classification strategy is to classify the time series according to a large library of bases, and only then merge the equivalent identified regimes via the calculated alignment subspace γij and the equivalence relation Eq. (9). Figure 3 shows the phase diagram obtained from the top-down algorithm that started with a library composed of J = 60 regimes randomly chosen, along with the representative of each phase. We observe in Fig. 3a that the derived phase diagram is able to distinguish the non-oscillating [top panel of Fig. 3b] and oscillating regimes [middle panel of Fig. 3b]. In addition, a third regime corresponding to a transient regime [bottom panel of Fig. 3b] is found close to the γAB = 0 or gA − γAB = 0 axis. This transient regime indicates that a longer simulation time might be needed to be considered either in the oscillating or non-oscillating regimes.

Fig. 3: Representation classification based on a top-down adaptive library.
figure 3

a Phase diagram obtained using the top-down classification strategy with an initial library composed of J = 60 regimes randomly chosen and the hyper-parameter threshold γth = 0.75. b Representative total intensity IA (and IB) of the A (and B) sublattice in blue (and orange) for the different regimes. The black vertical dotted line indicates the starting time from which the bases are generated.

However, we can see that the derived phase diagram is still failing in the low γAB and low gA − γAB region (bottom-left region of the present phase diagram), where some time series are interpreted as non-oscillating instead of oscillating regime. This shows the limitation of this method where the initially constructed library may lack some of the paths that may connect similar bases. For example, regimes i and k might not be similar enough to be considered as equivalent directly [Eq. (9)] but are both equivalent to the regime k, i.e., i ~ j and j ~ k, which is missing in the library. The natural workaround would be to increase the initial library size and ensure that the regimes in the library have no missing paths, as we will see in the next section.

The hyper-parameter γth is an important quantity in the algorithm since it dictates which regimes are equivalent or not. A low threshold γth will easily merge regimes while a high γth will barely reduce the size of the library as depicted in Fig. 4a. The threshold is here arbitrarily chosen based on Fig. 4, and based on the refinement of the desired library. For example, we can see in Fig. 4b that the derived phase diagram with γth = 0.55 has two different phases. For this coarse threshold, the transient regime is identified but the distinct non-oscillating and oscillating phases are merged together into a single phase. On the other hand, with the same library as in Fig. 4b, Fig. 4c displays the obtained phase diagram for a finer threshold value γth = 0.95. The plot shows that the algorithm separates the parameter space into several regimes, which can be grouped into four main regimes. In addition to the non-oscillating, the oscillating and the transient regimes, there is a regime corresponding to the transition between the two topological phases. Besides, this finer description allows us to see distinct sets of modes in the oscillating parameter space region [dark blue and light blue dots in Fig. 4c]. Nevertheless, we observe again that the initial library misclassifies some of the non-oscillating time series most likely because of some missing paths, as said previously.

Fig. 4: Hyper-parameter dependency for the top-down adaptive method.
figure 4

a Library size against the hyper-parameter threshold γth. Phase diagrams derived using the top-down classification strategy with b γth = 0.55 and c γth = 0.95. The initial library is composed of J = 60 regimes randomly chosen.

Bottom-up adaptive library

We propose an alternative and dual approach, which considers fewer samples in the library. The core idea of this bottom-up approach is then to add samples on the fly during the classification of the given sample if the library is not good enough.

Here, the library is considered to be good enough if the maximal projection of the measurement onto the regimes’ subspace is high enough. In other words, the library is said to be good enough if the worst relative reconstruction error, ϵ, is low enough:

$$\epsilon \, < \,{\epsilon }_{{{\rm{th}}}},$$

where ϵth is the hyper-parameter which decides the threshold quality of the library and

$$\epsilon := \mathop{\max }\limits_{j=1,\ldots ,J}\frac{\parallel \!\!{P}_{j}y(t)-y(t){\parallel }_{2}}{\parallel \!y(t){\parallel }_{2}},$$

with 2 the L2-norm of a vector. Supplementary note 4 gives more details on the bottom-up library generation principle.

The advantage of this bottom-up approach is the full exploration of the parameter space region and the automatic construction of a library based on its quality. This method does not suffer from the randomly chosen samples used to construct the library, and the library composition is not restricted to a narrow parameter space region. Using a good enough library quality, the algorithm should therefore be able to sort the missing paths issue in the top-down method.

The bottom-up representation classification scheme consists of classifying the time series according to a given library or adding this sample into the library if the library is not good enough, and only then merging the different phases obtained into groups of equivalent regimes using the top-down method. Figure 5 depicts the phase diagram derived from the bottom-up classification algorithm with a starting library composed of a single regime. Similarly to the top-down approach in Fig. 3a, we observe three distinct regimes corresponding to the non-oscillating, oscillating and transient regimes [Fig. 3b]. Nevertheless, the obtained phase diagram now better predicts the regimes. The misclassifications of the non-oscillating and oscillating regimes, which were due to missing paths in the library, are now reduced, and only very few dots are not correctly identified due to being close to the topological transition boundary. Likewise, the transient points are indications of longer simulations needed because of the long transient time.

Fig. 5: Representation classification based on a bottom-up adaptive library.
figure 5

a Phase diagram obtained using the bottom-up classification strategy with a starting library composed of a single regime randomly chosen, and the hyper-parameters threshold ϵth = 0.005 and γth = 0.75. b Representative total intensity IA (and IB) of the A (and B) sublattice in blue (and orange) for the different regimes.

Along with the γth hyper-parameter, the threshold hyper-parameter ϵth is an important parameter since it tells us whether we want to add or not a given sample into the library. We observe in Fig. 6a that a low threshold ϵth will add many samples to the library, whereas a high ϵth will not add samples to the library at all. The threshold value ϵth is, again, arbitrarily chosen but with a preference for a high-quality library, i.e., low ϵth, in order to avoid missing paths. For example, we can see in Fig. 6c the phase diagram derived using ϵth = 0.05 (and γth = 0.95), namely with a library that gives less than 5% of the reconstruction error of the measurement. This set of hyper-parameter gives four main regimes that seem to correspond to the non-oscillating, oscillating, transition and transient regimes. Yet, there is some misidentification of the two topological phases most likely because of missing paths of the obtained library. On the other hand, with a better library quality, here ϵth = 0.005 (and γth = 0.95), the missing paths are retrieved and the derived phase diagram correctly predicts the topological phases [Fig. 6b]. Figure 6b shows that the different regimes, obtained previously with a lower library quality, are now better defined. The non-oscillating and oscillating regimes are well located in their respective parameter space region, and the transition points follow the transition boundary between the two topological phases. In addition, the bottom-up representation classification splits the oscillating regime into two oscillating modes [dark blue and light blue dots in Fig. 6b]. The presence of distinct oscillating modes is an example of new insights given by the data-driven classification method. Indeed, the complex values of the amplitudes of x(t) are, here, taken into account instead of solely the total intensity of each sublattice A and B as in refs. 16,17. This allows for a finer description of the dynamic pattern based on the whole lattice with the relative phase difference of the sites or the absolute value of amplitudes.

Fig. 6: Hyper-parameter dependency for the bottom-up adaptive method.
figure 6

a Library size against the hyper-parameter threshold ϵth. The inset is a zoom-in of the plot. Phase diagram derived using the bottom-up classification strategy with b ϵth = 0.005 and c ϵth = 0.05. The initial library is composed of a single regime randomly chosen, and the other hyper-parameter threshold value is γth = 0.95.


We have proposed a data-driven approach to identify topological phases of dynamical systems. By utilising the representation classification strategy based on the aDMD, we have successfully drawn the phase diagrams of the domain-wall-type SSH lattice with saturable gain. To avoid manual labelling in the classification, we have proposed two automatic library construction schemes: top-down and bottom-up approaches that merge similar phases in a library or adaptively construct a library according to its quality, respectively. We find that the best approach to tackle the problem of drawing the phase diagram for the SSH is the bottom-up adaptive method. This shows that reverse engineering methods allow us an exploration of parameter space without any expert knowledge of the complex non-linear system.

Our study points out some pitfalls to avoid while using reverse engineering, and a strategy to extend the method to more complex systems. While maybe not all phases might be identified on the first try, our approach is capable of clustering similar behaviour and gives a first classification of the different modes in the system. It should be complemented by a thorough analysis of these modes. Nonetheless, because of its different approach to drawing the phase diagram and its capability of clustering similar behaviour, reverse engineering holds the potential to find novel topological lasing modes, which could have been overlooked in other approaches.