Spectral analysis of climate dynamics with operator-theoretic approaches

The Earth’s climate system is a classical example of a multiscale, multiphysics dynamical system with an extremely large number of active degrees of freedom, exhibiting variability on scales ranging from micrometers and seconds in cloud microphysics, to thousands of kilometers and centuries in ocean dynamics. Yet, despite this dynamical complexity, climate dynamics is known to exhibit coherent modes of variability. A primary example is the El Niño Southern Oscillation (ENSO), the dominant mode of interannual (3–5 yr) variability in the climate system. The objective and robust characterization of this and other important phenomena presents a long-standing challenge in Earth system science, the resolution of which would lead to improved scientific understanding and prediction of climate dynamics, as well as assessment of their impacts on human and natural systems. Here, we show that the spectral theory of dynamical systems, combined with techniques from data science, provides an effective means for extracting coherent modes of climate variability from high-dimensional model and observational data, requiring no frequency prefiltering, but recovering multiple timescales and their interactions. Lifecycle composites of ENSO are shown to improve upon results from conventional indices in terms of dynamical consistency and physical interpretability. In addition, the role of combination modes between ENSO and the annual cycle in ENSO diversity is elucidated.


Introduction
Ever since the discovery of phenomena such as ENSO 1 and the Madden-Julian Oscillation (MJO) 2 , the objective identification and characterization of coherent modes of climate variability have been vigorously studied across the disciplines of Earth system science. In the face of dynamical complexity and eventto-event diversity, the state of large-scale patterns of climate dynamics is typically described through a reduced representation provided by climatic indices, constructed using physical understanding and/or statistical approaches. For example, ENSO is an oscillation with a broadband periodicity of 2-7 years, commonly monitored using so-called Niño indices 3 . The latter are defined as spatial and temporal averages of sea surface temperature (SST) anomalies over the equatorial Pacific source region of ENSO. Such indices are employed for a multitude of diagnostic and prognostic purposes, including lifecycle composites 4 and prediction skill assessment 5 . Clearly, the success of these efforts depends strongly on the properties of the indices employed to characterize the phenomenon of interest. In general, it is desirable that a climatic index be as objective as possible, i.e., reveal an intrinsic pattern of climate dynamics independent of subjective choices such as data prefiltering, or details of the observation modality. For oscillatory patterns such as ENSO and MJO, it is important that the indices reveal the full cycle as a sequence of observables, e.g., SST fields in the case of ENSO. Yet, despite their widespread use, conventional approaches for defining climatic indices have inherent limitations, obfuscating the properties of the phenomenon under study, and sometimes yielding inconsistent results 6 . Empirical Orthogonal Function (EOF) analysis 7 , for example, is perhaps the most commonly used statistical technique for identification of climatic indices, yet it is well known to exhibit timescale mixing and poor physical interpretability due to EOF invariance under temporal permutations of the data, even in idealized settings 8 . In the context of ENSO, scalar Niño indices do not provide full information about the state of the cycle because the index could be increasing or decreasing.
In this paper, we show that the operator-theoretic framework provides an effective route for identifying slowly decaying (equivalently, slowly decorrelating) observables of the climate system as dominant eigenfunctions of transfer/Koopman operators and their generator. These eigenfunctions directly describe coherent climate phenomena such as ENSO, with higher dynamical consistency and physical interpretability than indices derived through conventional approaches. The principal distinguishing aspects of this analysis, illustrated in Fig. 1, can be summarized as: 1. Identification of cycles from spatio-temporal information: Our spectral approach is based on dynamical systems techniques, providing a superior basis for extracting persistent cyclic behavior. We transform the underlying full nonlinear dynamics to a larger linear space, yielding a complete linear picture for our spectral analysis. This transformation is built directly from physical spatiotemporal fields such as SST snapshots. Complex pairs of eigenvalues and their eigenvectors directly reveal persistent cycles (see outer panels of Fig. 1) and their periods.
2. Dynamical rectification: The underlying oscillations in ENSO are clearly revealed in a "rectified" two-dimensional (2D) phase space provided by a complex eigenvector. Temporal evolution of the oscillation is well described by a harmonic oscillator, represented by motion at a fixed speed around a circle in 2D phase space, with the oscillation frequency α determined by the complex eigenvalue. Importantly, this property holds true even if the dynamics of the full system is chaotic. See Figs. 2, 4, and 5, and accompanying animations in Supplementary Movies 1 and 2 for illustrations. Our thorough treatment of rectification clearly shows the asymmetry of ENSO, and enables an estimate of the "local speed" of the ENSO cycle.
3. Phase equivariance: If the 2D phase space is partitioned into S "wedges", each corresponding to a lifecycle phase, then the dynamical evolution of the samples starting in any given phase over a time interval of 2π/(Sα) maps them consistently to the next phase. See Fig. 1 (center) and Fig. 9 2/41 Compute leading eigenvalues ( Figure 3) and eigenvectors.
Subdivide the extracted cycle into phases -see Figure 1 (center). These phases are equivariant and respect the cyclic dynamics ( Figure 9).
Provides a projection onto the dominant cycle and the cycle frequency. The cycle is rectified to proceed at constant speed ( Figure 5).
Produces a cycle of spatio-temporal composites, e.g. for SST -see Figure 1 (outer panels).

Input spatio-temporal data
Build transfer/Koopman operator or generator.
Subdivide the extracted cycle into phases -see Fig. 1 (center). These phases are equivariant and respect the cyclic dynamics ( Fig. 9).
Provides a projection onto the dominant cycle and the cycle frequency. The cycle is rectified to proceed at constant speed (Fig. 5).

Input spatio-temporal data
Build t opera Each point in the 2D phase space represents an ENSO state. Dynamical evolution progresses in an approximately cyclical manner via counter-clockwise rotation. The period of the cycle is equal to 2π/α ≈ 4 yr, where α is the imaginary part of the eigenvalue corresponding to g. The 2D phase space is partitioned into S = 8 "wedges" of equal angular extent (distinguished by different solid colors), each corresponding to a distinct ENSO phase. Outer panels: The panels linked to each wedge are phase composites of SST (colors) and surface wind anomaly fields (green arrows). Collectively, they reveal a complete ENSO cycle, starting from a mature El Niño in Phase 1, and progressing to an El Niño to La Niña transition in Phase 3, mature La Niña in Phases 4-5, and La Niña to El Niño transition in Phase 7. The identified phases are equivariant, meaning that they each span a time interval of 2π/(Sα) ≈ 0.5 yr, and under forward dynamical evolution by 0.5 yr the samples making up phase i correlate strongly with the samples making up phase i + 1. By virtue of this property, the generator-based ENSO lifecycle captures the duration asymmetry between the El Niño to La Niña and La Niña to El Niño transition. This is evidenced by the fact that the strongest La Niña anomalies occur in Phase 4, as opposed to Phase 5 (which would be expected for a time-symmetric oscillation). Right: Flow chart of the computational approach for identification of slowly decaying cycles through eigenfunctions of transfer/Koopman operators. Red-, blue-, and orange-shaded boxes represent input, computation, and output, respectively.

3/41
for examples of this behavior with S = 8. An important consequence of equivariance combined with slow decay is that it endows the identified phases with higher predictability, while enabling the discovery of new mechanistic relationships between physical fields because of a more accurate lifecycle. Our improved phasing suggests that ENSO has a more significant cyclical component than previously thought. shows the phase angle on the attractor obtained by treating the leading two PCs as the real and imaginary parts of a complex observable. The black line depicts the same portion of the dynamical trajectory as the black line in Panel (a). Panel (c) shows a 2D projection associated with the leading two EOF PCs for the same time interval as Panel (d). Since these PCs correspond to linear projections of the data onto the corresponding EOFs, the evolution in the 2D phase space spanned by PC 1 , PC 2 has comparable complexity to the "raw" L63 dynamics, exhibiting a chaotic mixing of two cycles associated with the two lobes of the attractor. Panels (e-h) show the corresponding results to Panels (a-d), respectively, obtained from the leading non-constant eigenfunction g 1 of the transfer operator P ε (see Methods) . Panel (f) shows the argument of the complex-valued g 1 (color is the argument) evaluated at the 16,000 points in the trajectory. Notice that there is a cyclic "rainbow" of color as one progresses around each individual L63 attractor wing in phase space. Panel (g) plots these same arguments of g 1 , now in the complex plane, demonstrating that the output of g 1 lies approximately on the unit circle. Panel (h) shows the real part of the trajectory in Panel (g) plotted versus time, illustrating approximately simple harmonic motion. Thus, the second eigenvector g 1 of the transfer operator P ε extracts the dominant cyclic behavior of L63 on the attractor's wings.

5/41
stationarity is governed by a probability measure µ on Ω, which is preserved by the dynamics; formally µ(Φ −t (A)) = µ(A) for any measurable set A ⊆ Ω. The ergodicity assumption is an indecomposability hypothesis: there are no non-trivial Φ t -invariant sets, meaning that Ω cannot be decomposed into separate subsystems.
Operator-theoretic approaches shift attention from studying the properties of the (generally, nonlinear) flow Φ t on state space to studying its induced action on linear spaces of (generally, nonlinear) observables. We denote by F the space of complex-valued functions on Ω. The space F has the structure of an infinite-dimensional linear (vector) space equipped with the standard operations of function addition and scalar multiplication, but the elements of F need not be linear functions. We will consider the subspace of observables H = { f ∈ F : Ω | f | 2 dµ < ∞}. Intuitively, thinking of µ as the climatological distribution of the system, the space H consists of all observables with finite climatological mean and variance.
The dynamics acts naturally by composition on each element f 0 ∈ H. For invertible Φ t , the composition operator, f t := P t f 0 = f 0 • Φ −t , known as the transfer operator, evolves f 0 forward t units of time to the function f t . Dual to (and here, the inverse of) the transfer operator P t , is the Koopman operator defined by U t f 0 := f 0 • Φ t . Traveling forward in time along a trajectory {Φ t (ω 0 )} t≥0 , the observations recorded by f 0 along this trajectory are f 0 (Φ t (ω 0 )) = (U t f 0 )(ω 0 ).
Ergodicity may be equivalently characterized by the constant function 1 being the unique (normalized) fixed point of U t . Ergodicity implies (via Birkhoff's Ergodic Theorem or the strong law of large numbers) that sufficiently long trajectories in Ω will well sample µ. This will be important in this paper because we are using a single trajectory as our input data. We note that many operator-theoretic algorithms may also use information from multiple trajectories and are not restricted to using a single time series. Similar operator constructions can be carried out in other functional settings, notably there is a well-developed spectral theory for infinite compositions of different transfer operators arising from non-autonomous dynamical systems 24,[57][58][59] .
We now describe how the spectral properties of P t and U t provide natural notions of persistent almost-cyclic functions and observations. We distinguish between methods applicable for discreteand continuous-time dynamics. Discrete-time approaches are based on approximations of the time-1 transfer/Koopman operators, whereas continuous-time approaches target the infinitesimal generators of the transfer/Koopman evolution semigroups. In the present setting of observables in the Hilbert space H associated with the invariant measure, the Koopman and transfer operators are unitary, and are duals to one another under operator adjoints, i.e., P t * = U t . Thus, working with P t vs. U t is merely a matter of convention.
the approximate P is contained in the unit disk {z ∈ C : |z| ≤ 1}, rather than lying on the unit circle {z ∈ C : |z| = 1}. This addition of noise, which may also be done theoretically, for example by convolution with a stochastic kernel 15,63,64 , is frequently harnessed to easily select the most important eigenvalues from the typically infinite collection σ e (P), namely those eigenvalues with large magnitude (close to 1).
Let P ε denote this perturbed operator and consider an eigenfunction g (ε) corresponding to an eigenvalue Λ (ε) of large magnitude. Because (P ε ) t g (ε) = (Λ (ε) ) t g (ε) , these eigenfunctions g (ε) decay slowly under iteration of P ε relative to the decay rates of eigenfunctions corresponding to eigenvalues of smaller magnitude. It is these "leading" or "dominant" eigenfunctions that will persist over long timescales and will accurately describe the evolution of the dynamics over similarly long timescales.
Returning to our example Φ rotating the circle by an angle α, the eigenfunctions of P ε with least decay will be approximations of (and for carefully chosen approximations, equal to) e ±iθ (k = ±1) because they are the most regular, and persist longest under continued perturbation. The corresponding eigenvalues are Λ (ε) ±1 = R ε e ±iα ε ≈ R ε e ±iα , for 0 < R ε 1, which correspond to rotation by ±α with small decay rate of R ε per unit time. Thus, the eigenvalues of P ε of greatest magnitude (excluding the eigenvalue 1) automatically identify the rotation angle α. See Methods for a description of our numerical approach for approximating P.

Continuous time.
In continuous time one can consider generators for the transfer and Koopman operators. These generators are time-derivatives of P t and U t , and are given by The operators G and V are defined on a dense subspace of H, and are skew-symmetric duals to one another, i.e., G = V * = −V . One has σ e (G) = σ e (V ) are additive subgroups of iR (the eigenspectrum lies on the imaginary axis in C); that is, if λ ,λ ∈ σ e (G) = σ e (V ) then λ +λ and λ −λ are both in σ e (P). Eigenvalues of G and V are interpreted as rates of rotation per unit time. If our phase space is S 1 , and Φ t rotates the circle at a rate α, then G and V have eigenvalues λ k = ±ikα and corresponding eigenfunctions e ikθ for k ∈ Z and θ ∈ S 1 .
The operators G and V "generate" the semigroup of operators P t and U t by P t = e tG and U t = e tV , and the spectral mapping theorem connects their spectra: σ e (P t ) = e tσ e (G) and σ e (U t ) = e tσ e (V ) (if Φ t is not invertible, the spectral value 0 = e −∞ is treated separately). For example, the relationship Λ = e λ links the eigenvalues Λ of the discrete-time operators with the eigenvalues λ of their continuous-time counterparts.
As in the discrete-time setting, one may perturb the generators by addition of a diffusion process or through a numerical scheme. In the former case, if Φ t is governed by a vector field then natural "diffused" versions G ε and V ε of G and V are provided by normalized forward and backward Kolmogorov equations, respectively. In the latter case, one may apply various numerical schemes 26,30,34,36,65 . The scheme 34 is outlined in the Methods section. The eigenvalues of G ε and V ε are in general complex numbers with zero or negative real part. For the same reasons as in the discrete-time setting, one seeks eigenvalues with real part closest to the imaginary axis, which describe the slowest decay rate. In our example of a circle rotation with rotation rate α, the eigenfunctions of least decay rate are e ±iθ (k = ±1) with corresponding eigenvalues λ (ε) ±1 = −r ε ± iα ε ≈ −r ε ± iα, for r ε 0 (r ε is analogous to log R ε from the discrete-time setting).

Eigenvalue frequency analysis of monthly-averaged Indo-Pacific SST
We analyze model and observational SST data over the Indo-Pacific domain 28 • E-70 • W, 60 • S-20 • N. This domain was selected as a representative region of activity for several large-scale modes of climate variability on seasonal to decadal timescales, including ENSO, ENSO combination modes 66 , and the Interdecadal Pacific Oscillation (IPO) 67 . The model data comprise 1300 yr of monthly-averaged SST fields from a pre-industrial control integration of the Community Climate System Model Version 4 (CCSM4) 68 ,

7/41
sampled at the model's native ocean grid of approximately 1 • resolution. As observational data, we use monthly averaged SST fields at 2 • resolution from the Extended Reconstructed Sea Surface Temperature Version 4 (ERSSTv4) reanalysis product 69 over the period January 1970 to February 2020. The resulting SST data vectors x n have dimension d = 44,771 and 4868 for CCSM4 and ERSSTv4, respectively.
Our numerical approach builds an approximation of the generator in a data-driven basis consisting of eigenvectors of a kernel matrix. The kernel matrix, K K K, has sizeÑ ×Ñ, whereÑ = N − Q + 1, and is constructed from delay-embedded SST fields over a window of Q − 1 = 48 lags of length ∆t = 1 month, corresponding to an interannual time interval of Q ∆t = 4 years. Its eigenvectors represent temporal patterns that can be thought of as nonlinear generalizations of the PCs obtained via EEOF analysis. Previously 70-72 , such kernel eigenvectors were shown to successfully recover physically meaningful modes from monthly averaged SST data in both Indo-Pacific and Antarctic domains. For our purposes, however, the eigenvectors and eigenvalues of K K K are employed to construct a data-driven version of the regularized generator V ε , and extract dominant modes by solution of an associated eigenvalue problem (see Methods). The approximation basis formed by the eigenvectors of K K K is (i) learned from the highdimensional SST data at a feasible computational cost; (ii) is refinable, in the sense of having a well-defined asymptotic limit as the amount of data N increases; and (iii) as the delay window Q ∆t increases, it is provably well-adapted to representing eigenfunctions of the generator 33,37 . The results we obtain are not particularly sensitive to the precise choice of kernels and lags, nor to the use of the generator or transfer operator. For example, similar results can be obtained with the transfer operator P ∆t constructed using a single lag (Q = 2) of length = 12 months (see Methods). A summary of the dataset attributes and numerical parameters employed in our computations is displayed in Supplementary Table 1. Figure 3 and Supplementary Table 2 show eigenvalues λ 0 , λ 1 , . . . of the generator V ε computed from the CCSM4 and ERSSTv4 datasets, arranged in order of decreasing real part (i.e., increasing decay rate). The leading eigenvalues form distinct branches corresponding to (i) the annual cycle and its harmonics; (ii) ENSO and its combination modes with the annual cycle; and (iii) low-frequency (decadal) modes with vanishing oscillatory frequency. In the case of the observational data, the spectrum also contains a trend-like mode representing climate change, as well as combination modes representing the modulation of the annual cycle by the trend (see Supplementary Fig. 1).
In interpreting the results in Fig. 3 and Supplementary Table 2, it should be kept in mind that, modulo a small amount of numerical drift, the CCSM4 data are generated by autonomous dynamics associated with fixed (pre-industrial) concentrations of greenhouse gases and perfectly periodic radiative forcing representing the seasonal cycle. In particular, the phase of the seasonal cycle is implicitly represented in the delay-embedded SST data. The autonomous techniques employed in this paper are therefore rigorously applicable in this dataset. In contrast, the ERSSTv4 data are subject to different natural and anthropogenic external forcings (e.g., volcanoes and greenhouse gas emissions, respectively), so strictly speaking our autonomous methodology does not formally apply here. Nevertheless, our spectral decomposition separates the trend (corresponding to a real eigenvalue) from the periodic and approximately periodic cycles (corresponding to complex eigenvalues), which are by definition trendless. In fact, we posit that an advantage of our approach is that it is capable of extracting trendless cyclical modes in ERSSTv4 without ad hoc detrending of the data, which is oftentimes performed in the context of EOF analysis and related approaches.
In both CCSM4 and ERSSTv4, the seasonal-cycle modes occur first in our ordering, which is consistent with the fact that these are purely periodic modes remaining correlated for arbitrarily long times. Two pairs of eigenfrequencies ν j := Im λ j /(2π) in this family are accurately identified by the data-driven eigenvalue problem, namely the annual (1 yr −1 ) and semiannual (2 yr −1 ), eigenfrequencies where the numerical results agree with the true values to within 1% and 4%, respectively (see Supplementary Table 2). The Indo-Pacific SST data, highlighting eigenvalues corresponding to seasonal, interannual, decadal, and trend modes. The vertical and horizontal axes show the frequency ν j = Im λ j /(2π) and growth rate, Re λ j , respectively. Note that complex eigenvalues occur in complex-conjugate pairs as appropriate for describing oscillatory signals at the corresponding eigenfrequencies. Moreover, negative values of Re λ j correspond to decay. Lines connecting eigenvalues serve as visual guides for the seasonal (periodic), trend, and ENSO branches of the spectra. The annual (dark blue) and ENSO (red) eigenfrequencies indicated in the spectra correspond, through their imaginary parts, to the frequencies ν annual and ν ENSO discussed in the main text. Note that decadal modes are present in the CCSM4 spectrum, but they have larger decay rates, − Re λ j , than the range depicted in Panel (a). See Supplementary Table 2 for a listing of the leading 25 generator eigenvalues extracted from CCSM4 and ERSSTv4. third (triannual) harmonic is not identified as accurately, being assigned an eigenfrequency of 2.5 yr −1 as opposed to the expected 3 yr −1 . This discrepancy is at least partly due to finite-difference errors in our numerical approximation of the generator; this is discussed in more detail in the Methods section. Other contributing factors to approximation errors for the eigenfrequencies include the Nyquist limit (which imposes a limit of 1/(2 ∆t) = 6 cycles/yr on the maximum frequency that can be resolved with a monthly sampling interval) and the addition of diffusion (which in general perturbs the eigenvalues along both the real and imaginary axes) in the construction of the regularized generator V ε .
Beyond the seasonal cycle branch, the CCSM4 spectrum exhibits a branch of eigenvalues consisting of a pair of fundamental modes with an interannual frequency ν 7 0.25 yr −1 =: ν ENSO , as well as combination frequencies ν j , j = 9, 11, 13, 15, approximately equal to ν ENSO + mν annual , where ν annual = 1 yr −1 is the annual-cycle frequency, and m is an integer taking values in the set {−2, −1, 1, 2}. Note that the spacing of 2 in the index j is due to restricting to positive frequencies in this discussion; see Supplementary Table 2.
We will shortly interpret the eigenfunction corresponding to the eigenvalue ν ENSO as representing the fundamental ENSO cycle. We note that this choice is unambiguous as the eigenvalue with largest real part and frequency close to 0.25 yr −1 . Similarly, the frequencies ν ENSO + mν annual are naturally interpretable as combination modes, consistent with the group structure of the generator spectrum described above. Further notable aspects of these results are that (i) distinct generator eigenvalues correspond to distinct combination frequencies (as opposed to EOF analysis, which mixes the combination and fundamental 9/41 frequencies 66 ); and (ii) two harmonics are identified corresponding to the annual and semiannual cycles. In separate calculations, we have verified that the ENSO eigenfrequencies extracted from the CCSM4 data remain unchanged to two significant digits for embedding windows ranging from 1 year (Q = 12) to 16 years (Q = 192).
ENSO and ENSO combination eigenvalues are also identified in the ERSSTv4 spectrum, but these eigenvalues occur after an eigenvalue with vanishing imaginary part that we interpret as a representation of climate change trend. As shown in Supplementary Fig. 1(a), the eigenfunction time series corresponding to this eigenvalue has a manifestly nonstationary character, which is broadly consistent with accepted climate change signals such as persistent warming from the 1980s to early 2000s, "hiatus" during the mid to late 2000s, and accelerated warming during the early to mid 2010s 73 . In addition, the trend eigenfunction time series is found to correlate with area-averaged anomalies of Indo-Pacific SST and global surface air temperature, with 0.83 and 0.78 correlation coefficients, respectively. As with ENSO, this trend eigenfunction comes with its own "combination frequencies" close to 1 yr −1 (since the trend frequency is zero), capturing the modulation of the annual cycle by the trend (see Supplementary Fig. 1(b)). Aside from this trend family, both the CCSM4 and ERSSTv4 spectra contain additional modes with zero corresponding eigenfrequency, representing internal decadal variability of the Indo-Pacific 70,71 . The spectra also contain interannual modes with higher frequencies than ν ENSO , notably a mode with an approximately 3-year eigenperiod (see Supplementary Table 2). In what follows, we will focus on the fundamental ENSO eigenfunctions and the corresponding lifecycle analysis. These correspond to eigenfunctions g 7 and g 6 in the CCSM4 and ERSSTv4 ordering, respectively. Since the observational data are sparser and noisier than the model data, we expect larger (numerical) decay rates for the observational data as stronger diffusion is needed to regularize the generator (see Methods). This is borne out in Fig. 3, where the real parts of the generator eigenvalues for ENSO and the ENSO combination modes are more negative for the ERSSTv4 data than for CCSM4.
In summary, we have extracted ENSO eigenfunctions and eigenfrequencies from two datasets (CCSM4 and ERSSTv4), using two computational techniques (generator and transfer operator) and a range of numerical parameters (lag and embedding window length). Moreover, in each spectral analysis experiment, there is no ambiguity in associating particular eigenfunctions with ENSO, as discussed above.
A rectified ENSO lifecycle from eigenfunctions As described above, when nonzero eigenfrequencies exist, the dominant eigenfunctions correspond to observables with approximately cyclic evolution, even if the underlying flow Φ t is aperiodic. We will use this idea to extract a rectified ENSO lifecycle from the spatiotemporal SST data.
Factoring out approximate cycles from eigenfunctions. Let (U ε ) t g (ε) = (Λ (ε) ) t g (ε) as before. We follow an orbit in state space Ω starting at some ω 0 . Evaluating both sides of ( where we have inserted the definition of U t 0 as the middle term, recalling we have U ε ≈ U in some sense. Defining the multiplicative action of a complex number Λ on another complex number by ). Thus, we may think of the eigenfunction g (ε) as an approximate projection (or factor map) from Ω to C; this is summarized in the following (approximate) commutative diagram: Ω Ω

10/41
Evolution under Φ t on Ω is projected down (by g (ε) ) to approximately a fixed multiplicative action on C by Λ (ε) . Further, for |Λ (ε) | ≈ 1, we may consider the multiplicative action of M Λ (ε) as an approximate action on corresponds to an approximate rotation on S 1 by an angle of ±α. Thus, for |Λ (ε) ±1 | ≈ 1, evolution under Φ t on Ω is projected down (by g (ε) ) to approximately a fixed rotation on S 1 by α. The above statement is illustrated numerically for the Lorenz equations in Fig. 2 The evolution lies approximately on S 1 ⊂ C ( Fig. 2(g)), rotating at an approximately fixed rate ( Fig. 2(h)). See Supplementary Movie 1 for a more direct visualization of these results. Projections of this type for real eigenfunctions of the transfer operator have been used to project out fast dynamics in multiple time scale systems 74 .
Rectified cycles from eigenfunctions. The fact that the rotation on S 1 occurs at close to a fixed rate is a key aspect of our ENSO analysis, and so we emphasize this property by discussing a simple example that is strongly illustrative for climate cycles such as ENSO. We imagine a crude model of the ENSO cycle with one-dimensional phase space Ω = S 1 . The dynamics of this idealized model is given by a flow Φ t : S 1 → S 1 , generated by a nonconstant velocity on S 1 . We choose a sawtooth-like velocity field to model the observation that the La Niña to El Niño transition is slower than the transition in the other direction 75 ; see Fig. 4(c) for the corresponding evolution of a "normalized Niño 3.4" index vs. time and Supplementary Movie 2 for the corresponding animation. In this situation there is no need to "extract" a cycle in the dynamics, because the dynamics is a cycle-but importantly with nonconstant speed.
Similar to above, let M t Λ denote the flow that advances the angle on S 1 by arg(Λ)t, where arg(Λ) = (2π)/T and T is the period of the cycle. The flow M t Λ has a constant velocity around S 1 , namely 2π/T ; see Fig. 4(f) for the corresponding cosine-like evolution. Because Φ t and M t Λ are both cycles of the same period, there exists a homeomorphism h : We denote by θ the angle in the "original" cycle (the upper part of the commutative diagram) and by θ the angle in the rectified cycle (the lower part of the commutative diagram); by definition θ = h(θ ). We set θ = 0 to represent the peak El Niño state in our crude cyclic model of ENSO, and without loss of generality we fix h(0) = 0 so that peak El Niño occurs at the same angle θ = θ = 0 in both the original and rectified cycles. We define θ = π as peak La Niña, according to the original cycle, directly opposing El Niño; this is represented by the green dot in Fig. 4(a). The eigenfunction of the Koopman operator corresponding to M t Λ with eigenvalue Λ is g rect (θ ) := e iθ ; this eigenfunction is illustrated Fig. 4(e) where the value of the real part of e iθ is colored. By the above conjugacy, the function (g rect • h)(θ ) = e ih(θ ) is an eigenfunction of U t with eigenvalue Λ; see Fig. 4(a). Because La Niña is reached more quickly from El Niño than vice-versa in the original flow, so too (by conjugacy) must this occur in the rectified, constant-speed flow. Thus, La Niña in fact appears earlier than half-way through the rectified cycle; see the green dot in Fig. 4(e), which lies at the angle h(π). Finally, let f orig (θ ) := e iθ represent the complex-valued function corresponding to our crude cyclic model of ENSO, where θ = 0 is El Niño and θ = π is La Niña. We can map f orig to the rectified space by f orig • h −1 (θ ) = e ih −1 (θ ) ; the real part of this latter function is shown in Fig. 4(b). We will use 11/41 Figure 4. Rectification of a variable-speed oscillator by eigenfunctions of the generator. The dynamics is chosen such that the speed dθ dt is faster when θ lies in the interval Θ fast := (0, π) and slower for θ ∈ Θ slow := (π, 2π), resulting in the time series for cos(θ (t)) shown in Panel (c). Panel (a) shows the state space of the original oscillator, i.e., the unit circle S 1 consisting of all phase angles θ ∈ [0, 2π), colored by the value of the real part of the function f orig (θ ) = e iθ , namely Re f orig (θ ) = cos θ . In an analogy with ENSO, θ = 0 and π (green dot) would correspond to El Niño and La Niña climate states, respectively, while cos θ would correspond to a Niño index. The asymmetry in rotation speed then mimics the fact that El Niño-La Niña transitions take a shorter amount of time than La Niña-El Niño transitions. In Panel (d) we push down to rectified state space and show in color the real part of f orig • h −1 . Note that La Niña (green dot) appears earlier in this constant-speed cycle on rectified space; this compensates for the variations of speed of the original oscillator. It is this representation, i.e., the original "ENSO index" mapped to the rectified state space, that we focus on in Fig. 5. Panel (e) shows the real part of the eigenfunction g rect (θ ) = e iθ (color) on rectified space, which appears as a pure cosine wave in Panel (f). The evolution of the phase angle in the rectified state space is that of a harmonic oscillator with constant angular frequency 2π/T ; note the period is T = 4 years in analogy with ENSO. Finally, in Panel (b) we pull the function g rect back to the original space and display the real part of g rect • h (color).

12/41
versions of the functions f orig and f orig • h −1 as our main demonstration of our rectification process in Fig. 5. These results are an example of the automatic rectification performed by Koopman eigenfunctions (Theorem 17.11 11 ) for systems with discrete spectra. Operator-theoretic approaches to different kinds of rectification have also been explored 76,77 .
Comparing our rectified ENSO eigenfunction lifecycle to the Niño 3.4 index. We now apply the ideas from the previous two subsections to the CCSM4 and ERSSTv4 data, where g rect in those sections will be the generator eigenfunctions g 7 and g 6 , arising from these two datasets, respectively. In the following we will refer to g 6 and g 7 collectively as simply g j . Figure 5 compares several aspects of the g j to new, lagged ENSO indices f nino derived from the Niño 3.4 index output from the CCSM4 and ERSSTv4 data as follows. At each time instance, f nino is a 2D vector consisting of the current Niño 3.4 value and its value months in the past; that is, (Niño 3.4(t), Niño 3.4(t − months)). We choose to be the lag that gives the most cycle-like behavior for f nino . If the Niño 3.4 index evolved as a perfect cycle with a period of T = 4 months, the two components of f nino would be in quadrature (90 • phase difference), resulting in a purely angular motion in the associated 2D phase space. This situation would be analogous to the evolution of the f orig observable depicted in Fig. 4(a), which is periodic but not of fixed frequency. Yet, in Fig. 5(a, i), it is evident that the evolution of f nino exhibits significant departures from an approximate 4-year cycle, featuring both retrograde and radial motion, particularly in the case of the ERSSTv4 data ( Fig. 5(i)). In Fig. 5(d, l), we show the evolution of the phase angle obtained by treating the components of f nino as the real and imaginary parts of a complex number, analogous to the L63 example in Fig. 2(b, f) (note that the latter representations are in the full phase space). Here, an approximately cyclical evolution of f nino would induce an approximately monotonic phase evolution (modulo 2π), which would additionally be linear for a constant-frequency cycle. While such a behavior is discernible in Fig. 5(d, i), the phase evolution of f nino is clearly corrupted by high-frequency noise due to retrograde/radial motion.
Consider now the generator eigenfunctions g j . The time series plots ( , but instead appear earlier in the rectified cycle. These facts and the fact that the corresponding eigenfrequencies ν j are interannual and well-approximate ν ENSO , provide evidence that the g j provide a representation of the ENSO lifecycle; a fact which will be corroborated further below using phase composites. Before doing that, however, we note two important aspects of the results in Fig. 5. First, the generator eigenfunctions provide a significantly more cyclic representation of the ENSO lifecycle than conventional Niño indices. In Fig. 5(e, m), the 2D phase space trajectories associated with the real and imaginary parts of g j are seen to undergo a predominantly polar evolution, with little to no retrograde motion when g j is located sufficiently away from the origin (|g j | 1). As noted above, this is in contrast to the retrograde and radial motion seen in the Niño 3.4-based f nino index. Moreover, in separate calculations we have verified that the generator eigenfunctions g j are also more cyclical than the twodimensional f nino indices constructed from the Niño 4, 3, and 1+2 indices. Two-dimensional phase space representations of the ENSO state with approximately cyclical behavior can also be constructed through multivariate indices, such as SST and thermocline depth anomalies 4 , that reveal recharge-discharge 13/41  Fig. 4(d, e), respectively. Panels (c, g, k, o) show Niño 3.4 time series (c, k) and the real part of g j (g, o), plotted over a 30-year portion of the available data. These time series are analogous to those in Panels (c, f) of Fig. 4, respectively. Panels (d, h, l, p) show phase angles determined from f nino (d, l) and g j (h, p). Note that the slope in (h, p) is approximately constant, consistent with the automatic rectification process, namely that trajectories precess around the origin at a fixed angular speed. This regular rectification from our complex eigenvector is in strong contrast to the irregular angular behavior of the lagged Niño 3.4 index in (d, l). 14/41 processes 78 , but these representations are also generally less coherent than those provided by the generator eigenfunctions.
Second, the generator eigenfunctions "rectify" the ENSO cycle in a manner analogous to the oscillator example in Fig. 4. In Fig. 5(h), the phase angle associated with the CCSM4-derived g j undergoes a near-linear evolution, with some excursions from this behavior occurring. We observe that these deviations from linear behavior occur when the Niño 3.4 (scalar) index is close to zero (white color in Figs. 5(h,p)). Mathematically, deviations from cyclic behavior are more likely when |g j | is small, which implies Re g j is also small, and is in turn consistent with weak ENSO amplitude. Visually, the rectification induced by g j can be seen in the time series plots in Fig. 5(c, g), where a comparatively uniform El Niño-La Niña cycling of Re g j (Fig. 5(g)) is contrasted with slow La Niña to El Niño ramp ups followed by rapid El Niño to La Niña decays in the f nino representation (Fig. 5(c)). In Fig. 6(c), we examine the relationship between the phase angles associated with f nino and g j through a curve fit of θ := arg g j as a function of θ := arg f nino (shown in a solid yellow line). The fitted curve provides an estimate of the homeomorphism function h discussed above in the context of the oscillator example. When θ = π (i.e., during La Niñas according to the Niño 3.4 index), the fitted θ is less than π, which shows that La Niña events occur earlier than half-way through the g j cycle, as in Fig. 4.
A similar general behavior of the phase angle is observed for the ERSSTv4 data (Figs. 5(l, p) and 6(f)), though as one might expect the results are noisier than for CCSM4. Still, the phase angle progression associated with g j (Fig. 5(p)) exhibits a significantly more rectified behavior than its f nino counterpart (Fig. 5(l)), particularly during significant El Niño/La Niña events (highlighted with green star markers). Interestingly, the generator angle arg g j corresponding to La Niña events following strong El Niños (e.g., the 1973/74 and 1999/00 La Niñas in Fig. 5(n)) is close to 90 • . This is consistent with the fact that strong consecutive El Niño and La Niña events in the observational record have a tendency to occur one year apart, corresponding to a quarter of the 4-year ENSO cycle.
In summary, our spectral analysis extracts a canonical ENSO cycle, and provides rectified coordinates representing the cycle as an approximately fixed-speed oscillation. In rectified space it is clear that the representation of the ENSO cycle in terms of Niño indices (SST anomalies) is asymmetric because La Niña appears earlier (in phase/angle space) around the one-dimensional cycle (see Figs. 5(f) and 5(j)). Without the rectified representation, it would be difficult to assign a characteristic speed/frequency around the cycle. This notion of characteristic frequency will be useful below for constructing phase composites, and should also be useful for constructing reduced models. More broadly, we suggest that rectification is an important conceptual construction, which should be useful in a wide range of climate dynamics applications.

ENSO phases and their associated composites
We construct reduced representations of the ENSO lifecycle by partitioning the 2D phase spaces associated with the generator eigenfunctions and lagged Niño 3.4 index into angular phases, and then study the properties of associated phase composites of relevant oceanic and atmospheric fields. Figure 6(a, b) and Fig. 6(d, e) depict the phase space partitions over eight such phases for CCSM4 and ERSSTv4, respectively. Each phase is constructed from samples at times for which |g j | lies in the top m values in the corresponding 45 • radial sector, where m = 200 and 20 for CCSM4 and ERSSTv4, respectively. Larger magnitude values of the eigenfunction g j occur at times belonging to stronger ENSO cycles, and because we seek a strong canonical ENSO cycle, we subsample at these times. Mathematically, the phase composites constructed in this manner can be interpreted as conditional expectations of observables (e.g., SST anomaly fields) with respect to a discrete variable π j : Ω → {0, 1, . . . , 8} indexing the eight phases associated with eigenfunction g j . The inclusion of a "zero" phase nominally is to account for states which

15/41
are not ENSO-active, consistent with earlier work 24, 79 that prioritizes larger values of real eigenfunctions and equivariant functions; see Methods for further details.
It should be noted that in the eigenfunction-based representation, partitioning the phase space into phases of uniform angular extent is a natural choice since the evolution is rectified and takes place at an approximately constant angular frequency. In other words, in the eigenfunction picture in Fig. 6(b,e), phases of uniform angular extent correspond to phases of uniform temporal duration, in this case approximately 4/8 = 1/2 years. In the case of the Niño 3.4-based representation in Fig. 6(a,d), achieving a well-balanced partitioning is more challenging due to variable/retrograde angular speed and significant radial motion. Here, we have opted to employ a uniform partitioning scheme which is common practice with many cyclical climatic indices, including indices for the MJO and other intraseasonal oscillations 80 . We note that this is already an improvement over a characterization of ENSO phases based on scalar indices, since such representation cannot distinguish the time tendency (increasing or decreasing) of the oscillation.
In both the Niño 3.4-and eigenfunction-based representations, the phases are numbered such that Phase 1 corresponds to El Niño, and periodic cycling of the phases from 1 to 8 represents an El Niño to La Niña to El Niño evolution. Turning back to the Niño-3.4 representation in Fig. 5(b), Phase 5 is a La Niña phase centered at angle π. On the other hand, in the generator representation in Fig. 5(f), La Niña (deep blue, corresponding to lowest Niño-3.4 values) occurs at Phase 4, centered at 3π/4, due to the rectification. This means that the rectified generator representation allocates more phases (Phases 5-8) in the La Niña to El Niño portion of the ENSO lifecycle, thus yielding a more granular description of ENSO initiation processes.
In Fig. 7, we examine phase composites of monthly averaged SST and surface wind anomalies, constructed using the Niño 3.4 and generator phases from CCSM4 and ERSSTv4 depicted in Fig. 6. In the the CCSM4 analysis we use surface wind data from the atmospheric component of the model (CAM2). In the ERSSTv4 analysis, the surface wind data is from the NCEP/NCAR Reanalysis 1 product 81 . First, on a coarse level, both the Niño-and generator-based composites recover the salient features of the ENSO lifecycle. These include (i) the characteristic El Niño "tongue" of positive SST anomalies in the Eastern equatorial Pacific, together with its associated anomalous surface westerlies, in Phase 1; (ii) meridional discharge in the ensuing intermediate phases; and (iii) formation of negative SST anomalies and easterly surface winds during the La Niña phases (Phases 5 and 4 for the Niño-and eigenfunction-based representations, respectively). The Niño-3.4 and generator-based composites in Fig. 7 also exhibit important differences, particularly in the La Niña to El Niño transition phases. In both CCSM4 and ERSSTv4, Phases 6-8 of the generator capture a reorganization of the large-scale surface winds from a convergent configuration over the Maritime Continent in Phase 6 to a divergent configuration initiating in Phase 7 with a buildup of anomalous westerlies in the Western Pacific, developing further in Phase 8. In particular, the anomalous westerlies in Phase 7 are consistent with the aggregate effect of higher-frequency, stochastic atmospheric variability such as westerly wind bursts 82 that trigger the development of El Niño events.
To examine this behavior in more detail, in Fig. 8  and m = 20 for CCSM4 and ERSSTv4, which corresponds to approximately 1.5% and 3% of the available data per phase, respectively. The selected data points in each phase are marked by distinct colors, with red corresponding to the El Niño phase, Phase 1. Progression from Phase 1 to Phase 8 takes place in a counter-clockwise sense. The La Niña phases in the Niño 3.4 and generator representation are Phases 5 (blue) and 4 (pink), respectively. Gray lines show the phase space evolution over the entire 1300-year (a, b) and 50-year (d, e) analysis intervals. Panels (c) and (f) plot the rectified generator angle (y-axis) against the Niño-3.4 angle (x-axis). The yellow curves fit the data points and by construction pass through the origin (0, 0), which corresponds to El Niño for both the Niño-3.4 and generator representations. Note that according to the yellow curve in Panel (c), El Niño for the Niño-3.4 representation, occurring at angle π on the x-axis, corresponds to an angle of ≈ 3π/4 on the y-axis for the rectified cycle. This quantifies the more rapid transition between El Niño and La Niña in CCSM4, in comparison to the reverse transition. Although less noticeable in the yellow curve in Panel (f), the ERSSTv4 results display a similar El Niño-La Niña transition asymmetry, manifested by the tendency of La Niñas (dark blue dots) to occur below the diagonal dashed line. The speed of transition is indicated at the finer level of phases by the green lines, which are spaced equally on the x-axis according to Niño-3.4 phase boundaries. Wider (resp. narrower) spacing of the horizontal green lines corresponds to a slower (resp. faster) transition between phases. degree of asymmetry between the Northern and Southern hemispheres.
In contrast to the generator composites, the Niño-3.4-based composites exhibit significantly more abrupt El Niño-La Niña and La Niña-El Niño transitions, failing to recover a number of the processes outlined above. In particular, Niño Phase 2 (which represents El Niño decay in the generator picture), closely resembles the mature El Niño phase in Phase 1. In Phase 3, the Phase 2 configuration is abruptly replaced by near-neutral conditions, failing to capture the southward shift of the anomalous equatorial westerlies associated with El Niño termination. The Niño- the eastern equatorial Pacific being replaced by well-developed El Niño conditions. Importantly, there is no representation of anomalous westerlies during these phases. The more physically informative reconstruction of the ENSO lifecycle provided by the generator is likely due to the dynamical rectification property discussed above, which enables phase partitioning in the "intrinsic" phase of the oscillation. Beyond ENSO, we expect this rectification property to be beneficial in diagnostic and mechanistic studies of different climate phenomena.

Phase equivariance
Besides the diagnostic aspects described above, an important requirement of an index representing a coherent oscillatory phenomenon such as ENSO is that phase progression is consistent with the temporal evolution of the samples constituting each phase-this is the concept of phase equivariance stated in the Introduction. In the particular setting of the eight-phase ENSO reconstruction studied here, phase equivariance means that the forward evolution of the samples that constitute phase i by six months (the nominal duration of each phase) should map these samples into the samples making up phase i + 1, modulo 8. Theoretically, this correspondence should be exact for a purely periodic process such as the variable-speed oscillator in Fig. 4, but for a chaotic oscillator such as ENSO we expect it to hold only approximately. We will demonstrate below that the indices based on the generator eigenfunctions g 7 (CCSM4) and g 6 (ERSSTv4) exhibit greater equivariance than the lagged Niño 3.4 index f nino .
To test for equivariance in the Niño-3.4 and generator-based representation of ENSO, in Fig. 9 we show this forward evolution in the corresponding 2D phase spaces in six-month increments starting from 19/41 Figure 9. Evolution of ENSO Phase 7 at six-month increments for (a) CCSM4 and (b) ERSSTv4. In each set of panels, the top and bottom rows depict the phase evolution associated with the lagged Niño 3.4 indices f nino and generator eigenfunctions g j , respectively. Bold yellow dots show the Phase 7 members (left column) and their forward images (right four columns) under the dynamics. Dots colored in muted colors show the phase partitioning from Fig. 6 for reference. Observe that the generator-based evolution undergoes a uniform phase progression with significantly smaller spread than the Niño-based evolution.

20/41
Phase 7 (i.e., the phase most closely related to El Niño initiation). There, it is evident that the generator lifecycle exhibits phase equivariance on significantly longer intervals than the Niño 3.4 lifecycle, in both CCSM4 and ERSSTv4. In the case of the generator, the centroid of the cloud of points making up the forward evolution of Phase 7 has a phase angle consistent with equivariant phase evolution over the examined 2-year interval. While there is visible dispersion occurring by 12 months, this dispersion occurs predominantly in the radial direction and has a limited effect on the phase classification. In contrast, the point clouds corresponding to forward evolution of the Niño-based Phase 7 exhibit strong dispersion in both radial and angular directions, decorrelating with the target phase expected from equivariance on intervals as short as 6-12 months. The difference in equivariance between the Niño 3.4 and generator lifecycle is most striking in the ERSSTv4 data, where after a 1-year interval the forward-evolved Phase 7 from Niño 3.4 has zero overlap with the expected Phase 1, whereas in the case of the generator that overlap is close to 100%. These results open the possibility that methods of characterizing ENSO based on area-averaged anomalies (such as the lagged Niño 3.4 index f nino ) may conflate unrelated parts of the cycle. This could contribute to difficulties with ENSO prediction, as shown in Fig. 9(a, b; top) for f nino , where poorly chosen groupings mix together unrelated ENSO phases, leading to rapid divergence of these "false" groupings. Our results suggest that ENSO may have a more significant cyclic component than previously realized.
As a more quantitative assessment of phase equivariance, in Supplementary Fig. 2 we show the fractional sample overlap between the forward-evolved ENSO phases in CCSM4 in six-month increments with the expected target phases from equivariance. It is worthwhile noting that highest predictability of the generator phases occurs for start phases near the El Niño/La Niña peaks (Phases 1, 2, and 6), where the fractional overlap remains above 0.5 for at least a year. The evolution initialized at intermediate phases such as 3-5, 7, and 8 is somewhat less equivariant, with the relative overlap dropping to smaller than 0.5 values after a year. This behavior may be a manifestation of the ENSO spring predictability barrier 84 .

ENSO diversity
ENSO diversity, i.e., the tendency of El Niño/La Niña events to differ from each other in terms of their spatial and temporal characteristics, has been a topic of considerable interest in the literature 4, 85-88 . It is common to spatially classify El Niño events as being of Eastern Pacific (EP) or Central Pacific (CP) type, depending on the longitudinal location of the highest SST anomalies 4 . Some studies have interpreted these patterns as being the outcome of distinct temporal processes, with CP events dominated by quasi-biennial (QB; 1.5-3 yr) components, and strong EP events exhibiting both QB and low-frequency (LF) components in the 3-7 yr band 88 . Other studies have classified ENSO events as cyclic, episodic, or multiyear, depending on whether they are preceded by the opposite, neutral, or same phase, respectively 87 4 .
Recall from Fig. 3(b) that the top part of the generator spectrum exhibits the fundamental ENSO eigenfunctions (with a 4 y eigenperiod), the associated ENSO combination modes (with various eigenperiods in the interannual to seasonal band), and also a pair of eigenfunctions, g 19 and g 20 , with a 3 y eigenperiod (not shown in Fig. 3(b); see Supplementary Table 2). To assess the contribution of these eigenfunctions in the variability of the Niño 3.4 index, we compute associated time series reconstructions, or "modes", using the standard approach employed in SSA, EEOF analysis, and other comparable techniques utilizing delay embedding 56,70 . Given a complex-conjugate pair of generator eigenfunctions, {g j , g j+1 }, this procedure produces a (real) time series that represents the component of the Niño 3 Fig. 10(a), we present reconstructed Niño 3.4 time series based on the fundamental 4-year ENSO mode (red line), the 3-year ENSO mode (blue line), and the sum of the leading two ENSO combination modes (green line). The sum of these ENSO-related modes is also shown (orange line), and captures greater variability than the fundamental ENSO mode alone. Shaded time intervals indicate periods where the the 2-month running correlation coefficient between the latter reconstruction and the Niño 3.4 index is greater than 0.9.
First, it is readily apparent that certain El Niños are well captured by a small number of leading modes-i.e., modes that reflect greater dynamical persistence and cyclicity. In particular, consider the very intense 1982/83 and 1997/98 El Niños: for these two events, the peaks of all three ENSO-related modes are effectively coincident in Fig. 10(a). The next most intense event, 2015/16, is characterized by 4-year mode amplitude comparable to 1982/83, with overall correlation among the 4-year, 3-year, and combination modes. However, the 3-year ENSO mode for 2015/16 is less intense than the corresponding mode for 1982/3.
Consider now the 1986/87 event, which is shown in detail in Fig. 10(b). In this case, we see a distinct behavior, as the combination modes have two peaks, one occurring before and one after the peak of the 4-year mode, and a trough occurring during the peak of the 4-year mode (see green and red lines in Fig. 10(a)). Superposing the combination modes with the fundamental ENSO mode (green line in Fig. 10(b)) results in consecutive peaks in the reconstructed Niño 3.4 index around the peak of the 4-year mode. If we additionally include the 3-year mode (orange line) the relative amplitude of the two peaks changes. In contrast, if we superpose only the 3-year and 4-year modes, the two consecutive peaks do not occur (see blue line in Fig. 10(b)).
Previously, ENSO combination modes have received significant attention due to their role in El Niño termination in boreal spring 66,83,89 . The results in Fig. 10 show that ENSO combination modes can also play an important role in reconstructing events with multiple peaks. We note that while we have directly computed the combination eigenfunctions, in theory (as discussed in subsection "Eigenvalue frequency analysis of monthly-averaged Indo-Pacific SST") they may be determined from the state of the annual and the 4-year ENSO eigenfunction. Thus, our results show that certain doubly-peaked events, such as the 1986/87 El Niño, can be reconstructed from the same small set of modes as those used to reconstruct strong EP events.
It is noteworthy that the 1982/83, 1986/87, and 1997/98 El Niños, as well as the 2009/10 event (which is also well captured by the reconstructions in Fig. 10(a)), are all followed by La Niñas in the subsequent year. In the transition-based classification of ENSO 87 , these La Niñas are thus all classified as cyclic (cyclic El Niños are defined in a symmetric way). From the perspective of our spectral analysis approach, the occurrence of these cyclic La Niñas can be explained from the fact that once a generator eigenfunction g j becomes "active", i.e., |g j (ω)| is large for a given climate state ω, it will, with high likelihood, remain active for at least a significant fraction of the cycle that it represents (since g j (ω) precesses in the complex plane with fixed frequency and a weaker radial motion; see, e.g., Fig. 6(e)). In particular, significant El Niño events in the generator-based representation have high likelihood of leading to La Niñas in the following year. On the other hand, the generator eigenfunctions have only moderate magnitude in the La Niña phase (see Fig. 5(m, n)). This suggests fewer cyclic El Niños, which is consistent with the different triggering mechanisms of the two phenomena 87 .
In contrast to all of the events mentioned above, other events such as the 1991/92 El Niño, are not readily accounted for by the leading modes. It is possible that this event is tied to the June 1991 eruption of Mt. Pinatubo [90][91][92] ; external forcing of this event may explain why the eigenmodes fail to capture it. We note that the 1982/83 El Niño followed the eruption of El Chichón, but is nonetheless a strong EP event captured by the leading ENSO modes. The 1994/95 CP El Niño represents an example of another "missed" event in the context of the modes illustrated in Fig. 10. The fact that the leading generator eigenfunctions, which favor cyclicity and dynamical persistence, do not capture these events is consistent with the broadly accepted observation that CP events exhibit less canonical behavior than their EP counterparts 4 . Intriguingly, the reconstructions in Fig. 10 indicate the existence of a 3-year interannual mode, associated with generator eigenfunctions g 19 and g 20 , which plays a significant role in strong EP events but does not significantly contribute to CP events. This differs somewhat from previous mode decompositions of the Niño 3.4 index 88 , which have identified a single QB mode contributing to both CP and strong EP events.
In summary, the results in Fig. 10 show that our spectral approach can differentiate certain ENSO events in terms of amplitude and phasing of an underlying set of dominant modes. What we would argue here, at least from the viewpoint afforded by a single regional index like , is that a rich diversity of El Niño behavior can be "constructed" from a small number of eigenfunctions of a dynamical operator. That the ENSO combination modes also contribute in a discernible way, either by adding to 23/41 the fundamental and 3-year ENSO modes as in 1982/83 or 1997/98 or creating consecutive peaks as in 1986/87, is also of note, as these modes may not be separately identified in the variance basis of EOFs, although they have been noted in SSA.

Discussion
Operator-theoretic approaches for dynamical systems, realized through kernel methods for machine learning, provide an effective framework for identification of persistent cyclic modes of variability in climate dynamics. Central to this framework is modeling the evolution of observables of the climate system with transfer and Koopman operators. The dominant eigenfunctions of these operators yield succinct and physically interpretable representations of fundamental modes of climate variability, with the corresponding eigenvalues reflecting the intrinsic timescale of variability of the mode. We have shown by means of theoretical arguments and numerical analyses of (i) idealized dynamical systems, (ii) comprehensive climate models, and (iii) reanalysis data, that these eigenfunctions reveal approximate cycles embedded in complicated systems with several advantageous characteristics over conventional approaches. Composites in the original observation space can be readily constructed; see Fig. 1. A further distinguishing aspect of our eigenfunctions is that they provide rectified coordinates for the state of the oscillation (Figs. 5 and 6), making them better suited for indexing the fundamental oscillations of the climate. Moreover our extracted cycles display a high level of self-consistency under forward evolution (Fig. 9), a desirable property for characterizing a canonical strong ENSO and promising for prediction.
A major focus of this work has been the El Niño Southern Oscillation, extracted from monthly-averaged Indo-Pacific SST data from a millennial control integration of a comprehensive climate model (CCSM4) and reanalysis data (ERSSTv4). In both of these datasets, the generator spectrum (Fig. 3) contains a pair of slowly decaying eigenfunctions with an interannual eigenfrequency, providing a rectified representation of the canonical ENSO lifecycle. In addition to the fundamental ENSO modes, the spectrum of the generator is found to exhibit a hierarchy of combination modes between ENSO and the annual cycle with the theoretically expected frequencies. These combination modes appear to play a role in capturing "double El Niño" events in the recent observational record, such as the 1986/87 series of events. Meanwhile, other events, such as the 1991/92 El Niño following the Mt. Pinatubo eruption and the 1994/95 central Pacific El Niño are not captured by the leading eigenfunctions, suggesting a different dynamical origin. Going beyond cyclic behavior, in the case of the reanalysis data, the spectrum of the generator was found to contain nonstationary modes associated with climate change, as well as combination modes representing the modulation of the annual cycle by the climate-change trend. Our analysis motivates further application of the spectral theory of dynamical systems to diagnosing and predicting the fundamental dynamical patterns of the climate.

Methods
As described in the main text, we have a time-ordered dataset x 1 , . . . , x N ∈ R d , arising as a series of observations x n = X(ω n ) from a trajectory of an abstract dynamical system Φ t : Ω → Ω, where ω n = Φ n ∆t (ω 0 ). In our experiments, the SST field sampled at d 1 Indo-Pacific gridpoints at time-index i yields a vector x i ∈ R d (see Supplementary Table 1 for further details on the datasets employed in this study). We also consider low-dimensional examples with d = 3 (L63 system; Fig. 2) and d = 2 (variable-frequency oscillator; Fig. 4), where X is the identity map on the respective state space Ω. Recall that ∆t > 0 is the sampling interval, and µ is an assumed physically meaningful invariant probability measure for Φ t , In what follows, we describe data-driven techniques for approximation of (i) the transfer 24/41 operator P ∆t , or the Koopman operator U ∆t ; and (ii) the generator V of the transfer/Koopman operator semigroups. In the measure-preserving setting, the transfer and Koopman operators on H = L 2 (Ω, µ) form dual pairs related by adjoints, (P t ) * = U t for every t ∈ R. Thus, for conciseness of exposition, in what follows we focus on approximation of the transfer operator P = P ∆t , corresponding to Φ = Φ ∆t . In addition, we describe our procedure for computing spatiotemporal mode reconstructions from eigenfunctions.

Delay embedding
We will delay-embed the data to form vectorsx i = (x i−(Q−1) , . . . , x i− , x i ) ∈ R Qd for some positive integers Q and , to have an improved estimation of the underlying state ω i ∈ Ω as in standard Takens embedding 51,52 . Let ν be the measure induced on R Qd by the invariant measure µ on Ω. We seek to approximate projected versions of the operators P ∆t , U ∆t , and V that act on functions on a space of projected observables L 2 (R Qd , ν). In practice, integrals with respect to ν are approximated by integrals with respect to the sampling probability measure ν N : where δx i is the Dirac δ -measure centered atx i . This will shortly reduce to summing over the original data pointsx i .

Approximation of transfer and Koopman operators
We define a novel, data-driven Markov chain approximation of P, where each embedded data point x i ∈ R Qd , i = (Q − 1) , . . . , N − 2, N − 1 is identified with a Markov state. Our approximation retains important structural properties of P, namely it is a positive operator (nonnegative functions are mapped to nonnegative functions) and preserves integrals with respect to the data-based measure ν N . The addition of noise mentioned in the main text to form P ε is done via Gaussian kernels centered on each pointx i , where ε is a positive bandwidth parameter, the choice of which is discussed at the end of this subsection. We discretely approximate the Markov operator P ε : L 2 (R Qd , ν) → L 2 (R Qd , ν) defined by as Evaluating P ε f at an embedded data pointx i , we have where .
The matrix P P P = [P i j ] is column stochastic and we may think of P P P i j as the conditional probability that the state j (or data pointx j ) transitions to the state i (or data pointx i ), in one time step, according to the

25/41
kernel k ε and the measure ν N . A function f : is evolved forward in time by matrix-vector multiplication P P P f f f ; this approximates the action of P ε f . By construction, the mass conservation property P * ε 1 = 1 is inherited by P P P, namely P P P 1 = 1. In practice, we compute the numerical spectrum of P P P, P P Pg g g j = Λ j g g g j , and extract the most persistent cyclic behavior from the eigenvector g g g 1 corresponding to the eigenvalue Λ 1 = re iα with largest magnitude inside the unit circle (largest |r| < 1) and α > 0. As described in the main text, α represents the angle of rotation around the extracted cycle per unit time. The corresponding eigenvector g g g 1 approximates the eigenfunction g (ε) (x i ) of P ε that approximately projects the system from R Qd to the most persistent cycle (approximately lying on S 1 ) as illustrated, e.g., in Fig. 2(f)-(h) in the context of the L63 system.
In the experiment in Fig. 2(f)-(h), we used N = 16,000 samples taken at a sampling interval of ∆t = 0.01 time units. Moreover, the dimension is d = 3, and we do not need to embed the data as we have access to the original state, thus we set Q = 1. In calculations not reported here using the ERSSTv4 (N = 600) and CCSM (N = 15,600) Indo-Pacific SST datasets, using ∆t = 1 month we find that a single lag of Q = 2 with = 12 months (approximately one quarter of the cycle period) is sufficient to accurately extract an accurate ENSO frequency and ENSO eigenfunctions; see Supplementary Table 1.
Regarding the choice of ε, generally one wishes to select an ε as small as possible, while maintaining an eigenvalue 1 of P P P with unit multiplicity. A ballpark estimate of suitable ε is the mean nearest neighbor distance (averaged over all embedded data pointsx i ), divided by √ 2; this scales the Gaussian in (1) to have greatest slope (and therefore "distinguishing ability") when x i − y is the mean nearest neighbor distance. The values of ε used in the Lorenz, ERSSTv4, and CCSM calculations are shown in Supplementary Table  1, and are modified by less than a factor of four from the above ballpark estimate.
2. Formation of an L × L matrix W W W approximating the operator V ε and solution of the associated eigenvalue problem.
In what follows, we outline these steps, referring the reader to our previous work 34 for additional details and pseudocode.
Kernel matrix and basis functions. Using the delay-embedded datax i , we compute anÑ ×Ñ matrix K K K, whose entries are given by the values K i j = k γ (x i ,x j ) of a pairwise kernel function k γ : R dQ × R dQ → R.
We use variable-bandwidth kernels centered on each pointx i , where γ is a positive bandwidth parameter, and σ (x i , y) is a positive bandwidth function. Intuitively, the role of σ is to control the rate of decay (locality) of the kernel in data-dependent

26/41
manner, such that in regions of high sampling density σ is small, leading to a tighter kernel k γ , and allowing resolution of finer-scale features. Conversely, in low-density regions σ is large, and hence we obtain a broader kernel k γ , enhancing robustness to statistical sampling errors. Note that the radial Gaussian kernel in (1) is a special case of (2) with the constant bandwidth function σ (x i , y) = 1 and bandwidth parameter γ = ε. Here, we use the symbol γ for the kernel bandwidth parameter to distinguish it from ε employed for transfer/Koopman operator approximation in the previous section. The choice of γ and bandwidth function σ will be discussed in a subsequent section. It should be noted that in addition to improving state estimation, delay embedding also improves the efficiency of basis vectors derived from the datax j in approximating transfer/Koopman operator eigenfunctions 33 (as noted in the main text).
Having constructed the kernel matrix K K K, we next normalize it to obtain a bistochastic kernel matrix, i.e., a symmetricÑ ×Ñ matrixK K K with positive entriesK i j , satisfying ∑ N−1 j=Q−1K i j = 1 for all i ∈ {Q − 1, . . . , N − 1}. The normalization procedure 93 employs the steps where D D D and S S S are diagonal matrices with diagonal entries D ii = ∑ N−1 j=Q−1 K i j and S ii = ∑ N−1 j=Q−1 K i j /D j j , respectively. The basis vectors φ φ φ j are then obtained by solving the matrix eigenvalue problem By convention, we order the eigenvalues η j in decreasing order, η 0 ≥ η 1 ≥ · · · , and normalize the corresponding eigenvectors such that φ φ φ i φ φ φ j =Ñδ i j . By Markovianity ofK K K and strict positivity of k γ (which implies that the elements ofK K K are strictly positive), the leading eigenvalue η 0 is equal to 1, and is strictly greater than η 1 . Moreover, the corresponding eigenvector φ φ φ 0 has constant elements, which can be set to 1 by our choice of normalization. As a result, viewed as temporal patterns t i → φ i j , the eigenvectors φ φ φ j with j > 1 have zero mean (since they are orthogonal to φ φ φ 0 ) and unit variance (since φ φ φ j 2 /Ñ = 1). Note that because k γ from (1) is a nonlinear kernel, the entries of φ φ φ j are not necessarily linear projections of the datax i onto a corresponding extended EOF (EEOF); that is, in general, φ i j is not equal to u jx i for an EEOF u j ∈ R dQ . The φ φ φ j can therefore be viewed as nonlinear principal components, which are able to span richer spaces of observables than conventional EEOF techniques utilizing linear (covariance) kernels. This property is particularly important for our purposes, since in what follows we will use the φ φ φ j to build Galerkin approximation spaces for the generator that can act on nonlinear functions.
In what follows, our approach is to fix L N and employ the leading eigenvectors φ φ φ 0 , . . . , φ φ φ L−1 as basis vectors for approximating the generator. We choose an L-dimensional approximation space with high regularity, which reduces the sensitivity of our generator approximations to sampling errors.

Spectral analysis of the generator.
Viewing vectors f f f = ( f Q−1 , . . . , f N−1 ) ∈ CÑ as complex-valued temporal patterns t i → f i sampled discretely in time at the sampling interval ∆t, we approximate the generator V by a finite-difference operator V : CÑ → CÑ. As a concrete example, used in all generator calculations in this paper, the following is a fourth-order central scheme: Using (3), we approximate the generator V by the L × L antisymmetric matrix V V V with elements

27/41
It can be shown that V V V provides a data-driven Galerkin approximation matrix for V , which converges in a suitable large-data limit 33,34 . Similarly, we construct an L × L matrix W W W approximating the diffusionregularized generator V ε = V − ε∆, by defining Here, ∆ is a diffusion operator on the Hilbert space of observables H, and ∆ ∆ ∆ a positive-semidefinite, self-adjoint matrix approximating ∆. We set ∆ ∆ ∆ to the diagonal matrix with entries With these definitions in place, and a choice of regularization parameter ε > 0, we solve the L × L matrix eigenvalue problem W W W u u u j =λ j u u u j .
The eigenvaluesλ j provide approximations to the eigenvalues λ j of V ε . Moreover, the eigenvectors u u u j = (u 0 j , . . . , u (L−1) j ) ∈ C L contain the expansion coefficients of the approximate generator eigenfunction g g g j in the φ φ φ i basis; that is, In analogy to the discrete-time case, we order the eigenpairs (λ j , g g g j ) in decreasing order of Re λ j . We normalize the g g g j such that g g g † j g g g j =Ñ, where † denotes the complex-conjugate transpose. Note that, in general, the eigenvectors g g g j are not orthogonal (though they are approximately orthogonal for sufficiently small ε).
The imaginary parts of the eigenvalues, Imλ j , represent the angular frequencies (radians per unit time) corresponding to the eigenfunctions g g g j . In the main text (e.g., Fig. 3) we show the frequencies ν j = Imλ j /(2π) measuring cycles per unit time. Meanwhile, the real part Reλ j measures the (negative) decay rate of g g g j under the evolution semigroup generated by W W W . By construction, W W W has a constant eigenvector g g g 0 = 1 1 1 corresponding to the eigenvalueλ 0 = 0 (i.e., zero decay rate and oscillatory frequency). All other eigenvalues have strictly negative real part, and we order them in order of decreasing Reλ j (i.e., in order of increasing decay rate) by convention.
In separate calculations with synthetic periodic data, we have verified that the 17% approximation error of the triennial eigenfrequency in Fig. 3 can be reduced to 5% by using a eighth-order finitedifference approximation scheme at a fixed monthly sampling interval ∆t. Since our focus in this work is on lower frequencies (e.g., the interannual ENSO frequency), we have elected to work with the fourth-order scheme in (3), which provides adequate accuracy for the eigenfrequencies of interest while being less sensitive to numerical perturbations than higher-order schemes.
Bandwidth function and parameter tuning. In the CCSM4 and ERSSTv4 analyses, we employ a nonseparable bandwidth function that promotes connectivity between datapoints whose relative displacement vector is aligned with the local dynamical flow 94

28/41
where 0 ≤ ζ < 1, and v i =x i −x i−1 , v j =x j −x j−1 are (trajectory tangent) vectors representing the local time tendency of the data. Following Refs 70-72 , we set ζ to a value close to 1, namely ζ = 0.995 (see Supplementary Table 1). This has the effect of promoting slow timescales in the extracted basis functions φ φ φ j , which reduces the error of the finite-difference approximation of the generator. In separate calculations not reported here, we have computed approximate generator eigenfunctions for the L63 system, which exhibit persistent cyclicity analogously to the transfer operator experiment in Fig. 2. In these experiments, we employ a separable bandwidth function 40 , where ρ(x i ) is an estimate of the sampling density of the data atx i , and m > 0 a parameter approximating the dimension of the data manifold in R dQ . The density estimator is formally given by ρ(y) = exp(− z − y 2 /γ 2 ) dν(z), whereγ > 0 is a bandwidth parameter (in general, different from γ in (2)). The dimension parameter m is determined numerically using the same procedure 40 (outlined below) as for tuning the bandwidth parameters γ andγ. In the L63 experiments, we obtain a value m ≈ 2.06 approximating the fractal dimension of the Lorenz attractor. Even though the L63 snapshot data x i ∈ R 3 contain full state information, we have used a long embedding window of Q = 800 samples at a ∆t = 0.01 sampling interval. This has the effect of "biasing" the basis vectors φ φ φ j towards approximate Koopman/transfer eigenvectors 33,37 , thus improving the efficiency of the basis in approximating solutions to the generator eigenvalue problem. In all generator calculations reported in this paper we tune the bandwidth parameters γ andγ automatically using a numerical procedure 40 . This involves computing the kernel sum S(γ l ) := ∑ N−2 i, j=Q−1 k γ l (x i ,x j ) on a logarithmic grid γ l of trial bandwidth parameters, and choosing γ as the bandwidth parameter γ l that maximizes the derivative d log S(γ l )/d log γ l (estimated numerically by finite differences). See Algorithm 1 in Ref. 34 for pseudocode. The maximum valuem of d log S(γ l )/d log S(γ l ) can be shown to be approximately equal to the dimension of the data manifold in R dQ divided by 2. Based on that, in our generator calculations utilizing the bandwidth function in (6) we set the dimension parameter m =m/2.

Phase composites
Here, we describe the procedure for constructing phase composites of the observables employed in Figs. 7 and 8. Let Y : Ω → R d be a target observable for compositing. For instance, in Fig. 7, Y is either the global SST, zonal surface wind, or meridional surface wind anomaly field sampled at d gridpoints. Let g : Ω → C be a complex-valued index representing the phenomenon of interest for which composites are created. In Figs. 7 and 8, g is equal to either the generator eigenfunctions g j , or the lagged Niño 3.4 index f nino . We further letΩ = {ω Q−1 , . . . , ω N−1 } denote the set of states sampled along our dynamical trajectory (taking delay embedding with Q delays into account), S ∈ N the number of phases, and m an integer less thanÑ/S representing the number of samples in each phase (recall thatÑ = (N − Q + 1)/S). We define S "wedges" W 1 , . . .W S ⊂ C in the complex plane by W j = {z ∈ C : |z| ≥ a j and arg z ∈ Θ j }, j = 1, . . . , S, where Θ j = [2π( j − 1)/S, 2π j/S), and a j is the m-th largest modulus of the complex numbers in the set {g(ω) : ω ∈Ω and arg ω ∈ Θ j }.
The sets W 1 , . . . ,W S represent S "phases" of an oscillatory process represented by g. In addition, we define a phase W 0 = C \ S j=1 W j associated with the states in Ω for which the process represented by g is considered inactive. For each phase W j we define the associated phase composite as the vector Y j ∈ R d
We can interpret the phase composites Y j as values of the conditional expectation of the observable Y with respect to the partition {W 0 , . . . ,W S } of C induced by the complex-valued index g. For that, note that the partition induces a discrete variable π : Ω → {0, 1, . . . , S − 1}, where π(ω) = j if and only if ω lies in W j . We define Y : Ω → R d as a discrete observable representing the empirical conditional expectation of Y given π, i.e., Note that Y is a discrete observable satisfying Y(ω) = Y j whenever the eigenfunction value g(ω) lies in W j for the state ω ∈ Ω.

Mode reconstruction
Our approach for computing spatiotemporal mode reconstructions (e.g. as shown in Fig. 10) is closely related to the reconstruction procedure in SSA 56 , with appropriate modifications to take into account the facts that eigenfunctions of evolution operators may be (i) complex-valued; and (ii) non-orthogonal. Let Y : Ω → R d be a target observable for reconstruction as in the previous section. For instance, in Fig. 10, the target observable Y is the Niño 3.4 region-averaged SST anomaly, which is a scalar with d = 1, but Y can also be vector-valued, for example when one reconstructs the original input data and sets Y = X.
Let ·, · denote the inner product of H, f 1 , f 2 = Ωf 1 f 2 dµ, and let g j denote an element of the biorthonormal basis of the {g i }; that is, g j , g i = δ ji . A procedure for constructing the biorthonormal set {g g g 0 , . . . , g g g L−1 } to {g g g 0 , . . . , g g g L−1 }, satisfying g g g † i g g g j = δ i j is to (i) form the L × L Gram matrix G G G with G i j = u u u † i u u u j ; (ii) compute u u u i = G G G −1 u u u i ; and (iii) form the linear combination g g g i = ∑ L−1 k=0 u ki φ φ φ k . For each eigenfunction g j and lag q ∈ Z, we define the complex-valued spatial pattern A (q) j ∈ C d given by projection of P q ∆t Y = (U q ∆t ) * Y onto generator eigenfunction g j ; formally, Numerically, we approximate A (q) j by projecting the samples y 0 , . . . , y N−1 , y i = Y (ω i ), of the target observable, lagged by q steps, onto the dual eigenvector g g g j , viz.
It is worthwhile noting that for q = 0 the spatial patternsÂ (0) j are analogous to Koopman modes employed in data-driven Koopman operator techniques 12,13,17 . The patternsÂ (0) j can thus be thought of as time-shifted Koopman modes.
Next, we define an approximate projection of the target observable Y onto the eigenfunction g j , namelỹ Y j : Ω → C d , by multiplication of A (q) j with U q ∆t g j , followed by averaging over the delay-embedding window 56 , j U q ∆t g j .

30/41
Numerically,Ỹ j is approximated by the spatiotemporal patternŶ Y Y j = (ŷ j(Q−1) , . . . ,ŷ j(N−1) ) ∈ C d ×Ñ , wherê approximatesỸ j (ω i ). Note thatỸ j can be equivalently expressed as from which we can interpretỸ j as a projection of the observable Y onto an order-Q Krylov subspace generated by eigenfunction g j . Adopting standard terminology from climate science, in the main text we refer to the reconstructed patternsŶ Y Y j as modes, though it should be kept in mind that these patterns are different from Koopman modes in that they have both spatial and temporal character. The individual modesỸ j can be combined into sum modes by choosing an index set J = ( j 1 , . . . , j l ) and definingỸ J = ∑ l k=1Ỹ j k . Similarly, in the empirical setting, we defineŶ Y Y J = ∑ l k=1Ŷ Y Y j k . Note thatỸ J (resp.Ŷ Y Y J ) is real whenever J consists of indices of pairs of complex-conjugate eigenvalues λ j (resp.λ j ). In Fig. 10, we show reconstructions using index sets J representing various complex-conjugate pairs of ENSO and ENSO combination modes associated with generator eigenfunctions.